[PATCH 0/4 V5] Storage volume upload

This patch set depends on "[Kimchi-devel] [PATCH v2 1/4] fix: Use correct path when setting 'ref_cnt' to a new volume", so I included it to this patch set to help testing. v4 -> v5: - Reduce number of locks - Update MockModel and test cases accordingly Aline Manera (2): Upload storage volume Remove storage volume creation from file Crístian Deives (1): fix: Use correct path when setting 'ref_cnt' to a new volume Royce Lv (1): Update controller to make update accept formdata params docs/API.md | 8 +++- src/kimchi/API.json | 22 +++++++++ src/kimchi/control/base.py | 6 +-- src/kimchi/i18n.py | 6 +++ src/kimchi/mockmodel.py | 34 +++++++++----- src/kimchi/model/storagevolumes.py | 93 ++++++++++++++++++++++---------------- tests/test_model_storagevolume.py | 89 ++++++++++++++++++++++++++++++------ tests/test_rest.py | 45 +----------------- 8 files changed, 187 insertions(+), 116 deletions(-) -- 2.1.0

From: Crístian Deives <cristiandeives@gmail.com> When a storage volume is created with the parameter 'capacity', the parameter 'ref_cnt' is set on the storage pool path instead of the storage volume path. 'ref_cnt's are only related to storage volumes, not pools. Use the storage volume path instead of the storage pool path when setting 'ref_cnt' on a new volume. Signed-off-by: Crístian Deives <cristiandeives@gmail.com> --- src/kimchi/model/storagevolumes.py | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/src/kimchi/model/storagevolumes.py b/src/kimchi/model/storagevolumes.py index f02efca..cf4611d 100644 --- a/src/kimchi/model/storagevolumes.py +++ b/src/kimchi/model/storagevolumes.py @@ -182,12 +182,14 @@ class StorageVolumesModel(object): {'name': name, 'pool': pool, 'err': e.get_error_message()}) - path = StoragePoolModel( - conn=self.conn, objstore=self.objstore).lookup(pool_name)['path'] + vol_info = StorageVolumeModel(conn=self.conn, + objstore=self.objstore).lookup(pool_name, + name) try: with self.objstore as session: - session.store('storagevolume', path, {'ref_cnt': 0}) + session.store('storagevolume', vol_info['path'], + {'ref_cnt': 0}) except Exception as e: # If the storage volume was created flawlessly, then lets hide this # error to avoid more error in the VM creation process -- 2.1.0

From: Royce Lv <lvroyce@linux.vnet.ibm.com> When update does not accept params in base class, cherrypy will raise error that extra params are provided in body. So allow update function to accept params. Signed-off-by: Royce Lv <lvroyce@linux.vnet.ibm.com> --- src/kimchi/control/base.py | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/kimchi/control/base.py b/src/kimchi/control/base.py index b50ea5c..b9520c0 100644 --- a/src/kimchi/control/base.py +++ b/src/kimchi/control/base.py @@ -156,7 +156,7 @@ class Resource(object): raise cherrypy.HTTPError(400, e.message) @cherrypy.expose - def index(self): + def index(self, *args, **kargs): method = validate_method(('GET', 'DELETE', 'PUT'), self.role_key, self.admin_methods) @@ -167,7 +167,7 @@ class Resource(object): return {'GET': self.get, 'DELETE': self.delete, - 'PUT': self.update}[method]() + 'PUT': self.update}[method](*args, **kargs) except InvalidOperation, e: raise cherrypy.HTTPError(400, e.message) except InvalidParameter, e: @@ -194,7 +194,7 @@ class Resource(object): return user_name in users or len(set(user_groups) & set(groups)) > 0 - def update(self): + def update(self, *args, **kargs): try: update = getattr(self.model, model_fn(self, 'update')) except AttributeError: -- 2.1.0

Storage volume upload will use the same REST API as 'capacity' to create the storage volume for upload. To do that, the 'upload' parameter must be set accordingly. POST /storagepools/<pool-name>/storagevolumes/ {"capacity": 1000000, "format": "raw", "name": "volume-1", "upload": true} After storage volume is created, the data transfer will be done through several PUT requests: PUT /storagepools/<pool-name>/storagevolumes/volume-1 {"chunk_size": "1024", "chunk": form-data} PUT /storagepools/<pool-name>/storagevolumes/volume-1 {"chunk_size": "1024", "chunk": form-data} ... PUT /storagepools/<pool-name>/storagevolumes/volume-1 {"chunk_size": "1024", "chunk": form-data} Kimchi will keep the file offset internally to avoid data corruption. So all the PUT requests must be sequential. This patch also updates MockModel to manually does the upload process as the virStorageVol.upload() function is not supported by the libvirt Test driver. The test cases were also updated accordindly to those changes. Signed-off-by: Aline Manera <alinefm@linux.vnet.ibm.com> Signed-off-by: Royce Lv <lvroyce@linux.vnet.ibm.com> --- docs/API.md | 5 +++ src/kimchi/API.json | 22 ++++++++++ src/kimchi/i18n.py | 6 +++ src/kimchi/mockmodel.py | 18 +++++++- src/kimchi/model/storagevolumes.py | 53 ++++++++++++++++++++--- tests/test_model_storagevolume.py | 89 +++++++++++++++++++++++++++++++------- tests/test_rest.py | 45 +------------------ 7 files changed, 173 insertions(+), 65 deletions(-) diff --git a/docs/API.md b/docs/API.md index 62e4063..88c5fec 100644 --- a/docs/API.md +++ b/docs/API.md @@ -487,6 +487,8 @@ A interface represents available network interface on VM. The unit is bytes * format: The format of the defined Storage Volume. Only used when creating a storage volume with 'capacity'. + * upload: True to start an upload process. False, otherwise. + Only used when creating a storage volume 'capacity' parameter. * file: File to be uploaded, passed through form data * url: URL to be downloaded @@ -515,6 +517,9 @@ A interface represents available network interface on VM. * **DELETE**: Remove the Storage Volume * **POST**: *See Storage Volume Actions* +* **PUT**: Upload storage volume chunk + * chunk_size: Chunk size of the slice in Bytes. + * chunk: Actual data of uploaded file **Actions (POST):** diff --git a/src/kimchi/API.json b/src/kimchi/API.json index 474661c..a6330ae 100644 --- a/src/kimchi/API.json +++ b/src/kimchi/API.json @@ -202,6 +202,11 @@ "minimum": 1, "error": "KCHVOL0020E" }, + "upload": { + "description": "When the storage volume will be uploaded", + "type": "boolean", + "error": "KCHVOL0025E" + }, "allocation": { "description": "The size(MiB) of allocation when create the storage volume", "type": "number", @@ -222,6 +227,23 @@ } } }, + "storagevolume_update": { + "type": "object", + "properties": { + "chunk": { + "description": "Upload storage volume chunk", + "error": "KCHVOL0024E", + "required": true + }, + "chunk_size": { + "description": "Chunk size of uploaded storage volume", + "type": "string", + "error": "KCHVOL0024E", + "required": true + } + }, + "additionalProperties": false + }, "vms_create": { "type": "object", "error": "KCHVM0016E", diff --git a/src/kimchi/i18n.py b/src/kimchi/i18n.py index 18e84bc..9f169ab 100644 --- a/src/kimchi/i18n.py +++ b/src/kimchi/i18n.py @@ -214,6 +214,12 @@ messages = { "KCHVOL0021E": _("Storage volume URL must be http://, https://, ftp:// or ftps://."), "KCHVOL0022E": _("Unable to access file %(url)s. Please, check it."), "KCHVOL0023E": _("Unable to clone storage volume '%(name)s' in pool '%(pool)s'. Details: %(err)s"), + "KCHVOL0024E": _("Specify chunk data and its size to upload a file."), + "KCHVOL0025E": _("In order to upload a storage volume, specify the 'upload' parameter."), + "KCHVOL0026E": _("Unable to upload chunk data as it does not match with requested chunk size."), + "KCHVOL0027E": _("The storage volume %(vol)s is not under an upload process."), + "KCHVOL0028E": _("The upload chunk data will exceed the storage volume size."), + "KCHVOL0029E": _("Unable to upload chunk data to storage volume. Details: %(err)s."), "KCHIFACE0001E": _("Interface %(name)s does not exist"), diff --git a/src/kimchi/mockmodel.py b/src/kimchi/mockmodel.py index ae97354..4395883 100644 --- a/src/kimchi/mockmodel.py +++ b/src/kimchi/mockmodel.py @@ -38,7 +38,7 @@ from kimchi.model.libvirtstoragepool import IscsiPoolDef, NetfsPoolDef from kimchi.model.libvirtstoragepool import StoragePoolDef from kimchi.model.model import Model from kimchi.model.storagepools import StoragePoolModel -from kimchi.model.storagevolumes import StorageVolumesModel +from kimchi.model.storagevolumes import StorageVolumeModel, StorageVolumesModel from kimchi.model.templates import LibvirtVMTemplate from kimchi.model.users import PAMUsersModel from kimchi.model.groups import PAMGroupsModel @@ -112,6 +112,7 @@ class MockModel(Model): DeviceModel.lookup = self._mock_device_lookup StoragePoolModel._update_lvm_disks = self._update_lvm_disks StorageVolumesModel.get_list = self._mock_storagevolumes_get_list + StorageVolumeModel.doUpload = self._mock_storagevolume_doUpload DebugReportsModel._gen_debugreport_file = self._gen_debugreport_file LibvirtVMTemplate._get_volume_path = self._get_volume_path VMTemplate.get_iso_info = self._probe_image @@ -313,6 +314,21 @@ class MockModel(Model): return self._model_storagevolume_lookup(pool, vol) + def _mock_storagevolume_doUpload(self, vol, offset, data, data_size): + vol_path = vol.path() + + # MockModel does not create the storage volume as a file + # So create it to do the file upload + if offset == 0: + dirname = os.path.dirname(vol_path) + if not os.path.exists(dirname): + os.makedirs(dirname) + open(vol_path, 'w').close() + + with open(vol_path, 'a') as fd: + fd.seek(offset) + fd.write(data) + def _mock_partitions_get_list(self): return self._mock_partitions.partitions.keys() diff --git a/src/kimchi/model/storagevolumes.py b/src/kimchi/model/storagevolumes.py index cf4611d..0afc74b 100644 --- a/src/kimchi/model/storagevolumes.py +++ b/src/kimchi/model/storagevolumes.py @@ -21,6 +21,7 @@ import contextlib import lxml.etree as ET import os import tempfile +import threading import time import urllib2 from lxml.builder import E @@ -44,12 +45,11 @@ VOLUME_TYPE_MAP = {0: 'file', 2: 'directory', 3: 'network'} - READ_CHUNK_SIZE = 1048576 # 1 MiB - - REQUIRE_NAME_PARAMS = ['capacity'] +upload_volumes = dict() + class StorageVolumesModel(object): def __init__(self, **kargs): @@ -186,10 +186,13 @@ class StorageVolumesModel(object): objstore=self.objstore).lookup(pool_name, name) + vol_path = vol_info['path'] + if params.get('upload', False): + upload_volumes[vol_path] = {'lock': threading.Lock(), 'offset': 0} + try: with self.objstore as session: - session.store('storagevolume', vol_info['path'], - {'ref_cnt': 0}) + session.store('storagevolume', vol_path, {'ref_cnt': 0}) except Exception as e: # If the storage volume was created flawlessly, then lets hide this # error to avoid more error in the VM creation process @@ -491,6 +494,46 @@ class StorageVolumeModel(object): cb('OK', True) + def doUpload(self, vol, offset, data, data_size): + try: + st = self.conn.get().newStream(0) + vol.upload(st, offset, data_size) + st.send(data) + st.finish() + except Exception as e: + st and st.abort() + raise OperationFailed("KCHVOL0029E", {"err": e.message}) + + def update(self, pool, name, params): + chunk_data = params['chunk'].fullvalue() + chunk_size = int(params['chunk_size']) + + if len(chunk_data) != chunk_size: + raise OperationFailed("KCHVOL0026E") + + vol = StorageVolumeModel.get_storagevolume(pool, name, self.conn) + vol_path = vol.path() + vol_capacity = vol.info()[1] + + vol_data = upload_volumes.get(vol_path) + if vol_data is None: + raise OperationFailed("KCHVOL0027E", {"vol": vol_path}) + + lock = vol_data['lock'] + offset = vol_data['offset'] + if (offset + chunk_size) > vol_capacity: + raise OperationFailed("KCHVOL0028E") + + with lock: + self.doUpload(vol, offset, chunk_data, chunk_size) + + vol_data['offset'] += chunk_size + if vol_data['offset'] == vol_capacity: + del upload_volumes[vol_path] + + # Refresh to make sure volume can be found in following lookup + StoragePoolModel.get_storagepool(pool, self.conn).refresh(0) + class IsoVolumesModel(object): def __init__(self, **kargs): diff --git a/tests/test_model_storagevolume.py b/tests/test_model_storagevolume.py index a3c3ce3..005ac75 100644 --- a/tests/test_model_storagevolume.py +++ b/tests/test_model_storagevolume.py @@ -149,24 +149,83 @@ def _do_volume_test(self, model, host, ssl_port, pool_name): resp = self.request(vol_uri) self.assertEquals(404, resp.status) - # Create storage volume with 'file' - filepath = os.path.join(paths.get_prefix(), 'COPYING.LGPL') - url = 'https://%s:%s' % (host, ssl_port) + uri - with open(filepath, 'rb') as fd: - r = requests.post(url, files={'file': fd}, - verify=False, - headers=fake_auth_header()) - + # Storage volume upload + # It is done through a sequence of POST and several PUT requests + filename = 'COPYING.LGPL' + filepath = os.path.join(paths.get_prefix(), filename) + filesize = os.stat(filepath).st_size + + # Create storage volume for upload + req = json.dumps({'name': filename, 'format': 'raw', + 'capacity': filesize, 'upload': True}) + resp = self.request(uri, req, 'POST') if pool_info['type'] in READONLY_POOL_TYPE: - self.assertEquals(r.status_code, 400) + self.assertEquals(400, resp.status) else: - rollback.prependDefer(model.storagevolume_delete, pool_name, - 'COPYING.LGPL') - self.assertEquals(r.status_code, 202) - task = r.json() - wait_task(_task_lookup, task['id']) - resp = self.request(uri + '/COPYING.LGPL') + rollback.prependDefer(rollback_wrapper, model.storagevolume_delete, + pool_name, filename) + self.assertEquals(202, resp.status) + task_id = json.loads(resp.read())['id'] + wait_task(_task_lookup, task_id) + status = json.loads(self.request('/tasks/%s' % task_id).read()) + self.assertEquals('finished', status['status']) + + # Upload volume content + url = 'https://%s:%s' % (host, ssl_port) + uri + '/' + filename + + # Create a file with 5M to upload + # Max body size is set to 4M so the upload will fail with 413 + newfile = '/tmp/5m-file' + with open(newfile, 'wb') as fd: + fd.seek(5*1024*1024-1) + fd.write("\0") + rollback.prependDefer(os.remove, newfile) + + with open(newfile, 'rb') as fd: + with open(newfile + '.tmp', 'wb') as tmp_fd: + data = fd.read() + tmp_fd.write(data) + + with open(newfile + '.tmp', 'rb') as tmp_fd: + r = requests.put(url, data={'chunk_size': len(data)}, + files={'chunk': tmp_fd}, + verify=False, + headers=fake_auth_header()) + self.assertEquals(r.status_code, 413) + + # Do upload + index = 0 + chunk_size = 2 * 1024 + + with open(filepath, 'rb') as fd: + while True: + with open(filepath + '.tmp', 'wb') as tmp_fd: + fd.seek(index*chunk_size) + data = fd.read(chunk_size) + tmp_fd.write(data) + + with open(filepath + '.tmp', 'rb') as tmp_fd: + r = requests.put(url, data={'chunk_size': len(data)}, + files={'chunk': tmp_fd}, + verify=False, + headers=fake_auth_header()) + self.assertEquals(r.status_code, 200) + index = index + 1 + + if len(data) < chunk_size: + break + + rollback.prependDefer(os.remove, filepath + '.tmp') + resp = self.request(uri + '/' + filename) self.assertEquals(200, resp.status) + uploaded_path = json.loads(resp.read())['path'] + with open(filepath) as fd: + content = fd.read() + + with open(uploaded_path) as fd: + uploaded_content = fd.read() + + self.assertEquals(content, uploaded_content) # Create storage volume with 'url' url = 'https://github.com/kimchi-project/kimchi/raw/master/COPYING' diff --git a/tests/test_rest.py b/tests/test_rest.py index c2ebd88..914b602 100644 --- a/tests/test_rest.py +++ b/tests/test_rest.py @@ -21,7 +21,6 @@ import json import os import re -import requests import time import unittest import urllib2 @@ -35,7 +34,7 @@ import kimchi.server from kimchi.osinfo import get_template_default from kimchi.rollbackcontext import RollbackContext from kimchi.utils import add_task -from utils import fake_auth_header, get_free_port, patch_auth, request +from utils import get_free_port, patch_auth, request from utils import run_server, wait_task @@ -1188,48 +1187,6 @@ class RestTests(unittest.TestCase): resp = self.request('%s/fedora-fake' % base_uri, '{}', 'DELETE') self.assertEquals(204, resp.status) - def test_upload(self): - with RollbackContext() as rollback: - url = "https://%s:%s/storagepools/default-pool/storagevolumes" % \ - (host, ssl_port) - - # Create a file with 3M to upload - vol_path = '/tmp/3m-file' - with open(vol_path, 'wb') as fd: - fd.seek(3*1024*1024-1) - fd.write("\0") - rollback.prependDefer(os.remove, vol_path) - - with open(vol_path, 'rb') as fd: - r = requests.post(url, - files={'file': fd}, - verify=False, - headers=fake_auth_header()) - - self.assertEquals(r.status_code, 202) - task = r.json() - wait_task(self._task_lookup, task['id'], 15) - uri = '/storagepools/default-pool/storagevolumes/%s' - resp = self.request(uri % task['target_uri'].split('/')[-1]) - - self.assertEquals(200, resp.status) - - # Create a file with 5M to upload - # Max body size is set to 4M so the upload will fail with 413 - vol_path = '/tmp/5m-file' - with open(vol_path, 'wb') as fd: - fd.seek(5*1024*1024-1) - fd.write("\0") - rollback.prependDefer(os.remove, vol_path) - - with open(vol_path, 'rb') as fd: - r = requests.post(url, - files={'file': fd}, - verify=False, - headers=fake_auth_header()) - - self.assertEquals(r.status_code, 413) - class HttpsRestTests(RestTests): """ -- 2.1.0

On 11-05-2015 15:34, Aline Manera wrote:
+ "chunk_size": { + "description": "Chunk size of uploaded storage volume", + "type": "string", + "error": "KCHVOL0024E", + "required": true + }
Is there a specific reason to why "chunk_size" is not a number? It should be a number.
+ + lock = vol_data['lock'] + offset = vol_data['offset'] + if (offset + chunk_size) > vol_capacity: + raise OperationFailed("KCHVOL0028E") + + with lock: + self.doUpload(vol, offset, chunk_data, chunk_size) + + vol_data['offset'] += chunk_size + if vol_data['offset'] == vol_capacity: + del upload_volumes[vol_path] +
I think everything from "lock = vol_data['lock']" down to "del upload_volumes[vol_path]" should be enclosed by the lock, not just the call to "doUpload". Suppose two PUT commands are sent one right after the other: both of them will use "offset" with the value it had before both calls, so the second command will actually write over what the first command has written. You're locking the access to "doUpload", but "offset" (which controls where "doUpload" will write) isn't locked from concurrent access.
+ # Refresh to make sure volume can be found in following lookup + StoragePoolModel.get_storagepool(pool, self.conn).refresh(0)
There's no need to call "refresh" here. The volume already existed before (and after...) the call to "update". That function is only needed when you add a file manually to a storage pool directory and you want libvirt to see it. This is not the case here.
+ + # Create a file with 5M to upload + # Max body size is set to 4M so the upload will fail with 413
"4M"? Why? I'd say the volume capacity there is the size of COPYING.LGPL, which is about ~7 KiB, not 4 MiB. I understand that the test below will fail anyway, but the comment doesn't seem right.
+ newfile = '/tmp/5m-file' + with open(newfile, 'wb') as fd: + fd.seek(5*1024*1024-1) + fd.write("\0") + rollback.prependDefer(os.remove, newfile) + + with open(newfile, 'rb') as fd: + with open(newfile + '.tmp', 'wb') as tmp_fd: + data = fd.read() + tmp_fd.write(data) + + with open(newfile + '.tmp', 'rb') as tmp_fd: + r = requests.put(url, data={'chunk_size': len(data)}, + files={'chunk': tmp_fd}, + verify=False, + headers=fake_auth_header()) + self.assertEquals(r.status_code, 413)
Why going all the trouble of creating one file with \0s, then copying it to another file? Why not using only one?
+ # Do upload + index = 0 + chunk_size = 2 * 1024 + + with open(filepath, 'rb') as fd: + while True: + with open(filepath + '.tmp', 'wb') as tmp_fd: + fd.seek(index*chunk_size) + data = fd.read(chunk_size) + tmp_fd.write(data)
Again, why do you need to create a new file here with a copy of the original data?
+ with open(filepath) as fd: + content = fd.read()
Why don't you use the existing loop above to set the variable "content", one chunk at a time? The code has already read the file once, its contents haven't changed. I feel that there's too much unnecessary I/O going on here...

On 12/05/2015 13:38, Crístian Deives wrote:
On 11-05-2015 15:34, Aline Manera wrote:
+ "chunk_size": { + "description": "Chunk size of uploaded storage volume", + "type": "string", + "error": "KCHVOL0024E", + "required": true + }
Is there a specific reason to why "chunk_size" is not a number? It should be a number.
It is because the data is sent in the request body. So all data is encoded as unicode. As you can see in the tests, we chunk_size is actually an integer, but it will be encoded as unicode for the request.
+ + lock = vol_data['lock'] + offset = vol_data['offset'] + if (offset + chunk_size) > vol_capacity: + raise OperationFailed("KCHVOL0028E") + + with lock: + self.doUpload(vol, offset, chunk_data, chunk_size) + + vol_data['offset'] += chunk_size + if vol_data['offset'] == vol_capacity: + del upload_volumes[vol_path] +
I think everything from "lock = vol_data['lock']" down to "del upload_volumes[vol_path]" should be enclosed by the lock, not just the call to "doUpload". Suppose two PUT commands are sent one right after the other: both of them will use "offset" with the value it had before both calls, so the second command will actually write over what the first command has written. You're locking the access to "doUpload", but "offset" (which controls where "doUpload" will write) isn't locked from concurrent access.
Agree. I will change it and send V2.
+ # Refresh to make sure volume can be found in following lookup + StoragePoolModel.get_storagepool(pool, self.conn).refresh(0)
There's no need to call "refresh" here. The volume already existed before (and after...) the call to "update". That function is only needed when you add a file manually to a storage pool directory and you want libvirt to see it. This is not the case here.
ACK.
+ + # Create a file with 5M to upload + # Max body size is set to 4M so the upload will fail with 413
"4M"? Why? I'd say the volume capacity there is the size of COPYING.LGPL, which is about ~7 KiB, not 4 MiB. I understand that the test below will fail anyway, but the comment doesn't seem right.
The max body size is a server configuration. It represents how much data the request body can have. The test below is not related to the COPYING.LGPL file. It creates a new file '/tmp/5m-file' which has 5M of size. (it is an empty file). And then tries to upload all its content using the upload API. The test will fail as the max body size is 4M and the content has 5M.
+ newfile = '/tmp/5m-file' + with open(newfile, 'wb') as fd: + fd.seek(5*1024*1024-1) + fd.write("\0") + rollback.prependDefer(os.remove, newfile) + + with open(newfile, 'rb') as fd: + with open(newfile + '.tmp', 'wb') as tmp_fd: + data = fd.read() + tmp_fd.write(data) + + with open(newfile + '.tmp', 'rb') as tmp_fd: + r = requests.put(url, data={'chunk_size': len(data)}, + files={'chunk': tmp_fd}, + verify=False, + headers=fake_auth_header()) + self.assertEquals(r.status_code, 413)
Why going all the trouble of creating one file with \0s, then copying it to another file? Why not using only one?
It is a requirement on how to build the request. The file descriptor used for the request must have the data write otherwise, it will assume it is empty.
+ # Do upload + index = 0 + chunk_size = 2 * 1024 + + with open(filepath, 'rb') as fd: + while True: + with open(filepath + '.tmp', 'wb') as tmp_fd: + fd.seek(index*chunk_size) + data = fd.read(chunk_size) + tmp_fd.write(data)
Again, why do you need to create a new file here with a copy of the original data?
Same as I said before. And also, the first file descriptor will point to the entire file and the second one will point to a piece of content to be uploaded.
+ with open(filepath) as fd: + content = fd.read()
Why don't you use the existing loop above to set the variable "content", one chunk at a time? The code has already read the file once, its contents haven't changed. I feel that there's too much unnecessary I/O going on here...
ACK.

On 12-05-2015 13:59, Aline Manera wrote:
Is there a specific reason to why "chunk_size" is not a number? It should be a number.
It is because the data is sent in the request body. So all data is encoded as unicode. As you can see in the tests, we chunk_size is actually an integer, but it will be encoded as unicode for the request.
Still, this doesn't look good to me :-(
"4M"? Why? I'd say the volume capacity there is the size of COPYING.LGPL, which is about ~7 KiB, not 4 MiB. I understand that the test below will fail anyway, but the comment doesn't seem right.
The max body size is a server configuration. It represents how much data the request body can have.
The test below is not related to the COPYING.LGPL file. It creates a new file '/tmp/5m-file' which has 5M of size. (it is an empty file). And then tries to upload all its content using the upload API. The test will fail as the max body size is 4M and the content has 5M.
OK, thanks for the explanation! I thought this test was trying to write more than what the storage volume can hold (KCHVOL0028E), but this doesn't seem to have nothing to do with the upload operation itself, this is a generic limitation for all REST commands.
+ newfile = '/tmp/5m-file' + with open(newfile, 'wb') as fd: + fd.seek(5*1024*1024-1) + fd.write("\0") + rollback.prependDefer(os.remove, newfile) + + with open(newfile, 'rb') as fd: + with open(newfile + '.tmp', 'wb') as tmp_fd: + data = fd.read() + tmp_fd.write(data) + + with open(newfile + '.tmp', 'rb') as tmp_fd: + r = requests.put(url, data={'chunk_size': len(data)}, + files={'chunk': tmp_fd}, + verify=False, + headers=fake_auth_header()) + self.assertEquals(r.status_code, 413)
Why going all the trouble of creating one file with \0s, then copying it to another file? Why not using only one?
It is a requirement on how to build the request. The file descriptor used for the request must have the data write otherwise, it will assume it is empty.
+ # Do upload + index = 0 + chunk_size = 2 * 1024 + + with open(filepath, 'rb') as fd: + while True: + with open(filepath + '.tmp', 'wb') as tmp_fd: + fd.seek(index*chunk_size) + data = fd.read(chunk_size) + tmp_fd.write(data)
Again, why do you need to create a new file here with a copy of the original data?
Same as I said before. And also, the first file descriptor will point to the entire file and the second one will point to a piece of content to be uploaded.
So you mean the browser will also need to create one file for each data chunk and copy its contents locally before sending it to the server? I.e. will uploading a 4 GiB ISO file copy 4 GiB of local data + send 4 GiB of remote data? Isn't it possible to pass the original file handler seek'd to the desired offset, and the server only reads "chunk_size" bytes?

On 12/05/2015 16:31, Crístian Deives wrote:
On 12-05-2015 13:59, Aline Manera wrote:
Is there a specific reason to why "chunk_size" is not a number? It should be a number.
It is because the data is sent in the request body. So all data is encoded as unicode. As you can see in the tests, we chunk_size is actually an integer, but it will be encoded as unicode for the request.
Still, this doesn't look good to me :-(
Sorry! But as it is on request body we need to deal with it.
"4M"? Why? I'd say the volume capacity there is the size of COPYING.LGPL, which is about ~7 KiB, not 4 MiB. I understand that the test below will fail anyway, but the comment doesn't seem right.
The max body size is a server configuration. It represents how much data the request body can have.
The test below is not related to the COPYING.LGPL file. It creates a new file '/tmp/5m-file' which has 5M of size. (it is an empty file). And then tries to upload all its content using the upload API. The test will fail as the max body size is 4M and the content has 5M.
OK, thanks for the explanation! I thought this test was trying to write more than what the storage volume can hold (KCHVOL0028E), but this doesn't seem to have nothing to do with the upload operation itself, this is a generic limitation for all REST commands.
You are correct when say it is a generic limitation for all REST commands. But the upload API is the only one which accepts the request body, so the test is made using it.
+ newfile = '/tmp/5m-file' + with open(newfile, 'wb') as fd: + fd.seek(5*1024*1024-1) + fd.write("\0") + rollback.prependDefer(os.remove, newfile) + + with open(newfile, 'rb') as fd: + with open(newfile + '.tmp', 'wb') as tmp_fd: + data = fd.read() + tmp_fd.write(data) + + with open(newfile + '.tmp', 'rb') as tmp_fd: + r = requests.put(url, data={'chunk_size': len(data)}, + files={'chunk': tmp_fd}, + verify=False, + headers=fake_auth_header()) + self.assertEquals(r.status_code, 413)
Why going all the trouble of creating one file with \0s, then copying it to another file? Why not using only one?
It is a requirement on how to build the request. The file descriptor used for the request must have the data write otherwise, it will assume it is empty.
+ # Do upload + index = 0 + chunk_size = 2 * 1024 + + with open(filepath, 'rb') as fd: + while True: + with open(filepath + '.tmp', 'wb') as tmp_fd: + fd.seek(index*chunk_size) + data = fd.read(chunk_size) + tmp_fd.write(data)
Again, why do you need to create a new file here with a copy of the original data?
Same as I said before. And also, the first file descriptor will point to the entire file and the second one will point to a piece of content to be uploaded.
So you mean the browser will also need to create one file for each data chunk and copy its contents locally before sending it to the server?
On UI we will use the FormData object. Which means we got the file, reads what we need to insert in the FormData object and send the request. And then read the next chunk and do everything again.
I.e. will uploading a 4 GiB ISO file copy 4 GiB of local data + send 4 GiB of remote data?
Isn't it possible to pass the original file handler seek'd to the desired offset, and the server only reads "chunk_size" bytes?
Passing the original file handler means passing a large amount of data which will hang the browser.

Now the upload process will be a sequence of POST and several PUT requests to the storage volume to avoid browser crashing. So remove the former way to upload a file to Kimchi server which used 'file' parameter to POST request. Signed-off-by: Aline Manera <alinefm@linux.vnet.ibm.com> --- docs/API.md | 3 +-- src/kimchi/mockmodel.py | 16 +++++----------- src/kimchi/model/storagevolumes.py | 36 ++---------------------------------- 3 files changed, 8 insertions(+), 47 deletions(-) diff --git a/docs/API.md b/docs/API.md index 88c5fec..71f2539 100644 --- a/docs/API.md +++ b/docs/API.md @@ -481,7 +481,7 @@ A interface represents available network interface on VM. in the defined Storage Pool * **POST**: Create a new Storage Volume in the Storage Pool The return resource is a task resource * See Resource: Task * - Only one of 'file', 'capacity', 'url' can be specified. + Only one of 'capacity', 'url' can be specified. * name: The name of the Storage Volume * capacity: The total space which can be used to store volumes The unit is bytes @@ -490,7 +490,6 @@ A interface represents available network interface on VM. * upload: True to start an upload process. False, otherwise. Only used when creating a storage volume 'capacity' parameter. * file: File to be uploaded, passed through form data - * url: URL to be downloaded ### Resource: Storage Volume diff --git a/src/kimchi/mockmodel.py b/src/kimchi/mockmodel.py index 4395883..b205608 100644 --- a/src/kimchi/mockmodel.py +++ b/src/kimchi/mockmodel.py @@ -282,21 +282,15 @@ class MockModel(Model): kimchi_log.info("The host system will be rebooted") def _mock_storagevolumes_create(self, pool, params): - vol_source = ['file', 'url', 'capacity'] + vol_source = ['url', 'capacity'] index_list = list(i for i in range(len(vol_source)) if vol_source[i] in params) create_param = vol_source[index_list[0]] name = params.get('name') - if name is None: - if create_param == 'file': - name = os.path.basename(params['file'].filename) - del params['file'] - params['capacity'] = 1024 - elif create_param == 'url': - name = os.path.basename(params['url']) - del params['url'] - params['capacity'] = 1024 - params['name'] = name + if name is None and create_param == 'url': + params['name'] = os.path.basename(params['url']) + del params['url'] + params['capacity'] = 1024 return self._model_storagevolumes_create(pool, params) diff --git a/src/kimchi/model/storagevolumes.py b/src/kimchi/model/storagevolumes.py index 0afc74b..e2128f1 100644 --- a/src/kimchi/model/storagevolumes.py +++ b/src/kimchi/model/storagevolumes.py @@ -58,7 +58,7 @@ class StorageVolumesModel(object): self.task = TaskModel(**kargs) def create(self, pool_name, params): - vol_source = ['file', 'url', 'capacity'] + vol_source = ['url', 'capacity'] name = params.get('name') @@ -89,9 +89,7 @@ class StorageVolumesModel(object): # if 'name' is omitted - except for the methods listed in # 'REQUIRE_NAME_PARAMS' - the default volume name will be the # file/URL basename. - if create_param == 'file': - name = os.path.basename(params['file'].filename) - elif create_param == 'url': + if create_param == 'url': name = os.path.basename(params['url']) else: name = 'upload-%s' % int(time.time()) @@ -120,36 +118,6 @@ class StorageVolumesModel(object): taskid = add_task(targeturi, create_func, self.objstore, params) return self.task.lookup(taskid) - def _create_volume_with_file(self, cb, params): - pool_name = params.pop('pool') - dir_path = StoragePoolModel( - conn=self.conn, objstore=self.objstore).lookup(pool_name)['path'] - file_path = os.path.join(dir_path, params['name']) - if os.path.exists(file_path): - raise InvalidParameter('KCHVOL0001E', {'name': params['name']}) - - upload_file = params['file'] - f_len = upload_file.fp.length - try: - size = 0 - with open(file_path, 'wb') as f: - while True: - data = upload_file.file.read(READ_CHUNK_SIZE) - if not data: - break - size += len(data) - f.write(data) - cb('%s/%s' % (size, f_len)) - except Exception as e: - raise OperationFailed('KCHVOL0007E', - {'name': params['name'], - 'pool': pool_name, - 'err': e.message}) - - # Refresh to make sure volume can be found in following lookup - StoragePoolModel.get_storagepool(pool_name, self.conn).refresh(0) - cb('OK', True) - def _create_volume_with_capacity(self, cb, params): pool_name = params.pop('pool') vol_xml = """ -- 2.1.0

Shouldn't the UI be updated to re-enable file uploads now? On 11-05-2015 15:34, Aline Manera wrote:
This patch set depends on "[Kimchi-devel] [PATCH v2 1/4] fix: Use correct path when setting 'ref_cnt' to a new volume", so I included it to this patch set to help testing.
v4 -> v5: - Reduce number of locks - Update MockModel and test cases accordingly
Aline Manera (2): Upload storage volume Remove storage volume creation from file
Crístian Deives (1): fix: Use correct path when setting 'ref_cnt' to a new volume
Royce Lv (1): Update controller to make update accept formdata params
docs/API.md | 8 +++- src/kimchi/API.json | 22 +++++++++ src/kimchi/control/base.py | 6 +-- src/kimchi/i18n.py | 6 +++ src/kimchi/mockmodel.py | 34 +++++++++----- src/kimchi/model/storagevolumes.py | 93 ++++++++++++++++++++++---------------- tests/test_model_storagevolume.py | 89 ++++++++++++++++++++++++++++++------ tests/test_rest.py | 45 +----------------- 8 files changed, 187 insertions(+), 116 deletions(-)

On 12/05/2015 12:46, Crístian Deives wrote:
Shouldn't the UI be updated to re-enable file uploads now?
I am working on UI right now. As upload is current disabled on UI I don't see problems on merge the backend first. Are you agree?
On 11-05-2015 15:34, Aline Manera wrote:
This patch set depends on "[Kimchi-devel] [PATCH v2 1/4] fix: Use correct path when setting 'ref_cnt' to a new volume", so I included it to this patch set to help testing.
v4 -> v5: - Reduce number of locks - Update MockModel and test cases accordingly
Aline Manera (2): Upload storage volume Remove storage volume creation from file
Crístian Deives (1): fix: Use correct path when setting 'ref_cnt' to a new volume
Royce Lv (1): Update controller to make update accept formdata params
docs/API.md | 8 +++- src/kimchi/API.json | 22 +++++++++ src/kimchi/control/base.py | 6 +-- src/kimchi/i18n.py | 6 +++ src/kimchi/mockmodel.py | 34 +++++++++----- src/kimchi/model/storagevolumes.py | 93 ++++++++++++++++++++++---------------- tests/test_model_storagevolume.py | 89 ++++++++++++++++++++++++++++++------ tests/test_rest.py | 45 +----------------- 8 files changed, 187 insertions(+), 116 deletions(-)

On 12-05-2015 13:11, Aline Manera wrote:
On 12/05/2015 12:46, Crístian Deives wrote:
Shouldn't the UI be updated to re-enable file uploads now?
I am working on UI right now. As upload is current disabled on UI I don't see problems on merge the backend first. Are you agree?
Well, there's no problem in merging this code (as in "it won't break anything"), but there's no way for anyone to actually test a file upload and make sure the code works "in real life" (other than writing a test function...). I can't use the REST API via command line because this command requires passing a real object instead of JSON data in the variable "chunk", and there's no UI to use it as well. So only with the UI I'll be able to actually test if the implementation works.

On 12/05/2015 13:44, Crístian Deives wrote:
On 12-05-2015 13:11, Aline Manera wrote:
On 12/05/2015 12:46, Crístian Deives wrote:
Shouldn't the UI be updated to re-enable file uploads now?
I am working on UI right now. As upload is current disabled on UI I don't see problems on merge the backend first. Are you agree?
Well, there's no problem in merging this code (as in "it won't break anything"), but there's no way for anyone to actually test a file upload and make sure the code works "in real life" (other than writing a test function...). I can't use the REST API via command line because this command requires passing a real object instead of JSON data in the variable "chunk", and there's no UI to use it as well. So only with the UI I'll be able to actually test if the implementation works.
Well, we have a specific test case to use file upload. So if the test passes...

That assumption didn't seem to work well with the current implementation of the upload feature, where it works in the tests and it doesn't in "real life" ☺ On Tue, May 12, 2015, 13:46 Aline Manera <alinefm@linux.vnet.ibm.com> wrote:
On 12/05/2015 13:44, Crístian Deives wrote:
On 12-05-2015 13:11, Aline Manera wrote:
On 12/05/2015 12:46, Crístian Deives wrote:
Shouldn't the UI be updated to re-enable file uploads now?
I am working on UI right now. As upload is current disabled on UI I don't see problems on merge the backend first. Are you agree?
Well, there's no problem in merging this code (as in "it won't break anything"), but there's no way for anyone to actually test a file upload and make sure the code works "in real life" (other than writing a test function...). I can't use the REST API via command line because this command requires passing a real object instead of JSON data in the variable "chunk", and there's no UI to use it as well. So only with the UI I'll be able to actually test if the implementation works.
Well, we have a specific test case to use file upload. So if the test passes...
participants (3)
-
Aline Manera
-
Crístian Deives
-
Crístian Viana