[Kimchi-devel] [PATCH 3/4] Upload storage volume

Tue May 12 16:38:18 UTC 2015

On 11-05-2015 15:34, Aline Manera wrote:
> +                "chunk_size": {
> +                    "description": "Chunk size of uploaded storage volume",
> +                    "type": "string",
> +                    "error": "KCHVOL0024E",
> +                    "required": true
> +                }

Is there a specific reason to why "chunk_size" is not a number? It 
should be a number.

> +
> +        lock = vol_data['lock']
> +        offset = vol_data['offset']
> +        if (offset + chunk_size) > vol_capacity:
> +            raise OperationFailed("KCHVOL0028E")
> +
> +        with lock:
> +            self.doUpload(vol, offset, chunk_data, chunk_size)
> +
> +        vol_data['offset'] += chunk_size
> +        if vol_data['offset'] == vol_capacity:
> +            del upload_volumes[vol_path]
> +

I think everything from "lock = vol_data['lock']" down to "del 
upload_volumes[vol_path]" should be enclosed by the lock, not just the 
call to "doUpload". Suppose two PUT commands are sent one right after 
the other: both of them will use "offset" with the value it had before 
both calls, so the second command will actually write over what the 
first command has written. You're locking the access to "doUpload", but 
"offset" (which controls where "doUpload" will write) isn't locked from 
concurrent access.

> +        # Refresh to make sure volume can be found in following lookup
> +        StoragePoolModel.get_storagepool(pool, self.conn).refresh(0)

There's no need to call "refresh" here. The volume already existed 
before (and after...) the call to "update". That function is only needed 
when you add a file manually to a storage pool directory and you want 
libvirt to see it. This is not the case here.

> +
> +            # Create a file with 5M to upload
> +            # Max body size is set to 4M so the upload will fail with 413

"4M"? Why? I'd say the volume capacity there is the size of 
COPYING.LGPL, which is about ~7 KiB, not 4 MiB. I understand that the 
test below will fail anyway, but the comment doesn't seem right.

> +            newfile = '/tmp/5m-file'
> +            with open(newfile, 'wb') as fd:
> +                fd.seek(5*1024*1024-1)
> +                fd.write("\0")
> +            rollback.prependDefer(os.remove, newfile)
> +
> +            with open(newfile, 'rb') as fd:
> +                with open(newfile + '.tmp', 'wb') as tmp_fd:
> +                    data = fd.read()
> +                    tmp_fd.write(data)
> +
> +                with open(newfile + '.tmp', 'rb') as tmp_fd:
> +                    r = requests.put(url, data={'chunk_size': len(data)},
> +                                     files={'chunk': tmp_fd},
> +                                     verify=False,
> +                                     headers=fake_auth_header())
> +                    self.assertEquals(r.status_code, 413)

Why going all the trouble of creating one file with \0s, then copying it 
to another file? Why not using only one?

> +            # Do upload
> +            index = 0
> +            chunk_size = 2 * 1024
> +
> +            with open(filepath, 'rb') as fd:
> +                while True:
> +                    with open(filepath + '.tmp', 'wb') as tmp_fd:
> +                        fd.seek(index*chunk_size)
> +                        data = fd.read(chunk_size)
> +                        tmp_fd.write(data)

Again, why do you need to create a new file here with a copy of the 
original data?

> +            with open(filepath) as fd:
> +                content = fd.read()

Why don't you use the existing loop above to set the variable "content", 
one chunk at a time? The code has already read the file once, its 
contents haven't changed. I feel that there's too much unnecessary I/O 
going on here...