On Fri, Nov 23, 2018 at 2:49 PM Ranjit DSouza <Ranjit.DSouza(a)veritas.com>
wrote:
...
I am trying to upload a snapshot disk in chunks. Everything seems to
work
fine, but observed that the actual_size after upload, is much lesser than
the actual_size of the original disk.
Here are the steps:
1. Take a snapshot of a vm disk and download it (using Image
Transfer mechanism). Save it on the file system somewhere. This disk name
is *3gbdisk*. It is Raw + sparse. Resides on nfs storage. The size of
this downloaded file is 3 GB.
"actual_size" : "*1389109248*", //1 GB
This is the allocated size (what du -sh filename will show).
But in 4.2 we do not support yet detection of zero or unallocated areas in
the image,
so you always download the complete image. Zero or unallocated areas are
downloaded
as zeros.
...
2. Now create a new floating disk, (raw + sparse), with
provisioned_size = 3221225472, or 3 GB. This disk name is vmRestoreDisk
3. Upload to this disk using Image Transfer API, using libCurl
in
chunks of 128 MB. This is done in a while loop, sequentially reading
portions of the file downloaded in step 1 and uploading these chunks via
libcurl. I Use the Transfer URL, not proxy URL.
Here is the trace of the first chunk. Note the Content-Range and
Content-Length headers. Start offset = 0, end offset = 134217727 (or 128 MB)
upload request for chunk, start offset: 0, end offset: 134217727
Upload Started
Header:Content-Range: bytes 0-134217727/3221225472
The Content-Range header looks correct...
Header:Content-Length: 3221225472
* Trying 10.210.46.215...
* TCP_NODELAY set
* Connected to
pnm86hpch30bl15.pne.ven.veritas.com (10.210.46.215) port
54322 (#0)
* ALPN, offering http/1.1
* Cipher selection:
ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* SSL connection using TLSv1.2 / ECDHE-RSA-AES256-GCM-SHA384
* ALPN, server did not agree to a protocol
* Server certificate:
* subject:
O=pne.ven.veritas.com;
CN=pnm86hpch30bl15.pne.ven.veritas.com
* start date: Oct 7 08:55:24 2018 GMT
* expire date: Oct 7 08:55:24 2023 GMT
* issuer: C=US;
O=pne.ven.veritas.com;
CN=pravauto20.pne.ven.veritas.com.59289
* SSL certificate verify result: unable to get local issuer certificate
(20), continuing anyway.
> PUT /images/8ebc9fa8-d322-423e-8a14-5e46ca10ed4e HTTP/1.1
Host: pnm86hpch30bl15.pne.ven.veritas.com:54322
Accept: */*
Content-Length: 134217728
Expect: 100-continue
But you did not send the Content-Range header for this request...
* Done waiting for 100-continue
* We are completely uploaded and fine
* HTTP 1.0, assume close after body
< HTTP/1.0 200 OK
The request was successful, writing the first 128 MiB...
< Date: Fri, 23 Nov 2018 11:52:53 GMT
< Server: WSGIServer/0.1 Python/2.7.5
< Content-Type: application/json; charset=UTF-8
< Content-Length: 0
<
* Closing connection 0
http response code from curl 200
Upload Finished. Return Value: 0
Looking in the attached trace, you never sent the Content-Range, so imageio
happily wrote all chunks to the start of the image...
4. Finalize the Image Transfer after all chunks are uploaded.
Observed that the disk status goes from ‘uploading via API’ to
finalizing
to OK.
5. Do a GET call on the disk (vmRestoreDisk).
"actual_size" : "*134217728*", //128MB
Which explain why the file size is smaller than expected.
"alias" : "vmRestoreDisk",
"content_type" : "data",
"format" : "*raw*",
"image_id" : "3eda3df2-514a-4e78-b999-1729216b25db",
"propagate_errors" : "false",
"provisioned_size" : "3221225472",
"shareable" : "false",
"*sparse*" : "*true*",
"status" : "ok",
"storage_type" : "image",
"total_size" : "0",
"wipe_after_delete" : "false",
As you can see, the actual size is just 128 MB, not 1 GB. I have attached
the logs of the upload operation. I think I may be missing something, let
me know in case you need further information.
Please always include the relevant part from
/var/log/ovirt-imageio-daemon/daemon.log
If you check this log you will find that all requests for this upload have:
WRITE offset=0 size=134217728 ...
Other issue I see in the attached trace:
- You close the connection after every request - this is not needed and
reduce throughput
use the same connection for the entire request
- libcurl sends "Expect: 100-continue" header, but imageio does not handle
this yet in
4.2. This may cause 1 second delay for every request, when libcurl wait
for
"100 Continue" response, before sending the payload. This feature should
be available
in 4.3[4]. Until this feature is supported it would be good idea to
disable 100-continue
header in libcurl[5]. If you cannot disable the option, you can change
the timeout[6] to
avoid the delay.
- You don't check the server capabilities using OPTIONS[0] request. Every
upload sholud
start by checking the server capabilities so you can optimize the upload
using zero and
flush operations.
- You don't use the ?flush=no query string - this is recommended for
improving performance
if you use flush=no, you should send PATCH/flush[1] request at the end of
the transfer.
- It would be more efficient to send bigger chunks. The size of the chunk
is depends on
the amount of data you like to resend if a request fails.
- You can speed up the upload if you detect zero areas in the image and
send them
using PATCH/zero[2] request.
For example using all these features, see imageio python client[3]. If you
can use the
client you will get all this for free. Otherwise you can use it as example
code for
implementing the upload in another language.
[0]
http://ovirt.github.io/ovirt-imageio/random-io.html#options
[1]
http://ovirt.github.io/ovirt-imageio/random-io.html#zero-operation
[2]
http://ovirt.github.io/ovirt-imageio/random-io.html#flush-operation
[3]
https://github.com/oVirt/ovirt-imageio/blob/master/common/ovirt_imageio_c...
[4]
https://bugzilla.redhat.com/1512324
[5]
https://curl.haxx.se/mail/lib-2017-07/0013.html
[6]
https://curl.haxx.se/libcurl/c/CURLOPT_EXPECT_100_TIMEOUT_MS.html
Nir