Thanks Nir for the detailed response.
On this below point
>> Check how many transfer are active on all available hosts, and schedule
a new transfer on a host which is less busy
Are there any resource limits configured on hosts that we should query in such a case? Or if there is documented way to identify a best host for transfer, can you share?
Thanks
Suchitra
From:
Nir Soffer <nsoffer@redhat.com>
Date: Tuesday, August 28, 2018 at 3:01 PM
To: Suchitra Herwadkar <Suchitra.Herwadkar@veritas.com>, Daniel Gur <dagur@redhat.com>
Cc: Daniel Erez <derez@redhat.com>, "Nisan, Tal" <tnisan@redhat.com>, Pavan Chavva <pchavva@redhat.com>, "Yaniv Lavi (Dary)" <ylavi@redhat.com>, "devel@ovirt.org" <devel@ovirt.org>
Subject: [EXTERNAL] Re: Image Transfer mechanism queries/API support
On Mon, 27 Aug 2018, 20:10 Suchitra Herwadkar, <Suchitra.Herwadkar@veritas.com> wrote:
Hi team
>>>
For best performance, you should run your application on an oVirt host, using unix
socket to communicate with imageio. See:
http://ovirt.github.io/ovirt-imageio/unix-socket
(will be available in 4.2.5)
Just to reconfirm hereā¦ the unix socket option to communicate with imageio is only available if the client application runs on the ovirt host. For remote client, the only way is the REST API, right ?
Correct.
Here is more detailed description.
1. Find which host can access the disk.
Check on which storage domain the disk is located, and to which data center
this storage domain belongs.
For example, in this setup:
dc1
host1
host2
storage1
disk1
dc2
host3
host4
storage2
disk2
If we want to backup/restore disk1 we should run our program on host1 or host2
so you can use unix socket, and avoid sending the data over the network.
If you cannot run your transfer program on host1 and host2, you must use HTTPS
and send the data over the network.
The host performing the upload should also be active. If the host is in maintenance,
the disk will not be attached to this host.
2. Balance the backup/restore on all available hosts
Check how many transfer are active on all available hosts, and schedule
a new transfer on a host which is less busy.
For reference, for virt-v2v imports, copying 10s of 100g images from VMWare
to RHV, we got best results when running 2-3 imports on 4 hosts, compared with
running 10 imports on one host.
There is a useful document about virt-v2v performance recommendations crated
by our scale team. I think the recommendations should hold for backup/restore.
Daniel, can you share a link to the document?
3. Starting the image transfer
Once you chose a host, you can start a transfer:
- from your management system, and run the transfer program on that host
- or run the transfer program on the host, and let it start the transfer
virt-v2v took the second approach. This way the transfer program is easy to test,
but you may find the first approach better for your needs.
To start the transfer on the current host, you can can use the host hardware id.
The best example for this virt-v2v rhv-upload-plugin (BSD license):
4. Transferring the data
First check the host capabilities using OPTIONS request and HTTPS:
This will tell if the local imageio daemon supports zero and flush operations,
and unix socket. All operations are supported since 4.2.3, but robust program
should handle older daemon not supporting the new APIs.
If the daemon supports unix socket, and you started the transfer on the same
host your transfer program is running, you should close the HTTPS connection,
and open a new one using unix socket, to get 15% better performance with
less CPU usage.
If the daemon supports zero you can use PATCH/zero request. With 4.2.6,
this can give significant performance improvement.
if the daemon supports flush you can defer flushing, possibly improving
performance with some storage.
In all cases you should use reuse the same connection for all requests, and
always consume the entire response for all requests.
The best example for this is imageio examples upload script (GPL license):
But note that this example works only for local upload, we are working on
improving it to support also remote uploads, see here:
With this patch, if you can use Python, doing:
from ovirt_imageio_common import client
client.upload(filename, url, use_unix_socket=True)
Will do the right thing.
If you cannot use Python you can use this as a reference how to implement
the same thing in other languages.
All this will be eventually documented properly.
Nir