Hi,
A user was unable to import a VM template from a gluster export storage
because of various reasons, and since I'm afraid this could hit many
users (losing many days waiting for import tasks to...fail) I would like
to share the issues and some ideas with you:
1) Slowness:
FYI, working with sparse files on GlusterFS mounts will be very slow
until
https://bugzilla.redhat.com/show_bug.cgi?id=1220173 is implemented.
Maybe the same applies also to other network file-systems.
2) Engine Task timeout:
The last times the user tried to import this template (using nightly
builds), the task was apparently deleted by Engine because of timeout
while the qemu-img process continued running (and consuming resources).
I'm not sure, but I believe the tasks are deleted by Engine after 4 or 5
days no matter if the qemu-img process is still running (!?).
Thus, slow import tasks will never finish.
Can someone please confirm?
Maybe we can improve something here to support long import tasks.
3) Wrong SPM:
If $SRC is on host-1, $DST on host-2 and SPM is host-3, the image will
be unnecessarily crossing host-3 and kill the cluster performance.
I guess Engine should choose automatically the appropriate SPM host
depending on the tasks (host-1 or host-2 in this case).
But it seems like oVirt doesn't currently support changing the SPM when
there are running tasks, based on the fact that the user gets a warning
when trying to do it manually.
4) Convert Optimization:
I see oVirt is running:
/usr/bin/qemu-img convert -t none -T none -f raw $SRC -O raw $DST
In this case, $SRC is in raw format and there are no backing chains.
Shouldn't we do a simple 'cp --sparse=always' instead of a 'qemu-img
convert' in this case?
I guess qemu-img should be doing this optimization for us, but maybe
this raw-to-raw conversion use case is just to silly and will not be
considered by qemu-img maintainers.