Unable to import templates

Hi, A user was unable to import a VM template from a gluster export storage because of various reasons, and since I'm afraid this could hit many users (losing many days waiting for import tasks to...fail) I would like to share the issues and some ideas with you: 1) Slowness: FYI, working with sparse files on GlusterFS mounts will be very slow until https://bugzilla.redhat.com/show_bug.cgi?id=1220173 is implemented. Maybe the same applies also to other network file-systems. 2) Engine Task timeout: The last times the user tried to import this template (using nightly builds), the task was apparently deleted by Engine because of timeout while the qemu-img process continued running (and consuming resources). I'm not sure, but I believe the tasks are deleted by Engine after 4 or 5 days no matter if the qemu-img process is still running (!?). Thus, slow import tasks will never finish. Can someone please confirm? Maybe we can improve something here to support long import tasks. 3) Wrong SPM: If $SRC is on host-1, $DST on host-2 and SPM is host-3, the image will be unnecessarily crossing host-3 and kill the cluster performance. I guess Engine should choose automatically the appropriate SPM host depending on the tasks (host-1 or host-2 in this case). But it seems like oVirt doesn't currently support changing the SPM when there are running tasks, based on the fact that the user gets a warning when trying to do it manually. 4) Convert Optimization: I see oVirt is running: /usr/bin/qemu-img convert -t none -T none -f raw $SRC -O raw $DST In this case, $SRC is in raw format and there are no backing chains. Shouldn't we do a simple 'cp --sparse=always' instead of a 'qemu-img convert' in this case? I guess qemu-img should be doing this optimization for us, but maybe this raw-to-raw conversion use case is just to silly and will not be considered by qemu-img maintainers.

----- Original Message -----
From: "Christopher Pereira" <kripper@imatronix.cl> To: devel@ovirt.org Sent: Tuesday, May 26, 2015 1:44:16 AM Subject: [ovirt-devel] Unable to import templates
Hi,
A user was unable to import a VM template from a gluster export storage because of various reasons, and since I'm afraid this could hit many users (losing many days waiting for import tasks to...fail) I would like to share the issues and some ideas with you:
1) Slowness:
FYI, working with sparse files on GlusterFS mounts will be very slow until https://bugzilla.redhat.com/show_bug.cgi?id=1220173 is implemented. Maybe the same applies also to other network file-systems.
2) Engine Task timeout:
The last times the user tried to import this template (using nightly builds), the task was apparently deleted by Engine because of timeout while the qemu-img process continued running (and consuming resources). I'm not sure, but I believe the tasks are deleted by Engine after 4 or 5 days no matter if the qemu-img process is still running (!?). Thus, slow import tasks will never finish. Can someone please confirm? Maybe we can improve something here to support long import tasks. Sounds like a bug. Could you please report it with all the logs, etc. attached?
3) Wrong SPM:
If $SRC is on host-1, $DST on host-2 and SPM is host-3, the image will be unnecessarily crossing host-3 and kill the cluster performance. I guess Engine should choose automatically the appropriate SPM host depending on the tasks (host-1 or host-2 in this case). But it seems like oVirt doesn't currently support changing the SPM when there are running tasks, based on the fact that the user gets a warning when trying to do it manually.
Moving around SPM is WAY too dangerous to do per operation. Looking forward, this should be part of the non-SPM architecture, probably after 3.6.0.
4) Convert Optimization:
I see oVirt is running: /usr/bin/qemu-img convert -t none -T none -f raw $SRC -O raw $DST
In this case, $SRC is in raw format and there are no backing chains.
Shouldn't we do a simple 'cp --sparse=always' instead of a 'qemu-img convert' in this case? I guess qemu-img should be doing this optimization for us, but maybe this raw-to-raw conversion use case is just to silly and will not be considered by qemu-img maintainers.
Definitely worth checking
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel

----- Original Message -----
From: "Allon Mureinik" <amureini@redhat.com> To: "Christopher Pereira" <kripper@imatronix.cl> Cc: devel@ovirt.org Sent: Thursday, May 28, 2015 3:31:36 PM Subject: Re: [ovirt-devel] Unable to import templates
----- Original Message -----
From: "Christopher Pereira" <kripper@imatronix.cl> To: devel@ovirt.org Sent: Tuesday, May 26, 2015 1:44:16 AM Subject: [ovirt-devel] Unable to import templates
Hi,
A user was unable to import a VM template from a gluster export storage because of various reasons, and since I'm afraid this could hit many users (losing many days waiting for import tasks to...fail) I would like to share the issues and some ideas with you:
1) Slowness:
FYI, working with sparse files on GlusterFS mounts will be very slow until https://bugzilla.redhat.com/show_bug.cgi?id=1220173 is implemented. Maybe the same applies also to other network file-systems.
2) Engine Task timeout:
The last times the user tried to import this template (using nightly builds), the task was apparently deleted by Engine because of timeout while the qemu-img process continued running (and consuming resources). I'm not sure, but I believe the tasks are deleted by Engine after 4 or 5 days no matter if the qemu-img process is still running (!?). Thus, slow import tasks will never finish. Can someone please confirm? Maybe we can improve something here to support long import tasks. Sounds like a bug. Could you please report it with all the logs, etc. attached?
3) Wrong SPM:
If $SRC is on host-1, $DST on host-2 and SPM is host-3, the image will be unnecessarily crossing host-3 and kill the cluster performance. I guess Engine should choose automatically the appropriate SPM host depending on the tasks (host-1 or host-2 in this case). But it seems like oVirt doesn't currently support changing the SPM when there are running tasks, based on the fact that the user gets a warning when trying to do it manually.
Moving around SPM is WAY too dangerous to do per operation. Looking forward, this should be part of the non-SPM architecture, probably after 3.6.0.
4) Convert Optimization:
I see oVirt is running: /usr/bin/qemu-img convert -t none -T none -f raw $SRC -O raw $DST
In this case, $SRC is in raw format and there are no backing chains.
Shouldn't we do a simple 'cp --sparse=always' instead of a 'qemu-img convert' in this case? I guess qemu-img should be doing this optimization for us, but maybe this raw-to-raw conversion use case is just to silly and will not be considered by qemu-img maintainers.
Definitely worth checking
In general, we prefer to do all image operations using qemu-img. If an optimization is needed, better do it in qemu-img, improving all users. But if you send a patch for this and back it up with benchmark results, proving that this make a significant improvement, I think it should be easy to get it merged. Nir

----- Original Message -----
From: "Nir Soffer" <nsoffer@redhat.com> To: "Allon Mureinik" <amureini@redhat.com> Cc: "Christopher Pereira" <kripper@imatronix.cl>, devel@ovirt.org, "Federico Simoncelli" <fsimonce@redhat.com> Sent: Friday, May 29, 2015 5:37:16 PM Subject: Re: [ovirt-devel] Unable to import templates
----- Original Message -----
From: "Allon Mureinik" <amureini@redhat.com> To: "Christopher Pereira" <kripper@imatronix.cl> Cc: devel@ovirt.org Sent: Thursday, May 28, 2015 3:31:36 PM Subject: Re: [ovirt-devel] Unable to import templates
----- Original Message -----
From: "Christopher Pereira" <kripper@imatronix.cl> To: devel@ovirt.org Sent: Tuesday, May 26, 2015 1:44:16 AM Subject: [ovirt-devel] Unable to import templates
Hi,
A user was unable to import a VM template from a gluster export storage because of various reasons, and since I'm afraid this could hit many users (losing many days waiting for import tasks to...fail) I would like to share the issues and some ideas with you:
1) Slowness:
FYI, working with sparse files on GlusterFS mounts will be very slow until https://bugzilla.redhat.com/show_bug.cgi?id=1220173 is implemented. Maybe the same applies also to other network file-systems.
2) Engine Task timeout:
The last times the user tried to import this template (using nightly builds), the task was apparently deleted by Engine because of timeout while the qemu-img process continued running (and consuming resources). I'm not sure, but I believe the tasks are deleted by Engine after 4 or 5 days no matter if the qemu-img process is still running (!?). Thus, slow import tasks will never finish. Can someone please confirm? Maybe we can improve something here to support long import tasks. Sounds like a bug. Could you please report it with all the logs, etc. attached?
3) Wrong SPM:
If $SRC is on host-1, $DST on host-2 and SPM is host-3, the image will be unnecessarily crossing host-3 and kill the cluster performance. I guess Engine should choose automatically the appropriate SPM host depending on the tasks (host-1 or host-2 in this case). But it seems like oVirt doesn't currently support changing the SPM when there are running tasks, based on the fact that the user gets a warning when trying to do it manually.
Moving around SPM is WAY too dangerous to do per operation. Looking forward, this should be part of the non-SPM architecture, probably after 3.6.0.
4) Convert Optimization:
I see oVirt is running: /usr/bin/qemu-img convert -t none -T none -f raw $SRC -O raw $DST
In this case, $SRC is in raw format and there are no backing chains.
Shouldn't we do a simple 'cp --sparse=always' instead of a 'qemu-img convert' in this case? I guess qemu-img should be doing this optimization for us, but maybe this raw-to-raw conversion use case is just to silly and will not be considered by qemu-img maintainers.
Definitely worth checking
In general, we prefer to do all image operations using qemu-img. If an optimization is needed, better do it in qemu-img, improving all users.
But if you send a patch for this and back it up with benchmark results, proving that this make a significant improvement, I think it should be easy to get it merged.
And don't forget that you'll have to implement the progress report for "cp" as well. -- Federico

On 28-05-2015 9:31, Allon Mureinik wrote:
----- Original Message -----
From: "Christopher Pereira" <kripper@imatronix.cl> To: devel@ovirt.org Sent: Tuesday, May 26, 2015 1:44:16 AM Subject: [ovirt-devel] Unable to import templates
Hi,
A user was unable to import a VM template from a gluster export storage because of various reasons, and since I'm afraid this could hit many users (losing many days waiting for import tasks to...fail) I would like to share the issues and some ideas with you:
1) Slowness:
FYI, working with sparse files on GlusterFS mounts will be very slow until https://bugzilla.redhat.com/show_bug.cgi?id=1220173 is implemented. Maybe the same applies also to other network file-systems.
2) Engine Task timeout:
The last times the user tried to import this template (using nightly builds), the task was apparently deleted by Engine because of timeout while the qemu-img process continued running (and consuming resources). I'm not sure, but I believe the tasks are deleted by Engine after 4 or 5 days no matter if the qemu-img process is still running (!?). Thus, slow import tasks will never finish. Can someone please confirm? Maybe we can improve something here to support long import tasks. Sounds like a bug. Could you please report it with all the logs, etc. attached?
Reported with full logs here: https://bugzilla.redhat.com/show_bug.cgi?id=1226450
participants (4)
-
Allon Mureinik
-
Christopher Pereira
-
Federico Simoncelli
-
Nir Soffer