[ovirt-devel] Stale "Make Template" tasks and locked templates + SEEK_HOLE optimization for copying images

Thu May 7 09:09:46 UTC 2015

Hi,

I'm testing nightly builds.
In general, my experience after reinstalling from scratch and having 
solved my storage issues is very good.
Today I tried to "Make Template" and had some UX problems I would like 
to share with you.
I would also discuss a possible optimization for copying images using 
SEEK_HOLE.

1) "Make Template" UX and performance problems:

Source disk and destination disk (template) were on the same StorageDomain.
By accident (probably something very common) the SPM was set on an 
external host (that was not hosting this StorageDomain), so the whole 
image data went out and back to the same source machine.
This obviously takes very long (hours), while copying the sparse files 
directly only takes about 10 [s] with the below optimization.

While making the templates, I believe I restarted VDSM or rebooted the 
SPM so the tasks went stale. My fault again.
I was able to remove the stale tasks in Engine by suspending VM's, 
stopping VDSM to set the host as non responding and using "confirm host 
has been rebooted".
Setting the host in maintenance to confirm it was rebooted was not 
possible because it had async. running tasks.
Aren't this tasks PIDs being checked to see if they are still alive?

2) I saw that VDSM was running " /usr/bin/qemu-img convert":

In this case, I believe it is enough to just copy the images instead of 
converting them.
I made some tests and found that using "cp --sparse=always" is the best 
way to copy images to gluster mounts because it is faster and because 
the resulting files are still sparse ('du' reports exactly the same sizes).
But I also discovered a bottleneck.
When copying sparse files (e.g. a 2 TB sparse image that only uses 800 
MB in disk, a common scenario when we create templates from fresh 
installs) the 'cp' command behaves differently depending on if we are 
reading from a gluster mount or from a filesystem supporting SEEK_HOLE 
(available in kernels >= 3.1):

a) If we read from a gluster mount, 'cp' reads the 2 TB of zeros, even 
when it only writes the non-zeros (iotop shows the 'cp' process reading 
those 2 TB). Only sparse writing is optimized.

b) If we read from a SEEK_HOLE supporting filesystem (ext, xfs, etc), 
'cp' only reads the non-zero content, thus reading and writing takes 
like 10 [s] instead of hours.

It seems gluster is not using SEEK_HOLE for reading (?!).

Considering the source image is not being modified during the "Make 
Template" process and we have access to the gluster bricks, it is 
possible to 'cp' the source image directly from the bricks (on top of 
the SEEK_HOLE supporting FS) instead of reading from the gluster mount.

The difference is really impressive (seconds instead of hours).
I tested it cloning some VM's and it works.

Am I missing something?
Maybe a gluster optimization to enable SEEK_HOLE support on gluster mounts?

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/devel/attachments/20150507/6930926b/attachment-0001.html>