This is a multi-part message in MIME format.
--------------080700090609090405050009
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Hi,
I'm testing nightly builds.
In general, my experience after reinstalling from scratch and having
solved my storage issues is very good.
Today I tried to "Make Template" and had some UX problems I would like
to share with you.
I would also discuss a possible optimization for copying images using
SEEK_HOLE.
1) "Make Template" UX and performance problems:
Source disk and destination disk (template) were on the same StorageDomain.
By accident (probably something very common) the SPM was set on an
external host (that was not hosting this StorageDomain), so the whole
image data went out and back to the same source machine.
This obviously takes very long (hours), while copying the sparse files
directly only takes about 10 [s] with the below optimization.
While making the templates, I believe I restarted VDSM or rebooted the
SPM so the tasks went stale. My fault again.
I was able to remove the stale tasks in Engine by suspending VM's,
stopping VDSM to set the host as non responding and using "confirm host
has been rebooted".
Setting the host in maintenance to confirm it was rebooted was not
possible because it had async. running tasks.
Aren't this tasks PIDs being checked to see if they are still alive?
2) I saw that VDSM was running " /usr/bin/qemu-img convert":
In this case, I believe it is enough to just copy the images instead of
converting them.
I made some tests and found that using "cp --sparse=always" is the best
way to copy images to gluster mounts because it is faster and because
the resulting files are still sparse ('du' reports exactly the same sizes).
But I also discovered a bottleneck.
When copying sparse files (e.g. a 2 TB sparse image that only uses 800
MB in disk, a common scenario when we create templates from fresh
installs) the 'cp' command behaves differently depending on if we are
reading from a gluster mount or from a filesystem supporting SEEK_HOLE
(available in kernels >= 3.1):
a) If we read from a gluster mount, 'cp' reads the 2 TB of zeros, even
when it only writes the non-zeros (iotop shows the 'cp' process reading
those 2 TB). Only sparse writing is optimized.
b) If we read from a SEEK_HOLE supporting filesystem (ext, xfs, etc),
'cp' only reads the non-zero content, thus reading and writing takes
like 10 [s] instead of hours.
It seems gluster is not using SEEK_HOLE for reading (?!).
Considering the source image is not being modified during the "Make
Template" process and we have access to the gluster bricks, it is
possible to 'cp' the source image directly from the bricks (on top of
the SEEK_HOLE supporting FS) instead of reading from the gluster mount.
The difference is really impressive (seconds instead of hours).
I tested it cloning some VM's and it works.
Am I missing something?
Maybe a gluster optimization to enable SEEK_HOLE support on gluster mounts?
--------------080700090609090405050009
Content-Type: text/html; charset=utf-8
Content-Transfer-Encoding: 7bit
<html>
<head>
<meta http-equiv="content-type" content="text/html;
charset=utf-8">
</head>
<body bgcolor="#FFFFFF" text="#000000">
Hi,<br>
<br>
I'm testing nightly builds.<br>
In general, my experience after reinstalling from scratch and having
solved my storage issues is very good.<br>
Today I tried to "Make Template" and had some UX problems I would
like to share with you.<br>
I would also discuss a possible optimization for copying images
using SEEK_HOLE.<br>
<br>
1) "Make Template" UX and performance problems:<br>
<br>
Source disk and destination disk (template) were on the same
StorageDomain.<br>
By accident (probably something very common) the SPM was set on an
external host (that was not hosting this StorageDomain), so the
whole image data went out and back to the same source machine.<br>
This obviously takes very long (hours), while copying the sparse
files directly only takes about 10 [s] with the below optimization.<br>
<br>
While making the templates, I believe I restarted VDSM or rebooted
the SPM so the tasks went stale. My fault again.<br>
<span style="color: rgb(0, 0, 0); font-family: 'Arial Unicode MS',
Arial, sans-serif; font-size: small; font-style: normal;
font-variant: normal; font-weight: normal; letter-spacing: normal;
line-height: 21.6666679382324px; orphans: auto; text-align: start;
text-indent: 0px; text-transform: none; white-space: normal;
widows: 1; word-spacing: 0px; -webkit-text-stroke-width: 0px;
display: inline !important; float: none; background-color:
rgb(255, 255, 255);">I was able to remove the stale tasks in
Engine by suspending VM's, stopping VDSM to set the host as non
responding and using "confirm host has been rebooted".<br>
Setting the host in maintenance to confirm it was rebooted was not
possible because it had async. running tasks.<br>
Aren't this tasks PIDs being checked to see if they are still
alive?<br>
<br>
2) I saw that VDSM was running " /usr/bin/qemu-img convert":<br>
<br>
In this case, I believe it is enough to just copy the images
instead of converting them<big>.</big><br>
I made some tests and found that using "cp --sparse=always" is the
best way to copy images </span><span style="color: rgb(0, 0, 0);
font-family: 'Arial Unicode MS', Arial, sans-serif; font-size:
small; font-style: normal; font-variant: normal; font-weight:
normal; letter-spacing: normal; line-height: 21.6666679382324px;
orphans: auto; text-align: start; text-indent: 0px;
text-transform: none; white-space: normal; widows: 1;
word-spacing: 0px; -webkit-text-stroke-width: 0px; display: inline
!important; float: none; background-color: rgb(255, 255, 255);"><span
style="color: rgb(0, 0, 0); font-family: 'Arial Unicode MS',
Arial, sans-serif; font-size: small; font-style: normal;
font-variant: normal; font-weight: normal; letter-spacing:
normal; line-height: 21.6666679382324px; orphans: auto;
text-align: start; text-indent: 0px; text-transform: none;
white-space: normal; widows: 1; word-spacing: 0px;
-webkit-text-stroke-width: 0px; display: inline !important;
float: none; background-color: rgb(255, 255, 255);">to
</span>gluster
mounts </span><span style="color: rgb(0, 0, 0); font-family:
'Arial Unicode MS', Arial, sans-serif; font-size: small;
font-style: normal; font-variant: normal; font-weight: normal;
letter-spacing: normal; line-height: 21.6666679382324px; orphans:
auto; text-align: start; text-indent: 0px; text-transform: none;
white-space: normal; widows: 1; word-spacing: 0px;
-webkit-text-stroke-width: 0px; display: inline !important; float:
none; background-color: rgb(255, 255, 255);"><span style="color:
rgb(0, 0, 0); font-family: 'Arial Unicode MS', Arial,
sans-serif; font-size: small; font-style: normal; font-variant:
normal; font-weight: normal; letter-spacing: normal;
line-height: 21.6666679382324px; orphans: auto; text-align:
start; text-indent: 0px; text-transform: none; white-space:
normal; widows: 1; word-spacing: 0px; -webkit-text-stroke-width:
0px; display: inline !important; float: none; background-color:
rgb(255, 255, 255);">because it is faster </span>and because
the resulting files are still sparse ('du' reports exactly the
same sizes).<br>
But I also discovered a bottleneck.<br>
When copying sparse files (e.g. a 2 TB sparse image that only uses
800 MB in disk, a common scenario when we create templates from
fresh installs) the 'cp' command behaves differently depending on
if we are reading from a gluster mount or from a filesystem
supporting SEEK_HOLE (available in kernels >= 3.1):<br>
<br>
a) If we read from a gluster mount, 'cp' reads the 2 TB of zeros,
even when it only writes the non-zeros (iotop shows the 'cp'
process reading those 2 TB). Only sparse writing is optimized.<br>
<br>
</span><span style="color: rgb(0, 0, 0); font-family: 'Arial
Unicode
MS', Arial, sans-serif; font-size: small; font-style: normal;
font-variant: normal; font-weight: normal; letter-spacing: normal;
line-height: 21.6666679382324px; orphans: auto; text-align: start;
text-indent: 0px; text-transform: none; white-space: normal;
widows: 1; word-spacing: 0px; -webkit-text-stroke-width: 0px;
display: inline !important; float: none; background-color:
rgb(255, 255, 255);">b) If we read from a SEEK_HOLE supporting
filesystem (ext, xfs, etc), 'cp' only reads the non-zero content,
thus reading and writing takes like 10 [s] instead of hours.<br>
<br>
It seems gluster is not using SEEK_HOLE for reading (?!).<br>
<br>
</span>Considering the source image is not being modified during the
"Make Template" process and we have access to the gluster bricks, it
is possible to 'cp' the source image directly from the bricks (on
top of the SEEK_HOLE supporting FS) instead of reading from the
gluster mount.<br>
<br>
The difference is really impressive (seconds instead of hours).<br>
I tested it cloning some VM's and it works.<br>
<br>
Am I missing something?<br>
Maybe a gluster optimization to enable SEEK_HOLE support on gluster
mounts?<br>
<br>
</body>
</html>
--------------080700090609090405050009--