[ovirt-users] low level Image copy failed

Mon Oct 24 19:54:54 UTC 2016

On Sun, Oct 23, 2016 at 8:57 PM, Jonas Israelsson
<jonas.israelsson at elementary.se> wrote:
> On 23/10/16 20:06, Nir Soffer wrote:
>
>> On Sun, Oct 23, 2016 at 5:34 PM, Jonas Israelsson
>> <jonas.israelsson at elementary.se> wrote:
>>>
>>> Greetings.
>>>
>>> We are in the process of migrating from oVirt 3.6 to 4.0. To properly
>>> test
>>> 4.0 we have setup a parallel 4.0 environment.
>>>
>>> For the non critical vm:s we thought we try the "export vms --> move
>>> storage
>>> domain to the other DC --> import vms" method.
>>>
>>> While many imports are successful quite a few fails with 'low level Image
>>> copy failed'
>>>
>>> One of these vm impossible to import have the following disk layout.
>>>
>>> * Disk 1 - 100GB  (Thin)
>>>
>>> * Disk2 - 32GB (Preallocated)
>>
>> According to the volume .meta file bellow, this is COW/SPARSE,
>> not preallocated.
>
> It's because I'm an idiot and gave you information about the wrong disk. My
> apologizes..
>
> $ /usr/bin/qemu-img.org info
> /rhev/data-center/9d200b26-359e-48b6-972a-90da179e4829/61842ad9-42da-40a9-8ec8-dd7807a82916/images/9eb60288-27b6-4fb1-aef1-4246455d588e/ddf8b402-514c-4a3c-9683-26810a7c41c0
>
> image:
> /rhev/data-center/9d200b26-359e-48b6-972a-90da179e4829/61842ad9-42da-40a9-8ec8-dd7807a82916/images/9eb60288-27b6-4fb1-aef1-4246455d588e/ddf8b402-514c-4a3c-9683-26810a7c41c0
> file format: raw
> virtual size: 35G (37849399296 bytes)
> disk size: 35G
>
>
> [root at patty tmp]# cat
> /rhev/data-center/9d200b26-359e-48b6-972a-90da179e4829/61842ad9-42da-40a9-8ec8-dd7807a82916/images/9eb60288-27b6-4fb1-aef1-4246455d588e/ddf8b402-514c-4a3c-9683-26810a7c41c0.meta
> DOMAIN=61842ad9-42da-40a9-8ec8-dd7807a82916
> VOLTYPE=LEAF
> CTIME=1476880543
> FORMAT=RAW
> IMAGE=9eb60288-27b6-4fb1-aef1-4246455d588e
> DISKTYPE=2
> PUUID=00000000-0000-0000-0000-000000000000
> LEGALITY=LEGAL
> MTIME=0
> POOL_UUID=
> SIZE=67108864

This is 32G (34359738368 bytes), but qemu-img says this is 35G image...

> TYPE=PREALLOCATED
> DESCRIPTION=
> EOF
>
>
>
>>
>> Can you share the original vm disk metadata before the export?
>
> Could you please instruct me how to ? It's on a FC-LUN so it's then hiding
> on a lv somewhere. I could perhaps just move it to an nfs data domain .. ?

On block storage the volume metadata is in /dev/vg-uuid/metadata lv.

To locate the metadata, get the MD_NNN tag from the lv:

    # lvs -o tags vg-uuid/lv-uuid
    ... MD_42 ...

This volume metadata is in block 42 in the metadata lv.

To extract the metadata, use:

    # dd if=/dev/vg-uuid/metadata bs=512 count=1 seek=42

The format is the same as in file storage .meta file.

>> Looking at the metadata before the export, after the export, and after
>> the import, we can understand what is the root cause.
>>
>> It will be hard to find the metadata after the failed copy since vdsm try
>> hard to clean up after errors, but the information should be available
>> in vdsm log.
>
> Yes I noticed, hence the qemu-img wrapper
>>>
>>> * Disk3 - 32GB (Thin)
>>>
>>> Where the two thin disk (1 & 3) are successfully imported but disk2, the
>>> preallocated always fail.
>>>
>> ...
>>>
>>> and from vdsm.log
>>>
>> ...
>>>
>>> CopyImageError: low level Image copy failed: ('ecode=1, stdout=,
>>> stderr=qemu-img: error while writing sector 73912303: No space left on
>>> device\n, message=None',)
>>
>> We need log from the entire flow, starting at "Run and protect:
>> copyImage..."
>>
>> ...
>>>
>>> The first checking the size of the image (37849399296) , and the second
>>> the
>>> size of logical volume (34359738368) just created to hold this image.
>>> And as you can see the volume is smaller in size than the image it should
>>> hold, whereas we are under the impression something made an incorrect
>>> decision when creating that volume.
>>
>> The destination image size depend on the destination format. If the
>> destination
>> is preallocated, the logical volume size *must* be the virtual size
>> (32G). If it is
>> sparse, the logical volume should be the file size on the export domain
>> (35G).
>>
>> According to your findings, we created a destination image for a
>> preallocated
>> disk (32G), and then tried to run "qemu-img convert" with qcow2 format as
>> both source and destination. However this is only a guess, since I don't
>> have
>> the log showing the actual qemu-img command.
>
> 12:37:15 685557156   ---   Identifier: 51635 , Arguments: convert -p -t none
> -T none -f raw
> /rhev/data-center/9d200b26-359e-48b6-972a-90da179e4829/61842ad9-42da-40a9-8ec8-dd7807a82916/images/9eb60288-27b6-4fb1-aef1-4246455d588e/ddf8b402-514c-4a3c-9683-26810a7c41c0
> -O raw
> /rhev/data-center/mnt/blockSD/cb64e1fc-98b6-4b8c-916e-418d05bcd467/images/a1d70c22-cace-48d2-9809-caadc70b77e7/71f5fe82-81dd-47e9-aa3f-1a66622db4cb

So we are copying raw volume to raw volume and this cannot succeed
if the device is smaller than the image.

We need the original volume metadata, and the vdsm logs showing
the copy image from the original volume to the export domain.

Fixing the virtual size in the .meta file manually will work, but you should
check that the size match the virtual size in engine database.

Nir