This actually looks to be related to sharding. Doing a strace on the qemu-img process, i can see that it is using lseek, but after the first shard this turns to EOPNOTSUPP and qemu-img dies.


If i temporarily disable sharding, touch a file (so it will not be sharded), then re-enable sharding and use qemu-img to overwrite that new unsharded file, there are no issues.


On 2021-03-07 18:33, Nir Soffer wrote:

On Sun, Mar 7, 2021 at 1:14 PM Alex McWhirter <alex@triadic.us> wrote:

I've been wrestling with this all night, digging through various bits of VDSM code trying to figure why and how this is happening. I need to make some templates, but i simply can't.


VDSM <host> command HSMGetAllTasksStatusesVDS failed: value=low level Image copy failed: ("Command ['/usr/bin/qemu-img', 'convert', '-p', '-t', 'none', '-T', 'none', '-f', 'raw', '-O', 'raw', '/rhev/data-center/mnt/glusterSD/<gluster>:_Temp/45740f16-b3c9-4bb5-ba5f-3e64657fb663/images/6f87b073-c4ec-42f2-87da-d1cb6a08a150/f2dfb779-b49c-4ec9-86cd-741f3fe5b781', '/rhev/data-center/mnt/glusterSD/<gluster>:_Temp/45740f16-b3c9-4bb5-ba5f-3e64657fb663/images/84e56da6-8c26-4518-80a4-20bc395214db/039c5ada-ad6d-45c0-8393-bd4db0bbc366'] failed with rc=1 out=b'' err=bytearray(b'qemu-img: error while writing at byte 738197504: No such file or directory\\n')",)

This is a gluster issue that was reported in the past. write() cannot return errno ENOENT
and qemu-img cannot recover from this error.
 
Can you reproduce this when running the same qemu-img command from the shell?
 
Are you running the latest gluster version?
 
Nir
 

abortedcode=261

3/7/21 5:44:44 AM
 
 
Following the VDSM logs, i can see the new image gets created, permissions set, etc... but as soon qemu-img starts, it fails like this. I updated all hosts and the engine, rebooted the entire stack, to no avail. So i detached the storage domain, and wiped every host and fresh installed both engine and all nodes, imported the storage domain, and still no dice. Storage domain is gluster volume, single node, created in ovirt.
 
It happens when i make a template, copy an image, or make a new vm from a template. I can still create new vms from blank, and upload images via the web ui. Watching the gluster share, i can see the image being created, but its deleted at some point. I appears to not be being deleted by the template / copying process, as immediately after the above error, i get this one.
 
VDSM command DeleteImageGroupVDS failed: Image does not exist in domain: 'image=4f359545-01a8-439b-832b-18c26194b066, domain=b4507449-ac40-4e35-be66-56441bb696ac'
 
3/7/21 5:44:44 AM
 
I thought maybe garbage collection, but don't see any indication of that in the logs.
 
Any ideas? I redacted host names from the log output.
_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-leave@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/75BZNMZUH23ZWCUHCDYCJSC6EZ45UW5O/

_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-leave@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/33JF7NWW7XW4KKWM57H52TK5QDBDA4J5/