On Wed, Aug 12, 2020 at 2:25 AM <thomas@hoberg.net> wrote:
While trying to diagnose an issue with a set of VMs that get stopped for I/O problems at startup, I try to deal with the fact that their boot disks cause this issue, no matter where I connect them. They might have been the first disks I ever tried to sparsify and I was afraid that might have messed them up. The images are for a nested oVirt deployment and they worked just fine, before I shut down those VMs...

So I first tried to hook them as secondary disks to another VM to have a look, but that just cause the other VM to stop at boot.

Also tried downloading, exporting, and plain copying the disks to no avail, OVA exports on the entire VM fail again (fix is in!).

So to make sure copying disks between volumes *generally* work, I tried copying a disk from a working (but stopped) VM from 'vmstore' to 'data' on my 3nHCI farm, but that failed, too!

Plenty of space all around, but all disks are using thin/sparse/VDO on SSD underneath.

Before I open a bug, I'd like to have some feedback if this is a standard QA test, this is happening to you etc.

Still on oVirt 4.3.11 with pack_ova.py patched to wait for the udev settle,

This is from the engine.log on the hosted-engine:

2020-08-12 00:04:15,870+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-67) [] EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,802), VDSM gem2 command HSMGetAllTasksStatusesVDS failed: low level Image copy failed: ("Command ['/usr/bin/qemu-img', 'convert', '-p', '-t', 'none', '-T', 'none', '-f', 'raw', u'/rhev/data-center/mnt/glusterSD/192.168.0.91:_vmstore/9d1b8774-c5dc-46a8-bfa2-6a6db5851195/images/aca27b96-7215-476f-b793-fb0396543a2e/311f853c-e9cc-4b9e-8a00-5885ec7adf14', '-O', 'raw', u'/rhev/data-center/mnt/glusterSD/192.168.0.91:_data/32129b5f-d47c-495b-a282-7eae1079257e/images/f6a08d2a-4ddb-42da-88e6-4f92a38b9c95/e0d00d46-61a1-4d8c-8cb4-2e5f1683d7f5'] failed with rc=1 out='' err=bytearray(b'qemu-img: error while reading sector 131072: Transport endpoint is not connected\\nqemu-img: error while reading sector 135168: Transport endpoint is not connected\\nqemu-img: error while reading sector 139264: Transport
  endpoint is not connected\\nqemu-img: error while reading sector 143360: Transport endpoint is not connected\\nqemu-img: error while reading sector 147456: Transport endpoint is not connected\\nqemu-img: error while reading sector 151552: Transport endpoint is not connected\\n')",)

and this is from the vdsm.log on the gem2 node:
Error: Command ['/usr/bin/qemu-img', 'convert', '-p', '-t', 'none', '-T', 'none', '-f', 'raw', u'/rhev/data-center/mnt/glusterSD/192.168.0.91:_vmstore/9d1b8774-c5dc-46a8-bfa2-6a6db5851195/images/aca27b96-7215-476f-b793-fb0396543a2e/311f853c-e9cc-4b9e-8a00-5885ec7adf14', '-O', 'raw', u'/rhev/data-center/mnt/glusterSD/192.168.0.91:_data/32129b5f-d47c-495b-a282-7eae1079257e/images/f6a08d2a-4ddb-42da-88e6-4f92a38b9c95/e0d00d46-61a1-4d8c-8cb4-2e5f1683d7f5'] failed with rc=1 out='' err=bytearray(b'qemu-img: error while reading sector 131072: Transport endpoint is not connected\nqemu-img: error while reading sector 135168: Transport endpoint is not connected\nqemu-img: error while reading sector 139264: Transport endpoint is not connected\nqemu-img: error while reading sector 143360: Transport endpoint is not connected\nqemu-img: error while reading sector 147456: Transport endpoint is not connected\nqemu-img: error while reading sector 151552: Transport endpoint is not connected\n')
2020-08-12 00:03:15,428+0200 ERROR (tasks/7) [storage.Image] Unexpected error (image:849)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/storage/image.py", line 837, in copyCollapsed
    raise se.CopyImageError(str(e))
CopyImageError: low level Image copy failed: ("Command ['/usr/bin/qemu-img', 'convert', '-p', '-t', 'none', '-T', 'none', '-f', 'raw', u'/rhev/data-center/mnt/glusterSD/192.168.0.91:_vmstore/9d1b8774-c5dc-46a8-bfa2-6a6db5851195/images/aca27b96-7215-476f-b793-fb0396543a2e/311f853c-e9cc-4b9e-8a00-5885ec7adf14', '-O', 'raw', u'/rhev/data-center/mnt/glusterSD/192.168.0.91:_data/32129b5f-d47c-495b-a282-7eae1079257e/images/f6a08d2a-4ddb-42da-88e6-4f92a38b9c95/e0d00d46-61a1-4d8c-8cb4-2e5f1683d7f5'] failed with rc=1 out='' err=bytearray(b'qemu-img: error while reading sector 131072: Transport endpoint is not connected\\nqemu-img: error while reading sector 135168: Transport endpoint is not connected\\nqemu-img: error while reading sector 139264: Transport endpoint is not connected\\nqemu-img: error while reading sector 143360: Transport endpoint is not connected\\nqemu-img: error while reading sector 147456: Transport endpoint is not connected\\nqemu-img: error while reading sector 151552: T
 ransport endpoint is not connected\\n')",)

Please file a gluster bug for this.

You should be able to reproduce by running qemu-img manually:

    qemu-img convert -p -t none -T none-f raw -O raw \
        /rhev/data-center/mnt/glusterSD/192.168.0.91:_vmstore/9d1b8774-c5dc-46a8-bfa2-6a6db5851195/images/aca27b96-7215-476f-b793-fb0396543a2e/311f853c-e9cc-4b9e-8a00-5885ec7adf14  \
        /rhev/data-center/mnt/glusterSD/192.168.0.91:_data/test.raw
 
2020-08-12 00:03:15,429+0200 ERROR (tasks/7) [storage.TaskManager.Task] (Task='6399d533-e96a-412d-b0c3-0548e24d658d') Unexpected error (task:875)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882, in _run
    return fn(*args, **kargs)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 336, in run
    return self.cmd(*self.argslist, **self.argsdict)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 79, in wrapper
    return method(self, *args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/sp.py", line 1633, in copyImage
    postZero, force, discard)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/image.py", line 837, in copyCollapsed
    raise se.CopyImageError(str(e))
CopyImageError: low level Image copy failed: ("Command ['/usr/bin/qemu-img', 'convert', '-p', '-t', 'none', '-T', 'none', '-f', 'raw', u'/rhev/data-center/mnt/glusterSD/192.168.0.91:_vmstore/9d1b8774-c5dc-46a8-bfa2-6a6db5851195/images/aca27b96-7215-476f-b793-fb0396543a2e/311f853c-e9cc-4b9e-8a00-5885ec7adf14', '-O', 'raw', u'/rhev/data-center/mnt/glusterSD/192.168.0.91:_data/32129b5f-d47c-495b-a282-7eae1079257e/images/f6a08d2a-4ddb-42da-88e6-4f92a38b9c95/e0d00d46-61a1-4d8c-8cb4-2e5f1683d7f5'] failed with rc=1 out='' err=bytearray(b'qemu-img: error while reading sector 131072: Transport endpoint is not connected\\nqemu-img: error while reading sector 135168: Transport endpoint is not connected\\nqemu-img: error while reading sector 139264: Transport endpoint is not connected\\nqemu-img: error while reading sector 143360: Transport endpoint is not connected\\nqemu-img: error while reading sector 147456: Transport endpoint is not connected\\nqemu-img: error while reading sector 151552: T
 ransport endpoint is not connected\\n')",)
_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-leave@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/PRK4LTN3VTOQTBXOHS5R5IOXSIPYR64I/