Sounds like the gluster ACL bug.
Did you recently patchbyour gluster ?
Did you test functionality of oVirt after gluster upgrade?
Check the brick logs for errors mentioning 'acl' (you might need to increase log
level temporarily).
If you do brick log reporting issues related to acl - you can downgrade all gluster
packages (but you will need to restart the gluster brick processes).
Best Regards,
Strahil Nikolov
На 12 август 2020 г. 2:23:03 GMT+03:00, thomas(a)hoberg.net написа:
>While trying to diagnose an issue with a set of VMs that get stopped
>for I/O problems at startup, I try to deal with the fact that their
>boot disks cause this issue, no matter where I connect them. They might
>have been the first disks I ever tried to sparsify and I was afraid
>that might have messed them up. The images are for a nested oVirt
>deployment and they worked just fine, before I shut down those VMs...
>
>So I first tried to hook them as secondary disks to another VM to have
>a look, but that just cause the other VM to stop at boot.
>
>Also tried downloading, exporting, and plain copying the disks to no
>avail, OVA exports on the entire VM fail again (fix is in!).
>
>So to make sure copying disks between volumes *generally* work, I tried
>copying a disk from a working (but stopped) VM from 'vmstore' to
'data'
>on my 3nHCI farm, but that failed, too!
>
>Plenty of space all around, but all disks are using thin/sparse/VDO on
>SSD underneath.
>
>Before I open a bug, I'd like to have some feedback if this is a
>standard QA test, this is happening to you etc.
>
>Still on oVirt 4.3.11 with pack_ova.py patched to wait for the udev
>settle,
>
>This is from the engine.log on the hosted-engine:
>
>2020-08-12 00:04:15,870+02 ERROR
>[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
>(EE-ManagedThreadFactory-engineScheduled-Thread-67) [] EVENT_ID:
>VDS_BROKER_COMMAND_FAILURE(10,802), VDSM gem2 command
>HSMGetAllTasksStatusesVDS failed: low level Image copy failed:
>("Command ['/usr/bin/qemu-img', 'convert', '-p',
'-t', 'none', '-T',
>'none', '-f', 'raw',
>u'/rhev/data-center/mnt/glusterSD/192.168.0.91:_vmstore/9d1b8774-c5dc-46a8-bfa2-6a6db5851195/images/aca27b96-7215-476f-b793-fb0396543a2e/311f853c-e9cc-4b9e-8a00-5885ec7adf14',
>'-O', 'raw',
>u'/rhev/data-center/mnt/glusterSD/192.168.0.91:_data/32129b5f-d47c-495b-a282-7eae1079257e/images/f6a08d2a-4ddb-42da-88e6-4f92a38b9c95/e0d00d46-61a1-4d8c-8cb4-2e5f1683d7f5']
>failed with rc=1 out='' err=bytearray(b'qemu-img: error while reading
>sector 131072: Transport endpoint is not connected\\nqemu-img: error
>while reading sector 135168: Transport endpoint is not
>connected\\nqemu-img: error while reading sector 139264: Transport
>endpoint is not connected\\nqemu-img: error while reading sector
>143360: Transport endpoint is not connected\\nqemu-img: error while
>reading sector 147456: Transport endpoint is not connected\\nqemu-img:
>error while reading sector 151552: Transport endpoint is not
>connected\\n')",)
>
>and this is from the vdsm.log on the gem2 node:
>Error: Command ['/usr/bin/qemu-img', 'convert', '-p',
'-t', 'none',
>'-T', 'none', '-f', 'raw',
>u'/rhev/data-center/mnt/glusterSD/192.168.0.91:_vmstore/9d1b8774-c5dc-46a8-bfa2-6a6db5851195/images/aca27b96-7215-476f-b793-fb0396543a2e/311f853c-e9cc-4b9e-8a00-5885ec7adf14',
>'-O', 'raw',
>u'/rhev/data-center/mnt/glusterSD/192.168.0.91:_data/32129b5f-d47c-495b-a282-7eae1079257e/images/f6a08d2a-4ddb-42da-88e6-4f92a38b9c95/e0d00d46-61a1-4d8c-8cb4-2e5f1683d7f5']
>failed with rc=1 out='' err=bytearray(b'qemu-img: error while reading
>sector 131072: Transport endpoint is not connected\nqemu-img: error
>while reading sector 135168: Transport endpoint is not
>connected\nqemu-img: error while reading sector 139264: Transport
>endpoint is not connected\nqemu-img: error while reading sector 143360:
>Transport endpoint is not connected\nqemu-img: error while reading
>sector 147456: Transport endpoint is not connected\nqemu-img: error
>while reading sector 151552: Transport endpoint is not connected\n')
>2020-08-12 00:03:15,428+0200 ERROR (tasks/7) [storage.Image] Unexpected
>error (image:849)
>Traceback (most recent call last):
>File "/usr/lib/python2.7/site-packages/vdsm/storage/image.py", line
>837, in copyCollapsed
> raise se.CopyImageError(str(e))
>CopyImageError: low level Image copy failed: ("Command
>['/usr/bin/qemu-img', 'convert', '-p', '-t',
'none', '-T', 'none',
>'-f', 'raw',
>u'/rhev/data-center/mnt/glusterSD/192.168.0.91:_vmstore/9d1b8774-c5dc-46a8-bfa2-6a6db5851195/images/aca27b96-7215-476f-b793-fb0396543a2e/311f853c-e9cc-4b9e-8a00-5885ec7adf14',
>'-O', 'raw',
>u'/rhev/data-center/mnt/glusterSD/192.168.0.91:_data/32129b5f-d47c-495b-a282-7eae1079257e/images/f6a08d2a-4ddb-42da-88e6-4f92a38b9c95/e0d00d46-61a1-4d8c-8cb4-2e5f1683d7f5']
>failed with rc=1 out='' err=bytearray(b'qemu-img: error while reading
>sector 131072: Transport endpoint is not connected\\nqemu-img: error
>while reading sector 135168: Transport endpoint is not
>connected\\nqemu-img: error while reading sector 139264: Transport
>endpoint is not connected\\nqemu-img: error while reading sector
>143360: Transport endpoint is not connected\\nqemu-img: error while
>reading sector 147456: Transport endpoint is not connected\\nqemu-img:
>error while reading sector 151552: T
> ransport endpoint is not connected\\n')",)
>2020-08-12 00:03:15,429+0200 ERROR (tasks/7) [storage.TaskManager.Task]
>(Task='6399d533-e96a-412d-b0c3-0548e24d658d') Unexpected error
>(task:875)
>Traceback (most recent call last):
>File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882,
>in _run
> return fn(*args, **kargs)
>File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 336,
>in run
> return self.cmd(*self.argslist, **self.argsdict)
>File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line
>79, in wrapper
> return method(self, *args, **kwargs)
>File "/usr/lib/python2.7/site-packages/vdsm/storage/sp.py", line 1633,
>in copyImage
> postZero, force, discard)
>File "/usr/lib/python2.7/site-packages/vdsm/storage/image.py", line
>837, in copyCollapsed
> raise se.CopyImageError(str(e))
>CopyImageError: low level Image copy failed: ("Command
>['/usr/bin/qemu-img', 'convert', '-p', '-t',
'none', '-T', 'none',
>'-f', 'raw',
>u'/rhev/data-center/mnt/glusterSD/192.168.0.91:_vmstore/9d1b8774-c5dc-46a8-bfa2-6a6db5851195/images/aca27b96-7215-476f-b793-fb0396543a2e/311f853c-e9cc-4b9e-8a00-5885ec7adf14',
>'-O', 'raw',
>u'/rhev/data-center/mnt/glusterSD/192.168.0.91:_data/32129b5f-d47c-495b-a282-7eae1079257e/images/f6a08d2a-4ddb-42da-88e6-4f92a38b9c95/e0d00d46-61a1-4d8c-8cb4-2e5f1683d7f5']
>failed with rc=1 out='' err=bytearray(b'qemu-img: error while reading
>sector 131072: Transport endpoint is not connected\\nqemu-img: error
>while reading sector 135168: Transport endpoint is not
>connected\\nqemu-img: error while reading sector 139264: Transport
>endpoint is not connected\\nqemu-img: error while reading sector
>143360: Transport endpoint is not connected\\nqemu-img: error while
>reading sector 147456: Transport endpoint is not connected\\nqemu-img:
>error while reading sector 151552: T
> ransport endpoint is not connected\\n')",)
>_______________________________________________
>Users mailing list -- users(a)ovirt.org
>To unsubscribe send an email to users-leave(a)ovirt.org
>Privacy Statement:
https://www.ovirt.org/privacy-policy.html
>oVirt Code of Conduct:
>https://www.ovirt.org/community/about/community-guidelines/
>List Archives:
>https://lists.ovirt.org/archives/list/users@ovirt.org/message/PRK4LTN3VTOQTBXOHS5R5IOXSIPYR64I/