
Hi All, I have installed successfully several times oVirt (version 4.1) with 3 nodes on top glusterfs. This time, when trying to configure the same setup, I am facing the following issue which doesn't seem to go away. During installation i get the error: Failed to execute stage 'Misc configuration': Cannot acquire host id: (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, 'Sanlock lockspace add failure', 'Invalid argument')) The only different in this setup is that instead of standard partitioning i have GPT partitioning and the disks have 4K block size instead of 512. The /var/log/sanlock.log has the following lines: 2017-06-03 19:21:15+0200 23450 [943]: s9 lockspace ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047:250:/rhev/data-center/mnt/_var_lib_ovirt-hosted-engin-setup_tmptjkIDI/ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047/dom_md/ids:0 2017-06-03 19:21:36+0200 23471 [944]: s9:r5 resource ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047:SDM:/rhev/data-center/mnt/_var_lib_ovirt-hosted-engine-setup_tmptjkIDI/ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047/dom_md/leases:1048576 for 2,9,23040 2017-06-03 19:21:36+0200 23471 [943]: s10 lockspace a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922:250:/rhev/data-center/mnt/glusterSD/10.100.100.1: _engine/a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922/dom_md/ids:0 2017-06-03 19:21:36+0200 23471 [23522]: a5a6b0e7 aio collect RD 0x7f59b00008c0:0x7f59b00008d0:0x7f59b0101000 result -22:0 match res 2017-06-03 19:21:36+0200 23471 [23522]: read_sectors delta_leader offset 127488 rv -22 /rhev/data-center/mnt/glusterSD/10.100.100.1: _engine/a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922/dom_md/ids 2017-06-03 19:21:37+0200 23472 [930]: s9 host 250 1 23450 88c2244c-a782-40ed-9560-6cfa4d46f853.v0.neptune 2017-06-03 19:21:37+0200 23472 [943]: s10 add_lockspace fail result -22 And /var/log/vdsm/vdsm.log says: 2017-06-03 19:19:38,176+0200 WARN (jsonrpc/3) [storage.StorageServer.MountConnection] Using user specified backup-volfile-servers option (storageServer:253) 2017-06-03 19:21:12,379+0200 WARN (periodic/1) [throttled] MOM not available. (throttledlog:105) 2017-06-03 19:21:12,380+0200 WARN (periodic/1) [throttled] MOM not available, KSM stats will be missing. (throttledlog:105) 2017-06-03 19:21:14,714+0200 WARN (jsonrpc/1) [storage.StorageServer.MountConnection] Using user specified backup-volfile-servers option (storageServer:253) 2017-06-03 19:21:15,515+0200 ERROR (jsonrpc/4) [storage.initSANLock] Cannot initialize SANLock for domain a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922 (clusterlock:238) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py", line 234, in initSANLock sanlock.init_lockspace(sdUUID, idsPath) SanlockException: (107, 'Sanlock lockspace init failure', 'Transport endpoint is not connected') 2017-06-03 19:21:15,515+0200 WARN (jsonrpc/4) [storage.StorageDomainManifest] lease did not initialize successfully (sd:557) Traceback (most recent call last): File "/usr/share/vdsm/storage/sd.py", line 552, in initDomainLock self._domainLock.initLock(self.getDomainLease()) File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py", line 271, in initLock initSANLock(self._sdUUID, self._idsPath, lease) File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py", line 239, in initSANLock raise se.ClusterLockInitError() ClusterLockInitError: Could not initialize cluster lock: () 2017-06-03 19:21:37,867+0200 ERROR (jsonrpc/2) [storage.StoragePool] Create pool hosted_datacenter canceled (sp:655) Traceback (most recent call last): File "/usr/share/vdsm/storage/sp.py", line 652, in create self.attachSD(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 971, in attachSD dom.acquireHostId(self.id) File "/usr/share/vdsm/storage/sd.py", line 790, in acquireHostId self._manifest.acquireHostId(hostId, async) File "/usr/share/vdsm/storage/sd.py", line 449, in acquireHostId self._domainLock.acquireHostId(hostId, async) File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py", line 297, in acquireHostId raise se.AcquireHostIdFailure(self._sdUUID, e) AcquireHostIdFailure: Cannot acquire host id: (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, 'Sanlock lockspace add failure', 'Invalid argument')) 2017-06-03 19:21:37,870+0200 ERROR (jsonrpc/2) [storage.StoragePool] Domain ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 detach from MSD ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 Ver 1 failed. (sp:528) Traceback (most recent call last): File "/usr/share/vdsm/storage/sp.py", line 525, in __cleanupDomains self.detachSD(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 1046, in detachSD raise se.CannotDetachMasterStorageDomain(sdUUID) CannotDetachMasterStorageDomain: Illegal action: (u'ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047',) 2017-06-03 19:21:37,872+0200 ERROR (jsonrpc/2) [storage.StoragePool] Domain a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922 detach from MSD ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 Ver 1 failed. (sp:528) Traceback (most recent call last): File "/usr/share/vdsm/storage/sp.py", line 525, in __cleanupDomains self.detachSD(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 1043, in detachSD self.validateAttachedDomain(dom) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 542, in validateAttachedDomain self.validatePoolSD(dom.sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 535, in validatePoolSD raise se.StorageDomainNotMemberOfPool(self.spUUID, sdUUID) StorageDomainNotMemberOfPool: Domain is not member in pool: u'pool=a1e7e9dd-0cf4-41ae-ba13-36297ed66309, domain=a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922' 2017-06-03 19:21:40,063+0200 ERROR (jsonrpc/2) [storage.TaskManager.Task] (Task='a2476a33-26f8-4ebd-876d-02fe5d13ef78') Unexpected error (task:870) Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 877, in _run return fn(*args, **kargs) File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 52, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 959, in createStoragePool leaseParams) File "/usr/share/vdsm/storage/sp.py", line 652, in create self.attachSD(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 971, in attachSD dom.acquireHostId(self.id) File "/usr/share/vdsm/storage/sd.py", line 790, in acquireHostId self._manifest.acquireHostId(hostId, async) File "/usr/share/vdsm/storage/sd.py", line 449, in acquireHostId self._domainLock.acquireHostId(hostId, async) File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py", line 297, in acquireHostId raise se.AcquireHostIdFailure(self._sdUUID, e) AcquireHostIdFailure: Cannot acquire host id: (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, 'Sanlock lockspace add failure', 'Invalid argument')) 2017-06-03 19:21:40,067+0200 ERROR (jsonrpc/2) [storage.Dispatcher] {'status': {'message': "Cannot acquire host id: (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, 'Sanlock lockspace add failure', 'Invalid argument'))", 'code': 661}} (dispatcher:77) The gluster volume prepared for engine storage is online and no split brain is reported. I don't understand what needs to be done to overcome this. Any idea will be appreciated. Thank you, Alex

Hi Alex, I saw a bug that might be related to the issue you encountered at https://bugzilla.redhat.com/show_bug.cgi?id=1386443 Sahina, maybe you have any advise? Do you think that BZ1386443is related? Regards, Maor On Sat, Jun 3, 2017 at 8:45 PM, Abi Askushi <rightkicktech@gmail.com> wrote:
Hi All,
I have installed successfully several times oVirt (version 4.1) with 3 nodes on top glusterfs.
This time, when trying to configure the same setup, I am facing the following issue which doesn't seem to go away. During installation i get the error:
Failed to execute stage 'Misc configuration': Cannot acquire host id: (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, 'Sanlock lockspace add failure', 'Invalid argument'))
The only different in this setup is that instead of standard partitioning i have GPT partitioning and the disks have 4K block size instead of 512.
The /var/log/sanlock.log has the following lines:
2017-06-03 19:21:15+0200 23450 [943]: s9 lockspace ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047:250:/rhev/data-center/mnt/_var_lib_ovirt-hosted-engin-setup_tmptjkIDI/ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047/dom_md/ids:0 2017-06-03 19:21:36+0200 23471 [944]: s9:r5 resource ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047:SDM:/rhev/data-center/mnt/_var_lib_ovirt-hosted-engine-setup_tmptjkIDI/ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047/dom_md/leases:1048576 for 2,9,23040 2017-06-03 19:21:36+0200 23471 [943]: s10 lockspace a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922:250:/rhev/data-center/mnt/glusterSD/10.100.100.1:_engine/a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922/dom_md/ids:0 2017-06-03 19:21:36+0200 23471 [23522]: a5a6b0e7 aio collect RD 0x7f59b00008c0:0x7f59b00008d0:0x7f59b0101000 result -22:0 match res 2017-06-03 19:21:36+0200 23471 [23522]: read_sectors delta_leader offset 127488 rv -22 /rhev/data-center/mnt/glusterSD/10.100.100.1:_engine/a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922/dom_md/ids 2017-06-03 19:21:37+0200 23472 [930]: s9 host 250 1 23450 88c2244c-a782-40ed-9560-6cfa4d46f853.v0.neptune 2017-06-03 19:21:37+0200 23472 [943]: s10 add_lockspace fail result -22
And /var/log/vdsm/vdsm.log says:
2017-06-03 19:19:38,176+0200 WARN (jsonrpc/3) [storage.StorageServer.MountConnection] Using user specified backup-volfile-servers option (storageServer:253) 2017-06-03 19:21:12,379+0200 WARN (periodic/1) [throttled] MOM not available. (throttledlog:105) 2017-06-03 19:21:12,380+0200 WARN (periodic/1) [throttled] MOM not available, KSM stats will be missing. (throttledlog:105) 2017-06-03 19:21:14,714+0200 WARN (jsonrpc/1) [storage.StorageServer.MountConnection] Using user specified backup-volfile-servers option (storageServer:253) 2017-06-03 19:21:15,515+0200 ERROR (jsonrpc/4) [storage.initSANLock] Cannot initialize SANLock for domain a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922 (clusterlock:238) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py", line 234, in initSANLock sanlock.init_lockspace(sdUUID, idsPath) SanlockException: (107, 'Sanlock lockspace init failure', 'Transport endpoint is not connected') 2017-06-03 19:21:15,515+0200 WARN (jsonrpc/4) [storage.StorageDomainManifest] lease did not initialize successfully (sd:557) Traceback (most recent call last): File "/usr/share/vdsm/storage/sd.py", line 552, in initDomainLock self._domainLock.initLock(self.getDomainLease()) File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py", line 271, in initLock initSANLock(self._sdUUID, self._idsPath, lease) File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py", line 239, in initSANLock raise se.ClusterLockInitError() ClusterLockInitError: Could not initialize cluster lock: () 2017-06-03 19:21:37,867+0200 ERROR (jsonrpc/2) [storage.StoragePool] Create pool hosted_datacenter canceled (sp:655) Traceback (most recent call last): File "/usr/share/vdsm/storage/sp.py", line 652, in create self.attachSD(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 971, in attachSD dom.acquireHostId(self.id) File "/usr/share/vdsm/storage/sd.py", line 790, in acquireHostId self._manifest.acquireHostId(hostId, async) File "/usr/share/vdsm/storage/sd.py", line 449, in acquireHostId self._domainLock.acquireHostId(hostId, async) File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py", line 297, in acquireHostId raise se.AcquireHostIdFailure(self._sdUUID, e) AcquireHostIdFailure: Cannot acquire host id: (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, 'Sanlock lockspace add failure', 'Invalid argument')) 2017-06-03 19:21:37,870+0200 ERROR (jsonrpc/2) [storage.StoragePool] Domain ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 detach from MSD ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 Ver 1 failed. (sp:528) Traceback (most recent call last): File "/usr/share/vdsm/storage/sp.py", line 525, in __cleanupDomains self.detachSD(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 1046, in detachSD raise se.CannotDetachMasterStorageDomain(sdUUID) CannotDetachMasterStorageDomain: Illegal action: (u'ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047',) 2017-06-03 19:21:37,872+0200 ERROR (jsonrpc/2) [storage.StoragePool] Domain a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922 detach from MSD ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 Ver 1 failed. (sp:528) Traceback (most recent call last): File "/usr/share/vdsm/storage/sp.py", line 525, in __cleanupDomains self.detachSD(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 1043, in detachSD self.validateAttachedDomain(dom) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 542, in validateAttachedDomain self.validatePoolSD(dom.sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 535, in validatePoolSD raise se.StorageDomainNotMemberOfPool(self.spUUID, sdUUID) StorageDomainNotMemberOfPool: Domain is not member in pool: u'pool=a1e7e9dd-0cf4-41ae-ba13-36297ed66309, domain=a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922' 2017-06-03 19:21:40,063+0200 ERROR (jsonrpc/2) [storage.TaskManager.Task] (Task='a2476a33-26f8-4ebd-876d-02fe5d13ef78') Unexpected error (task:870) Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 877, in _run return fn(*args, **kargs) File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 52, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 959, in createStoragePool leaseParams) File "/usr/share/vdsm/storage/sp.py", line 652, in create self.attachSD(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 971, in attachSD dom.acquireHostId(self.id) File "/usr/share/vdsm/storage/sd.py", line 790, in acquireHostId self._manifest.acquireHostId(hostId, async) File "/usr/share/vdsm/storage/sd.py", line 449, in acquireHostId self._domainLock.acquireHostId(hostId, async) File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py", line 297, in acquireHostId raise se.AcquireHostIdFailure(self._sdUUID, e) AcquireHostIdFailure: Cannot acquire host id: (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, 'Sanlock lockspace add failure', 'Invalid argument')) 2017-06-03 19:21:40,067+0200 ERROR (jsonrpc/2) [storage.Dispatcher] {'status': {'message': "Cannot acquire host id: (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, 'Sanlock lockspace add failure', 'Invalid argument'))", 'code': 661}} (dispatcher:77)
The gluster volume prepared for engine storage is online and no split brain is reported. I don't understand what needs to be done to overcome this. Any idea will be appreciated.
Thank you, Alex
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

Hi Maor, My disk are of 4K block size and from this bug seems that gluster replica needs 512B block size. Is there a way to make gluster function with 4K drives? Thank you! On Sun, Jun 4, 2017 at 2:34 PM, Maor Lipchuk <mlipchuk@redhat.com> wrote:
Hi Alex,
I saw a bug that might be related to the issue you encountered at https://bugzilla.redhat.com/show_bug.cgi?id=1386443
Sahina, maybe you have any advise? Do you think that BZ1386443is related?
Regards, Maor
Hi All,
I have installed successfully several times oVirt (version 4.1) with 3 nodes on top glusterfs.
This time, when trying to configure the same setup, I am facing the following issue which doesn't seem to go away. During installation i get
error:
Failed to execute stage 'Misc configuration': Cannot acquire host id: (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, 'Sanlock lockspace add failure', 'Invalid argument'))
The only different in this setup is that instead of standard
have GPT partitioning and the disks have 4K block size instead of 512.
The /var/log/sanlock.log has the following lines:
2017-06-03 19:21:15+0200 23450 [943]: s9 lockspace ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047:250:/rhev/data-
2017-06-03 19:21:36+0200 23471 [944]: s9:r5 resource ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047:SDM:/rhev/data-
for 2,9,23040 2017-06-03 19:21:36+0200 23471 [943]: s10 lockspace a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922:250:/rhev/data-
2017-06-03 19:21:36+0200 23471 [23522]: a5a6b0e7 aio collect RD 0x7f59b00008c0:0x7f59b00008d0:0x7f59b0101000 result -22:0 match res 2017-06-03 19:21:36+0200 23471 [23522]: read_sectors delta_leader offset 127488 rv -22 /rhev/data-center/mnt/glusterSD/10.100.100.1:_engine/a5a6b0e7-fc3f-4838- 8e26-c8b4d5e5e922/dom_md/ids 2017-06-03 19:21:37+0200 23472 [930]: s9 host 250 1 23450 88c2244c-a782-40ed-9560-6cfa4d46f853.v0.neptune 2017-06-03 19:21:37+0200 23472 [943]: s10 add_lockspace fail result -22
And /var/log/vdsm/vdsm.log says:
2017-06-03 19:19:38,176+0200 WARN (jsonrpc/3) [storage.StorageServer.MountConnection] Using user specified backup-volfile-servers option (storageServer:253) 2017-06-03 19:21:12,379+0200 WARN (periodic/1) [throttled] MOM not available. (throttledlog:105) 2017-06-03 19:21:12,380+0200 WARN (periodic/1) [throttled] MOM not available, KSM stats will be missing. (throttledlog:105) 2017-06-03 19:21:14,714+0200 WARN (jsonrpc/1) [storage.StorageServer.MountConnection] Using user specified backup-volfile-servers option (storageServer:253) 2017-06-03 19:21:15,515+0200 ERROR (jsonrpc/4) [storage.initSANLock] Cannot initialize SANLock for domain a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922 (clusterlock:238) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py",
234, in initSANLock sanlock.init_lockspace(sdUUID, idsPath) SanlockException: (107, 'Sanlock lockspace init failure', 'Transport endpoint is not connected') 2017-06-03 19:21:15,515+0200 WARN (jsonrpc/4) [storage.StorageDomainManifest] lease did not initialize successfully (sd:557) Traceback (most recent call last): File "/usr/share/vdsm/storage/sd.py", line 552, in initDomainLock self._domainLock.initLock(self.getDomainLease()) File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py",
271, in initLock initSANLock(self._sdUUID, self._idsPath, lease) File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py",
239, in initSANLock raise se.ClusterLockInitError() ClusterLockInitError: Could not initialize cluster lock: () 2017-06-03 19:21:37,867+0200 ERROR (jsonrpc/2) [storage.StoragePool] Create pool hosted_datacenter canceled (sp:655) Traceback (most recent call last): File "/usr/share/vdsm/storage/sp.py", line 652, in create self.attachSD(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 971, in attachSD dom.acquireHostId(self.id) File "/usr/share/vdsm/storage/sd.py", line 790, in acquireHostId self._manifest.acquireHostId(hostId, async) File "/usr/share/vdsm/storage/sd.py", line 449, in acquireHostId self._domainLock.acquireHostId(hostId, async) File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py",
297, in acquireHostId raise se.AcquireHostIdFailure(self._sdUUID, e) AcquireHostIdFailure: Cannot acquire host id: (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, 'Sanlock lockspace add failure', 'Invalid argument')) 2017-06-03 19:21:37,870+0200 ERROR (jsonrpc/2) [storage.StoragePool] Domain ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 detach from MSD ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 Ver 1 failed. (sp:528) Traceback (most recent call last): File "/usr/share/vdsm/storage/sp.py", line 525, in __cleanupDomains self.detachSD(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 1046, in detachSD raise se.CannotDetachMasterStorageDomain(sdUUID) CannotDetachMasterStorageDomain: Illegal action: (u'ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047',) 2017-06-03 19:21:37,872+0200 ERROR (jsonrpc/2) [storage.StoragePool] Domain a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922 detach from MSD ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 Ver 1 failed. (sp:528) Traceback (most recent call last): File "/usr/share/vdsm/storage/sp.py", line 525, in __cleanupDomains self.detachSD(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 1043, in detachSD self.validateAttachedDomain(dom) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 542, in validateAttachedDomain self.validatePoolSD(dom.sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 535, in validatePoolSD raise se.StorageDomainNotMemberOfPool(self.spUUID, sdUUID) StorageDomainNotMemberOfPool: Domain is not member in pool: u'pool=a1e7e9dd-0cf4-41ae-ba13-36297ed66309, domain=a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922' 2017-06-03 19:21:40,063+0200 ERROR (jsonrpc/2) [storage.TaskManager.Task] (Task='a2476a33-26f8-4ebd-876d-02fe5d13ef78') Unexpected error (task:870) Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 877, in _run return fn(*args, **kargs) File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 52, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 959, in createStoragePool leaseParams) File "/usr/share/vdsm/storage/sp.py", line 652, in create self.attachSD(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 971, in attachSD dom.acquireHostId(self.id) File "/usr/share/vdsm/storage/sd.py", line 790, in acquireHostId self._manifest.acquireHostId(hostId, async) File "/usr/share/vdsm/storage/sd.py", line 449, in acquireHostId self._domainLock.acquireHostId(hostId, async) File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py",
On Sat, Jun 3, 2017 at 8:45 PM, Abi Askushi <rightkicktech@gmail.com> wrote: the partitioning i center/mnt/_var_lib_ovirt-hosted-engin-setup_tmptjkIDI/ ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047/dom_md/ids:0 center/mnt/_var_lib_ovirt-hosted-engine-setup_tmptjkIDI/ ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047/dom_md/leases:1048576 center/mnt/glusterSD/10.100.100.1:_engine/a5a6b0e7-fc3f- 4838-8e26-c8b4d5e5e922/dom_md/ids:0 line line line line line line line line line line line
297, in acquireHostId raise se.AcquireHostIdFailure(self._sdUUID, e) AcquireHostIdFailure: Cannot acquire host id: (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, 'Sanlock lockspace add failure', 'Invalid argument')) 2017-06-03 19:21:40,067+0200 ERROR (jsonrpc/2) [storage.Dispatcher] {'status': {'message': "Cannot acquire host id: (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, 'Sanlock lockspace add failure', 'Invalid argument'))", 'code': 661}} (dispatcher:77)
The gluster volume prepared for engine storage is online and no split brain is reported. I don't understand what needs to be done to overcome this. Any idea will be appreciated.
Thank you, Alex
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

I clean installed everything and ran into the same. I then ran gdeploy and encountered the same issue when deploying engine. Seems that gluster (?) doesn't like 4K sector drives. I am not sure if it has to do with alignment. The weird thing is that gluster volumes are all ok, replicating normally and no split brain is reported. The solution to the mentioned bug (1386443 <https://bugzilla.redhat.com/show_bug.cgi?id=1386443>) was to format with 512 sector size, which for my case is not an option: mkfs.xfs -f -i size=512 -s size=512 /dev/gluster/engine illegal sector size 512; hw sector is 4096 Is there any workaround to address this? Thanx, Alex On Sun, Jun 4, 2017 at 5:48 PM, Abi Askushi <rightkicktech@gmail.com> wrote:
Hi Maor,
My disk are of 4K block size and from this bug seems that gluster replica needs 512B block size. Is there a way to make gluster function with 4K drives?
Thank you!
On Sun, Jun 4, 2017 at 2:34 PM, Maor Lipchuk <mlipchuk@redhat.com> wrote:
Hi Alex,
I saw a bug that might be related to the issue you encountered at https://bugzilla.redhat.com/show_bug.cgi?id=1386443
Sahina, maybe you have any advise? Do you think that BZ1386443is related?
Regards, Maor
Hi All,
I have installed successfully several times oVirt (version 4.1) with 3 nodes on top glusterfs.
This time, when trying to configure the same setup, I am facing the following issue which doesn't seem to go away. During installation i get the error:
Failed to execute stage 'Misc configuration': Cannot acquire host id: (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, 'Sanlock lockspace add failure', 'Invalid argument'))
The only different in this setup is that instead of standard
have GPT partitioning and the disks have 4K block size instead of 512.
The /var/log/sanlock.log has the following lines:
2017-06-03 19:21:15+0200 23450 [943]: s9 lockspace ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047:250:/rhev/data-center/ mnt/_var_lib_ovirt-hosted-engin-setup_tmptjkIDI/ba6bd862 -c2b8-46e7-b2c8-91e4a5bb2047/dom_md/ids:0 2017-06-03 19:21:36+0200 23471 [944]: s9:r5 resource ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047:SDM:/rhev/data-center/ mnt/_var_lib_ovirt-hosted-engine-setup_tmptjkIDI/ba6bd86 2-c2b8-46e7-b2c8-91e4a5bb2047/dom_md/leases:1048576 for 2,9,23040 2017-06-03 19:21:36+0200 23471 [943]: s10 lockspace a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922:250:/rhev/data-center/ mnt/glusterSD/10.100.100.1:_engine/a5a6b0e7-fc3f-4838- 8e26-c8b4d5e5e922/dom_md/ids:0 2017-06-03 19:21:36+0200 23471 [23522]: a5a6b0e7 aio collect RD 0x7f59b00008c0:0x7f59b00008d0:0x7f59b0101000 result -22:0 match res 2017-06-03 19:21:36+0200 23471 [23522]: read_sectors delta_leader offset 127488 rv -22 /rhev/data-center/mnt/glusterSD/10.100.100.1:_engine/ a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922/dom_md/ids 2017-06-03 19:21:37+0200 23472 [930]: s9 host 250 1 23450 88c2244c-a782-40ed-9560-6cfa4d46f853.v0.neptune 2017-06-03 19:21:37+0200 23472 [943]: s10 add_lockspace fail result -22
And /var/log/vdsm/vdsm.log says:
2017-06-03 19:19:38,176+0200 WARN (jsonrpc/3) [storage.StorageServer.MountConnection] Using user specified backup-volfile-servers option (storageServer:253) 2017-06-03 19:21:12,379+0200 WARN (periodic/1) [throttled] MOM not available. (throttledlog:105) 2017-06-03 19:21:12,380+0200 WARN (periodic/1) [throttled] MOM not available, KSM stats will be missing. (throttledlog:105) 2017-06-03 19:21:14,714+0200 WARN (jsonrpc/1) [storage.StorageServer.MountConnection] Using user specified backup-volfile-servers option (storageServer:253) 2017-06-03 19:21:15,515+0200 ERROR (jsonrpc/4) [storage.initSANLock] Cannot initialize SANLock for domain a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922 (clusterlock:238) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py",
234, in initSANLock sanlock.init_lockspace(sdUUID, idsPath) SanlockException: (107, 'Sanlock lockspace init failure', 'Transport endpoint is not connected') 2017-06-03 19:21:15,515+0200 WARN (jsonrpc/4) [storage.StorageDomainManifest] lease did not initialize successfully (sd:557) Traceback (most recent call last): File "/usr/share/vdsm/storage/sd.py", line 552, in initDomainLock self._domainLock.initLock(self.getDomainLease()) File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py",
271, in initLock initSANLock(self._sdUUID, self._idsPath, lease) File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py",
239, in initSANLock raise se.ClusterLockInitError() ClusterLockInitError: Could not initialize cluster lock: () 2017-06-03 19:21:37,867+0200 ERROR (jsonrpc/2) [storage.StoragePool] Create pool hosted_datacenter canceled (sp:655) Traceback (most recent call last): File "/usr/share/vdsm/storage/sp.py", line 652, in create self.attachSD(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 971, in attachSD dom.acquireHostId(self.id) File "/usr/share/vdsm/storage/sd.py", line 790, in acquireHostId self._manifest.acquireHostId(hostId, async) File "/usr/share/vdsm/storage/sd.py", line 449, in acquireHostId self._domainLock.acquireHostId(hostId, async) File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py",
297, in acquireHostId raise se.AcquireHostIdFailure(self._sdUUID, e) AcquireHostIdFailure: Cannot acquire host id: (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, 'Sanlock lockspace add failure', 'Invalid argument')) 2017-06-03 19:21:37,870+0200 ERROR (jsonrpc/2) [storage.StoragePool] Domain ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 detach from MSD ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 Ver 1 failed. (sp:528) Traceback (most recent call last): File "/usr/share/vdsm/storage/sp.py", line 525, in __cleanupDomains self.detachSD(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 1046, in detachSD raise se.CannotDetachMasterStorageDomain(sdUUID) CannotDetachMasterStorageDomain: Illegal action: (u'ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047',) 2017-06-03 19:21:37,872+0200 ERROR (jsonrpc/2) [storage.StoragePool] Domain a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922 detach from MSD ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 Ver 1 failed. (sp:528) Traceback (most recent call last): File "/usr/share/vdsm/storage/sp.py", line 525, in __cleanupDomains self.detachSD(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 1043, in detachSD self.validateAttachedDomain(dom) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 542, in validateAttachedDomain self.validatePoolSD(dom.sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 535, in validatePoolSD raise se.StorageDomainNotMemberOfPool(self.spUUID, sdUUID) StorageDomainNotMemberOfPool: Domain is not member in pool: u'pool=a1e7e9dd-0cf4-41ae-ba13-36297ed66309, domain=a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922' 2017-06-03 19:21:40,063+0200 ERROR (jsonrpc/2) [storage.TaskManager.Task] (Task='a2476a33-26f8-4ebd-876d-02fe5d13ef78') Unexpected error (task:870) Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 877, in _run return fn(*args, **kargs) File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 52, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 959, in createStoragePool leaseParams) File "/usr/share/vdsm/storage/sp.py", line 652, in create self.attachSD(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 971, in attachSD dom.acquireHostId(self.id) File "/usr/share/vdsm/storage/sd.py", line 790, in acquireHostId self._manifest.acquireHostId(hostId, async) File "/usr/share/vdsm/storage/sd.py", line 449, in acquireHostId self._domainLock.acquireHostId(hostId, async) File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py",
On Sat, Jun 3, 2017 at 8:45 PM, Abi Askushi <rightkicktech@gmail.com> wrote: partitioning i line line line line line line line line line line line
297, in acquireHostId raise se.AcquireHostIdFailure(self._sdUUID, e) AcquireHostIdFailure: Cannot acquire host id: (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, 'Sanlock lockspace add failure', 'Invalid argument')) 2017-06-03 19:21:40,067+0200 ERROR (jsonrpc/2) [storage.Dispatcher] {'status': {'message': "Cannot acquire host id: (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, 'Sanlock lockspace add failure', 'Invalid argument'))", 'code': 661}} (dispatcher:77)
The gluster volume prepared for engine storage is online and no split brain is reported. I don't understand what needs to be done to overcome this. Any idea will be appreciated.
Thank you, Alex
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

On Sun, Jun 4, 2017 at 8:51 PM, Abi Askushi <rightkicktech@gmail.com> wrote:
I clean installed everything and ran into the same. I then ran gdeploy and encountered the same issue when deploying engine. Seems that gluster (?) doesn't like 4K sector drives. I am not sure if it has to do with alignment. The weird thing is that gluster volumes are all ok, replicating normally and no split brain is reported.
The solution to the mentioned bug (1386443) was to format with 512 sector size, which for my case is not an option:
mkfs.xfs -f -i size=512 -s size=512 /dev/gluster/engine illegal sector size 512; hw sector is 4096
Is there any workaround to address this?
Thanx, Alex
On Sun, Jun 4, 2017 at 5:48 PM, Abi Askushi <rightkicktech@gmail.com> wrote:
Hi Maor,
My disk are of 4K block size and from this bug seems that gluster replica needs 512B block size. Is there a way to make gluster function with 4K drives?
Thank you!
On Sun, Jun 4, 2017 at 2:34 PM, Maor Lipchuk <mlipchuk@redhat.com> wrote:
Hi Alex,
I saw a bug that might be related to the issue you encountered at https://bugzilla.redhat.com/show_bug.cgi?id=1386443
Sahina, maybe you have any advise? Do you think that BZ1386443is related?
Regards, Maor
On Sat, Jun 3, 2017 at 8:45 PM, Abi Askushi <rightkicktech@gmail.com> wrote:
Hi All,
I have installed successfully several times oVirt (version 4.1) with 3 nodes on top glusterfs.
This time, when trying to configure the same setup, I am facing the following issue which doesn't seem to go away. During installation i get the error:
Failed to execute stage 'Misc configuration': Cannot acquire host id: (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, 'Sanlock lockspace add failure', 'Invalid argument'))
The only different in this setup is that instead of standard partitioning i have GPT partitioning and the disks have 4K block size instead of 512.
The /var/log/sanlock.log has the following lines:
2017-06-03 19:21:15+0200 23450 [943]: s9 lockspace
ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047:250:/rhev/data-center/mnt/_var_lib_ovirt-hosted-engin-setup_tmptjkIDI/ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047/dom_md/ids:0 2017-06-03 19:21:36+0200 23471 [944]: s9:r5 resource
ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047:SDM:/rhev/data-center/mnt/_var_lib_ovirt-hosted-engine-setup_tmptjkIDI/ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047/dom_md/leases:1048576 for 2,9,23040 2017-06-03 19:21:36+0200 23471 [943]: s10 lockspace
a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922:250:/rhev/data-center/mnt/glusterSD/10.100.100.1:_engine/a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922/dom_md/ids:0 2017-06-03 19:21:36+0200 23471 [23522]: a5a6b0e7 aio collect RD 0x7f59b00008c0:0x7f59b00008d0:0x7f59b0101000 result -22:0 match res 2017-06-03 19:21:36+0200 23471 [23522]: read_sectors delta_leader offset 127488 rv -22
/rhev/data-center/mnt/glusterSD/10.100.100.1:_engine/a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922/dom_md/ids 2017-06-03 19:21:37+0200 23472 [930]: s9 host 250 1 23450 88c2244c-a782-40ed-9560-6cfa4d46f853.v0.neptune 2017-06-03 19:21:37+0200 23472 [943]: s10 add_lockspace fail result -22
And /var/log/vdsm/vdsm.log says:
2017-06-03 19:19:38,176+0200 WARN (jsonrpc/3) [storage.StorageServer.MountConnection] Using user specified backup-volfile-servers option (storageServer:253) 2017-06-03 19:21:12,379+0200 WARN (periodic/1) [throttled] MOM not available. (throttledlog:105) 2017-06-03 19:21:12,380+0200 WARN (periodic/1) [throttled] MOM not available, KSM stats will be missing. (throttledlog:105) 2017-06-03 19:21:14,714+0200 WARN (jsonrpc/1) [storage.StorageServer.MountConnection] Using user specified backup-volfile-servers option (storageServer:253) 2017-06-03 19:21:15,515+0200 ERROR (jsonrpc/4) [storage.initSANLock] Cannot initialize SANLock for domain a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922 (clusterlock:238) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py", line 234, in initSANLock sanlock.init_lockspace(sdUUID, idsPath) SanlockException: (107, 'Sanlock lockspace init failure', 'Transport endpoint is not connected') 2017-06-03 19:21:15,515+0200 WARN (jsonrpc/4) [storage.StorageDomainManifest] lease did not initialize successfully (sd:557) Traceback (most recent call last): File "/usr/share/vdsm/storage/sd.py", line 552, in initDomainLock self._domainLock.initLock(self.getDomainLease()) File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py", line 271, in initLock initSANLock(self._sdUUID, self._idsPath, lease) File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py", line 239, in initSANLock raise se.ClusterLockInitError() ClusterLockInitError: Could not initialize cluster lock: () 2017-06-03 19:21:37,867+0200 ERROR (jsonrpc/2) [storage.StoragePool] Create pool hosted_datacenter canceled (sp:655) Traceback (most recent call last): File "/usr/share/vdsm/storage/sp.py", line 652, in create self.attachSD(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 971, in attachSD dom.acquireHostId(self.id) File "/usr/share/vdsm/storage/sd.py", line 790, in acquireHostId self._manifest.acquireHostId(hostId, async) File "/usr/share/vdsm/storage/sd.py", line 449, in acquireHostId self._domainLock.acquireHostId(hostId, async) File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py", line 297, in acquireHostId raise se.AcquireHostIdFailure(self._sdUUID, e) AcquireHostIdFailure: Cannot acquire host id: (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, 'Sanlock lockspace add failure', 'Invalid argument')) 2017-06-03 19:21:37,870+0200 ERROR (jsonrpc/2) [storage.StoragePool] Domain ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 detach from MSD ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 Ver 1 failed. (sp:528) Traceback (most recent call last): File "/usr/share/vdsm/storage/sp.py", line 525, in __cleanupDomains self.detachSD(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 1046, in detachSD raise se.CannotDetachMasterStorageDomain(sdUUID) CannotDetachMasterStorageDomain: Illegal action: (u'ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047',) 2017-06-03 19:21:37,872+0200 ERROR (jsonrpc/2) [storage.StoragePool] Domain a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922 detach from MSD ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 Ver 1 failed. (sp:528) Traceback (most recent call last): File "/usr/share/vdsm/storage/sp.py", line 525, in __cleanupDomains self.detachSD(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 1043, in detachSD self.validateAttachedDomain(dom) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 542, in validateAttachedDomain self.validatePoolSD(dom.sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 535, in validatePoolSD raise se.StorageDomainNotMemberOfPool(self.spUUID, sdUUID) StorageDomainNotMemberOfPool: Domain is not member in pool: u'pool=a1e7e9dd-0cf4-41ae-ba13-36297ed66309, domain=a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922' 2017-06-03 19:21:40,063+0200 ERROR (jsonrpc/2) [storage.TaskManager.Task] (Task='a2476a33-26f8-4ebd-876d-02fe5d13ef78') Unexpected error (task:870) Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 877, in _run return fn(*args, **kargs) File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 52, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 959, in createStoragePool leaseParams) File "/usr/share/vdsm/storage/sp.py", line 652, in create self.attachSD(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 971, in attachSD dom.acquireHostId(self.id) File "/usr/share/vdsm/storage/sd.py", line 790, in acquireHostId self._manifest.acquireHostId(hostId, async) File "/usr/share/vdsm/storage/sd.py", line 449, in acquireHostId self._domainLock.acquireHostId(hostId, async) File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py", line 297, in acquireHostId raise se.AcquireHostIdFailure(self._sdUUID, e) AcquireHostIdFailure: Cannot acquire host id: (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, 'Sanlock lockspace add failure', 'Invalid argument')) 2017-06-03 19:21:40,067+0200 ERROR (jsonrpc/2) [storage.Dispatcher] {'status': {'message': "Cannot acquire host id: (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, 'Sanlock lockspace add failure', 'Invalid argument'))", 'code': 661}} (dispatcher:77)
The gluster volume prepared for engine storage is online and no split brain is reported. I don't understand what needs to be done to overcome this. Any idea will be appreciated.
Thank you, Alex
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Adding Sahina to the thread

Can we have the gluster mount logs and brick logs to check if it's the same issue? On Sun, Jun 4, 2017 at 11:21 PM, Abi Askushi <rightkicktech@gmail.com> wrote:
I clean installed everything and ran into the same. I then ran gdeploy and encountered the same issue when deploying engine. Seems that gluster (?) doesn't like 4K sector drives. I am not sure if it has to do with alignment. The weird thing is that gluster volumes are all ok, replicating normally and no split brain is reported.
The solution to the mentioned bug (1386443 <https://bugzilla.redhat.com/show_bug.cgi?id=1386443>) was to format with 512 sector size, which for my case is not an option:
mkfs.xfs -f -i size=512 -s size=512 /dev/gluster/engine illegal sector size 512; hw sector is 4096
Is there any workaround to address this?
Thanx, Alex
On Sun, Jun 4, 2017 at 5:48 PM, Abi Askushi <rightkicktech@gmail.com> wrote:
Hi Maor,
My disk are of 4K block size and from this bug seems that gluster replica needs 512B block size. Is there a way to make gluster function with 4K drives?
Thank you!
On Sun, Jun 4, 2017 at 2:34 PM, Maor Lipchuk <mlipchuk@redhat.com> wrote:
Hi Alex,
I saw a bug that might be related to the issue you encountered at https://bugzilla.redhat.com/show_bug.cgi?id=1386443
Sahina, maybe you have any advise? Do you think that BZ1386443is related?
Regards, Maor
Hi All,
I have installed successfully several times oVirt (version 4.1) with 3 nodes on top glusterfs.
This time, when trying to configure the same setup, I am facing the following issue which doesn't seem to go away. During installation i get the error:
Failed to execute stage 'Misc configuration': Cannot acquire host id: (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, 'Sanlock lockspace add failure', 'Invalid argument'))
The only different in this setup is that instead of standard
have GPT partitioning and the disks have 4K block size instead of 512.
The /var/log/sanlock.log has the following lines:
2017-06-03 19:21:15+0200 23450 [943]: s9 lockspace ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047:250:/rhev/data-center/m nt/_var_lib_ovirt-hosted-engin-setup_tmptjkIDI/ba6bd862- c2b8-46e7-b2c8-91e4a5bb2047/dom_md/ids:0 2017-06-03 19:21:36+0200 23471 [944]: s9:r5 resource ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047:SDM:/rhev/data-center/m nt/_var_lib_ovirt-hosted-engine-setup_tmptjkIDI/ba6bd862- c2b8-46e7-b2c8-91e4a5bb2047/dom_md/leases:1048576 for 2,9,23040 2017-06-03 19:21:36+0200 23471 [943]: s10 lockspace a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922:250:/rhev/data-center/m nt/glusterSD/10.100.100.1:_engine/a5a6b0e7-fc3f-4838-8e26- c8b4d5e5e922/dom_md/ids:0 2017-06-03 19:21:36+0200 23471 [23522]: a5a6b0e7 aio collect RD 0x7f59b00008c0:0x7f59b00008d0:0x7f59b0101000 result -22:0 match res 2017-06-03 19:21:36+0200 23471 [23522]: read_sectors delta_leader offset 127488 rv -22 /rhev/data-center/mnt/glusterSD/10.100.100.1:_engine/a5a6b0e 7-fc3f-4838-8e26-c8b4d5e5e922/dom_md/ids 2017-06-03 19:21:37+0200 23472 [930]: s9 host 250 1 23450 88c2244c-a782-40ed-9560-6cfa4d46f853.v0.neptune 2017-06-03 19:21:37+0200 23472 [943]: s10 add_lockspace fail result -22
And /var/log/vdsm/vdsm.log says:
2017-06-03 19:19:38,176+0200 WARN (jsonrpc/3) [storage.StorageServer.MountConnection] Using user specified backup-volfile-servers option (storageServer:253) 2017-06-03 19:21:12,379+0200 WARN (periodic/1) [throttled] MOM not available. (throttledlog:105) 2017-06-03 19:21:12,380+0200 WARN (periodic/1) [throttled] MOM not available, KSM stats will be missing. (throttledlog:105) 2017-06-03 19:21:14,714+0200 WARN (jsonrpc/1) [storage.StorageServer.MountConnection] Using user specified backup-volfile-servers option (storageServer:253) 2017-06-03 19:21:15,515+0200 ERROR (jsonrpc/4) [storage.initSANLock] Cannot initialize SANLock for domain a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922 (clusterlock:238) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py",
234, in initSANLock sanlock.init_lockspace(sdUUID, idsPath) SanlockException: (107, 'Sanlock lockspace init failure', 'Transport endpoint is not connected') 2017-06-03 19:21:15,515+0200 WARN (jsonrpc/4) [storage.StorageDomainManifest] lease did not initialize successfully (sd:557) Traceback (most recent call last): File "/usr/share/vdsm/storage/sd.py", line 552, in initDomainLock self._domainLock.initLock(self.getDomainLease()) File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py",
271, in initLock initSANLock(self._sdUUID, self._idsPath, lease) File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py",
239, in initSANLock raise se.ClusterLockInitError() ClusterLockInitError: Could not initialize cluster lock: () 2017-06-03 19:21:37,867+0200 ERROR (jsonrpc/2) [storage.StoragePool] Create pool hosted_datacenter canceled (sp:655) Traceback (most recent call last): File "/usr/share/vdsm/storage/sp.py", line 652, in create self.attachSD(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 971, in attachSD dom.acquireHostId(self.id) File "/usr/share/vdsm/storage/sd.py", line 790, in acquireHostId self._manifest.acquireHostId(hostId, async) File "/usr/share/vdsm/storage/sd.py", line 449, in acquireHostId self._domainLock.acquireHostId(hostId, async) File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py",
297, in acquireHostId raise se.AcquireHostIdFailure(self._sdUUID, e) AcquireHostIdFailure: Cannot acquire host id: (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, 'Sanlock lockspace add failure', 'Invalid argument')) 2017-06-03 19:21:37,870+0200 ERROR (jsonrpc/2) [storage.StoragePool] Domain ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 detach from MSD ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 Ver 1 failed. (sp:528) Traceback (most recent call last): File "/usr/share/vdsm/storage/sp.py", line 525, in __cleanupDomains self.detachSD(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 1046, in detachSD raise se.CannotDetachMasterStorageDomain(sdUUID) CannotDetachMasterStorageDomain: Illegal action: (u'ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047',) 2017-06-03 19:21:37,872+0200 ERROR (jsonrpc/2) [storage.StoragePool] Domain a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922 detach from MSD ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 Ver 1 failed. (sp:528) Traceback (most recent call last): File "/usr/share/vdsm/storage/sp.py", line 525, in __cleanupDomains self.detachSD(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 1043, in detachSD self.validateAttachedDomain(dom) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 542, in validateAttachedDomain self.validatePoolSD(dom.sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 535, in validatePoolSD raise se.StorageDomainNotMemberOfPool(self.spUUID, sdUUID) StorageDomainNotMemberOfPool: Domain is not member in pool: u'pool=a1e7e9dd-0cf4-41ae-ba13-36297ed66309, domain=a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922' 2017-06-03 19:21:40,063+0200 ERROR (jsonrpc/2) [storage.TaskManager.Task] (Task='a2476a33-26f8-4ebd-876d-02fe5d13ef78') Unexpected error (task:870) Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 877, in _run return fn(*args, **kargs) File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 52, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 959, in createStoragePool leaseParams) File "/usr/share/vdsm/storage/sp.py", line 652, in create self.attachSD(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 971, in attachSD dom.acquireHostId(self.id) File "/usr/share/vdsm/storage/sd.py", line 790, in acquireHostId self._manifest.acquireHostId(hostId, async) File "/usr/share/vdsm/storage/sd.py", line 449, in acquireHostId self._domainLock.acquireHostId(hostId, async) File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py",
297, in acquireHostId raise se.AcquireHostIdFailure(self._sdUUID, e) AcquireHostIdFailure: Cannot acquire host id: (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, 'Sanlock lockspace add failure', 'Invalid argument')) 2017-06-03 19:21:40,067+0200 ERROR (jsonrpc/2) [storage.Dispatcher] {'status': {'message': "Cannot acquire host id: (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, 'Sanlock lockspace add failure', 'Invalid argument'))", 'code': 661}} (dispatcher:77)
The gluster volume prepared for engine storage is online and no split brain is reported. I don't understand what needs to be done to overcome
On Sat, Jun 3, 2017 at 8:45 PM, Abi Askushi <rightkicktech@gmail.com> wrote: partitioning i line line line line line line line line line line line this. Any
idea will be appreciated.
Thank you, Alex
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

Hi Sahina, Attached are the logs. Let me know if sth else is needed. I have 5 disks (with 4K physical sector) in RAID5. The RAID has 64K stripe size at the moment. I have prepared the storage as below: pvcreate --dataalignment 256K /dev/sda4 vgcreate --physicalextentsize 256K gluster /dev/sda4 lvcreate -n engine --size 120G gluster mkfs.xfs -f -i size=512 /dev/gluster/engine Thanx, Alex On Mon, Jun 5, 2017 at 12:14 PM, Sahina Bose <sabose@redhat.com> wrote:
Can we have the gluster mount logs and brick logs to check if it's the same issue?
On Sun, Jun 4, 2017 at 11:21 PM, Abi Askushi <rightkicktech@gmail.com> wrote:
I clean installed everything and ran into the same. I then ran gdeploy and encountered the same issue when deploying engine. Seems that gluster (?) doesn't like 4K sector drives. I am not sure if it has to do with alignment. The weird thing is that gluster volumes are all ok, replicating normally and no split brain is reported.
The solution to the mentioned bug (1386443 <https://bugzilla.redhat.com/show_bug.cgi?id=1386443>) was to format with 512 sector size, which for my case is not an option:
mkfs.xfs -f -i size=512 -s size=512 /dev/gluster/engine illegal sector size 512; hw sector is 4096
Is there any workaround to address this?
Thanx, Alex
On Sun, Jun 4, 2017 at 5:48 PM, Abi Askushi <rightkicktech@gmail.com> wrote:
Hi Maor,
My disk are of 4K block size and from this bug seems that gluster replica needs 512B block size. Is there a way to make gluster function with 4K drives?
Thank you!
On Sun, Jun 4, 2017 at 2:34 PM, Maor Lipchuk <mlipchuk@redhat.com> wrote:
Hi Alex,
I saw a bug that might be related to the issue you encountered at https://bugzilla.redhat.com/show_bug.cgi?id=1386443
Sahina, maybe you have any advise? Do you think that BZ1386443is related?
Regards, Maor
Hi All,
I have installed successfully several times oVirt (version 4.1) with 3 nodes on top glusterfs.
This time, when trying to configure the same setup, I am facing the following issue which doesn't seem to go away. During installation i get the error:
Failed to execute stage 'Misc configuration': Cannot acquire host id: (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, 'Sanlock lockspace add failure', 'Invalid argument'))
The only different in this setup is that instead of standard
have GPT partitioning and the disks have 4K block size instead of 512.
The /var/log/sanlock.log has the following lines:
2017-06-03 19:21:15+0200 23450 [943]: s9 lockspace ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047:250:/rhev/data-center/m nt/_var_lib_ovirt-hosted-engin-setup_tmptjkIDI/ba6bd862-c2b8 -46e7-b2c8-91e4a5bb2047/dom_md/ids:0 2017-06-03 19:21:36+0200 23471 [944]: s9:r5 resource ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047:SDM:/rhev/data-center/m nt/_var_lib_ovirt-hosted-engine-setup_tmptjkIDI/ba6bd862-c2b 8-46e7-b2c8-91e4a5bb2047/dom_md/leases:1048576 for 2,9,23040 2017-06-03 19:21:36+0200 23471 [943]: s10 lockspace a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922:250:/rhev/data-center/m nt/glusterSD/10.100.100.1:_engine/a5a6b0e7-fc3f-4838-8e26-c8 b4d5e5e922/dom_md/ids:0 2017-06-03 19:21:36+0200 23471 [23522]: a5a6b0e7 aio collect RD 0x7f59b00008c0:0x7f59b00008d0:0x7f59b0101000 result -22:0 match res 2017-06-03 19:21:36+0200 23471 [23522]: read_sectors delta_leader offset 127488 rv -22 /rhev/data-center/mnt/glusterSD/10.100.100.1:_engine/a5a6b0e 7-fc3f-4838-8e26-c8b4d5e5e922/dom_md/ids 2017-06-03 19:21:37+0200 23472 [930]: s9 host 250 1 23450 88c2244c-a782-40ed-9560-6cfa4d46f853.v0.neptune 2017-06-03 19:21:37+0200 23472 [943]: s10 add_lockspace fail result -22
And /var/log/vdsm/vdsm.log says:
2017-06-03 19:19:38,176+0200 WARN (jsonrpc/3) [storage.StorageServer.MountConnection] Using user specified backup-volfile-servers option (storageServer:253) 2017-06-03 19:21:12,379+0200 WARN (periodic/1) [throttled] MOM not available. (throttledlog:105) 2017-06-03 19:21:12,380+0200 WARN (periodic/1) [throttled] MOM not available, KSM stats will be missing. (throttledlog:105) 2017-06-03 19:21:14,714+0200 WARN (jsonrpc/1) [storage.StorageServer.MountConnection] Using user specified backup-volfile-servers option (storageServer:253) 2017-06-03 19:21:15,515+0200 ERROR (jsonrpc/4) [storage.initSANLock] Cannot initialize SANLock for domain a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922 (clusterlock:238) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py",
234, in initSANLock sanlock.init_lockspace(sdUUID, idsPath) SanlockException: (107, 'Sanlock lockspace init failure', 'Transport endpoint is not connected') 2017-06-03 19:21:15,515+0200 WARN (jsonrpc/4) [storage.StorageDomainManifest] lease did not initialize successfully (sd:557) Traceback (most recent call last): File "/usr/share/vdsm/storage/sd.py", line 552, in initDomainLock self._domainLock.initLock(self.getDomainLease()) File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py",
271, in initLock initSANLock(self._sdUUID, self._idsPath, lease) File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py",
239, in initSANLock raise se.ClusterLockInitError() ClusterLockInitError: Could not initialize cluster lock: () 2017-06-03 19:21:37,867+0200 ERROR (jsonrpc/2) [storage.StoragePool] Create pool hosted_datacenter canceled (sp:655) Traceback (most recent call last): File "/usr/share/vdsm/storage/sp.py", line 652, in create self.attachSD(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 971, in attachSD dom.acquireHostId(self.id) File "/usr/share/vdsm/storage/sd.py", line 790, in acquireHostId self._manifest.acquireHostId(hostId, async) File "/usr/share/vdsm/storage/sd.py", line 449, in acquireHostId self._domainLock.acquireHostId(hostId, async) File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py",
297, in acquireHostId raise se.AcquireHostIdFailure(self._sdUUID, e) AcquireHostIdFailure: Cannot acquire host id: (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, 'Sanlock lockspace add failure', 'Invalid argument')) 2017-06-03 19:21:37,870+0200 ERROR (jsonrpc/2) [storage.StoragePool] Domain ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 detach from MSD ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 Ver 1 failed. (sp:528) Traceback (most recent call last): File "/usr/share/vdsm/storage/sp.py", line 525, in __cleanupDomains self.detachSD(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 1046, in detachSD raise se.CannotDetachMasterStorageDomain(sdUUID) CannotDetachMasterStorageDomain: Illegal action: (u'ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047',) 2017-06-03 19:21:37,872+0200 ERROR (jsonrpc/2) [storage.StoragePool] Domain a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922 detach from MSD ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 Ver 1 failed. (sp:528) Traceback (most recent call last): File "/usr/share/vdsm/storage/sp.py", line 525, in __cleanupDomains self.detachSD(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 1043, in detachSD self.validateAttachedDomain(dom) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 542, in validateAttachedDomain self.validatePoolSD(dom.sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 535, in validatePoolSD raise se.StorageDomainNotMemberOfPool(self.spUUID, sdUUID) StorageDomainNotMemberOfPool: Domain is not member in pool: u'pool=a1e7e9dd-0cf4-41ae-ba13-36297ed66309, domain=a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922' 2017-06-03 19:21:40,063+0200 ERROR (jsonrpc/2) [storage.TaskManager.Task] (Task='a2476a33-26f8-4ebd-876d-02fe5d13ef78') Unexpected error (task:870) Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 877, in _run return fn(*args, **kargs) File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 52, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 959, in createStoragePool leaseParams) File "/usr/share/vdsm/storage/sp.py", line 652, in create self.attachSD(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 971, in attachSD dom.acquireHostId(self.id) File "/usr/share/vdsm/storage/sd.py", line 790, in acquireHostId self._manifest.acquireHostId(hostId, async) File "/usr/share/vdsm/storage/sd.py", line 449, in acquireHostId self._domainLock.acquireHostId(hostId, async) File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py",
297, in acquireHostId raise se.AcquireHostIdFailure(self._sdUUID, e) AcquireHostIdFailure: Cannot acquire host id: (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, 'Sanlock lockspace add failure', 'Invalid argument')) 2017-06-03 19:21:40,067+0200 ERROR (jsonrpc/2) [storage.Dispatcher] {'status': {'message': "Cannot acquire host id: (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, 'Sanlock lockspace add failure', 'Invalid argument'))", 'code': 661}} (dispatcher:77)
The gluster volume prepared for engine storage is online and no split brain is reported. I don't understand what needs to be done to overcome
On Sat, Jun 3, 2017 at 8:45 PM, Abi Askushi <rightkicktech@gmail.com> wrote: partitioning i line line line line line line line line line line line this. Any
idea will be appreciated.
Thank you, Alex
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

This seems like a case of O_DIRECT reads and writes gone wrong, judging by the 'Invalid argument' errors. The two operations that have failed on gluster bricks are: [2017-06-05 09:40:39.428979] E [MSGID: 113072] [posix.c:3453:posix_writev] 0-engine-posix: write failed: offset 0, [Invalid argument] [2017-06-05 09:41:00.865760] E [MSGID: 113040] [posix.c:3178:posix_readv] 0-engine-posix: read failed on gfid=8c94f658-ac3c-4e3a-b368-8c038513a914, fd=0x7f408584c06c, offset=127488 size=512, buf=0x7f4083c0b000 [Invalid argument] But then, both the write and the read have 512byte-aligned offset, size and buf address (which is correct). Are you saying you don't see this issue with 4K block-size? -Krutika On Mon, Jun 5, 2017 at 3:21 PM, Abi Askushi <rightkicktech@gmail.com> wrote:
Hi Sahina,
Attached are the logs. Let me know if sth else is needed.
I have 5 disks (with 4K physical sector) in RAID5. The RAID has 64K stripe size at the moment. I have prepared the storage as below:
pvcreate --dataalignment 256K /dev/sda4 vgcreate --physicalextentsize 256K gluster /dev/sda4
lvcreate -n engine --size 120G gluster mkfs.xfs -f -i size=512 /dev/gluster/engine
Thanx, Alex
On Mon, Jun 5, 2017 at 12:14 PM, Sahina Bose <sabose@redhat.com> wrote:
Can we have the gluster mount logs and brick logs to check if it's the same issue?
On Sun, Jun 4, 2017 at 11:21 PM, Abi Askushi <rightkicktech@gmail.com> wrote:
I clean installed everything and ran into the same. I then ran gdeploy and encountered the same issue when deploying engine. Seems that gluster (?) doesn't like 4K sector drives. I am not sure if it has to do with alignment. The weird thing is that gluster volumes are all ok, replicating normally and no split brain is reported.
The solution to the mentioned bug (1386443 <https://bugzilla.redhat.com/show_bug.cgi?id=1386443>) was to format with 512 sector size, which for my case is not an option:
mkfs.xfs -f -i size=512 -s size=512 /dev/gluster/engine illegal sector size 512; hw sector is 4096
Is there any workaround to address this?
Thanx, Alex
On Sun, Jun 4, 2017 at 5:48 PM, Abi Askushi <rightkicktech@gmail.com> wrote:
Hi Maor,
My disk are of 4K block size and from this bug seems that gluster replica needs 512B block size. Is there a way to make gluster function with 4K drives?
Thank you!
On Sun, Jun 4, 2017 at 2:34 PM, Maor Lipchuk <mlipchuk@redhat.com> wrote:
Hi Alex,
I saw a bug that might be related to the issue you encountered at https://bugzilla.redhat.com/show_bug.cgi?id=1386443
Sahina, maybe you have any advise? Do you think that BZ1386443is related?
Regards, Maor
Hi All,
I have installed successfully several times oVirt (version 4.1) with 3 nodes on top glusterfs.
This time, when trying to configure the same setup, I am facing the following issue which doesn't seem to go away. During installation i get the error:
Failed to execute stage 'Misc configuration': Cannot acquire host id: (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, 'Sanlock lockspace add failure', 'Invalid argument'))
The only different in this setup is that instead of standard
have GPT partitioning and the disks have 4K block size instead of
On Sat, Jun 3, 2017 at 8:45 PM, Abi Askushi <rightkicktech@gmail.com> wrote: partitioning i 512.
The /var/log/sanlock.log has the following lines:
2017-06-03 19:21:15+0200 23450 [943]: s9 lockspace ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047:250:/rhev/data-center/m
2017-06-03 19:21:36+0200 23471 [944]: s9:r5 resource ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047:SDM:/rhev/data-center/m nt/_var_lib_ovirt-hosted-engine-setup_tmptjkIDI/ba6bd862-c2b 8-46e7-b2c8-91e4a5bb2047/dom_md/leases:1048576 for 2,9,23040 2017-06-03 19:21:36+0200 23471 [943]: s10 lockspace a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922:250:/rhev/data-center/m nt/glusterSD/10.100.100.1:_engine/a5a6b0e7-fc3f-4838-8e26-c8 b4d5e5e922/dom_md/ids:0 2017-06-03 19:21:36+0200 23471 [23522]: a5a6b0e7 aio collect RD 0x7f59b00008c0:0x7f59b00008d0:0x7f59b0101000 result -22:0 match res 2017-06-03 19:21:36+0200 23471 [23522]: read_sectors delta_leader offset 127488 rv -22 /rhev/data-center/mnt/glusterSD/10.100.100.1:_engine/a5a6b0e 7-fc3f-4838-8e26-c8b4d5e5e922/dom_md/ids 2017-06-03 19:21:37+0200 23472 [930]: s9 host 250 1 23450 88c2244c-a782-40ed-9560-6cfa4d46f853.v0.neptune 2017-06-03 19:21:37+0200 23472 [943]: s10 add_lockspace fail result -22
And /var/log/vdsm/vdsm.log says:
2017-06-03 19:19:38,176+0200 WARN (jsonrpc/3) [storage.StorageServer.MountConnection] Using user specified backup-volfile-servers option (storageServer:253) 2017-06-03 19:21:12,379+0200 WARN (periodic/1) [throttled] MOM not available. (throttledlog:105) 2017-06-03 19:21:12,380+0200 WARN (periodic/1) [throttled] MOM not available, KSM stats will be missing. (throttledlog:105) 2017-06-03 19:21:14,714+0200 WARN (jsonrpc/1) [storage.StorageServer.MountConnection] Using user specified backup-volfile-servers option (storageServer:253) 2017-06-03 19:21:15,515+0200 ERROR (jsonrpc/4) [storage.initSANLock] Cannot initialize SANLock for domain a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922 (clusterlock:238) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py",
234, in initSANLock sanlock.init_lockspace(sdUUID, idsPath) SanlockException: (107, 'Sanlock lockspace init failure', 'Transport endpoint is not connected') 2017-06-03 19:21:15,515+0200 WARN (jsonrpc/4) [storage.StorageDomainManifest] lease did not initialize successfully (sd:557) Traceback (most recent call last): File "/usr/share/vdsm/storage/sd.py", line 552, in initDomainLock self._domainLock.initLock(self.getDomainLease()) File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py",
271, in initLock initSANLock(self._sdUUID, self._idsPath, lease) File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py",
239, in initSANLock raise se.ClusterLockInitError() ClusterLockInitError: Could not initialize cluster lock: () 2017-06-03 19:21:37,867+0200 ERROR (jsonrpc/2) [storage.StoragePool] Create pool hosted_datacenter canceled (sp:655) Traceback (most recent call last): File "/usr/share/vdsm/storage/sp.py", line 652, in create self.attachSD(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 971, in attachSD dom.acquireHostId(self.id) File "/usr/share/vdsm/storage/sd.py", line 790, in acquireHostId self._manifest.acquireHostId(hostId, async) File "/usr/share/vdsm/storage/sd.py", line 449, in acquireHostId self._domainLock.acquireHostId(hostId, async) File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py",
297, in acquireHostId raise se.AcquireHostIdFailure(self._sdUUID, e) AcquireHostIdFailure: Cannot acquire host id: (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, 'Sanlock lockspace add failure', 'Invalid argument')) 2017-06-03 19:21:37,870+0200 ERROR (jsonrpc/2) [storage.StoragePool] Domain ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 detach from MSD ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 Ver 1 failed. (sp:528) Traceback (most recent call last): File "/usr/share/vdsm/storage/sp.py", line 525, in __cleanupDomains self.detachSD(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 1046, in detachSD raise se.CannotDetachMasterStorageDomain(sdUUID) CannotDetachMasterStorageDomain: Illegal action: (u'ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047',) 2017-06-03 19:21:37,872+0200 ERROR (jsonrpc/2) [storage.StoragePool] Domain a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922 detach from MSD ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 Ver 1 failed. (sp:528) Traceback (most recent call last): File "/usr/share/vdsm/storage/sp.py", line 525, in __cleanupDomains self.detachSD(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 1043, in detachSD self.validateAttachedDomain(dom) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 542, in validateAttachedDomain self.validatePoolSD(dom.sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 535, in validatePoolSD raise se.StorageDomainNotMemberOfPool(self.spUUID, sdUUID) StorageDomainNotMemberOfPool: Domain is not member in pool: u'pool=a1e7e9dd-0cf4-41ae-ba13-36297ed66309, domain=a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922' 2017-06-03 19:21:40,063+0200 ERROR (jsonrpc/2) [storage.TaskManager.Task] (Task='a2476a33-26f8-4ebd-876d-02fe5d13ef78') Unexpected error (task:870) Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 877, in _run return fn(*args, **kargs) File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 52, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 959, in createStoragePool leaseParams) File "/usr/share/vdsm/storage/sp.py", line 652, in create self.attachSD(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 971, in attachSD dom.acquireHostId(self.id) File "/usr/share/vdsm/storage/sd.py", line 790, in acquireHostId self._manifest.acquireHostId(hostId, async) File "/usr/share/vdsm/storage/sd.py", line 449, in acquireHostId self._domainLock.acquireHostId(hostId, async) File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py",
297, in acquireHostId raise se.AcquireHostIdFailure(self._sdUUID, e) AcquireHostIdFailure: Cannot acquire host id: (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, 'Sanlock lockspace add failure', 'Invalid argument')) 2017-06-03 19:21:40,067+0200 ERROR (jsonrpc/2) [storage.Dispatcher] {'status': {'message': "Cannot acquire host id: (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, 'Sanlock lockspace add failure', 'Invalid argument'))", 'code': 661}} (dispatcher:77)
The gluster volume prepared for engine storage is online and no split brain is reported. I don't understand what needs to be done to overcome
nt/_var_lib_ovirt-hosted-engin-setup_tmptjkIDI/ba6bd862-c2b8 -46e7-b2c8-91e4a5bb2047/dom_md/ids:0 line line line line line line line line line line line this. Any
idea will be appreciated.
Thank you, Alex
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

Hi Krutika, I am saying that I am facing this issue with 4k drives. I never encountered this issue with 512 drives. Alex On Jun 5, 2017 14:26, "Krutika Dhananjay" <kdhananj@redhat.com> wrote:
This seems like a case of O_DIRECT reads and writes gone wrong, judging by the 'Invalid argument' errors.
The two operations that have failed on gluster bricks are:
[2017-06-05 09:40:39.428979] E [MSGID: 113072] [posix.c:3453:posix_writev] 0-engine-posix: write failed: offset 0, [Invalid argument] [2017-06-05 09:41:00.865760] E [MSGID: 113040] [posix.c:3178:posix_readv] 0-engine-posix: read failed on gfid=8c94f658-ac3c-4e3a-b368-8c038513a914, fd=0x7f408584c06c, offset=127488 size=512, buf=0x7f4083c0b000 [Invalid argument]
But then, both the write and the read have 512byte-aligned offset, size and buf address (which is correct).
Are you saying you don't see this issue with 4K block-size?
-Krutika
On Mon, Jun 5, 2017 at 3:21 PM, Abi Askushi <rightkicktech@gmail.com> wrote:
Hi Sahina,
Attached are the logs. Let me know if sth else is needed.
I have 5 disks (with 4K physical sector) in RAID5. The RAID has 64K stripe size at the moment. I have prepared the storage as below:
pvcreate --dataalignment 256K /dev/sda4 vgcreate --physicalextentsize 256K gluster /dev/sda4
lvcreate -n engine --size 120G gluster mkfs.xfs -f -i size=512 /dev/gluster/engine
Thanx, Alex
On Mon, Jun 5, 2017 at 12:14 PM, Sahina Bose <sabose@redhat.com> wrote:
Can we have the gluster mount logs and brick logs to check if it's the same issue?
On Sun, Jun 4, 2017 at 11:21 PM, Abi Askushi <rightkicktech@gmail.com> wrote:
I clean installed everything and ran into the same. I then ran gdeploy and encountered the same issue when deploying engine. Seems that gluster (?) doesn't like 4K sector drives. I am not sure if it has to do with alignment. The weird thing is that gluster volumes are all ok, replicating normally and no split brain is reported.
The solution to the mentioned bug (1386443 <https://bugzilla.redhat.com/show_bug.cgi?id=1386443>) was to format with 512 sector size, which for my case is not an option:
mkfs.xfs -f -i size=512 -s size=512 /dev/gluster/engine illegal sector size 512; hw sector is 4096
Is there any workaround to address this?
Thanx, Alex
On Sun, Jun 4, 2017 at 5:48 PM, Abi Askushi <rightkicktech@gmail.com> wrote:
Hi Maor,
My disk are of 4K block size and from this bug seems that gluster replica needs 512B block size. Is there a way to make gluster function with 4K drives?
Thank you!
On Sun, Jun 4, 2017 at 2:34 PM, Maor Lipchuk <mlipchuk@redhat.com> wrote:
Hi Alex,
I saw a bug that might be related to the issue you encountered at https://bugzilla.redhat.com/show_bug.cgi?id=1386443
Sahina, maybe you have any advise? Do you think that BZ1386443is related?
Regards, Maor
On Sat, Jun 3, 2017 at 8:45 PM, Abi Askushi <rightkicktech@gmail.com> wrote: > Hi All, > > I have installed successfully several times oVirt (version 4.1) with 3 nodes > on top glusterfs. > > This time, when trying to configure the same setup, I am facing the > following issue which doesn't seem to go away. During installation i get the > error: > > Failed to execute stage 'Misc configuration': Cannot acquire host id: > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, 'Sanlock > lockspace add failure', 'Invalid argument')) > > The only different in this setup is that instead of standard partitioning i > have GPT partitioning and the disks have 4K block size instead of 512. > > The /var/log/sanlock.log has the following lines: > > 2017-06-03 19:21:15+0200 23450 [943]: s9 lockspace > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047:250:/rhev/data-center/m nt/_var_lib_ovirt-hosted-engin-setup_tmptjkIDI/ba6bd862-c2b8 -46e7-b2c8-91e4a5bb2047/dom_md/ids:0 > 2017-06-03 19:21:36+0200 23471 [944]: s9:r5 resource > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047:SDM:/rhev/data-center/m nt/_var_lib_ovirt-hosted-engine-setup_tmptjkIDI/ba6bd862-c2b 8-46e7-b2c8-91e4a5bb2047/dom_md/leases:1048576 > for 2,9,23040 > 2017-06-03 19:21:36+0200 23471 [943]: s10 lockspace > a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922:250:/rhev/data-center/m nt/glusterSD/10.100.100.1:_engine/a5a6b0e7-fc3f-4838-8e26-c8 b4d5e5e922/dom_md/ids:0 > 2017-06-03 19:21:36+0200 23471 [23522]: a5a6b0e7 aio collect RD > 0x7f59b00008c0:0x7f59b00008d0:0x7f59b0101000 result -22:0 match res > 2017-06-03 19:21:36+0200 23471 [23522]: read_sectors delta_leader offset > 127488 rv -22 > /rhev/data-center/mnt/glusterSD/10.100.100.1:_engine/a5a6b0e 7-fc3f-4838-8e26-c8b4d5e5e922/dom_md/ids > 2017-06-03 19:21:37+0200 23472 [930]: s9 host 250 1 23450 > 88c2244c-a782-40ed-9560-6cfa4d46f853.v0.neptune > 2017-06-03 19:21:37+0200 23472 [943]: s10 add_lockspace fail result -22 > > And /var/log/vdsm/vdsm.log says: > > 2017-06-03 19:19:38,176+0200 WARN (jsonrpc/3) > [storage.StorageServer.MountConnection] Using user specified > backup-volfile-servers option (storageServer:253) > 2017-06-03 19:21:12,379+0200 WARN (periodic/1) [throttled] MOM not > available. (throttledlog:105) > 2017-06-03 19:21:12,380+0200 WARN (periodic/1) [throttled] MOM not > available, KSM stats will be missing. (throttledlog:105) > 2017-06-03 19:21:14,714+0200 WARN (jsonrpc/1) > [storage.StorageServer.MountConnection] Using user specified > backup-volfile-servers option (storageServer:253) > 2017-06-03 19:21:15,515+0200 ERROR (jsonrpc/4) [storage.initSANLock] Cannot > initialize SANLock for domain a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922 > (clusterlock:238) > Traceback (most recent call last): > File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py", line > 234, in initSANLock > sanlock.init_lockspace(sdUUID, idsPath) > SanlockException: (107, 'Sanlock lockspace init failure', 'Transport > endpoint is not connected') > 2017-06-03 19:21:15,515+0200 WARN (jsonrpc/4) > [storage.StorageDomainManifest] lease did not initialize successfully > (sd:557) > Traceback (most recent call last): > File "/usr/share/vdsm/storage/sd.py", line 552, in initDomainLock > self._domainLock.initLock(self.getDomainLease()) > File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py", line > 271, in initLock > initSANLock(self._sdUUID, self._idsPath, lease) > File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py", line > 239, in initSANLock > raise se.ClusterLockInitError() > ClusterLockInitError: Could not initialize cluster lock: () > 2017-06-03 19:21:37,867+0200 ERROR (jsonrpc/2) [storage.StoragePool] Create > pool hosted_datacenter canceled (sp:655) > Traceback (most recent call last): > File "/usr/share/vdsm/storage/sp.py", line 652, in create > self.attachSD(sdUUID) > File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line > 79, in wrapper > return method(self, *args, **kwargs) > File "/usr/share/vdsm/storage/sp.py", line 971, in attachSD > dom.acquireHostId(self.id) > File "/usr/share/vdsm/storage/sd.py", line 790, in acquireHostId > self._manifest.acquireHostId(hostId, async) > File "/usr/share/vdsm/storage/sd.py", line 449, in acquireHostId > self._domainLock.acquireHostId(hostId, async) > File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py", line > 297, in acquireHostId > raise se.AcquireHostIdFailure(self._sdUUID, e) > AcquireHostIdFailure: Cannot acquire host id: > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, 'Sanlock > lockspace add failure', 'Invalid argument')) > 2017-06-03 19:21:37,870+0200 ERROR (jsonrpc/2) [storage.StoragePool] Domain > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 detach from MSD > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 Ver 1 failed. (sp:528) > Traceback (most recent call last): > File "/usr/share/vdsm/storage/sp.py", line 525, in __cleanupDomains > self.detachSD(sdUUID) > File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line > 79, in wrapper > return method(self, *args, **kwargs) > File "/usr/share/vdsm/storage/sp.py", line 1046, in detachSD > raise se.CannotDetachMasterStorageDomain(sdUUID) > CannotDetachMasterStorageDomain: Illegal action: > (u'ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047',) > 2017-06-03 19:21:37,872+0200 ERROR (jsonrpc/2) [storage.StoragePool] Domain > a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922 detach from MSD > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 Ver 1 failed. (sp:528) > Traceback (most recent call last): > File "/usr/share/vdsm/storage/sp.py", line 525, in __cleanupDomains > self.detachSD(sdUUID) > File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line > 79, in wrapper > return method(self, *args, **kwargs) > File "/usr/share/vdsm/storage/sp.py", line 1043, in detachSD > self.validateAttachedDomain(dom) > File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line > 79, in wrapper > return method(self, *args, **kwargs) > File "/usr/share/vdsm/storage/sp.py", line 542, in validateAttachedDomain > self.validatePoolSD(dom.sdUUID) > File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line > 79, in wrapper > return method(self, *args, **kwargs) > File "/usr/share/vdsm/storage/sp.py", line 535, in validatePoolSD > raise se.StorageDomainNotMemberOfPool(self.spUUID, sdUUID) > StorageDomainNotMemberOfPool: Domain is not member in pool: > u'pool=a1e7e9dd-0cf4-41ae-ba13-36297ed66309, > domain=a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922' > 2017-06-03 19:21:40,063+0200 ERROR (jsonrpc/2) [storage.TaskManager.Task] > (Task='a2476a33-26f8-4ebd-876d-02fe5d13ef78') Unexpected error (task:870) > Traceback (most recent call last): > File "/usr/share/vdsm/storage/task.py", line 877, in _run > return fn(*args, **kargs) > File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 52, in > wrapper > res = f(*args, **kwargs) > File "/usr/share/vdsm/storage/hsm.py", line 959, in createStoragePool > leaseParams) > File "/usr/share/vdsm/storage/sp.py", line 652, in create > self.attachSD(sdUUID) > File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line > 79, in wrapper > return method(self, *args, **kwargs) > File "/usr/share/vdsm/storage/sp.py", line 971, in attachSD > dom.acquireHostId(self.id) > File "/usr/share/vdsm/storage/sd.py", line 790, in acquireHostId > self._manifest.acquireHostId(hostId, async) > File "/usr/share/vdsm/storage/sd.py", line 449, in acquireHostId > self._domainLock.acquireHostId(hostId, async) > File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py", line > 297, in acquireHostId > raise se.AcquireHostIdFailure(self._sdUUID, e) > AcquireHostIdFailure: Cannot acquire host id: > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, 'Sanlock > lockspace add failure', 'Invalid argument')) > 2017-06-03 19:21:40,067+0200 ERROR (jsonrpc/2) [storage.Dispatcher] > {'status': {'message': "Cannot acquire host id: > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, 'Sanlock > lockspace add failure', 'Invalid argument'))", 'code': 661}} (dispatcher:77) > > The gluster volume prepared for engine storage is online and no split brain > is reported. I don't understand what needs to be done to overcome this. Any > idea will be appreciated. > > Thank you, > Alex > > _______________________________________________ > Users mailing list > Users@ovirt.org > http://lists.ovirt.org/mailman/listinfo/users >
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

The question that rises is what is needed to make gluster aware of the 4K physical sectors presented to it (the logical sector is also 4K). The offset (127488) at the log does not seem aligned at 4K. Alex On Mon, Jun 5, 2017 at 2:47 PM, Abi Askushi <rightkicktech@gmail.com> wrote:
Hi Krutika,
I am saying that I am facing this issue with 4k drives. I never encountered this issue with 512 drives.
Alex
On Jun 5, 2017 14:26, "Krutika Dhananjay" <kdhananj@redhat.com> wrote:
This seems like a case of O_DIRECT reads and writes gone wrong, judging by the 'Invalid argument' errors.
The two operations that have failed on gluster bricks are:
[2017-06-05 09:40:39.428979] E [MSGID: 113072] [posix.c:3453:posix_writev] 0-engine-posix: write failed: offset 0, [Invalid argument] [2017-06-05 09:41:00.865760] E [MSGID: 113040] [posix.c:3178:posix_readv] 0-engine-posix: read failed on gfid=8c94f658-ac3c-4e3a-b368-8c038513a914, fd=0x7f408584c06c, offset=127488 size=512, buf=0x7f4083c0b000 [Invalid argument]
But then, both the write and the read have 512byte-aligned offset, size and buf address (which is correct).
Are you saying you don't see this issue with 4K block-size?
-Krutika
On Mon, Jun 5, 2017 at 3:21 PM, Abi Askushi <rightkicktech@gmail.com> wrote:
Hi Sahina,
Attached are the logs. Let me know if sth else is needed.
I have 5 disks (with 4K physical sector) in RAID5. The RAID has 64K stripe size at the moment. I have prepared the storage as below:
pvcreate --dataalignment 256K /dev/sda4 vgcreate --physicalextentsize 256K gluster /dev/sda4
lvcreate -n engine --size 120G gluster mkfs.xfs -f -i size=512 /dev/gluster/engine
Thanx, Alex
On Mon, Jun 5, 2017 at 12:14 PM, Sahina Bose <sabose@redhat.com> wrote:
Can we have the gluster mount logs and brick logs to check if it's the same issue?
On Sun, Jun 4, 2017 at 11:21 PM, Abi Askushi <rightkicktech@gmail.com> wrote:
I clean installed everything and ran into the same. I then ran gdeploy and encountered the same issue when deploying engine. Seems that gluster (?) doesn't like 4K sector drives. I am not sure if it has to do with alignment. The weird thing is that gluster volumes are all ok, replicating normally and no split brain is reported.
The solution to the mentioned bug (1386443 <https://bugzilla.redhat.com/show_bug.cgi?id=1386443>) was to format with 512 sector size, which for my case is not an option:
mkfs.xfs -f -i size=512 -s size=512 /dev/gluster/engine illegal sector size 512; hw sector is 4096
Is there any workaround to address this?
Thanx, Alex
On Sun, Jun 4, 2017 at 5:48 PM, Abi Askushi <rightkicktech@gmail.com> wrote:
Hi Maor,
My disk are of 4K block size and from this bug seems that gluster replica needs 512B block size. Is there a way to make gluster function with 4K drives?
Thank you!
On Sun, Jun 4, 2017 at 2:34 PM, Maor Lipchuk <mlipchuk@redhat.com> wrote:
> Hi Alex, > > I saw a bug that might be related to the issue you encountered at > https://bugzilla.redhat.com/show_bug.cgi?id=1386443 > > Sahina, maybe you have any advise? Do you think that BZ1386443is > related? > > Regards, > Maor > > On Sat, Jun 3, 2017 at 8:45 PM, Abi Askushi <rightkicktech@gmail.com> > wrote: > > Hi All, > > > > I have installed successfully several times oVirt (version 4.1) > with 3 nodes > > on top glusterfs. > > > > This time, when trying to configure the same setup, I am facing the > > following issue which doesn't seem to go away. During installation > i get the > > error: > > > > Failed to execute stage 'Misc configuration': Cannot acquire host > id: > > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, > 'Sanlock > > lockspace add failure', 'Invalid argument')) > > > > The only different in this setup is that instead of standard > partitioning i > > have GPT partitioning and the disks have 4K block size instead of > 512. > > > > The /var/log/sanlock.log has the following lines: > > > > 2017-06-03 19:21:15+0200 23450 [943]: s9 lockspace > > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047:250:/rhev/data-center/m > nt/_var_lib_ovirt-hosted-engin-setup_tmptjkIDI/ba6bd862-c2b8 > -46e7-b2c8-91e4a5bb2047/dom_md/ids:0 > > 2017-06-03 19:21:36+0200 23471 [944]: s9:r5 resource > > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047:SDM:/rhev/data-center/m > nt/_var_lib_ovirt-hosted-engine-setup_tmptjkIDI/ba6bd862-c2b > 8-46e7-b2c8-91e4a5bb2047/dom_md/leases:1048576 > > for 2,9,23040 > > 2017-06-03 19:21:36+0200 23471 [943]: s10 lockspace > > a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922:250:/rhev/data-center/m > nt/glusterSD/10.100.100.1:_engine/a5a6b0e7-fc3f-4838-8e26-c8 > b4d5e5e922/dom_md/ids:0 > > 2017-06-03 19:21:36+0200 23471 [23522]: a5a6b0e7 aio collect RD > > 0x7f59b00008c0:0x7f59b00008d0:0x7f59b0101000 result -22:0 match > res > > 2017-06-03 19:21:36+0200 23471 [23522]: read_sectors delta_leader > offset > > 127488 rv -22 > > /rhev/data-center/mnt/glusterSD/10.100.100.1:_engine/a5a6b0e > 7-fc3f-4838-8e26-c8b4d5e5e922/dom_md/ids > > 2017-06-03 19:21:37+0200 23472 [930]: s9 host 250 1 23450 > > 88c2244c-a782-40ed-9560-6cfa4d46f853.v0.neptune > > 2017-06-03 19:21:37+0200 23472 [943]: s10 add_lockspace fail > result -22 > > > > And /var/log/vdsm/vdsm.log says: > > > > 2017-06-03 19:19:38,176+0200 WARN (jsonrpc/3) > > [storage.StorageServer.MountConnection] Using user specified > > backup-volfile-servers option (storageServer:253) > > 2017-06-03 19:21:12,379+0200 WARN (periodic/1) [throttled] MOM not > > available. (throttledlog:105) > > 2017-06-03 19:21:12,380+0200 WARN (periodic/1) [throttled] MOM not > > available, KSM stats will be missing. (throttledlog:105) > > 2017-06-03 19:21:14,714+0200 WARN (jsonrpc/1) > > [storage.StorageServer.MountConnection] Using user specified > > backup-volfile-servers option (storageServer:253) > > 2017-06-03 19:21:15,515+0200 ERROR (jsonrpc/4) > [storage.initSANLock] Cannot > > initialize SANLock for domain a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922 > > (clusterlock:238) > > Traceback (most recent call last): > > File "/usr/lib/python2.7/site-packa > ges/vdsm/storage/clusterlock.py", line > > 234, in initSANLock > > sanlock.init_lockspace(sdUUID, idsPath) > > SanlockException: (107, 'Sanlock lockspace init failure', > 'Transport > > endpoint is not connected') > > 2017-06-03 19:21:15,515+0200 WARN (jsonrpc/4) > > [storage.StorageDomainManifest] lease did not initialize > successfully > > (sd:557) > > Traceback (most recent call last): > > File "/usr/share/vdsm/storage/sd.py", line 552, in > initDomainLock > > self._domainLock.initLock(self.getDomainLease()) > > File "/usr/lib/python2.7/site-packa > ges/vdsm/storage/clusterlock.py", line > > 271, in initLock > > initSANLock(self._sdUUID, self._idsPath, lease) > > File "/usr/lib/python2.7/site-packa > ges/vdsm/storage/clusterlock.py", line > > 239, in initSANLock > > raise se.ClusterLockInitError() > > ClusterLockInitError: Could not initialize cluster lock: () > > 2017-06-03 19:21:37,867+0200 ERROR (jsonrpc/2) > [storage.StoragePool] Create > > pool hosted_datacenter canceled (sp:655) > > Traceback (most recent call last): > > File "/usr/share/vdsm/storage/sp.py", line 652, in create > > self.attachSD(sdUUID) > > File "/usr/lib/python2.7/site-packa > ges/vdsm/storage/securable.py", line > > 79, in wrapper > > return method(self, *args, **kwargs) > > File "/usr/share/vdsm/storage/sp.py", line 971, in attachSD > > dom.acquireHostId(self.id) > > File "/usr/share/vdsm/storage/sd.py", line 790, in acquireHostId > > self._manifest.acquireHostId(hostId, async) > > File "/usr/share/vdsm/storage/sd.py", line 449, in acquireHostId > > self._domainLock.acquireHostId(hostId, async) > > File "/usr/lib/python2.7/site-packa > ges/vdsm/storage/clusterlock.py", line > > 297, in acquireHostId > > raise se.AcquireHostIdFailure(self._sdUUID, e) > > AcquireHostIdFailure: Cannot acquire host id: > > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, > 'Sanlock > > lockspace add failure', 'Invalid argument')) > > 2017-06-03 19:21:37,870+0200 ERROR (jsonrpc/2) > [storage.StoragePool] Domain > > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 detach from MSD > > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 Ver 1 failed. (sp:528) > > Traceback (most recent call last): > > File "/usr/share/vdsm/storage/sp.py", line 525, in > __cleanupDomains > > self.detachSD(sdUUID) > > File "/usr/lib/python2.7/site-packa > ges/vdsm/storage/securable.py", line > > 79, in wrapper > > return method(self, *args, **kwargs) > > File "/usr/share/vdsm/storage/sp.py", line 1046, in detachSD > > raise se.CannotDetachMasterStorageDomain(sdUUID) > > CannotDetachMasterStorageDomain: Illegal action: > > (u'ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047',) > > 2017-06-03 19:21:37,872+0200 ERROR (jsonrpc/2) > [storage.StoragePool] Domain > > a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922 detach from MSD > > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 Ver 1 failed. (sp:528) > > Traceback (most recent call last): > > File "/usr/share/vdsm/storage/sp.py", line 525, in > __cleanupDomains > > self.detachSD(sdUUID) > > File "/usr/lib/python2.7/site-packa > ges/vdsm/storage/securable.py", line > > 79, in wrapper > > return method(self, *args, **kwargs) > > File "/usr/share/vdsm/storage/sp.py", line 1043, in detachSD > > self.validateAttachedDomain(dom) > > File "/usr/lib/python2.7/site-packa > ges/vdsm/storage/securable.py", line > > 79, in wrapper > > return method(self, *args, **kwargs) > > File "/usr/share/vdsm/storage/sp.py", line 542, in > validateAttachedDomain > > self.validatePoolSD(dom.sdUUID) > > File "/usr/lib/python2.7/site-packa > ges/vdsm/storage/securable.py", line > > 79, in wrapper > > return method(self, *args, **kwargs) > > File "/usr/share/vdsm/storage/sp.py", line 535, in > validatePoolSD > > raise se.StorageDomainNotMemberOfPool(self.spUUID, sdUUID) > > StorageDomainNotMemberOfPool: Domain is not member in pool: > > u'pool=a1e7e9dd-0cf4-41ae-ba13-36297ed66309, > > domain=a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922' > > 2017-06-03 19:21:40,063+0200 ERROR (jsonrpc/2) > [storage.TaskManager.Task] > > (Task='a2476a33-26f8-4ebd-876d-02fe5d13ef78') Unexpected error > (task:870) > > Traceback (most recent call last): > > File "/usr/share/vdsm/storage/task.py", line 877, in _run > > return fn(*args, **kargs) > > File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line > 52, in > > wrapper > > res = f(*args, **kwargs) > > File "/usr/share/vdsm/storage/hsm.py", line 959, in > createStoragePool > > leaseParams) > > File "/usr/share/vdsm/storage/sp.py", line 652, in create > > self.attachSD(sdUUID) > > File "/usr/lib/python2.7/site-packa > ges/vdsm/storage/securable.py", line > > 79, in wrapper > > return method(self, *args, **kwargs) > > File "/usr/share/vdsm/storage/sp.py", line 971, in attachSD > > dom.acquireHostId(self.id) > > File "/usr/share/vdsm/storage/sd.py", line 790, in acquireHostId > > self._manifest.acquireHostId(hostId, async) > > File "/usr/share/vdsm/storage/sd.py", line 449, in acquireHostId > > self._domainLock.acquireHostId(hostId, async) > > File "/usr/lib/python2.7/site-packa > ges/vdsm/storage/clusterlock.py", line > > 297, in acquireHostId > > raise se.AcquireHostIdFailure(self._sdUUID, e) > > AcquireHostIdFailure: Cannot acquire host id: > > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, > 'Sanlock > > lockspace add failure', 'Invalid argument')) > > 2017-06-03 19:21:40,067+0200 ERROR (jsonrpc/2) [storage.Dispatcher] > > {'status': {'message': "Cannot acquire host id: > > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, > 'Sanlock > > lockspace add failure', 'Invalid argument'))", 'code': 661}} > (dispatcher:77) > > > > The gluster volume prepared for engine storage is online and no > split brain > > is reported. I don't understand what needs to be done to overcome > this. Any > > idea will be appreciated. > > > > Thank you, > > Alex > > > > _______________________________________________ > > Users mailing list > > Users@ovirt.org > > http://lists.ovirt.org/mailman/listinfo/users > > >
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

Also when testing with dd i get the following: *Testing on the gluster mount: * dd if=/dev/zero of=/rhev/data-center/mnt/glusterSD/10.100.100.1:_engine/test2.img oflag=direct bs=512 count=1 dd: error writing β/rhev/data-center/mnt/glusterSD/10.100.100.1:_engine/test2.imgβ: *Transport endpoint is not connected* 1+0 records in 0+0 records out 0 bytes (0 B) copied, 0.00336755 s, 0.0 kB/s *Testing on the /root directory (XFS): * dd if=/dev/zero of=/test2.img oflag=direct bs=512 count=1 dd: error writing β/test2.imgβ:* Invalid argument* 1+0 records in 0+0 records out 0 bytes (0 B) copied, 0.000321239 s, 0.0 kB/s Seems that the gluster is trying to do the same and fails. On Mon, Jun 5, 2017 at 10:10 PM, Abi Askushi <rightkicktech@gmail.com> wrote:
The question that rises is what is needed to make gluster aware of the 4K physical sectors presented to it (the logical sector is also 4K). The offset (127488) at the log does not seem aligned at 4K.
Alex
On Mon, Jun 5, 2017 at 2:47 PM, Abi Askushi <rightkicktech@gmail.com> wrote:
Hi Krutika,
I am saying that I am facing this issue with 4k drives. I never encountered this issue with 512 drives.
Alex
On Jun 5, 2017 14:26, "Krutika Dhananjay" <kdhananj@redhat.com> wrote:
This seems like a case of O_DIRECT reads and writes gone wrong, judging by the 'Invalid argument' errors.
The two operations that have failed on gluster bricks are:
[2017-06-05 09:40:39.428979] E [MSGID: 113072] [posix.c:3453:posix_writev] 0-engine-posix: write failed: offset 0, [Invalid argument] [2017-06-05 09:41:00.865760] E [MSGID: 113040] [posix.c:3178:posix_readv] 0-engine-posix: read failed on gfid=8c94f658-ac3c-4e3a-b368-8c038513a914, fd=0x7f408584c06c, offset=127488 size=512, buf=0x7f4083c0b000 [Invalid argument]
But then, both the write and the read have 512byte-aligned offset, size and buf address (which is correct).
Are you saying you don't see this issue with 4K block-size?
-Krutika
On Mon, Jun 5, 2017 at 3:21 PM, Abi Askushi <rightkicktech@gmail.com> wrote:
Hi Sahina,
Attached are the logs. Let me know if sth else is needed.
I have 5 disks (with 4K physical sector) in RAID5. The RAID has 64K stripe size at the moment. I have prepared the storage as below:
pvcreate --dataalignment 256K /dev/sda4 vgcreate --physicalextentsize 256K gluster /dev/sda4
lvcreate -n engine --size 120G gluster mkfs.xfs -f -i size=512 /dev/gluster/engine
Thanx, Alex
On Mon, Jun 5, 2017 at 12:14 PM, Sahina Bose <sabose@redhat.com> wrote:
Can we have the gluster mount logs and brick logs to check if it's the same issue?
On Sun, Jun 4, 2017 at 11:21 PM, Abi Askushi <rightkicktech@gmail.com> wrote:
I clean installed everything and ran into the same. I then ran gdeploy and encountered the same issue when deploying engine. Seems that gluster (?) doesn't like 4K sector drives. I am not sure if it has to do with alignment. The weird thing is that gluster volumes are all ok, replicating normally and no split brain is reported.
The solution to the mentioned bug (1386443 <https://bugzilla.redhat.com/show_bug.cgi?id=1386443>) was to format with 512 sector size, which for my case is not an option:
mkfs.xfs -f -i size=512 -s size=512 /dev/gluster/engine illegal sector size 512; hw sector is 4096
Is there any workaround to address this?
Thanx, Alex
On Sun, Jun 4, 2017 at 5:48 PM, Abi Askushi <rightkicktech@gmail.com> wrote:
> Hi Maor, > > My disk are of 4K block size and from this bug seems that gluster > replica needs 512B block size. > Is there a way to make gluster function with 4K drives? > > Thank you! > > On Sun, Jun 4, 2017 at 2:34 PM, Maor Lipchuk <mlipchuk@redhat.com> > wrote: > >> Hi Alex, >> >> I saw a bug that might be related to the issue you encountered at >> https://bugzilla.redhat.com/show_bug.cgi?id=1386443 >> >> Sahina, maybe you have any advise? Do you think that BZ1386443is >> related? >> >> Regards, >> Maor >> >> On Sat, Jun 3, 2017 at 8:45 PM, Abi Askushi < >> rightkicktech@gmail.com> wrote: >> > Hi All, >> > >> > I have installed successfully several times oVirt (version 4.1) >> with 3 nodes >> > on top glusterfs. >> > >> > This time, when trying to configure the same setup, I am facing >> the >> > following issue which doesn't seem to go away. During >> installation i get the >> > error: >> > >> > Failed to execute stage 'Misc configuration': Cannot acquire host >> id: >> > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, >> 'Sanlock >> > lockspace add failure', 'Invalid argument')) >> > >> > The only different in this setup is that instead of standard >> partitioning i >> > have GPT partitioning and the disks have 4K block size instead of >> 512. >> > >> > The /var/log/sanlock.log has the following lines: >> > >> > 2017-06-03 19:21:15+0200 23450 [943]: s9 lockspace >> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047:250:/rhev/data-center/m >> nt/_var_lib_ovirt-hosted-engin-setup_tmptjkIDI/ba6bd862-c2b8 >> -46e7-b2c8-91e4a5bb2047/dom_md/ids:0 >> > 2017-06-03 19:21:36+0200 23471 [944]: s9:r5 resource >> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047:SDM:/rhev/data-center/m >> nt/_var_lib_ovirt-hosted-engine-setup_tmptjkIDI/ba6bd862-c2b >> 8-46e7-b2c8-91e4a5bb2047/dom_md/leases:1048576 >> > for 2,9,23040 >> > 2017-06-03 19:21:36+0200 23471 [943]: s10 lockspace >> > a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922:250:/rhev/data-center/m >> nt/glusterSD/10.100.100.1:_engine/a5a6b0e7-fc3f-4838-8e26-c8 >> b4d5e5e922/dom_md/ids:0 >> > 2017-06-03 19:21:36+0200 23471 [23522]: a5a6b0e7 aio collect RD >> > 0x7f59b00008c0:0x7f59b00008d0:0x7f59b0101000 result -22:0 match >> res >> > 2017-06-03 19:21:36+0200 23471 [23522]: read_sectors delta_leader >> offset >> > 127488 rv -22 >> > /rhev/data-center/mnt/glusterSD/10.100.100.1:_engine/a5a6b0e >> 7-fc3f-4838-8e26-c8b4d5e5e922/dom_md/ids >> > 2017-06-03 19:21:37+0200 23472 [930]: s9 host 250 1 23450 >> > 88c2244c-a782-40ed-9560-6cfa4d46f853.v0.neptune >> > 2017-06-03 19:21:37+0200 23472 [943]: s10 add_lockspace fail >> result -22 >> > >> > And /var/log/vdsm/vdsm.log says: >> > >> > 2017-06-03 19:19:38,176+0200 WARN (jsonrpc/3) >> > [storage.StorageServer.MountConnection] Using user specified >> > backup-volfile-servers option (storageServer:253) >> > 2017-06-03 19:21:12,379+0200 WARN (periodic/1) [throttled] MOM >> not >> > available. (throttledlog:105) >> > 2017-06-03 19:21:12,380+0200 WARN (periodic/1) [throttled] MOM >> not >> > available, KSM stats will be missing. (throttledlog:105) >> > 2017-06-03 19:21:14,714+0200 WARN (jsonrpc/1) >> > [storage.StorageServer.MountConnection] Using user specified >> > backup-volfile-servers option (storageServer:253) >> > 2017-06-03 19:21:15,515+0200 ERROR (jsonrpc/4) >> [storage.initSANLock] Cannot >> > initialize SANLock for domain a5a6b0e7-fc3f-4838-8e26-c8b4d5 >> e5e922 >> > (clusterlock:238) >> > Traceback (most recent call last): >> > File "/usr/lib/python2.7/site-packa >> ges/vdsm/storage/clusterlock.py", line >> > 234, in initSANLock >> > sanlock.init_lockspace(sdUUID, idsPath) >> > SanlockException: (107, 'Sanlock lockspace init failure', >> 'Transport >> > endpoint is not connected') >> > 2017-06-03 19:21:15,515+0200 WARN (jsonrpc/4) >> > [storage.StorageDomainManifest] lease did not initialize >> successfully >> > (sd:557) >> > Traceback (most recent call last): >> > File "/usr/share/vdsm/storage/sd.py", line 552, in >> initDomainLock >> > self._domainLock.initLock(self.getDomainLease()) >> > File "/usr/lib/python2.7/site-packa >> ges/vdsm/storage/clusterlock.py", line >> > 271, in initLock >> > initSANLock(self._sdUUID, self._idsPath, lease) >> > File "/usr/lib/python2.7/site-packa >> ges/vdsm/storage/clusterlock.py", line >> > 239, in initSANLock >> > raise se.ClusterLockInitError() >> > ClusterLockInitError: Could not initialize cluster lock: () >> > 2017-06-03 19:21:37,867+0200 ERROR (jsonrpc/2) >> [storage.StoragePool] Create >> > pool hosted_datacenter canceled (sp:655) >> > Traceback (most recent call last): >> > File "/usr/share/vdsm/storage/sp.py", line 652, in create >> > self.attachSD(sdUUID) >> > File "/usr/lib/python2.7/site-packa >> ges/vdsm/storage/securable.py", line >> > 79, in wrapper >> > return method(self, *args, **kwargs) >> > File "/usr/share/vdsm/storage/sp.py", line 971, in attachSD >> > dom.acquireHostId(self.id) >> > File "/usr/share/vdsm/storage/sd.py", line 790, in >> acquireHostId >> > self._manifest.acquireHostId(hostId, async) >> > File "/usr/share/vdsm/storage/sd.py", line 449, in >> acquireHostId >> > self._domainLock.acquireHostId(hostId, async) >> > File "/usr/lib/python2.7/site-packa >> ges/vdsm/storage/clusterlock.py", line >> > 297, in acquireHostId >> > raise se.AcquireHostIdFailure(self._sdUUID, e) >> > AcquireHostIdFailure: Cannot acquire host id: >> > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, >> 'Sanlock >> > lockspace add failure', 'Invalid argument')) >> > 2017-06-03 19:21:37,870+0200 ERROR (jsonrpc/2) >> [storage.StoragePool] Domain >> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 detach from MSD >> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 Ver 1 failed. (sp:528) >> > Traceback (most recent call last): >> > File "/usr/share/vdsm/storage/sp.py", line 525, in >> __cleanupDomains >> > self.detachSD(sdUUID) >> > File "/usr/lib/python2.7/site-packa >> ges/vdsm/storage/securable.py", line >> > 79, in wrapper >> > return method(self, *args, **kwargs) >> > File "/usr/share/vdsm/storage/sp.py", line 1046, in detachSD >> > raise se.CannotDetachMasterStorageDomain(sdUUID) >> > CannotDetachMasterStorageDomain: Illegal action: >> > (u'ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047',) >> > 2017-06-03 19:21:37,872+0200 ERROR (jsonrpc/2) >> [storage.StoragePool] Domain >> > a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922 detach from MSD >> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 Ver 1 failed. (sp:528) >> > Traceback (most recent call last): >> > File "/usr/share/vdsm/storage/sp.py", line 525, in >> __cleanupDomains >> > self.detachSD(sdUUID) >> > File "/usr/lib/python2.7/site-packa >> ges/vdsm/storage/securable.py", line >> > 79, in wrapper >> > return method(self, *args, **kwargs) >> > File "/usr/share/vdsm/storage/sp.py", line 1043, in detachSD >> > self.validateAttachedDomain(dom) >> > File "/usr/lib/python2.7/site-packa >> ges/vdsm/storage/securable.py", line >> > 79, in wrapper >> > return method(self, *args, **kwargs) >> > File "/usr/share/vdsm/storage/sp.py", line 542, in >> validateAttachedDomain >> > self.validatePoolSD(dom.sdUUID) >> > File "/usr/lib/python2.7/site-packa >> ges/vdsm/storage/securable.py", line >> > 79, in wrapper >> > return method(self, *args, **kwargs) >> > File "/usr/share/vdsm/storage/sp.py", line 535, in >> validatePoolSD >> > raise se.StorageDomainNotMemberOfPool(self.spUUID, sdUUID) >> > StorageDomainNotMemberOfPool: Domain is not member in pool: >> > u'pool=a1e7e9dd-0cf4-41ae-ba13-36297ed66309, >> > domain=a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922' >> > 2017-06-03 19:21:40,063+0200 ERROR (jsonrpc/2) >> [storage.TaskManager.Task] >> > (Task='a2476a33-26f8-4ebd-876d-02fe5d13ef78') Unexpected error >> (task:870) >> > Traceback (most recent call last): >> > File "/usr/share/vdsm/storage/task.py", line 877, in _run >> > return fn(*args, **kargs) >> > File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line >> 52, in >> > wrapper >> > res = f(*args, **kwargs) >> > File "/usr/share/vdsm/storage/hsm.py", line 959, in >> createStoragePool >> > leaseParams) >> > File "/usr/share/vdsm/storage/sp.py", line 652, in create >> > self.attachSD(sdUUID) >> > File "/usr/lib/python2.7/site-packa >> ges/vdsm/storage/securable.py", line >> > 79, in wrapper >> > return method(self, *args, **kwargs) >> > File "/usr/share/vdsm/storage/sp.py", line 971, in attachSD >> > dom.acquireHostId(self.id) >> > File "/usr/share/vdsm/storage/sd.py", line 790, in >> acquireHostId >> > self._manifest.acquireHostId(hostId, async) >> > File "/usr/share/vdsm/storage/sd.py", line 449, in >> acquireHostId >> > self._domainLock.acquireHostId(hostId, async) >> > File "/usr/lib/python2.7/site-packa >> ges/vdsm/storage/clusterlock.py", line >> > 297, in acquireHostId >> > raise se.AcquireHostIdFailure(self._sdUUID, e) >> > AcquireHostIdFailure: Cannot acquire host id: >> > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, >> 'Sanlock >> > lockspace add failure', 'Invalid argument')) >> > 2017-06-03 19:21:40,067+0200 ERROR (jsonrpc/2) >> [storage.Dispatcher] >> > {'status': {'message': "Cannot acquire host id: >> > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, >> 'Sanlock >> > lockspace add failure', 'Invalid argument'))", 'code': 661}} >> (dispatcher:77) >> > >> > The gluster volume prepared for engine storage is online and no >> split brain >> > is reported. I don't understand what needs to be done to overcome >> this. Any >> > idea will be appreciated. >> > >> > Thank you, >> > Alex >> > >> > _______________________________________________ >> > Users mailing list >> > Users@ovirt.org >> > http://lists.ovirt.org/mailman/listinfo/users >> > >> > >
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

OK. So for the 'Transport endpoint is not connected' issue, could you share the mount and brick logs? Hmmm.. 'Invalid argument' error even on the root partition. What if you change bs to 4096 and run? The logs I showed in my earlier mail shows that gluster is merely returning the error it got from the disk file system where the brick is hosted. But you're right about the fact that the offset 127488 is not 4K-aligned. If the dd on /root worked for you with bs=4096, could you try the same directly on gluster mount point on a dummy file and capture the strace output of dd? You can perhaps reuse your existing gluster volume by mounting it at another location and doing the dd. Here's what you need to execute: strace -ff -T -p <pid-of-mount-process> -o <path-to-the-file-where-you-want-the-output-saved>` FWIW, here's something I found in man(2) open: *Under Linux 2.4, transfer sizes, and the alignment of the user buffer and the file offset must all be multiples of the logical block size of the filesystem. Since Linux 2.6.0, alignment to the logical block size of the underlying storage (typically 512 bytes) suffices. The logical block size can be determined using the ioctl(2) BLKSSZGET operation or from the shell using the command: blockdev --getss* -Krutika On Tue, Jun 6, 2017 at 1:18 AM, Abi Askushi <rightkicktech@gmail.com> wrote:
Also when testing with dd i get the following:
*Testing on the gluster mount: * dd if=/dev/zero of=/rhev/data-center/mnt/glusterSD/10.100.100.1:_engine/test2.img oflag=direct bs=512 count=1 dd: error writing β/rhev/data-center/mnt/glusterSD/10.100.100.1:_engine/test2.imgβ: *Transport endpoint is not connected* 1+0 records in 0+0 records out 0 bytes (0 B) copied, 0.00336755 s, 0.0 kB/s
*Testing on the /root directory (XFS): * dd if=/dev/zero of=/test2.img oflag=direct bs=512 count=1 dd: error writing β/test2.imgβ:* Invalid argument* 1+0 records in 0+0 records out 0 bytes (0 B) copied, 0.000321239 s, 0.0 kB/s
Seems that the gluster is trying to do the same and fails.
On Mon, Jun 5, 2017 at 10:10 PM, Abi Askushi <rightkicktech@gmail.com> wrote:
The question that rises is what is needed to make gluster aware of the 4K physical sectors presented to it (the logical sector is also 4K). The offset (127488) at the log does not seem aligned at 4K.
Alex
On Mon, Jun 5, 2017 at 2:47 PM, Abi Askushi <rightkicktech@gmail.com> wrote:
Hi Krutika,
I am saying that I am facing this issue with 4k drives. I never encountered this issue with 512 drives.
Alex
On Jun 5, 2017 14:26, "Krutika Dhananjay" <kdhananj@redhat.com> wrote:
This seems like a case of O_DIRECT reads and writes gone wrong, judging by the 'Invalid argument' errors.
The two operations that have failed on gluster bricks are:
[2017-06-05 09:40:39.428979] E [MSGID: 113072] [posix.c:3453:posix_writev] 0-engine-posix: write failed: offset 0, [Invalid argument] [2017-06-05 09:41:00.865760] E [MSGID: 113040] [posix.c:3178:posix_readv] 0-engine-posix: read failed on gfid=8c94f658-ac3c-4e3a-b368-8c038513a914, fd=0x7f408584c06c, offset=127488 size=512, buf=0x7f4083c0b000 [Invalid argument]
But then, both the write and the read have 512byte-aligned offset, size and buf address (which is correct).
Are you saying you don't see this issue with 4K block-size?
-Krutika
On Mon, Jun 5, 2017 at 3:21 PM, Abi Askushi <rightkicktech@gmail.com> wrote:
Hi Sahina,
Attached are the logs. Let me know if sth else is needed.
I have 5 disks (with 4K physical sector) in RAID5. The RAID has 64K stripe size at the moment. I have prepared the storage as below:
pvcreate --dataalignment 256K /dev/sda4 vgcreate --physicalextentsize 256K gluster /dev/sda4
lvcreate -n engine --size 120G gluster mkfs.xfs -f -i size=512 /dev/gluster/engine
Thanx, Alex
On Mon, Jun 5, 2017 at 12:14 PM, Sahina Bose <sabose@redhat.com> wrote:
Can we have the gluster mount logs and brick logs to check if it's the same issue?
On Sun, Jun 4, 2017 at 11:21 PM, Abi Askushi <rightkicktech@gmail.com > wrote:
> I clean installed everything and ran into the same. > I then ran gdeploy and encountered the same issue when deploying > engine. > Seems that gluster (?) doesn't like 4K sector drives. I am not sure > if it has to do with alignment. The weird thing is that gluster volumes are > all ok, replicating normally and no split brain is reported. > > The solution to the mentioned bug (1386443 > <https://bugzilla.redhat.com/show_bug.cgi?id=1386443>) was to > format with 512 sector size, which for my case is not an option: > > mkfs.xfs -f -i size=512 -s size=512 /dev/gluster/engine > illegal sector size 512; hw sector is 4096 > > Is there any workaround to address this? > > Thanx, > Alex > > > On Sun, Jun 4, 2017 at 5:48 PM, Abi Askushi <rightkicktech@gmail.com > > wrote: > >> Hi Maor, >> >> My disk are of 4K block size and from this bug seems that gluster >> replica needs 512B block size. >> Is there a way to make gluster function with 4K drives? >> >> Thank you! >> >> On Sun, Jun 4, 2017 at 2:34 PM, Maor Lipchuk <mlipchuk@redhat.com> >> wrote: >> >>> Hi Alex, >>> >>> I saw a bug that might be related to the issue you encountered at >>> https://bugzilla.redhat.com/show_bug.cgi?id=1386443 >>> >>> Sahina, maybe you have any advise? Do you think that BZ1386443is >>> related? >>> >>> Regards, >>> Maor >>> >>> On Sat, Jun 3, 2017 at 8:45 PM, Abi Askushi < >>> rightkicktech@gmail.com> wrote: >>> > Hi All, >>> > >>> > I have installed successfully several times oVirt (version 4.1) >>> with 3 nodes >>> > on top glusterfs. >>> > >>> > This time, when trying to configure the same setup, I am facing >>> the >>> > following issue which doesn't seem to go away. During >>> installation i get the >>> > error: >>> > >>> > Failed to execute stage 'Misc configuration': Cannot acquire >>> host id: >>> > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, >>> 'Sanlock >>> > lockspace add failure', 'Invalid argument')) >>> > >>> > The only different in this setup is that instead of standard >>> partitioning i >>> > have GPT partitioning and the disks have 4K block size instead >>> of 512. >>> > >>> > The /var/log/sanlock.log has the following lines: >>> > >>> > 2017-06-03 19:21:15+0200 23450 [943]: s9 lockspace >>> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047:250:/rhev/data-center/m >>> nt/_var_lib_ovirt-hosted-engin-setup_tmptjkIDI/ba6bd862-c2b8 >>> -46e7-b2c8-91e4a5bb2047/dom_md/ids:0 >>> > 2017-06-03 19:21:36+0200 23471 [944]: s9:r5 resource >>> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047:SDM:/rhev/data-center/m >>> nt/_var_lib_ovirt-hosted-engine-setup_tmptjkIDI/ba6bd862-c2b >>> 8-46e7-b2c8-91e4a5bb2047/dom_md/leases:1048576 >>> > for 2,9,23040 >>> > 2017-06-03 19:21:36+0200 23471 [943]: s10 lockspace >>> > a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922:250:/rhev/data-center/m >>> nt/glusterSD/10.100.100.1:_engine/a5a6b0e7-fc3f-4838-8e26-c8 >>> b4d5e5e922/dom_md/ids:0 >>> > 2017-06-03 19:21:36+0200 23471 [23522]: a5a6b0e7 aio collect RD >>> > 0x7f59b00008c0:0x7f59b00008d0:0x7f59b0101000 result -22:0 match >>> res >>> > 2017-06-03 19:21:36+0200 23471 [23522]: read_sectors >>> delta_leader offset >>> > 127488 rv -22 >>> > /rhev/data-center/mnt/glusterSD/10.100.100.1:_engine/a5a6b0e >>> 7-fc3f-4838-8e26-c8b4d5e5e922/dom_md/ids >>> > 2017-06-03 19:21:37+0200 23472 [930]: s9 host 250 1 23450 >>> > 88c2244c-a782-40ed-9560-6cfa4d46f853.v0.neptune >>> > 2017-06-03 19:21:37+0200 23472 [943]: s10 add_lockspace fail >>> result -22 >>> > >>> > And /var/log/vdsm/vdsm.log says: >>> > >>> > 2017-06-03 19:19:38,176+0200 WARN (jsonrpc/3) >>> > [storage.StorageServer.MountConnection] Using user specified >>> > backup-volfile-servers option (storageServer:253) >>> > 2017-06-03 19:21:12,379+0200 WARN (periodic/1) [throttled] MOM >>> not >>> > available. (throttledlog:105) >>> > 2017-06-03 19:21:12,380+0200 WARN (periodic/1) [throttled] MOM >>> not >>> > available, KSM stats will be missing. (throttledlog:105) >>> > 2017-06-03 19:21:14,714+0200 WARN (jsonrpc/1) >>> > [storage.StorageServer.MountConnection] Using user specified >>> > backup-volfile-servers option (storageServer:253) >>> > 2017-06-03 19:21:15,515+0200 ERROR (jsonrpc/4) >>> [storage.initSANLock] Cannot >>> > initialize SANLock for domain a5a6b0e7-fc3f-4838-8e26-c8b4d5 >>> e5e922 >>> > (clusterlock:238) >>> > Traceback (most recent call last): >>> > File "/usr/lib/python2.7/site-packa >>> ges/vdsm/storage/clusterlock.py", line >>> > 234, in initSANLock >>> > sanlock.init_lockspace(sdUUID, idsPath) >>> > SanlockException: (107, 'Sanlock lockspace init failure', >>> 'Transport >>> > endpoint is not connected') >>> > 2017-06-03 19:21:15,515+0200 WARN (jsonrpc/4) >>> > [storage.StorageDomainManifest] lease did not initialize >>> successfully >>> > (sd:557) >>> > Traceback (most recent call last): >>> > File "/usr/share/vdsm/storage/sd.py", line 552, in >>> initDomainLock >>> > self._domainLock.initLock(self.getDomainLease()) >>> > File "/usr/lib/python2.7/site-packa >>> ges/vdsm/storage/clusterlock.py", line >>> > 271, in initLock >>> > initSANLock(self._sdUUID, self._idsPath, lease) >>> > File "/usr/lib/python2.7/site-packa >>> ges/vdsm/storage/clusterlock.py", line >>> > 239, in initSANLock >>> > raise se.ClusterLockInitError() >>> > ClusterLockInitError: Could not initialize cluster lock: () >>> > 2017-06-03 19:21:37,867+0200 ERROR (jsonrpc/2) >>> [storage.StoragePool] Create >>> > pool hosted_datacenter canceled (sp:655) >>> > Traceback (most recent call last): >>> > File "/usr/share/vdsm/storage/sp.py", line 652, in create >>> > self.attachSD(sdUUID) >>> > File "/usr/lib/python2.7/site-packa >>> ges/vdsm/storage/securable.py", line >>> > 79, in wrapper >>> > return method(self, *args, **kwargs) >>> > File "/usr/share/vdsm/storage/sp.py", line 971, in attachSD >>> > dom.acquireHostId(self.id) >>> > File "/usr/share/vdsm/storage/sd.py", line 790, in >>> acquireHostId >>> > self._manifest.acquireHostId(hostId, async) >>> > File "/usr/share/vdsm/storage/sd.py", line 449, in >>> acquireHostId >>> > self._domainLock.acquireHostId(hostId, async) >>> > File "/usr/lib/python2.7/site-packa >>> ges/vdsm/storage/clusterlock.py", line >>> > 297, in acquireHostId >>> > raise se.AcquireHostIdFailure(self._sdUUID, e) >>> > AcquireHostIdFailure: Cannot acquire host id: >>> > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, >>> 'Sanlock >>> > lockspace add failure', 'Invalid argument')) >>> > 2017-06-03 19:21:37,870+0200 ERROR (jsonrpc/2) >>> [storage.StoragePool] Domain >>> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 detach from MSD >>> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 Ver 1 failed. (sp:528) >>> > Traceback (most recent call last): >>> > File "/usr/share/vdsm/storage/sp.py", line 525, in >>> __cleanupDomains >>> > self.detachSD(sdUUID) >>> > File "/usr/lib/python2.7/site-packa >>> ges/vdsm/storage/securable.py", line >>> > 79, in wrapper >>> > return method(self, *args, **kwargs) >>> > File "/usr/share/vdsm/storage/sp.py", line 1046, in detachSD >>> > raise se.CannotDetachMasterStorageDomain(sdUUID) >>> > CannotDetachMasterStorageDomain: Illegal action: >>> > (u'ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047',) >>> > 2017-06-03 19:21:37,872+0200 ERROR (jsonrpc/2) >>> [storage.StoragePool] Domain >>> > a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922 detach from MSD >>> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 Ver 1 failed. (sp:528) >>> > Traceback (most recent call last): >>> > File "/usr/share/vdsm/storage/sp.py", line 525, in >>> __cleanupDomains >>> > self.detachSD(sdUUID) >>> > File "/usr/lib/python2.7/site-packa >>> ges/vdsm/storage/securable.py", line >>> > 79, in wrapper >>> > return method(self, *args, **kwargs) >>> > File "/usr/share/vdsm/storage/sp.py", line 1043, in detachSD >>> > self.validateAttachedDomain(dom) >>> > File "/usr/lib/python2.7/site-packa >>> ges/vdsm/storage/securable.py", line >>> > 79, in wrapper >>> > return method(self, *args, **kwargs) >>> > File "/usr/share/vdsm/storage/sp.py", line 542, in >>> validateAttachedDomain >>> > self.validatePoolSD(dom.sdUUID) >>> > File "/usr/lib/python2.7/site-packa >>> ges/vdsm/storage/securable.py", line >>> > 79, in wrapper >>> > return method(self, *args, **kwargs) >>> > File "/usr/share/vdsm/storage/sp.py", line 535, in >>> validatePoolSD >>> > raise se.StorageDomainNotMemberOfPool(self.spUUID, sdUUID) >>> > StorageDomainNotMemberOfPool: Domain is not member in pool: >>> > u'pool=a1e7e9dd-0cf4-41ae-ba13-36297ed66309, >>> > domain=a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922' >>> > 2017-06-03 19:21:40,063+0200 ERROR (jsonrpc/2) >>> [storage.TaskManager.Task] >>> > (Task='a2476a33-26f8-4ebd-876d-02fe5d13ef78') Unexpected error >>> (task:870) >>> > Traceback (most recent call last): >>> > File "/usr/share/vdsm/storage/task.py", line 877, in _run >>> > return fn(*args, **kargs) >>> > File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", >>> line 52, in >>> > wrapper >>> > res = f(*args, **kwargs) >>> > File "/usr/share/vdsm/storage/hsm.py", line 959, in >>> createStoragePool >>> > leaseParams) >>> > File "/usr/share/vdsm/storage/sp.py", line 652, in create >>> > self.attachSD(sdUUID) >>> > File "/usr/lib/python2.7/site-packa >>> ges/vdsm/storage/securable.py", line >>> > 79, in wrapper >>> > return method(self, *args, **kwargs) >>> > File "/usr/share/vdsm/storage/sp.py", line 971, in attachSD >>> > dom.acquireHostId(self.id) >>> > File "/usr/share/vdsm/storage/sd.py", line 790, in >>> acquireHostId >>> > self._manifest.acquireHostId(hostId, async) >>> > File "/usr/share/vdsm/storage/sd.py", line 449, in >>> acquireHostId >>> > self._domainLock.acquireHostId(hostId, async) >>> > File "/usr/lib/python2.7/site-packa >>> ges/vdsm/storage/clusterlock.py", line >>> > 297, in acquireHostId >>> > raise se.AcquireHostIdFailure(self._sdUUID, e) >>> > AcquireHostIdFailure: Cannot acquire host id: >>> > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, >>> 'Sanlock >>> > lockspace add failure', 'Invalid argument')) >>> > 2017-06-03 19:21:40,067+0200 ERROR (jsonrpc/2) >>> [storage.Dispatcher] >>> > {'status': {'message': "Cannot acquire host id: >>> > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, >>> 'Sanlock >>> > lockspace add failure', 'Invalid argument'))", 'code': 661}} >>> (dispatcher:77) >>> > >>> > The gluster volume prepared for engine storage is online and no >>> split brain >>> > is reported. I don't understand what needs to be done to >>> overcome this. Any >>> > idea will be appreciated. >>> > >>> > Thank you, >>> > Alex >>> > >>> > _______________________________________________ >>> > Users mailing list >>> > Users@ovirt.org >>> > http://lists.ovirt.org/mailman/listinfo/users >>> > >>> >> >> > > _______________________________________________ > Users mailing list > Users@ovirt.org > http://lists.ovirt.org/mailman/listinfo/users > >
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

I stand corrected. Just realised the strace command I gave was wrong. Here's what you would actually need to execute: strace -y -ff -o <path-where-you-want-your-output-saved> <dd command here> -Krutika On Tue, Jun 6, 2017 at 3:20 PM, Krutika Dhananjay <kdhananj@redhat.com> wrote:
OK.
So for the 'Transport endpoint is not connected' issue, could you share the mount and brick logs?
Hmmm.. 'Invalid argument' error even on the root partition. What if you change bs to 4096 and run?
The logs I showed in my earlier mail shows that gluster is merely returning the error it got from the disk file system where the brick is hosted. But you're right about the fact that the offset 127488 is not 4K-aligned.
If the dd on /root worked for you with bs=4096, could you try the same directly on gluster mount point on a dummy file and capture the strace output of dd? You can perhaps reuse your existing gluster volume by mounting it at another location and doing the dd. Here's what you need to execute:
strace -ff -T -p <pid-of-mount-process> -o <path-to-the-file-where-you-want-the-output-saved>`
FWIW, here's something I found in man(2) open:
*Under Linux 2.4, transfer sizes, and the alignment of the user buffer and the file offset must all be multiples of the logical block size of the filesystem. Since Linux 2.6.0, alignment to the logical block size of the underlying storage (typically 512 bytes) suffices. The logical block size can be determined using the ioctl(2) BLKSSZGET operation or from the shell using the command: blockdev --getss*
-Krutika
On Tue, Jun 6, 2017 at 1:18 AM, Abi Askushi <rightkicktech@gmail.com> wrote:
Also when testing with dd i get the following:
*Testing on the gluster mount: * dd if=/dev/zero of=/rhev/data-center/mnt/glusterSD/10.100.100.1:_engine/test2.img oflag=direct bs=512 count=1 dd: error writing β/rhev/data-center/mnt/glusterSD/10.100.100.1: _engine/test2.imgβ: *Transport endpoint is not connected* 1+0 records in 0+0 records out 0 bytes (0 B) copied, 0.00336755 s, 0.0 kB/s
*Testing on the /root directory (XFS): * dd if=/dev/zero of=/test2.img oflag=direct bs=512 count=1 dd: error writing β/test2.imgβ:* Invalid argument* 1+0 records in 0+0 records out 0 bytes (0 B) copied, 0.000321239 s, 0.0 kB/s
Seems that the gluster is trying to do the same and fails.
On Mon, Jun 5, 2017 at 10:10 PM, Abi Askushi <rightkicktech@gmail.com> wrote:
The question that rises is what is needed to make gluster aware of the 4K physical sectors presented to it (the logical sector is also 4K). The offset (127488) at the log does not seem aligned at 4K.
Alex
On Mon, Jun 5, 2017 at 2:47 PM, Abi Askushi <rightkicktech@gmail.com> wrote:
Hi Krutika,
I am saying that I am facing this issue with 4k drives. I never encountered this issue with 512 drives.
Alex
On Jun 5, 2017 14:26, "Krutika Dhananjay" <kdhananj@redhat.com> wrote:
This seems like a case of O_DIRECT reads and writes gone wrong, judging by the 'Invalid argument' errors.
The two operations that have failed on gluster bricks are:
[2017-06-05 09:40:39.428979] E [MSGID: 113072] [posix.c:3453:posix_writev] 0-engine-posix: write failed: offset 0, [Invalid argument] [2017-06-05 09:41:00.865760] E [MSGID: 113040] [posix.c:3178:posix_readv] 0-engine-posix: read failed on gfid=8c94f658-ac3c-4e3a-b368-8c038513a914, fd=0x7f408584c06c, offset=127488 size=512, buf=0x7f4083c0b000 [Invalid argument]
But then, both the write and the read have 512byte-aligned offset, size and buf address (which is correct).
Are you saying you don't see this issue with 4K block-size?
-Krutika
On Mon, Jun 5, 2017 at 3:21 PM, Abi Askushi <rightkicktech@gmail.com> wrote:
Hi Sahina,
Attached are the logs. Let me know if sth else is needed.
I have 5 disks (with 4K physical sector) in RAID5. The RAID has 64K stripe size at the moment. I have prepared the storage as below:
pvcreate --dataalignment 256K /dev/sda4 vgcreate --physicalextentsize 256K gluster /dev/sda4
lvcreate -n engine --size 120G gluster mkfs.xfs -f -i size=512 /dev/gluster/engine
Thanx, Alex
On Mon, Jun 5, 2017 at 12:14 PM, Sahina Bose <sabose@redhat.com> wrote:
> Can we have the gluster mount logs and brick logs to check if it's > the same issue? > > On Sun, Jun 4, 2017 at 11:21 PM, Abi Askushi < > rightkicktech@gmail.com> wrote: > >> I clean installed everything and ran into the same. >> I then ran gdeploy and encountered the same issue when deploying >> engine. >> Seems that gluster (?) doesn't like 4K sector drives. I am not sure >> if it has to do with alignment. The weird thing is that gluster volumes are >> all ok, replicating normally and no split brain is reported. >> >> The solution to the mentioned bug (1386443 >> <https://bugzilla.redhat.com/show_bug.cgi?id=1386443>) was to >> format with 512 sector size, which for my case is not an option: >> >> mkfs.xfs -f -i size=512 -s size=512 /dev/gluster/engine >> illegal sector size 512; hw sector is 4096 >> >> Is there any workaround to address this? >> >> Thanx, >> Alex >> >> >> On Sun, Jun 4, 2017 at 5:48 PM, Abi Askushi < >> rightkicktech@gmail.com> wrote: >> >>> Hi Maor, >>> >>> My disk are of 4K block size and from this bug seems that gluster >>> replica needs 512B block size. >>> Is there a way to make gluster function with 4K drives? >>> >>> Thank you! >>> >>> On Sun, Jun 4, 2017 at 2:34 PM, Maor Lipchuk <mlipchuk@redhat.com> >>> wrote: >>> >>>> Hi Alex, >>>> >>>> I saw a bug that might be related to the issue you encountered at >>>> https://bugzilla.redhat.com/show_bug.cgi?id=1386443 >>>> >>>> Sahina, maybe you have any advise? Do you think that BZ1386443is >>>> related? >>>> >>>> Regards, >>>> Maor >>>> >>>> On Sat, Jun 3, 2017 at 8:45 PM, Abi Askushi < >>>> rightkicktech@gmail.com> wrote: >>>> > Hi All, >>>> > >>>> > I have installed successfully several times oVirt (version 4.1) >>>> with 3 nodes >>>> > on top glusterfs. >>>> > >>>> > This time, when trying to configure the same setup, I am facing >>>> the >>>> > following issue which doesn't seem to go away. During >>>> installation i get the >>>> > error: >>>> > >>>> > Failed to execute stage 'Misc configuration': Cannot acquire >>>> host id: >>>> > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, >>>> 'Sanlock >>>> > lockspace add failure', 'Invalid argument')) >>>> > >>>> > The only different in this setup is that instead of standard >>>> partitioning i >>>> > have GPT partitioning and the disks have 4K block size instead >>>> of 512. >>>> > >>>> > The /var/log/sanlock.log has the following lines: >>>> > >>>> > 2017-06-03 19:21:15+0200 23450 [943]: s9 lockspace >>>> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047:250:/rhev/data-center/m >>>> nt/_var_lib_ovirt-hosted-engin-setup_tmptjkIDI/ba6bd862-c2b8 >>>> -46e7-b2c8-91e4a5bb2047/dom_md/ids:0 >>>> > 2017-06-03 19:21:36+0200 23471 [944]: s9:r5 resource >>>> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047:SDM:/rhev/data-center/m >>>> nt/_var_lib_ovirt-hosted-engine-setup_tmptjkIDI/ba6bd862-c2b >>>> 8-46e7-b2c8-91e4a5bb2047/dom_md/leases:1048576 >>>> > for 2,9,23040 >>>> > 2017-06-03 19:21:36+0200 23471 [943]: s10 lockspace >>>> > a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922:250:/rhev/data-center/m >>>> nt/glusterSD/10.100.100.1:_engine/a5a6b0e7-fc3f-4838-8e26-c8 >>>> b4d5e5e922/dom_md/ids:0 >>>> > 2017-06-03 19:21:36+0200 23471 [23522]: a5a6b0e7 aio collect RD >>>> > 0x7f59b00008c0:0x7f59b00008d0:0x7f59b0101000 result -22:0 >>>> match res >>>> > 2017-06-03 19:21:36+0200 23471 [23522]: read_sectors >>>> delta_leader offset >>>> > 127488 rv -22 >>>> > /rhev/data-center/mnt/glusterSD/10.100.100.1:_engine/a5a6b0e >>>> 7-fc3f-4838-8e26-c8b4d5e5e922/dom_md/ids >>>> > 2017-06-03 19:21:37+0200 23472 [930]: s9 host 250 1 23450 >>>> > 88c2244c-a782-40ed-9560-6cfa4d46f853.v0.neptune >>>> > 2017-06-03 19:21:37+0200 23472 [943]: s10 add_lockspace fail >>>> result -22 >>>> > >>>> > And /var/log/vdsm/vdsm.log says: >>>> > >>>> > 2017-06-03 19:19:38,176+0200 WARN (jsonrpc/3) >>>> > [storage.StorageServer.MountConnection] Using user specified >>>> > backup-volfile-servers option (storageServer:253) >>>> > 2017-06-03 19:21:12,379+0200 WARN (periodic/1) [throttled] MOM >>>> not >>>> > available. (throttledlog:105) >>>> > 2017-06-03 19:21:12,380+0200 WARN (periodic/1) [throttled] MOM >>>> not >>>> > available, KSM stats will be missing. (throttledlog:105) >>>> > 2017-06-03 19:21:14,714+0200 WARN (jsonrpc/1) >>>> > [storage.StorageServer.MountConnection] Using user specified >>>> > backup-volfile-servers option (storageServer:253) >>>> > 2017-06-03 19:21:15,515+0200 ERROR (jsonrpc/4) >>>> [storage.initSANLock] Cannot >>>> > initialize SANLock for domain a5a6b0e7-fc3f-4838-8e26-c8b4d5 >>>> e5e922 >>>> > (clusterlock:238) >>>> > Traceback (most recent call last): >>>> > File "/usr/lib/python2.7/site-packa >>>> ges/vdsm/storage/clusterlock.py", line >>>> > 234, in initSANLock >>>> > sanlock.init_lockspace(sdUUID, idsPath) >>>> > SanlockException: (107, 'Sanlock lockspace init failure', >>>> 'Transport >>>> > endpoint is not connected') >>>> > 2017-06-03 19:21:15,515+0200 WARN (jsonrpc/4) >>>> > [storage.StorageDomainManifest] lease did not initialize >>>> successfully >>>> > (sd:557) >>>> > Traceback (most recent call last): >>>> > File "/usr/share/vdsm/storage/sd.py", line 552, in >>>> initDomainLock >>>> > self._domainLock.initLock(self.getDomainLease()) >>>> > File "/usr/lib/python2.7/site-packa >>>> ges/vdsm/storage/clusterlock.py", line >>>> > 271, in initLock >>>> > initSANLock(self._sdUUID, self._idsPath, lease) >>>> > File "/usr/lib/python2.7/site-packa >>>> ges/vdsm/storage/clusterlock.py", line >>>> > 239, in initSANLock >>>> > raise se.ClusterLockInitError() >>>> > ClusterLockInitError: Could not initialize cluster lock: () >>>> > 2017-06-03 19:21:37,867+0200 ERROR (jsonrpc/2) >>>> [storage.StoragePool] Create >>>> > pool hosted_datacenter canceled (sp:655) >>>> > Traceback (most recent call last): >>>> > File "/usr/share/vdsm/storage/sp.py", line 652, in create >>>> > self.attachSD(sdUUID) >>>> > File "/usr/lib/python2.7/site-packa >>>> ges/vdsm/storage/securable.py", line >>>> > 79, in wrapper >>>> > return method(self, *args, **kwargs) >>>> > File "/usr/share/vdsm/storage/sp.py", line 971, in attachSD >>>> > dom.acquireHostId(self.id) >>>> > File "/usr/share/vdsm/storage/sd.py", line 790, in >>>> acquireHostId >>>> > self._manifest.acquireHostId(hostId, async) >>>> > File "/usr/share/vdsm/storage/sd.py", line 449, in >>>> acquireHostId >>>> > self._domainLock.acquireHostId(hostId, async) >>>> > File "/usr/lib/python2.7/site-packa >>>> ges/vdsm/storage/clusterlock.py", line >>>> > 297, in acquireHostId >>>> > raise se.AcquireHostIdFailure(self._sdUUID, e) >>>> > AcquireHostIdFailure: Cannot acquire host id: >>>> > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, >>>> 'Sanlock >>>> > lockspace add failure', 'Invalid argument')) >>>> > 2017-06-03 19:21:37,870+0200 ERROR (jsonrpc/2) >>>> [storage.StoragePool] Domain >>>> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 detach from MSD >>>> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 Ver 1 failed. (sp:528) >>>> > Traceback (most recent call last): >>>> > File "/usr/share/vdsm/storage/sp.py", line 525, in >>>> __cleanupDomains >>>> > self.detachSD(sdUUID) >>>> > File "/usr/lib/python2.7/site-packa >>>> ges/vdsm/storage/securable.py", line >>>> > 79, in wrapper >>>> > return method(self, *args, **kwargs) >>>> > File "/usr/share/vdsm/storage/sp.py", line 1046, in detachSD >>>> > raise se.CannotDetachMasterStorageDomain(sdUUID) >>>> > CannotDetachMasterStorageDomain: Illegal action: >>>> > (u'ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047',) >>>> > 2017-06-03 19:21:37,872+0200 ERROR (jsonrpc/2) >>>> [storage.StoragePool] Domain >>>> > a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922 detach from MSD >>>> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 Ver 1 failed. (sp:528) >>>> > Traceback (most recent call last): >>>> > File "/usr/share/vdsm/storage/sp.py", line 525, in >>>> __cleanupDomains >>>> > self.detachSD(sdUUID) >>>> > File "/usr/lib/python2.7/site-packa >>>> ges/vdsm/storage/securable.py", line >>>> > 79, in wrapper >>>> > return method(self, *args, **kwargs) >>>> > File "/usr/share/vdsm/storage/sp.py", line 1043, in detachSD >>>> > self.validateAttachedDomain(dom) >>>> > File "/usr/lib/python2.7/site-packa >>>> ges/vdsm/storage/securable.py", line >>>> > 79, in wrapper >>>> > return method(self, *args, **kwargs) >>>> > File "/usr/share/vdsm/storage/sp.py", line 542, in >>>> validateAttachedDomain >>>> > self.validatePoolSD(dom.sdUUID) >>>> > File "/usr/lib/python2.7/site-packa >>>> ges/vdsm/storage/securable.py", line >>>> > 79, in wrapper >>>> > return method(self, *args, **kwargs) >>>> > File "/usr/share/vdsm/storage/sp.py", line 535, in >>>> validatePoolSD >>>> > raise se.StorageDomainNotMemberOfPool(self.spUUID, sdUUID) >>>> > StorageDomainNotMemberOfPool: Domain is not member in pool: >>>> > u'pool=a1e7e9dd-0cf4-41ae-ba13-36297ed66309, >>>> > domain=a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922' >>>> > 2017-06-03 19:21:40,063+0200 ERROR (jsonrpc/2) >>>> [storage.TaskManager.Task] >>>> > (Task='a2476a33-26f8-4ebd-876d-02fe5d13ef78') Unexpected error >>>> (task:870) >>>> > Traceback (most recent call last): >>>> > File "/usr/share/vdsm/storage/task.py", line 877, in _run >>>> > return fn(*args, **kargs) >>>> > File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", >>>> line 52, in >>>> > wrapper >>>> > res = f(*args, **kwargs) >>>> > File "/usr/share/vdsm/storage/hsm.py", line 959, in >>>> createStoragePool >>>> > leaseParams) >>>> > File "/usr/share/vdsm/storage/sp.py", line 652, in create >>>> > self.attachSD(sdUUID) >>>> > File "/usr/lib/python2.7/site-packa >>>> ges/vdsm/storage/securable.py", line >>>> > 79, in wrapper >>>> > return method(self, *args, **kwargs) >>>> > File "/usr/share/vdsm/storage/sp.py", line 971, in attachSD >>>> > dom.acquireHostId(self.id) >>>> > File "/usr/share/vdsm/storage/sd.py", line 790, in >>>> acquireHostId >>>> > self._manifest.acquireHostId(hostId, async) >>>> > File "/usr/share/vdsm/storage/sd.py", line 449, in >>>> acquireHostId >>>> > self._domainLock.acquireHostId(hostId, async) >>>> > File "/usr/lib/python2.7/site-packa >>>> ges/vdsm/storage/clusterlock.py", line >>>> > 297, in acquireHostId >>>> > raise se.AcquireHostIdFailure(self._sdUUID, e) >>>> > AcquireHostIdFailure: Cannot acquire host id: >>>> > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, >>>> 'Sanlock >>>> > lockspace add failure', 'Invalid argument')) >>>> > 2017-06-03 19:21:40,067+0200 ERROR (jsonrpc/2) >>>> [storage.Dispatcher] >>>> > {'status': {'message': "Cannot acquire host id: >>>> > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, >>>> 'Sanlock >>>> > lockspace add failure', 'Invalid argument'))", 'code': 661}} >>>> (dispatcher:77) >>>> > >>>> > The gluster volume prepared for engine storage is online and no >>>> split brain >>>> > is reported. I don't understand what needs to be done to >>>> overcome this. Any >>>> > idea will be appreciated. >>>> > >>>> > Thank you, >>>> > Alex >>>> > >>>> > _______________________________________________ >>>> > Users mailing list >>>> > Users@ovirt.org >>>> > http://lists.ovirt.org/mailman/listinfo/users >>>> > >>>> >>> >>> >> >> _______________________________________________ >> Users mailing list >> Users@ovirt.org >> http://lists.ovirt.org/mailman/listinfo/users >> >> >
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

Hi Krutika, My comments inline. Also attached the strace of: strace -y -ff -o /root/512-trace-on-root.log dd if=/dev/zero of=/mnt/test2.img oflag=direct bs=512 count=1 and of: strace -y -ff -o /root/4096-trace-on-root.log dd if=/dev/zero of=/mnt/test2.img oflag=direct bs=4096 count=16 I have mounted gluster volume at /mnt. The dd with bs=4096 is successful. The gluster mount log gives only the following: [2017-06-06 12:04:54.102576] W [MSGID: 114031] [client-rpc-fops.c:854:client3_3_writev_cbk] 0-engine-client-0: remote operation failed [Invalid argument] [2017-06-06 12:04:54.102591] W [MSGID: 114031] [client-rpc-fops.c:854:client3_3_writev_cbk] 0-engine-client-1: remote operation failed [Invalid argument] [2017-06-06 12:04:54.103355] W [fuse-bridge.c:2312:fuse_writev_cbk] 0-glusterfs-fuse: 205: WRITE => -1 gfid=075ab3a5-0274-4f07-a075-2748c3b4d394 fd=0x7faf1d08706c (Transport endpoint is not connected) The gluster brick log gives: [2017-06-06 12:07:03.793080] E [MSGID: 113072] [posix.c:3453:posix_writev] 0-engine-posix: write failed: offset 0, [Invalid argument] [2017-06-06 12:07:03.793172] E [MSGID: 115067] [server-rpc-fops.c:1346:server_writev_cbk] 0-engine-server: 291: WRITEV 0 (075ab3a5-0274-4f07-a075-2748c3b4d394) ==> (Invalid argument) [Invalid argument] On Tue, Jun 6, 2017 at 12:50 PM, Krutika Dhananjay <kdhananj@redhat.com> wrote:
OK.
So for the 'Transport endpoint is not connected' issue, could you share the mount and brick logs?
Hmmm.. 'Invalid argument' error even on the root partition. What if you change bs to 4096 and run?
If I use bs=4096 the dd is successful on /root and at gluster mounted volume.
The logs I showed in my earlier mail shows that gluster is merely returning the error it got from the disk file system where the brick is hosted. But you're right about the fact that the offset 127488 is not 4K-aligned.
If the dd on /root worked for you with bs=4096, could you try the same directly on gluster mount point on a dummy file and capture the strace output of dd? You can perhaps reuse your existing gluster volume by mounting it at another location and doing the dd. Here's what you need to execute:
strace -ff -T -p <pid-of-mount-process> -o <path-to-the-file-where-you-want-the-output-saved>`
FWIW, here's something I found in man(2) open:
*Under Linux 2.4, transfer sizes, and the alignment of the user buffer and the file offset must all be multiples of the logical block size of the filesystem. Since Linux 2.6.0, alignment to the logical block size of the underlying storage (typically 512 bytes) suffices. The logical block size can be determined using the ioctl(2) BLKSSZGET operation or from the shell using the command: blockdev --getss*
Please note also that the physical disks are of 4K sector size (native). Thus OS is having 4096/4096 local/physical sector size. [root@v0 ~]# blockdev --getss /dev/sda 4096 [root@v0 ~]# blockdev --getpbsz /dev/sda 4096
-Krutika
On Tue, Jun 6, 2017 at 1:18 AM, Abi Askushi <rightkicktech@gmail.com> wrote:
Also when testing with dd i get the following:
*Testing on the gluster mount: * dd if=/dev/zero of=/rhev/data-center/mnt/glusterSD/10.100.100.1:_engine/test2.img oflag=direct bs=512 count=1 dd: error writing β/rhev/data-center/mnt/glusterSD/10.100.100.1: _engine/test2.imgβ: *Transport endpoint is not connected* 1+0 records in 0+0 records out 0 bytes (0 B) copied, 0.00336755 s, 0.0 kB/s
*Testing on the /root directory (XFS): * dd if=/dev/zero of=/test2.img oflag=direct bs=512 count=1 dd: error writing β/test2.imgβ:* Invalid argument* 1+0 records in 0+0 records out 0 bytes (0 B) copied, 0.000321239 s, 0.0 kB/s
Seems that the gluster is trying to do the same and fails.
On Mon, Jun 5, 2017 at 10:10 PM, Abi Askushi <rightkicktech@gmail.com> wrote:
The question that rises is what is needed to make gluster aware of the 4K physical sectors presented to it (the logical sector is also 4K). The offset (127488) at the log does not seem aligned at 4K.
Alex
On Mon, Jun 5, 2017 at 2:47 PM, Abi Askushi <rightkicktech@gmail.com> wrote:
Hi Krutika,
I am saying that I am facing this issue with 4k drives. I never encountered this issue with 512 drives.
Alex
On Jun 5, 2017 14:26, "Krutika Dhananjay" <kdhananj@redhat.com> wrote:
This seems like a case of O_DIRECT reads and writes gone wrong, judging by the 'Invalid argument' errors.
The two operations that have failed on gluster bricks are:
[2017-06-05 09:40:39.428979] E [MSGID: 113072] [posix.c:3453:posix_writev] 0-engine-posix: write failed: offset 0, [Invalid argument] [2017-06-05 09:41:00.865760] E [MSGID: 113040] [posix.c:3178:posix_readv] 0-engine-posix: read failed on gfid=8c94f658-ac3c-4e3a-b368-8c038513a914, fd=0x7f408584c06c, offset=127488 size=512, buf=0x7f4083c0b000 [Invalid argument]
But then, both the write and the read have 512byte-aligned offset, size and buf address (which is correct).
Are you saying you don't see this issue with 4K block-size?
-Krutika
On Mon, Jun 5, 2017 at 3:21 PM, Abi Askushi <rightkicktech@gmail.com> wrote:
Hi Sahina,
Attached are the logs. Let me know if sth else is needed.
I have 5 disks (with 4K physical sector) in RAID5. The RAID has 64K stripe size at the moment. I have prepared the storage as below:
pvcreate --dataalignment 256K /dev/sda4 vgcreate --physicalextentsize 256K gluster /dev/sda4
lvcreate -n engine --size 120G gluster mkfs.xfs -f -i size=512 /dev/gluster/engine
Thanx, Alex
On Mon, Jun 5, 2017 at 12:14 PM, Sahina Bose <sabose@redhat.com> wrote:
> Can we have the gluster mount logs and brick logs to check if it's > the same issue? > > On Sun, Jun 4, 2017 at 11:21 PM, Abi Askushi < > rightkicktech@gmail.com> wrote: > >> I clean installed everything and ran into the same. >> I then ran gdeploy and encountered the same issue when deploying >> engine. >> Seems that gluster (?) doesn't like 4K sector drives. I am not sure >> if it has to do with alignment. The weird thing is that gluster volumes are >> all ok, replicating normally and no split brain is reported. >> >> The solution to the mentioned bug (1386443 >> <https://bugzilla.redhat.com/show_bug.cgi?id=1386443>) was to >> format with 512 sector size, which for my case is not an option: >> >> mkfs.xfs -f -i size=512 -s size=512 /dev/gluster/engine >> illegal sector size 512; hw sector is 4096 >> >> Is there any workaround to address this? >> >> Thanx, >> Alex >> >> >> On Sun, Jun 4, 2017 at 5:48 PM, Abi Askushi < >> rightkicktech@gmail.com> wrote: >> >>> Hi Maor, >>> >>> My disk are of 4K block size and from this bug seems that gluster >>> replica needs 512B block size. >>> Is there a way to make gluster function with 4K drives? >>> >>> Thank you! >>> >>> On Sun, Jun 4, 2017 at 2:34 PM, Maor Lipchuk <mlipchuk@redhat.com> >>> wrote: >>> >>>> Hi Alex, >>>> >>>> I saw a bug that might be related to the issue you encountered at >>>> https://bugzilla.redhat.com/show_bug.cgi?id=1386443 >>>> >>>> Sahina, maybe you have any advise? Do you think that BZ1386443is >>>> related? >>>> >>>> Regards, >>>> Maor >>>> >>>> On Sat, Jun 3, 2017 at 8:45 PM, Abi Askushi < >>>> rightkicktech@gmail.com> wrote: >>>> > Hi All, >>>> > >>>> > I have installed successfully several times oVirt (version 4.1) >>>> with 3 nodes >>>> > on top glusterfs. >>>> > >>>> > This time, when trying to configure the same setup, I am facing >>>> the >>>> > following issue which doesn't seem to go away. During >>>> installation i get the >>>> > error: >>>> > >>>> > Failed to execute stage 'Misc configuration': Cannot acquire >>>> host id: >>>> > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, >>>> 'Sanlock >>>> > lockspace add failure', 'Invalid argument')) >>>> > >>>> > The only different in this setup is that instead of standard >>>> partitioning i >>>> > have GPT partitioning and the disks have 4K block size instead >>>> of 512. >>>> > >>>> > The /var/log/sanlock.log has the following lines: >>>> > >>>> > 2017-06-03 19:21:15+0200 23450 [943]: s9 lockspace >>>> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047:250:/rhev/data-center/m >>>> nt/_var_lib_ovirt-hosted-engin-setup_tmptjkIDI/ba6bd862-c2b8 >>>> -46e7-b2c8-91e4a5bb2047/dom_md/ids:0 >>>> > 2017-06-03 19:21:36+0200 23471 [944]: s9:r5 resource >>>> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047:SDM:/rhev/data-center/m >>>> nt/_var_lib_ovirt-hosted-engine-setup_tmptjkIDI/ba6bd862-c2b >>>> 8-46e7-b2c8-91e4a5bb2047/dom_md/leases:1048576 >>>> > for 2,9,23040 >>>> > 2017-06-03 19:21:36+0200 23471 [943]: s10 lockspace >>>> > a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922:250:/rhev/data-center/m >>>> nt/glusterSD/10.100.100.1:_engine/a5a6b0e7-fc3f-4838-8e26-c8 >>>> b4d5e5e922/dom_md/ids:0 >>>> > 2017-06-03 19:21:36+0200 23471 [23522]: a5a6b0e7 aio collect RD >>>> > 0x7f59b00008c0:0x7f59b00008d0:0x7f59b0101000 result -22:0 >>>> match res >>>> > 2017-06-03 19:21:36+0200 23471 [23522]: read_sectors >>>> delta_leader offset >>>> > 127488 rv -22 >>>> > /rhev/data-center/mnt/glusterSD/10.100.100.1:_engine/a5a6b0e >>>> 7-fc3f-4838-8e26-c8b4d5e5e922/dom_md/ids >>>> > 2017-06-03 19:21:37+0200 23472 [930]: s9 host 250 1 23450 >>>> > 88c2244c-a782-40ed-9560-6cfa4d46f853.v0.neptune >>>> > 2017-06-03 19:21:37+0200 23472 [943]: s10 add_lockspace fail >>>> result -22 >>>> > >>>> > And /var/log/vdsm/vdsm.log says: >>>> > >>>> > 2017-06-03 19:19:38,176+0200 WARN (jsonrpc/3) >>>> > [storage.StorageServer.MountConnection] Using user specified >>>> > backup-volfile-servers option (storageServer:253) >>>> > 2017-06-03 19:21:12,379+0200 WARN (periodic/1) [throttled] MOM >>>> not >>>> > available. (throttledlog:105) >>>> > 2017-06-03 19:21:12,380+0200 WARN (periodic/1) [throttled] MOM >>>> not >>>> > available, KSM stats will be missing. (throttledlog:105) >>>> > 2017-06-03 19:21:14,714+0200 WARN (jsonrpc/1) >>>> > [storage.StorageServer.MountConnection] Using user specified >>>> > backup-volfile-servers option (storageServer:253) >>>> > 2017-06-03 19:21:15,515+0200 ERROR (jsonrpc/4) >>>> [storage.initSANLock] Cannot >>>> > initialize SANLock for domain a5a6b0e7-fc3f-4838-8e26-c8b4d5 >>>> e5e922 >>>> > (clusterlock:238) >>>> > Traceback (most recent call last): >>>> > File "/usr/lib/python2.7/site-packa >>>> ges/vdsm/storage/clusterlock.py", line >>>> > 234, in initSANLock >>>> > sanlock.init_lockspace(sdUUID, idsPath) >>>> > SanlockException: (107, 'Sanlock lockspace init failure', >>>> 'Transport >>>> > endpoint is not connected') >>>> > 2017-06-03 19:21:15,515+0200 WARN (jsonrpc/4) >>>> > [storage.StorageDomainManifest] lease did not initialize >>>> successfully >>>> > (sd:557) >>>> > Traceback (most recent call last): >>>> > File "/usr/share/vdsm/storage/sd.py", line 552, in >>>> initDomainLock >>>> > self._domainLock.initLock(self.getDomainLease()) >>>> > File "/usr/lib/python2.7/site-packa >>>> ges/vdsm/storage/clusterlock.py", line >>>> > 271, in initLock >>>> > initSANLock(self._sdUUID, self._idsPath, lease) >>>> > File "/usr/lib/python2.7/site-packa >>>> ges/vdsm/storage/clusterlock.py", line >>>> > 239, in initSANLock >>>> > raise se.ClusterLockInitError() >>>> > ClusterLockInitError: Could not initialize cluster lock: () >>>> > 2017-06-03 19:21:37,867+0200 ERROR (jsonrpc/2) >>>> [storage.StoragePool] Create >>>> > pool hosted_datacenter canceled (sp:655) >>>> > Traceback (most recent call last): >>>> > File "/usr/share/vdsm/storage/sp.py", line 652, in create >>>> > self.attachSD(sdUUID) >>>> > File "/usr/lib/python2.7/site-packa >>>> ges/vdsm/storage/securable.py", line >>>> > 79, in wrapper >>>> > return method(self, *args, **kwargs) >>>> > File "/usr/share/vdsm/storage/sp.py", line 971, in attachSD >>>> > dom.acquireHostId(self.id) >>>> > File "/usr/share/vdsm/storage/sd.py", line 790, in >>>> acquireHostId >>>> > self._manifest.acquireHostId(hostId, async) >>>> > File "/usr/share/vdsm/storage/sd.py", line 449, in >>>> acquireHostId >>>> > self._domainLock.acquireHostId(hostId, async) >>>> > File "/usr/lib/python2.7/site-packa >>>> ges/vdsm/storage/clusterlock.py", line >>>> > 297, in acquireHostId >>>> > raise se.AcquireHostIdFailure(self._sdUUID, e) >>>> > AcquireHostIdFailure: Cannot acquire host id: >>>> > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, >>>> 'Sanlock >>>> > lockspace add failure', 'Invalid argument')) >>>> > 2017-06-03 19:21:37,870+0200 ERROR (jsonrpc/2) >>>> [storage.StoragePool] Domain >>>> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 detach from MSD >>>> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 Ver 1 failed. (sp:528) >>>> > Traceback (most recent call last): >>>> > File "/usr/share/vdsm/storage/sp.py", line 525, in >>>> __cleanupDomains >>>> > self.detachSD(sdUUID) >>>> > File "/usr/lib/python2.7/site-packa >>>> ges/vdsm/storage/securable.py", line >>>> > 79, in wrapper >>>> > return method(self, *args, **kwargs) >>>> > File "/usr/share/vdsm/storage/sp.py", line 1046, in detachSD >>>> > raise se.CannotDetachMasterStorageDomain(sdUUID) >>>> > CannotDetachMasterStorageDomain: Illegal action: >>>> > (u'ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047',) >>>> > 2017-06-03 19:21:37,872+0200 ERROR (jsonrpc/2) >>>> [storage.StoragePool] Domain >>>> > a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922 detach from MSD >>>> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 Ver 1 failed. (sp:528) >>>> > Traceback (most recent call last): >>>> > File "/usr/share/vdsm/storage/sp.py", line 525, in >>>> __cleanupDomains >>>> > self.detachSD(sdUUID) >>>> > File "/usr/lib/python2.7/site-packa >>>> ges/vdsm/storage/securable.py", line >>>> > 79, in wrapper >>>> > return method(self, *args, **kwargs) >>>> > File "/usr/share/vdsm/storage/sp.py", line 1043, in detachSD >>>> > self.validateAttachedDomain(dom) >>>> > File "/usr/lib/python2.7/site-packa >>>> ges/vdsm/storage/securable.py", line >>>> > 79, in wrapper >>>> > return method(self, *args, **kwargs) >>>> > File "/usr/share/vdsm/storage/sp.py", line 542, in >>>> validateAttachedDomain >>>> > self.validatePoolSD(dom.sdUUID) >>>> > File "/usr/lib/python2.7/site-packa >>>> ges/vdsm/storage/securable.py", line >>>> > 79, in wrapper >>>> > return method(self, *args, **kwargs) >>>> > File "/usr/share/vdsm/storage/sp.py", line 535, in >>>> validatePoolSD >>>> > raise se.StorageDomainNotMemberOfPool(self.spUUID, sdUUID) >>>> > StorageDomainNotMemberOfPool: Domain is not member in pool: >>>> > u'pool=a1e7e9dd-0cf4-41ae-ba13-36297ed66309, >>>> > domain=a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922' >>>> > 2017-06-03 19:21:40,063+0200 ERROR (jsonrpc/2) >>>> [storage.TaskManager.Task] >>>> > (Task='a2476a33-26f8-4ebd-876d-02fe5d13ef78') Unexpected error >>>> (task:870) >>>> > Traceback (most recent call last): >>>> > File "/usr/share/vdsm/storage/task.py", line 877, in _run >>>> > return fn(*args, **kargs) >>>> > File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", >>>> line 52, in >>>> > wrapper >>>> > res = f(*args, **kwargs) >>>> > File "/usr/share/vdsm/storage/hsm.py", line 959, in >>>> createStoragePool >>>> > leaseParams) >>>> > File "/usr/share/vdsm/storage/sp.py", line 652, in create >>>> > self.attachSD(sdUUID) >>>> > File "/usr/lib/python2.7/site-packa >>>> ges/vdsm/storage/securable.py", line >>>> > 79, in wrapper >>>> > return method(self, *args, **kwargs) >>>> > File "/usr/share/vdsm/storage/sp.py", line 971, in attachSD >>>> > dom.acquireHostId(self.id) >>>> > File "/usr/share/vdsm/storage/sd.py", line 790, in >>>> acquireHostId >>>> > self._manifest.acquireHostId(hostId, async) >>>> > File "/usr/share/vdsm/storage/sd.py", line 449, in >>>> acquireHostId >>>> > self._domainLock.acquireHostId(hostId, async) >>>> > File "/usr/lib/python2.7/site-packa >>>> ges/vdsm/storage/clusterlock.py", line >>>> > 297, in acquireHostId >>>> > raise se.AcquireHostIdFailure(self._sdUUID, e) >>>> > AcquireHostIdFailure: Cannot acquire host id: >>>> > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, >>>> 'Sanlock >>>> > lockspace add failure', 'Invalid argument')) >>>> > 2017-06-03 19:21:40,067+0200 ERROR (jsonrpc/2) >>>> [storage.Dispatcher] >>>> > {'status': {'message': "Cannot acquire host id: >>>> > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, >>>> 'Sanlock >>>> > lockspace add failure', 'Invalid argument'))", 'code': 661}} >>>> (dispatcher:77) >>>> > >>>> > The gluster volume prepared for engine storage is online and no >>>> split brain >>>> > is reported. I don't understand what needs to be done to >>>> overcome this. Any >>>> > idea will be appreciated. >>>> > >>>> > Thank you, >>>> > Alex >>>> > >>>> > _______________________________________________ >>>> > Users mailing list >>>> > Users@ovirt.org >>>> > http://lists.ovirt.org/mailman/listinfo/users >>>> > >>>> >>> >>> >> >> _______________________________________________ >> Users mailing list >> Users@ovirt.org >> http://lists.ovirt.org/mailman/listinfo/users >> >> >
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

Also I see this article from Redhat, that mentions if 4K sectors are supported, but I am not able to read it as I don't have a subscription: https://access.redhat.com/solutions/56494 Its hard to believe that 4K drives have not been used from others on oVirt deployments. Alex On Tue, Jun 6, 2017 at 3:18 PM, Abi Askushi <rightkicktech@gmail.com> wrote:
Hi Krutika,
My comments inline.
Also attached the strace of: strace -y -ff -o /root/512-trace-on-root.log dd if=/dev/zero of=/mnt/test2.img oflag=direct bs=512 count=1
and of: strace -y -ff -o /root/4096-trace-on-root.log dd if=/dev/zero of=/mnt/test2.img oflag=direct bs=4096 count=16
I have mounted gluster volume at /mnt. The dd with bs=4096 is successful.
The gluster mount log gives only the following: [2017-06-06 12:04:54.102576] W [MSGID: 114031] [client-rpc-fops.c:854:client3_3_writev_cbk] 0-engine-client-0: remote operation failed [Invalid argument] [2017-06-06 12:04:54.102591] W [MSGID: 114031] [client-rpc-fops.c:854:client3_3_writev_cbk] 0-engine-client-1: remote operation failed [Invalid argument] [2017-06-06 12:04:54.103355] W [fuse-bridge.c:2312:fuse_writev_cbk] 0-glusterfs-fuse: 205: WRITE => -1 gfid=075ab3a5-0274-4f07-a075-2748c3b4d394 fd=0x7faf1d08706c (Transport endpoint is not connected)
The gluster brick log gives: [2017-06-06 12:07:03.793080] E [MSGID: 113072] [posix.c:3453:posix_writev] 0-engine-posix: write failed: offset 0, [Invalid argument] [2017-06-06 12:07:03.793172] E [MSGID: 115067] [server-rpc-fops.c:1346:server_writev_cbk] 0-engine-server: 291: WRITEV 0 (075ab3a5-0274-4f07-a075-2748c3b4d394) ==> (Invalid argument) [Invalid argument]
On Tue, Jun 6, 2017 at 12:50 PM, Krutika Dhananjay <kdhananj@redhat.com> wrote:
OK.
So for the 'Transport endpoint is not connected' issue, could you share the mount and brick logs?
Hmmm.. 'Invalid argument' error even on the root partition. What if you change bs to 4096 and run?
If I use bs=4096 the dd is successful on /root and at gluster mounted volume.
The logs I showed in my earlier mail shows that gluster is merely returning the error it got from the disk file system where the brick is hosted. But you're right about the fact that the offset 127488 is not 4K-aligned.
If the dd on /root worked for you with bs=4096, could you try the same directly on gluster mount point on a dummy file and capture the strace output of dd? You can perhaps reuse your existing gluster volume by mounting it at another location and doing the dd. Here's what you need to execute:
strace -ff -T -p <pid-of-mount-process> -o <path-to-the-file-where-you-want-the-output-saved>`
FWIW, here's something I found in man(2) open:
*Under Linux 2.4, transfer sizes, and the alignment of the user buffer and the file offset must all be multiples of the logical block size of the filesystem. Since Linux 2.6.0, alignment to the logical block size of the underlying storage (typically 512 bytes) suffices. The logical block size can be determined using the ioctl(2) BLKSSZGET operation or from the shell using the command: blockdev --getss*
Please note also that the physical disks are of 4K sector size (native). Thus OS is having 4096/4096 local/physical sector size. [root@v0 ~]# blockdev --getss /dev/sda 4096 [root@v0 ~]# blockdev --getpbsz /dev/sda 4096
-Krutika
On Tue, Jun 6, 2017 at 1:18 AM, Abi Askushi <rightkicktech@gmail.com> wrote:
Also when testing with dd i get the following:
*Testing on the gluster mount: * dd if=/dev/zero of=/rhev/data-center/mnt/glusterSD/10.100.100.1: _engine/test2.img oflag=direct bs=512 count=1 dd: error writing β/rhev/data-center/mnt/glusterSD/10.100.100.1: _engine/test2.imgβ: *Transport endpoint is not connected* 1+0 records in 0+0 records out 0 bytes (0 B) copied, 0.00336755 s, 0.0 kB/s
*Testing on the /root directory (XFS): * dd if=/dev/zero of=/test2.img oflag=direct bs=512 count=1 dd: error writing β/test2.imgβ:* Invalid argument* 1+0 records in 0+0 records out 0 bytes (0 B) copied, 0.000321239 s, 0.0 kB/s
Seems that the gluster is trying to do the same and fails.
On Mon, Jun 5, 2017 at 10:10 PM, Abi Askushi <rightkicktech@gmail.com> wrote:
The question that rises is what is needed to make gluster aware of the 4K physical sectors presented to it (the logical sector is also 4K). The offset (127488) at the log does not seem aligned at 4K.
Alex
On Mon, Jun 5, 2017 at 2:47 PM, Abi Askushi <rightkicktech@gmail.com> wrote:
Hi Krutika,
I am saying that I am facing this issue with 4k drives. I never encountered this issue with 512 drives.
Alex
On Jun 5, 2017 14:26, "Krutika Dhananjay" <kdhananj@redhat.com> wrote:
This seems like a case of O_DIRECT reads and writes gone wrong, judging by the 'Invalid argument' errors.
The two operations that have failed on gluster bricks are:
[2017-06-05 09:40:39.428979] E [MSGID: 113072] [posix.c:3453:posix_writev] 0-engine-posix: write failed: offset 0, [Invalid argument] [2017-06-05 09:41:00.865760] E [MSGID: 113040] [posix.c:3178:posix_readv] 0-engine-posix: read failed on gfid=8c94f658-ac3c-4e3a-b368-8c038513a914, fd=0x7f408584c06c, offset=127488 size=512, buf=0x7f4083c0b000 [Invalid argument]
But then, both the write and the read have 512byte-aligned offset, size and buf address (which is correct).
Are you saying you don't see this issue with 4K block-size?
-Krutika
On Mon, Jun 5, 2017 at 3:21 PM, Abi Askushi <rightkicktech@gmail.com> wrote:
> Hi Sahina, > > Attached are the logs. Let me know if sth else is needed. > > I have 5 disks (with 4K physical sector) in RAID5. The RAID has 64K > stripe size at the moment. > I have prepared the storage as below: > > pvcreate --dataalignment 256K /dev/sda4 > vgcreate --physicalextentsize 256K gluster /dev/sda4 > > lvcreate -n engine --size 120G gluster > mkfs.xfs -f -i size=512 /dev/gluster/engine > > Thanx, > Alex > > On Mon, Jun 5, 2017 at 12:14 PM, Sahina Bose <sabose@redhat.com> > wrote: > >> Can we have the gluster mount logs and brick logs to check if it's >> the same issue? >> >> On Sun, Jun 4, 2017 at 11:21 PM, Abi Askushi < >> rightkicktech@gmail.com> wrote: >> >>> I clean installed everything and ran into the same. >>> I then ran gdeploy and encountered the same issue when deploying >>> engine. >>> Seems that gluster (?) doesn't like 4K sector drives. I am not >>> sure if it has to do with alignment. The weird thing is that gluster >>> volumes are all ok, replicating normally and no split brain is reported. >>> >>> The solution to the mentioned bug (1386443 >>> <https://bugzilla.redhat.com/show_bug.cgi?id=1386443>) was to >>> format with 512 sector size, which for my case is not an option: >>> >>> mkfs.xfs -f -i size=512 -s size=512 /dev/gluster/engine >>> illegal sector size 512; hw sector is 4096 >>> >>> Is there any workaround to address this? >>> >>> Thanx, >>> Alex >>> >>> >>> On Sun, Jun 4, 2017 at 5:48 PM, Abi Askushi < >>> rightkicktech@gmail.com> wrote: >>> >>>> Hi Maor, >>>> >>>> My disk are of 4K block size and from this bug seems that gluster >>>> replica needs 512B block size. >>>> Is there a way to make gluster function with 4K drives? >>>> >>>> Thank you! >>>> >>>> On Sun, Jun 4, 2017 at 2:34 PM, Maor Lipchuk <mlipchuk@redhat.com >>>> > wrote: >>>> >>>>> Hi Alex, >>>>> >>>>> I saw a bug that might be related to the issue you encountered at >>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1386443 >>>>> >>>>> Sahina, maybe you have any advise? Do you think that BZ1386443is >>>>> related? >>>>> >>>>> Regards, >>>>> Maor >>>>> >>>>> On Sat, Jun 3, 2017 at 8:45 PM, Abi Askushi < >>>>> rightkicktech@gmail.com> wrote: >>>>> > Hi All, >>>>> > >>>>> > I have installed successfully several times oVirt (version >>>>> 4.1) with 3 nodes >>>>> > on top glusterfs. >>>>> > >>>>> > This time, when trying to configure the same setup, I am >>>>> facing the >>>>> > following issue which doesn't seem to go away. During >>>>> installation i get the >>>>> > error: >>>>> > >>>>> > Failed to execute stage 'Misc configuration': Cannot acquire >>>>> host id: >>>>> > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', >>>>> SanlockException(22, 'Sanlock >>>>> > lockspace add failure', 'Invalid argument')) >>>>> > >>>>> > The only different in this setup is that instead of standard >>>>> partitioning i >>>>> > have GPT partitioning and the disks have 4K block size instead >>>>> of 512. >>>>> > >>>>> > The /var/log/sanlock.log has the following lines: >>>>> > >>>>> > 2017-06-03 19:21:15+0200 23450 [943]: s9 lockspace >>>>> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047:250:/rhev/data-center/m >>>>> nt/_var_lib_ovirt-hosted-engin-setup_tmptjkIDI/ba6bd862-c2b8 >>>>> -46e7-b2c8-91e4a5bb2047/dom_md/ids:0 >>>>> > 2017-06-03 19:21:36+0200 23471 [944]: s9:r5 resource >>>>> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047:SDM:/rhev/data-center/m >>>>> nt/_var_lib_ovirt-hosted-engine-setup_tmptjkIDI/ba6bd862-c2b >>>>> 8-46e7-b2c8-91e4a5bb2047/dom_md/leases:1048576 >>>>> > for 2,9,23040 >>>>> > 2017-06-03 19:21:36+0200 23471 [943]: s10 lockspace >>>>> > a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922:250:/rhev/data-center/m >>>>> nt/glusterSD/10.100.100.1:_engine/a5a6b0e7-fc3f-4838-8e26-c8 >>>>> b4d5e5e922/dom_md/ids:0 >>>>> > 2017-06-03 19:21:36+0200 23471 [23522]: a5a6b0e7 aio collect RD >>>>> > 0x7f59b00008c0:0x7f59b00008d0:0x7f59b0101000 result -22:0 >>>>> match res >>>>> > 2017-06-03 19:21:36+0200 23471 [23522]: read_sectors >>>>> delta_leader offset >>>>> > 127488 rv -22 >>>>> > /rhev/data-center/mnt/glusterSD/10.100.100.1:_engine/a5a6b0e >>>>> 7-fc3f-4838-8e26-c8b4d5e5e922/dom_md/ids >>>>> > 2017-06-03 19:21:37+0200 23472 [930]: s9 host 250 1 23450 >>>>> > 88c2244c-a782-40ed-9560-6cfa4d46f853.v0.neptune >>>>> > 2017-06-03 19:21:37+0200 23472 [943]: s10 add_lockspace fail >>>>> result -22 >>>>> > >>>>> > And /var/log/vdsm/vdsm.log says: >>>>> > >>>>> > 2017-06-03 19:19:38,176+0200 WARN (jsonrpc/3) >>>>> > [storage.StorageServer.MountConnection] Using user specified >>>>> > backup-volfile-servers option (storageServer:253) >>>>> > 2017-06-03 19:21:12,379+0200 WARN (periodic/1) [throttled] >>>>> MOM not >>>>> > available. (throttledlog:105) >>>>> > 2017-06-03 19:21:12,380+0200 WARN (periodic/1) [throttled] >>>>> MOM not >>>>> > available, KSM stats will be missing. (throttledlog:105) >>>>> > 2017-06-03 19:21:14,714+0200 WARN (jsonrpc/1) >>>>> > [storage.StorageServer.MountConnection] Using user specified >>>>> > backup-volfile-servers option (storageServer:253) >>>>> > 2017-06-03 19:21:15,515+0200 ERROR (jsonrpc/4) >>>>> [storage.initSANLock] Cannot >>>>> > initialize SANLock for domain a5a6b0e7-fc3f-4838-8e26-c8b4d5 >>>>> e5e922 >>>>> > (clusterlock:238) >>>>> > Traceback (most recent call last): >>>>> > File "/usr/lib/python2.7/site-packa >>>>> ges/vdsm/storage/clusterlock.py", line >>>>> > 234, in initSANLock >>>>> > sanlock.init_lockspace(sdUUID, idsPath) >>>>> > SanlockException: (107, 'Sanlock lockspace init failure', >>>>> 'Transport >>>>> > endpoint is not connected') >>>>> > 2017-06-03 19:21:15,515+0200 WARN (jsonrpc/4) >>>>> > [storage.StorageDomainManifest] lease did not initialize >>>>> successfully >>>>> > (sd:557) >>>>> > Traceback (most recent call last): >>>>> > File "/usr/share/vdsm/storage/sd.py", line 552, in >>>>> initDomainLock >>>>> > self._domainLock.initLock(self.getDomainLease()) >>>>> > File "/usr/lib/python2.7/site-packa >>>>> ges/vdsm/storage/clusterlock.py", line >>>>> > 271, in initLock >>>>> > initSANLock(self._sdUUID, self._idsPath, lease) >>>>> > File "/usr/lib/python2.7/site-packa >>>>> ges/vdsm/storage/clusterlock.py", line >>>>> > 239, in initSANLock >>>>> > raise se.ClusterLockInitError() >>>>> > ClusterLockInitError: Could not initialize cluster lock: () >>>>> > 2017-06-03 19:21:37,867+0200 ERROR (jsonrpc/2) >>>>> [storage.StoragePool] Create >>>>> > pool hosted_datacenter canceled (sp:655) >>>>> > Traceback (most recent call last): >>>>> > File "/usr/share/vdsm/storage/sp.py", line 652, in create >>>>> > self.attachSD(sdUUID) >>>>> > File "/usr/lib/python2.7/site-packa >>>>> ges/vdsm/storage/securable.py", line >>>>> > 79, in wrapper >>>>> > return method(self, *args, **kwargs) >>>>> > File "/usr/share/vdsm/storage/sp.py", line 971, in attachSD >>>>> > dom.acquireHostId(self.id) >>>>> > File "/usr/share/vdsm/storage/sd.py", line 790, in >>>>> acquireHostId >>>>> > self._manifest.acquireHostId(hostId, async) >>>>> > File "/usr/share/vdsm/storage/sd.py", line 449, in >>>>> acquireHostId >>>>> > self._domainLock.acquireHostId(hostId, async) >>>>> > File "/usr/lib/python2.7/site-packa >>>>> ges/vdsm/storage/clusterlock.py", line >>>>> > 297, in acquireHostId >>>>> > raise se.AcquireHostIdFailure(self._sdUUID, e) >>>>> > AcquireHostIdFailure: Cannot acquire host id: >>>>> > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', >>>>> SanlockException(22, 'Sanlock >>>>> > lockspace add failure', 'Invalid argument')) >>>>> > 2017-06-03 19:21:37,870+0200 ERROR (jsonrpc/2) >>>>> [storage.StoragePool] Domain >>>>> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 detach from MSD >>>>> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 Ver 1 failed. (sp:528) >>>>> > Traceback (most recent call last): >>>>> > File "/usr/share/vdsm/storage/sp.py", line 525, in >>>>> __cleanupDomains >>>>> > self.detachSD(sdUUID) >>>>> > File "/usr/lib/python2.7/site-packa >>>>> ges/vdsm/storage/securable.py", line >>>>> > 79, in wrapper >>>>> > return method(self, *args, **kwargs) >>>>> > File "/usr/share/vdsm/storage/sp.py", line 1046, in detachSD >>>>> > raise se.CannotDetachMasterStorageDomain(sdUUID) >>>>> > CannotDetachMasterStorageDomain: Illegal action: >>>>> > (u'ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047',) >>>>> > 2017-06-03 19:21:37,872+0200 ERROR (jsonrpc/2) >>>>> [storage.StoragePool] Domain >>>>> > a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922 detach from MSD >>>>> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 Ver 1 failed. (sp:528) >>>>> > Traceback (most recent call last): >>>>> > File "/usr/share/vdsm/storage/sp.py", line 525, in >>>>> __cleanupDomains >>>>> > self.detachSD(sdUUID) >>>>> > File "/usr/lib/python2.7/site-packa >>>>> ges/vdsm/storage/securable.py", line >>>>> > 79, in wrapper >>>>> > return method(self, *args, **kwargs) >>>>> > File "/usr/share/vdsm/storage/sp.py", line 1043, in detachSD >>>>> > self.validateAttachedDomain(dom) >>>>> > File "/usr/lib/python2.7/site-packa >>>>> ges/vdsm/storage/securable.py", line >>>>> > 79, in wrapper >>>>> > return method(self, *args, **kwargs) >>>>> > File "/usr/share/vdsm/storage/sp.py", line 542, in >>>>> validateAttachedDomain >>>>> > self.validatePoolSD(dom.sdUUID) >>>>> > File "/usr/lib/python2.7/site-packa >>>>> ges/vdsm/storage/securable.py", line >>>>> > 79, in wrapper >>>>> > return method(self, *args, **kwargs) >>>>> > File "/usr/share/vdsm/storage/sp.py", line 535, in >>>>> validatePoolSD >>>>> > raise se.StorageDomainNotMemberOfPool(self.spUUID, sdUUID) >>>>> > StorageDomainNotMemberOfPool: Domain is not member in pool: >>>>> > u'pool=a1e7e9dd-0cf4-41ae-ba13-36297ed66309, >>>>> > domain=a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922' >>>>> > 2017-06-03 19:21:40,063+0200 ERROR (jsonrpc/2) >>>>> [storage.TaskManager.Task] >>>>> > (Task='a2476a33-26f8-4ebd-876d-02fe5d13ef78') Unexpected >>>>> error (task:870) >>>>> > Traceback (most recent call last): >>>>> > File "/usr/share/vdsm/storage/task.py", line 877, in _run >>>>> > return fn(*args, **kargs) >>>>> > File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", >>>>> line 52, in >>>>> > wrapper >>>>> > res = f(*args, **kwargs) >>>>> > File "/usr/share/vdsm/storage/hsm.py", line 959, in >>>>> createStoragePool >>>>> > leaseParams) >>>>> > File "/usr/share/vdsm/storage/sp.py", line 652, in create >>>>> > self.attachSD(sdUUID) >>>>> > File "/usr/lib/python2.7/site-packa >>>>> ges/vdsm/storage/securable.py", line >>>>> > 79, in wrapper >>>>> > return method(self, *args, **kwargs) >>>>> > File "/usr/share/vdsm/storage/sp.py", line 971, in attachSD >>>>> > dom.acquireHostId(self.id) >>>>> > File "/usr/share/vdsm/storage/sd.py", line 790, in >>>>> acquireHostId >>>>> > self._manifest.acquireHostId(hostId, async) >>>>> > File "/usr/share/vdsm/storage/sd.py", line 449, in >>>>> acquireHostId >>>>> > self._domainLock.acquireHostId(hostId, async) >>>>> > File "/usr/lib/python2.7/site-packa >>>>> ges/vdsm/storage/clusterlock.py", line >>>>> > 297, in acquireHostId >>>>> > raise se.AcquireHostIdFailure(self._sdUUID, e) >>>>> > AcquireHostIdFailure: Cannot acquire host id: >>>>> > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', >>>>> SanlockException(22, 'Sanlock >>>>> > lockspace add failure', 'Invalid argument')) >>>>> > 2017-06-03 19:21:40,067+0200 ERROR (jsonrpc/2) >>>>> [storage.Dispatcher] >>>>> > {'status': {'message': "Cannot acquire host id: >>>>> > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', >>>>> SanlockException(22, 'Sanlock >>>>> > lockspace add failure', 'Invalid argument'))", 'code': 661}} >>>>> (dispatcher:77) >>>>> > >>>>> > The gluster volume prepared for engine storage is online and >>>>> no split brain >>>>> > is reported. I don't understand what needs to be done to >>>>> overcome this. Any >>>>> > idea will be appreciated. >>>>> > >>>>> > Thank you, >>>>> > Alex >>>>> > >>>>> > _______________________________________________ >>>>> > Users mailing list >>>>> > Users@ovirt.org >>>>> > http://lists.ovirt.org/mailman/listinfo/users >>>>> > >>>>> >>>> >>>> >>> >>> _______________________________________________ >>> Users mailing list >>> Users@ovirt.org >>> http://lists.ovirt.org/mailman/listinfo/users >>> >>> >> > > _______________________________________________ > Users mailing list > Users@ovirt.org > http://lists.ovirt.org/mailman/listinfo/users > >

Just to note that the mentioned logs below are from the dd with bs=512, which were failing. Attached the full logs from mount and brick. Alex On Tue, Jun 6, 2017 at 3:18 PM, Abi Askushi <rightkicktech@gmail.com> wrote:
Hi Krutika,
My comments inline.
Also attached the strace of: strace -y -ff -o /root/512-trace-on-root.log dd if=/dev/zero of=/mnt/test2.img oflag=direct bs=512 count=1
and of: strace -y -ff -o /root/4096-trace-on-root.log dd if=/dev/zero of=/mnt/test2.img oflag=direct bs=4096 count=16
I have mounted gluster volume at /mnt. The dd with bs=4096 is successful.
The gluster mount log gives only the following: [2017-06-06 12:04:54.102576] W [MSGID: 114031] [client-rpc-fops.c:854:client3_3_writev_cbk] 0-engine-client-0: remote operation failed [Invalid argument] [2017-06-06 12:04:54.102591] W [MSGID: 114031] [client-rpc-fops.c:854:client3_3_writev_cbk] 0-engine-client-1: remote operation failed [Invalid argument] [2017-06-06 12:04:54.103355] W [fuse-bridge.c:2312:fuse_writev_cbk] 0-glusterfs-fuse: 205: WRITE => -1 gfid=075ab3a5-0274-4f07-a075-2748c3b4d394 fd=0x7faf1d08706c (Transport endpoint is not connected)
The gluster brick log gives: [2017-06-06 12:07:03.793080] E [MSGID: 113072] [posix.c:3453:posix_writev] 0-engine-posix: write failed: offset 0, [Invalid argument] [2017-06-06 12:07:03.793172] E [MSGID: 115067] [server-rpc-fops.c:1346:server_writev_cbk] 0-engine-server: 291: WRITEV 0 (075ab3a5-0274-4f07-a075-2748c3b4d394) ==> (Invalid argument) [Invalid argument]
On Tue, Jun 6, 2017 at 12:50 PM, Krutika Dhananjay <kdhananj@redhat.com> wrote:
OK.
So for the 'Transport endpoint is not connected' issue, could you share the mount and brick logs?
Hmmm.. 'Invalid argument' error even on the root partition. What if you change bs to 4096 and run?
If I use bs=4096 the dd is successful on /root and at gluster mounted volume.
The logs I showed in my earlier mail shows that gluster is merely returning the error it got from the disk file system where the brick is hosted. But you're right about the fact that the offset 127488 is not 4K-aligned.
If the dd on /root worked for you with bs=4096, could you try the same directly on gluster mount point on a dummy file and capture the strace output of dd? You can perhaps reuse your existing gluster volume by mounting it at another location and doing the dd. Here's what you need to execute:
strace -ff -T -p <pid-of-mount-process> -o <path-to-the-file-where-you-want-the-output-saved>`
FWIW, here's something I found in man(2) open:
*Under Linux 2.4, transfer sizes, and the alignment of the user buffer and the file offset must all be multiples of the logical block size of the filesystem. Since Linux 2.6.0, alignment to the logical block size of the underlying storage (typically 512 bytes) suffices. The logical block size can be determined using the ioctl(2) BLKSSZGET operation or from the shell using the command: blockdev --getss*
Please note also that the physical disks are of 4K sector size (native). Thus OS is having 4096/4096 local/physical sector size. [root@v0 ~]# blockdev --getss /dev/sda 4096 [root@v0 ~]# blockdev --getpbsz /dev/sda 4096
-Krutika
On Tue, Jun 6, 2017 at 1:18 AM, Abi Askushi <rightkicktech@gmail.com> wrote:
Also when testing with dd i get the following:
*Testing on the gluster mount: * dd if=/dev/zero of=/rhev/data-center/mnt/glusterSD/10.100.100.1: _engine/test2.img oflag=direct bs=512 count=1 dd: error writing β/rhev/data-center/mnt/glusterSD/10.100.100.1: _engine/test2.imgβ: *Transport endpoint is not connected* 1+0 records in 0+0 records out 0 bytes (0 B) copied, 0.00336755 s, 0.0 kB/s
*Testing on the /root directory (XFS): * dd if=/dev/zero of=/test2.img oflag=direct bs=512 count=1 dd: error writing β/test2.imgβ:* Invalid argument* 1+0 records in 0+0 records out 0 bytes (0 B) copied, 0.000321239 s, 0.0 kB/s
Seems that the gluster is trying to do the same and fails.
On Mon, Jun 5, 2017 at 10:10 PM, Abi Askushi <rightkicktech@gmail.com> wrote:
The question that rises is what is needed to make gluster aware of the 4K physical sectors presented to it (the logical sector is also 4K). The offset (127488) at the log does not seem aligned at 4K.
Alex
On Mon, Jun 5, 2017 at 2:47 PM, Abi Askushi <rightkicktech@gmail.com> wrote:
Hi Krutika,
I am saying that I am facing this issue with 4k drives. I never encountered this issue with 512 drives.
Alex
On Jun 5, 2017 14:26, "Krutika Dhananjay" <kdhananj@redhat.com> wrote:
This seems like a case of O_DIRECT reads and writes gone wrong, judging by the 'Invalid argument' errors.
The two operations that have failed on gluster bricks are:
[2017-06-05 09:40:39.428979] E [MSGID: 113072] [posix.c:3453:posix_writev] 0-engine-posix: write failed: offset 0, [Invalid argument] [2017-06-05 09:41:00.865760] E [MSGID: 113040] [posix.c:3178:posix_readv] 0-engine-posix: read failed on gfid=8c94f658-ac3c-4e3a-b368-8c038513a914, fd=0x7f408584c06c, offset=127488 size=512, buf=0x7f4083c0b000 [Invalid argument]
But then, both the write and the read have 512byte-aligned offset, size and buf address (which is correct).
Are you saying you don't see this issue with 4K block-size?
-Krutika
On Mon, Jun 5, 2017 at 3:21 PM, Abi Askushi <rightkicktech@gmail.com> wrote:
> Hi Sahina, > > Attached are the logs. Let me know if sth else is needed. > > I have 5 disks (with 4K physical sector) in RAID5. The RAID has 64K > stripe size at the moment. > I have prepared the storage as below: > > pvcreate --dataalignment 256K /dev/sda4 > vgcreate --physicalextentsize 256K gluster /dev/sda4 > > lvcreate -n engine --size 120G gluster > mkfs.xfs -f -i size=512 /dev/gluster/engine > > Thanx, > Alex > > On Mon, Jun 5, 2017 at 12:14 PM, Sahina Bose <sabose@redhat.com> > wrote: > >> Can we have the gluster mount logs and brick logs to check if it's >> the same issue? >> >> On Sun, Jun 4, 2017 at 11:21 PM, Abi Askushi < >> rightkicktech@gmail.com> wrote: >> >>> I clean installed everything and ran into the same. >>> I then ran gdeploy and encountered the same issue when deploying >>> engine. >>> Seems that gluster (?) doesn't like 4K sector drives. I am not >>> sure if it has to do with alignment. The weird thing is that gluster >>> volumes are all ok, replicating normally and no split brain is reported. >>> >>> The solution to the mentioned bug (1386443 >>> <https://bugzilla.redhat.com/show_bug.cgi?id=1386443>) was to >>> format with 512 sector size, which for my case is not an option: >>> >>> mkfs.xfs -f -i size=512 -s size=512 /dev/gluster/engine >>> illegal sector size 512; hw sector is 4096 >>> >>> Is there any workaround to address this? >>> >>> Thanx, >>> Alex >>> >>> >>> On Sun, Jun 4, 2017 at 5:48 PM, Abi Askushi < >>> rightkicktech@gmail.com> wrote: >>> >>>> Hi Maor, >>>> >>>> My disk are of 4K block size and from this bug seems that gluster >>>> replica needs 512B block size. >>>> Is there a way to make gluster function with 4K drives? >>>> >>>> Thank you! >>>> >>>> On Sun, Jun 4, 2017 at 2:34 PM, Maor Lipchuk <mlipchuk@redhat.com >>>> > wrote: >>>> >>>>> Hi Alex, >>>>> >>>>> I saw a bug that might be related to the issue you encountered at >>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1386443 >>>>> >>>>> Sahina, maybe you have any advise? Do you think that BZ1386443is >>>>> related? >>>>> >>>>> Regards, >>>>> Maor >>>>> >>>>> On Sat, Jun 3, 2017 at 8:45 PM, Abi Askushi < >>>>> rightkicktech@gmail.com> wrote: >>>>> > Hi All, >>>>> > >>>>> > I have installed successfully several times oVirt (version >>>>> 4.1) with 3 nodes >>>>> > on top glusterfs. >>>>> > >>>>> > This time, when trying to configure the same setup, I am >>>>> facing the >>>>> > following issue which doesn't seem to go away. During >>>>> installation i get the >>>>> > error: >>>>> > >>>>> > Failed to execute stage 'Misc configuration': Cannot acquire >>>>> host id: >>>>> > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', >>>>> SanlockException(22, 'Sanlock >>>>> > lockspace add failure', 'Invalid argument')) >>>>> > >>>>> > The only different in this setup is that instead of standard >>>>> partitioning i >>>>> > have GPT partitioning and the disks have 4K block size instead >>>>> of 512. >>>>> > >>>>> > The /var/log/sanlock.log has the following lines: >>>>> > >>>>> > 2017-06-03 19:21:15+0200 23450 [943]: s9 lockspace >>>>> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047:250:/rhev/data-center/m >>>>> nt/_var_lib_ovirt-hosted-engin-setup_tmptjkIDI/ba6bd862-c2b8 >>>>> -46e7-b2c8-91e4a5bb2047/dom_md/ids:0 >>>>> > 2017-06-03 19:21:36+0200 23471 [944]: s9:r5 resource >>>>> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047:SDM:/rhev/data-center/m >>>>> nt/_var_lib_ovirt-hosted-engine-setup_tmptjkIDI/ba6bd862-c2b >>>>> 8-46e7-b2c8-91e4a5bb2047/dom_md/leases:1048576 >>>>> > for 2,9,23040 >>>>> > 2017-06-03 19:21:36+0200 23471 [943]: s10 lockspace >>>>> > a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922:250:/rhev/data-center/m >>>>> nt/glusterSD/10.100.100.1:_engine/a5a6b0e7-fc3f-4838-8e26-c8 >>>>> b4d5e5e922/dom_md/ids:0 >>>>> > 2017-06-03 19:21:36+0200 23471 [23522]: a5a6b0e7 aio collect RD >>>>> > 0x7f59b00008c0:0x7f59b00008d0:0x7f59b0101000 result -22:0 >>>>> match res >>>>> > 2017-06-03 19:21:36+0200 23471 [23522]: read_sectors >>>>> delta_leader offset >>>>> > 127488 rv -22 >>>>> > /rhev/data-center/mnt/glusterSD/10.100.100.1:_engine/a5a6b0e >>>>> 7-fc3f-4838-8e26-c8b4d5e5e922/dom_md/ids >>>>> > 2017-06-03 19:21:37+0200 23472 [930]: s9 host 250 1 23450 >>>>> > 88c2244c-a782-40ed-9560-6cfa4d46f853.v0.neptune >>>>> > 2017-06-03 19:21:37+0200 23472 [943]: s10 add_lockspace fail >>>>> result -22 >>>>> > >>>>> > And /var/log/vdsm/vdsm.log says: >>>>> > >>>>> > 2017-06-03 19:19:38,176+0200 WARN (jsonrpc/3) >>>>> > [storage.StorageServer.MountConnection] Using user specified >>>>> > backup-volfile-servers option (storageServer:253) >>>>> > 2017-06-03 19:21:12,379+0200 WARN (periodic/1) [throttled] >>>>> MOM not >>>>> > available. (throttledlog:105) >>>>> > 2017-06-03 19:21:12,380+0200 WARN (periodic/1) [throttled] >>>>> MOM not >>>>> > available, KSM stats will be missing. (throttledlog:105) >>>>> > 2017-06-03 19:21:14,714+0200 WARN (jsonrpc/1) >>>>> > [storage.StorageServer.MountConnection] Using user specified >>>>> > backup-volfile-servers option (storageServer:253) >>>>> > 2017-06-03 19:21:15,515+0200 ERROR (jsonrpc/4) >>>>> [storage.initSANLock] Cannot >>>>> > initialize SANLock for domain a5a6b0e7-fc3f-4838-8e26-c8b4d5 >>>>> e5e922 >>>>> > (clusterlock:238) >>>>> > Traceback (most recent call last): >>>>> > File "/usr/lib/python2.7/site-packa >>>>> ges/vdsm/storage/clusterlock.py", line >>>>> > 234, in initSANLock >>>>> > sanlock.init_lockspace(sdUUID, idsPath) >>>>> > SanlockException: (107, 'Sanlock lockspace init failure', >>>>> 'Transport >>>>> > endpoint is not connected') >>>>> > 2017-06-03 19:21:15,515+0200 WARN (jsonrpc/4) >>>>> > [storage.StorageDomainManifest] lease did not initialize >>>>> successfully >>>>> > (sd:557) >>>>> > Traceback (most recent call last): >>>>> > File "/usr/share/vdsm/storage/sd.py", line 552, in >>>>> initDomainLock >>>>> > self._domainLock.initLock(self.getDomainLease()) >>>>> > File "/usr/lib/python2.7/site-packa >>>>> ges/vdsm/storage/clusterlock.py", line >>>>> > 271, in initLock >>>>> > initSANLock(self._sdUUID, self._idsPath, lease) >>>>> > File "/usr/lib/python2.7/site-packa >>>>> ges/vdsm/storage/clusterlock.py", line >>>>> > 239, in initSANLock >>>>> > raise se.ClusterLockInitError() >>>>> > ClusterLockInitError: Could not initialize cluster lock: () >>>>> > 2017-06-03 19:21:37,867+0200 ERROR (jsonrpc/2) >>>>> [storage.StoragePool] Create >>>>> > pool hosted_datacenter canceled (sp:655) >>>>> > Traceback (most recent call last): >>>>> > File "/usr/share/vdsm/storage/sp.py", line 652, in create >>>>> > self.attachSD(sdUUID) >>>>> > File "/usr/lib/python2.7/site-packa >>>>> ges/vdsm/storage/securable.py", line >>>>> > 79, in wrapper >>>>> > return method(self, *args, **kwargs) >>>>> > File "/usr/share/vdsm/storage/sp.py", line 971, in attachSD >>>>> > dom.acquireHostId(self.id) >>>>> > File "/usr/share/vdsm/storage/sd.py", line 790, in >>>>> acquireHostId >>>>> > self._manifest.acquireHostId(hostId, async) >>>>> > File "/usr/share/vdsm/storage/sd.py", line 449, in >>>>> acquireHostId >>>>> > self._domainLock.acquireHostId(hostId, async) >>>>> > File "/usr/lib/python2.7/site-packa >>>>> ges/vdsm/storage/clusterlock.py", line >>>>> > 297, in acquireHostId >>>>> > raise se.AcquireHostIdFailure(self._sdUUID, e) >>>>> > AcquireHostIdFailure: Cannot acquire host id: >>>>> > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', >>>>> SanlockException(22, 'Sanlock >>>>> > lockspace add failure', 'Invalid argument')) >>>>> > 2017-06-03 19:21:37,870+0200 ERROR (jsonrpc/2) >>>>> [storage.StoragePool] Domain >>>>> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 detach from MSD >>>>> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 Ver 1 failed. (sp:528) >>>>> > Traceback (most recent call last): >>>>> > File "/usr/share/vdsm/storage/sp.py", line 525, in >>>>> __cleanupDomains >>>>> > self.detachSD(sdUUID) >>>>> > File "/usr/lib/python2.7/site-packa >>>>> ges/vdsm/storage/securable.py", line >>>>> > 79, in wrapper >>>>> > return method(self, *args, **kwargs) >>>>> > File "/usr/share/vdsm/storage/sp.py", line 1046, in detachSD >>>>> > raise se.CannotDetachMasterStorageDomain(sdUUID) >>>>> > CannotDetachMasterStorageDomain: Illegal action: >>>>> > (u'ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047',) >>>>> > 2017-06-03 19:21:37,872+0200 ERROR (jsonrpc/2) >>>>> [storage.StoragePool] Domain >>>>> > a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922 detach from MSD >>>>> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 Ver 1 failed. (sp:528) >>>>> > Traceback (most recent call last): >>>>> > File "/usr/share/vdsm/storage/sp.py", line 525, in >>>>> __cleanupDomains >>>>> > self.detachSD(sdUUID) >>>>> > File "/usr/lib/python2.7/site-packa >>>>> ges/vdsm/storage/securable.py", line >>>>> > 79, in wrapper >>>>> > return method(self, *args, **kwargs) >>>>> > File "/usr/share/vdsm/storage/sp.py", line 1043, in detachSD >>>>> > self.validateAttachedDomain(dom) >>>>> > File "/usr/lib/python2.7/site-packa >>>>> ges/vdsm/storage/securable.py", line >>>>> > 79, in wrapper >>>>> > return method(self, *args, **kwargs) >>>>> > File "/usr/share/vdsm/storage/sp.py", line 542, in >>>>> validateAttachedDomain >>>>> > self.validatePoolSD(dom.sdUUID) >>>>> > File "/usr/lib/python2.7/site-packa >>>>> ges/vdsm/storage/securable.py", line >>>>> > 79, in wrapper >>>>> > return method(self, *args, **kwargs) >>>>> > File "/usr/share/vdsm/storage/sp.py", line 535, in >>>>> validatePoolSD >>>>> > raise se.StorageDomainNotMemberOfPool(self.spUUID, sdUUID) >>>>> > StorageDomainNotMemberOfPool: Domain is not member in pool: >>>>> > u'pool=a1e7e9dd-0cf4-41ae-ba13-36297ed66309, >>>>> > domain=a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922' >>>>> > 2017-06-03 19:21:40,063+0200 ERROR (jsonrpc/2) >>>>> [storage.TaskManager.Task] >>>>> > (Task='a2476a33-26f8-4ebd-876d-02fe5d13ef78') Unexpected >>>>> error (task:870) >>>>> > Traceback (most recent call last): >>>>> > File "/usr/share/vdsm/storage/task.py", line 877, in _run >>>>> > return fn(*args, **kargs) >>>>> > File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", >>>>> line 52, in >>>>> > wrapper >>>>> > res = f(*args, **kwargs) >>>>> > File "/usr/share/vdsm/storage/hsm.py", line 959, in >>>>> createStoragePool >>>>> > leaseParams) >>>>> > File "/usr/share/vdsm/storage/sp.py", line 652, in create >>>>> > self.attachSD(sdUUID) >>>>> > File "/usr/lib/python2.7/site-packa >>>>> ges/vdsm/storage/securable.py", line >>>>> > 79, in wrapper >>>>> > return method(self, *args, **kwargs) >>>>> > File "/usr/share/vdsm/storage/sp.py", line 971, in attachSD >>>>> > dom.acquireHostId(self.id) >>>>> > File "/usr/share/vdsm/storage/sd.py", line 790, in >>>>> acquireHostId >>>>> > self._manifest.acquireHostId(hostId, async) >>>>> > File "/usr/share/vdsm/storage/sd.py", line 449, in >>>>> acquireHostId >>>>> > self._domainLock.acquireHostId(hostId, async) >>>>> > File "/usr/lib/python2.7/site-packa >>>>> ges/vdsm/storage/clusterlock.py", line >>>>> > 297, in acquireHostId >>>>> > raise se.AcquireHostIdFailure(self._sdUUID, e) >>>>> > AcquireHostIdFailure: Cannot acquire host id: >>>>> > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', >>>>> SanlockException(22, 'Sanlock >>>>> > lockspace add failure', 'Invalid argument')) >>>>> > 2017-06-03 19:21:40,067+0200 ERROR (jsonrpc/2) >>>>> [storage.Dispatcher] >>>>> > {'status': {'message': "Cannot acquire host id: >>>>> > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', >>>>> SanlockException(22, 'Sanlock >>>>> > lockspace add failure', 'Invalid argument'))", 'code': 661}} >>>>> (dispatcher:77) >>>>> > >>>>> > The gluster volume prepared for engine storage is online and >>>>> no split brain >>>>> > is reported. I don't understand what needs to be done to >>>>> overcome this. Any >>>>> > idea will be appreciated. >>>>> > >>>>> > Thank you, >>>>> > Alex >>>>> > >>>>> > _______________________________________________ >>>>> > Users mailing list >>>>> > Users@ovirt.org >>>>> > http://lists.ovirt.org/mailman/listinfo/users >>>>> > >>>>> >>>> >>>> >>> >>> _______________________________________________ >>> Users mailing list >>> Users@ovirt.org >>> http://lists.ovirt.org/mailman/listinfo/users >>> >>> >> > > _______________________________________________ > Users mailing list > Users@ovirt.org > http://lists.ovirt.org/mailman/listinfo/users > >

Hi Sahina, Did you have the chance to check the logs and have any idea how this may be addressed? Thanx, Alex On Mon, Jun 5, 2017 at 12:14 PM, Sahina Bose <sabose@redhat.com> wrote:
Can we have the gluster mount logs and brick logs to check if it's the same issue?
On Sun, Jun 4, 2017 at 11:21 PM, Abi Askushi <rightkicktech@gmail.com> wrote:
I clean installed everything and ran into the same. I then ran gdeploy and encountered the same issue when deploying engine. Seems that gluster (?) doesn't like 4K sector drives. I am not sure if it has to do with alignment. The weird thing is that gluster volumes are all ok, replicating normally and no split brain is reported.
The solution to the mentioned bug (1386443 <https://bugzilla.redhat.com/show_bug.cgi?id=1386443>) was to format with 512 sector size, which for my case is not an option:
mkfs.xfs -f -i size=512 -s size=512 /dev/gluster/engine illegal sector size 512; hw sector is 4096
Is there any workaround to address this?
Thanx, Alex
On Sun, Jun 4, 2017 at 5:48 PM, Abi Askushi <rightkicktech@gmail.com> wrote:
Hi Maor,
My disk are of 4K block size and from this bug seems that gluster replica needs 512B block size. Is there a way to make gluster function with 4K drives?
Thank you!
On Sun, Jun 4, 2017 at 2:34 PM, Maor Lipchuk <mlipchuk@redhat.com> wrote:
Hi Alex,
I saw a bug that might be related to the issue you encountered at https://bugzilla.redhat.com/show_bug.cgi?id=1386443
Sahina, maybe you have any advise? Do you think that BZ1386443is related?
Regards, Maor
Hi All,
I have installed successfully several times oVirt (version 4.1) with 3 nodes on top glusterfs.
This time, when trying to configure the same setup, I am facing the following issue which doesn't seem to go away. During installation i get the error:
Failed to execute stage 'Misc configuration': Cannot acquire host id: (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, 'Sanlock lockspace add failure', 'Invalid argument'))
The only different in this setup is that instead of standard
have GPT partitioning and the disks have 4K block size instead of 512.
The /var/log/sanlock.log has the following lines:
2017-06-03 19:21:15+0200 23450 [943]: s9 lockspace ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047:250:/rhev/data-center/m nt/_var_lib_ovirt-hosted-engin-setup_tmptjkIDI/ba6bd862-c2b8 -46e7-b2c8-91e4a5bb2047/dom_md/ids:0 2017-06-03 19:21:36+0200 23471 [944]: s9:r5 resource ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047:SDM:/rhev/data-center/m nt/_var_lib_ovirt-hosted-engine-setup_tmptjkIDI/ba6bd862-c2b 8-46e7-b2c8-91e4a5bb2047/dom_md/leases:1048576 for 2,9,23040 2017-06-03 19:21:36+0200 23471 [943]: s10 lockspace a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922:250:/rhev/data-center/m nt/glusterSD/10.100.100.1:_engine/a5a6b0e7-fc3f-4838-8e26-c8 b4d5e5e922/dom_md/ids:0 2017-06-03 19:21:36+0200 23471 [23522]: a5a6b0e7 aio collect RD 0x7f59b00008c0:0x7f59b00008d0:0x7f59b0101000 result -22:0 match res 2017-06-03 19:21:36+0200 23471 [23522]: read_sectors delta_leader offset 127488 rv -22 /rhev/data-center/mnt/glusterSD/10.100.100.1:_engine/a5a6b0e 7-fc3f-4838-8e26-c8b4d5e5e922/dom_md/ids 2017-06-03 19:21:37+0200 23472 [930]: s9 host 250 1 23450 88c2244c-a782-40ed-9560-6cfa4d46f853.v0.neptune 2017-06-03 19:21:37+0200 23472 [943]: s10 add_lockspace fail result -22
And /var/log/vdsm/vdsm.log says:
2017-06-03 19:19:38,176+0200 WARN (jsonrpc/3) [storage.StorageServer.MountConnection] Using user specified backup-volfile-servers option (storageServer:253) 2017-06-03 19:21:12,379+0200 WARN (periodic/1) [throttled] MOM not available. (throttledlog:105) 2017-06-03 19:21:12,380+0200 WARN (periodic/1) [throttled] MOM not available, KSM stats will be missing. (throttledlog:105) 2017-06-03 19:21:14,714+0200 WARN (jsonrpc/1) [storage.StorageServer.MountConnection] Using user specified backup-volfile-servers option (storageServer:253) 2017-06-03 19:21:15,515+0200 ERROR (jsonrpc/4) [storage.initSANLock] Cannot initialize SANLock for domain a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922 (clusterlock:238) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py",
234, in initSANLock sanlock.init_lockspace(sdUUID, idsPath) SanlockException: (107, 'Sanlock lockspace init failure', 'Transport endpoint is not connected') 2017-06-03 19:21:15,515+0200 WARN (jsonrpc/4) [storage.StorageDomainManifest] lease did not initialize successfully (sd:557) Traceback (most recent call last): File "/usr/share/vdsm/storage/sd.py", line 552, in initDomainLock self._domainLock.initLock(self.getDomainLease()) File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py",
271, in initLock initSANLock(self._sdUUID, self._idsPath, lease) File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py",
239, in initSANLock raise se.ClusterLockInitError() ClusterLockInitError: Could not initialize cluster lock: () 2017-06-03 19:21:37,867+0200 ERROR (jsonrpc/2) [storage.StoragePool] Create pool hosted_datacenter canceled (sp:655) Traceback (most recent call last): File "/usr/share/vdsm/storage/sp.py", line 652, in create self.attachSD(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 971, in attachSD dom.acquireHostId(self.id) File "/usr/share/vdsm/storage/sd.py", line 790, in acquireHostId self._manifest.acquireHostId(hostId, async) File "/usr/share/vdsm/storage/sd.py", line 449, in acquireHostId self._domainLock.acquireHostId(hostId, async) File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py",
297, in acquireHostId raise se.AcquireHostIdFailure(self._sdUUID, e) AcquireHostIdFailure: Cannot acquire host id: (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, 'Sanlock lockspace add failure', 'Invalid argument')) 2017-06-03 19:21:37,870+0200 ERROR (jsonrpc/2) [storage.StoragePool] Domain ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 detach from MSD ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 Ver 1 failed. (sp:528) Traceback (most recent call last): File "/usr/share/vdsm/storage/sp.py", line 525, in __cleanupDomains self.detachSD(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 1046, in detachSD raise se.CannotDetachMasterStorageDomain(sdUUID) CannotDetachMasterStorageDomain: Illegal action: (u'ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047',) 2017-06-03 19:21:37,872+0200 ERROR (jsonrpc/2) [storage.StoragePool] Domain a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922 detach from MSD ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 Ver 1 failed. (sp:528) Traceback (most recent call last): File "/usr/share/vdsm/storage/sp.py", line 525, in __cleanupDomains self.detachSD(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 1043, in detachSD self.validateAttachedDomain(dom) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 542, in validateAttachedDomain self.validatePoolSD(dom.sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 535, in validatePoolSD raise se.StorageDomainNotMemberOfPool(self.spUUID, sdUUID) StorageDomainNotMemberOfPool: Domain is not member in pool: u'pool=a1e7e9dd-0cf4-41ae-ba13-36297ed66309, domain=a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922' 2017-06-03 19:21:40,063+0200 ERROR (jsonrpc/2) [storage.TaskManager.Task] (Task='a2476a33-26f8-4ebd-876d-02fe5d13ef78') Unexpected error (task:870) Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 877, in _run return fn(*args, **kargs) File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 52, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 959, in createStoragePool leaseParams) File "/usr/share/vdsm/storage/sp.py", line 652, in create self.attachSD(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 971, in attachSD dom.acquireHostId(self.id) File "/usr/share/vdsm/storage/sd.py", line 790, in acquireHostId self._manifest.acquireHostId(hostId, async) File "/usr/share/vdsm/storage/sd.py", line 449, in acquireHostId self._domainLock.acquireHostId(hostId, async) File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py",
297, in acquireHostId raise se.AcquireHostIdFailure(self._sdUUID, e) AcquireHostIdFailure: Cannot acquire host id: (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, 'Sanlock lockspace add failure', 'Invalid argument')) 2017-06-03 19:21:40,067+0200 ERROR (jsonrpc/2) [storage.Dispatcher] {'status': {'message': "Cannot acquire host id: (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, 'Sanlock lockspace add failure', 'Invalid argument'))", 'code': 661}} (dispatcher:77)
The gluster volume prepared for engine storage is online and no split brain is reported. I don't understand what needs to be done to overcome
On Sat, Jun 3, 2017 at 8:45 PM, Abi Askushi <rightkicktech@gmail.com> wrote: partitioning i line line line line line line line line line line line this. Any
idea will be appreciated.
Thank you, Alex
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

Hello Alex, On Wed, Jun 7, 2017 at 11:39 AM, Abi Askushi <rightkicktech@gmail.com> wrote:
Hi Sahina,
Did you have the chance to check the logs and have any idea how this may be addressed?
It seems to be a VDSM issue, as VDSM uses direct IO (and id actualy calls dd) and assumes that block size is 512. I see in the code, that block size is defined as a constant, so it probably may be adjusted, but i think it would be better if we ask some one who knows that part better. Anyway, could you please file a bug on that issue? Thanks in advance.

Hi Denis, Ok I will file a bug for this. I am not sure if I will be able to provide troubleshooting info for much long as I already have put forward the replacement of disks with 512 ones. Alex On Thu, Jun 8, 2017 at 11:48 AM, Denis Chaplygin <dchaplyg@redhat.com> wrote:
Hello Alex,
On Wed, Jun 7, 2017 at 11:39 AM, Abi Askushi <rightkicktech@gmail.com> wrote:
Hi Sahina,
Did you have the chance to check the logs and have any idea how this may be addressed?
It seems to be a VDSM issue, as VDSM uses direct IO (and id actualy calls dd) and assumes that block size is 512. I see in the code, that block size is defined as a constant, so it probably may be adjusted, but i think it would be better if we ask some one who knows that part better.
Anyway, could you please file a bug on that issue? Thanks in advance.

Filed the *Bug 1459855* <https://bugzilla.redhat.com/show_bug.cgi?id=1459855> Alex On Thu, Jun 8, 2017 at 1:16 PM, Abi Askushi <rightkicktech@gmail.com> wrote:
Hi Denis,
Ok I will file a bug for this. I am not sure if I will be able to provide troubleshooting info for much long as I already have put forward the replacement of disks with 512 ones.
Alex
On Thu, Jun 8, 2017 at 11:48 AM, Denis Chaplygin <dchaplyg@redhat.com> wrote:
Hello Alex,
On Wed, Jun 7, 2017 at 11:39 AM, Abi Askushi <rightkicktech@gmail.com> wrote:
Hi Sahina,
Did you have the chance to check the logs and have any idea how this may be addressed?
It seems to be a VDSM issue, as VDSM uses direct IO (and id actualy calls dd) and assumes that block size is 512. I see in the code, that block size is defined as a constant, so it probably may be adjusted, but i think it would be better if we ask some one who knows that part better.
Anyway, could you please file a bug on that issue? Thanks in advance.
participants (5)
-
Abi Askushi
-
Denis Chaplygin
-
Krutika Dhananjay
-
Maor Lipchuk
-
Sahina Bose