[ovirt-users] oVirt gluster sanlock issue

Maor Lipchuk mlipchuk at redhat.com
Sun Jun 4 19:39:25 UTC 2017


On Sun, Jun 4, 2017 at 8:51 PM, Abi Askushi <rightkicktech at gmail.com> wrote:
> I clean installed everything and ran into the same.
> I then ran gdeploy and encountered the same issue when deploying engine.
> Seems that gluster (?) doesn't like 4K sector drives. I am not sure if it
> has to do with alignment. The weird thing is that gluster volumes are all
> ok, replicating normally and no split brain is reported.
>
> The solution to the mentioned bug (1386443) was to format with 512 sector
> size, which for my case is not an option:
>
> mkfs.xfs -f -i size=512 -s size=512 /dev/gluster/engine
> illegal sector size 512; hw sector is 4096
>
> Is there any workaround to address this?
>
> Thanx,
> Alex
>
>
> On Sun, Jun 4, 2017 at 5:48 PM, Abi Askushi <rightkicktech at gmail.com> wrote:
>>
>> Hi Maor,
>>
>> My disk are of 4K block size and from this bug seems that gluster replica
>> needs 512B block size.
>> Is there a way to make gluster function with 4K drives?
>>
>> Thank you!
>>
>> On Sun, Jun 4, 2017 at 2:34 PM, Maor Lipchuk <mlipchuk at redhat.com> wrote:
>>>
>>> Hi Alex,
>>>
>>> I saw a bug that might be related to the issue you encountered at
>>> https://bugzilla.redhat.com/show_bug.cgi?id=1386443
>>>
>>> Sahina, maybe you have any advise? Do you think that BZ1386443is related?
>>>
>>> Regards,
>>> Maor
>>>
>>> On Sat, Jun 3, 2017 at 8:45 PM, Abi Askushi <rightkicktech at gmail.com>
>>> wrote:
>>> > Hi All,
>>> >
>>> > I have installed successfully several times oVirt (version 4.1) with 3
>>> > nodes
>>> > on top glusterfs.
>>> >
>>> > This time, when trying to configure the same setup, I am facing the
>>> > following issue which doesn't seem to go away. During installation i
>>> > get the
>>> > error:
>>> >
>>> > Failed to execute stage 'Misc configuration': Cannot acquire host id:
>>> > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, 'Sanlock
>>> > lockspace add failure', 'Invalid argument'))
>>> >
>>> > The only different in this setup is that instead of standard
>>> > partitioning i
>>> > have GPT partitioning and the disks have 4K block size instead of 512.
>>> >
>>> > The /var/log/sanlock.log has the following lines:
>>> >
>>> > 2017-06-03 19:21:15+0200 23450 [943]: s9 lockspace
>>> >
>>> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047:250:/rhev/data-center/mnt/_var_lib_ovirt-hosted-engin-setup_tmptjkIDI/ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047/dom_md/ids:0
>>> > 2017-06-03 19:21:36+0200 23471 [944]: s9:r5 resource
>>> >
>>> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047:SDM:/rhev/data-center/mnt/_var_lib_ovirt-hosted-engine-setup_tmptjkIDI/ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047/dom_md/leases:1048576
>>> > for 2,9,23040
>>> > 2017-06-03 19:21:36+0200 23471 [943]: s10 lockspace
>>> >
>>> > a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922:250:/rhev/data-center/mnt/glusterSD/10.100.100.1:_engine/a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922/dom_md/ids:0
>>> > 2017-06-03 19:21:36+0200 23471 [23522]: a5a6b0e7 aio collect RD
>>> > 0x7f59b00008c0:0x7f59b00008d0:0x7f59b0101000 result -22:0 match res
>>> > 2017-06-03 19:21:36+0200 23471 [23522]: read_sectors delta_leader
>>> > offset
>>> > 127488 rv -22
>>> >
>>> > /rhev/data-center/mnt/glusterSD/10.100.100.1:_engine/a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922/dom_md/ids
>>> > 2017-06-03 19:21:37+0200 23472 [930]: s9 host 250 1 23450
>>> > 88c2244c-a782-40ed-9560-6cfa4d46f853.v0.neptune
>>> > 2017-06-03 19:21:37+0200 23472 [943]: s10 add_lockspace fail result -22
>>> >
>>> > And /var/log/vdsm/vdsm.log says:
>>> >
>>> > 2017-06-03 19:19:38,176+0200 WARN  (jsonrpc/3)
>>> > [storage.StorageServer.MountConnection] Using user specified
>>> > backup-volfile-servers option (storageServer:253)
>>> > 2017-06-03 19:21:12,379+0200 WARN  (periodic/1) [throttled] MOM not
>>> > available. (throttledlog:105)
>>> > 2017-06-03 19:21:12,380+0200 WARN  (periodic/1) [throttled] MOM not
>>> > available, KSM stats will be missing. (throttledlog:105)
>>> > 2017-06-03 19:21:14,714+0200 WARN  (jsonrpc/1)
>>> > [storage.StorageServer.MountConnection] Using user specified
>>> > backup-volfile-servers option (storageServer:253)
>>> > 2017-06-03 19:21:15,515+0200 ERROR (jsonrpc/4) [storage.initSANLock]
>>> > Cannot
>>> > initialize SANLock for domain a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922
>>> > (clusterlock:238)
>>> > Traceback (most recent call last):
>>> >   File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py",
>>> > line
>>> > 234, in initSANLock
>>> >     sanlock.init_lockspace(sdUUID, idsPath)
>>> > SanlockException: (107, 'Sanlock lockspace init failure', 'Transport
>>> > endpoint is not connected')
>>> > 2017-06-03 19:21:15,515+0200 WARN  (jsonrpc/4)
>>> > [storage.StorageDomainManifest] lease did not initialize successfully
>>> > (sd:557)
>>> > Traceback (most recent call last):
>>> >   File "/usr/share/vdsm/storage/sd.py", line 552, in initDomainLock
>>> >     self._domainLock.initLock(self.getDomainLease())
>>> >   File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py",
>>> > line
>>> > 271, in initLock
>>> >     initSANLock(self._sdUUID, self._idsPath, lease)
>>> >   File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py",
>>> > line
>>> > 239, in initSANLock
>>> >     raise se.ClusterLockInitError()
>>> > ClusterLockInitError: Could not initialize cluster lock: ()
>>> > 2017-06-03 19:21:37,867+0200 ERROR (jsonrpc/2) [storage.StoragePool]
>>> > Create
>>> > pool hosted_datacenter canceled  (sp:655)
>>> > Traceback (most recent call last):
>>> >   File "/usr/share/vdsm/storage/sp.py", line 652, in create
>>> >     self.attachSD(sdUUID)
>>> >   File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
>>> > line
>>> > 79, in wrapper
>>> >     return method(self, *args, **kwargs)
>>> >   File "/usr/share/vdsm/storage/sp.py", line 971, in attachSD
>>> >     dom.acquireHostId(self.id)
>>> >   File "/usr/share/vdsm/storage/sd.py", line 790, in acquireHostId
>>> >     self._manifest.acquireHostId(hostId, async)
>>> >   File "/usr/share/vdsm/storage/sd.py", line 449, in acquireHostId
>>> >     self._domainLock.acquireHostId(hostId, async)
>>> >   File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py",
>>> > line
>>> > 297, in acquireHostId
>>> >     raise se.AcquireHostIdFailure(self._sdUUID, e)
>>> > AcquireHostIdFailure: Cannot acquire host id:
>>> > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, 'Sanlock
>>> > lockspace add failure', 'Invalid argument'))
>>> > 2017-06-03 19:21:37,870+0200 ERROR (jsonrpc/2) [storage.StoragePool]
>>> > Domain
>>> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 detach from MSD
>>> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 Ver 1 failed. (sp:528)
>>> > Traceback (most recent call last):
>>> >   File "/usr/share/vdsm/storage/sp.py", line 525, in __cleanupDomains
>>> >     self.detachSD(sdUUID)
>>> >   File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
>>> > line
>>> > 79, in wrapper
>>> >     return method(self, *args, **kwargs)
>>> >   File "/usr/share/vdsm/storage/sp.py", line 1046, in detachSD
>>> >     raise se.CannotDetachMasterStorageDomain(sdUUID)
>>> > CannotDetachMasterStorageDomain: Illegal action:
>>> > (u'ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047',)
>>> > 2017-06-03 19:21:37,872+0200 ERROR (jsonrpc/2) [storage.StoragePool]
>>> > Domain
>>> > a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922 detach from MSD
>>> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 Ver 1 failed. (sp:528)
>>> > Traceback (most recent call last):
>>> >   File "/usr/share/vdsm/storage/sp.py", line 525, in __cleanupDomains
>>> >     self.detachSD(sdUUID)
>>> >   File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
>>> > line
>>> > 79, in wrapper
>>> >     return method(self, *args, **kwargs)
>>> >   File "/usr/share/vdsm/storage/sp.py", line 1043, in detachSD
>>> >     self.validateAttachedDomain(dom)
>>> >   File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
>>> > line
>>> > 79, in wrapper
>>> >     return method(self, *args, **kwargs)
>>> >   File "/usr/share/vdsm/storage/sp.py", line 542, in
>>> > validateAttachedDomain
>>> >     self.validatePoolSD(dom.sdUUID)
>>> >   File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
>>> > line
>>> > 79, in wrapper
>>> >     return method(self, *args, **kwargs)
>>> >   File "/usr/share/vdsm/storage/sp.py", line 535, in validatePoolSD
>>> >     raise se.StorageDomainNotMemberOfPool(self.spUUID, sdUUID)
>>> > StorageDomainNotMemberOfPool: Domain is not member in pool:
>>> > u'pool=a1e7e9dd-0cf4-41ae-ba13-36297ed66309,
>>> > domain=a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922'
>>> > 2017-06-03 19:21:40,063+0200 ERROR (jsonrpc/2)
>>> > [storage.TaskManager.Task]
>>> > (Task='a2476a33-26f8-4ebd-876d-02fe5d13ef78') Unexpected error
>>> > (task:870)
>>> > Traceback (most recent call last):
>>> >  File "/usr/share/vdsm/storage/task.py", line 877, in _run
>>> >     return fn(*args, **kargs)
>>> >   File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 52, in
>>> > wrapper
>>> >     res = f(*args, **kwargs)
>>> >   File "/usr/share/vdsm/storage/hsm.py", line 959, in createStoragePool
>>> >     leaseParams)
>>> >   File "/usr/share/vdsm/storage/sp.py", line 652, in create
>>> >     self.attachSD(sdUUID)
>>> >   File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
>>> > line
>>> > 79, in wrapper
>>> >     return method(self, *args, **kwargs)
>>> >   File "/usr/share/vdsm/storage/sp.py", line 971, in attachSD
>>> >     dom.acquireHostId(self.id)
>>> >   File "/usr/share/vdsm/storage/sd.py", line 790, in acquireHostId
>>> >     self._manifest.acquireHostId(hostId, async)
>>> >   File "/usr/share/vdsm/storage/sd.py", line 449, in acquireHostId
>>> >     self._domainLock.acquireHostId(hostId, async)
>>> >   File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py",
>>> > line
>>> > 297, in acquireHostId
>>> >     raise se.AcquireHostIdFailure(self._sdUUID, e)
>>> > AcquireHostIdFailure: Cannot acquire host id:
>>> > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, 'Sanlock
>>> > lockspace add failure', 'Invalid argument'))
>>> > 2017-06-03 19:21:40,067+0200 ERROR (jsonrpc/2) [storage.Dispatcher]
>>> > {'status': {'message': "Cannot acquire host id:
>>> > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22, 'Sanlock
>>> > lockspace add failure', 'Invalid argument'))", 'code': 661}}
>>> > (dispatcher:77)
>>> >
>>> > The gluster volume prepared for engine storage is online and no split
>>> > brain
>>> > is reported. I don't understand what needs to be done to overcome this.
>>> > Any
>>> > idea will be appreciated.
>>> >
>>> > Thank you,
>>> > Alex
>>> >
>>> > _______________________________________________
>>> > Users mailing list
>>> > Users at ovirt.org
>>> > http://lists.ovirt.org/mailman/listinfo/users
>>> >
>>
>>
>

Adding Sahina to the thread


More information about the Users mailing list