[ovirt-users] oVirt gluster sanlock issue

Abi Askushi rightkicktech at gmail.com
Wed Jun 7 09:39:37 UTC 2017


Hi Sahina,

Did you have the chance to check the logs and have any idea how this may be
addressed?


Thanx,
Alex

On Mon, Jun 5, 2017 at 12:14 PM, Sahina Bose <sabose at redhat.com> wrote:

> Can we have the gluster mount logs and brick logs to check if it's the
> same issue?
>
> On Sun, Jun 4, 2017 at 11:21 PM, Abi Askushi <rightkicktech at gmail.com>
> wrote:
>
>> I clean installed everything and ran into the same.
>> I then ran gdeploy and encountered the same issue when deploying engine.
>> Seems that gluster (?) doesn't like 4K sector drives. I am not sure if it
>> has to do with alignment. The weird thing is that gluster volumes are all
>> ok, replicating normally and no split brain is reported.
>>
>> The solution to the mentioned bug (1386443
>> <https://bugzilla.redhat.com/show_bug.cgi?id=1386443>) was to format
>> with 512 sector size, which for my case is not an option:
>>
>> mkfs.xfs -f -i size=512 -s size=512 /dev/gluster/engine
>> illegal sector size 512; hw sector is 4096
>>
>> Is there any workaround to address this?
>>
>> Thanx,
>> Alex
>>
>>
>> On Sun, Jun 4, 2017 at 5:48 PM, Abi Askushi <rightkicktech at gmail.com>
>> wrote:
>>
>>> Hi Maor,
>>>
>>> My disk are of 4K block size and from this bug seems that gluster
>>> replica needs 512B block size.
>>> Is there a way to make gluster function with 4K drives?
>>>
>>> Thank you!
>>>
>>> On Sun, Jun 4, 2017 at 2:34 PM, Maor Lipchuk <mlipchuk at redhat.com>
>>> wrote:
>>>
>>>> Hi Alex,
>>>>
>>>> I saw a bug that might be related to the issue you encountered at
>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1386443
>>>>
>>>> Sahina, maybe you have any advise? Do you think that BZ1386443is
>>>> related?
>>>>
>>>> Regards,
>>>> Maor
>>>>
>>>> On Sat, Jun 3, 2017 at 8:45 PM, Abi Askushi <rightkicktech at gmail.com>
>>>> wrote:
>>>> > Hi All,
>>>> >
>>>> > I have installed successfully several times oVirt (version 4.1) with
>>>> 3 nodes
>>>> > on top glusterfs.
>>>> >
>>>> > This time, when trying to configure the same setup, I am facing the
>>>> > following issue which doesn't seem to go away. During installation i
>>>> get the
>>>> > error:
>>>> >
>>>> > Failed to execute stage 'Misc configuration': Cannot acquire host id:
>>>> > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22,
>>>> 'Sanlock
>>>> > lockspace add failure', 'Invalid argument'))
>>>> >
>>>> > The only different in this setup is that instead of standard
>>>> partitioning i
>>>> > have GPT partitioning and the disks have 4K block size instead of 512.
>>>> >
>>>> > The /var/log/sanlock.log has the following lines:
>>>> >
>>>> > 2017-06-03 19:21:15+0200 23450 [943]: s9 lockspace
>>>> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047:250:/rhev/data-center/m
>>>> nt/_var_lib_ovirt-hosted-engin-setup_tmptjkIDI/ba6bd862-c2b8
>>>> -46e7-b2c8-91e4a5bb2047/dom_md/ids:0
>>>> > 2017-06-03 19:21:36+0200 23471 [944]: s9:r5 resource
>>>> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047:SDM:/rhev/data-center/m
>>>> nt/_var_lib_ovirt-hosted-engine-setup_tmptjkIDI/ba6bd862-c2b
>>>> 8-46e7-b2c8-91e4a5bb2047/dom_md/leases:1048576
>>>> > for 2,9,23040
>>>> > 2017-06-03 19:21:36+0200 23471 [943]: s10 lockspace
>>>> > a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922:250:/rhev/data-center/m
>>>> nt/glusterSD/10.100.100.1:_engine/a5a6b0e7-fc3f-4838-8e26-c8
>>>> b4d5e5e922/dom_md/ids:0
>>>> > 2017-06-03 19:21:36+0200 23471 [23522]: a5a6b0e7 aio collect RD
>>>> > 0x7f59b00008c0:0x7f59b00008d0:0x7f59b0101000 result -22:0 match res
>>>> > 2017-06-03 19:21:36+0200 23471 [23522]: read_sectors delta_leader
>>>> offset
>>>> > 127488 rv -22
>>>> > /rhev/data-center/mnt/glusterSD/10.100.100.1:_engine/a5a6b0e
>>>> 7-fc3f-4838-8e26-c8b4d5e5e922/dom_md/ids
>>>> > 2017-06-03 19:21:37+0200 23472 [930]: s9 host 250 1 23450
>>>> > 88c2244c-a782-40ed-9560-6cfa4d46f853.v0.neptune
>>>> > 2017-06-03 19:21:37+0200 23472 [943]: s10 add_lockspace fail result
>>>> -22
>>>> >
>>>> > And /var/log/vdsm/vdsm.log says:
>>>> >
>>>> > 2017-06-03 19:19:38,176+0200 WARN  (jsonrpc/3)
>>>> > [storage.StorageServer.MountConnection] Using user specified
>>>> > backup-volfile-servers option (storageServer:253)
>>>> > 2017-06-03 19:21:12,379+0200 WARN  (periodic/1) [throttled] MOM not
>>>> > available. (throttledlog:105)
>>>> > 2017-06-03 19:21:12,380+0200 WARN  (periodic/1) [throttled] MOM not
>>>> > available, KSM stats will be missing. (throttledlog:105)
>>>> > 2017-06-03 19:21:14,714+0200 WARN  (jsonrpc/1)
>>>> > [storage.StorageServer.MountConnection] Using user specified
>>>> > backup-volfile-servers option (storageServer:253)
>>>> > 2017-06-03 19:21:15,515+0200 ERROR (jsonrpc/4) [storage.initSANLock]
>>>> Cannot
>>>> > initialize SANLock for domain a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922
>>>> > (clusterlock:238)
>>>> > Traceback (most recent call last):
>>>> >   File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py",
>>>> line
>>>> > 234, in initSANLock
>>>> >     sanlock.init_lockspace(sdUUID, idsPath)
>>>> > SanlockException: (107, 'Sanlock lockspace init failure', 'Transport
>>>> > endpoint is not connected')
>>>> > 2017-06-03 19:21:15,515+0200 WARN  (jsonrpc/4)
>>>> > [storage.StorageDomainManifest] lease did not initialize successfully
>>>> > (sd:557)
>>>> > Traceback (most recent call last):
>>>> >   File "/usr/share/vdsm/storage/sd.py", line 552, in initDomainLock
>>>> >     self._domainLock.initLock(self.getDomainLease())
>>>> >   File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py",
>>>> line
>>>> > 271, in initLock
>>>> >     initSANLock(self._sdUUID, self._idsPath, lease)
>>>> >   File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py",
>>>> line
>>>> > 239, in initSANLock
>>>> >     raise se.ClusterLockInitError()
>>>> > ClusterLockInitError: Could not initialize cluster lock: ()
>>>> > 2017-06-03 19:21:37,867+0200 ERROR (jsonrpc/2) [storage.StoragePool]
>>>> Create
>>>> > pool hosted_datacenter canceled  (sp:655)
>>>> > Traceback (most recent call last):
>>>> >   File "/usr/share/vdsm/storage/sp.py", line 652, in create
>>>> >     self.attachSD(sdUUID)
>>>> >   File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
>>>> line
>>>> > 79, in wrapper
>>>> >     return method(self, *args, **kwargs)
>>>> >   File "/usr/share/vdsm/storage/sp.py", line 971, in attachSD
>>>> >     dom.acquireHostId(self.id)
>>>> >   File "/usr/share/vdsm/storage/sd.py", line 790, in acquireHostId
>>>> >     self._manifest.acquireHostId(hostId, async)
>>>> >   File "/usr/share/vdsm/storage/sd.py", line 449, in acquireHostId
>>>> >     self._domainLock.acquireHostId(hostId, async)
>>>> >   File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py",
>>>> line
>>>> > 297, in acquireHostId
>>>> >     raise se.AcquireHostIdFailure(self._sdUUID, e)
>>>> > AcquireHostIdFailure: Cannot acquire host id:
>>>> > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22,
>>>> 'Sanlock
>>>> > lockspace add failure', 'Invalid argument'))
>>>> > 2017-06-03 19:21:37,870+0200 ERROR (jsonrpc/2) [storage.StoragePool]
>>>> Domain
>>>> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 detach from MSD
>>>> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 Ver 1 failed. (sp:528)
>>>> > Traceback (most recent call last):
>>>> >   File "/usr/share/vdsm/storage/sp.py", line 525, in __cleanupDomains
>>>> >     self.detachSD(sdUUID)
>>>> >   File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
>>>> line
>>>> > 79, in wrapper
>>>> >     return method(self, *args, **kwargs)
>>>> >   File "/usr/share/vdsm/storage/sp.py", line 1046, in detachSD
>>>> >     raise se.CannotDetachMasterStorageDomain(sdUUID)
>>>> > CannotDetachMasterStorageDomain: Illegal action:
>>>> > (u'ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047',)
>>>> > 2017-06-03 19:21:37,872+0200 ERROR (jsonrpc/2) [storage.StoragePool]
>>>> Domain
>>>> > a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922 detach from MSD
>>>> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 Ver 1 failed. (sp:528)
>>>> > Traceback (most recent call last):
>>>> >   File "/usr/share/vdsm/storage/sp.py", line 525, in __cleanupDomains
>>>> >     self.detachSD(sdUUID)
>>>> >   File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
>>>> line
>>>> > 79, in wrapper
>>>> >     return method(self, *args, **kwargs)
>>>> >   File "/usr/share/vdsm/storage/sp.py", line 1043, in detachSD
>>>> >     self.validateAttachedDomain(dom)
>>>> >   File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
>>>> line
>>>> > 79, in wrapper
>>>> >     return method(self, *args, **kwargs)
>>>> >   File "/usr/share/vdsm/storage/sp.py", line 542, in
>>>> validateAttachedDomain
>>>> >     self.validatePoolSD(dom.sdUUID)
>>>> >   File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
>>>> line
>>>> > 79, in wrapper
>>>> >     return method(self, *args, **kwargs)
>>>> >   File "/usr/share/vdsm/storage/sp.py", line 535, in validatePoolSD
>>>> >     raise se.StorageDomainNotMemberOfPool(self.spUUID, sdUUID)
>>>> > StorageDomainNotMemberOfPool: Domain is not member in pool:
>>>> > u'pool=a1e7e9dd-0cf4-41ae-ba13-36297ed66309,
>>>> > domain=a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922'
>>>> > 2017-06-03 19:21:40,063+0200 ERROR (jsonrpc/2)
>>>> [storage.TaskManager.Task]
>>>> > (Task='a2476a33-26f8-4ebd-876d-02fe5d13ef78') Unexpected error
>>>> (task:870)
>>>> > Traceback (most recent call last):
>>>> >  File "/usr/share/vdsm/storage/task.py", line 877, in _run
>>>> >     return fn(*args, **kargs)
>>>> >   File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 52,
>>>> in
>>>> > wrapper
>>>> >     res = f(*args, **kwargs)
>>>> >   File "/usr/share/vdsm/storage/hsm.py", line 959, in
>>>> createStoragePool
>>>> >     leaseParams)
>>>> >   File "/usr/share/vdsm/storage/sp.py", line 652, in create
>>>> >     self.attachSD(sdUUID)
>>>> >   File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
>>>> line
>>>> > 79, in wrapper
>>>> >     return method(self, *args, **kwargs)
>>>> >   File "/usr/share/vdsm/storage/sp.py", line 971, in attachSD
>>>> >     dom.acquireHostId(self.id)
>>>> >   File "/usr/share/vdsm/storage/sd.py", line 790, in acquireHostId
>>>> >     self._manifest.acquireHostId(hostId, async)
>>>> >   File "/usr/share/vdsm/storage/sd.py", line 449, in acquireHostId
>>>> >     self._domainLock.acquireHostId(hostId, async)
>>>> >   File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py",
>>>> line
>>>> > 297, in acquireHostId
>>>> >     raise se.AcquireHostIdFailure(self._sdUUID, e)
>>>> > AcquireHostIdFailure: Cannot acquire host id:
>>>> > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22,
>>>> 'Sanlock
>>>> > lockspace add failure', 'Invalid argument'))
>>>> > 2017-06-03 19:21:40,067+0200 ERROR (jsonrpc/2) [storage.Dispatcher]
>>>> > {'status': {'message': "Cannot acquire host id:
>>>> > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922', SanlockException(22,
>>>> 'Sanlock
>>>> > lockspace add failure', 'Invalid argument'))", 'code': 661}}
>>>> (dispatcher:77)
>>>> >
>>>> > The gluster volume prepared for engine storage is online and no split
>>>> brain
>>>> > is reported. I don't understand what needs to be done to overcome
>>>> this. Any
>>>> > idea will be appreciated.
>>>> >
>>>> > Thank you,
>>>> > Alex
>>>> >
>>>> > _______________________________________________
>>>> > Users mailing list
>>>> > Users at ovirt.org
>>>> > http://lists.ovirt.org/mailman/listinfo/users
>>>> >
>>>>
>>>
>>>
>>
>> _______________________________________________
>> Users mailing list
>> Users at ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20170607/e72744e9/attachment-0001.html>


More information about the Users mailing list