This seems like a case of O_DIRECT reads and writes gone wrong, judging by
the 'Invalid argument' errors.
The two operations that have failed on gluster bricks are:
[2017-06-05 09:40:39.428979] E [MSGID: 113072] [posix.c:3453:posix_writev]
0-engine-posix: write failed: offset 0, [Invalid argument]
[2017-06-05 09:41:00.865760] E [MSGID: 113040] [posix.c:3178:posix_readv]
0-engine-posix: read failed on gfid=8c94f658-ac3c-4e3a-b368-8c038513a914,
fd=0x7f408584c06c, offset=127488 size=512, buf=0x7f4083c0b000 [Invalid
argument]
But then, both the write and the read have 512byte-aligned offset, size and
buf address (which is correct).
Are you saying you don't see this issue with 4K block-size?
-Krutika
On Mon, Jun 5, 2017 at 3:21 PM, Abi Askushi <rightkicktech(a)gmail.com> wrote:
Hi Sahina,
Attached are the logs. Let me know if sth else is needed.
I have 5 disks (with 4K physical sector) in RAID5. The RAID has 64K stripe
size at the moment.
I have prepared the storage as below:
pvcreate --dataalignment 256K /dev/sda4
vgcreate --physicalextentsize 256K gluster /dev/sda4
lvcreate -n engine --size 120G gluster
mkfs.xfs -f -i size=512 /dev/gluster/engine
Thanx,
Alex
On Mon, Jun 5, 2017 at 12:14 PM, Sahina Bose <sabose(a)redhat.com> wrote:
> Can we have the gluster mount logs and brick logs to check if it's the
> same issue?
>
> On Sun, Jun 4, 2017 at 11:21 PM, Abi Askushi <rightkicktech(a)gmail.com>
> wrote:
>
>> I clean installed everything and ran into the same.
>> I then ran gdeploy and encountered the same issue when deploying engine.
>> Seems that gluster (?) doesn't like 4K sector drives. I am not sure if
>> it has to do with alignment. The weird thing is that gluster volumes are
>> all ok, replicating normally and no split brain is reported.
>>
>> The solution to the mentioned bug (1386443
>> <
https://bugzilla.redhat.com/show_bug.cgi?id=1386443>) was to format
>> with 512 sector size, which for my case is not an option:
>>
>> mkfs.xfs -f -i size=512 -s size=512 /dev/gluster/engine
>> illegal sector size 512; hw sector is 4096
>>
>> Is there any workaround to address this?
>>
>> Thanx,
>> Alex
>>
>>
>> On Sun, Jun 4, 2017 at 5:48 PM, Abi Askushi <rightkicktech(a)gmail.com>
>> wrote:
>>
>>> Hi Maor,
>>>
>>> My disk are of 4K block size and from this bug seems that gluster
>>> replica needs 512B block size.
>>> Is there a way to make gluster function with 4K drives?
>>>
>>> Thank you!
>>>
>>> On Sun, Jun 4, 2017 at 2:34 PM, Maor Lipchuk <mlipchuk(a)redhat.com>
>>> wrote:
>>>
>>>> Hi Alex,
>>>>
>>>> I saw a bug that might be related to the issue you encountered at
>>>>
https://bugzilla.redhat.com/show_bug.cgi?id=1386443
>>>>
>>>> Sahina, maybe you have any advise? Do you think that BZ1386443is
>>>> related?
>>>>
>>>> Regards,
>>>> Maor
>>>>
>>>> On Sat, Jun 3, 2017 at 8:45 PM, Abi Askushi
<rightkicktech(a)gmail.com>
>>>> wrote:
>>>> > Hi All,
>>>> >
>>>> > I have installed successfully several times oVirt (version 4.1)
with
>>>> 3 nodes
>>>> > on top glusterfs.
>>>> >
>>>> > This time, when trying to configure the same setup, I am facing the
>>>> > following issue which doesn't seem to go away. During
installation i
>>>> get the
>>>> > error:
>>>> >
>>>> > Failed to execute stage 'Misc configuration': Cannot acquire
host id:
>>>> > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922',
SanlockException(22,
>>>> 'Sanlock
>>>> > lockspace add failure', 'Invalid argument'))
>>>> >
>>>> > The only different in this setup is that instead of standard
>>>> partitioning i
>>>> > have GPT partitioning and the disks have 4K block size instead of
>>>> 512.
>>>> >
>>>> > The /var/log/sanlock.log has the following lines:
>>>> >
>>>> > 2017-06-03 19:21:15+0200 23450 [943]: s9 lockspace
>>>> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047:250:/rhev/data-center/m
>>>> nt/_var_lib_ovirt-hosted-engin-setup_tmptjkIDI/ba6bd862-c2b8
>>>> -46e7-b2c8-91e4a5bb2047/dom_md/ids:0
>>>> > 2017-06-03 19:21:36+0200 23471 [944]: s9:r5 resource
>>>> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047:SDM:/rhev/data-center/m
>>>> nt/_var_lib_ovirt-hosted-engine-setup_tmptjkIDI/ba6bd862-c2b
>>>> 8-46e7-b2c8-91e4a5bb2047/dom_md/leases:1048576
>>>> > for 2,9,23040
>>>> > 2017-06-03 19:21:36+0200 23471 [943]: s10 lockspace
>>>> > a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922:250:/rhev/data-center/m
>>>> nt/glusterSD/10.100.100.1:_engine/a5a6b0e7-fc3f-4838-8e26-c8
>>>> b4d5e5e922/dom_md/ids:0
>>>> > 2017-06-03 19:21:36+0200 23471 [23522]: a5a6b0e7 aio collect RD
>>>> > 0x7f59b00008c0:0x7f59b00008d0:0x7f59b0101000 result -22:0 match res
>>>> > 2017-06-03 19:21:36+0200 23471 [23522]: read_sectors delta_leader
>>>> offset
>>>> > 127488 rv -22
>>>> > /rhev/data-center/mnt/glusterSD/10.100.100.1:_engine/a5a6b0e
>>>> 7-fc3f-4838-8e26-c8b4d5e5e922/dom_md/ids
>>>> > 2017-06-03 19:21:37+0200 23472 [930]: s9 host 250 1 23450
>>>> > 88c2244c-a782-40ed-9560-6cfa4d46f853.v0.neptune
>>>> > 2017-06-03 19:21:37+0200 23472 [943]: s10 add_lockspace fail result
>>>> -22
>>>> >
>>>> > And /var/log/vdsm/vdsm.log says:
>>>> >
>>>> > 2017-06-03 19:19:38,176+0200 WARN (jsonrpc/3)
>>>> > [storage.StorageServer.MountConnection] Using user specified
>>>> > backup-volfile-servers option (storageServer:253)
>>>> > 2017-06-03 19:21:12,379+0200 WARN (periodic/1) [throttled] MOM not
>>>> > available. (throttledlog:105)
>>>> > 2017-06-03 19:21:12,380+0200 WARN (periodic/1) [throttled] MOM not
>>>> > available, KSM stats will be missing. (throttledlog:105)
>>>> > 2017-06-03 19:21:14,714+0200 WARN (jsonrpc/1)
>>>> > [storage.StorageServer.MountConnection] Using user specified
>>>> > backup-volfile-servers option (storageServer:253)
>>>> > 2017-06-03 19:21:15,515+0200 ERROR (jsonrpc/4)
[storage.initSANLock]
>>>> Cannot
>>>> > initialize SANLock for domain a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922
>>>> > (clusterlock:238)
>>>> > Traceback (most recent call last):
>>>> > File
"/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py",
>>>> line
>>>> > 234, in initSANLock
>>>> > sanlock.init_lockspace(sdUUID, idsPath)
>>>> > SanlockException: (107, 'Sanlock lockspace init failure',
'Transport
>>>> > endpoint is not connected')
>>>> > 2017-06-03 19:21:15,515+0200 WARN (jsonrpc/4)
>>>> > [storage.StorageDomainManifest] lease did not initialize
>>>> successfully
>>>> > (sd:557)
>>>> > Traceback (most recent call last):
>>>> > File "/usr/share/vdsm/storage/sd.py", line 552, in
initDomainLock
>>>> > self._domainLock.initLock(self.getDomainLease())
>>>> > File
"/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py",
>>>> line
>>>> > 271, in initLock
>>>> > initSANLock(self._sdUUID, self._idsPath, lease)
>>>> > File
"/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py",
>>>> line
>>>> > 239, in initSANLock
>>>> > raise se.ClusterLockInitError()
>>>> > ClusterLockInitError: Could not initialize cluster lock: ()
>>>> > 2017-06-03 19:21:37,867+0200 ERROR (jsonrpc/2)
[storage.StoragePool]
>>>> Create
>>>> > pool hosted_datacenter canceled (sp:655)
>>>> > Traceback (most recent call last):
>>>> > File "/usr/share/vdsm/storage/sp.py", line 652, in
create
>>>> > self.attachSD(sdUUID)
>>>> > File
"/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
>>>> line
>>>> > 79, in wrapper
>>>> > return method(self, *args, **kwargs)
>>>> > File "/usr/share/vdsm/storage/sp.py", line 971, in
attachSD
>>>> > dom.acquireHostId(self.id)
>>>> > File "/usr/share/vdsm/storage/sd.py", line 790, in
acquireHostId
>>>> > self._manifest.acquireHostId(hostId, async)
>>>> > File "/usr/share/vdsm/storage/sd.py", line 449, in
acquireHostId
>>>> > self._domainLock.acquireHostId(hostId, async)
>>>> > File
"/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py",
>>>> line
>>>> > 297, in acquireHostId
>>>> > raise se.AcquireHostIdFailure(self._sdUUID, e)
>>>> > AcquireHostIdFailure: Cannot acquire host id:
>>>> > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922',
SanlockException(22,
>>>> 'Sanlock
>>>> > lockspace add failure', 'Invalid argument'))
>>>> > 2017-06-03 19:21:37,870+0200 ERROR (jsonrpc/2)
[storage.StoragePool]
>>>> Domain
>>>> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 detach from MSD
>>>> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 Ver 1 failed. (sp:528)
>>>> > Traceback (most recent call last):
>>>> > File "/usr/share/vdsm/storage/sp.py", line 525, in
>>>> __cleanupDomains
>>>> > self.detachSD(sdUUID)
>>>> > File
"/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
>>>> line
>>>> > 79, in wrapper
>>>> > return method(self, *args, **kwargs)
>>>> > File "/usr/share/vdsm/storage/sp.py", line 1046, in
detachSD
>>>> > raise se.CannotDetachMasterStorageDomain(sdUUID)
>>>> > CannotDetachMasterStorageDomain: Illegal action:
>>>> > (u'ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047',)
>>>> > 2017-06-03 19:21:37,872+0200 ERROR (jsonrpc/2)
[storage.StoragePool]
>>>> Domain
>>>> > a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922 detach from MSD
>>>> > ba6bd862-c2b8-46e7-b2c8-91e4a5bb2047 Ver 1 failed. (sp:528)
>>>> > Traceback (most recent call last):
>>>> > File "/usr/share/vdsm/storage/sp.py", line 525, in
>>>> __cleanupDomains
>>>> > self.detachSD(sdUUID)
>>>> > File
"/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
>>>> line
>>>> > 79, in wrapper
>>>> > return method(self, *args, **kwargs)
>>>> > File "/usr/share/vdsm/storage/sp.py", line 1043, in
detachSD
>>>> > self.validateAttachedDomain(dom)
>>>> > File
"/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
>>>> line
>>>> > 79, in wrapper
>>>> > return method(self, *args, **kwargs)
>>>> > File "/usr/share/vdsm/storage/sp.py", line 542, in
>>>> validateAttachedDomain
>>>> > self.validatePoolSD(dom.sdUUID)
>>>> > File
"/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
>>>> line
>>>> > 79, in wrapper
>>>> > return method(self, *args, **kwargs)
>>>> > File "/usr/share/vdsm/storage/sp.py", line 535, in
validatePoolSD
>>>> > raise se.StorageDomainNotMemberOfPool(self.spUUID, sdUUID)
>>>> > StorageDomainNotMemberOfPool: Domain is not member in pool:
>>>> > u'pool=a1e7e9dd-0cf4-41ae-ba13-36297ed66309,
>>>> > domain=a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922'
>>>> > 2017-06-03 19:21:40,063+0200 ERROR (jsonrpc/2)
>>>> [storage.TaskManager.Task]
>>>> > (Task='a2476a33-26f8-4ebd-876d-02fe5d13ef78') Unexpected
error
>>>> (task:870)
>>>> > Traceback (most recent call last):
>>>> > File "/usr/share/vdsm/storage/task.py", line 877, in
_run
>>>> > return fn(*args, **kargs)
>>>> > File
"/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line
>>>> 52, in
>>>> > wrapper
>>>> > res = f(*args, **kwargs)
>>>> > File "/usr/share/vdsm/storage/hsm.py", line 959, in
>>>> createStoragePool
>>>> > leaseParams)
>>>> > File "/usr/share/vdsm/storage/sp.py", line 652, in
create
>>>> > self.attachSD(sdUUID)
>>>> > File
"/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
>>>> line
>>>> > 79, in wrapper
>>>> > return method(self, *args, **kwargs)
>>>> > File "/usr/share/vdsm/storage/sp.py", line 971, in
attachSD
>>>> > dom.acquireHostId(self.id)
>>>> > File "/usr/share/vdsm/storage/sd.py", line 790, in
acquireHostId
>>>> > self._manifest.acquireHostId(hostId, async)
>>>> > File "/usr/share/vdsm/storage/sd.py", line 449, in
acquireHostId
>>>> > self._domainLock.acquireHostId(hostId, async)
>>>> > File
"/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py",
>>>> line
>>>> > 297, in acquireHostId
>>>> > raise se.AcquireHostIdFailure(self._sdUUID, e)
>>>> > AcquireHostIdFailure: Cannot acquire host id:
>>>> > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922',
SanlockException(22,
>>>> 'Sanlock
>>>> > lockspace add failure', 'Invalid argument'))
>>>> > 2017-06-03 19:21:40,067+0200 ERROR (jsonrpc/2) [storage.Dispatcher]
>>>> > {'status': {'message': "Cannot acquire host
id:
>>>> > (u'a5a6b0e7-fc3f-4838-8e26-c8b4d5e5e922',
SanlockException(22,
>>>> 'Sanlock
>>>> > lockspace add failure', 'Invalid argument'))",
'code': 661}}
>>>> (dispatcher:77)
>>>> >
>>>> > The gluster volume prepared for engine storage is online and no
>>>> split brain
>>>> > is reported. I don't understand what needs to be done to
overcome
>>>> this. Any
>>>> > idea will be appreciated.
>>>> >
>>>> > Thank you,
>>>> > Alex
>>>> >
>>>> > _______________________________________________
>>>> > Users mailing list
>>>> > Users(a)ovirt.org
>>>> >
http://lists.ovirt.org/mailman/listinfo/users
>>>> >
>>>>
>>>
>>>
>>
>> _______________________________________________
>> Users mailing list
>> Users(a)ovirt.org
>>
http://lists.ovirt.org/mailman/listinfo/users
>>
>>
>
_______________________________________________
Users mailing list
Users(a)ovirt.org
http://lists.ovirt.org/mailman/listinfo/users