[ovirt-users] oVirt split brain resolution

Abi Askushi rightkicktech at gmail.com
Fri Jun 23 16:21:40 UTC 2017


Hi Denis,

I receive permission denied as below:

gluster volume heal engine split-brain latest-mtime
/e1c80750-b880-495e-9609-b8bc7760d101/ha_agent
Healing /e1c80750-b880-495e-9609-b8bc7760d101/ha_agent failed:Operation not
permitted.
Volume heal failed.


When I shutdown host3 then no split brain is reported from the remaining
two hosts. When I power up host3 then I receive the mentioned split brain
and host3 logs the following at ovirt-hosted-engine-ha/agent.log

MainThread::INFO::2017-06-23
16:18:06,067::hosted_engine::594::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_broker)
Failed set the storage domain: 'Failed to set storage domain VdsmBackend,
options {'hosted-engine.lockspace':
'7B22696D6167655F75756964223A202238323132626637382D663933332D346465652D616333372D346265633734353035366235222C202270617468223A206E756C6C2C2022766F6C756D655F75756964223A202236323739303162652D666261332D346263342D393037632D393931356138333632633537227D',
'sp_uuid': '00000000-0000-0000-0000-000000000000', 'dom_type': 'glusterfs',
'hosted-engine.metadata':
'7B22696D6167655F75756964223A202263353930633034372D613462322D346539312D613832362D643438623961643537323330222C202270617468223A206E756C6C2C2022766F6C756D655F75756964223A202230353166653865612D333339632D346134302D383438382D386335313138666438373238227D',
'sd_uuid': 'e1c80750-b880-495e-9609-b8bc7760d101'}: Request failed: <type
'exceptions.OSError'>'. Waiting '5's before the next attempt

and the following at /var/log/messages:
Jun 23 16:19:43 v2 journal: vdsm root ERROR failed to retrieve Hosted
Engine HA info#012Traceback (most recent call last):#012  File
"/usr/lib/python2.7/site-packages/vdsm/host/api.py", line 231, in
_getHaInfo#012    stats = instance.get_all_stats()#012  File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py",
line 105, in get_all_stats#012    stats =
broker.get_stats_from_storage(service)#012  File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
line 233, in get_stats_from_storage#012    result =
self._checked_communicate(request)#012  File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
line 261, in _checked_communicate#012    .format(message or
response))#012RequestError: Request failed: failed to read metadata: [Errno
5] Input/output error: '/rhev/data-center/mnt/glusterSD/10.100.100.1:
_engine/e1c80750-b880-495e-9609-b8bc7760d101/ha_agent/hosted-engine.metadata'

Thanx


On Fri, Jun 23, 2017 at 6:05 PM, Denis Chaplygin <dchaplyg at redhat.com>
wrote:

> Hello Abi,
>
> On Fri, Jun 23, 2017 at 4:47 PM, Abi Askushi <rightkicktech at gmail.com>
> wrote:
>
>> Hi All,
>>
>> I have a 3 node ovirt 4.1 setup. I lost one node due to raid controller
>> issues. Upon restoration I have the following split brain, although the
>> hosts have mounted the storage domains:
>>
>> gluster volume heal engine info split-brain
>> Brick gluster0:/gluster/engine/brick
>> /e1c80750-b880-495e-9609-b8bc7760d101/ha_agent
>> Status: Connected
>> Number of entries in split-brain: 1
>>
>> Brick gluster1:/gluster/engine/brick
>> /e1c80750-b880-495e-9609-b8bc7760d101/ha_agent
>> Status: Connected
>> Number of entries in split-brain: 1
>>
>> Brick gluster2:/gluster/engine/brick
>> /e1c80750-b880-495e-9609-b8bc7760d101/ha_agent
>> Status: Connected
>> Number of entries in split-brain: 1
>>
>>
>>
> It is definitely on gluster side. You could try to use
>
> gluster volume heal engine split-brain latest-mtime /e1c80750-b880-
> 495e-9609-b8bc7760d101/ha_agent
>
>
> I also added gluster developers to that thread, so they may provide you
> with better advices.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20170623/55bc2097/attachment.html>


More information about the Users mailing list