New subject: Ovirt 4.3.1 problem with HA agent

19 Mar 2019

      Hi Alexei,
...
...
1.2 All bricks healed (gluster volume heal data info summary) and no split-brain

gluster volume heal data info

Brick node-msk-gluster203:/opt/gluster/data
Status: Connected
Number of entries: 0

Brick node-msk-gluster205:/opt/gluster/data
<gfid:18c78043-0943-48f8-a4fe-9b23e2ba3404>
<gfid:b6f7d8e7-1746-471b-a49d-8d824db9fd72>
<gfid:6db6a49e-2be2-4c4e-93cb-d76c32f8e422>
<gfid:e39cb2a8-5698-4fd2-b49c-102e5ea0a008>
<gfid:5fad58f8-4370-46ce-b976-ac22d2f680ee>
<gfid:7d0b4104-6ad6-433f-9142-7843fd260c70>
<gfid:706cd1d9-f4c9-4c89-aa4c-42ca91ab827e>
Status: Connected
Number of entries: 7

Brick node-msk-gluster201:/opt/gluster/data
<gfid:18c78043-0943-48f8-a4fe-9b23e2ba3404>
<gfid:b6f7d8e7-1746-471b-a49d-8d824db9fd72>
<gfid:6db6a49e-2be2-4c4e-93cb-d76c32f8e422>
<gfid:e39cb2a8-5698-4fd2-b49c-102e5ea0a008>
<gfid:5fad58f8-4370-46ce-b976-ac22d2f680ee>
<gfid:7d0b4104-6ad6-433f-9142-7843fd260c70>
<gfid:706cd1d9-f4c9-4c89-aa4c-42ca91ab827e>
Status: Connected
Number of entries: 7

Data needs healing.
Run: cluster volume heal data full
If it still doesn't heal (check in 5 min),go to /rhev/data-center/mnt/glusterSD/msk-gluster-facility.xxxx_data
And run 'find  . -exec stat {}\;'  without the quotes.

As I have understood you, ovirt Hosted Engine  is running and can be started on all nodes except 1.
...
...

2. Go to the problematic host and check the mount point is there

No mount point on problematic node /rhev/data-center/mnt/glusterSD/msk-gluster-facility.xxxx:_data
If I create a mount point manually, it is deleted after the node is activated.

Other nodes can mount this volume without problems. Only this node have connection problems after update.

Here is a part of the log at the time of activation of the node:

vdsm log

2019-03-18 16:46:00,548+0300 INFO  (jsonrpc/5) [vds] Setting Hosted Engine HA local maintenance to False (API:1630)
2019-03-18 16:46:00,549+0300 INFO  (jsonrpc/5) [jsonrpc.JsonRpcServer] RPC call Host.setHaMaintenanceMode succeeded in 0.00 seconds (__init__:573)
2019-03-18 16:46:00,581+0300 INFO  (jsonrpc/7) [vdsm.api] START connectStorageServer(domType=7, spUUID=u'5a5cca91-01f8-01af-0297-00000000025f', conList=[{u'id': u'5799806e-7969-45da-b17d-b47a63e6a8e4', u'connection': u'msk-gluster-facility.xxxx:/data', u'iqn': u'', u'user': u'', u'tpgt': u'1', u'vfs_type': u'glusterfs', u'password': '********', u'port': u''}], options=None) from=::ffff:10.77.253.210,56630, flow_id=81524ed, task_id=5f353993-95de-480d-afea-d32dc94fd146 (api:46)
2019-03-18 16:46:00,621+0300 INFO  (jsonrpc/7) [storage.StorageServer.MountConnection] Creating directory u'/rhev/data-center/mnt/glusterSD/msk-gluster-facility.xxxx:_data' (storageServer:167)
2019-03-18 16:46:00,622+0300 INFO  (jsonrpc/7) [storage.fileUtils] Creating directory: /rhev/data-center/mnt/glusterSD/msk-gluster-facility.xxxx:_data mode: None (fileUtils:197)
2019-03-18 16:46:00,622+0300 WARN  (jsonrpc/7) [storage.StorageServer.MountConnection] gluster server u'msk-gluster-facility.xxxx' is not in bricks ['node-msk-gluster203', 'node-msk-gluster205', 'node-msk-gluster201'], possibly mounting duplicate servers (storageServer:317)
This seems very strange. As you have hidden the hostname, I'm not use which on is this.
Check that DNS can be resolved from  all hosts and the hostname of this Host is resolvable.
Also check if it in the  peer  list.
Try to manually mount the cluster volume:
mount -t glusterfs msk-gluster-facility.xxxx:/data /mnt

Is this a second FQDN/IP of this server?
If so, gluster accepts that via gluster peer probe IP2
...
...
2.1. Check permissions (should be vdsm:kvm) and fix with chown -R if needed
...
...
2.2. Check the OVF_STORE from the logs that it exists

How can i do this?
Go to /rhev/data-center/mnt/glusterSD/host_engine and use find inside the domain UUID for files that are not owned by vdsm:KVM.
I usually run 'chown -R vdsm:KVM 823xx-xxxx-yyyy-zzz'  and it will fix any misconfiguration.

Best Regards,
Strahil Nikolov

Re: Ovirt 4.3.1 problem with HA agent

Strahil

Николаев Алексей

tags

participants (2)