I restored my engine to a gluster volume named :/engine on a three node hyperconverged
oVirt 4.3.3.1 cluster. Before restoring I was checking the status of the volumes. They
were clean. No heal entries. All peers connected. gluster volume status looked good. Then
I restored. This went well. The engine is up. But the engine gluster volume shows entries
on node02 and node03. The engine was installed to node01. I have to deploy the engine to
the other two hosts to reach full HA, but I bet maintenance is not possible until the
volume is healed.
I tried "gluster volume heal engine" also with added "full". The heal
entries will disappear for a few seconds and then /dom_md/ids will pop up again. The
__DIRECT_IO_TEST__ will join later. The split-brain info has no entries. Is this some kind
of hidden split brain? Maybe there is data on node01 brick which got not synced to the
other two nodes? I can only speculate. Gluster docs say: this should heal. But it
doesn't. I have two other volumes. Those are fine. One of them containing 3 VMs that
are running. I also tried to shut down the engine, so no-one was using the volume. Then
heal. Same effect. Those two files will always show up. But none other. Heal can always be
started successfully from any of the participating nodes.
Reset the volume bricks one by one and cross fingers?
[root@node03 ~]# gluster volume heal engine info
Brick node01.infra.solutions.work:/gluster_bricks/engine/engine
Status: Connected
Number of entries: 0
Brick node02.infra.solutions.work:/gluster_bricks/engine/engine
/9f4d5ae9-e01d-4b73-8b6d-e349279e9782/dom_md/ids
/__DIRECT_IO_TEST__
Status: Connected
Number of entries: 2
Brick node03.infra.solutions.work:/gluster_bricks/engine/engine
/9f4d5ae9-e01d-4b73-8b6d-e349279e9782/dom_md/ids
/__DIRECT_IO_TEST__
Status: Connected
Number of entries: 2