Put ovnode2 in maintenance (put a tick for stopping gluster), wait till all VMs evacuate and the host is really in maintenance and activate it back.

Restarting the glusterd also should do the trick, but it's always better to ensure no gluster processes have been left running(inclusing the mount points.

Best Regards,
Strahil Nikolov

On Fri, Oct 1, 2021 at 17:06, Dominique D
<dominique.deschenes@gcgenicom.com> wrote:
yesterday I had a glich and my second ovnode2 server restarted

here are some errors in the events :

VDSM ovnode3.telecom.lan command SpmStatusVDS failed: Connection timeout for host 'ovnode3.telecom.lan', last response arrived 2455 ms ago.
Host ovnode3.telecom.lan is not responding. It will stay in Connecting state for a grace period of 86 seconds and after that an attempt to fence the host will be issued.
Invalid status on Data Center Default. Setting Data Center status to Non Responsive (On host ovnode3.telecom.lan, Error: Network error during communication with the Host.).
Executing power management status on Host ovnode3.telecom.lan using Proxy Host ovnode1.telecom.lan and Fence Agent ipmilan:10.5.1.16.

Now my 3 bricks have errors from my gluster volume


[root@ovnode2 ~]# gluster volume status
Status of volume: datassd
Gluster process                            TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick ovnode1s.telecom.lan:/gluster_bricks/
datassd/datassd                            49152    0          Y      4027
Brick ovnode2s.telecom.lan:/gluster_bricks/
datassd/datassd                            49153    0          Y      2393
Brick ovnode3s.telecom.lan:/gluster_bricks/
datassd/datassd                            49152    0          Y      2347
Self-heal Daemon on localhost              N/A      N/A        Y      2405
Self-heal Daemon on ovnode3s.telecom.lan    N/A      N/A        Y      2366
Self-heal Daemon on 172.16.70.91            N/A      N/A        Y      4043

Task Status of Volume datassd
------------------------------------------------------------------------------
There are no active volume tasks


gluster volume heal datassd info | grep -i "Number of entries:" | grep -v "entries: 0"
Number of entries: 5759

in the webadmin all the bricks are green with comments for two :

ovnode1 Up, 5834 Unsynced entries present
ovnode2 Up,
ovnode3 Up, 5820 Unsynced entries present

I tried this without success

gluster volume heal datassd
Launching heal operation to perform index self heal on volume datassd has been unsuccessful:
Glusterd Syncop Mgmt brick op 'Heal' failed. Please check glustershd log file for details.

What are the next steps ?

Thank you
_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-leave@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/QRI2K34O2X3NEEYLWTZJYG26EYH6CJQU/