Hello
I've got a faulty host that keeps rebooting itself from time to time
(due to HW issues), that is/was part of the 3 hosts group hosting the
HostedEngine, and now it always appears as "Kdumping" in the web
administration panel.
All my hosts are oVirt 4.1 on Centos 7.3 with glusterfs 3.7 but this one
that was updated by mistake to Centos 7.4 with glusterfs 3.8.
Is this due to the different OS/gluster version? How can I "reset" it? I
want to remove it permanently and assign the HostedEngine to another host?
Moreover, the main glusterfs volume, the one which holds the HE image,
has some bricks on this failing machine (vm03):
# gluster volume status data_ssd
Status of volume: data_ssd
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick vm01.storage.billy:/gluster/ssd/data/
brick 49156 0 Y 6039
Brick vm02.storage.billy:/gluster/ssd/data/
brick 49153 0 Y 99097
Brick vm03.storage.billy:/gluster/ssd/data/
arbiter_brick 49159 0 Y 5325
Brick vm03.storage.billy:/gluster/ssd/data/
brick N/A N/A N N/A
Brick vm04.storage.billy:/gluster/ssd/data/
brick 49152 0 Y 14811
Brick vm02.storage.billy:/gluster/ssd/data/
arbiter_brick 49154 0 Y 99104
Self-heal Daemon on localhost N/A N/A Y 6753
Self-heal Daemon on vm01.storage.billy N/A N/A Y 79317
Self-heal Daemon on vm02.storage.billy N/A N/A Y 41778
Self-heal Daemon on vm04.storage.billy N/A N/A Y 125116
What's the best way to replace them? Is this guide still useful?
https://support.rackspace.com/how-to/recover-from-a-failed-server-in-a-gl...
(I guess so)
Thanks!
--
Davide Ferrari
Lead System Engineer
Billy Performance Network