[ovirt-users] Host in kdumping state

2 Nov 2017

      Hello

I've got a faulty host that keeps rebooting itself from time to time 
(due to HW issues), that is/was part of the 3 hosts group hosting the 
HostedEngine, and now it always appears as "Kdumping" in the web 
administration panel.

All my hosts are oVirt 4.1 on Centos 7.3 with glusterfs 3.7 but this one 
that was updated by mistake to Centos 7.4 with glusterfs 3.8.

Is this due to the different OS/gluster version? How can I "reset" it? I 
want to remove it permanently and assign the HostedEngine to another host?

Moreover, the main glusterfs volume, the one which holds the HE image, 
has some bricks on this failing machine (vm03):

#  gluster volume status data_ssd
Status of volume: data_ssd
Gluster process                             TCP Port RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick vm01.storage.billy:/gluster/ssd/data/
brick                                       49156 0          Y       6039
Brick vm02.storage.billy:/gluster/ssd/data/
brick                                       49153 0          Y       99097
Brick vm03.storage.billy:/gluster/ssd/data/
arbiter_brick                               49159 0          Y       5325
Brick vm03.storage.billy:/gluster/ssd/data/
brick                                       N/A N/A        N       N/A
Brick vm04.storage.billy:/gluster/ssd/data/
brick                                       49152 0          Y       14811
Brick vm02.storage.billy:/gluster/ssd/data/
arbiter_brick                               49154 0          Y       99104
Self-heal Daemon on localhost               N/A N/A        Y       6753
Self-heal Daemon on vm01.storage.billy      N/A N/A        Y       79317
Self-heal Daemon on vm02.storage.billy      N/A N/A        Y       41778
Self-heal Daemon on vm04.storage.billy      N/A N/A        Y       125116

What's the best way to replace them? Is this guide still useful? 
https://support.rackspace.com/how-to/recover-from-a-failed-server-in-a-glust... 
(I guess so)

Thanks!

-- 
Davide Ferrari
Lead System Engineer
Billy Performance Network