
Hi Benny, Who should I be reaching out to for help with a gluster based hosted engine corruption? --== Host 1 status ==-- conf_on_shared_storage : True Status up-to-date : True Hostname : ovirtnode1.abcxyzdomains.net Host ID : 1 Engine status : {"reason": "failed liveliness check", "health": "bad", "vm": "up", "detail": "Up"} Score : 3400 stopped : False Local maintenance : False crc32 : 92254a68 local_conf_timestamp : 115910 Host timestamp : 115910 Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=115910 (Mon Jun 18 09:43:20 2018) host-id=1 score=3400 vm_conf_refresh_time=115910 (Mon Jun 18 09:43:20 2018) conf_on_shared_storage=True maintenance=False state=GlobalMaintenance stopped=False My when I VNC into my HE, All I get is: Probing EDD (edd=off to disable)... ok So, that's why it's failing the liveliness check... I cannot get the screen on HE to change short of ctl-alt-del which will reboot the HE. I do have backups for the HE that are/were run on a nightly basis. If the cluster was left alone, the HE vm would bounce from machine to machine trying to boot. This is why the cluster is in maintenance mode. One of the nodes was down for a period of time and brought back, sometime through the night, which is when the automated backup kicks, the HE started bouncing around. Got nearly 1000 emails. This seems to be the same error (but may not be the same cause) as listed here: https://bugzilla.redhat.com/show_bug.cgi?id=1569827 Thanks, Hanson