
Hi, I deployed a oVirt (4.3.10) cluster with HostedEngine and GlusterFS volumes (engine, vmstore, data), the glusterfs cluster on node1/node2/node3, and the engine vm can be running on those 3 nodes. Then I added a 4th nodes into cluster. But, when I operates on Eninge Web Portal, it's always reports 503 error, then I checked `hsoted-engine --vm-status`, see below: ``` [root@vhost1 ~]# hosted-engine –vm-status –== Host vhost1.yhmk.lan (id: 1) status ==– conf_on_shared_storage : True Status up-to-date : True Hostname : vhost1.<span style=”background-color: rgb(255, 255, 255); color: rgb(51, 51, 51);”>alatest</span>.lan Host ID : 1 Engine status : {“reason”: “bad vm status”, “health”: “bad”, “vm”: “down_unexpected”, “detail”: “Down”} Score : 0 stopped : False Local maintenance : False crc32 : 1f25baff local_conf_timestamp : 1253650 Host timestamp : 1253649 Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=1253649 (Thu Apr 8 08:05:48 2021) host-id=1 score=0 vm_conf_refresh_time=1253650 (Thu Apr 8 08:05:48 2021) conf_on_shared_storage=True maintenance=False state=EngineUnexpectedlyDown stopped=False timeout=Thu Jan 15 20:23:29 1970 –== Host vhost2.yhmk.lan (id: 2) status ==– conf_on_shared_storage : True Status up-to-date : True Hostname : vhost2.<span style=”background-color: rgb(255, 255, 255); color: rgb(51, 51, 51);”>alatest</span>.lan Host ID : 2 Engine status : {“reason”: “vm not running on this host”, “health”: “bad”, “vm”: “down_unexpected”, “detail”: “unknown”} Score : 3400 stopped : False Local maintenance : False crc32 : 539fc30c local_conf_timestamp : 1253343 Host timestamp : 1253343 Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=1253343 (Thu Apr 8 08:05:46 2021) host-id=2 score=3400 vm_conf_refresh_time=1253343 (Thu Apr 8 08:05:46 2021) conf_on_shared_storage=True maintenance=False state=EngineDown stopped=False –== Host vhost3.yhmk.lan (id: 3) status ==– conf_on_shared_storage : True Status up-to-date : True Hostname : vhost3.alatest.lan Host ID : 3 Engine status : {“reason”: “bad vm status”, “health”: “bad”, “vm”: “up”, “detail”: “Powering up”} Score : 3400 stopped : False Local maintenance : False crc32 : 4072e0b8 local_conf_timestamp : 1252345 Host timestamp : 1252345 Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=1252345 (Thu Apr 8 08:05:42 2021) host-id=3 score=3400 vm_conf_refresh_time=1252345 (Thu Apr 8 08:05:42 2021) conf_on_shared_storage=True maintenance=False state=EngineStarting stopped=False ``` Then, wait a moment, can access web portal again, and check the hosts status, alway reports one or more hosts with label `unavaiable as HA score`, but it will dispear later. And I found, sometimes the engine vm will migrated to another nodes whill this problem occur. So, seems the HostedEngine is not stable, always occur this problem, could you please help me with this ? Thanks!