hosted-engine -vm-status show a ghost node that is not anymore in the cluster: how to remove?

engine 4.5.2.4 The issue is that in my cluster when I use the: [root@ovirt-node3 ~]# hosted-engine --vm-status --== Host ovirt-node3.ovirt (id: 1) status ==-- Host ID : 1 Host timestamp : 1633143 Score : 3400 Engine status : {"vm": "down", "health": "bad", "detail": "unknown", "reason": "vm not running on this host"} Hostname : ovirt-node3.ovirt Local maintenance : False stopped : False crc32 : 1cbfcd19 conf_on_shared_storage : True local_conf_timestamp : 1633143 Status up-to-date : True Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=1633143 (Wed Aug 31 14:37:53 2022) host-id=1 score=3400 vm_conf_refresh_time=1633143 (Wed Aug 31 14:37:53 2022) conf_on_shared_storage=True maintenance=False state=EngineDown stopped=False --== Host ovirt-node1.ovirt (id: 2) status ==-- Host ID : 2 Host timestamp : 373629 Score : 0 Engine status : unknown stale-data Hostname : ovirt-node1.ovirt Local maintenance : True stopped : False crc32 : 12a6eb81 conf_on_shared_storage : True local_conf_timestamp : 373630 Status up-to-date : False Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=373629 (Tue Jun 14 16:48:50 2022) host-id=2 score=0 vm_conf_refresh_time=373630 (Tue Jun 14 16:48:50 2022) conf_on_shared_storage=True maintenance=True state=LocalMaintenance stopped=False --== Host ovirt-node2.ovirt (id: 3) status ==-- Host ID : 3 Host timestamp : 434247 Score : 3400 Engine status : {"vm": "down", "health": "bad", "detail": "unknown", "reason": "vm not running on this host"} Hostname : ovirt-node2.ovirt Local maintenance : False stopped : False crc32 : badb3751 conf_on_shared_storage : True local_conf_timestamp : 434247 Status up-to-date : True Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=434247 (Wed Aug 31 14:37:45 2022) host-id=3 score=3400 vm_conf_refresh_time=434247 (Wed Aug 31 14:37:45 2022) conf_on_shared_storage=True maintenance=False state=EngineDown stopped=False --== Host ovirt-node4.ovirt (id: 4) status ==-- Host ID : 4 Host timestamp : 1646655 Score : 3400 Engine status : {"vm": "up", "health": "good", "detail": "Up"} Hostname : ovirt-node4.ovirt Local maintenance : False stopped : False crc32 : 1a16027e conf_on_shared_storage : True local_conf_timestamp : 1646655 Status up-to-date : True Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=1646655 (Wed Aug 31 14:37:43 2022) host-id=4 score=3400 vm_conf_refresh_time=1646655 (Wed Aug 31 14:37:43 2022) conf_on_shared_storage=True maintenance=False state=EngineUp stopped=False The problem is that ovirt-node1.ovirt is not anymore in ther cluster, in the host list presented by the ui there is correctly no ovirt-node1, the ovirt-node1 appears only in the commandline. I did a full text search in the engine DB, but node1 doesn't appear anywhere, even in the filesystem, a grep doesn't find anything

Anyone can give a hint please?

Il giorno ven 2 set 2022 alle ore 17:26 Diego Ercolani < diego.ercolani@ssis.sm> ha scritto:
Anyone can give a hint please?
You can use the procedure at https://access.redhat.com/solutions/2212601 In your case: # hosted-engine --clean-metadata --host-id=2 --force-clean should work -- Sandro Bonazzola MANAGER, SOFTWARE ENGINEERING, EMEA R&D PERFORMANCE & SCALE Red Hat EMEA <https://www.redhat.com/> sbonazzo@redhat.com <https://www.redhat.com/> *Red Hat respects your work life balance. Therefore there is no need to answer this email out of your office hours.*

Il giorno lun 5 set 2022 alle ore 07:59 Sandro Bonazzola < sbonazzo@redhat.com> ha scritto:
Il giorno ven 2 set 2022 alle ore 17:26 Diego Ercolani < diego.ercolani@ssis.sm> ha scritto:
Anyone can give a hint please?
You can use the procedure at https://access.redhat.com/solutions/2212601
Procedure also described in https://lists.ovirt.org/archives/list/users@ovirt.org/thread/UFO6DB2PFX7NWGB...
In your case: # hosted-engine --clean-metadata --host-id=2 --force-clean
should work
--
Sandro Bonazzola
MANAGER, SOFTWARE ENGINEERING, EMEA R&D PERFORMANCE & SCALE
Red Hat EMEA <https://www.redhat.com/>
sbonazzo@redhat.com <https://www.redhat.com/>
*Red Hat respects your work life balance. Therefore there is no need to answer this email out of your office hours.*
-- Sandro Bonazzola MANAGER, SOFTWARE ENGINEERING, EMEA R&D PERFORMANCE & SCALE Red Hat EMEA <https://www.redhat.com/> sbonazzo@redhat.com <https://www.redhat.com/> *Red Hat respects your work life balance. Therefore there is no need to answer this email out of your office hours.*

Perfect! It works Thank you Sandro. The help file is discouraging the use: [root@ovirt-node3 ~]# hosted-engine --clean-metadata --help Usage: /usr/sbin/hosted-engine --clean_metadata [--force-cleanup] [--host-id=<id>] Remove host's metadata from the global status database. Available only in properly deployed cluster with properly stopped agent. --force-cleanup This option overrides the safety checks. Use at your own risk DANGEROUS. --host-id=<id> Specify an explicit host id to clean

Just to follow that I had to run the same command on all the nodes because in nodes where I didn't run continued to appear. This should be resolved now. Thank you
participants (2)
-
Diego Ercolani
-
Sandro Bonazzola