hosted-engine --vm-start not working

Hi, When i run: hosted-engine --vm-start I get this: VM exists and is Down, cleaning up and restarting VM in WaitForLaunch But the VM never starts: virsh list --all Id Name State ------------------------------- - HostedEngine shut off systemctl status -l ovirt-ha-agent ● ovirt-ha-agent.service - oVirt Hosted Engine High Availability Monitoring Agent Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-agent.service; disabled; vendor preset: disabled) Active: active (running) since Wed 2021-06-16 13:27:27 CEST; 3min 26s ago Main PID: 79702 (ovirt-ha-agent) Tasks: 2 (limit: 198090) Memory: 28.3M CGroup: /system.slice/ovirt-ha-agent.service └─79702 /usr/libexec/platform-python /usr/share/ovirt-hosted-engine-ha/ovirt-ha-agent Jun 16 13:27:27 hej1.5ervers.lan systemd[1]: ovirt-ha-agent.service: Succeeded. Jun 16 13:27:27 hej1.5ervers.lan systemd[1]: Stopped oVirt Hosted Engine High Availability Monitoring Agent. Jun 16 13:27:27 hej1.5ervers.lan systemd[1]: Started oVirt Hosted Engine High Availability Monitoring Agent. Jun 16 13:29:42 hej1.5ervers.lan ovirt-ha-agent[79702]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM stopped on localhost hosted-engine --vm-status --== Host hej1.5ervers.lan (id: 1) status ==-- Host ID : 1 Host timestamp : 3547 Score : 3400 Engine status : {"vm": "down", "health": "bad", "detail": "Down", "reason": "bad vm status"} Hostname : hej1.5ervers.lan Local maintenance : False stopped : False crc32 : f35899f8 conf_on_shared_storage : True local_conf_timestamp : 3547 Status up-to-date : True Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=3547 (Wed Jun 16 13:32:12 2021) host-id=1 score=3400 vm_conf_refresh_time=3547 (Wed Jun 16 13:32:12 2021) conf_on_shared_storage=True maintenance=False state=EngineDown stopped=False --== Host hej2.5ervers.lan (id: 2) status ==-- Host ID : 2 Host timestamp : 94681 Score : 0 Engine status : {"vm": "down_unexpected", "health": "bad", "detail": "Down", "reason": "bad vm status"} Hostname : hej2.5ervers.lan Local maintenance : False stopped : False crc32 : 40a3f809 conf_on_shared_storage : True local_conf_timestamp : 94681 Status up-to-date : True Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=94681 (Wed Jun 16 13:32:05 2021) host-id=2 score=0 vm_conf_refresh_time=94681 (Wed Jun 16 13:32:05 2021) conf_on_shared_storage=True maintenance=False state=EngineUnexpectedlyDown stopped=False timeout=Fri Jan 2 03:23:40 1970 --== Host hej3.5ervers.lan (id: 3) status ==-- Host ID : 3 Host timestamp : 94666 Score : 0 Engine status : {"vm": "down_unexpected", "health": "bad", "detail": "Down", "reason": "bad vm status"} Hostname : hej3.5ervers.lan Local maintenance : False stopped : False crc32 : a50c2b3e conf_on_shared_storage : True local_conf_timestamp : 94666 Status up-to-date : True Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=94666 (Wed Jun 16 13:32:09 2021) host-id=3 score=0 vm_conf_refresh_time=94666 (Wed Jun 16 13:32:09 2021) conf_on_shared_storage=True maintenance=False state=EngineUnexpectedlyDown stopped=False timeout=Fri Jan 2 03:23:16 1970

Check ovirt-ha-broker/agent logs on all systems to identify the reason why the VM is down. Also, you can check libvirt logs for any clues . Best Regards,Strahil Nikolov On Mon, Jun 21, 2021 at 19:20, Harry O<harryo.dk@gmail.com> wrote: Hi, When i run: hosted-engine --vm-start I get this: VM exists and is Down, cleaning up and restarting VM in WaitForLaunch But the VM never starts: virsh list --all Id Name State ------------------------------- - HostedEngine shut off systemctl status -l ovirt-ha-agent ● ovirt-ha-agent.service - oVirt Hosted Engine High Availability Monitoring Agent Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-agent.service; disabled; vendor preset: disabled) Active: active (running) since Wed 2021-06-16 13:27:27 CEST; 3min 26s ago Main PID: 79702 (ovirt-ha-agent) Tasks: 2 (limit: 198090) Memory: 28.3M CGroup: /system.slice/ovirt-ha-agent.service └─79702 /usr/libexec/platform-python /usr/share/ovirt-hosted-engine-ha/ovirt-ha-agent Jun 16 13:27:27 hej1.5ervers.lan systemd[1]: ovirt-ha-agent.service: Succeeded. Jun 16 13:27:27 hej1.5ervers.lan systemd[1]: Stopped oVirt Hosted Engine High Availability Monitoring Agent. Jun 16 13:27:27 hej1.5ervers.lan systemd[1]: Started oVirt Hosted Engine High Availability Monitoring Agent. Jun 16 13:29:42 hej1.5ervers.lan ovirt-ha-agent[79702]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM stopped on localhost hosted-engine --vm-status --== Host hej1.5ervers.lan (id: 1) status ==-- Host ID : 1 Host timestamp : 3547 Score : 3400 Engine status : {"vm": "down", "health": "bad", "detail": "Down", "reason": "bad vm status"} Hostname : hej1.5ervers.lan Local maintenance : False stopped : False crc32 : f35899f8 conf_on_shared_storage : True local_conf_timestamp : 3547 Status up-to-date : True Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=3547 (Wed Jun 16 13:32:12 2021) host-id=1 score=3400 vm_conf_refresh_time=3547 (Wed Jun 16 13:32:12 2021) conf_on_shared_storage=True maintenance=False state=EngineDown stopped=False --== Host hej2.5ervers.lan (id: 2) status ==-- Host ID : 2 Host timestamp : 94681 Score : 0 Engine status : {"vm": "down_unexpected", "health": "bad", "detail": "Down", "reason": "bad vm status"} Hostname : hej2.5ervers.lan Local maintenance : False stopped : False crc32 : 40a3f809 conf_on_shared_storage : True local_conf_timestamp : 94681 Status up-to-date : True Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=94681 (Wed Jun 16 13:32:05 2021) host-id=2 score=0 vm_conf_refresh_time=94681 (Wed Jun 16 13:32:05 2021) conf_on_shared_storage=True maintenance=False state=EngineUnexpectedlyDown stopped=False timeout=Fri Jan 2 03:23:40 1970 --== Host hej3.5ervers.lan (id: 3) status ==-- Host ID : 3 Host timestamp : 94666 Score : 0 Engine status : {"vm": "down_unexpected", "health": "bad", "detail": "Down", "reason": "bad vm status"} Hostname : hej3.5ervers.lan Local maintenance : False stopped : False crc32 : a50c2b3e conf_on_shared_storage : True local_conf_timestamp : 94666 Status up-to-date : True Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=94666 (Wed Jun 16 13:32:09 2021) host-id=3 score=0 vm_conf_refresh_time=94666 (Wed Jun 16 13:32:09 2021) conf_on_shared_storage=True maintenance=False state=EngineUnexpectedlyDown stopped=False timeout=Fri Jan 2 03:23:16 1970 _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/33W3L6MTJ45U25...

I would see this if the engine VM storage was not accessible. In my case a targetd iscsi server for the hosted storage wouldn't serve it after rebooting from a power outage. More specifically, the iscsi host had let LVM startup scan the device, which prevented targetd from serving it. On Mon, Jun 21, 2021 at 12:21 PM Harry O <harryo.dk@gmail.com> wrote:
Hi, When i run: hosted-engine --vm-start I get this: VM exists and is Down, cleaning up and restarting VM in WaitForLaunch
But the VM never starts: virsh list --all Id Name State ------------------------------- - HostedEngine shut off
systemctl status -l ovirt-ha-agent ● ovirt-ha-agent.service - oVirt Hosted Engine High Availability Monitoring Agent Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-agent.service; disabled; vendor preset: disabled) Active: active (running) since Wed 2021-06-16 13:27:27 CEST; 3min 26s ago Main PID: 79702 (ovirt-ha-agent) Tasks: 2 (limit: 198090) Memory: 28.3M CGroup: /system.slice/ovirt-ha-agent.service └─79702 /usr/libexec/platform-python /usr/share/ovirt-hosted-engine-ha/ovirt-ha-agent
Jun 16 13:27:27 hej1.5ervers.lan systemd[1]: ovirt-ha-agent.service: Succeeded. Jun 16 13:27:27 hej1.5ervers.lan systemd[1]: Stopped oVirt Hosted Engine High Availability Monitoring Agent. Jun 16 13:27:27 hej1.5ervers.lan systemd[1]: Started oVirt Hosted Engine High Availability Monitoring Agent. Jun 16 13:29:42 hej1.5ervers.lan ovirt-ha-agent[79702]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM stopped on localhost
hosted-engine --vm-status
--== Host hej1.5ervers.lan (id: 1) status ==--
Host ID : 1 Host timestamp : 3547 Score : 3400 Engine status : {"vm": "down", "health": "bad", "detail": "Down", "reason": "bad vm status"} Hostname : hej1.5ervers.lan Local maintenance : False stopped : False crc32 : f35899f8 conf_on_shared_storage : True local_conf_timestamp : 3547 Status up-to-date : True Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=3547 (Wed Jun 16 13:32:12 2021) host-id=1 score=3400 vm_conf_refresh_time=3547 (Wed Jun 16 13:32:12 2021) conf_on_shared_storage=True maintenance=False state=EngineDown stopped=False
--== Host hej2.5ervers.lan (id: 2) status ==--
Host ID : 2 Host timestamp : 94681 Score : 0 Engine status : {"vm": "down_unexpected", "health": "bad", "detail": "Down", "reason": "bad vm status"} Hostname : hej2.5ervers.lan Local maintenance : False stopped : False crc32 : 40a3f809 conf_on_shared_storage : True local_conf_timestamp : 94681 Status up-to-date : True Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=94681 (Wed Jun 16 13:32:05 2021) host-id=2 score=0 vm_conf_refresh_time=94681 (Wed Jun 16 13:32:05 2021) conf_on_shared_storage=True maintenance=False state=EngineUnexpectedlyDown stopped=False timeout=Fri Jan 2 03:23:40 1970
--== Host hej3.5ervers.lan (id: 3) status ==--
Host ID : 3 Host timestamp : 94666 Score : 0 Engine status : {"vm": "down_unexpected", "health": "bad", "detail": "Down", "reason": "bad vm status"} Hostname : hej3.5ervers.lan Local maintenance : False stopped : False crc32 : a50c2b3e conf_on_shared_storage : True local_conf_timestamp : 94666 Status up-to-date : True Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=94666 (Wed Jun 16 13:32:09 2021) host-id=3 score=0 vm_conf_refresh_time=94666 (Wed Jun 16 13:32:09 2021) conf_on_shared_storage=True maintenance=False state=EngineUnexpectedlyDown stopped=False timeout=Fri Jan 2 03:23:16 1970 _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/33W3L6MTJ45U25...
participants (3)
-
Edward Berger
-
Harry O
-
Strahil Nikolov