[ovirt-users] [hosted-engine] engine VM doesn't respawn when its host was killed (poweroff)

Wee Sritippho wee.s at forest.go.th
Mon Apr 25 08:58:20 UTC 2016


Hi,

 From the hosted-engine FAQ, the engine VM should be up and running in 
about 5 minutes after its host was forced poweroff. However, after 
updated oVirt 3.6.4 to 3.6.5, the engine VM won't restart automatically 
even after 10+ minutes (I already made sure that global maintenance mode 
is set to none). I initially thought its a time sync issue, so I 
installed and enabled ntp on the hosts and engine. However, the issue 
still persists.

###Versions:
[root at host01 ~]# rpm -qa | grep ovirt
libgovirt-0.3.3-1.el7_2.1.x86_64
ovirt-vmconsole-1.0.0-1.el7.centos.noarch
ovirt-vmconsole-host-1.0.0-1.el7.centos.noarch
ovirt-hosted-engine-ha-1.3.5.3-1.el7.centos.noarch
ovirt-host-deploy-1.4.1-1.el7.centos.noarch
ovirt-engine-sdk-python-3.6.5.0-1.el7.centos.noarch
ovirt-hosted-engine-setup-1.3.5.0-1.el7.centos.noarch
ovirt-release36-007-1.noarch
ovirt-setup-lib-1.0.1-1.el7.centos.noarch
[root at host01 ~]# rpm -qa | grep vdsm
vdsm-infra-4.17.26-0.el7.centos.noarch
vdsm-jsonrpc-4.17.26-0.el7.centos.noarch
vdsm-gluster-4.17.26-0.el7.centos.noarch
vdsm-python-4.17.26-0.el7.centos.noarch
vdsm-yajsonrpc-4.17.26-0.el7.centos.noarch
vdsm-4.17.26-0.el7.centos.noarch
vdsm-cli-4.17.26-0.el7.centos.noarch
vdsm-xmlrpc-4.17.26-0.el7.centos.noarch
vdsm-hook-vmfex-dev-4.17.26-0.el7.centos.noarch

###Log files:
https://app.box.com/s/fkurmwagogwkv5smkwwq7i4ztmwf9q9r

###After host02 was killed:
[root at host03 wees]# hosted-engine --vm-status


--== Host 1 status ==--

Status up-to-date                  : True
Hostname                           : host01.ovirt.forest.go.th
Host ID                            : 1
Engine status                      : {"reason": "vm not running on this 
host", "health": "bad", "vm": "down", "detail": "unknown"}
Score                              : 3400
stopped                            : False
Local maintenance                  : False
crc32                              : 396766e0
Host timestamp                     : 4391


--== Host 2 status ==--

Status up-to-date                  : True
Hostname                           : host02.ovirt.forest.go.th
Host ID                            : 2
Engine status                      : {"health": "good", "vm": "up", 
"detail": "up"}
Score                              : 0
stopped                            : True
Local maintenance                  : False
crc32                              : 3a345b65
Host timestamp                     : 1458


--== Host 3 status ==--

Status up-to-date                  : True
Hostname                           : host03.ovirt.forest.go.th
Host ID                            : 3
Engine status                      : {"reason": "vm not running on this 
host", "health": "bad", "vm": "down", "detail": "unknown"}
Score                              : 3400
stopped                            : False
Local maintenance                  : False
crc32                              : 4c34b0ed
Host timestamp                     : 11958

###After host02 was killed for a while:
[root at host03 wees]# hosted-engine --vm-status


--== Host 1 status ==--

Status up-to-date                  : False
Hostname                           : host01.ovirt.forest.go.th
Host ID                            : 1
Engine status                      : unknown stale-data
Score                              : 3400
stopped                            : False
Local maintenance                  : False
crc32                              : 72e4e418
Host timestamp                     : 4415


--== Host 2 status ==--

Status up-to-date                  : False
Hostname                           : host02.ovirt.forest.go.th
Host ID                            : 2
Engine status                      : unknown stale-data
Score                              : 0
stopped                            : True
Local maintenance                  : False
crc32                              : 3a345b65
Host timestamp                     : 1458


--== Host 3 status ==--

Status up-to-date                  : False
Hostname                           : host03.ovirt.forest.go.th
Host ID                            : 3
Engine status                      : unknown stale-data
Score                              : 3400
stopped                            : False
Local maintenance                  : False
crc32                              : 4c34b0ed
Host timestamp                     : 11958

###After host02 was up again completely:
[root at host03 wees]# hosted-engine --vm-status


--== Host 1 status ==--

Status up-to-date                  : True
Hostname                           : host01.ovirt.forest.go.th
Host ID                            : 1
Engine status                      : {"reason": "vm not running on this 
host", "health": "bad", "vm": "down", "detail": "unknown"}
Score                              : 0
stopped                            : False
Local maintenance                  : False
crc32                              : f5728fca
Host timestamp                     : 5555


--== Host 2 status ==--

Status up-to-date                  : True
Hostname                           : host02.ovirt.forest.go.th
Host ID                            : 2
Engine status                      : {"health": "good", "vm": "up", 
"detail": "up"}
Score                              : 3400
stopped                            : False
Local maintenance                  : False
crc32                              : e5284763
Host timestamp                     : 715


--== Host 3 status ==--

Status up-to-date                  : True
Hostname                           : host03.ovirt.forest.go.th
Host ID                            : 3
Engine status                      : {"reason": "vm not running on this 
host", "health": "bad", "vm": "down", "detail": "unknown"}
Score                              : 3400
stopped                            : False
Local maintenance                  : False
crc32                              : bc10c7fc
Host timestamp                     : 13119

-- 
Wee




More information about the Users mailing list