Hi,
it seems that all nodes lost access to storage for some reason after
the host was killed. Where is your hosted engine storage located?
Regards
--
Martin Sivak
SLA / oVirt
On Mon, Apr 25, 2016 at 10:58 AM, Wee Sritippho <wee.s(a)forest.go.th> wrote:
Hi,
From the hosted-engine FAQ, the engine VM should be up and running in about
5 minutes after its host was forced poweroff. However, after updated oVirt
3.6.4 to 3.6.5, the engine VM won't restart automatically even after 10+
minutes (I already made sure that global maintenance mode is set to none). I
initially thought its a time sync issue, so I installed and enabled ntp on
the hosts and engine. However, the issue still persists.
###Versions:
[root@host01 ~]# rpm -qa | grep ovirt
libgovirt-0.3.3-1.el7_2.1.x86_64
ovirt-vmconsole-1.0.0-1.el7.centos.noarch
ovirt-vmconsole-host-1.0.0-1.el7.centos.noarch
ovirt-hosted-engine-ha-1.3.5.3-1.el7.centos.noarch
ovirt-host-deploy-1.4.1-1.el7.centos.noarch
ovirt-engine-sdk-python-3.6.5.0-1.el7.centos.noarch
ovirt-hosted-engine-setup-1.3.5.0-1.el7.centos.noarch
ovirt-release36-007-1.noarch
ovirt-setup-lib-1.0.1-1.el7.centos.noarch
[root@host01 ~]# rpm -qa | grep vdsm
vdsm-infra-4.17.26-0.el7.centos.noarch
vdsm-jsonrpc-4.17.26-0.el7.centos.noarch
vdsm-gluster-4.17.26-0.el7.centos.noarch
vdsm-python-4.17.26-0.el7.centos.noarch
vdsm-yajsonrpc-4.17.26-0.el7.centos.noarch
vdsm-4.17.26-0.el7.centos.noarch
vdsm-cli-4.17.26-0.el7.centos.noarch
vdsm-xmlrpc-4.17.26-0.el7.centos.noarch
vdsm-hook-vmfex-dev-4.17.26-0.el7.centos.noarch
###Log files:
https://app.box.com/s/fkurmwagogwkv5smkwwq7i4ztmwf9q9r
###After host02 was killed:
[root@host03 wees]# hosted-engine --vm-status
--== Host 1 status ==--
Status up-to-date : True
Hostname : host01.ovirt.forest.go.th
Host ID : 1
Engine status : {"reason": "vm not running on this
host", "health": "bad", "vm": "down",
"detail": "unknown"}
Score : 3400
stopped : False
Local maintenance : False
crc32 : 396766e0
Host timestamp : 4391
--== Host 2 status ==--
Status up-to-date : True
Hostname : host02.ovirt.forest.go.th
Host ID : 2
Engine status : {"health": "good",
"vm": "up",
"detail": "up"}
Score : 0
stopped : True
Local maintenance : False
crc32 : 3a345b65
Host timestamp : 1458
--== Host 3 status ==--
Status up-to-date : True
Hostname : host03.ovirt.forest.go.th
Host ID : 3
Engine status : {"reason": "vm not running on this
host", "health": "bad", "vm": "down",
"detail": "unknown"}
Score : 3400
stopped : False
Local maintenance : False
crc32 : 4c34b0ed
Host timestamp : 11958
###After host02 was killed for a while:
[root@host03 wees]# hosted-engine --vm-status
--== Host 1 status ==--
Status up-to-date : False
Hostname : host01.ovirt.forest.go.th
Host ID : 1
Engine status : unknown stale-data
Score : 3400
stopped : False
Local maintenance : False
crc32 : 72e4e418
Host timestamp : 4415
--== Host 2 status ==--
Status up-to-date : False
Hostname : host02.ovirt.forest.go.th
Host ID : 2
Engine status : unknown stale-data
Score : 0
stopped : True
Local maintenance : False
crc32 : 3a345b65
Host timestamp : 1458
--== Host 3 status ==--
Status up-to-date : False
Hostname : host03.ovirt.forest.go.th
Host ID : 3
Engine status : unknown stale-data
Score : 3400
stopped : False
Local maintenance : False
crc32 : 4c34b0ed
Host timestamp : 11958
###After host02 was up again completely:
[root@host03 wees]# hosted-engine --vm-status
--== Host 1 status ==--
Status up-to-date : True
Hostname : host01.ovirt.forest.go.th
Host ID : 1
Engine status : {"reason": "vm not running on this
host", "health": "bad", "vm": "down",
"detail": "unknown"}
Score : 0
stopped : False
Local maintenance : False
crc32 : f5728fca
Host timestamp : 5555
--== Host 2 status ==--
Status up-to-date : True
Hostname : host02.ovirt.forest.go.th
Host ID : 2
Engine status : {"health": "good",
"vm": "up",
"detail": "up"}
Score : 3400
stopped : False
Local maintenance : False
crc32 : e5284763
Host timestamp : 715
--== Host 3 status ==--
Status up-to-date : True
Hostname : host03.ovirt.forest.go.th
Host ID : 3
Engine status : {"reason": "vm not running on this
host", "health": "bad", "vm": "down",
"detail": "unknown"}
Score : 3400
stopped : False
Local maintenance : False
crc32 : bc10c7fc
Host timestamp : 13119
--
Wee
_______________________________________________
Users mailing list
Users(a)ovirt.org
http://lists.ovirt.org/mailman/listinfo/users