[ovirt-users] [hosted-engine] engine VM doesn't respawn when its host was killed (poweroff)

Martin Sivak msivak at redhat.com
Mon Apr 25 09:19:36 UTC 2016


Hi,

it seems that all nodes lost access to storage for some reason after
the host was killed. Where is your hosted engine storage located?

Regards

--
Martin Sivak
SLA / oVirt


On Mon, Apr 25, 2016 at 10:58 AM, Wee Sritippho <wee.s at forest.go.th> wrote:
> Hi,
>
> From the hosted-engine FAQ, the engine VM should be up and running in about
> 5 minutes after its host was forced poweroff. However, after updated oVirt
> 3.6.4 to 3.6.5, the engine VM won't restart automatically even after 10+
> minutes (I already made sure that global maintenance mode is set to none). I
> initially thought its a time sync issue, so I installed and enabled ntp on
> the hosts and engine. However, the issue still persists.
>
> ###Versions:
> [root at host01 ~]# rpm -qa | grep ovirt
> libgovirt-0.3.3-1.el7_2.1.x86_64
> ovirt-vmconsole-1.0.0-1.el7.centos.noarch
> ovirt-vmconsole-host-1.0.0-1.el7.centos.noarch
> ovirt-hosted-engine-ha-1.3.5.3-1.el7.centos.noarch
> ovirt-host-deploy-1.4.1-1.el7.centos.noarch
> ovirt-engine-sdk-python-3.6.5.0-1.el7.centos.noarch
> ovirt-hosted-engine-setup-1.3.5.0-1.el7.centos.noarch
> ovirt-release36-007-1.noarch
> ovirt-setup-lib-1.0.1-1.el7.centos.noarch
> [root at host01 ~]# rpm -qa | grep vdsm
> vdsm-infra-4.17.26-0.el7.centos.noarch
> vdsm-jsonrpc-4.17.26-0.el7.centos.noarch
> vdsm-gluster-4.17.26-0.el7.centos.noarch
> vdsm-python-4.17.26-0.el7.centos.noarch
> vdsm-yajsonrpc-4.17.26-0.el7.centos.noarch
> vdsm-4.17.26-0.el7.centos.noarch
> vdsm-cli-4.17.26-0.el7.centos.noarch
> vdsm-xmlrpc-4.17.26-0.el7.centos.noarch
> vdsm-hook-vmfex-dev-4.17.26-0.el7.centos.noarch
>
> ###Log files:
> https://app.box.com/s/fkurmwagogwkv5smkwwq7i4ztmwf9q9r
>
> ###After host02 was killed:
> [root at host03 wees]# hosted-engine --vm-status
>
>
> --== Host 1 status ==--
>
> Status up-to-date                  : True
> Hostname                           : host01.ovirt.forest.go.th
> Host ID                            : 1
> Engine status                      : {"reason": "vm not running on this
> host", "health": "bad", "vm": "down", "detail": "unknown"}
> Score                              : 3400
> stopped                            : False
> Local maintenance                  : False
> crc32                              : 396766e0
> Host timestamp                     : 4391
>
>
> --== Host 2 status ==--
>
> Status up-to-date                  : True
> Hostname                           : host02.ovirt.forest.go.th
> Host ID                            : 2
> Engine status                      : {"health": "good", "vm": "up",
> "detail": "up"}
> Score                              : 0
> stopped                            : True
> Local maintenance                  : False
> crc32                              : 3a345b65
> Host timestamp                     : 1458
>
>
> --== Host 3 status ==--
>
> Status up-to-date                  : True
> Hostname                           : host03.ovirt.forest.go.th
> Host ID                            : 3
> Engine status                      : {"reason": "vm not running on this
> host", "health": "bad", "vm": "down", "detail": "unknown"}
> Score                              : 3400
> stopped                            : False
> Local maintenance                  : False
> crc32                              : 4c34b0ed
> Host timestamp                     : 11958
>
> ###After host02 was killed for a while:
> [root at host03 wees]# hosted-engine --vm-status
>
>
> --== Host 1 status ==--
>
> Status up-to-date                  : False
> Hostname                           : host01.ovirt.forest.go.th
> Host ID                            : 1
> Engine status                      : unknown stale-data
> Score                              : 3400
> stopped                            : False
> Local maintenance                  : False
> crc32                              : 72e4e418
> Host timestamp                     : 4415
>
>
> --== Host 2 status ==--
>
> Status up-to-date                  : False
> Hostname                           : host02.ovirt.forest.go.th
> Host ID                            : 2
> Engine status                      : unknown stale-data
> Score                              : 0
> stopped                            : True
> Local maintenance                  : False
> crc32                              : 3a345b65
> Host timestamp                     : 1458
>
>
> --== Host 3 status ==--
>
> Status up-to-date                  : False
> Hostname                           : host03.ovirt.forest.go.th
> Host ID                            : 3
> Engine status                      : unknown stale-data
> Score                              : 3400
> stopped                            : False
> Local maintenance                  : False
> crc32                              : 4c34b0ed
> Host timestamp                     : 11958
>
> ###After host02 was up again completely:
> [root at host03 wees]# hosted-engine --vm-status
>
>
> --== Host 1 status ==--
>
> Status up-to-date                  : True
> Hostname                           : host01.ovirt.forest.go.th
> Host ID                            : 1
> Engine status                      : {"reason": "vm not running on this
> host", "health": "bad", "vm": "down", "detail": "unknown"}
> Score                              : 0
> stopped                            : False
> Local maintenance                  : False
> crc32                              : f5728fca
> Host timestamp                     : 5555
>
>
> --== Host 2 status ==--
>
> Status up-to-date                  : True
> Hostname                           : host02.ovirt.forest.go.th
> Host ID                            : 2
> Engine status                      : {"health": "good", "vm": "up",
> "detail": "up"}
> Score                              : 3400
> stopped                            : False
> Local maintenance                  : False
> crc32                              : e5284763
> Host timestamp                     : 715
>
>
> --== Host 3 status ==--
>
> Status up-to-date                  : True
> Hostname                           : host03.ovirt.forest.go.th
> Host ID                            : 3
> Engine status                      : {"reason": "vm not running on this
> host", "health": "bad", "vm": "down", "detail": "unknown"}
> Score                              : 3400
> stopped                            : False
> Local maintenance                  : False
> crc32                              : bc10c7fc
> Host timestamp                     : 13119
>
> --
> Wee
>
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users



More information about the Users mailing list