Try with the VNC console 'hosted-engine --add-console-password'
Then connect on the IP:port that the command replies and check what is going on.
Maybe, you will need a rescue DVD and mount all filesystems and dismount them.
After that, just power it off and power it on regularly.

If you can't use custom engine config, use the xml definition in the VDSM log.

You will also need this alias:
alias  virsh='virsh -c qemu:///system?authfile=/etc/ovirt-hosted-engine/virsh_auth.conf' , so you can use virsh freely (define/start/destroy).

Best Regards,
Strahil Nikolov

On Apr 15, 2019 22:35, Stefan Wolf <shb256@gmail.com> wrote:

Hello all,

 

after a powerloss the hosted engine won’t start up anymore.

I ‘ve the current ovirt installed.

Storage is glusterfs und it is up and running

 

It is trying to start up hosted engine but it does not work, but I can’t see where the problem is.

 

[root@kvm320 ~]# hosted-engine --vm-status

 

 

--== Host 1 status ==--

 

conf_on_shared_storage             : True

Status up-to-date                  : True

Hostname                           : kvm380.durchhalten.intern

Host ID                            : 1

Engine status                      : {"reason": "bad vm status", "health": "bad", "vm": "down", "detail": "Down"}

Score                              : 1800

stopped                            : False

Local maintenance                  : False

crc32                              : 3ad6d0bd

local_conf_timestamp               : 14594

Host timestamp                     : 14594

Extra metadata (valid at timestamp):

       metadata_parse_version=1

        metadata_feature_version=1

        timestamp=14594 (Mon Apr 15 21:25:12 2019)

        host-id=1

        score=1800

        vm_conf_refresh_time=14594 (Mon Apr 15 21:25:12 2019)

        conf_on_shared_storage=True

        maintenance=False

        state=GlobalMaintenance

        stopped=False

 

 

--== Host 2 status ==--

 

conf_on_shared_storage             : True

Status up-to-date                  : True

Hostname                           : kvm320.durchhalten.intern

Host ID                            : 2

Engine status                      : {"reason": "failed liveliness check", "health": "bad", "vm": "up", "detail": "Up"}

Score                              : 0

stopped                            : False

Local maintenance                  : False

crc32                              : e7d4840d

local_conf_timestamp               : 21500

Host timestamp                     : 21500

Extra metadata (valid at timestamp):

        metadata_parse_version=1

        metadata_feature_version=1

        timestamp=21500 (Mon Apr 15 21:25:22 2019)

        host-id=2

        score=0

        vm_conf_refresh_time=21500 (Mon Apr 15 21:25:22 2019)

        conf_on_shared_storage=True

        maintenance=False

        state=ReinitializeFSM

        stopped=False

 

 

--== Host 3 status ==--

 

conf_on_shared_storage             : True

Status up-to-date                  : True

Hostname                           : kvm360.durchhalten.intern

Host ID                            : 3

Engine status                      : {"reason": "vm not running on this host", "health": "bad", "vm": "down", "detail": "unknown"}

Score                              : 1800

stopped                            : False

Local maintenance                  : False

crc32                              : cf9221cb

local_conf_timestamp               : 22121

Host timestamp                     : 22120

Extra metadata (valid at timestamp):

        metadata_parse_version=1

        metadata_feature_version=1

        timestamp=22120 (Mon Apr 15 21:25:18 2019)

        host-id=3

        score=1800

        vm_conf_refresh_time=22121 (Mon Apr 15 21:25:18 2019)

        conf_on_shared_storage=True

        maintenance=False

        state=GlobalMaintenance

        stopped=False

 

[root@kvm320 ~]# virsh -r list

Id    Name                           Status

----------------------------------------------------

6     HostedEngine                   laufend

 

[root@kvm320 ~]# hosted-engine --console

The engine VM is running on this host

Verbunden mit der Domain: HostedEngine

Escape-Zeichen ist ^]

Fehler: Interner Fehler: Zeichengerät <null> kann nicht gefunden warden

 

In engish it should be this

 

[root@mgmt~]# hosted-engine --console
The engine VM is running on this host
Connected to domain HostedEngine
Escape character is ^]
error: internal error: cannot find character device

 

This is in the log

 

[root@kvm320 ~]# tail -f /var/log/ovirt-hosted-engine-ha/agent.log

MainThread::INFO::2019-04-15 21:28:33,032::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop) Current state EngineStarting (score: 1800)

MainThread::INFO::2019-04-15 21:28:43,050::states::779::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume) VM is powering up..

MainThread::INFO::2019-04-15 21:28:43,165::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop) Current state EngineStarting (score: 1800)

MainThread::INFO::2019-04-15 21:28:53,183::states::779::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume) VM is powering up..

MainThread::INFO::2019-04-15 21:28:53,300::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop) Current state EngineStarting (score: 1800)

MainThread::INFO::2019-04-15 21:29:03,317::states::779::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume) VM is powering up..

MainThread::INFO::2019-04-15 21:29:03,434::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop) Current state EngineStarting (score: 1800)

MainThread::INFO::2019-04-15 21:29:13,453::states::779::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume) VM is powering up..

MainThread::INFO::2019-04-15 21:29:13,571::states::136::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(score) Penalizing score by 1600 due to gateway status

MainThread::INFO::2019-04-15 21:29:13,571::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop) Current state EngineStarting (score: 1800)

MainThread::INFO::2019-04-15 21:29:22,589::states::779::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume) VM is powering up..

MainThread::INFO::2019-04-15 21:29:22,712::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop) Current state EngineStarting (score: 1800)

 

But it is not reachable over the network

 

[root@kvm320 ~]# ping 192.168.200.211

PING 192.168.200.211 (192.168.200.211) 56(84) bytes of data.

From 192.168.200.231 icmp_seq=1 Destination Host Unreachable

From 192.168.200.231 icmp_seq=2 Destination Host Unreachable

From 192.168.200.231 icmp_seq=3 Destination Host Unreachable

From 192.168.200.231 icmp_seq=4 Destination Host Unreachable

 

I tried to stop and start the vm again, but it didn’t helped

 

Maybe someone can give me some advice how to get the hosted engine running again

 

Thx by stefan