Ovirt-ha-agent is logging following:

 

MainThread::INFO::2019-08-19 15:26:13,302::hosted_engine::512::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop) Current state EngineDown (score: 3400)

MainThread::INFO::2019-08-19 15:26:13,303::hosted_engine::520::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop) Best remote host ovirt-sj-01.ictv.com (id: 1, score: 3400)

MainThread::INFO::2019-08-19 15:26:13,314::state_machine::169::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh) Global metadata: {'maintenance': False}

MainThread::INFO::2019-08-19 15:26:13,314::state_machine::174::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh) Host ovirt-sj-01.ictv.com (id 1): {'conf_on_shared_storage': True, 'extra': 'metadata_parse_version=1\nmetadata_feature_version=1\ntimestamp=4777 (Mon Aug 19 15:19:31 2019)\nhost-id=1\nscore=3400\nvm_conf_refresh_time=4531 (Mon Aug 19 15:15:26 2019)\nconf_on_shared_storage=True\nmaintenance=False\nstate=EngineDown\nstopped=False\n', 'hostname': 'ovirt-sj-01.ictv.com', 'alive': True, 'host-id': 1, 'engine-status': {'reason': 'bad vm status', 'health': 'bad', 'vm': 'down', 'detail': 'Down'}, 'score': 3400, 'stopped': False, 'maintenance': False, 'crc32': 'b1380c25', 'local_conf_timestamp': 4531, 'host-ts': 4777}

MainThread::INFO::2019-08-19 15:26:13,314::state_machine::177::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh) Local (id 2): {'engine-health': {'reason': 'vm not running on this host', 'health': 'bad', 'vm': 'down', 'detail': 'unknown'}, 'bridge': True, 'network': 1.0, 'mem-free': 191320.0, 'maintenance': False, 'cpu-load': 0.0049, 'storage-domain': True}

MainThread::INFO::2019-08-19 15:26:13,314::states::467::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume) Engine down and local host has best score (3400), attempting to start engine VM

MainThread::INFO::2019-08-19 15:26:13,370::brokerlink::68::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify) Success, was notification of state_transition (EngineDown-EngineStart) sent? sent

MainThread::ERROR::2019-08-19 15:26:13,498::config_ovf::42::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config.vm::(_get_vm_conf_content_from_ovf_store) Failed scanning for OVF_STORE due to Command Volume.getInfo with args {'storagepoolID': '00000000-0000-0000-0000-000000000000', 'storagedomainID': 'a03fb743-8004-4d54-823b-9be470a0e87b', 'volumeID': u'3f3ee39f-f687-4586-87bd-e5188958863a', 'imageID': u'8c9279b7-0321-49c9-bdd5-4bb94d863960'} failed:

(code=100, message=(13, 'Sanlock resource read failure', 'Permission denied'))

MainThread::ERROR::2019-08-19 15:26:13,498::config_ovf::84::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config.vm::(_get_vm_conf_content_from_ovf_store) Unable to identify the OVF_STORE volume, falling back to initial vm.conf. Please ensure you already added your first data domain for regular VMs

 

Both domains are present:

 

10.210.13.64:/hosted_engine on /rhev/data-center/mnt/10.210.13.64:_hosted__engine type nfs4 (rw,relatime,vers=4.0,rsize=65536,wsize=65536,namlen=255,soft,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys,clientaddr=10.210.13.12,local_lock=none,addr=10.210.13.64)

10.210.13.64:/ovirt_production on /rhev/data-center/mnt/10.210.13.64:_ovirt__production type nfs4 (rw,relatime,vers=4.0,rsize=65536,wsize=65536,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.210.13.12,local_lock=none,addr=10.210.13.64)

 

Note: the IP 10.20.20.40 used in first email was just an example.

 

 

— — —
Met vriendelijke groet / Kind regards,

Marko Vrgotic

 

 

 

From: "Vrgotic, Marko" <M.Vrgotic@activevideo.com>
Date: Monday, 19 August 2019 at 17:19
To: "users@ovirt.org" <users@ovirt.org>
Subject: Re: Issues with oVirt-Engine start - oVirt 4.3.4

 

Additionally,

 

When agent tries to boot up the engine, I am able to get following status:

 

[root@ovirt-sj-02 images]# hosted-engine --vm-status

 

 

--== Host ovirt-sj-01.ictv.com (id: 1) status ==--

 

conf_on_shared_storage             : True

Status up-to-date                  : True

Hostname                           : ovirt-sj-01.ictv.com

Host ID                            : 1

Engine status                      : {"reason": "vm not running on this host", "health": "bad", "vm": "down", "detail": "unknown"}

Score                              : 3400

stopped                            : False

Local maintenance                  : False

crc32                              : f4f95c83

local_conf_timestamp               : 4285

Host timestamp                     : 4285

Extra metadata (valid at timestamp):

                metadata_parse_version=1

                metadata_feature_version=1

                timestamp=4285 (Mon Aug 19 15:11:19 2019)

                host-id=1

                score=3400

                vm_conf_refresh_time=4285 (Mon Aug 19 15:11:20 2019)

                conf_on_shared_storage=True

                maintenance=False

                state=EngineStarting

                stopped=False

 

 

--== Host ovirt-sj-02.ictv.com (id: 2) status ==--

 

conf_on_shared_storage             : True

Status up-to-date                  : True

Hostname                           : ovirt-sj-02.ictv.com

Host ID                            : 2

Engine status                      : {"reason": "bad vm status", "health": "bad", "vm": "down", "detail": "Down"}

Score                              : 3400

stopped                            : False

Local maintenance                  : False

crc32                              : c2669fe8

local_conf_timestamp               : 4153

Host timestamp                     : 4153

Extra metadata (valid at timestamp):

                metadata_parse_version=1

                metadata_feature_version=1

                timestamp=4153 (Mon Aug 19 15:09:47 2019)

                host-id=2

                score=3400

                vm_conf_refresh_time=4153 (Mon Aug 19 15:09:47 2019)

                conf_on_shared_storage=True

                maintenance=False

                state=EngineStart

                stopped=False

 

 

 

 

 

From: "Vrgotic, Marko" <M.Vrgotic@activevideo.com>
Date: Monday, 19 August 2019 at 17:17
To: "users@ovirt.org" <users@ovirt.org>
Subject: Issues with oVirt-Engine start - oVirt 4.3.4

 

Dear oVirt,

 

While working on a procedure to get the NFS v4 mount from Netapp, working on oVIrt, following steps came out to be the way to go in regards of setting it up for oVIrt SHE and VM Guests:

 

 

This works fine, and it needs to be executed on each Hypervisor, before its provisioned into oVirt.

 

However, just today I have discovered that command chmod -R 755 /mnt/rhevstore, if executed on new to be added Hypervisor, after oVirt is already running, it brings the oVirt Engine into broken state.

 

The moment I executed the above on 3rd Hypervisor, before provisioning it into oVirt, following occurred:

 

 

I am unable to boot the Engine VM – it end up in Status ForceStop

 

Hosted-engine –vm-status shows:

[root@ovirt-sj-02 ~]# hosted-engine --vm-status

The hosted engine configuration has not been retrieved from shared storage. Please ensure that ovirt-ha-agent is running and the storage server is reachable.

But, storage is mounted and reachable, as well as ovirt-ha-agent running:

[root@ovirt-sj-02 ~]# systemctl status ovirt-ha-agent

● ovirt-ha-agent.service - oVirt Hosted Engine High Availability Monitoring Agent

   Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-agent.service; enabled; vendor preset: disabled)

   Active: active (running) since Mon 2019-08-19 14:57:07 UTC; 23s ago

Main PID: 43388 (ovirt-ha-agent)

    Tasks: 2

   CGroup: /system.slice/ovirt-ha-agent.service

           └─43388 /usr/bin/python /usr/share/ovirt-hosted-engine-ha/ovirt-ha-agent

 

Can somebody help me with what to do?

 

 

— — —
Met vriendelijke groet / Kind regards,

Marko Vrgotic