MainThread::INFO::2020-04-08 20:56:20,138::broker::47::ovirt_hosted_engine_ha.broker.broker.Broker::(run) ovirt-hosted-engine-ha broker 2.3.6 started
MainThread::INFO::2020-04-08 20:56:20,138::monitor::40::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Searching for submonitors in /usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/submonitors
MainThread::INFO::2020-04-08 20:56:20,138::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor network
MainThread::INFO::2020-04-08 20:56:20,140::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor cpu-load-no-engine
MainThread::INFO::2020-04-08 20:56:20,140::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor mgmt-bridge
MainThread::INFO::2020-04-08 20:56:20,141::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor network
MainThread::INFO::2020-04-08 20:56:20,141::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor cpu-load
MainThread::INFO::2020-04-08 20:56:20,141::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor engine-health
MainThread::INFO::2020-04-08 20:56:20,141::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor mgmt-bridge
MainThread::INFO::2020-04-08 20:56:20,142::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor cpu-load-no-engine
MainThread::INFO::2020-04-08 20:56:20,142::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor cpu-load
MainThread::INFO::2020-04-08 20:56:20,142::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor mem-free
MainThread::INFO::2020-04-08 20:56:20,143::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor storage-domain
MainThread::INFO::2020-04-08 20:56:20,143::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor storage-domain
MainThread::INFO::2020-04-08 20:56:20,143::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor mem-free
MainThread::INFO::2020-04-08 20:56:20,143::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor engine-health
MainThread::INFO::2020-04-08 20:56:20,143::monitor::50::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Finished loading submonitors
MainThread::INFO::2020-04-08 20:56:20,197::storage_backends::373::ovirt_hosted_engine_ha.lib.storage_backends::(connect) Connecting the storage
MainThread::INFO::2020-04-08 20:56:20,197::storage_server::349::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) Connecting storage server
MainThread::INFO::2020-04-08 20:56:20,414::storage_server::356::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) Connecting storage server
MainThread::INFO::2020-04-08 20:56:20,628::storage_server::413::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) Refreshing the storage domain
MainThread::WARNING::2020-04-08 20:56:21,057::storage_broker::97::ovirt_hosted_engine_ha.broker.storage_broker.StorageBroker::(__init__) Can't connect vdsm storage: Command StorageDomain.getInfo with args {'storagedomainID': 'a6cea67d-dbfb-45cf-a775-b4d0d47b26f2'} failed:
(code=350, message=Error in storage domain action: (u'sdUUID=a6cea67d-dbfb-45cf-a775-b4d0d47b26f2',))
MainThread::INFO::2020-04-08 20:56:21,901::broker::47::ovirt_hosted_engine_ha.broker.broker.Broker::(run) ovirt-hosted-engine-ha broker 2.3.6 started
MainThread::INFO::2020-04-08 20:56:21,901::monitor::40::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Searching for submonitors in /usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/submonitors
MainThread::ERROR::2020-04-08 20:57:00,799::agent::145::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) Trying to restart agent
MainThread::INFO::2020-04-08 20:57:00,799::agent::89::ovirt_hosted_engine_ha.agent.agent.Agent::(run) Agent shutting down
MainThread::INFO::2020-04-08 20:57:11,144::agent::67::ovirt_hosted_engine_ha.agent.agent.Agent::(run) ovirt-hosted-engine-ha agent 2.3.6 started
MainThread::INFO::2020-04-08 20:57:11,182::hosted_engine::234::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_hostname) Found certificate common name: ovirt-node-01.phoelex.com
MainThread::INFO::2020-04-08 20:57:11,294::hosted_engine::543::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_broker) Initializing ha-broker connection
MainThread::INFO::2020-04-08 20:57:11,296::brokerlink::80::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor) Starting monitor network, options {'tcp_t_address': '', 'network_test': 'dns', 'tcp_t_port': '', 'addr': '192.168.1.99'}
MainThread::ERROR::2020-04-08 20:57:11,296::hosted_engine::559::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_broker) Failed to start necessary monitors
MainThread::ERROR::2020-04-08 20:57:11,297::agent::144::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line 131, in _run_agent
return action(he)
File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line 55, in action_proper
return he.start_monitoring()
File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 432, in start_monitoring
self._initialize_broker()
File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 556, in _initialize_broker
m.get('options', {}))
File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 89, in start_monitor
).format(t=type, o=options, e=e)
RequestError: brokerlink - failed to start monitor via ovirt-ha-broker: [Errno 2] No such file or directory, [monitor: 'network', options: {'tcp_t_address': '', 'network_test': 'dns', 'tcp_t_port': '', 'addr': '192.168.1.99'}]
MainThread::ERROR::2020-04-08 20:57:11,297::agent::145::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) Trying to restart agent
MainThread::INFO::2020-04-08 20:57:11,297::agent::89::ovirt_hosted_engine_ha.agent.agent.Agent::(run) Agent shutting down
On April 8, 2020 7:47:20 PM GMT+03:00, "Maton, Brett" <matonb@ltresources.co.uk> wrote:
>On the host you tried to restart the engine on:
>
>Add an alias to virsh (authenticates with virsh_auth.conf)
>
>alias virsh='virsh -c
>qemu:///system?authfile=/etc/ovirt-hosted-engine/virsh_auth.conf'
>
>Then run virsh:
>
>virsh
>
>virsh # list
> Id Name State
>----------------------------------------------------
> xx HostedEngine Paused
> xx ********** running
> ...
> xx ********** running
>
>HostedEngine should be in the list, try and resume the engine:
>
>virsh # resume HostedEngine
>
>On Wed, 8 Apr 2020 at 17:28, Shareef Jalloq <shareef@jalloq.co.uk>
>wrote:
>
>> Thanks!
>>
>> The status hangs due to, I guess, the VM being down....
>>
>> [root@ovirt-node-01 ~]# hosted-engine --vm-start
>> VM exists and is down, cleaning up and restarting
>> VM in WaitForLaunch
>>
>> but this doesn't seem to do anything. OK, after a while I get a
>status of
>> it being barfed...
>>
>> --== Host ovirt-node-00.phoelex.com (id: 1) status ==--
>>
>> conf_on_shared_storage : True
>> Status up-to-date : False
>> Hostname : ovirt-node-00.phoelex.com
>> Host ID : 1
>> Engine status : unknown stale-data
>> Score : 3400
>> stopped : False
>> Local maintenance : False
>> crc32 : 9c4a034b
>> local_conf_timestamp : 523362
>> Host timestamp : 523608
>> Extra metadata (valid at timestamp):
>> metadata_parse_version=1
>> metadata_feature_version=1
>> timestamp=523608 (Wed Apr 8 16:17:11 2020)
>> host-id=1
>> score=3400
>> vm_conf_refresh_time=523362 (Wed Apr 8 16:13:06 2020)
>> conf_on_shared_storage=True
>> maintenance=False
>> state=EngineDown
>> stopped=False
>>
>>
>> --== Host ovirt-node-01.phoelex.com (id: 2) status ==--
>>
>> conf_on_shared_storage : True
>> Status up-to-date : True
>> Hostname : ovirt-node-01.phoelex.com
>> Host ID : 2
>> Engine status : {"reason": "bad vm status",
>"health":
>> "bad", "vm": "down_unexpected", "detail": "Down"}
>> Score : 0
>> stopped : False
>> Local maintenance : False
>> crc32 : 5045f2eb
>> local_conf_timestamp : 1737037
>> Host timestamp : 1737283
>> Extra metadata (valid at timestamp):
>> metadata_parse_version=1
>> metadata_feature_version=1
>> timestamp=1737283 (Wed Apr 8 16:16:17 2020)
>> host-id=2
>> score=0
>> vm_conf_refresh_time=1737037 (Wed Apr 8 16:12:11 2020)
>> conf_on_shared_storage=True
>> maintenance=False
>> state=EngineUnexpectedlyDown
>> stopped=False
>>
>> On Wed, Apr 8, 2020 at 5:09 PM Maton, Brett
><matonb@ltresources.co.uk>
>> wrote:
>>
>>> First steps, on one of your hosts as root:
>>>
>>> To get information:
>>> hosted-engine --vm-status
>>>
>>> To start the engine:
>>> hosted-engine --vm-start
>>>
>>>
>>> On Wed, 8 Apr 2020 at 17:00, Shareef Jalloq <shareef@jalloq.co.uk>
>wrote:
>>>
>>>> So my engine has gone down and I can't ssh into it either. If I
>try to
>>>> log into the web-ui of the node it is running on, I get redirected
>because
>>>> the node can't reach the engine.
>>>>
>>>> What are my next steps?
>>>>
>>>> Shareef.
>>>> _______________________________________________
>>>> Users mailing list -- users@ovirt.org
>>>> To unsubscribe send an email to users-leave@ovirt.org
>>>> Privacy Statement: https://www.ovirt.org/privacy-policy.html
>>>> oVirt Code of Conduct:
>>>> https://www.ovirt.org/community/about/community-guidelines/
>>>> List Archives:
>>>>
>https://lists.ovirt.org/archives/list/users@ovirt.org/message/W7BP57OCIRSW5CDRQWR5MIKJUH3ISLCQ/
>>>>
>>>
This has to be resolved:
Engine status : unknown stale-data
Run again 'hosted-engine --vm-status'. If it remains the same, restart ovirt-ha-broker.service & ovirt-ha-agent.service
Verify that the engine's storage is available. Then monitor the broker & agent logs in /var/log/ovirt-hosted-engine-ha
Best Regards,
Strahil Nikolov