the old hypervisoer /oVirt 4.1.8/ got probblem to release the HE due
this exception. The HE is on the NFS store.
MainThread::INFO::2018-01-09
10:40:28,497::upgrade::998::ovirt_hosted_engine_ha.lib.upgrade.StorageServer::(upgrade_35_36)
Host configuration is already up-to-date
MainThread::INFO::2018-01-09
10:40:28,498::config::493::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(refresh_vm_conf)
Reloading vm.conf from the shared storage domain
MainThread::INFO::2018-01-09
10:40:28,498::config::416::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store)
Trying to get a fresher copy of vm configuration from the OVF_STORE
MainThread::INFO::2018-01-09
10:40:28,498::ovf_store::132::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(getEngineVMOVF)
Extracting Engine VM OVF from the OVF_STORE
MainThread::INFO::2018-01-09
10:40:28,498::ovf_store::134::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(getEngineVMOVF)
OVF_STORE volume path:
/var/run/vdsm/storage/3981424d-a55c-4f07-bff2-aca316a95d1f/3513775f-d6b0-4423-be19-bbeb79c72ad2/7ee3f450-5976-48f8-b667-27b48f6cf778
MainThread::INFO::2018-01-09
10:40:28,517::config::435::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store)
Found an OVF for HE VM, trying to convert
MainThread::ERROR::2018-01-09
10:40:28,523::agent::205::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent)
Traceback (most recent call last):
File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
line 191, in _run_agent
return action(he)
File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
line 64, in action_proper
return he.start_monitoring()
File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
line 421, in start_monitoring
self._config.refresh_vm_conf()
File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/env/config.py",
line 496, in refresh_vm_conf
content_from_ovf = self._get_vm_conf_content_from_ovf_store()
File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/env/config.py",
line 438, in _get_vm_conf_content_from_ovf_store
conf = ovf2VmParams.confFromOvf(heovf)
File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/ovf/ovf2VmParams.py",
line 283, in confFromOvf
vmConf = toDict(ovf)
File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/ovf/ovf2VmParams.py",
line 210, in toDict
vmParams['vmId'] = tree.find('Content/Section').attrib[OVF_NS +
'id']
File "lxml.etree.pyx", line 2272, in lxml.etree._Attrib.__getitem__
(src/lxml/lxml.etree.c:55336)
KeyError: '{http://schemas.dmtf.org/ovf/envelope/1/}id'
On 09/01/2018 10:18, Peter Hudec wrote:
The HA is flapping between 3400 nad 0. ;(
And I'm not able to migrate also any other Vm to this host.
Loggs fromthe /var/log/ovirt-hosted-engine-ha/agent.log file
MainThread::INFO::2018-01-08
21:44:45,805::state_machine::174::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh)
Host dipovirt03.cnc.sk (id 1): {'conf_on_shared_storage': True, 'extra':
'metadata_parse_version=1\nmetadata_feature_version=1\ntimestamp=8232312
(Mon Jan 8 21:44:29
2018)\nhost-id=1\nscore=3400\nvm_conf_refresh_time=8232316 (Mon Jan 8
21:44:33
2018)\nconf_on_shared_storage=True\nmaintenance=False\nstate=EngineUp\nstopped=False\n',
'hostname': 'dipovirt03.cnc.sk', 'host-id': 1,
'engine-status':
{'health': 'good', 'vm': 'up', 'detail':
'up'}, 'score': 3400,
'stopped': False, 'maintenance': False, 'crc32':
'f28d4648',
'local_conf_timestamp': 8232316, 'host-ts': 8232312}
MainThread::INFO::2018-01-08
21:44:45,805::state_machine::174::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh)
Host dipovirt02.cnc.sk (id 3): {'conf_on_shared_storage': True, 'extra':
'metadata_parse_version=1\nmetada...skipping...
neVMOVF) OVF_STORE volume path:
/var/run/vdsm/storage/3981424d-a55c-4f07-bff2-aca316a95d1f/3513775f-d6b0-4423-be19-bbe
b79c72ad2/7ee3f450-5976-48f8-b667-27b48f6cf778
MainThread::INFO::2018-01-09
10:15:13,904::state_machine::169::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh)
Global metadata: {'maintenance': False}
MainThread::INFO::2018-01-09
10:15:13,905::state_machine::174::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh)
Host dipovirt03.cnc.sk (id 1): {'conf_on_shared_storage': True, 'extra':
'metadata_parse_version=1\nmetadata_feature_version=1\ntimestamp=38598
(Tue Jan 9 09:23:33
2018)\nhost-id=1\nscore=3400\nvm_conf_refresh_time=38598 (Tue Jan 9
09:23:34
2018)\nconf_on_shared_storage=True\nmaintenance=False\nstate=EngineUp\nstopped=False\n',
'hostname': 'dipovirt03.cnc.sk', 'alive': False,
'host-id': 1,
'engine-status': {'health': 'good', 'vm': 'up',
'detail': 'up'},
'score': 3400, 'stopped': False, 'maintenance': False,
'crc32':
'4c1d1890', 'local_conf_timestamp': 38598, 'host-ts': 38598}
MainThread::INFO::2018-01-09
10:15:13,905::state_machine::174::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh)
Host dipovirt02.cnc.sk (id 3): {'conf_on_shared_storage': True, 'extra':
'metadata_parse_version=1\nmetadata_feature_version=1\ntimestamp=40677
(Tue Jan 9 09:24:11
2018)\nhost-id=3\nscore=3400\nvm_conf_refresh_time=40677 (Tue Jan 9
09:24:11
2018)\nconf_on_shared_storage=True\nmaintenance=False\nstate=EngineDown\nstopped=False\n',
'hostname': 'dipovirt02.cnc.sk', 'alive': False,
'host-id': 3,
'engine-status': {'reason': 'vm not running on this host',
'health':
'bad', 'vm': 'down', 'detail': 'unknown'},
'score': 3400, 'stopped':
False, 'maintenance': False, 'crc32': '3bf104bc',
'local_conf_timestamp': 40677, 'host-ts': 40677}
MainThread::INFO::2018-01-09
10:15:13,905::state_machine::177::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh)
Local (id 2): {'engine-health': {'reason': 'vm not running on this
host', 'health': 'bad', 'vm': 'down',
'detail': 'unknown'}, 'bridge':
True, 'mem-free': 39540.0, 'maintenance': False, 'cpu-load':
0.0432,
'gateway': 1.0, 'storage-domain': True}
MainThread::INFO::2018-01-09
10:15:13,905::states::775::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
Another host already took over..
MainThread::INFO::2018-01-09
10:15:13,928::state_decorators::88::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check)
Timeout cleared while transitioning <class
'ovirt_hosted_engine_ha.agent.states.EngineStarting'> -> <class
'ovirt_hosted_engine_ha.agent.states.EngineForceStop'>
MainThread::INFO::2018-01-09
10:15:14,046::brokerlink::68::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
Success, was notification of state_transition
(EngineStarting-EngineForceStop) sent? sent
MainThread::INFO::2018-01-09
10:15:14,464::hosted_engine::494::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop)
Current state EngineForceStop (score: 3400)
MainThread::INFO::2018-01-09
10:15:14,467::hosted_engine::1002::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_stop_engine_vm)
Shutting down vm using `/usr/sbin/hosted-engine --vm-poweroff`
MainThread::INFO::2018-01-09
10:15:15,198::hosted_engine::1007::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_stop_engine_vm)
stdout:
MainThread::INFO::2018-01-09
10:15:15,198::hosted_engine::1008::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_stop_engine_vm)
stderr: Command VM.destroy with args {'vmID':
'9a8ea503-f598-433e-9751-93aee3e7b347'} failed:
(code=1, message=Virtual machine does not exist: {'vmId':
u'9a8ea503-f598-433e-9751-93aee3e7b347'})
MainThread::ERROR::2018-01-09
10:15:15,199::hosted_engine::1013::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_stop_engine_vm)
Failed to stop engine vm with /usr/sbin/hosted-engine --vm-poweroff:
Command VM.destroy with args {'vmID':
'9a8ea503-f598-433e-9751-93aee3e7b347'} failed:
(code=1, message=Virtual machine does not exist: {'vmId':
u'9a8ea503-f598-433e-9751-93aee3e7b347'})
MainThread::ERROR::2018-01-09
10:15:15,199::hosted_engine::1019::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_stop_engine_vm)
Failed to stop engine VM: Command VM.destroy with args {'vmID':
'9a8ea503-f598-433e-9751-93aee3e7b347'} failed:
(code=1, message=Virtual machine does not exist: {'vmId':
u'9a8ea503-f598-433e-9751-93aee3e7b347'})
MainThread::INFO::2018-01-09
10:15:15,317::brokerlink::68::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
Success, was notification of state_transition
(EngineForceStop-ReinitializeFSM) sent? sent
MainThread::INFO::2018-01-09
10:15:15,356::hosted_engine::494::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop)
Current state ReinitializeFSM (score: 0)
MainThread::INFO::2018-01-09
10:15:25,560::brokerlink::68::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
Success, was notification of state_transition
(ReinitializeFSM-EngineDown) sent? sent
Peter
On 09/01/2018 09:35, Yedidyah Bar David wrote:
>
> 3) Hosted Engine HA:
> Hosted Engine HA on upgraded hosts is 3400, the same as on the 4.1
> hosts. Is this good or bad?
>
>
> It's good.
--
*Peter Hudec*
Infraštruktúrny architekt
phudec(a)cnc.sk <mailto:phudec@cnc.sk>
*CNC, a.s.*
Borská 6, 841 04 Bratislava
Recepcia: +421 2 35 000 100
Mobil:+421 905 997 203
*www.cnc.sk* <http:///www.cnc.sk>