
On Tue, Jan 9, 2018 at 12:04 PM, Peter Hudec <phudec@cnc.sk> wrote:
quick fix is follow the https://gerrit.ovirt.org/#/c/84802/2/backend/manager/ modules/utils/src/main/java/org/ovirt/engine/core/utils/ ovf/IOvfBuilder.java
and remove trailing '/' in /usr/lib/python2.7/site-packages/ovirt_hosted_engine_ ha/lib/ovf/ovf2VmParams.py
Adding Denis. Thanks for the logs!
On 09/01/2018 10:45, Peter Hudec wrote:
the old hypervisoer /oVirt 4.1.8/ got probblem to release the HE due this exception. The HE is on the NFS store.
MainThread::INFO::2018-01-09 10:40:28,497::upgrade::998::ovirt_hosted_engine_ha.lib. upgrade.StorageServer::(upgrade_35_36) Host configuration is already up-to-date MainThread::INFO::2018-01-09 10:40:28,498::config::493::ovirt_hosted_engine_ha.agent. hosted_engine.HostedEngine.config::(refresh_vm_conf) Reloading vm.conf from the shared storage domain MainThread::INFO::2018-01-09 10:40:28,498::config::416::ovirt_hosted_engine_ha.agent. hosted_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store) Trying to get a fresher copy of vm configuration from the OVF_STORE MainThread::INFO::2018-01-09 10:40:28,498::ovf_store::132::ovirt_hosted_engine_ha.lib. ovf.ovf_store.OVFStore::(getEngineVMOVF) Extracting Engine VM OVF from the OVF_STORE MainThread::INFO::2018-01-09 10:40:28,498::ovf_store::134::ovirt_hosted_engine_ha.lib. ovf.ovf_store.OVFStore::(getEngineVMOVF) OVF_STORE volume path: /var/run/vdsm/storage/3981424d-a55c-4f07-bff2- aca316a95d1f/3513775f-d6b0-4423-be19-bbeb79c72ad2/7ee3f450-5976-48f8-b667- 27b48f6cf778 MainThread::INFO::2018-01-09 10:40:28,517::config::435::ovirt_hosted_engine_ha.agent. hosted_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store) Found an OVF for HE VM, trying to convert MainThread::ERROR::2018-01-09 10:40:28,523::agent::205::ovirt_hosted_engine_ha.agent. agent.Agent::(_run_agent) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ ha/agent/agent.py", line 191, in _run_agent return action(he) File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ ha/agent/agent.py", line 64, in action_proper return he.start_monitoring() File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ ha/agent/hosted_engine.py", line 421, in start_monitoring self._config.refresh_vm_conf() File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/env/config.py", line 496, in refresh_vm_conf content_from_ovf = self._get_vm_conf_content_from_ovf_store() File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/env/config.py", line 438, in _get_vm_conf_content_from_ovf_store conf = ovf2VmParams.confFromOvf(heovf) File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ ha/lib/ovf/ovf2VmParams.py", line 283, in confFromOvf vmConf = toDict(ovf) File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ ha/lib/ovf/ovf2VmParams.py", line 210, in toDict vmParams['vmId'] = tree.find('Content/Section').attrib[OVF_NS + 'id'] File "lxml.etree.pyx", line 2272, in lxml.etree._Attrib.__getitem__ (src/lxml/lxml.etree.c:55336) KeyError: '{http://schemas.dmtf.org/ovf/envelope/1/}id'
On 09/01/2018 10:18, Peter Hudec wrote:
The HA is flapping between 3400 nad 0. ;( And I'm not able to migrate also any other Vm to this host.
Loggs fromthe /var/log/ovirt-hosted-engine-ha/agent.log file
MainThread::INFO::2018-01-08 21:44:45,805::state_machine::174::ovirt_hosted_engine_ha. agent.hosted_engine.HostedEngine::(refresh) Host dipovirt03.cnc.sk (id 1): {'conf_on_shared_storage': True, 'extra': 'metadata_parse_version=1\nmetadata_feature_version=1\ ntimestamp=8232312 (Mon Jan 8 21:44:29 2018)\nhost-id=1\nscore=3400\nvm_conf_refresh_time=8232316 (Mon Jan 8 21:44:33 2018)\nconf_on_shared_storage=True\nmaintenance=False\ nstate=EngineUp\nstopped=False\n', 'hostname': 'dipovirt03.cnc.sk', 'host-id': 1, 'engine-status': {'health': 'good', 'vm': 'up', 'detail': 'up'}, 'score': 3400, 'stopped': False, 'maintenance': False, 'crc32': 'f28d4648', 'local_conf_timestamp': 8232316, 'host-ts': 8232312} MainThread::INFO::2018-01-08 21:44:45,805::state_machine::174::ovirt_hosted_engine_ha. agent.hosted_engine.HostedEngine::(refresh) Host dipovirt02.cnc.sk (id 3): {'conf_on_shared_storage': True, 'extra': 'metadata_parse_version=1\nmetada...skipping... neVMOVF) OVF_STORE volume path: /var/run/vdsm/storage/3981424d-a55c-4f07-bff2- aca316a95d1f/3513775f-d6b0-4423-be19-bbe b79c72ad2/7ee3f450-5976-48f8-b667-27b48f6cf778 MainThread::INFO::2018-01-09 10:15:13,904::state_machine::169::ovirt_hosted_engine_ha. agent.hosted_engine.HostedEngine::(refresh) Global metadata: {'maintenance': False} MainThread::INFO::2018-01-09 10:15:13,905::state_machine::174::ovirt_hosted_engine_ha. agent.hosted_engine.HostedEngine::(refresh) Host dipovirt03.cnc.sk (id 1): {'conf_on_shared_storage': True, 'extra': 'metadata_parse_version=1\nmetadata_feature_version=1\ntimestamp=38598 (Tue Jan 9 09:23:33 2018)\nhost-id=1\nscore=3400\nvm_conf_refresh_time=38598 (Tue Jan 9 09:23:34 2018)\nconf_on_shared_storage=True\nmaintenance=False\ nstate=EngineUp\nstopped=False\n', 'hostname': 'dipovirt03.cnc.sk', 'alive': False, 'host-id': 1, 'engine-status': {'health': 'good', 'vm': 'up', 'detail': 'up'}, 'score': 3400, 'stopped': False, 'maintenance': False, 'crc32': '4c1d1890', 'local_conf_timestamp': 38598, 'host-ts': 38598} MainThread::INFO::2018-01-09 10:15:13,905::state_machine::174::ovirt_hosted_engine_ha. agent.hosted_engine.HostedEngine::(refresh) Host dipovirt02.cnc.sk (id 3): {'conf_on_shared_storage': True, 'extra': 'metadata_parse_version=1\nmetadata_feature_version=1\ntimestamp=40677 (Tue Jan 9 09:24:11 2018)\nhost-id=3\nscore=3400\nvm_conf_refresh_time=40677 (Tue Jan 9 09:24:11 2018)\nconf_on_shared_storage=True\nmaintenance=False\ nstate=EngineDown\nstopped=False\n', 'hostname': 'dipovirt02.cnc.sk', 'alive': False, 'host-id': 3, 'engine-status': {'reason': 'vm not running on this host', 'health': 'bad', 'vm': 'down', 'detail': 'unknown'}, 'score': 3400, 'stopped': False, 'maintenance': False, 'crc32': '3bf104bc', 'local_conf_timestamp': 40677, 'host-ts': 40677} MainThread::INFO::2018-01-09 10:15:13,905::state_machine::177::ovirt_hosted_engine_ha. agent.hosted_engine.HostedEngine::(refresh) Local (id 2): {'engine-health': {'reason': 'vm not running on this host', 'health': 'bad', 'vm': 'down', 'detail': 'unknown'}, 'bridge': True, 'mem-free': 39540.0, 'maintenance': False, 'cpu-load': 0.0432, 'gateway': 1.0, 'storage-domain': True} MainThread::INFO::2018-01-09 10:15:13,905::states::775::ovirt_hosted_engine_ha.agent. hosted_engine.HostedEngine::(consume) Another host already took over.. MainThread::INFO::2018-01-09 10:15:13,928::state_decorators::88::ovirt_hosted_ engine_ha.agent.hosted_engine.HostedEngine::(check) Timeout cleared while transitioning <class 'ovirt_hosted_engine_ha.agent.states.EngineStarting'> -> <class 'ovirt_hosted_engine_ha.agent.states.EngineForceStop'> MainThread::INFO::2018-01-09 10:15:14,046::brokerlink::68::ovirt_hosted_engine_ha.lib. brokerlink.BrokerLink::(notify) Success, was notification of state_transition (EngineStarting-EngineForceStop) sent? sent MainThread::INFO::2018-01-09 10:15:14,464::hosted_engine::494::ovirt_hosted_engine_ha. agent.hosted_engine.HostedEngine::(_monitoring_loop) Current state EngineForceStop (score: 3400) MainThread::INFO::2018-01-09 10:15:14,467::hosted_engine::1002::ovirt_hosted_engine_ha. agent.hosted_engine.HostedEngine::(_stop_engine_vm) Shutting down vm using `/usr/sbin/hosted-engine --vm-poweroff` MainThread::INFO::2018-01-09 10:15:15,198::hosted_engine::1007::ovirt_hosted_engine_ha. agent.hosted_engine.HostedEngine::(_stop_engine_vm) stdout: MainThread::INFO::2018-01-09 10:15:15,198::hosted_engine::1008::ovirt_hosted_engine_ha. agent.hosted_engine.HostedEngine::(_stop_engine_vm) stderr: Command VM.destroy with args {'vmID': '9a8ea503-f598-433e-9751-93aee3e7b347'} failed: (code=1, message=Virtual machine does not exist: {'vmId': u'9a8ea503-f598-433e-9751-93aee3e7b347'})
MainThread::ERROR::2018-01-09 10:15:15,199::hosted_engine::1013::ovirt_hosted_engine_ha. agent.hosted_engine.HostedEngine::(_stop_engine_vm) Failed to stop engine vm with /usr/sbin/hosted-engine --vm-poweroff: Command VM.destroy with args {'vmID': '9a8ea503-f598-433e-9751-93aee3e7b347'} failed: (code=1, message=Virtual machine does not exist: {'vmId': u'9a8ea503-f598-433e-9751-93aee3e7b347'})
MainThread::ERROR::2018-01-09 10:15:15,199::hosted_engine::1019::ovirt_hosted_engine_ha. agent.hosted_engine.HostedEngine::(_stop_engine_vm) Failed to stop engine VM: Command VM.destroy with args {'vmID': '9a8ea503-f598-433e-9751-93aee3e7b347'} failed: (code=1, message=Virtual machine does not exist: {'vmId': u'9a8ea503-f598-433e-9751-93aee3e7b347'})
MainThread::INFO::2018-01-09 10:15:15,317::brokerlink::68::ovirt_hosted_engine_ha.lib. brokerlink.BrokerLink::(notify) Success, was notification of state_transition (EngineForceStop-ReinitializeFSM) sent? sent MainThread::INFO::2018-01-09 10:15:15,356::hosted_engine::494::ovirt_hosted_engine_ha. agent.hosted_engine.HostedEngine::(_monitoring_loop) Current state ReinitializeFSM (score: 0) MainThread::INFO::2018-01-09 10:15:25,560::brokerlink::68::ovirt_hosted_engine_ha.lib. brokerlink.BrokerLink::(notify) Success, was notification of state_transition (ReinitializeFSM-EngineDown) sent? sent
Peter
On 09/01/2018 09:35, Yedidyah Bar David wrote:
3) Hosted Engine HA: Hosted Engine HA on upgraded hosts is 3400, the same as on the 4.1 hosts. Is this good or bad?
It's good.
-- *Peter Hudec* Infraštruktúrny architekt phudec@cnc.sk <mailto:phudec@cnc.sk>
*CNC, a.s.* Borská 6, 841 04 Bratislava Recepcia: +421 2 35 000 100
Mobil:+421 905 997 203 *www.cnc.sk* <http:///www.cnc.sk>
-- Didi