guest migration issue was due the firewalld, in logs a I found in logs
2018-01-09 10:12:06,700+0100 ERROR (migsrc/d6e3745b) [virt.vm]
(vmId='d6e3745b-1444-42a3-8cc0-29eaf59b8520') Failed to migrate
(migration:429)
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/vdsm/virt/migration.py", line
382, in run
self._setupVdsConnection()
File "/usr/lib/python2.7/site-packages/vdsm/virt/migration.py", line
219, in _setupVdsConnection
client = self._createClient(port)
File "/usr/lib/python2.7/site-packages/vdsm/virt/migration.py", line
206, in _createClient
client_socket = utils.create_connected_socket(host, int(port), sslctx)
File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 950, in
create_connected_socket
sock.connect((host, port))
File "/usr/lib64/python2.7/site-packages/M2Crypto/SSL/Connection.py",
line 181, in connect
self.socket.connect(addr)
File "/usr/lib64/python2.7/socket.py", line 224, in meth
return getattr(self._sock,name)(*args)
error: [Errno 113] No route to host
disabling the firewalld solved the issue, so back to the orininal
problem with the firewall ;)
On 09/01/2018 11:08, Yedidyah Bar David wrote:
On Tue, Jan 9, 2018 at 12:04 PM, Peter Hudec <phudec(a)cnc.sk
<mailto:phudec@cnc.sk>> wrote:
quick fix is follow the
https://gerrit.ovirt.org/#/c/84802/2/backend/manager/modules/utils/src/ma...
<
https://gerrit.ovirt.org/#/c/84802/2/backend/manager/modules/utils/src/ma...
and remove trailing '/' in
/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/ovf/ovf2VmParams.py
Adding Denis. Thanks for the logs!
On 09/01/2018 10:45, Peter Hudec wrote:
> the old hypervisoer /oVirt 4.1.8/ got probblem to release the HE due
> this exception. The HE is on the NFS store.
>
> MainThread::INFO::2018-01-09
>
10:40:28,497::upgrade::998::ovirt_hosted_engine_ha.lib.upgrade.StorageServer::(upgrade_35_36)
> Host configuration is already up-to-date
> MainThread::INFO::2018-01-09
>
10:40:28,498::config::493::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(refresh_vm_conf)
> Reloading vm.conf from the shared storage domain
> MainThread::INFO::2018-01-09
>
10:40:28,498::config::416::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store)
> Trying to get a fresher copy of vm configuration from the OVF_STORE
> MainThread::INFO::2018-01-09
>
10:40:28,498::ovf_store::132::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(getEngineVMOVF)
> Extracting Engine VM OVF from the OVF_STORE
> MainThread::INFO::2018-01-09
>
10:40:28,498::ovf_store::134::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(getEngineVMOVF)
> OVF_STORE volume path:
>
/var/run/vdsm/storage/3981424d-a55c-4f07-bff2-aca316a95d1f/3513775f-d6b0-4423-be19-bbeb79c72ad2/7ee3f450-5976-48f8-b667-27b48f6cf778
> MainThread::INFO::2018-01-09
>
10:40:28,517::config::435::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store)
> Found an OVF for HE VM, trying to convert
> MainThread::ERROR::2018-01-09
>
10:40:28,523::agent::205::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent)
> Traceback (most recent call last):
> File
>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
> line 191, in _run_agent
> return action(he)
> File
>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
> line 64, in action_proper
> return he.start_monitoring()
> File
>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
> line 421, in start_monitoring
> self._config.refresh_vm_conf()
> File
>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/env/config.py",
> line 496, in refresh_vm_conf
> content_from_ovf = self._get_vm_conf_content_from_ovf_store()
> File
>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/env/config.py",
> line 438, in _get_vm_conf_content_from_ovf_store
> conf = ovf2VmParams.confFromOvf(heovf)
> File
>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/ovf/ovf2VmParams.py",
> line 283, in confFromOvf
> vmConf = toDict(ovf)
> File
>
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/ovf/ovf2VmParams.py",
> line 210, in toDict
> vmParams['vmId'] =
tree.find('Content/Section').attrib[OVF_NS
+ 'id']
> File "lxml.etree.pyx", line 2272, in lxml.etree._Attrib.__getitem__
> (src/lxml/lxml.etree.c:55336)
> KeyError: '{http://schemas.dmtf.org/ovf/envelope/1/}id
<
http://schemas.dmtf.org/ovf/envelope/1/}id>'
>
>
> On 09/01/2018 10:18, Peter Hudec wrote:
>> The HA is flapping between 3400 nad 0. ;(
>> And I'm not able to migrate also any other Vm to this host.
>>
>> Loggs fromthe /var/log/ovirt-hosted-engine-ha/agent.log file
>>
>> MainThread::INFO::2018-01-08
>>
21:44:45,805::state_machine::174::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh)
>> Host dipovirt03.cnc.sk <
http://dipovirt03.cnc.sk> (id 1):
{'conf_on_shared_storage': True, 'extra':
>>
'metadata_parse_version=1\nmetadata_feature_version=1\ntimestamp=8232312
>> (Mon Jan 8 21:44:29
>> 2018)\nhost-id=1\nscore=3400\nvm_conf_refresh_time=8232316 (Mon
Jan 8
>> 21:44:33
>>
2018)\nconf_on_shared_storage=True\nmaintenance=False\nstate=EngineUp\nstopped=False\n',
>> 'hostname': 'dipovirt03.cnc.sk
<
http://dipovirt03.cnc.sk>';,
'host-id': 1, 'engine-status':
>> {'health': 'good', 'vm': 'up',
'detail': 'up'}, 'score': 3400,
>> 'stopped': False, 'maintenance': False, 'crc32':
'f28d4648',
>> 'local_conf_timestamp': 8232316, 'host-ts': 8232312}
>> MainThread::INFO::2018-01-08
>>
21:44:45,805::state_machine::174::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh)
>> Host dipovirt02.cnc.sk <
http://dipovirt02.cnc.sk> (id 3):
{'conf_on_shared_storage': True, 'extra':
>> 'metadata_parse_version=1\nmetada...skipping...
>> neVMOVF) OVF_STORE volume path:
>>
/var/run/vdsm/storage/3981424d-a55c-4f07-bff2-aca316a95d1f/3513775f-d6b0-4423-be19-bbe
>> b79c72ad2/7ee3f450-5976-48f8-b667-27b48f6cf778
>> MainThread::INFO::2018-01-09
>>
10:15:13,904::state_machine::169::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh)
>> Global metadata: {'maintenance': False}
>> MainThread::INFO::2018-01-09
>>
10:15:13,905::state_machine::174::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh)
>> Host dipovirt03.cnc.sk <
http://dipovirt03.cnc.sk> (id 1):
{'conf_on_shared_storage': True, 'extra':
>>
'metadata_parse_version=1\nmetadata_feature_version=1\ntimestamp=38598
>> (Tue Jan 9 09:23:33
>> 2018)\nhost-id=1\nscore=3400\nvm_conf_refresh_time=38598 (Tue Jan 9
>> 09:23:34
>>
2018)\nconf_on_shared_storage=True\nmaintenance=False\nstate=EngineUp\nstopped=False\n',
>> 'hostname': 'dipovirt03.cnc.sk
<
http://dipovirt03.cnc.sk>';,
'alive': False, 'host-id': 1,
>> 'engine-status': {'health': 'good', 'vm':
'up', 'detail': 'up'},
>> 'score': 3400, 'stopped': False, 'maintenance':
False, 'crc32':
>> '4c1d1890', 'local_conf_timestamp': 38598,
'host-ts': 38598}
>> MainThread::INFO::2018-01-09
>>
10:15:13,905::state_machine::174::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh)
>> Host dipovirt02.cnc.sk <
http://dipovirt02.cnc.sk> (id 3):
{'conf_on_shared_storage': True, 'extra':
>>
'metadata_parse_version=1\nmetadata_feature_version=1\ntimestamp=40677
>> (Tue Jan 9 09:24:11
>> 2018)\nhost-id=3\nscore=3400\nvm_conf_refresh_time=40677 (Tue Jan 9
>> 09:24:11
>>
2018)\nconf_on_shared_storage=True\nmaintenance=False\nstate=EngineDown\nstopped=False\n',
>> 'hostname': 'dipovirt02.cnc.sk
<
http://dipovirt02.cnc.sk>';,
'alive': False, 'host-id': 3,
>> 'engine-status': {'reason': 'vm not running on this
host', 'health':
>> 'bad', 'vm': 'down', 'detail':
'unknown'}, 'score': 3400, 'stopped':
>> False, 'maintenance': False, 'crc32': '3bf104bc',
>> 'local_conf_timestamp': 40677, 'host-ts': 40677}
>> MainThread::INFO::2018-01-09
>>
10:15:13,905::state_machine::177::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh)
>> Local (id 2): {'engine-health': {'reason': 'vm not
running on this
>> host', 'health': 'bad', 'vm': 'down',
'detail': 'unknown'}, 'bridge':
>> True, 'mem-free': 39540.0, 'maintenance': False,
'cpu-load': 0.0432,
>> 'gateway': 1.0, 'storage-domain': True}
>> MainThread::INFO::2018-01-09
>>
10:15:13,905::states::775::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
>> Another host already took over..
>> MainThread::INFO::2018-01-09
>>
10:15:13,928::state_decorators::88::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check)
>> Timeout cleared while transitioning <class
>> 'ovirt_hosted_engine_ha.agent.states.EngineStarting'> ->
<class
>> 'ovirt_hosted_engine_ha.agent.states.EngineForceStop'>
>> MainThread::INFO::2018-01-09
>>
10:15:14,046::brokerlink::68::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
>> Success, was notification of state_transition
>> (EngineStarting-EngineForceStop) sent? sent
>> MainThread::INFO::2018-01-09
>>
10:15:14,464::hosted_engine::494::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop)
>> Current state EngineForceStop (score: 3400)
>> MainThread::INFO::2018-01-09
>>
10:15:14,467::hosted_engine::1002::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_stop_engine_vm)
>> Shutting down vm using `/usr/sbin/hosted-engine --vm-poweroff`
>> MainThread::INFO::2018-01-09
>>
10:15:15,198::hosted_engine::1007::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_stop_engine_vm)
>> stdout:
>> MainThread::INFO::2018-01-09
>>
10:15:15,198::hosted_engine::1008::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_stop_engine_vm)
>> stderr: Command VM.destroy with args {'vmID':
>> '9a8ea503-f598-433e-9751-93aee3e7b347'} failed:
>> (code=1, message=Virtual machine does not exist: {'vmId':
>> u'9a8ea503-f598-433e-9751-93aee3e7b347'})
>>
>> MainThread::ERROR::2018-01-09
>>
10:15:15,199::hosted_engine::1013::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_stop_engine_vm)
>> Failed to stop engine vm with /usr/sbin/hosted-engine --vm-poweroff:
>> Command VM.destroy with args {'vmID':
>> '9a8ea503-f598-433e-9751-93aee3e7b347'} failed:
>> (code=1, message=Virtual machine does not exist: {'vmId':
>> u'9a8ea503-f598-433e-9751-93aee3e7b347'})
>>
>> MainThread::ERROR::2018-01-09
>>
10:15:15,199::hosted_engine::1019::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_stop_engine_vm)
>> Failed to stop engine VM: Command VM.destroy with args {'vmID':
>> '9a8ea503-f598-433e-9751-93aee3e7b347'} failed:
>> (code=1, message=Virtual machine does not exist: {'vmId':
>> u'9a8ea503-f598-433e-9751-93aee3e7b347'})
>>
>> MainThread::INFO::2018-01-09
>>
10:15:15,317::brokerlink::68::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
>> Success, was notification of state_transition
>> (EngineForceStop-ReinitializeFSM) sent? sent
>> MainThread::INFO::2018-01-09
>>
10:15:15,356::hosted_engine::494::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop)
>> Current state ReinitializeFSM (score: 0)
>> MainThread::INFO::2018-01-09
>>
10:15:25,560::brokerlink::68::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
>> Success, was notification of state_transition
>> (ReinitializeFSM-EngineDown) sent? sent
>>
>> Peter
>>
>> On 09/01/2018 09:35, Yedidyah Bar David wrote:
>>>
>>> 3) Hosted Engine HA:
>>> Hosted Engine HA on upgraded hosts is 3400, the same as on
the 4.1
>>> hosts. Is this good or bad?
>>>
>>>
>>> It's good.
>>
>>
>
>
--
*Peter Hudec*
Infraštruktúrny architekt
phudec(a)cnc.sk <mailto:phudec@cnc.sk> <mailto:phudec@cnc.sk
<mailto:phudec@cnc.sk>>
*CNC, a.s.*
Borská 6, 841 04 Bratislava
Recepcia: +421 2 35 000 100 <tel:%2B421%202%C2%A0%2035%20000%20100>
Mobil:+421 905 997 203 <tel:%2B421%C2%A0905%20997%20203>
*www.cnc.sk <
http://www.cnc.sk>* <http:///www.cnc.sk
<
http://www.cnc.sk>>
--
Didi
--
*Peter Hudec*
Infraštruktúrny architekt
phudec(a)cnc.sk <mailto:phudec@cnc.sk>
*CNC, a.s.*
Borská 6, 841 04 Bratislava
Recepcia: +421 2 35 000 100
Mobil:+421 905 997 203
*www.cnc.sk* <http:///www.cnc.sk>