[ovirt-users] hosted engine setup on second host fails

Stefan Wendler stefan.wendler at tngtech.com
Wed Sep 24 14:14:59 UTC 2014


Oh well. I think this is fixed. I upgraded to 3.4.4 and the message
seems to be gone. the agents are running :)

Thank you very much !!! :)


On 09/24/2014 15:23, Stefan Wendler wrote:
> Okay, I'm truncating the previous mails here....
> 
> Davids hint was the solution. I had the ovirt hosts already added to the
> cluster and tried to do the hosted-engine-ha setup on them.
> 
> After removing the hosts from the cluster and putting the data domain to
> maintenance mode I was able to deploy an all other nodes. I now have a
> HA'd hosted engine. Which can also be migrated \o/
> 
> Maybe that is something that could be stated in the documentation more
> clearly?
> 
> Unfortunately now I have a new problem. The agents crash rapidly after
> startup. The error is the following:
> (/var/log/ovirt-hosted-engine-ha/agent.log)
> 
> AttributeError: 'NoneType' object has no attribute 'iteritems'
> 
> And the whole output here - The agents have been started and I tried a
> migration of the hosted engine from ovirt host 1 to host 2 which
> succeeded. But the agents crashed afterwards:
> 
> MainThread::INFO::2014-09-24
> 15:09:24,839::agent::52::ovirt_hosted_engine_ha.agent.agent.Agent::(run)
> ovirt-hosted-engine-ha agent 1.1.5 started
> MainThread::INFO::2014-09-24
> 15:09:24,871::hosted_engine::223::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_hostname)
> Found certificate common name: 10.8.2.101
> MainThread::INFO::2014-09-24
> 15:09:25,081::hosted_engine::367::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_broker)
> Initializing ha-broker connection
> MainThread::INFO::2014-09-24
> 15:09:25,082::brokerlink::126::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor)
> Starting monitor ping, options {'addr': '10.8.2.1'}
> MainThread::INFO::2014-09-24
> 15:09:25,083::brokerlink::137::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor)
> Success, id 25293072
> MainThread::INFO::2014-09-24
> 15:09:25,083::brokerlink::126::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor)
> Starting monitor mgmt-bridge, options {'use_ssl': 'true', 'bridge_name':
> 'ovirtmgmt', 'address': '0'}
> MainThread::INFO::2014-09-24
> 15:09:25,086::brokerlink::137::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor)
> Success, id 25294160
> MainThread::INFO::2014-09-24
> 15:09:25,086::brokerlink::126::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor)
> Starting monitor mem-free, options {'use_ssl': 'true', 'address': '0'}
> MainThread::INFO::2014-09-24
> 15:09:25,088::brokerlink::137::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor)
> Success, id 25293968
> MainThread::INFO::2014-09-24
> 15:09:25,088::brokerlink::126::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor)
> Starting monitor cpu-load-no-engine, options {'use_ssl': 'true',
> 'vm_uuid': 'e1ca293f-09e0-4d2e-8915-221839af1489', 'address': '0'}
> MainThread::INFO::2014-09-24
> 15:09:25,089::brokerlink::137::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor)
> Success, id 25360400
> MainThread::INFO::2014-09-24
> 15:09:25,089::brokerlink::126::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor)
> Starting monitor engine-health, options {'use_ssl': 'true', 'vm_uuid':
> 'e1ca293f-09e0-4d2e-8915-221839af1489', 'address': '0'}
> MainThread::INFO::2014-09-24
> 15:09:25,091::brokerlink::137::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor)
> Success, id 25509776
> MainThread::INFO::2014-09-24
> 15:09:25,091::hosted_engine::391::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_broker)
> Broker initialized, all submonitors started
> MainThread::INFO::2014-09-24
> 15:09:25,125::hosted_engine::476::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_sanlock)
> Ensuring lease for lockspace hosted-engine, host id 2 is acquired (file:
> /rhev/data-center/mnt/10.8.2.12:_volume1_engine-store/e313da39-594c-46b5-95c9-c445889c745c/ha_agent/hosted-engine.lockspace)
> MainThread::INFO::2014-09-24
> 15:09:25,134::state_machine::153::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh)
> Global metadata: {'maintenance': False}
> MainThread::INFO::2014-09-24
> 15:09:25,134::state_machine::158::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh)
> Host 10.8.2.100 (id 1): {'live-data': True, 'extra':
> 'metadata_parse_version=1\nmetadata_feature_version=1\ntimestamp=1411564164
> (Wed Sep 24 15:09:24
> 2014)\nhost-id=1\nscore=2400\nmaintenance=False\nstate=EngineUp\n',
> 'hostname': '10.8.2.100', 'host-id': 1, 'engine-status': {'health':
> 'good', 'vm': 'up', 'detail': 'up'}, 'score': 2400, 'maintenance':
> False, 'host-ts': 1411564164}
> MainThread::INFO::2014-09-24
> 15:09:25,134::state_machine::158::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh)
> Host 10.8.2.102 (id 3): {'live-data': False, 'extra':
> 'metadata_parse_version=1\nmetadata_feature_version=1\ntimestamp=1411562496
> (Wed Sep 24 14:41:36
> 2014)\nhost-id=3\nscore=0\nmaintenance=False\nstate=EngineUnexpectedlyDown\ntimeout=Wed
> Sep 24 14:50:24 2014\n', 'hostname': '10.8.2.102', 'host-id': 3,
> 'engine-status': {'reason': 'vm not running on this host', 'health':
> 'bad', 'vm': 'down', 'detail': 'unknown'}, 'score': 0, 'maintenance':
> False, 'host-ts': 1411562496}
> MainThread::INFO::2014-09-24
> 15:09:25,134::state_machine::161::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh)
> Local (id 2): {'engine-health': None, 'bridge': True, 'mem-free': None,
> 'maintenance': False, 'cpu-load': None, 'gateway': True}
> MainThread::INFO::2014-09-24
> 15:09:25,135::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
> Trying: notify time=1411564165.14 type=state_transition
> detail=StartState-ReinitializeFSM
> hostname='ovirt-node-mapconv2.int.tngtech.com'
> MainThread::INFO::2014-09-24
> 15:09:25,170::brokerlink::117::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
> Success, was notification of state_transition
> (StartState-ReinitializeFSM) sent? sent
> MainThread::INFO::2014-09-24
> 15:09:25,383::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state ReinitializeFSM (score: 0)
> MainThread::INFO::2014-09-24
> 15:09:35,409::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
> Trying: notify time=1411564175.41 type=state_transition
> detail=ReinitializeFSM-EngineDown
> hostname='ovirt-node-mapconv2.int.tngtech.com'
> MainThread::INFO::2014-09-24
> 15:09:35,410::brokerlink::117::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
> Success, was notification of state_transition
> (ReinitializeFSM-EngineDown) sent? ignored
> MainThread::INFO::2014-09-24
> 15:09:35,627::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineDown (score: 2400)
> MainThread::INFO::2014-09-24
> 15:09:45,652::states::441::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
> The engine is not running, but we do not have enough data to decide
> which hosts are alive
> MainThread::INFO::2014-09-24
> 15:09:45,653::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
> Trying: notify time=1411564185.65 type=state_transition
> detail=EngineDown-EngineDown hostname='ovirt-node-mapconv2.int.tngtech.com'
> MainThread::INFO::2014-09-24
> 15:09:45,653::brokerlink::117::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
> Success, was notification of state_transition (EngineDown-EngineDown)
> sent? ignored
> MainThread::INFO::2014-09-24
> 15:09:45,875::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineDown (score: 2400)
> MainThread::CRITICAL::2014-09-24
> 15:09:55,899::agent::103::ovirt_hosted_engine_ha.agent.agent.Agent::(run) Could
> not start ha-agent
> Traceback (most recent call last):
>   File
> "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line
> 97, in run
>     self._run_agent()
>   File
> "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line
> 154, in _run_agent
>     hosted_engine.HostedEngine(self.shutdown_requested).start_monitoring()
>   File
> "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
> line 307, in start_monitoring
>     for old_state, state, delay in self.fsm:
>   File
> "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/fsm/machine.py",
> line 125, in next
>     new_data = self.refresh(self._state.data)
>   File
> "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/state_machine.py",
> line 77, in refresh
>     stats.update(self.hosted_engine.collect_stats())
>   File
> "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
> line 700, in collect_stats
>     stats = self.process_remote_metadata(host_id, remote_data)
>   File
> "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
> line 747, in process_remote_metadata
>     md['engine-status'] = engine_status(md["engine-status"])
>   File
> "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
> line 79, in engine_status
>     in json.loads(status).iteritems()])
> AttributeError: 'NoneType' object has no attribute 'iteritems'
> 
> 
> 
> 
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
> 



More information about the Users mailing list