------=_Part_24816046_1946051580.1411575237782
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
Seems we should consider not adding the host if already there. Please open a bug.
Though I really hope in 3.6 to see this done from the gui
On Sep 24, 2014 4:23 PM, Stefan Wendler <stefan.wendler(a)tngtech.com> wrote:
Okay, I'm truncating the previous mails here....
Davids hiOkay, I'm truncating the previous mails here....
Davids hint was the solution. I had the ovirt hosts already added to the
cluster and tried to do the hosted-engine-ha setup on them.
After removing the hosts from the cluster and putting the data domain to
maintenance mode I was able to deploy an all other nodes. I now have a
HA'd hosted engine. Which can also be migrated \o/
Maybe that is something that could be stated in the documentation more
clearly?
Unfortunately now I have a new problem. The agents crash rapidly after
startup. The error is the following:
(/var/log/ovirt-hosted-engine-ha/agent.log)
AttributeError: 'NoneType' object has no attribute 'iteritems'
And the whole output here - The agents have been started and I tried a
migration of the hosted engine from ovirt host 1 to host 2 which
succeeded. But the agents crashed afterwards:
MainThread::INFO::2014-09-24
15:09:24,839::agent::52::ovirt_hosted_engine_ha.agent.agent.Agent::(run)
ovirt-hosted-engine-ha agent 1.1.5 started
MainThread::INFO::2014-09-24
15:09:24,871::hosted_engine::223::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_hostname)
Found certificate common name: 10.8.2.101
MainThread::INFO::2014-09-24
15:09:25,081::hosted_engine::367::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_broker)
Initializing ha-broker connection
MainThread::INFO::2014-09-24
15:09:25,082::brokerlink::126::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor)
Starting monitor ping, options {'addr': '10.8.2.1'}
MainThread::INFO::2014-09-24
15:09:25,083::brokerlink::137::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor)
Success, id 25293072
MainThread::INFO::2014-09-24
15:09:25,083::brokerlink::126::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor)
Starting monitor mgmt-bridge, options {'use_ssl': 'true',
'bridge_name':
'ovirtmgmt', 'address': '0'}
MainThread::INFO::2014-09-24
15:09:25,086::brokerlink::137::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor)
Success, id 25294160
MainThread::INFO::2014-09-24
15:09:25,086::brokerlink::126::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor)
Starting monitor mem-free, options {'use_ssl': 'true', 'address':
'0'}
MainThread::INFO::2014-09-24
15:09:25,088::brokerlink::137::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor)
Success, id 25293968
MainThread::INFO::2014-09-24
15:09:25,088::brokerlink::126::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor)
Starting monitor cpu-load-no-engine, options {'use_ssl': 'true',
'vm_uuid': 'e1ca293f-09e0-4d2e-8915-221839af1489', 'address':
'0'}
MainThread::INFO::2014-09-24
15:09:25,089::brokerlink::137::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor)
Success, id 25360400
MainThread::INFO::2014-09-24
15:09:25,089::brokerlink::126::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor)
Starting monitor engine-health, options {'use_ssl': 'true',
'vm_uuid':
'e1ca293f-09e0-4d2e-8915-221839af1489', 'address': '0'}
MainThread::INFO::2014-09-24
15:09:25,091::brokerlink::137::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor)
Success, id 25509776
MainThread::INFO::2014-09-24
15:09:25,091::hosted_engine::391::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_broker)
Broker initialized, all submonitors started
MainThread::INFO::2014-09-24
15:09:25,125::hosted_engine::476::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_sanlock)
Ensuring lease for lockspace hosted-engine, host id 2 is acquired (file:
/rhev/data-center/mnt/10.8.2.12:_volume1_engine-store/e313da39-594c-46b5-95c9-c445889c745c/ha_agent/hosted-engine.lockspace)
MainThread::INFO::2014-09-24
15:09:25,134::state_machine::153::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh)
Global metadata: {'maintenance': False}
MainThread::INFO::2014-09-24
15:09:25,134::state_machine::158::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh)
Host 10.8.2.100 (id 1): {'live-data': True, 'extra':
'metadata_parse_version=1\nmetadata_feature_version=1\ntimestamp=1411564164
(Wed Sep 24 15:09:24
2014)\nhost-id=1\nscore=2400\nmaintenance=False\nstate=EngineUp\n',
'hostname': '10.8.2.100', 'host-id': 1, 'engine-status':
{'health':
'good', 'vm': 'up', 'detail': 'up'},
'score': 2400, 'maintenance':
False, 'host-ts': 1411564164}
MainThread::INFO::2014-09-24
15:09:25,134::state_machine::158::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh)
Host 10.8.2.102 (id 3): {'live-data': False, 'extra':
'metadata_parse_version=1\nmetadata_feature_version=1\ntimestamp=1411562496
(Wed Sep 24 14:41:36
2014)\nhost-id=3\nscore=0\nmaintenance=False\nstate=EngineUnexpectedlyDown\ntimeout=Wed
Sep 24 14:50:24 2014\n', 'hostname': '10.8.2.102', 'host-id':
3,
'engine-status': {'reason': 'vm not running on this host',
'health':
'bad', 'vm': 'down', 'detail': 'unknown'},
'score': 0, 'maintenance':
False, 'host-ts': 1411562496}
MainThread::INFO::2014-09-24
15:09:25,134::state_machine::161::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh)
Local (id 2): {'engine-health': None, 'bridge': True, 'mem-free':
None,
'maintenance': False, 'cpu-load': None, 'gateway': True}
MainThread::INFO::2014-09-24
15:09:25,135::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
Trying: notify time=1411564165.14 type=state_transition
detail=StartState-ReinitializeFSM
hostname='ovirt-node-mapconv2.int.tngtech.com'
MainThread::INFO::2014-09-24
15:09:25,170::brokerlink::117::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
Success, was notification of state_transition
(StartState-ReinitializeFSM) sent? sent
MainThread::INFO::2014-09-24
15:09:25,383::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Current state ReinitializeFSM (score: 0)
MainThread::INFO::2014-09-24
15:09:35,409::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
Trying: notify time=1411564175.41 type=state_transition
detail=ReinitializeFSM-EngineDown
hostname='ovirt-node-mapconv2.int.tngtech.com'
MainThread::INFO::2014-09-24
15:09:35,410::brokerlink::117::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
Success, was notification of state_transition
(ReinitializeFSM-EngineDown) sent? ignored
MainThread::INFO::2014-09-24
15:09:35,627::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Current state EngineDown (score: 2400)
MainThread::INFO::2014-09-24
15:09:45,652::states::441::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
The engine is not running, but we do not have enough data to decide
which hosts are alive
MainThread::INFO::2014-09-24
15:09:45,653::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
Trying: notify time=1411564185.65 type=state_transition
detail=EngineDown-EngineDown hostname='ovirt-node-mapconv2.int.tngtech.com'
MainThread::INFO::2014-09-24
15:09:45,653::brokerlink::117::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
Success, was notification of state_transition (EngineDown-EngineDown)
sent? ignored
MainThread::INFO::2014-09-24
15:09:45,875::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Current state EngineDown (score: 2400)
MainThread::CRITICAL::2014-09-24
15:09:55,899::agent::103::ovirt_hosted_engine_ha.agent.agent.Agent::(run) Could
not start ha-agent
Traceback (most recent call last):
File
"/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line
97, in run
self._run_agent()
File
"/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line
154, in _run_agent
hosted_engine.HostedEngine(self.shutdown_requested).start_monitoring()
File
"/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
line 307, in start_monitoring
for old_state, state, delay in self.fsm:
File
"/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/fsm/machine.py",
line 125, in next
new_data = self.refresh(self._state.data)
File
"/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/state_machine.py",
line 77, in refresh
stats.update(self.hosted_engine.collect_stats())
File
"/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
line 700, in collect_stats
stats = self.process_remote_metadata(host_id, remote_data)
File
"/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
line 747, in process_remote_metadata
md['engine-status'] = engine_status(md["engine-status"])
File
"/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
line 79, in engine_status
in json.loads(status).iteritems()])
AttributeError: 'NoneType' object has no attribute 'iteritems'
------=_Part_24816046_1946051580.1411575237782
Content-Type: text/html; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
<html><body><div>Seems we should consider not adding the host if already
th=
ere. Please open a bug.<br>Though I really hope in 3.6 to see this done fro=
m the gui<br><br>On Sep 24, 2014 4:23 PM, Stefan Wendler
<stefan.wendler=
@tngtech.com> wrote:<br>><br>> Okay, I'm truncating
the previous m=
ails here.... <br>><br>> Davids
hi</div><br><div>Okay, I'm truncating=
the previous mails here....
<br>
<br>Davids hint was the solution. I had the ovirt hosts already added to th=
e
<br>cluster and tried to do the hosted-engine-ha setup on them.
<br>
<br>After removing the hosts from the cluster and putting the data domain t=
o
<br>maintenance mode I was able to deploy an all other nodes. I now have a
<br>HA'd hosted engine. Which can also be migrated \o/
<br>
<br>Maybe that is something that could be stated in the documentation more
<br>clearly?
<br>
<br>Unfortunately now I have a new problem. The agents crash rapidly after
<br>startup. The error is the following:
<br>(/var/log/ovirt-hosted-engine-ha/agent.log)
<br>
<br>AttributeError: 'NoneType' object has no attribute 'iteritems'
<br>
<br>And the whole output here - The agents have been started and I tried a
<br>migration of the hosted engine from ovirt host 1 to host 2 which
<br>succeeded. But the agents crashed afterwards:
<br>
<br>MainThread::INFO::2014-09-24
<br>15:09:24,839::agent::52::ovirt_hosted_engine_ha.agent.agent.Agent::(run=
)
<br>ovirt-hosted-engine-ha agent 1.1.5 started
<br>MainThread::INFO::2014-09-24
<br>15:09:24,871::hosted_engine::223::ovirt_hosted_engine_ha.agent.hosted_e=
ngine.HostedEngine::(_get_hostname)
<br>Found certificate common name: 10.8.2.101
<br>MainThread::INFO::2014-09-24
<br>15:09:25,081::hosted_engine::367::ovirt_hosted_engine_ha.agent.hosted_e=
ngine.HostedEngine::(_initialize_broker)
<br>Initializing ha-broker connection
<br>MainThread::INFO::2014-09-24
<br>15:09:25,082::brokerlink::126::ovirt_hosted_engine_ha.lib.brokerlink.Br=
okerLink::(start_monitor)
<br>Starting monitor ping, options {'addr': '10.8.2.1'}
<br>MainThread::INFO::2014-09-24
<br>15:09:25,083::brokerlink::137::ovirt_hosted_engine_ha.lib.brokerlink.Br=
okerLink::(start_monitor)
<br>Success, id 25293072
<br>MainThread::INFO::2014-09-24
<br>15:09:25,083::brokerlink::126::ovirt_hosted_engine_ha.lib.brokerlink.Br=
okerLink::(start_monitor)
<br>Starting monitor mgmt-bridge, options {'use_ssl': 'true',
'bridge_name'=
:
<br>'ovirtmgmt', 'address': '0'}
<br>MainThread::INFO::2014-09-24
<br>15:09:25,086::brokerlink::137::ovirt_hosted_engine_ha.lib.brokerlink.Br=
okerLink::(start_monitor)
<br>Success, id 25294160
<br>MainThread::INFO::2014-09-24
<br>15:09:25,086::brokerlink::126::ovirt_hosted_engine_ha.lib.brokerlink.Br=
okerLink::(start_monitor)
<br>Starting monitor mem-free, options {'use_ssl': 'true',
'address': '0'}
<br>MainThread::INFO::2014-09-24
<br>15:09:25,088::brokerlink::137::ovirt_hosted_engine_ha.lib.brokerlink.Br=
okerLink::(start_monitor)
<br>Success, id 25293968
<br>MainThread::INFO::2014-09-24
<br>15:09:25,088::brokerlink::126::ovirt_hosted_engine_ha.lib.brokerlink.Br=
okerLink::(start_monitor)
<br>Starting monitor cpu-load-no-engine, options {'use_ssl':
'true',
<br>'vm_uuid': 'e1ca293f-09e0-4d2e-8915-221839af1489',
'address': '0'}
<br>MainThread::INFO::2014-09-24
<br>15:09:25,089::brokerlink::137::ovirt_hosted_engine_ha.lib.brokerlink.Br=
okerLink::(start_monitor)
<br>Success, id 25360400
<br>MainThread::INFO::2014-09-24
<br>15:09:25,089::brokerlink::126::ovirt_hosted_engine_ha.lib.brokerlink.Br=
okerLink::(start_monitor)
<br>Starting monitor engine-health, options {'use_ssl': 'true',
'vm_uuid':
<br>'e1ca293f-09e0-4d2e-8915-221839af1489', 'address': '0'}
<br>MainThread::INFO::2014-09-24
<br>15:09:25,091::brokerlink::137::ovirt_hosted_engine_ha.lib.brokerlink.Br=
okerLink::(start_monitor)
<br>Success, id 25509776
<br>MainThread::INFO::2014-09-24
<br>15:09:25,091::hosted_engine::391::ovirt_hosted_engine_ha.agent.hosted_e=
ngine.HostedEngine::(_initialize_broker)
<br>Broker initialized, all submonitors started
<br>MainThread::INFO::2014-09-24
<br>15:09:25,125::hosted_engine::476::ovirt_hosted_engine_ha.agent.hosted_e=
ngine.HostedEngine::(_initialize_sanlock)
<br>Ensuring lease for lockspace hosted-engine, host id 2 is acquired (file=
:
<br>/rhev/data-center/mnt/10.8.2.12:_volume1_engine-store/e313da39-594c-46b=
5-95c9-c445889c745c/ha_agent/hosted-engine.lockspace)
<br>MainThread::INFO::2014-09-24
<br>15:09:25,134::state_machine::153::ovirt_hosted_engine_ha.agent.hosted_e=
ngine.HostedEngine::(refresh)
<br>Global metadata: {'maintenance': False}
<br>MainThread::INFO::2014-09-24
<br>15:09:25,134::state_machine::158::ovirt_hosted_engine_ha.agent.hosted_e=
ngine.HostedEngine::(refresh)
<br>Host 10.8.2.100 (id 1): {'live-data': True, 'extra':
<br>'metadata_parse_version=3D1\nmetadata_feature_version=3D1\ntimestamp=3D=
1411564164
<br>(Wed Sep 24 15:09:24
<br>2014)\nhost-id=3D1\nscore=3D2400\nmaintenance=3DFalse\nstate=3DEngineUp=
\n',
<br>'hostname': '10.8.2.100', 'host-id': 1,
'engine-status': {'health':
<br>'good', 'vm': 'up', 'detail': 'up'},
'score': 2400, 'maintenance':
<br>False, 'host-ts': 1411564164}
<br>MainThread::INFO::2014-09-24
<br>15:09:25,134::state_machine::158::ovirt_hosted_engine_ha.agent.hosted_e=
ngine.HostedEngine::(refresh)
<br>Host 10.8.2.102 (id 3): {'live-data': False, 'extra':
<br>'metadata_parse_version=3D1\nmetadata_feature_version=3D1\ntimestamp=3D=
1411562496
<br>(Wed Sep 24 14:41:36
<br>2014)\nhost-id=3D3\nscore=3D0\nmaintenance=3DFalse\nstate=3DEngineUnexp=
ectedlyDown\ntimeout=3DWed
<br>Sep 24 14:50:24 2014\n', 'hostname': '10.8.2.102',
'host-id': 3,
<br>'engine-status': {'reason': 'vm not running on this
host', 'health':
<br>'bad', 'vm': 'down', 'detail':
'unknown'}, 'score': 0, 'maintenance':
<br>False, 'host-ts': 1411562496}
<br>MainThread::INFO::2014-09-24
<br>15:09:25,134::state_machine::161::ovirt_hosted_engine_ha.agent.hosted_e=
ngine.HostedEngine::(refresh)
<br>Local (id 2): {'engine-health': None, 'bridge': True,
'mem-free': None,
<br>'maintenance': False, 'cpu-load': None, 'gateway':
True}
<br>MainThread::INFO::2014-09-24
<br>15:09:25,135::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlink.Br=
okerLink::(notify)
<br>Trying: notify time=3D1411564165.14 type=3Dstate_transition
<br>detail=3DStartState-ReinitializeFSM
<br>hostname=3D'ovirt-node-mapconv2.int.tngtech.com'
<br>MainThread::INFO::2014-09-24
<br>15:09:25,170::brokerlink::117::ovirt_hosted_engine_ha.lib.brokerlink.Br=
okerLink::(notify)
<br>Success, was notification of state_transition
<br>(StartState-ReinitializeFSM) sent? sent
<br>MainThread::INFO::2014-09-24
<br>15:09:25,383::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_e=
ngine.HostedEngine::(start_monitoring)
<br>Current state ReinitializeFSM (score: 0)
<br>MainThread::INFO::2014-09-24
<br>15:09:35,409::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlink.Br=
okerLink::(notify)
<br>Trying: notify time=3D1411564175.41 type=3Dstate_transition
<br>detail=3DReinitializeFSM-EngineDown
<br>hostname=3D'ovirt-node-mapconv2.int.tngtech.com'
<br>MainThread::INFO::2014-09-24
<br>15:09:35,410::brokerlink::117::ovirt_hosted_engine_ha.lib.brokerlink.Br=
okerLink::(notify)
<br>Success, was notification of state_transition
<br>(ReinitializeFSM-EngineDown) sent? ignored
<br>MainThread::INFO::2014-09-24
<br>15:09:35,627::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_e=
ngine.HostedEngine::(start_monitoring)
<br>Current state EngineDown (score: 2400)
<br>MainThread::INFO::2014-09-24
<br>15:09:45,652::states::441::ovirt_hosted_engine_ha.agent.hosted_engine.H=
ostedEngine::(consume)
<br>The engine is not running, but we do not have enough data to decide
<br>which hosts are alive
<br>MainThread::INFO::2014-09-24
<br>15:09:45,653::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlink.Br=
okerLink::(notify)
<br>Trying: notify time=3D1411564185.65 type=3Dstate_transition
<br>detail=3DEngineDown-EngineDown hostname=3D'ovirt-node-mapconv2.int.tngt=
ech.com'
<br>MainThread::INFO::2014-09-24
<br>15:09:45,653::brokerlink::117::ovirt_hosted_engine_ha.lib.brokerlink.Br=
okerLink::(notify)
<br>Success, was notification of state_transition (EngineDown-EngineDown)
<br>sent? ignored
<br>MainThread::INFO::2014-09-24
<br>15:09:45,875::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_e=
ngine.HostedEngine::(start_monitoring)
<br>Current state EngineDown (score: 2400)
<br>MainThread::CRITICAL::2014-09-24
<br>15:09:55,899::agent::103::ovirt_hosted_engine_ha.agent.agent.Agent::(ru=
n) Could
<br>not start ha-agent
<br>Traceback (most recent call last):
<br> File
<br>"/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/age=
nt.py", line
<br>97, in run
<br> self._run_agent()
<br> File
<br>"/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/age=
nt.py", line
<br>154, in _run_agent
<br>
hosted_engine.HostedEngine(self.shutdown_requested).start=
_monitoring()
<br> File
<br>"/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/hos=
ted_engine.py",
<br>line 307, in start_monitoring
<br> for old_state, state, delay in self.fsm:
<br> File
<br>"/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/fsm/m=
achine.py",
<br>line 125, in next
<br> new_data =3D self.refresh(self._state.data)
<br> File
<br>"/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/sta=
te_machine.py",
<br>line 77, in refresh
<br> stats.update(self.hosted_engine.collect_stats())
<br> File
<br>"/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/hos=
ted_engine.py",
<br>line 700, in collect_stats
<br> stats =3D self.process_remote_metadata(host_id,
remote_da=
ta)
<br> File
<br>"/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/hos=
ted_engine.py",
<br>line 747, in process_remote_metadata
<br> md['engine-status'] =3D
engine_status(md["engine-sta=
tus"])
<br> File
<br>"/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/hos=
ted_engine.py",
<br>line 79, in engine_status
<br> in json.loads(status).iteritems()])
<br>AttributeError: 'NoneType' object has no attribute 'iteritems'
<br>
<br>
<br></div></body></html>
------=_Part_24816046_1946051580.1411575237782--