[ovirt-users] Self-hosted engine won't start
Jiri Moskovcak
jmoskovc at redhat.com
Thu Jul 24 09:10:04 UTC 2014
Hi, please provide the the exact versions of ovirt-hosted-engine-ha and
all logs from /var/log/ovirt-hosted-engine-ha/
Thank you,
Jirka
On 07/24/2014 01:29 AM, John Gardeniers wrote:
> Hi All,
>
> I have created a lab with 2 hypervisors and a self-hosted engine. Today
> I followed the upgrade instructions as described in
> http://www.ovirt.org/Hosted_Engine_Howto and rebooted the engine. I
> didn't really do an upgrade but simply wanted to test what would happen
> when the engine was rebooted.
>
> When the engine didn't restart I re-ran hosted-engine
> --set-maintenance=none and restarted the vdsm, ovirt-ha-agent and
> ovirt-ha-broker services on both nodes. 15 minutes later it still hadn't
> restarted, so I then tried rebooting both hypervisers. After an hour
> there was still no sign of the engine starting. The agent logs don't
> help me much. The following bits are repeated over and over.
>
> ovirt1 (192.168.19.20):
>
> MainThread::INFO::2014-07-24
> 09:18:40,272::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
> Trying: notify time=1406157520.27 type=state_transition
> detail=EngineDown-EngineDown hostname='ovirt1.om.net'
> MainThread::INFO::2014-07-24
> 09:18:40,272::brokerlink::117::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
> Success, was notification of state_transition (EngineDown-EngineDown)
> sent? ignored
> MainThread::INFO::2014-07-24
> 09:18:40,594::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineDown (score: 2400)
> MainThread::INFO::2014-07-24
> 09:18:40,594::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 192.168.19.21 (id: 2, score: 2400)
>
> ovirt2 (192.168.19.21):
>
> MainThread::INFO::2014-07-24
> 09:18:04,005::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
> Trying: notify time=1406157484.01 type=state_transition
> detail=EngineDown-EngineDown hostname='ovirt2.om.net'
> MainThread::INFO::2014-07-24
> 09:18:04,006::brokerlink::117::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
> Success, was notification of state_transition (EngineDown-EngineDown)
> sent? ignored
> MainThread::INFO::2014-07-24
> 09:18:04,324::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineDown (score: 2400)
> MainThread::INFO::2014-07-24
> 09:18:04,324::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 192.168.19.20 (id: 1, score: 2400)
>
> From the above information I decided to simply shut down one hypervisor
> and see what happens. The engine did start back up again a few minutes
> later.
>
> The interesting part is that each hypervisor seems to think the other is
> a better host. The two machines are identical, so there's no reason I
> can see for this odd behaviour. In a lab environment this is little more
> than an annoying inconvenience. In a production environment it would be
> completely unacceptable.
>
> May I suggest that this issue be looked into and some means found to
> eliminate this kind of mutual exclusion? e.g. After a few minutes of
> such an issue one hypervisor could be randomly given a slightly higher
> weighting, which should result in it being chosen to start the engine.
>
> regards,
> John
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
More information about the Users
mailing list