[ovirt-users] repeating EngineUnexpectedlyDown/EngineDown/EngineStart/EngineStarting

Robert Story rstory at tislabs.com
Thu Oct 29 16:37:44 UTC 2015


On Thu, 29 Oct 2015 16:00:27 +0100 Simone wrote:
ST> And indeed ares was host 1 so when it failed it was correctly trying to
ST> get lock for host 1 but it seams that previously it acquired a lock as
ST> different host.
ST> Could you please check
ST>  grep host_id /etc/ovirt-hosted-engine/hosted-engine.conf
ST> on ares and share vdsm and sanlock logs from that host?

$ for x in ares hera eclipse poseidon apollo; do echo "* $x"; ssh root@$x grep host_id /etc/ovirt-hosted-engine/hosted-engine.conf 2>/dev/null; done
* ares
host_id=1
* hera
host_id=2
* eclipse
host_id=3
* poseidon
host_id=4
* apollo
host_id=5

Since I've upgraded, I figured I'd reproduce and send new logs. In that
process, I noticed that the ha-agent was down on 3 hosts, and the 2 other
hosts were the ones generating the messages. So I restarted ha-agent on all
5, disabled global maintenance for 2 minutes, re-enabeled it, then ran a
grep on all the logs on all 5 hosts for those 2 minutes. I'll sent that to
you directly, as it's rather large to be sending to the list..

All 3 hosts that had down ha-agents were down again, so I'm guessing
that's the issue.. 

Robert

-- 
Senior Software Engineer @ Parsons
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <http://lists.ovirt.org/pipermail/users/attachments/20151029/8f4293d7/attachment-0001.sig>


More information about the Users mailing list