
2 Oct
2014
2 Oct
'14
1:43 p.m.
On 02.10.2014 09:51, Jiri Moskovcak wrote: > On 10/01/2014 02:39 PM, Daniel Helgenberger wrote: >> On 01.10.2014 13:33, Jiri Moskovcak wrote: >>> On 10/01/2014 01:17 PM, Daniel Helgenberger wrote: >>>> Hello Jirka, >>>> On 01.10.2014 09:10, Jiri Moskovcak wrote: >>>>> Hi Daniel, >>>>> from the logs it seems like you ran into [1]. It should be fixed in >>>>> ovirt-hosted-engine-ha-1.1.5 (part of oVirt 3.4.2). >>>> I am running 3.4.4 - and from hosted-engine --vm-status both hosts had a >>>> score of 2400... >>> - doesn't seem like it from the logs, I can see the transition from >>> EngineStart to EngineUp and directly to EngineUpBadHealth, if you have >>> the latest version it should go to the EngineStarting before it's >>> EngineUp, are you sure you've restarted the services (broker and agent) >>> after update? Please provide output of rpm -q ovirt-hosted-engine-ha. >> here you go: >> rpm -q ovirt-hosted-engine-ha >> ovirt-hosted-engine-ha-1.1.5-1.el6.noarch >> >> >> also, I upgraded to 3.4.3 prior to 3.4.4. I cannot recall whatevter I >> restarted ovirt-ha-agent; but it is highly likely. Here system reboots >> after kernel updates: >> reboot system boot 2.6.32-431.29.2. Tue Sep 30 21:46 - 14:36 (16:50) >> reboot system boot 2.6.32-431.29.2. Mon Sep 29 12:19 - 21:44 (1+09:24) >> reboot system boot 2.6.32-431.29.2. Fri Sep 12 08:47 - 12:17 (17+03:30) >> reboot system boot 2.6.32-431.20.3. Mon Sep 1 17:48 - 08:44 (10+14:56) > ok, so please just to be 100% sure, check the version on both hosts (it > should be >= 1.1.5) and restart broker and agent and then try to > reproduce the problem. I went thru the code in 1.1.5 and I don't see any > code path which could take the agent from EngineStart to EngineUp > without going thru the EngineStarting state - this was the behavior > prior 1.1.5. Hi Jirka, sadly I cannot reproduce this atm because yesterday I upgraded to ovirt-hosted-engine-ha-1.1.6-1.el6.noarch. (but at least, I did restart everything). This was resulting in HA being inoperable, one of my HA hosts quits with: Exception: Failed to start monitoring domain (sd_uuid=bcfa7ec4-5278-44d8-9f31-682f2d9de91d, host_id=1): timeout during domain acquisition I might have a lot of issues because of changes I made for resolving BZ1147148 (witch should be reverted by now). I try do downgrade and if I get HA working again try to reproduce this. Cheers > > Regards, > Jirka > >>> Thanks, >>> Jirka >>> >>>>> --Jirka >>>>> >>>>> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1093366 >>>>> >>>>> On 09/27/2014 12:40 PM, Daniel Helgenberger wrote: >>>>>> Hello, >>>>>> >>>>>> before filing a BZ against 3.4 branch I wanted to get some input on the >>>>>> following issue: >>>>>> >>>>>> Steps, root shell on one engine-ha hosts, using hosted-engine cmd: >>>>>> 1. set global maintenance >>>>>> 2. shutdown hosted-engine vm >>>>>> (do some work) >>>>>> 3. disable global maintenance >>>>>> >>>>>> Result: My engine was started and immediately powered down again, in a loop. >>>>>> I could only manually brake this with: >>>>>> 1. enable global mt. gain >>>>>> 2. start engine >>>>>> 3. disable global mt. >>>>>> >>>>>> I attached the hosts' engine-ha broker logs as well as agent logs, from >>>>>> today 12:00 to 12:27, right after I 'fixed' this. >>>>>> Note, the engine was started on nodehv02 automatically after i disabled >>>>>> global mt. @ about 12:05 >>>>>> >>>>>> Thanks >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Users mailing list >>>>>> Users@ovirt.org >>>>>> http://lists.ovirt.org/mailman/listinfo/users >>>>>> > -- Daniel Helgenberger m box bewegtbild GmbH P: +49/30/2408781-22 F: +49/30/2408781-10 ACKERSTR. 19 D-10115 BERLIN www.m-box.de www.monkeymen.tv Geschäftsführer: Martin Retschitzegger / Michaela Göllner Handeslregister: Amtsgericht Charlottenburg / HRB 112767