On 02.10.2014 09:51, Jiri Moskovcak wrote:
On 10/01/2014 02:39 PM, Daniel Helgenberger wrote:
> On 01.10.2014 13:33, Jiri Moskovcak wrote:
>> On 10/01/2014 01:17 PM, Daniel Helgenberger wrote:
>>> Hello Jirka,
>>> On 01.10.2014 09:10, Jiri Moskovcak wrote:
>>>> Hi Daniel,
>>>> from the logs it seems like you ran into [1]. It should be fixed in
>>>> ovirt-hosted-engine-ha-1.1.5 (part of oVirt 3.4.2).
>>> I am running 3.4.4 - and from hosted-engine --vm-status both hosts had a
>>> score of 2400...
>> - doesn't seem like it from the logs, I can see the transition from
>> EngineStart to EngineUp and directly to EngineUpBadHealth, if you have
>> the latest version it should go to the EngineStarting before it's
>> EngineUp, are you sure you've restarted the services (broker and agent)
>> after update? Please provide output of rpm -q ovirt-hosted-engine-ha.
> here you go:
> rpm -q ovirt-hosted-engine-ha
> ovirt-hosted-engine-ha-1.1.5-1.el6.noarch
>
>
> also, I upgraded to 3.4.3 prior to 3.4.4. I cannot recall whatevter I
> restarted ovirt-ha-agent; but it is highly likely. Here system reboots
> after kernel updates:
> reboot system boot 2.6.32-431.29.2. Tue Sep 30 21:46 - 14:36 (16:50)
> reboot system boot 2.6.32-431.29.2. Mon Sep 29 12:19 - 21:44 (1+09:24)
> reboot system boot 2.6.32-431.29.2. Fri Sep 12 08:47 - 12:17 (17+03:30)
> reboot system boot 2.6.32-431.20.3. Mon Sep 1 17:48 - 08:44 (10+14:56)
ok, so please just to be 100% sure, check the version on both hosts (it
should be >= 1.1.5) and restart broker and agent and then try to
reproduce the problem. I went thru the code in 1.1.5 and I don't see any
code path which could take the agent from EngineStart to EngineUp
without going thru the EngineStarting state - this was the behavior
prior 1.1.5.
Hi Jirka,
sadly I cannot reproduce this atm because yesterday I upgraded to
ovirt-hosted-engine-ha-1.1.6-1.el6.noarch. (but at least, I did restart
everything). This was resulting in HA being inoperable, one of my HA
hosts quits with: Exception: Failed to start monitoring domain
(sd_uuid=bcfa7ec4-5278-44d8-9f31-682f2d9de91d, host_id=1): timeout
during domain acquisition
I might have a lot of issues because of changes I made for resolving
BZ1147148 (witch should be reverted by now). I try do downgrade and if I
get HA working again try to reproduce this.
Cheers
Regards,
Jirka
>> Thanks,
>> Jirka
>>
>>>> --Jirka
>>>>
>>>> [1]
https://bugzilla.redhat.com/show_bug.cgi?id=1093366
>>>>
>>>> On 09/27/2014 12:40 PM, Daniel Helgenberger wrote:
>>>>> Hello,
>>>>>
>>>>> before filing a BZ against 3.4 branch I wanted to get some input on
the
>>>>> following issue:
>>>>>
>>>>> Steps, root shell on one engine-ha hosts, using hosted-engine cmd:
>>>>> 1. set global maintenance
>>>>> 2. shutdown hosted-engine vm
>>>>> (do some work)
>>>>> 3. disable global maintenance
>>>>>
>>>>> Result: My engine was started and immediately powered down again, in
a loop.
>>>>> I could only manually brake this with:
>>>>> 1. enable global mt. gain
>>>>> 2. start engine
>>>>> 3. disable global mt.
>>>>>
>>>>> I attached the hosts' engine-ha broker logs as well as agent
logs, from
>>>>> today 12:00 to 12:27, right after I 'fixed' this.
>>>>> Note, the engine was started on nodehv02 automatically after i
disabled
>>>>> global mt. @ about 12:05
>>>>>
>>>>> Thanks
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Users mailing list
>>>>> Users(a)ovirt.org
>>>>>
http://lists.ovirt.org/mailman/listinfo/users
>>>>>
--
Daniel Helgenberger
m box bewegtbild GmbH
P: +49/30/2408781-22
F: +49/30/2408781-10
ACKERSTR. 19
D-10115 BERLIN
www.m-box.de www.monkeymen.tv
Geschäftsführer: Martin Retschitzegger / Michaela Göllner
Handeslregister: Amtsgericht Charlottenburg / HRB 112767