[ovirt-users] Slow booting host - restart loop

Eli Mesika emesika at redhat.com
Tue Sep 5 15:14:11 UTC 2017


Hi Bernardo

I would like to suggest a workaround to this problem , can you please check
that :

We have a configuration value named FenceQuietTimeBetweenOperationsInSec.
It controls the minimal timeout to wait between fence operation (stop,
start),
currently, it is defaulted to 180 sec , The key is not exposed to
engine-config, so, I would suggest to

1) Change this key value to 900 by running the following from psql prompt :

update vdc_options set option_value = '900' where option_name =
'FenceQuietTimeBetweenOperationsInSec';

2) Restart the engine

3) Repeat the scenario

Now, the engine will require 15 min between fencing operations and your
host can be up again without being fenced again.

Please let me know if this workaround is working for you

Thanks

Eli

On Tue, Sep 5, 2017 at 4:20 PM, Bernardo Juanicó <bjuanico at gmail.com> wrote:

> Martin, thanks for your reply, i was aware of the [1] BUG and the
> implemented solution, changing ServerRebootTimeout to 1200 didnt change a
> thing...
> Now i know about [2] and ill test the fix once it gets released.
>
> Regards,
>
> Bernardo
>
> PGP Key <http://pgp.mit.edu/pks/lookup?op=get&search=0x695E5BCE34263F5B>
> Skype: mattraken
>
> 2017-09-05 8:23 GMT-03:00 Martin Perina <mperina at redhat.com>:
>
>> Hi Bernardo,
>>
>> we have added timeout to wait until host is booted [1] in oVirt 4.1.2.
>> This timeout is by default 5 minutes, but it can be extended using
>> following command:
>>
>>    engine-config -s ServerRebootTimeout=NNN
>>
>> where NNN is number of seconds you want to wait until host is booted up.
>>
>> But be aware that you may be affected by [2], which we are currently
>> trying to fix.
>>
>> Regards
>>
>> Martin Perina
>>
>>
>> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1423657
>> [2] https://bugzilla.redhat.com/show_bug.cgi?id=1477700
>>
>>
>> On Fri, Sep 1, 2017 at 7:54 PM, Bernardo Juanicó <bjuanico at gmail.com>
>> wrote:
>>
>>> Hi everyone,
>>>
>>> I installed 2 hosts on a new cluster and the servers take a really long
>>> to boot up (about 8 minutes).
>>>
>>> When a host crashes or is powered off the ovirt-manager starts it via
>>> power management, since the servers takes all that time to boot up the
>>> ovirt-manager thinks it failed to start and proceeds to reboot it, several
>>> times before giving up, when the server is finally started (about 20
>>> minutes after the failure)
>>>
>>> I changed some engine variables with engine-config trying to set a
>>> higher timeout, but the problem persists.
>>>
>>> Any ideas??
>>>
>>>
>>> Regards,
>>> Bernardo
>>>
>>>
>>> PGP Key <http://pgp.mit.edu/pks/lookup?op=get&search=0x695E5BCE34263F5B>
>>> Skype: mattraken
>>>
>>> _______________________________________________
>>> Users mailing list
>>> Users at ovirt.org
>>> http://lists.ovirt.org/mailman/listinfo/users
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20170905/1c19bfab/attachment.html>


More information about the Users mailing list