[ovirt-users] oVirt 4.0.x - hosted-engine was not starting properly
Simone Tiraboschi
stirabos at redhat.com
Thu Sep 29 13:01:34 UTC 2016
On Thu, Sep 29, 2016 at 2:51 PM, Gervais de Montbrun <gervais at demontbrun.com
> wrote:
> Hi Martin,
>
> The entropy was super low. Somewhere around 140. I installed and
> configured haveged.service to start at bootup, reverted my apache
> changes... After a reboot, my systemctl status still says that there are 7
> services queued (note that I erroneously said degraded in my previous email
> - the services are, in fact, queued), but the oVirt GUI comes up almost
> immediately and everything seems to be great.
>
>
Take care that using havaged on a VM should not be considered a good source
of entropy and the oVirt PKi is managed by the engine.
http://security.stackexchange.com/questions/34523/is-it-
appropriate-to-use-haveged-as-a-source-of-entropy-on-virtual-machines
A better approach is the virtio-rng paravirtualised rng driver as for patch
https://gerrit.ovirt.org/#/c/62334/
> Thank you for the tip. You solved my issue.
>
> Cheers,
> Gervais
>
>
>
> On Sep 29, 2016, at 7:47 AM, Martin Perina <mperina at redhat.com> wrote:
>
> Hi,
>
> please take a look at my inline comments:
>
> On Tue, Sep 27, 2016 at 7:23 PM, Gervais de Montbrun <
> gervais at demontbrun.com> wrote:
>
>> Hey All,
>>
>> Since updating to 4.0.x of oVirt, I have had an issue with my hosted
>> engine. After a some poking around, I think I have figured out my issue and
>> thought I would share to see what others think.
>> The issue has existed with 4.0, 4.0.1, 4.0.2, 4.0.3, and still exists in
>> 4.0.4.
>>
>> Description:
>> When my hosted engine starts it reports that it is in a degraded state
>> with 7 or 8 services still not started when I run systemctl status. It
>> takes about 6 or 7 minutes to eventually start all the services and come
>> online. If I don't set my cluster to Global-Maintenance mode it eventually
>> thinks that my hosted-engine needs to be rebooted and restarts it before it
>> can start everything.
>>
>
> Could you please share with us logs gathered by ovirt-log-collector?
>
> It's just a guess but could you please take a look if you HE VM has enough
> entropy?
>
> cat /proc/sys/kernel/random/entropy_avail
>
> If the value is low (below or around 200), you really need to install and
> configure some entropy generator such as haveged
>
>
>> Solution:
>> I realized that Apache was the culprit and found that the proxy to the
>> ovirt-engine in /etc/httpd/conf.d/z-ovirt-engine-proxy.conf has a super
>> long timeout with many retries. I changed the settings and now everything
>> works for me.
>>
>> -> Before change:
>>
>> <LocationMatch ^/(ovirt-engine($|/)|api($|/)|
>> RHEVManagerWeb/|OvirtEngineWeb/|ca.crt$|engine.ssh.key.txt$|
>> rhevm.ssh.key.txt$)>
>> ProxyPassMatch ajp://127.0.0.1:8702 timeout=3600 retry=5
>>
>> <IfModule deflate_module>
>> AddOutputFilterByType DEFLATE text/javascript text/css
>> text/html text/xml text/json application/xml application/json
>> application/x-yaml
>> </IfModule>
>> </LocationMatch>
>>
>>
>> -> After change:
>>
>> <LocationMatch ^/ovirt-engine($|/)>
>> ProxyPassMatch ajp://127.0.0.1:8702 timeout=5 retry=2
>>
>> <IfModule deflate_module>
>> AddOutputFilterByType DEFLATE text/javascript text/css
>> text/html text/xml text/json application/xml application/json
>> application/x-yaml
>> </IfModule>
>> </LocationMatch>
>>
>>
> This one is correct for 4.0
> , not sure why it was not updated during upgrade from 3.6. @Simone?
>
>
>
>>
>> If I read the timeout settings correctly, it will wait 60 minutes with 5
>> retries. 5 hours is way too long for my little server to hold onto all
>> those apache processes.
>>
> The change I made allows for there to be an error, and also releases
>> apache's hold on the process. Once everything is ready, apache is ready to
>> serve requests and everything/everyone is happy. Before making the change,
>> I just get a whitescreen in my browser and then nothing works until I
>> restart Apache (or I end up in an endless loop of ovirt-ha services
>> restarting my hosted-engine.
>>
>
> Well, if you have an issue with too many apache processes waiting for
> engine to respond, then there's some issue in engine. As I wrote above
> please share the logs with us and check entropy.
>
> Thanks
>
> Martin Perina
>
>
>
>>
>> I noticed that this setting reverts to the original setting, so oVirt
>> must be writing this file. Perhaps these number can be changed in oVirt? If
>> not, I will just setup and ansible play to revert the settings with working
>> values and restart apache on my engine.
>> :-)
>>
>> Cheers,
>> Gervais
>>
>>
>>
>>
>> _______________________________________________
>> Users mailing list
>> Users at ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20160929/a889f884/attachment-0001.html>
More information about the Users
mailing list