On Thu, Sep 29, 2016 at 12:47 PM, Martin Perina <mperina(a)redhat.com> wrote:
Hi,
please take a look at my inline comments:
On Tue, Sep 27, 2016 at 7:23 PM, Gervais de Montbrun <
gervais(a)demontbrun.com> wrote:
> Hey All,
>
> Since updating to 4.0.x of oVirt, I have had an issue with my hosted
> engine. After a some poking around, I think I have figured out my issue and
> thought I would share to see what others think.
> The issue has existed with 4.0, 4.0.1, 4.0.2, 4.0.3, and still exists in
> 4.0.4.
>
> Description:
> When my hosted engine starts it reports that it is in a degraded state
> with 7 or 8 services still not started when I run systemctl status. It
> takes about 6 or 7 minutes to eventually start all the services and come
> online. If I don't set my cluster to Global-Maintenance mode it eventually
> thinks that my hosted-engine needs to be rebooted and restarts it before it
> can start everything.
>
Could you please share with us logs gathered by ovirt-log-collector?
It's just a guess but could you please take a look if you HE VM has enough
entropy?
cat /proc/sys/kernel/random/entropy_avail
If the value is low (below or around 200), you really need to install and
configure some entropy generator such as haveged
> Solution:
> I realized that Apache was the culprit and found that the proxy to the
> ovirt-engine in /etc/httpd/conf.d/z-ovirt-engine-proxy.conf has a super
> long timeout with many retries. I changed the settings and now everything
> works for me.
>
> -> Before change:
>
> <LocationMatch ^/(ovirt-engine($|/)|api($|/)|
> RHEVManagerWeb/|OvirtEngineWeb/|ca.crt$|engine.ssh.key.txt$|
> rhevm.ssh.key.txt$)>
> ProxyPassMatch ajp://127.0.0.1:8702 timeout=3600 retry=5
>
> <IfModule deflate_module>
> AddOutputFilterByType DEFLATE text/javascript text/css
> text/html text/xml text/json application/xml application/json
> application/x-yaml
> </IfModule>
> </LocationMatch>
>
>
> -> After change:
>
> <LocationMatch ^/ovirt-engine($|/)>
> ProxyPassMatch ajp://127.0.0.1:8702 timeout=5 retry=2
>
> <IfModule deflate_module>
> AddOutputFilterByType DEFLATE text/javascript text/css
> text/html text/xml text/json application/xml application/json
> application/x-yaml
> </IfModule>
> </LocationMatch>
>
>
This one is correct for 4.0
, not sure why it was not updated during upgrade from 3.6. @Simone?
Honestly it's
<LocationMatch ^/ovirt-engine($|/)>
ProxyPassMatch ajp://127.0.0.1:8702 timeout=3600 retry=5
<IfModule deflate_module>
AddOutputFilterByType DEFLATE text/javascript text/css
text/html text/xml text/json application/xml application/json
application/x-yaml
</IfModule>
</LocationMatch>
also on a fresh 4.0 engine from our latest engine-appliance.
> If I read the timeout settings correctly, it will wait 60 minutes with 5
> retries. 5 hours is way too long for my little server to hold onto all
> those apache processes.
>
The change I made allows for there to be an error, and also releases
> apache's hold on the process. Once everything is ready, apache is ready to
> serve requests and everything/everyone is happy. Before making the change,
> I just get a whitescreen in my browser and then nothing works until I
> restart Apache (or I end up in an endless loop of ovirt-ha services
> restarting my hosted-engine.
>
Well, if you have an issue with too many apache processes waiting for
engine to respond, then there's some issue in engine. As I wrote above
please share the logs with us and check entropy.
Thanks
Martin Perina
>
> I noticed that this setting reverts to the original setting, so oVirt
> must be writing this file. Perhaps these number can be changed in oVirt? If
> not, I will just setup and ansible play to revert the settings with working
> values and restart apache on my engine.
> :-)
>
> Cheers,
> Gervais
>
>
>
>
> _______________________________________________
> Users mailing list
> Users(a)ovirt.org
>
http://lists.ovirt.org/mailman/listinfo/users
>
>