
On Sep 29, 2016, at 10:04 AM, Simone Tiraboschi <stirabos@redhat.com> = wrote: =20 =20 =20 On Thu, Sep 29, 2016 at 12:47 PM, Martin Perina <mperina@redhat.com = <mailto:mperina@redhat.com>> wrote: Hi, =20 please take a look at my inline comments: =20 On Tue, Sep 27, 2016 at 7:23 PM, Gervais de Montbrun = <gervais@demontbrun.com <mailto:gervais@demontbrun.com>> wrote: Hey All, =20 Since updating to 4.0.x of oVirt, I have had an issue with my hosted = engine. After a some poking around, I think I have figured out my issue = and thought I would share to see what others think. The issue has existed with 4.0, 4.0.1, 4.0.2, 4.0.3, and still exists = in 4.0.4. =20 Description: When my hosted engine starts it reports that it is in a degraded state = with 7 or 8 services still not started when I run systemctl status. It = takes about 6 or 7 minutes to eventually start all the services and come = online. If I don't set my cluster to Global-Maintenance mode it = eventually thinks that my hosted-engine needs to be rebooted and = restarts it before it can start everything. =20 =E2=80=8BCould you please share with us logs gathered by = ovirt-log-collector? =20 It's just a guess but could you please take a look if you HE VM has = enough entropy? =20 cat /proc/sys/kernel/random/entropy_avail =20 If the value is low (below or around 200), you really need to install = and configure some entropy generator such as haveged =20 =20 Solution: I realized that Apache was the culprit and found that the proxy to the = ovirt-engine in /etc/httpd/conf.d/z-ovirt-engine-proxy.conf has a super = long timeout with many retries. I changed the settings and now = everything works for me. =20 -> Before change: <LocationMatch = ^/(ovirt-engine($|/)|api($|/)|RHEVManagerWeb/|OvirtEngineWeb/|ca.crt$|engi= ne.ssh.key.txt$|rhevm.ssh.key.txt$)> ProxyPassMatch ajp://127.0.0.1:8702 <> timeout=3D3600 retry=3D5 =20 <IfModule deflate_module> AddOutputFilterByType DEFLATE text/javascript text/css = text/html text/xml text/json application/xml application/json = application/x-yaml </IfModule> </LocationMatch> =20 -> After change: <LocationMatch ^/ovirt-engine($|/)> ProxyPassMatch ajp://127.0.0.1:8702 <> timeout=3D5 retry=3D2 =20 <IfModule deflate_module> AddOutputFilterByType DEFLATE text/javascript text/css = text/html text/xml text/json application/xml application/json = application/x-yaml </IfModule> </LocationMatch> =20 =E2=80=8BThis one is correct for 4.0=E2=80=8B=E2=80=8B, not sure why = it was not updated during upgrade from 3.6. @Simone? =E2=80=8B =20 Honestly it's <LocationMatch ^/ovirt-engine($|/)> ProxyPassMatch ajp://127.0.0.1:8702 <http://127.0.0.1:8702/> = timeout=3D3600 retry=3D5 =20 <IfModule deflate_module> AddOutputFilterByType DEFLATE text/javascript text/css = text/html text/xml text/json application/xml application/json = application/x-yaml </IfModule> </LocationMatch> also on a fresh 4.0 engine from our latest engine-appliance. =20 =20 If I read the timeout settings correctly, it will wait 60 minutes with = 5 retries. 5 hours is way too long for my little server to hold onto all =
--Apple-Mail=_19C0B956-90E3-4499-AC9F-B50879BD6D0A Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 Hi Simone, Yes... I guess it was not clear in my original email. I changed the = numbers myself to lower the timeout and retries. With them set as they = were set by ovirt (timeout=3D3600 retry=3D5) things were not working for = me.=20 Cheers, Gervais those apache processes.
The change I made allows for there to be an error, and also releases = apache's hold on the process. Once everything is ready, apache is ready = to serve requests and everything/everyone is happy. Before making the = change, I just get a whitescreen in my browser and then nothing works = until I restart Apache (or I end up in an endless loop of ovirt-ha = services restarting my hosted-engine. =20 =E2=80=8BWell, if you have an issue with too many apache processes = waiting for engine to respond, then there's some issue in engine. As I = wrote above please share the logs with us and check entropy. =20 Thanks =20 Martin Perina =E2=80=8B=20 =20 I noticed that this setting reverts to the original setting, so oVirt = must be writing this file. Perhaps these number can be changed in oVirt? = If not, I will just setup and ansible play to revert the settings with = working values and restart apache on my engine. :-) =20 Cheers, Gervais =20 =20 =20 =20 _______________________________________________ Users mailing list Users@ovirt.org <mailto:Users@ovirt.org> http://lists.ovirt.org/mailman/listinfo/users = <http://lists.ovirt.org/mailman/listinfo/users> =20 =20 =20
--Apple-Mail=_19C0B956-90E3-4499-AC9F-B50879BD6D0A Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8 <html><head><meta http-equiv=3D"Content-Type" content=3D"text/html = charset=3Dutf-8"></head><body style=3D"word-wrap: break-word; = -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" = class=3D"">Hi Simone,<div class=3D""><br class=3D""></div><div = class=3D"">Yes... I guess it was not clear in my original email. I = changed the numbers myself to lower the timeout and retries. With them = set as they were set by ovirt (timeout=3D3600 retry=3D5) things were not = working for me. <br class=3D""><div class=3D""> <div id=3D"signature" class=3D""><br class=3D"">Cheers,<br = class=3D"">Gervais<br class=3D""><br class=3D""><br class=3D""></div> </div> <br class=3D""><div><blockquote type=3D"cite" class=3D""><div = class=3D"">On Sep 29, 2016, at 10:04 AM, Simone Tiraboschi <<a = href=3D"mailto:stirabos@redhat.com" class=3D"">stirabos@redhat.com</a>>= wrote:</div><br class=3D"Apple-interchange-newline"><div class=3D""><div = dir=3D"ltr" class=3D""><br class=3D""><div class=3D"gmail_extra"><br = class=3D""><div class=3D"gmail_quote">On Thu, Sep 29, 2016 at 12:47 PM, = Martin Perina <span dir=3D"ltr" class=3D""><<a = href=3D"mailto:mperina@redhat.com" target=3D"_blank" = class=3D"">mperina@redhat.com</a>></span> wrote:<br = class=3D""><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px = 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div = dir=3D"ltr" class=3D""><div class=3D"">Hi,<br class=3D""><br = class=3D""></div><div class=3D"">please take a look at my inline = comments:<br class=3D""></div><div class=3D"gmail_extra"><br = class=3D""><div class=3D"gmail_quote"><span class=3D"gmail-">On Tue, Sep = 27, 2016 at 7:23 PM, Gervais de Montbrun <span dir=3D"ltr" = class=3D""><<a href=3D"mailto:gervais@demontbrun.com" target=3D"_blank"= class=3D"">gervais@demontbrun.com</a>></span> wrote:<br = class=3D""><blockquote style=3D"margin:0px 0px 0px 0.8ex;border-left:1px = solid rgb(204,204,204);padding-left:1ex" class=3D"gmail_quote"><div = style=3D"word-wrap:break-word" class=3D"">Hey All,<div class=3D""><br = class=3D""></div><div class=3D"">Since updating to 4.0.x of oVirt, I = have had an issue with my hosted engine. After a some poking around, I = think I have figured out my issue and thought I would share to see what = others think.</div><div class=3D"">The issue has existed with 4.0, = 4.0.1, 4.0.2, 4.0.3, and still exists in 4.0.4.</div><div class=3D""><br = class=3D""></div><div class=3D"">Description:</div><div class=3D"">When = my hosted engine starts it reports that it is in a degraded state with 7 = or 8 services still not started when I run systemctl status. It takes = about 6 or 7 minutes to eventually start all the services and come = online. If I don't set my cluster to Global-Maintenance mode it = eventually thinks that my hosted-engine needs to be rebooted and = restarts it before it can start = everything.</div></div></blockquote></span><div class=3D""><br = class=3D""><div class=3D"">=E2=80=8BCould you please share with us logs = gathered by ovirt-log-collector?<br class=3D""><br class=3D"">It's just = a guess but could you please take a look if you HE VM has enough = entropy?<br class=3D""><br class=3D""> cat = /proc/sys/kernel/random/<wbr class=3D"">entropy_avail<br class=3D""><br = class=3D""></div><div class=3D"">If the value is low (below or around = 200), you really need to install and configure some entropy = generator such as haveged<br class=3D""><br class=3D""></div></div><span = class=3D"gmail-"><blockquote style=3D"margin:0px 0px 0px = 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex" = class=3D"gmail_quote"><div style=3D"word-wrap:break-word" class=3D""><div = class=3D""><br class=3D""></div><div class=3D"">Solution:</div><div = class=3D"">I realized that Apache was the culprit and found that the = proxy to the ovirt-engine in /etc/httpd/conf.d/z-ovirt-e<wbr = class=3D"">ngine-proxy.conf has a super long timeout with many retries. = I changed the settings and now everything works for me.</div><div = class=3D""><br class=3D""></div><div class=3D"">-> Before = change:</div><blockquote style=3D"margin:0px 0px 0px = 40px;border-width:medium;border-style:none;padding:0px" class=3D""><div = class=3D""><div class=3D""> <LocationMatch = ^/(ovirt-engine($|/)|api($|/)|<wbr = class=3D"">RHEVManagerWeb/|OvirtEngineWeb<wbr = class=3D"">/|ca.crt$|engine.ssh.key.txt$|<wbr = class=3D"">rhevm.ssh.key.txt$)></div><div class=3D""> = ProxyPassMatch <a class=3D"">ajp://127.0.0.1:8702</a> = timeout=3D3600 retry=3D5</div><div class=3D""><br class=3D""></div><div = class=3D""> <IfModule = deflate_module></div><div class=3D""> = AddOutputFilterByType DEFLATE text/javascript text/css = text/html text/xml text/json application/xml application/json = application/x-yaml</div><div class=3D""> = </IfModule></div><div class=3D""> = </LocationMatch></div></div></blockquote><div class=3D""><br = class=3D""></div>-> After change:<blockquote style=3D"margin:0px 0px = 0px 40px;border-width:medium;border-style:none;padding:0px" = class=3D""><div class=3D""><div class=3D""> = <LocationMatch ^/ovirt-engine($|/)></div><div class=3D""> = ProxyPassMatch <a class=3D"">ajp://127.0.0.1:8702</a>= timeout=3D5 retry=3D2</div><div class=3D""><br class=3D""></div><div = class=3D""> <IfModule = deflate_module></div><div class=3D""> = AddOutputFilterByType DEFLATE text/javascript text/css = text/html text/xml text/json application/xml application/json = application/x-yaml</div><div class=3D""> = </IfModule></div><div class=3D""> = </LocationMatch></div></div></blockquote></div></blockquote></span><= div class=3D""><br class=3D""><div = style=3D"font-family:arial,helvetica,sans-serif;display:inline" = class=3D"">=E2=80=8BThis one is correct for 4.0=E2=80=8B</div><div = style=3D"font-family:arial,helvetica,sans-serif;display:inline" = class=3D"">=E2=80=8B, not sure why it was not updated during upgrade = from 3.6. @Simone?<br = class=3D"">=E2=80=8B</div></div></div></div></div></blockquote><div = class=3D""><br class=3D""></div><div class=3D"">Honestly it's</div><div = class=3D""><div class=3D""> <LocationMatch = ^/ovirt-engine($|/)></div><div class=3D""> = ProxyPassMatch ajp://<a href=3D"http://127.0.0.1:8702/" = class=3D"">127.0.0.1:8702</a> timeout=3D3600 retry=3D5</div><div = class=3D""><br class=3D""></div><div class=3D""> = <IfModule deflate_module></div><div class=3D""> = AddOutputFilterByType DEFLATE = text/javascript text/css text/html text/xml text/json application/xml = application/json application/x-yaml</div><div class=3D""> = </IfModule></div><div class=3D""> = </LocationMatch></div></div><div class=3D"">also on a fresh 4.0 = engine from our latest engine-appliance.</div><div = class=3D""> </div><blockquote class=3D"gmail_quote" = style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid = rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr" class=3D""><div = class=3D"gmail_extra"><div class=3D"gmail_quote"><span = class=3D"gmail-"><blockquote style=3D"margin:0px 0px 0px = 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex" = class=3D"gmail_quote"><div style=3D"word-wrap:break-word" class=3D""><div = class=3D""><br class=3D""></div>If I read the timeout settings = correctly, it will wait 60 minutes with 5 retries. 5 hours is way too = long for my little server to hold onto all those apache processes. = </div></blockquote><blockquote style=3D"margin:0px 0px 0px = 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex" = class=3D"gmail_quote"><div style=3D"word-wrap:break-word" class=3D"">The = change I made allows for there to be an error, and also releases = apache's hold on the process. Once everything is ready, apache is ready = to serve requests and everything/everyone is happy. Before making the = change, I just get a whitescreen in my browser and then nothing works = until I restart Apache (or I end up in an endless loop of ovirt-ha = services restarting my hosted-engine.<br = class=3D""></div></blockquote></span><div class=3D""><br class=3D""><div = style=3D"font-family:arial,helvetica,sans-serif;display:inline" = class=3D"">=E2=80=8BWell, if you have an issue with too many apache = processes waiting for engine to respond, then there's some issue in = engine. As I wrote above please share the logs with us and check = entropy.<br class=3D""><br class=3D""></div><div = style=3D"font-family:arial,helvetica,sans-serif;display:inline" = class=3D"">Thanks<br class=3D""><br class=3D""></div><div = style=3D"font-family:arial,helvetica,sans-serif;display:inline" = class=3D"">Martin Perina<br class=3D"">=E2=80=8B</div> </div><blockqu= ote style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid = rgb(204,204,204);padding-left:1ex" class=3D"gmail_quote"><span = class=3D"gmail-"><div style=3D"word-wrap:break-word" class=3D""><div = class=3D""><div class=3D""><div class=3D""><br class=3D""></div><div = class=3D"">I noticed that this setting reverts to the original setting, = so oVirt must be writing this file. Perhaps these number can be changed = in oVirt? If not, I will just setup and ansible play to revert the = settings with working values and restart apache on my engine.</div><div = class=3D"">:-)</div><div class=3D""> <div class=3D""><br class=3D"">Cheers,<br class=3D"">Gervais<br = class=3D""><br class=3D""><br class=3D""></div> </div> <br class=3D""></div></div></div><br = class=3D""></span>______________________________<wbr = class=3D"">_________________<br class=3D""> Users mailing list<br class=3D""> <a href=3D"mailto:Users@ovirt.org" target=3D"_blank" = class=3D"">Users@ovirt.org</a><br class=3D""> <a rel=3D"noreferrer" = href=3D"http://lists.ovirt.org/mailman/listinfo/users" target=3D"_blank" = class=3D"">http://lists.ovirt.org/mailman<wbr = class=3D"">/listinfo/users</a><br class=3D""> <br class=3D""></blockquote></div><br class=3D""></div></div> </blockquote></div><br class=3D""></div></div> </div></blockquote></div><br class=3D""></div></body></html>= --Apple-Mail=_19C0B956-90E3-4499-AC9F-B50879BD6D0A--