[hosted-engine] engine failed to start after rebooted

Hi, I were upgrading oVirt from 3.6.4.1 to 3.6.5. The engine-vm was running on host02. These are the steps that I've done: 1. Set hosted engine maintenance mode to global 2. Accessed engine-vm and upgraded oVirt to latest version 3. Run 'reboot' in engine-vm 4. After about 10 minutes, the engine-vm still doesn't boot, so I set hosted engine maintenance mode back to none. 5. After another 10 minutes, the engine-vm still doesn't boot, so I restarted host02, host01 then host03 before the engine-vm would be accessible again. I then have to activate host01 and host03 again. Here are the log files from ovirt-hosted-engine-ha folder: - host01: https://gist.github.com/weeix/d73aa8506b296c27110747464ea33312 - host02: https://gist.github.com/weeix/c1b7033f07fb104fdd483cf7ea3a7852 How to correctly restart the engine-vm when we need to? -- Wee

On Fri, Apr 22, 2016 at 9:44 AM, Wee Sritippho <wee.s@forest.go.th> wrote:
Hi,
I were upgrading oVirt from 3.6.4.1 to 3.6.5. The engine-vm was running on host02. These are the steps that I've done:
1. Set hosted engine maintenance mode to global 2. Accessed engine-vm and upgraded oVirt to latest version 3. Run 'reboot' in engine-vm 4. After about 10 minutes, the engine-vm still doesn't boot, so I set hosted engine maintenance mode back to none.
This is absolutely normal: in global maintenance mode the agent will not bring up the VM.
5. After another 10 minutes, the engine-vm still doesn't boot, so I restarted host02, host01 then host03 before the engine-vm would be accessible again. I then have to activate host01 and host03 again.
This instead is pretty strange: exiting the maintenance mode an host should bring up the engine VM.
Here are the log files from ovirt-hosted-engine-ha folder: - host01: https://gist.github.com/weeix/d73aa8506b296c27110747464ea33312 - host02: https://gist.github.com/weeix/c1b7033f07fb104fdd483cf7ea3a7852
How to correctly restart the engine-vm when we need to?
-- Wee
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

On Fri, Apr 22, 2016 at 9:46 AM, Simone Tiraboschi <stirabos@redhat.com> wrote:
On Fri, Apr 22, 2016 at 9:44 AM, Wee Sritippho <wee.s@forest.go.th> wrote:
Hi,
I were upgrading oVirt from 3.6.4.1 to 3.6.5. The engine-vm was running on host02. These are the steps that I've done:
1. Set hosted engine maintenance mode to global 2. Accessed engine-vm and upgraded oVirt to latest version 3. Run 'reboot' in engine-vm 4. After about 10 minutes, the engine-vm still doesn't boot, so I set hosted engine maintenance mode back to none.
This is absolutely normal: in global maintenance mode the agent will not bring up the VM.
5. After another 10 minutes, the engine-vm still doesn't boot, so I restarted host02, host01 then host03 before the engine-vm would be accessible again. I then have to activate host01 and host03 again.
This instead is pretty strange: exiting the maintenance mode an host should bring up the engine VM.
OK, it didn't start on host02 since it was in local maintenance mode: MainThread::INFO::2016-04-23 01:08:12,597::hosted_engine::462::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) Current state LocalMaintenance (score: 0) The issue on host01 is here: MainThread::INFO::2016-04-23 01:22:14,608::brokerlink::111::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify) Trying: notify time=1461349334.61 type=state_transition detail=GlobalMaintenance-ReinitializeFSM hostname='host01.ovirt.forest.go.th' MainThread::ERROR::2016-04-23 01:22:44,638::brokerlink::279::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(_communicate) Connection closed: Connection timed out The agent failed talking with the broker service (can you please also attach broker logs from host01?). Rebooting the host simply restarted also the broker and so the engine VM went up. No the issue is why the broker went down and didn't restarted.
Here are the log files from ovirt-hosted-engine-ha folder: - host01: https://gist.github.com/weeix/d73aa8506b296c27110747464ea33312 - host02: https://gist.github.com/weeix/c1b7033f07fb104fdd483cf7ea3a7852
How to correctly restart the engine-vm when we need to?
-- Wee
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

------UGM80WYK9WP9OLRJ5BK95DTPJ26S6D Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Here is host01's broker.log: https://gist.github.com/weeix/d73aa8506b296c27110747464ea33312/raw/e73938= f4dce3591006b07e6ea61760831f4a2f18/broker.log On 22 =E0=B9=80=E0=B8=A1=E0=B8=A9=E0=B8=B2=E0=B8=A2=E0=B8=99 2016 15 =E0=B8= =99=E0=B8=B2=E0=B8=AC=E0=B8=B4=E0=B8=81=E0=B8=B2 04 =E0=B8=99=E0=B8=B2=E0= =B8=97=E0=B8=B5 40 =E0=B8=A7=E0=B8=B4=E0=B8=99=E0=B8=B2=E0=B8=97=E0=B8=B5= GMT+07:00, Simone Tiraboschi <stirabos@redhat.com> wrote:
On Fri, Apr 22, 2016 at 9:46 AM, Simone Tiraboschi <stirabos@redhat.com> wrote:
On Fri, Apr 22, 2016 at 9:44 AM, Wee Sritippho <wee.s@forest.go.th> wrote:
Hi,
I were upgrading oVirt from 3.6.4.1 to 3.6.5. The engine-vm was running on host02. These are the steps that I've done:
1. Set hosted engine maintenance mode to global 2. Accessed engine-vm and upgraded oVirt to latest version 3. Run 'reboot' in engine-vm 4. After about 10 minutes, the engine-vm still doesn't boot, so I set hosted engine maintenance mode back to none.
This is absolutely normal: in global maintenance mode the agent will not bring up the VM.
5. After another 10 minutes, the engine-vm still doesn't boot, so I restarted host02, host01 then host03 before the engine-vm would be accessible again. I then have to activate host01 and host03 again.
This instead is pretty strange: exiting the maintenance mode an host should bring up the engine VM.
OK, it didn't start on host02 since it was in local maintenance mode: MainThread::INFO::2016-04-23 01:08:12,597::hosted_engine::462::ovirt_hosted_engine_ha.agent.hosted_en= gine.HostedEngine::(start_monitoring) Current state LocalMaintenance (score: 0)
The issue on host01 is here:
MainThread::INFO::2016-04-23 01:22:14,608::brokerlink::111::ovirt_hosted_engine_ha.lib.brokerlink.Bro= kerLink::(notify) Trying: notify time=3D1461349334.61 type=3Dstate_transition detail=3DGlobalMaintenance-ReinitializeFSM hostname=3D'host01.ovirt.forest.go.th' MainThread::ERROR::2016-04-23 01:22:44,638::brokerlink::279::ovirt_hosted_engine_ha.lib.brokerlink.Bro= kerLink::(_communicate) Connection closed: Connection timed out
The agent failed talking with the broker service (can you please also attach broker logs from host01?). Rebooting the host simply restarted also the broker and so the engine VM went up. No the issue is why the broker went down and didn't restarted.
Here are the log files from ovirt-hosted-engine-ha folder: - host01: https://gist.github.com/weeix/d73aa8506b296c27110747464ea33312 - host02: https://gist.github.com/weeix/c1b7033f07fb104fdd483cf7ea3a7852
How to correctly restart the engine-vm when we need to?
-- Wee
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
<br /><blockquote class=3D"gmail_quote" style=3D"margin: 0pt 0pt 1ex 0.8= ex; border-left: 1px solid #729fcf; padding-left: 1ex;"><blockquote class= =3D"gmail_quote" style=3D"margin: 0pt 0pt 1ex 0.8ex; border-left: 1px sol= id #ad7fa8; padding-left: 1ex;"> Here are the log files from ovirt-hosted= -engine-ha folder:<br /> - host01: <a
--=20 =E0=B8=A7=E0=B8=B5=E0=B8=A3=E0=B9=8C =E0=B8=A8=E0=B8=A3=E0=B8=B5=E0=B8=97= =E0=B8=B4=E0=B8=9E=E0=B9=82=E0=B8=9E=E0=B8=98=E0=B8=B4=E0=B9=8C =E0=B8=99=E0=B8=B1=E0=B8=81=E0=B8=A7=E0=B8=B4=E0=B8=8A=E0=B8=B2=E0=B8=81=E0= =B8=B2=E0=B8=A3=E0=B8=84=E0=B8=AD=E0=B8=A1=E0=B8=9E=E0=B8=B4=E0=B8=A7=E0=B9= =80=E0=B8=95=E0=B8=AD=E0=B8=A3=E0=B9=8C=E0=B8=9B=E0=B8=8F=E0=B8=B4=E0=B8=9A= =E0=B8=B1=E0=B8=95=E0=B8=B4=E0=B8=81=E0=B8=B2=E0=B8=A3 =E0=B8=A8=E0=B8=B9=E0=B8=99=E0=B8=A2=E0=B9=8C=E0=B8=AA=E0=B8=B2=E0=B8=A3=E0= =B8=AA=E0=B8=99=E0=B9=80=E0=B8=97=E0=B8=A8 =E0=B8=81=E0=B8=A3=E0=B8=A1=E0= =B8=9B=E0=B9=88=E0=B8=B2=E0=B9=84=E0=B8=A1=E0=B9=89 =E0=B9=82=E0=B8=97=E0=B8=A3.=C2=A0025614292-3=C2=A0=E0=B8=95=E0=B9=88=E0=B8= =AD=C2=A05621 =E0=B8=A1=E0=B8=B7=E0=B8=AD=E0=B8=96=E0=B8=B7=E0=B8=AD.=C2=A00864678919 ------UGM80WYK9WP9OLRJ5BK95DTPJ26S6D Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: quoted-printable <html><head></head><body>Here is host01's broker.log:<br> <a href=3D"https://gist.github.com/weeix/d73aa8506b296c27110747464ea33312= /raw/e73938f4dce3591006b07e6ea61760831f4a2f18/broker.log">https://gist.gi= thub.com/weeix/d73aa8506b296c27110747464ea33312/raw/e73938f4dce3591006b07= e6ea61760831f4a2f18/broker.log</a><br><br><div class=3D"gmail_quote">On 2= 2 =E0=B9=80=E0=B8=A1=E0=B8=A9=E0=B8=B2=E0=B8=A2=E0=B8=99 2016 15 =E0=B8=99= =E0=B8=B2=E0=B8=AC=E0=B8=B4=E0=B8=81=E0=B8=B2 04 =E0=B8=99=E0=B8=B2=E0=B8= =97=E0=B8=B5 40 =E0=B8=A7=E0=B8=B4=E0=B8=99=E0=B8=B2=E0=B8=97=E0=B8=B5 GM= T+07:00, Simone Tiraboschi <stirabos@redhat.com> wrote:<blockquote = class=3D"gmail_quote" style=3D"margin: 0pt 0pt 0pt 0.8ex; border-left: 1p= x solid rgb(204, 204, 204); padding-left: 1ex;"> <pre class=3D"k9mail">On Fri, Apr 22, 2016 at 9:46 AM, Simone Tiraboschi = <stirabos@redhat.com> wrote:<br /><blockquote class=3D"gmail_quote"= style=3D"margin: 0pt 0pt 1ex 0.8ex; border-left: 1px solid #729fcf; padd= ing-left: 1ex;"> On Fri, Apr 22, 2016 at 9:44 AM, Wee Sritippho <wee.s= @forest.go.th> wrote:<br /><blockquote class=3D"gmail_quote" style=3D"= margin: 0pt 0pt 1ex 0.8ex; border-left: 1px solid #ad7fa8; padding-left: = 1ex;"> Hi,<br /><br /> I were upgrading oVirt from <a href=3D"http://3.6.= 4.1">3.6.4.1</a> to 3.6.5. The engine-vm was running on<br /> host02. The= se are the steps that I've done:<br /><br /> 1. Set hosted engine mainten= ance mode to global<br /> 2. Accessed engine-vm and upgraded oVirt to lat= est version<br /> 3. Run 'reboot' in engine-vm<br /> 4. After about 10 mi= nutes, the engine-vm still doesn't boot, so I set hosted<br /> engine mai= ntenance mode back to none.<br /></blockquote><br /> This is absolutely n= ormal: in global maintenance mode the agent will<br /> not bring up the VM.<br /><br /><blockquote class=3D"gmail_quote" style=3D"ma= rgin: 0pt 0pt 1ex 0.8ex; border-left: 1px solid #ad7fa8; padding-left: 1e= x;"> 5. After another 10 minutes, the engine-vm still doesn't boot, so I<= br /> restarted host02, host01 then host03 before the engine-vm would be<= br /> accessible again. I then have to activate host01 and host03 again.<= br /></blockquote><br /> This instead is pretty strange: exiting the main= tenance mode an host<br /> should bring up the engine VM.<br /></blockquo= te><br />OK,<br />it didn't start on host02 since it was in local mainten= ance mode:<br />MainThread::INFO::2016-04-23<br />01:08:12,597::hosted_en= gine::462::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(star= t_monitoring)<br />Current state LocalMaintenance (score: 0)<br /><br />T= he issue on host01 is here:<br /><br />MainThread::INFO::2016-04-23<br />= 01:22:14,608::brokerlink::111::ovirt_hosted_engine_ha.lib.brokerlink.Brok= erLink::(notify)<br />Trying: notify time=3D1461349334.61 type=3Dstate_transition<br />detail=3DGlobalMaintena= nce-ReinitializeFSM<br />hostname=3D'<a href=3D"http://host01.ovirt.fores= t.go.th">host01.ovirt.forest.go.th</a>'<br />MainThread::ERROR::2016-04-2= 3<br />01:22:44,638::brokerlink::279::ovirt_hosted_engine_ha.lib.brokerli= nk.BrokerLink::(_communicate)<br />Connection closed: Connection timed ou= t<br /><br />The agent failed talking with the broker service (can you pl= ease also<br />attach broker logs from host01?).<br />Rebooting the host = simply restarted also the broker and so the engine<br />VM went up.<br />= No the issue is why the broker went down and didn't restarted.<br /><br /= href=3D"https://gist.github.com/weeix/d73aa8506b296c27110747464ea33312">h= ttps://gist.github.com/weeix/d73aa8506b296c27110747464ea33312</a><br /> = - host02: <a href=3D"https://gist.github.com/weeix/c1b7033f07fb104fdd483= cf7ea3a7852">https://gist.github.com/weeix/c1b7033f07fb104fdd483cf7ea3a78= 52</a><br /><br /> How to correctly restart the engine-vm when we need to= ?<br /><br /> --<br /> Wee<br /><br /><hr /><br /> Users mailing list<br = /> Users@ovirt.org<br /> <a href=3D"http://lists.ovirt.org/mailman/listin= fo/users">http://lists.ovirt.org/mailman/listinfo/users</a><br /></blockq= uote></blockquote></pre></blockquote></div><br> -- <br> =E0=B8=A7=E0=B8=B5=E0=B8=A3=E0=B9=8C =E0=B8=A8=E0=B8=A3=E0=B8=B5=E0=B8=97= =E0=B8=B4=E0=B8=9E=E0=B9=82=E0=B8=9E=E0=B8=98=E0=B8=B4=E0=B9=8C<br> =E0=B8=99=E0=B8=B1=E0=B8=81=E0=B8=A7=E0=B8=B4=E0=B8=8A=E0=B8=B2=E0=B8=81=E0= =B8=B2=E0=B8=A3=E0=B8=84=E0=B8=AD=E0=B8=A1=E0=B8=9E=E0=B8=B4=E0=B8=A7=E0=B9= =80=E0=B8=95=E0=B8=AD=E0=B8=A3=E0=B9=8C=E0=B8=9B=E0=B8=8F=E0=B8=B4=E0=B8=9A= =E0=B8=B1=E0=B8=95=E0=B8=B4=E0=B8=81=E0=B8=B2=E0=B8=A3<br> =E0=B8=A8=E0=B8=B9=E0=B8=99=E0=B8=A2=E0=B9=8C=E0=B8=AA=E0=B8=B2=E0=B8=A3=E0= =B8=AA=E0=B8=99=E0=B9=80=E0=B8=97=E0=B8=A8 =E0=B8=81=E0=B8=A3=E0=B8=A1=E0= =B8=9B=E0=B9=88=E0=B8=B2=E0=B9=84=E0=B8=A1=E0=B9=89<br> =E0=B9=82=E0=B8=97=E0=B8=A3.=C2=A0025614292-3=C2=A0=E0=B8=95=E0=B9=88=E0=B8= =AD=C2=A05621<br> =E0=B8=A1=E0=B8=B7=E0=B8=AD=E0=B8=96=E0=B8=B7=E0=B8=AD.=C2=A00864678919</= body></html> ------UGM80WYK9WP9OLRJ5BK95DTPJ26S6D--
participants (2)
-
Simone Tiraboschi
-
Wee Sritippho