
On 22 Feb 2017, at 13:53, Simone Tiraboschi <stirabos@redhat.com> = wrote: =20 =20 =20 On Wed, Feb 22, 2017 at 1:33 PM, Simone Tiraboschi = <stirabos@redhat.com <mailto:stirabos@redhat.com>> wrote: When ovirt-ha-agent checks the status of the engine VM we get: 2017-02-21 22:21:14,738-0500 ERROR (jsonrpc/2) [api] FINISH getStats = error=3DVirtual machine does not exist: {'vmId': = u'2ccc0ef0-cc31-45b8-8e91-a78fa4cad671'} (api:69) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 67, = in method ret =3D func(*args, **kwargs) File "/usr/share/vdsm/API.py", line 335, in getStats vm =3D self.vm File "/usr/share/vdsm/API.py", line 130, in vm raise exception.NoSuchVM(vmId=3Dself._UUID) NoSuchVM: Virtual machine does not exist: {'vmId': = u'2ccc0ef0-cc31-45b8-8e91-a78fa4cad671'} =20 While in ovirt-ha-agent logs we have: MainThread::INFO::2017-02-21 = 22:21:18,583::hosted_engine::453::ovirt_hosted_engine_ha.agent.hosted_engi= ne.HostedEngine::(start_monitoring) Current state UnknownLocalVmState = (score: 3400) ... MainThread::INFO::2017-02-21 = 22:21:31,199::state_decorators::25::ovirt_hosted_engine_ha.agent.hosted_en= gine.HostedEngine::(check) Unknown local engine vm status no actions = taken Probably it's a bug or a regression somewhere on master. =20 On ovirt-ha-broker side the detection is based on a strict string = match on the error message that is expected to be exactly 'Virtual = machine does not exist' to set down status otherwise we set unknown = status as in this case: = https://gerrit.ovirt.org/gitweb?p=3Dovirt-hosted-engine-ha.git;a=3Dblob;f=3D... ovirt_hosted_engine_ha/broker/submonitors/engine_health.py;h=3Dd633cb860b8= 11e84021221771bf706a9a4ac1d63;hb=3Drefs/heads/master#l54 = <https://gerrit.ovirt.org/gitweb?p=3Dovirt-hosted-engine-ha.git;a=3Dblob;f= =3Dovirt_hosted_engine_ha/broker/submonitors/engine_health.py;h=3Dd633cb86= 0b811e84021221771bf706a9a4ac1d63;hb=3Drefs/heads/master#l54> =20 Adding Francesco here to understand if something has recently changed =
--Apple-Mail=_F2D44161-4CD5-4109-BD15-4E9B9ED8862F Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 there on vdsm side. That=E2=80=99s not a very robust code handling. Yes, the text changed, the vm id was added. And yes, it may change again any time I guess
=20 =20 =20 On Wed, Feb 22, 2017 at 1:02 PM, Sandro Bonazzola <sbonazzo@redhat.com = <mailto:sbonazzo@redhat.com>> wrote: Adding Lev =20 On Wed, Feb 22, 2017 at 12:59 PM, Sahina Bose <sabose@redhat.com = <mailto:sabose@redhat.com>> wrote: Hi all, =20 On the HC setup, the HE VM is not restarted. The agent.log has=20 MainThread::INFO::2017-02-21 = 22:09:58,022::state_machine::169::ovirt_hosted_engine_ha.agent.hosted_engi= ne.HostedEngine::(refresh) Global metadata: {} MainThread::INFO::2017-02-21 = 22:09:58,023::state_machine::177::ovirt_hosted_engine_ha.agent.hosted_engi= ne.HostedEngine::(refresh) Local (id 1): {'engine-health': {'reason': = 'failed to getVmStats', 'health': 'unknown', 'vm': 'unknown', 'detail': = 'unknown'}, 'bridge': True, 'mem-free': 4079.0, 'maintenance': False, = 'cpu-load': 0.0491, 'gateway': True} ... MainThread::INFO::2017-02-21 = 22:10:29,219::state_decorators::25::ovirt_hosted_engine_ha.agent.hosted_en= gine.HostedEngine::(check) Unknown local engine vm status no actions = taken MainThread::INFO::2017-02-21 = 22:10:29,219::brokerlink::111::ovirt_hosted_engine_ha.lib.br = <http://ovirt_hosted_engine_ha.lib.br/>okerlink.BrokerLink::(notify) = Trying: notify time=3D1487733029.22 type=3Dstate_transition = detail=3DReinitializeFSM-UnknownLocalVmState = hostname=3D'lago-hc-basic-suite-master-host0' MainThread::INFO::2017-02-21 = 22:10:29,317::brokerlink::121::ovirt_hosted_engine_ha.lib.br = <http://ovirt_hosted_engine_ha.lib.br/>okerlink.BrokerLink::(notify) = Success, was notification of state_transition = (ReinitializeFSM-UnknownLocalVmState) sent? ignored and the vdsm.log=20 =20 2017-02-21 22:09:11,962-0500 INFO (libvirt/events) [virt.vm] = (vmId=3D'2ccc0ef0-cc31-45b8-8e91-a78fa4cad671') Changed state to Down: = User shut down from within the guest (code=3D7) (vm:1269) 2017-02-21 22:09:11,962-0500 INFO (libvirt/events) [virt.vm] = (vmId=3D'2ccc0ef0-cc31-45b8-8e91-a78fa4cad671') Stopping connection = (guestagent:429) =20 2017-02-21 22:09:29,727-0500 ERROR (jsonrpc/4) [api] FINISH getStats = error=3DVirtual machine does not exist: {'vmId': = u'2ccc0ef0-cc31-45b8-8e91-a78fa4cad671'} (api:69) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 67, = in method ret =3D func(*args, **kwargs) File "/usr/share/vdsm/API.py", line 335, in getStats vm =3D self.vm File "/usr/share/vdsm/API.py", line 130, in vm raise exception.NoSuchVM(vmId=3Dself._UUID) NoSuchVM: Virtual machine does not exist: {'vmId': = u'2ccc0ef0-cc31-45b8-8e91-a78fa4cad671'} =20 =20 What should I be looking for to identify the issue? =20 The logs are at = http://jenkins.ovirt.org/job/ovirt_master_hc-system-tests/lastCompletedBui= ld/artifact/exported-artifacts/test_logs/hc-basic-suite-master/post-002_bo= otstrap.py/lago-hc-basic-suite-master-host0 = <http://jenkins.ovirt.org/job/ovirt_master_hc-system-tests/lastCompletedBu= ild/artifact/exported-artifacts/test_logs/hc-basic-suite-master/post-002_b= ootstrap.py/lago-hc-basic-suite-master-host0> =20 thanks sahina =20 _______________________________________________ Devel mailing list Devel@ovirt.org <mailto:Devel@ovirt.org> http://lists.ovirt.org/mailman/listinfo/devel = <http://lists.ovirt.org/mailman/listinfo/devel> =20 =20 =20 --=20 Sandro Bonazzola Better technology. Faster innovation. Powered by community =
collaboration. > See how it works at redhat.com <http://redhat.com/> >=20 > _______________________________________________ > Devel mailing list > Devel@ovirt.org > http://lists.ovirt.org/mailman/listinfo/devel
--Apple-Mail=_F2D44161-4CD5-4109-BD15-4E9B9ED8862F Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8 <html><head><meta http-equiv=3D"Content-Type" content=3D"text/html = charset=3Dutf-8"></head><body style=3D"word-wrap: break-word; = -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" = class=3D""><br class=3D""><div><blockquote type=3D"cite" class=3D""><div = class=3D"">On 22 Feb 2017, at 13:53, Simone Tiraboschi <<a = href=3D"mailto:stirabos@redhat.com" class=3D"">stirabos@redhat.com</a>>= wrote:</div><br class=3D"Apple-interchange-newline"><div class=3D""><div = dir=3D"ltr" class=3D""><br class=3D""><div class=3D"gmail_extra"><br = class=3D""><div class=3D"gmail_quote">On Wed, Feb 22, 2017 at 1:33 PM, = Simone Tiraboschi <span dir=3D"ltr" class=3D""><<a = href=3D"mailto:stirabos@redhat.com" target=3D"_blank" = class=3D"">stirabos@redhat.com</a>></span> wrote:<br = class=3D""><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px = 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div = dir=3D"ltr" class=3D"">When ovirt-ha-agent checks the status of the = engine VM we get:<div class=3D""><pre style=3D"" class=3D"">2017-02-21 = 22:21:14,738-0500 ERROR (jsonrpc/2) [api] FINISH getStats error=3DVirtual = machine does not exist: {'vmId': u'2ccc0ef0-cc31-45b8-8e91-<wbr = class=3D"">a78fa4cad671'} (api:69) Traceback (most recent call last): File "/usr/lib/python2.7/site-<wbr = class=3D"">packages/vdsm/common/api.py", line 67, in method ret =3D func(*args, **kwargs) File "/usr/share/vdsm/API.py", line 335, in getStats vm =3D self.vm File "/usr/share/vdsm/API.py", line 130, in vm raise exception.NoSuchVM(vmId=3Dself._<wbr class=3D"">UUID) NoSuchVM: Virtual machine does not exist: {'vmId': = u'2ccc0ef0-cc31-45b8-8e91-<wbr class=3D"">a78fa4cad671'}</pre><pre = style=3D"" class=3D""><br class=3D""></pre><pre class=3D"">While in = ovirt-ha-agent logs we have:<pre style=3D"" class=3D""><pre = class=3D"">MainThread::INFO::2017-02-21 = 22:21:18,583::hosted_engine::<wbr = class=3D"">453::ovirt_hosted_engine_ha.<wbr = class=3D"">agent.hosted_engine.<wbr class=3D"">HostedEngine::(start_<wbr = class=3D"">monitoring) Current state UnknownLocalVmState (score: = 3400)</pre><pre class=3D"">...</pre></pre><pre style=3D"" = class=3D"">MainThread::INFO::2017-02-21 22:21:31,199::state_<wbr = class=3D"">decorators::25::ovirt_hosted_<wbr = class=3D"">engine_ha.agent.hosted_engine.<wbr = class=3D"">HostedEngine::(check) Unknown local engine vm status no = actions taken</pre></pre>Probably it's a bug or a regression somewhere = on master.</div></div></blockquote><div class=3D""><br = class=3D""></div><div class=3D"">On ovirt-ha-broker side the detection = is based on a strict string match on the error message that is expected = to be exactly 'Virtual machine does not exist' to set down status = otherwise we set unknown status as in this case:</div><div class=3D""><a = href=3D"https://gerrit.ovirt.org/gitweb?p=3Dovirt-hosted-engine-ha.git;a=3D= blob;f=3Dovirt_hosted_engine_ha/broker/submonitors/engine_health.py;h=3Dd6= 33cb860b811e84021221771bf706a9a4ac1d63;hb=3Drefs/heads/master#l54" = class=3D"">https://gerrit.ovirt.org/gitweb?p=3Dovirt-hosted-engine-ha.git;= a=3Dblob;f=3Dovirt_hosted_engine_ha/broker/submonitors/engine_health.py;h=3D= d633cb860b811e84021221771bf706a9a4ac1d63;hb=3Drefs/heads/master#l54</a><br= class=3D""></div><div class=3D""> </div><div class=3D"">Adding = Francesco here to understand if something has recently changed there on = vdsm side.</div></div></div></div></div></blockquote><div><br = class=3D""></div>That=E2=80=99s not a very robust code = handling.</div><div>Yes, the text changed, the vm id was = added.</div><div>And yes, it may change again any time I = guess</div><div><br class=3D""></div><div><blockquote type=3D"cite" = class=3D""><div class=3D""><div dir=3D"ltr" class=3D""><div = class=3D"gmail_extra"><div class=3D"gmail_quote"><div class=3D""><br = class=3D""></div><blockquote class=3D"gmail_quote" style=3D"margin:0px = 0px 0px 0.8ex;border-left:1px solid = rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr" class=3D""><div = class=3D""><div class=3D"gmail-h5"><div class=3D""><br class=3D""><div = class=3D"gmail_extra"><br class=3D""><div class=3D"gmail_quote">On Wed, = Feb 22, 2017 at 1:02 PM, Sandro Bonazzola <span dir=3D"ltr" = class=3D""><<a href=3D"mailto:sbonazzo@redhat.com" target=3D"_blank" = class=3D"">sbonazzo@redhat.com</a>></span> wrote:<br = class=3D""><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px = 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div = dir=3D"ltr" class=3D"">Adding Lev</div><div class=3D"gmail_extra"><br = class=3D""><div class=3D"gmail_quote"><div class=3D""><div = class=3D"gmail-m_3795450909556566802gmail-h5">On Wed, Feb 22, 2017 at = 12:59 PM, Sahina Bose <span dir=3D"ltr" class=3D""><<a = href=3D"mailto:sabose@redhat.com" target=3D"_blank" = class=3D"">sabose@redhat.com</a>></span> wrote:<br = class=3D""></div></div><blockquote class=3D"gmail_quote" = style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid = rgb(204,204,204);padding-left:1ex"><div class=3D""><div = class=3D"gmail-m_3795450909556566802gmail-h5"><div dir=3D"ltr" = class=3D""><pre class=3D"">Hi all,<br class=3D""><br class=3D"">On the = HC setup, the HE VM is not restarted.<br class=3D"">The agent.log has = <br class=3D"">MainThread::INFO::2017-02-21 = 22:09:58,022::state_machine::1<wbr = class=3D"">69::ovirt_hosted_engine_ha.age<wbr = class=3D"">nt.hosted_engine.HostedEngine:<wbr class=3D"">:(refresh) = Global metadata: {} MainThread::INFO::2017-02-21 22:09:58,023::state_machine::1<wbr = class=3D"">77::ovirt_hosted_engine_ha.age<wbr = class=3D"">nt.hosted_engine.HostedEngine:<wbr class=3D"">:(refresh) = Local (id 1): {'engine-health': {'reason': 'failed to getVmStats', = 'health': 'unknown', 'vm': 'unknown', 'detail': 'unknown'}, 'bridge': = True, 'mem-free': 4079.0, 'maintenance': False, 'cpu-load': 0.0491, = 'gateway': True}<br class=3D"">...<br = class=3D"">MainThread::INFO::2017-02-21 = 22:10:29,219::state_decorators<wbr = class=3D"">::25::ovirt_hosted_engine_ha.a<wbr = class=3D"">gent.hosted_engine.HostedEngin<wbr class=3D"">e::(check) = Unknown local engine vm status no actions taken MainThread::INFO::2017-02-21 22:10:29,219::brokerlink::111:<wbr = class=3D"">:<a href=3D"http://ovirt_hosted_engine_ha.lib.br/" = target=3D"_blank" class=3D"">ovirt_hosted_engine_ha.lib.br</a><wbr = class=3D"">okerlink.BrokerLink::(notify) Trying: notify = time=3D1487733029.22 type=3Dstate_transition = detail=3DReinitializeFSM-Unknown<wbr class=3D"">LocalVmState = hostname=3D'lago-hc-basic-suite-<wbr class=3D"">master-host0' MainThread::INFO::2017-02-21 22:10:29,317::brokerlink::121:<wbr = class=3D"">:<a href=3D"http://ovirt_hosted_engine_ha.lib.br/" = target=3D"_blank" class=3D"">ovirt_hosted_engine_ha.lib.br</a><wbr = class=3D"">okerlink.BrokerLink::(notify) Success, was notification of = state_transition (ReinitializeFSM-UnknownLocalV<wbr class=3D"">mState) = sent? ignored<br class=3D""></pre><pre class=3D"">and the vdsm.log <br = class=3D""><br class=3D"">2017-02-21 22:09:11,962-0500 INFO = (libvirt/events) [virt.vm] (vmId=3D'2ccc0ef0-cc31-45b8-8e91<wbr = class=3D"">-a78fa4cad671') Changed state to Down: User shut down from = within the guest (code=3D7) (vm:1269) 2017-02-21 22:09:11,962-0500 INFO (libvirt/events) [virt.vm] = (vmId=3D'2ccc0ef0-cc31-45b8-8e91<wbr class=3D"">-a78fa4cad671') Stopping = connection (guestagent:429)<br class=3D""><br class=3D"">2017-02-21 = 22:09:29,727-0500 ERROR (jsonrpc/4) [api] FINISH getStats error=3DVirtual = machine does not exist: {'vmId': u'2ccc0ef0-cc31-45b8-8e91-a78f<wbr = class=3D"">a4cad671'} (api:69) Traceback (most recent call last): File "/usr/lib/python2.7/site-packa<wbr = class=3D"">ges/vdsm/common/api.py", line 67, in method ret =3D func(*args, **kwargs) File "/usr/share/vdsm/API.py", line 335, in getStats vm =3D self.vm File "/usr/share/vdsm/API.py", line 130, in vm raise exception.NoSuchVM(vmId=3Dself._<wbr class=3D"">UUID) NoSuchVM: Virtual machine does not exist: {'vmId': = u'2ccc0ef0-cc31-45b8-8e91-a78f<wbr class=3D"">a4cad671'}<br class=3D""><br= class=3D""><br class=3D""></pre><pre class=3D"">What should I be = looking for to identify the issue?<br class=3D""><br class=3D""></pre><pre= class=3D"">The logs are at <a = href=3D"http://jenkins.ovirt.org/job/ovirt_master_hc-system-tests/lastComp= letedBuild/artifact/exported-artifacts/test_logs/hc-basic-suite-master/pos= t-002_bootstrap.py/lago-hc-basic-suite-master-host0" target=3D"_blank" = class=3D"">http://jenkins.ovirt.org/job/o<wbr = class=3D"">virt_master_hc-system-tests/la<wbr = class=3D"">stCompletedBuild/artifact/expo<wbr = class=3D"">rted-artifacts/test_logs/hc-ba<wbr = class=3D"">sic-suite-master/post-002_boot<wbr = class=3D"">strap.py/lago-hc-basic-suite-<wbr = class=3D"">master-host0</a><br class=3D""><br class=3D""></pre><pre = class=3D"">thanks<span = class=3D"gmail-m_3795450909556566802gmail-m_-511095684978464768HOEnZb"><fo= nt color=3D"#888888" class=3D""><br class=3D""></font></span></pre><span = class=3D"gmail-m_3795450909556566802gmail-m_-511095684978464768HOEnZb"><fo= nt color=3D"#888888" class=3D""><pre class=3D"">sahina<br = class=3D""></pre></font></span></div> <br class=3D""></div></div>______________________________<wbr = class=3D"">_________________<br class=3D""> Devel mailing list<br class=3D""> <a href=3D"mailto:Devel@ovirt.org" target=3D"_blank" = class=3D"">Devel@ovirt.org</a><br class=3D""> <a href=3D"http://lists.ovirt.org/mailman/listinfo/devel" = rel=3D"noreferrer" target=3D"_blank" = class=3D"">http://lists.ovirt.org/mailman<wbr = class=3D"">/listinfo/devel</a><span = class=3D"gmail-m_3795450909556566802gmail-HOEnZb"><font color=3D"#888888" = class=3D""><br class=3D""></font></span></blockquote></div><span = class=3D"gmail-m_3795450909556566802gmail-HOEnZb"><font color=3D"#888888" = class=3D""><br class=3D""><br clear=3D"all" class=3D""><div class=3D""><br= class=3D""></div>-- <br class=3D""><div = class=3D"gmail-m_3795450909556566802gmail-m_-511095684978464768gmail_signa= ture"><div dir=3D"ltr" class=3D""><div class=3D""><div dir=3D"ltr" = class=3D""><div class=3D""><div dir=3D"ltr" class=3D""><div = class=3D""><div dir=3D"ltr" class=3D"">Sandro Bonazzola<br = class=3D"">Better technology. Faster innovation. Powered by community = collaboration.<br class=3D"">See how it works at <a = href=3D"http://redhat.com/" target=3D"_blank" = class=3D"">redhat.com</a></div></div></div></div></div></div></div></div> </font></span></div> </blockquote></div><br class=3D""></div></div></div></div></div> </blockquote></div><br class=3D""></div></div> _______________________________________________<br class=3D"">Devel = mailing list<br class=3D""><a href=3D"mailto:Devel@ovirt.org" = class=3D"">Devel@ovirt.org</a><br = class=3D"">http://lists.ovirt.org/mailman/listinfo/devel</div></blockquote=
</div><br class=3D""></body></html>=
--Apple-Mail=_F2D44161-4CD5-4109-BD15-4E9B9ED8862F--