
--=-X7tdJfzKIr6253m47i4Q Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Di, 2014-08-19 at 07:42 +1000, John Gardeniers wrote:
Hi Daniel, =20 As per my original post, each host believed the *other* is a better candidate, with the result that neither would start the engine. As you may have read by now, the bug has been confirmed and a fix has been proposed. Indeed! I run in this bug also. I also applied Jiris fix.
Your claim that HA is working is incorrect. A system that requires manual intervention when something goes wrong is not HA. =20 regards, John =20 =20 On 18/08/14 19:18, Daniel Helgenberger wrote: =20
Hello John, =20 =20 On Mi, 2014-07-23 at 19:47 -0400, Jason Brooks wrote:
----- Original Message -----
From: "John Gardeniers" <jgardeniers@objectmastery.com> To: "users" <users@ovirt.org> Sent: Wednesday, July 23, 2014 4:29:45 PM Subject: [ovirt-users] Self-hosted engine won't start =20 Hi All, =20 I have created a lab with 2 hypervisors and a self-hosted engine. T= oday I followed the upgrade instructions as described in http://www.ovirt.org/Hosted_Engine_Howto and rebooted the engine. I didn't really do an upgrade but simply wanted to test what would ha=
However, for some reason one of my hosts showed a score of 2000; this is why it was working for me it seems. ppen
when the engine was rebooted. =20 When the engine didn't restart I re-ran hosted-engine --set-maintenance=3Dnone and restarted the vdsm, ovirt-ha-agent and ovirt-ha-broker services on both nodes. 15 minutes later it still h= adn't restarted, so I then tried rebooting both hypervisers. After an hou= r there was still no sign of the engine starting. The agent logs don'= t help me much. The following bits are repeated over and over. =20 ovirt1 (192.168.19.20): =20 MainThread::INFO::2014-07-24 09:18:40,272::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlin= k.BrokerLink::(notify) Trying: notify time=3D1406157520.27 type=3Dstate_transition detail=3DEngineDown-EngineDown hostname=3D'ovirt1.om.net' MainThread::INFO::2014-07-24 09:18:40,272::brokerlink::117::ovirt_hosted_engine_ha.lib.brokerlin= k.BrokerLink::(notify) Success, was notification of state_transition (EngineDown-EngineDow= n) sent? ignored MainThread::INFO::2014-07-24 09:18:40,594::hosted_engine::327::ovirt_hosted_engine_ha.agent.host= ed_engine.HostedEngine::(start_monitoring) Current state EngineDown (score: 2400) MainThread::INFO::2014-07-24 09:18:40,594::hosted_engine::332::ovirt_hosted_engine_ha.agent.host= ed_engine.HostedEngine::(start_monitoring) Best remote host 192.168.19.21 (id: 2, score: 2400) =20 ovirt2 (192.168.19.21): =20 MainThread::INFO::2014-07-24 09:18:04,005::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlin= k.BrokerLink::(notify) Trying: notify time=3D1406157484.01 type=3Dstate_transition detail=3DEngineDown-EngineDown hostname=3D'ovirt2.om.net' MainThread::INFO::2014-07-24 09:18:04,006::brokerlink::117::ovirt_hosted_engine_ha.lib.brokerlin= k.BrokerLink::(notify) Success, was notification of state_transition (EngineDown-EngineDow= n) sent? ignored MainThread::INFO::2014-07-24 09:18:04,324::hosted_engine::327::ovirt_hosted_engine_ha.agent.host= ed_engine.HostedEngine::(start_monitoring) Current state EngineDown (score: 2400) MainThread::INFO::2014-07-24 09:18:04,324::hosted_engine::332::ovirt_hosted_engine_ha.agent.host= ed_engine.HostedEngine::(start_monitoring) Best remote host 192.168.19.20 (id: 1, score: 2400) =20 From the above information I decided to simply shut down one hyperv= isor and see what happens. The engine did start back up again a few minu= tes later. I've seen this behavior, too. =20 Jason =20 The interesting part is that each hypervisor seems to think the oth= er is a better host.=20 Where do you get this from? From the line:=20 'Best remote host 192.168.19.20 (id: 1, score: 2400)' ? =20 I assume this is not the case; HA broker just looking for the best remote candidate.=20 =20 But I have also trouble with this behavior; esp. when I had the cluster in global maintenance. I resolve this by stating hosted engine manually in in global maintenance and waiting for {"health": "good", "vm": "up", "detail": "up"} and disabling global maintenance afterwards. =20 I found the HA feature is indeed working - and tried out best by manually stopping the engine service (service hosted-engine stop). IIRC This should trigger a failover and reboot of the engine. =20 =20 The two machines are identical, so there's no reason I can see for this odd behaviour. In a lab environment this is little= more than an annoying inconvenience. In a production environment it woul= d be completely unacceptable. =20 May I suggest that this issue be looked into and some means found t= o eliminate this kind of mutual exclusion? e.g. After a few minutes o= f such an issue one hypervisor could be randomly given a slightly hig= her weighting, which should result in it being chosen to start the engi= ne. =20 regards, John _______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users =20
Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users =20 Cheers,=20 Daniel =20 =20
Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users =20
Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
--=20 Daniel Helgenberger=20 m box bewegtbild GmbH=20 P: +49/30/2408781-22 F: +49/30/2408781-10 ACKERSTR. 19=20 D-10115 BERLIN=20 www.m-box.de www.monkeymen.tv=20 Gesch=C3=A4ftsf=C3=BChrer: Martin Retschitzegger / Michaela G=C3=B6llner Handeslregister: Amtsgericht Charlottenburg / HRB 112767=20 --=20 Daniel Helgenberger=20 m box bewegtbild GmbH=20 P: +49/30/2408781-22 F: +49/30/2408781-10 ACKERSTR. 19=20 D-10115 BERLIN=20 www.m-box.de www.monkeymen.tv=20 Gesch=C3=A4ftsf=C3=BChrer: Martin Retschitzegger / Michaela G=C3=B6llner Handeslregister: Amtsgericht Charlottenburg / HRB 112767=20 --=20 Daniel Helgenberger=20 m box bewegtbild GmbH=20 P: +49/30/2408781-22 F: +49/30/2408781-10 ACKERSTR. 19=20 D-10115 BERLIN=20 www.m-box.de www.monkeymen.tv=20 Gesch=C3=A4ftsf=C3=BChrer: Martin Retschitzegger / Michaela G=C3=B6llner Handeslregister: Amtsgericht Charlottenburg / HRB 112767=20 --=20 Daniel Helgenberger=20 m box bewegtbild GmbH=20 P: +49/30/2408781-22 F: +49/30/2408781-10 ACKERSTR. 19=20 D-10115 BERLIN=20 www.m-box.de www.monkeymen.tv=20 Gesch=C3=A4ftsf=C3=BChrer: Martin Retschitzegger / Michaela G=C3=B6llner Handeslregister: Amtsgericht Charlottenburg / HRB 112767=20 --=20 Daniel Helgenberger=20 m box bewegtbild GmbH=20 P: +49/30/2408781-22 F: +49/30/2408781-10 ACKERSTR. 19=20 D-10115 BERLIN=20 www.m-box.de www.monkeymen.tv=20 Gesch=C3=A4ftsf=C3=BChrer: Martin Retschitzegger / Michaela G=C3=B6llner Handeslregister: Amtsgericht Charlottenburg / HRB 112767=20 --=20 Daniel Helgenberger=20 m box bewegtbild GmbH=20 P: +49/30/2408781-22 F: +49/30/2408781-10 ACKERSTR. 19=20 D-10115 BERLIN=20 www.m-box.de www.monkeymen.tv=20 Gesch=C3=A4ftsf=C3=BChrer: Martin Retschitzegger / Michaela G=C3=B6llner Handeslregister: Amtsgericht Charlottenburg / HRB 112767=20 --=-X7tdJfzKIr6253m47i4Q Content-Type: application/x-pkcs7-signature; name="smime.p7s" Content-Disposition: attachment; filename="smime.p7s" Content-Transfer-Encoding: base64 MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIN9zCCBFcw ggM/oAMCAQICCwQAAAAAAS9O4TFGMA0GCSqGSIb3DQEBBQUAMFcxCzAJBgNVBAYTAkJFMRkwFwYD VQQKExBHbG9iYWxTaWduIG52LXNhMRAwDgYDVQQLEwdSb290IENBMRswGQYDVQQDExJHbG9iYWxT aWduIFJvb3QgQ0EwHhcNMTEwNDEzMTAwMDAwWhcNMTkwNDEzMTAwMDAwWjBUMQswCQYDVQQGEwJC RTEZMBcGA1UEChMQR2xvYmFsU2lnbiBudi1zYTEqMCgGA1UEAxMhR2xvYmFsU2lnbiBQZXJzb25h bFNpZ24gMiBDQSAtIEcyMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAwWtB+TXs+BJ9 3SJRaV+3uRNGJ3cUO+MTgW8+5HQXfgy19CzkDI1T1NwwICi/bo4R/mYR5FEWx91//eE0ElC/89iY 7GkL0tDasmVx4TOXnrqrsziUcxEPPqHRE8x4NhtBK7+8o0nsMIJMA1gyZ2FA5To2Ew1BBuvovvDJ +Nua3qOCNBNu+8A+eNpJlVnlu/qB7+XWaPXtUMlsIikxD+gREFVUgYE4VzBuLa2kkg0VLd09XkE2 ceRDm6YgRATuDk6ogUyX4OLxCGIJF8yi6Z37M0wemDA6Uff0EuqdwDQd5HwG/rernUjt1grLdAxq 8BwywRRg0eFHmE+ShhpyO3Fi+wIDAQABo4IBJTCCASEwDgYDVR0PAQH/BAQDAgEGMBIGA1UdEwEB /wQIMAYBAf8CAQAwHQYDVR0OBBYEFD8V0m18L+cxnkMKBqiUbCw7xe5lMEcGA1UdIARAMD4wPAYE VR0gADA0MDIGCCsGAQUFBwIBFiZodHRwczovL3d3dy5nbG9iYWxzaWduLmNvbS9yZXBvc2l0b3J5 LzAzBgNVHR8ELDAqMCigJqAkhiJodHRwOi8vY3JsLmdsb2JhbHNpZ24ubmV0L3Jvb3QuY3JsMD0G CCsGAQUFBwEBBDEwLzAtBggrBgEFBQcwAYYhaHR0cDovL29jc3AuZ2xvYmFsc2lnbi5jb20vcm9v dHIxMB8GA1UdIwQYMBaAFGB7ZhpFDZfKiVAvfQTNNKj//P1LMA0GCSqGSIb3DQEBBQUAA4IBAQDI WOF8oQHpI41wO21cUvjE819juuGa05F5yK/ESqW+9th9vfhG92eaBSLViTIJV7gfCFbt11WexfK/ 44NeiJMfi5wX6sK7Xnt8QIK5lH7ZX1Wg/zK1cXjrgRaYUOX/MA+PmuRm4gWV0zFwYOK2uv4OFgaM mVr+8en7K1aQY2ecI9YhEaDWOcSGj6SN8DvzPdE4G4tBk4/aIsUged9sGDqRYweKla3LTNjXPps1 Y+zsVbgHLtjdOIB0YZ1hrlAQcY2L/b+V+Yyoi7CMdOtmm1Rm6Jh5ILbwQTjlUCkgu5yVdfs9LDKc M0SPeCldkjfaGVSd+nURMOUy3hfxsMVux9+FMIIEyjCCA7KgAwIBAgIRAJZpZsDepakv5CafojXo PKcwDQYJKoZIhvcNAQEFBQAwVDELMAkGA1UEBhMCQkUxGTAXBgNVBAoTEEdsb2JhbFNpZ24gbnYt c2ExKjAoBgNVBAMTIUdsb2JhbFNpZ24gUGVyc29uYWxTaWduIDIgQ0EgLSBHMjAeFw0xMzA4Mjcx NjU3NThaFw0xNjA4MjcxNjU3NThaMFgxCzAJBgNVBAYTAkRFMRwwGgYDVQQDExNEYW5pZWwgSGVs Z2VuYmVyZ2VyMSswKQYJKoZIhvcNAQkBFhxkYW5pZWwuaGVsZ2VuYmVyZ2VyQG0tYm94LmRlMIIB IjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAzgFDm8+SeTU4Yt3WopJQgqZAuuNxyMlxiPuq 0C0D581goXz2nVVjhTCIVwX2MqWYD1Dyjy1hLHXothgWgZaiQ1EB4oVdmIFmIfIjR6SkR/Gjw3lx MwJzEpxJhZXyyrOYE8Kgw2maJWgLx5zw2/lKpcffhVW0OY0t+JWWxPKiYFcAmQnb+fleonM8sUZZ ZES08uRVVL67jbq+3+E2xCLlqQ2iJ1h5ej3wlyuZ4CkUnfMHYrG8zOIfHwsPirWACX026a1flgts Kl1Yv0CRZ1c5qujcP3OPpDovIbBr9RBStl2DcFdzTuGMdmfp32963VLOlvKpClPMzrfJeJfWZ4Qy UwIDAQABo4IBkTCCAY0wDgYDVR0PAQH/BAQDAgWgMEwGA1UdIARFMEMwQQYJKwYBBAGgMgEoMDQw MgYIKwYBBQUHAgEWJmh0dHBzOi8vd3d3Lmdsb2JhbHNpZ24uY29tL3JlcG9zaXRvcnkvMCcGA1Ud EQQgMB6BHGRhbmllbC5oZWxnZW5iZXJnZXJAbS1ib3guZGUwCQYDVR0TBAIwADAdBgNVHSUEFjAU BggrBgEFBQcDAgYIKwYBBQUHAwQwQwYDVR0fBDwwOjA4oDagNIYyaHR0cDovL2NybC5nbG9iYWxz aWduLmNvbS9ncy9nc3BlcnNvbmFsc2lnbjJnMi5jcmwwVQYIKwYBBQUHAQEESTBHMEUGCCsGAQUF BzAChjlodHRwOi8vc2VjdXJlLmdsb2JhbHNpZ24uY29tL2NhY2VydC9nc3BlcnNvbmFsc2lnbjJn Mi5jcnQwHQYDVR0OBBYEFLw0UD+6l35aKnDaePxEP8K35HYZMB8GA1UdIwQYMBaAFD8V0m18L+cx nkMKBqiUbCw7xe5lMA0GCSqGSIb3DQEBBQUAA4IBAQBdVOm7h+E4sRMBbTN1tCIjAEgxmB5U0mdZ XcawzEHLJxTrc/5YFBMGX2qPju8cuZV14XszMfRBJdlJz1Od+voJggianIhnFEAakCxaa1l/cmJ5 EDT6PgZAkXbMB5rU1dhegb35lJJkcFLEpR2tF1V0TfbSe5UZNPYeMQjYsRhs69pfKLoeGm4dSLK7 gsPT5EhPd+JPyNSIootOwClMP4CTxIsXQgRI5IDqG2Ku/r2YMMLsqWD11PtAE87t2mgohQ6V1XdW FqGd1V+wN98oPumRRS8bld+1gRA7GVYMnO5MF6p//iHFcy3MVT05ojqgomMt+voH5cFzrHA61z80 xaZ6MIIEyjCCA7KgAwIBAgIRAJZpZsDepakv5CafojXoPKcwDQYJKoZIhvcNAQEFBQAwVDELMAkG A1UEBhMCQkUxGTAXBgNVBAoTEEdsb2JhbFNpZ24gbnYtc2ExKjAoBgNVBAMTIUdsb2JhbFNpZ24g UGVyc29uYWxTaWduIDIgQ0EgLSBHMjAeFw0xMzA4MjcxNjU3NThaFw0xNjA4MjcxNjU3NThaMFgx CzAJBgNVBAYTAkRFMRwwGgYDVQQDExNEYW5pZWwgSGVsZ2VuYmVyZ2VyMSswKQYJKoZIhvcNAQkB FhxkYW5pZWwuaGVsZ2VuYmVyZ2VyQG0tYm94LmRlMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIB CgKCAQEAzgFDm8+SeTU4Yt3WopJQgqZAuuNxyMlxiPuq0C0D581goXz2nVVjhTCIVwX2MqWYD1Dy jy1hLHXothgWgZaiQ1EB4oVdmIFmIfIjR6SkR/Gjw3lxMwJzEpxJhZXyyrOYE8Kgw2maJWgLx5zw 2/lKpcffhVW0OY0t+JWWxPKiYFcAmQnb+fleonM8sUZZZES08uRVVL67jbq+3+E2xCLlqQ2iJ1h5 ej3wlyuZ4CkUnfMHYrG8zOIfHwsPirWACX026a1flgtsKl1Yv0CRZ1c5qujcP3OPpDovIbBr9RBS tl2DcFdzTuGMdmfp32963VLOlvKpClPMzrfJeJfWZ4QyUwIDAQABo4IBkTCCAY0wDgYDVR0PAQH/ BAQDAgWgMEwGA1UdIARFMEMwQQYJKwYBBAGgMgEoMDQwMgYIKwYBBQUHAgEWJmh0dHBzOi8vd3d3 Lmdsb2JhbHNpZ24uY29tL3JlcG9zaXRvcnkvMCcGA1UdEQQgMB6BHGRhbmllbC5oZWxnZW5iZXJn ZXJAbS1ib3guZGUwCQYDVR0TBAIwADAdBgNVHSUEFjAUBggrBgEFBQcDAgYIKwYBBQUHAwQwQwYD VR0fBDwwOjA4oDagNIYyaHR0cDovL2NybC5nbG9iYWxzaWduLmNvbS9ncy9nc3BlcnNvbmFsc2ln bjJnMi5jcmwwVQYIKwYBBQUHAQEESTBHMEUGCCsGAQUFBzAChjlodHRwOi8vc2VjdXJlLmdsb2Jh bHNpZ24uY29tL2NhY2VydC9nc3BlcnNvbmFsc2lnbjJnMi5jcnQwHQYDVR0OBBYEFLw0UD+6l35a KnDaePxEP8K35HYZMB8GA1UdIwQYMBaAFD8V0m18L+cxnkMKBqiUbCw7xe5lMA0GCSqGSIb3DQEB BQUAA4IBAQBdVOm7h+E4sRMBbTN1tCIjAEgxmB5U0mdZXcawzEHLJxTrc/5YFBMGX2qPju8cuZV1 4XszMfRBJdlJz1Od+voJggianIhnFEAakCxaa1l/cmJ5EDT6PgZAkXbMB5rU1dhegb35lJJkcFLE pR2tF1V0TfbSe5UZNPYeMQjYsRhs69pfKLoeGm4dSLK7gsPT5EhPd+JPyNSIootOwClMP4CTxIsX QgRI5IDqG2Ku/r2YMMLsqWD11PtAE87t2mgohQ6V1XdWFqGd1V+wN98oPumRRS8bld+1gRA7GVYM nO5MF6p//iHFcy3MVT05ojqgomMt+voH5cFzrHA61z80xaZ6MYIC5zCCAuMCAQEwaTBUMQswCQYD VQQGEwJCRTEZMBcGA1UEChMQR2xvYmFsU2lnbiBudi1zYTEqMCgGA1UEAxMhR2xvYmFsU2lnbiBQ ZXJzb25hbFNpZ24gMiBDQSAtIEcyAhEAlmlmwN6lqS/kJp+iNeg8pzAJBgUrDgMCGgUAoIIBUzAY BgkqhkiG9w0BCQMxCwYJKoZIhvcNAQcBMBwGCSqGSIb3DQEJBTEPFw0xNDA4MTkxNTAwMjNaMCMG CSqGSIb3DQEJBDEWBBTzUUmM9u511Ut8jbOlq52V0hui2zB4BgkrBgEEAYI3EAQxazBpMFQxCzAJ BgNVBAYTAkJFMRkwFwYDVQQKExBHbG9iYWxTaWduIG52LXNhMSowKAYDVQQDEyFHbG9iYWxTaWdu IFBlcnNvbmFsU2lnbiAyIENBIC0gRzICEQCWaWbA3qWpL+Qmn6I16DynMHoGCyqGSIb3DQEJEAIL MWugaTBUMQswCQYDVQQGEwJCRTEZMBcGA1UEChMQR2xvYmFsU2lnbiBudi1zYTEqMCgGA1UEAxMh R2xvYmFsU2lnbiBQZXJzb25hbFNpZ24gMiBDQSAtIEcyAhEAlmlmwN6lqS/kJp+iNeg8pzANBgkq hkiG9w0BAQEFAASCAQArLjb+1wMIs4bhB2kRzHieh1GaUnEQ3Nw8qk4YesCtjK3j/hKUW+fRY1mo LUTkibOl4yb+6INzwF/xM1UoPbILpQQ6praYShPo1cteqcKMroCzKp1P0ppARYlcr9rbdE5mYrdf Kt0i8sbsSALV0qoxRQheDcKDNRpCHxHC64VJHa6M3bElcYbRQNplkQ31lowtZ7dda0f8SbV20EXV ruSlg6o5w/TjXQmWUn9cqW6sgm/j/rDD1YrmILViz/Ls+b4FwV8wLVVmv6Iz+WF4KKJDgtBrebs6 H5bsxCR7CIj89RjVNLOuJt52zugy+pTIFVD3X1SrlMnkkX76aOuqaKTsAAAAAAAA --=-X7tdJfzKIr6253m47i4Q--