--=-X7tdJfzKIr6253m47i4Q
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
On Di, 2014-08-19 at 07:42 +1000, John Gardeniers wrote:
Hi Daniel,
=20
As per my original post, each host believed the *other* is a better
candidate, with the result that neither would start the engine. As you
may have read by now, the bug has been confirmed and a fix has been
proposed.
Indeed!
I run in this bug also. I also applied Jiris fix.
However, for some reason one of my hosts showed a score of 2000; this is
why it was working for me it seems.
Your claim that HA is working is incorrect. A system that requires
manual intervention when something goes wrong is not HA.
=20
regards,
John
=20
=20
On 18/08/14 19:18, Daniel Helgenberger wrote:
=20
> Hello John,
>=20
>=20
> On Mi, 2014-07-23 at 19:47 -0400, Jason Brooks wrote:
> > ----- Original Message -----
> > > From: "John Gardeniers" <jgardeniers(a)objectmastery.com>
> > > To: "users" <users(a)ovirt.org>
> > > Sent: Wednesday, July 23, 2014 4:29:45 PM
> > > Subject: [ovirt-users] Self-hosted engine won't start
> > >=20
> > > Hi All,
> > >=20
> > > I have created a lab with 2 hypervisors and a self-hosted engine. T=
oday
> > > I followed the upgrade instructions as described in
> > >
http://www.ovirt.org/Hosted_Engine_Howto and rebooted the engine. I
> > > didn't really do an upgrade but simply wanted to test what would ha=
ppen
> > > when the engine was rebooted.
> > >=20
> > > When the engine didn't restart I re-ran hosted-engine
> > > --set-maintenance=3Dnone and restarted the vdsm, ovirt-ha-agent and
> > > ovirt-ha-broker services on both nodes. 15 minutes later it still h=
adn't
> > > restarted, so I then tried rebooting both hypervisers.
After an hou=
r
> > > there was still no sign of the engine starting. The
agent logs don'=
t
> > > help me much. The following bits are repeated over and
over.
> > >=20
> > > ovirt1 (192.168.19.20):
> > >=20
> > > MainThread::INFO::2014-07-24
> > > 09:18:40,272::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlin=
k.BrokerLink::(notify)
> > > Trying: notify time=3D1406157520.27
type=3Dstate_transition
> > > detail=3DEngineDown-EngineDown hostname=3D'ovirt1.om.net'
> > > MainThread::INFO::2014-07-24
> > > 09:18:40,272::brokerlink::117::ovirt_hosted_engine_ha.lib.brokerlin=
k.BrokerLink::(notify)
> > > Success, was notification of state_transition
(EngineDown-EngineDow=
n)
> > > sent? ignored
> > > MainThread::INFO::2014-07-24
> > > 09:18:40,594::hosted_engine::327::ovirt_hosted_engine_ha.agent.host=
ed_engine.HostedEngine::(start_monitoring)
> > > Current state EngineDown (score: 2400)
> > > MainThread::INFO::2014-07-24
> > > 09:18:40,594::hosted_engine::332::ovirt_hosted_engine_ha.agent.host=
ed_engine.HostedEngine::(start_monitoring)
> > > Best remote host 192.168.19.21 (id: 2, score: 2400)
> > >=20
> > > ovirt2 (192.168.19.21):
> > >=20
> > > MainThread::INFO::2014-07-24
> > > 09:18:04,005::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlin=
k.BrokerLink::(notify)
> > > Trying: notify time=3D1406157484.01
type=3Dstate_transition
> > > detail=3DEngineDown-EngineDown hostname=3D'ovirt2.om.net'
> > > MainThread::INFO::2014-07-24
> > > 09:18:04,006::brokerlink::117::ovirt_hosted_engine_ha.lib.brokerlin=
k.BrokerLink::(notify)
> > > Success, was notification of state_transition
(EngineDown-EngineDow=
n)
> > > sent? ignored
> > > MainThread::INFO::2014-07-24
> > > 09:18:04,324::hosted_engine::327::ovirt_hosted_engine_ha.agent.host=
ed_engine.HostedEngine::(start_monitoring)
> > > Current state EngineDown (score: 2400)
> > > MainThread::INFO::2014-07-24
> > > 09:18:04,324::hosted_engine::332::ovirt_hosted_engine_ha.agent.host=
ed_engine.HostedEngine::(start_monitoring)
> > > Best remote host 192.168.19.20 (id: 1, score: 2400)
> > >=20
> > > From the above information I decided to simply shut down one hyperv=
isor
> > > and see what happens. The engine did start back up
again a few minu=
tes
> > > later.
> > I've seen this behavior, too.
> >=20
> > Jason
> >=20
> > > The interesting part is that each hypervisor seems to think the oth=
er is
> > > a better host.=20
> Where do you get this from? From the line:=20
> 'Best remote host 192.168.19.20 (id: 1, score: 2400)' ?
>=20
> I assume this is not the case; HA broker just looking for the best
> remote candidate.=20
>=20
> But I have also trouble with this behavior; esp. when I had the cluster
> in global maintenance.
> I resolve this by stating hosted engine manually in in global
> maintenance and waiting for {"health": "good", "vm":
"up", "detail":
> "up"} and disabling global maintenance afterwards.
>=20
> I found the HA feature is indeed working - and tried out best by
> manually stopping the engine service (service hosted-engine stop). IIRC
> This should trigger a failover and reboot of the engine.
>=20
>=20
> > The two machines are identical, so there's no reason I
> > > can see for this odd behaviour. In a lab environment this is little=
more
> > > than an annoying inconvenience. In a production
environment it woul=
d be
> > > completely unacceptable.
> > >=20
> > > May I suggest that this issue be looked into and some means found t=
o
> > > eliminate this kind of mutual exclusion? e.g. After a
few minutes o=
f
> > > such an issue one hypervisor could be randomly given a
slightly hig=
her
> > > weighting, which should result in it being chosen to
start the engi=
ne.
> > >=20
> > > regards,
> > > John
> > > _______________________________________________
> > > Users mailing list
> > > Users(a)ovirt.org
> > >
http://lists.ovirt.org/mailman/listinfo/users
> > >=20
> > _______________________________________________
> > Users mailing list
> > Users(a)ovirt.org
> >
http://lists.ovirt.org/mailman/listinfo/users
>=20
> Cheers,=20
> Daniel
>=20
>=20
> _______________________________________________
> Users mailing list
> Users(a)ovirt.org
>
http://lists.ovirt.org/mailman/listinfo/users
=20
_______________________________________________
Users mailing list
Users(a)ovirt.org
http://lists.ovirt.org/mailman/listinfo/users
--=20
Daniel Helgenberger=20
m box bewegtbild GmbH=20
P: +49/30/2408781-22
F: +49/30/2408781-10
ACKERSTR. 19=20
D-10115 BERLIN=20
www.m-box.de www.monkeymen.tv=20
Gesch=C3=A4ftsf=C3=BChrer: Martin Retschitzegger / Michaela G=C3=B6llner
Handeslregister: Amtsgericht Charlottenburg / HRB 112767=20
--=20
Daniel Helgenberger=20
m box bewegtbild GmbH=20
P: +49/30/2408781-22
F: +49/30/2408781-10
ACKERSTR. 19=20
D-10115 BERLIN=20
www.m-box.de www.monkeymen.tv=20
Gesch=C3=A4ftsf=C3=BChrer: Martin Retschitzegger / Michaela G=C3=B6llner
Handeslregister: Amtsgericht Charlottenburg / HRB 112767=20
--=20
Daniel Helgenberger=20
m box bewegtbild GmbH=20
P: +49/30/2408781-22
F: +49/30/2408781-10
ACKERSTR. 19=20
D-10115 BERLIN=20
www.m-box.de www.monkeymen.tv=20
Gesch=C3=A4ftsf=C3=BChrer: Martin Retschitzegger / Michaela G=C3=B6llner
Handeslregister: Amtsgericht Charlottenburg / HRB 112767=20
--=20
Daniel Helgenberger=20
m box bewegtbild GmbH=20
P: +49/30/2408781-22
F: +49/30/2408781-10
ACKERSTR. 19=20
D-10115 BERLIN=20
www.m-box.de www.monkeymen.tv=20
Gesch=C3=A4ftsf=C3=BChrer: Martin Retschitzegger / Michaela G=C3=B6llner
Handeslregister: Amtsgericht Charlottenburg / HRB 112767=20
--=20
Daniel Helgenberger=20
m box bewegtbild GmbH=20
P: +49/30/2408781-22
F: +49/30/2408781-10
ACKERSTR. 19=20
D-10115 BERLIN=20
www.m-box.de www.monkeymen.tv=20
Gesch=C3=A4ftsf=C3=BChrer: Martin Retschitzegger / Michaela G=C3=B6llner
Handeslregister: Amtsgericht Charlottenburg / HRB 112767=20
--=20
Daniel Helgenberger=20
m box bewegtbild GmbH=20
P: +49/30/2408781-22
F: +49/30/2408781-10
ACKERSTR. 19=20
D-10115 BERLIN=20
www.m-box.de www.monkeymen.tv=20
Gesch=C3=A4ftsf=C3=BChrer: Martin Retschitzegger / Michaela G=C3=B6llner
Handeslregister: Amtsgericht Charlottenburg / HRB 112767=20
--=-X7tdJfzKIr6253m47i4Q
Content-Type: application/x-pkcs7-signature; name="smime.p7s"
Content-Disposition: attachment; filename="smime.p7s"
Content-Transfer-Encoding: base64
MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIN9zCCBFcw
ggM/oAMCAQICCwQAAAAAAS9O4TFGMA0GCSqGSIb3DQEBBQUAMFcxCzAJBgNVBAYTAkJFMRkwFwYD
VQQKExBHbG9iYWxTaWduIG52LXNhMRAwDgYDVQQLEwdSb290IENBMRswGQYDVQQDExJHbG9iYWxT
aWduIFJvb3QgQ0EwHhcNMTEwNDEzMTAwMDAwWhcNMTkwNDEzMTAwMDAwWjBUMQswCQYDVQQGEwJC
RTEZMBcGA1UEChMQR2xvYmFsU2lnbiBudi1zYTEqMCgGA1UEAxMhR2xvYmFsU2lnbiBQZXJzb25h
bFNpZ24gMiBDQSAtIEcyMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAwWtB+TXs+BJ9
3SJRaV+3uRNGJ3cUO+MTgW8+5HQXfgy19CzkDI1T1NwwICi/bo4R/mYR5FEWx91//eE0ElC/89iY
7GkL0tDasmVx4TOXnrqrsziUcxEPPqHRE8x4NhtBK7+8o0nsMIJMA1gyZ2FA5To2Ew1BBuvovvDJ
+Nua3qOCNBNu+8A+eNpJlVnlu/qB7+XWaPXtUMlsIikxD+gREFVUgYE4VzBuLa2kkg0VLd09XkE2
ceRDm6YgRATuDk6ogUyX4OLxCGIJF8yi6Z37M0wemDA6Uff0EuqdwDQd5HwG/rernUjt1grLdAxq
8BwywRRg0eFHmE+ShhpyO3Fi+wIDAQABo4IBJTCCASEwDgYDVR0PAQH/BAQDAgEGMBIGA1UdEwEB
/wQIMAYBAf8CAQAwHQYDVR0OBBYEFD8V0m18L+cxnkMKBqiUbCw7xe5lMEcGA1UdIARAMD4wPAYE
VR0gADA0MDIGCCsGAQUFBwIBFiZodHRwczovL3d3dy5nbG9iYWxzaWduLmNvbS9yZXBvc2l0b3J5
LzAzBgNVHR8ELDAqMCigJqAkhiJodHRwOi8vY3JsLmdsb2JhbHNpZ24ubmV0L3Jvb3QuY3JsMD0G
CCsGAQUFBwEBBDEwLzAtBggrBgEFBQcwAYYhaHR0cDovL29jc3AuZ2xvYmFsc2lnbi5jb20vcm9v
dHIxMB8GA1UdIwQYMBaAFGB7ZhpFDZfKiVAvfQTNNKj//P1LMA0GCSqGSIb3DQEBBQUAA4IBAQDI
WOF8oQHpI41wO21cUvjE819juuGa05F5yK/ESqW+9th9vfhG92eaBSLViTIJV7gfCFbt11WexfK/
44NeiJMfi5wX6sK7Xnt8QIK5lH7ZX1Wg/zK1cXjrgRaYUOX/MA+PmuRm4gWV0zFwYOK2uv4OFgaM
mVr+8en7K1aQY2ecI9YhEaDWOcSGj6SN8DvzPdE4G4tBk4/aIsUged9sGDqRYweKla3LTNjXPps1
Y+zsVbgHLtjdOIB0YZ1hrlAQcY2L/b+V+Yyoi7CMdOtmm1Rm6Jh5ILbwQTjlUCkgu5yVdfs9LDKc
M0SPeCldkjfaGVSd+nURMOUy3hfxsMVux9+FMIIEyjCCA7KgAwIBAgIRAJZpZsDepakv5CafojXo
PKcwDQYJKoZIhvcNAQEFBQAwVDELMAkGA1UEBhMCQkUxGTAXBgNVBAoTEEdsb2JhbFNpZ24gbnYt
c2ExKjAoBgNVBAMTIUdsb2JhbFNpZ24gUGVyc29uYWxTaWduIDIgQ0EgLSBHMjAeFw0xMzA4Mjcx
NjU3NThaFw0xNjA4MjcxNjU3NThaMFgxCzAJBgNVBAYTAkRFMRwwGgYDVQQDExNEYW5pZWwgSGVs
Z2VuYmVyZ2VyMSswKQYJKoZIhvcNAQkBFhxkYW5pZWwuaGVsZ2VuYmVyZ2VyQG0tYm94LmRlMIIB
IjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAzgFDm8+SeTU4Yt3WopJQgqZAuuNxyMlxiPuq
0C0D581goXz2nVVjhTCIVwX2MqWYD1Dyjy1hLHXothgWgZaiQ1EB4oVdmIFmIfIjR6SkR/Gjw3lx
MwJzEpxJhZXyyrOYE8Kgw2maJWgLx5zw2/lKpcffhVW0OY0t+JWWxPKiYFcAmQnb+fleonM8sUZZ
ZES08uRVVL67jbq+3+E2xCLlqQ2iJ1h5ej3wlyuZ4CkUnfMHYrG8zOIfHwsPirWACX026a1flgts
Kl1Yv0CRZ1c5qujcP3OPpDovIbBr9RBStl2DcFdzTuGMdmfp32963VLOlvKpClPMzrfJeJfWZ4Qy
UwIDAQABo4IBkTCCAY0wDgYDVR0PAQH/BAQDAgWgMEwGA1UdIARFMEMwQQYJKwYBBAGgMgEoMDQw
MgYIKwYBBQUHAgEWJmh0dHBzOi8vd3d3Lmdsb2JhbHNpZ24uY29tL3JlcG9zaXRvcnkvMCcGA1Ud
EQQgMB6BHGRhbmllbC5oZWxnZW5iZXJnZXJAbS1ib3guZGUwCQYDVR0TBAIwADAdBgNVHSUEFjAU
BggrBgEFBQcDAgYIKwYBBQUHAwQwQwYDVR0fBDwwOjA4oDagNIYyaHR0cDovL2NybC5nbG9iYWxz
aWduLmNvbS9ncy9nc3BlcnNvbmFsc2lnbjJnMi5jcmwwVQYIKwYBBQUHAQEESTBHMEUGCCsGAQUF
BzAChjlodHRwOi8vc2VjdXJlLmdsb2JhbHNpZ24uY29tL2NhY2VydC9nc3BlcnNvbmFsc2lnbjJn
Mi5jcnQwHQYDVR0OBBYEFLw0UD+6l35aKnDaePxEP8K35HYZMB8GA1UdIwQYMBaAFD8V0m18L+cx
nkMKBqiUbCw7xe5lMA0GCSqGSIb3DQEBBQUAA4IBAQBdVOm7h+E4sRMBbTN1tCIjAEgxmB5U0mdZ
XcawzEHLJxTrc/5YFBMGX2qPju8cuZV14XszMfRBJdlJz1Od+voJggianIhnFEAakCxaa1l/cmJ5
EDT6PgZAkXbMB5rU1dhegb35lJJkcFLEpR2tF1V0TfbSe5UZNPYeMQjYsRhs69pfKLoeGm4dSLK7
gsPT5EhPd+JPyNSIootOwClMP4CTxIsXQgRI5IDqG2Ku/r2YMMLsqWD11PtAE87t2mgohQ6V1XdW
FqGd1V+wN98oPumRRS8bld+1gRA7GVYMnO5MF6p//iHFcy3MVT05ojqgomMt+voH5cFzrHA61z80
xaZ6MIIEyjCCA7KgAwIBAgIRAJZpZsDepakv5CafojXoPKcwDQYJKoZIhvcNAQEFBQAwVDELMAkG
A1UEBhMCQkUxGTAXBgNVBAoTEEdsb2JhbFNpZ24gbnYtc2ExKjAoBgNVBAMTIUdsb2JhbFNpZ24g
UGVyc29uYWxTaWduIDIgQ0EgLSBHMjAeFw0xMzA4MjcxNjU3NThaFw0xNjA4MjcxNjU3NThaMFgx
CzAJBgNVBAYTAkRFMRwwGgYDVQQDExNEYW5pZWwgSGVsZ2VuYmVyZ2VyMSswKQYJKoZIhvcNAQkB
FhxkYW5pZWwuaGVsZ2VuYmVyZ2VyQG0tYm94LmRlMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIB
CgKCAQEAzgFDm8+SeTU4Yt3WopJQgqZAuuNxyMlxiPuq0C0D581goXz2nVVjhTCIVwX2MqWYD1Dy
jy1hLHXothgWgZaiQ1EB4oVdmIFmIfIjR6SkR/Gjw3lxMwJzEpxJhZXyyrOYE8Kgw2maJWgLx5zw
2/lKpcffhVW0OY0t+JWWxPKiYFcAmQnb+fleonM8sUZZZES08uRVVL67jbq+3+E2xCLlqQ2iJ1h5
ej3wlyuZ4CkUnfMHYrG8zOIfHwsPirWACX026a1flgtsKl1Yv0CRZ1c5qujcP3OPpDovIbBr9RBS
tl2DcFdzTuGMdmfp32963VLOlvKpClPMzrfJeJfWZ4QyUwIDAQABo4IBkTCCAY0wDgYDVR0PAQH/
BAQDAgWgMEwGA1UdIARFMEMwQQYJKwYBBAGgMgEoMDQwMgYIKwYBBQUHAgEWJmh0dHBzOi8vd3d3
Lmdsb2JhbHNpZ24uY29tL3JlcG9zaXRvcnkvMCcGA1UdEQQgMB6BHGRhbmllbC5oZWxnZW5iZXJn
ZXJAbS1ib3guZGUwCQYDVR0TBAIwADAdBgNVHSUEFjAUBggrBgEFBQcDAgYIKwYBBQUHAwQwQwYD
VR0fBDwwOjA4oDagNIYyaHR0cDovL2NybC5nbG9iYWxzaWduLmNvbS9ncy9nc3BlcnNvbmFsc2ln
bjJnMi5jcmwwVQYIKwYBBQUHAQEESTBHMEUGCCsGAQUFBzAChjlodHRwOi8vc2VjdXJlLmdsb2Jh
bHNpZ24uY29tL2NhY2VydC9nc3BlcnNvbmFsc2lnbjJnMi5jcnQwHQYDVR0OBBYEFLw0UD+6l35a
KnDaePxEP8K35HYZMB8GA1UdIwQYMBaAFD8V0m18L+cxnkMKBqiUbCw7xe5lMA0GCSqGSIb3DQEB
BQUAA4IBAQBdVOm7h+E4sRMBbTN1tCIjAEgxmB5U0mdZXcawzEHLJxTrc/5YFBMGX2qPju8cuZV1
4XszMfRBJdlJz1Od+voJggianIhnFEAakCxaa1l/cmJ5EDT6PgZAkXbMB5rU1dhegb35lJJkcFLE
pR2tF1V0TfbSe5UZNPYeMQjYsRhs69pfKLoeGm4dSLK7gsPT5EhPd+JPyNSIootOwClMP4CTxIsX
QgRI5IDqG2Ku/r2YMMLsqWD11PtAE87t2mgohQ6V1XdWFqGd1V+wN98oPumRRS8bld+1gRA7GVYM
nO5MF6p//iHFcy3MVT05ojqgomMt+voH5cFzrHA61z80xaZ6MYIC5zCCAuMCAQEwaTBUMQswCQYD
VQQGEwJCRTEZMBcGA1UEChMQR2xvYmFsU2lnbiBudi1zYTEqMCgGA1UEAxMhR2xvYmFsU2lnbiBQ
ZXJzb25hbFNpZ24gMiBDQSAtIEcyAhEAlmlmwN6lqS/kJp+iNeg8pzAJBgUrDgMCGgUAoIIBUzAY
BgkqhkiG9w0BCQMxCwYJKoZIhvcNAQcBMBwGCSqGSIb3DQEJBTEPFw0xNDA4MTkxNTAwMjNaMCMG
CSqGSIb3DQEJBDEWBBTzUUmM9u511Ut8jbOlq52V0hui2zB4BgkrBgEEAYI3EAQxazBpMFQxCzAJ
BgNVBAYTAkJFMRkwFwYDVQQKExBHbG9iYWxTaWduIG52LXNhMSowKAYDVQQDEyFHbG9iYWxTaWdu
IFBlcnNvbmFsU2lnbiAyIENBIC0gRzICEQCWaWbA3qWpL+Qmn6I16DynMHoGCyqGSIb3DQEJEAIL
MWugaTBUMQswCQYDVQQGEwJCRTEZMBcGA1UEChMQR2xvYmFsU2lnbiBudi1zYTEqMCgGA1UEAxMh
R2xvYmFsU2lnbiBQZXJzb25hbFNpZ24gMiBDQSAtIEcyAhEAlmlmwN6lqS/kJp+iNeg8pzANBgkq
hkiG9w0BAQEFAASCAQArLjb+1wMIs4bhB2kRzHieh1GaUnEQ3Nw8qk4YesCtjK3j/hKUW+fRY1mo
LUTkibOl4yb+6INzwF/xM1UoPbILpQQ6praYShPo1cteqcKMroCzKp1P0ppARYlcr9rbdE5mYrdf
Kt0i8sbsSALV0qoxRQheDcKDNRpCHxHC64VJHa6M3bElcYbRQNplkQ31lowtZ7dda0f8SbV20EXV
ruSlg6o5w/TjXQmWUn9cqW6sgm/j/rDD1YrmILViz/Ls+b4FwV8wLVVmv6Iz+WF4KKJDgtBrebs6
H5bsxCR7CIj89RjVNLOuJt52zugy+pTIFVD3X1SrlMnkkX76aOuqaKTsAAAAAAAA
--=-X7tdJfzKIr6253m47i4Q--