--_000_BFAB40933B3367488CE6299BAF8592D1014E52E495ACSOCRATESasl_
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Hi All,
after running solid for several month my ovirt-engine started rebooting on =
several hosts. I've looked into the hostend-engine -vm-status and it sees t=
hat the engine is up on one host but not reachable. At the same time I can =
access the gui and everything is working fine. After some time the engine i=
s shutting down and all hosts are trying to start the engine until one is t=
he winner, at least it looks like this. Any clues where to look at and find=
the issue with the liveliness check ?
---------------------------------------------------------------------------=
-----------------------------
--=3D=3D Host 1 status =3D=3D--
conf_on_shared_storage : True
Status up-to-date : True
Hostname : ovirt-node01
Host ID : 1
Engine status : {"reason": "vm not running on this
hos=
t", "health": "bad", "vm": "down",
"detail": "unknown"}
Score : 3400
stopped : False
Local maintenance : False
crc32 : 3eb33843
local_conf_timestamp : 17128
Host timestamp : 17113
Extra metadata (valid at timestamp):
metadata_parse_version=3D1
metadata_feature_version=3D1
timestamp=3D17113 (Fri Jul 14 11:50:23 2017)
host-id=3D1
score=3D3400
vm_conf_refresh_time=3D17128 (Fri Jul 14 11:50:38 2017)
conf_on_shared_storage=3DTrue
maintenance=3DFalse
state=3DEngineDown
stopped=3DFalse
--=3D=3D Host 2 status =3D=3D--
conf_on_shared_storage : True
Status up-to-date : True
Hostname : ovirt-node02.mgmt.lan
Host ID : 2
Engine status : {"reason": "failed liveliness
check", =
"health": "bad", "vm": "up", "detail":
"up"}
Score : 3400
stopped : False
Local maintenance : False
crc32 : 2a8c86cc
local_conf_timestamp : 523182
Host timestamp : 523167
Extra metadata (valid at timestamp):
metadata_parse_version=3D1
metadata_feature_version=3D1
timestamp=3D523167 (Fri Jul 14 11:50:25 2017)
host-id=3D2
score=3D3400
vm_conf_refresh_time=3D523182 (Fri Jul 14 11:50:40 2017)
conf_on_shared_storage=3DTrue
maintenance=3DFalse
state=3DEngineStarting
stopped=3DFalse
--=3D=3D Host 3 status =3D=3D--
conf_on_shared_storage : True
Status up-to-date : True
Hostname : ovirt-node03.mgmt.lan
Host ID : 3
Engine status : {"reason": "vm not running on this
hos=
t", "health": "bad", "vm": "down",
"detail": "unknown"}
Score : 3400
stopped : False
Local maintenance : False
crc32 : f8490d79
local_conf_timestamp : 527698
Host timestamp : 527683
Extra metadata (valid at timestamp):
metadata_parse_version=3D1
metadata_feature_version=3D1
timestamp=3D527683 (Fri Jul 14 11:50:33 2017)
host-id=3D3
score=3D3400
vm_conf_refresh_time=3D527698 (Fri Jul 14 11:50:47 2017)
conf_on_shared_storage=3DTrue
maintenance=3DFalse
state=3DEngineDown
stopped=3DFalse
---------------------------------------------------------------------------=
-------------------
Thank you,
Sven
--_000_BFAB40933B3367488CE6299BAF8592D1014E52E495ACSOCRATESasl_
Content-Type: text/html; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
<html xmlns:v=3D"urn:schemas-microsoft-com:vml"
xmlns:o=3D"urn:schemas-micr=
osoft-com:office:office" xmlns:w=3D"urn:schemas-microsoft-com:office:word"
=
xmlns:m=3D"http://schemas.microsoft.com/office/2004/12/omml"
xmlns=3D"http:=
//www.w3.org/TR/REC-html40"><head><meta http-equiv=3DContent-Type
content=
=3D"text/html; charset=3Dus-ascii"><meta name=3DGenerator
content=3D"Micros=
oft Word 15 (filtered medium)"><style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri",sans-serif;
mso-fareast-language:EN-US;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:#0563C1;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:#954F72;
text-decoration:underline;}
span.E-MailFormatvorlage17
{mso-style-type:personal-compose;
font-family:"Calibri",sans-serif;
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-family:"Calibri",sans-serif;
mso-fareast-language:EN-US;}
@page WordSection1
{size:612.0pt 792.0pt;
margin:70.85pt 70.85pt 2.0cm 70.85pt;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext=3D"edit" spidmax=3D"1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext=3D"edit">
<o:idmap v:ext=3D"edit" data=3D"1" />
</o:shapelayout></xml><![endif]--></head><body lang=3DDE
link=3D"#0563C1" v=
link=3D"#954F72"><div class=3DWordSection1><p
class=3DMsoNormal>Hi All, <o:=
p></o:p></p><p
class=3DMsoNormal><o:p> </o:p></p><p
class=3DMsoNormal>=
<span lang=3DEN-US>after running solid for several month my ovirt-engine st=
arted rebooting on several hosts. I’ve looked into the hostend-engine=
–vm-status and it sees that the engine is up on one host but not rea=
chable. At the same time I can access the gui and everything is working fin=
e. After some time the engine is shutting down and all hosts are trying to =
start the engine until one is the winner, at least it looks like this. Any =
clues where to look at and find the issue with the liveliness check ? <o:p>=
</o:p></span></p><p class=3DMsoNormal><span
lang=3DEN-US><o:p> </o:p><=
/span></p><p class=3DMsoNormal><span
lang=3DEN-US>-------------------------=
---------------------------------------------------------------------------=
----<o:p></o:p></span></p><p class=3DMsoNormal><span
lang=3DEN-US><o:p>&nbs=
p;</o:p></span></p><p class=3DMsoNormal><span
lang=3DEN-US>--=3D=3D Host 1 =
status =3D=3D--<o:p></o:p></span></p><p
class=3DMsoNormal><span lang=3DEN-U=
S><o:p> </o:p></span></p><p
class=3DMsoNormal><span lang=3DEN-US>conf_=
on_shared_storage
&nb=
sp; : True<o:p></o:p></span></p><p
class=3DMsoNormal><span lang=
=3DEN-US>Status
up-to-date &=
nbsp; :
True<o:p></o:p></sp=
an></p><p class=3DMsoNormal><span
lang=3DEN-US>Hostname &n=
bsp;  =
;
: ovirt-node0=
1<o:p></o:p></span></p><p class=3DMsoNormal><span
lang=3DEN-US>Host ID =
; &n=
bsp;  =
; : 1<o:p></o:p></span></p><p
class=3DMsoNormal><span lang=3DEN-US>En=
gine
status &nbs=
p;
: {"rea=
son": "vm not running on this host",
"health": &qu=
ot;bad", "vm": "down",
"detail": "u=
nknown"}<o:p></o:p></span></p><p
class=3DMsoNormal><span lang=3DEN-US>=
Score &nbs=
p; &=
nbsp; :
3400<o:p></o:p></span></p><p class=3DMsoNor=
mal><span
lang=3DEN-US>stopped &nb=
sp; =
:
False<o:p></o:p></span></p><p =
class=3DMsoNormal><span lang=3DEN-US>Local
maintenance &nb=
sp; =
: False<o:p></o:p></span></p><p
class=3DMsoNormal><span lang=3DEN-US=
crc32 &nb=
sp; =
:
3eb33843<o:p></o:p></span></p><p class=3DM=
soNormal><span
lang=3DEN-US>local_conf_timestamp &nb=
sp;
: 17128<o:p></o:p=
</span></p><p class=3DMsoNormal><span
lang=3DEN-US>Host timestamp &nb=
sp;
 =
; :
17113<o:p></o:p></span></p><p class=
=3DMsoNormal><span lang=3DEN-US>Extra metadata (valid at
timestamp):<o:p></=
o:p></span></p><p class=3DMsoNormal><span
lang=3DEN-US> &n=
bsp;
metadata_parse_version=3D1<o:p></o:p></span></p><p c=
lass=3DMsoNormal><span
lang=3DEN-US> &nb=
sp; metadata_feature_version=3D1<o:p></o:p></span></p><p
class=3DMsoNormal>=
<span
lang=3DEN-US>
timestamp=3D1=
7113 (Fri Jul 14 11:50:23 2017)<o:p></o:p></span></p><p
class=3DMsoNormal><=
span
lang=3DEN-US>
host-id=3D1<o:=
p></o:p></span></p><p class=3DMsoNormal><span
lang=3DEN-US> &nbs=
p;
score=3D3400<o:p></o:p></span></p><p class=3DMso=
Normal><span
lang=3DEN-US>
vm_con=
f_refresh_time=3D17128 (Fri Jul 14 11:50:38
2017)<o:p></o:p></span></p><p c=
lass=3DMsoNormal><span
lang=3DEN-US> &nb=
sp; conf_on_shared_storage=3DTrue<o:p></o:p></span></p><p
class=3DMsoNormal=
<span
lang=3DEN-US>
maintenance=
=3DFalse<o:p></o:p></span></p><p
class=3DMsoNormal><span lang=3DEN-US> =
;
state=3DEngineDown<o:p></o:p></span><=
/p><p class=3DMsoNormal><span
lang=3DEN-US> &n=
bsp; stopped=3DFalse<o:p></o:p></span></p><p
class=3DMsoNormal><span =
lang=3DEN-US><o:p> </o:p></span></p><p
class=3DMsoNormal><span lang=3D=
EN-US><o:p> </o:p></span></p><p
class=3DMsoNormal><span lang=3DEN-US>-=
-=3D=3D Host 2 status =3D=3D--<o:p></o:p></span></p><p
class=3DMsoNormal><s=
pan lang=3DEN-US><o:p> </o:p></span></p><p
class=3DMsoNormal><span lan=
g=3DEN-US>conf_on_shared_storage &=
nbsp; :
True<o:p></o:p></span></p><p class=3DMsoNor=
mal><span lang=3DEN-US>Status
up-to-date  =
;
: True<=
o:p></o:p></span></p><p class=3DMsoNormal><span
lang=3DEN-US>Hostname =
&nb=
sp; =
: ovirt-node02.mgmt.lan<o:p></o:p></span></p><p
class=3DMsoNormal><span la=
ng=3DEN-US>Host
ID &nb=
sp; =
:
2<o:p></o:p></span></p><p class=3DMsoNorma=
l><span lang=3DEN-US>Engine
status  =
; &n=
bsp; : {"reason": "failed liveliness check",
&quo=
t;health": "bad", "vm":
"up", "deta=
il": "up"}<o:p></o:p></span></p><p
class=3DMsoNormal><span l=
ang=3DEN-US>Score &nbs=
p; &=
nbsp; :
3400<o:p></o:p></span></p><p cl=
ass=3DMsoNormal><span
lang=3DEN-US>stopped &nb=
sp; =
: False<o:p></o:p></=
span></p><p class=3DMsoNormal><span lang=3DEN-US>Local
maintenance &nb=
sp; =
: False<o:p></o:p></span></p><p
class=3DMsoNormal><span =
lang=3DEN-US>crc32 &nb=
sp; =
:
2a8c86cc<o:p></o:p></span></p>=
<p class=3DMsoNormal><span
lang=3DEN-US>local_conf_timestamp &nb=
sp;
: 523=
182<o:p></o:p></span></p><p class=3DMsoNormal><span
lang=3DEN-US>Host times=
tamp  =
; :
523167<o:p></o:p></span=
</p><p class=3DMsoNormal><span lang=3DEN-US>Extra
metadata (valid at times=
tamp):<o:p></o:p></span></p><p
class=3DMsoNormal><span lang=3DEN-US> &=
nbsp;
metadata_parse_version=3D1<o:p></o:p></=
span></p><p class=3DMsoNormal><span
lang=3DEN-US> &n=
bsp;
metadata_feature_version=3D1<o:p></o:p></span></p><p class=
=3DMsoNormal><span
lang=3DEN-US> =
timestamp=3D523167 (Fri Jul 14 11:50:25
2017)<o:p></o:p></span></p><p class=
=3DMsoNormal><span
lang=3DEN-US> =
host-id=3D2<o:p></o:p></span></p><p
class=3DMsoNormal><span lang=3DEN-US>&n=
bsp;
score=3D3400<o:p></o:p></span></p>=
<p class=3DMsoNormal><span
lang=3DEN-US>  =
; vm_conf_refresh_time=3D523182 (Fri Jul 14 11:50:40
2017)<o:p></o:p>=
</span></p><p class=3DMsoNormal><span
lang=3DEN-US> =
conf_on_shared_storage=3DTrue<o:p></o:p></span></p><p cl=
ass=3DMsoNormal><span
lang=3DEN-US> &nbs=
p; maintenance=3DFalse<o:p></o:p></span></p><p
class=3DMsoNormal><span lang=
=3DEN-US>
state=3DEngineStarting<=
o:p></o:p></span></p><p class=3DMsoNormal><span
lang=3DEN-US> &n=
bsp;
stopped=3DFalse<o:p></o:p></span></p><p class=
=3DMsoNormal><span
lang=3DEN-US><o:p> </o:p></span></p><p
class=3DMsoN=
ormal><span
lang=3DEN-US><o:p> </o:p></span></p><p
class=3DMsoNormal><=
span lang=3DEN-US>--=3D=3D Host 3 status
=3D=3D--<o:p></o:p></span></p><p c=
lass=3DMsoNormal><span
lang=3DEN-US><o:p> </o:p></span></p><p class=3D=
MsoNormal><span
lang=3DEN-US>conf_on_shared_storage =
:
True<o:p></o:p></span></=
p><p class=3DMsoNormal><span lang=3DEN-US>Status
up-to-date &nbs=
p; &=
nbsp; : True<o:p></o:p></span></p><p
class=3DMsoNormal><span lang=3DE=
N-US>Hostname &n=
bsp;  =
; :
ovirt-node03.mgmt.lan<o:p></o:p></span></p><p class=
=3DMsoNormal><span lang=3DEN-US>Host
ID =
&nb=
sp; :
3<o:p></o:p></span></=
p><p class=3DMsoNormal><span lang=3DEN-US>Engine
status &n=
bsp;  =
; : {"reason":
"vm not running=
on this host", "health": "bad",
"vm": &=
quot;down", "detail":
"unknown"}<o:p></o:p></span>=
</p><p class=3DMsoNormal><span
lang=3DEN-US>Score &n=
bsp;
&nbs=
p; :=
3400<o:p></o:p></span></p><p class=3DMsoNormal><span
lang=3DEN-US>stopped&=
nbsp; &nbs=
p; &=
nbsp; : False<o:p></o:p></span></p><p
class=3DMsoNormal><span lang=3D=
EN-US>Local
maintenance &nbs=
p; :
False<o:p></o:p></span=
</p><p class=3DMsoNormal><span
lang=3DEN-US>crc32 &=
nbsp; &nbs=
p;
=
: f8490d79<o:p></o:p></span></p><p
class=3DMsoNormal><span lang=3DEN-US>loc=
al_conf_timestamp &nbs=
p; :
527698<o:p></o:p></span></p><p class=3DMsoNorm=
al><span lang=3DEN-US>Host
timestamp &nb=
sp; =
: 527683<o:p></o:p></span></p><p
class=3DMsoNormal><span lang=3DEN-U=
S>Extra metadata (valid at
timestamp):<o:p></o:p></span></p><p class=3DMsoN=
ormal><span
lang=3DEN-US>
metadat=
a_parse_version=3D1<o:p></o:p></span></p><p
class=3DMsoNormal><span lang=3D=
EN-US>
metadata_feature_version=
=3D1<o:p></o:p></span></p><p class=3DMsoNormal><span
lang=3DEN-US> &nb=
sp; timestamp=3D527683 (Fri Jul 14
11:50:33 2=
017)<o:p></o:p></span></p><p class=3DMsoNormal><span
lang=3DEN-US> &nb=
sp;
host-id=3D3<o:p></o:p></span></p><p class=
=3DMsoNormal><span
lang=3DEN-US> =
score=3D3400<o:p></o:p></span></p><p
class=3DMsoNormal><span lang=3DEN-US>&=
nbsp;
vm_conf_refresh_time=3D527698 (Fr=
i Jul 14 11:50:47 2017)<o:p></o:p></span></p><p
class=3DMsoNormal><span lan=
g=3DEN-US>
conf_on_shared_storage=
=3DTrue<o:p></o:p></span></p><p class=3DMsoNormal><span
lang=3DEN-US> =
maintenance=3DFalse<o:p></o:p></span><=
/p><p class=3DMsoNormal><span
lang=3DEN-US> &n=
bsp; state=3DEngineDown<o:p></o:p></span></p><p
class=3DMsoNormal><sp=
an lang=3DEN-US>
stopped=3DFalse<=
o:p></o:p></span></p><p class=3DMsoNormal><span
lang=3DEN-US><o:p> </o=
:p></span></p><p class=3DMsoNormal><span
lang=3DEN-US>---------------------=
-------------------------------------------------------------------------<o=
:p></o:p></span></p><p class=3DMsoNormal><span
lang=3DEN-US>Thank you, <o:p=
</o:p></span></p><p class=3DMsoNormal><span
lang=3DEN-US>Sven <o:p></o:p><=
/span></p></div></body></html>=
--_000_BFAB40933B3367488CE6299BAF8592D1014E52E495ACSOCRATESasl_--