oVirt 3.5 : "Power Management Health Check" - feature pages

Hi The following wiki pages were added to the "Power Management Health Check" feature planned for oVirt 3.5 http://www.ovirt.org/Features/PMHealthCheck http://www.ovirt.org/Features/Design/DetailedPMHealthCheck Your comments/questions are mostly welcomed. Thanks Eli Mesika

Hi Eli, Here is my comment :) Why engine needs to send the status health check, isn't there any 3rd parties that does it, that we can integrate with? If found, it probably has /less (known) bugs/more features/ and it's already written, tested, documented, allows further integration and probably deals with scale. btw, fixed some typos in your pages :-) Thanks, Gilad. ----- Original Message -----
From: "Eli Mesika" <emesika@redhat.com> To: "users" <users@ovirt.org> Cc: "Arthur Berezin" <aberezin@redhat.com> Sent: Sunday, May 4, 2014 12:18:47 PM Subject: [ovirt-users] oVirt 3.5 : "Power Management Health Check" - feature pages
Hi
The following wiki pages were added to the "Power Management Health Check" feature planned for oVirt 3.5
http://www.ovirt.org/Features/PMHealthCheck http://www.ovirt.org/Features/Design/DetailedPMHealthCheck
Your comments/questions are mostly welcomed.
Thanks Eli Mesika _______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

------=_Part_224310_1178833016.1399214159589 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit In this case engine periodically checks health of hosts' power management as HA relies on it. Arthur ----- Original Message -----
From: "Gilad Chaplik" <gchaplik@redhat.com> To: "Eli Mesika" <emesika@redhat.com> Cc: "users" <users@ovirt.org>, "Arthur Berezin" <aberezin@redhat.com> Sent: Sunday, May 4, 2014 5:26:45 PM Subject: Re: [ovirt-users] oVirt 3.5 : "Power Management Health Check" - feature pages
Hi Eli,
Here is my comment :) Why engine needs to send the status health check, isn't there any 3rd parties that does it, that we can integrate with? If found, it probably has /less (known) bugs/more features/ and it's already written, tested, documented, allows further integration and probably deals with scale.
btw, fixed some typos in your pages :-)
Thanks, Gilad.
----- Original Message -----
From: "Eli Mesika" <emesika@redhat.com> To: "users" <users@ovirt.org> Cc: "Arthur Berezin" <aberezin@redhat.com> Sent: Sunday, May 4, 2014 12:18:47 PM Subject: [ovirt-users] oVirt 3.5 : "Power Management Health Check" - feature pages
Hi
The following wiki pages were added to the "Power Management Health Check" feature planned for oVirt 3.5
http://www.ovirt.org/Features/PMHealthCheck http://www.ovirt.org/Features/Design/DetailedPMHealthCheck
Your comments/questions are mostly welcomed.
Thanks Eli Mesika _______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
------=_Part_224310_1178833016.1399214159589 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: quoted-printable <html><body><div style=3D"font-family: times new roman, new york, times, se= rif; font-size: 12pt; color: #000000"><div>In this case engine periodically= checks health of hosts' power management as HA relies on it.</di= v><div><br></div><div>Arthur</div><div><br></div><hr id=3D"zwchr"><blockquo= te style=3D"border-left:2px solid #1010FF;margin-left:5px;padding-left:5px;= color:#000;font-weight:normal;font-style:normal;text-decoration:none;font-f= amily:Helvetica,Arial,sans-serif;font-size:12pt;"><b>From: </b>"Gilad Chapl= ik" <gchaplik@redhat.com><br><b>To: </b>"Eli Mesika" <emesika@redh= at.com><br><b>Cc: </b>"users" <users@ovirt.org>, "Arthur Berezin" = <aberezin@redhat.com><br><b>Sent: </b>Sunday, May 4, 2014 5:26:45 PM<= br><b>Subject: </b>Re: [ovirt-users] oVirt 3.5 : "Power Management Health C= heck" - feature pages<br><di= v><br></div>Hi Eli,<br><div><br></div>Here is my comment :)<br>Why engine n= eeds to send the status health check, isn't there any 3rd parties that does= it, that we can integrate with?<br>If found, it probably has /less (known)= bugs/more features/ and it's already written, tested, documented, allows f= urther integration and probably deals with scale.<br><div><br></div>btw, fi= xed some typos in your pages :-)<br><div><br></div>Thanks, <br>Gilad.<br><d= iv><br></div>----- Original Message -----<br>> From: "Eli Mesika" <em= esika@redhat.com><br>> To: "users" <users@ovirt.org><br>> Cc= : "Arthur Berezin" <aberezin@redhat.com><br>> Sent: Sunday, May 4,= 2014 12:18:47 PM<br>> Subject: [ovirt-users] oVirt 3.5 : "Power Managem= ent Health Check" - feature = pages<br>> <br>> Hi<br>> <br>> The following wiki pages were ad= ded to the "Power Management Health Check"<br>> feature planned for oVir= t 3.5<br>> <br>> http://www.ovirt.org/Features/PMHealthCheck<br>> = http://www.ovirt.org/Features/Design/DetailedPMHealthCheck<br>> <br>>= Your comments/questions are mostly welcomed.<br>> <br>> Thanks<br>&g= t; Eli Mesika<br>> _______________________________________________<br>&g= t; Users mailing list<br>> Users@ovirt.org<br>> http://lists.ovirt.or= g/mailman/listinfo/users<br>> <br></blockquote><div><br></div></div></bo= dy></html> ------=_Part_224310_1178833016.1399214159589--

----- Original Message -----
From: "Arthur Berezin" <aberezin@redhat.com> To: "Gilad Chaplik" <gchaplik@redhat.com> Cc: "users" <users@ovirt.org> Sent: Sunday, May 4, 2014 5:35:59 PM Subject: Re: [ovirt-users] oVirt 3.5 : "Power Management Health Check" - feature pages
In this case engine periodically checks health of hosts' power management as HA relies on it.
Arthur
----- Original Message -----
From: "Gilad Chaplik" <gchaplik@redhat.com> To: "Eli Mesika" <emesika@redhat.com> Cc: "users" <users@ovirt.org>, "Arthur Berezin" <aberezin@redhat.com> Sent: Sunday, May 4, 2014 5:26:45 PM Subject: Re: [ovirt-users] oVirt 3.5 : "Power Management Health Check" - feature pages
Hi Eli,
Here is my comment :) Why engine needs to send the status health check, isn't there any 3rd parties that does it, that we can integrate with? If found, it probably has /less (known) bugs/more features/ and it's already written, tested, documented, allows further integration and probably deals with scale.
btw, fixed some typos in your pages :-)
Thanks, Gilad.
Hi, what 3rd party for example do you refer to? The PM code already exists at engine, And you're also using quartz for scheduling.
----- Original Message -----
From: "Eli Mesika" <emesika@redhat.com> To: "users" <users@ovirt.org> Cc: "Arthur Berezin" <aberezin@redhat.com> Sent: Sunday, May 4, 2014 12:18:47 PM Subject: [ovirt-users] oVirt 3.5 : "Power Management Health Check" - feature pages
Hi
The following wiki pages were added to the "Power Management Health Check" feature planned for oVirt 3.5
http://www.ovirt.org/Features/PMHealthCheck http://www.ovirt.org/Features/Design/DetailedPMHealthCheck
Your comments/questions are mostly welcomed.
Thanks Eli Mesika _______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

----- Original Message -----
From: "Yair Zaslavsky" <yzaslavs@redhat.com> To: "Arthur Berezin" <aberezin@redhat.com> Cc: "Gilad Chaplik" <gchaplik@redhat.com>, "users" <users@ovirt.org> Sent: Monday, May 5, 2014 6:39:02 AM Subject: Re: [ovirt-users] oVirt 3.5 : "Power Management Health Check" - feature pages
----- Original Message -----
From: "Arthur Berezin" <aberezin@redhat.com> To: "Gilad Chaplik" <gchaplik@redhat.com> Cc: "users" <users@ovirt.org> Sent: Sunday, May 4, 2014 5:35:59 PM Subject: Re: [ovirt-users] oVirt 3.5 : "Power Management Health Check" - feature pages
In this case engine periodically checks health of hosts' power management as HA relies on it.
Arthur
----- Original Message -----
From: "Gilad Chaplik" <gchaplik@redhat.com> To: "Eli Mesika" <emesika@redhat.com> Cc: "users" <users@ovirt.org>, "Arthur Berezin" <aberezin@redhat.com> Sent: Sunday, May 4, 2014 5:26:45 PM Subject: Re: [ovirt-users] oVirt 3.5 : "Power Management Health Check" - feature pages
Hi Eli,
Here is my comment :) Why engine needs to send the status health check, isn't there any 3rd parties that does it, that we can integrate with? If found, it probably has /less (known) bugs/more features/ and it's already written, tested, documented, allows further integration and probably deals with scale.
btw, fixed some typos in your pages :-)
Thanks, Gilad.
Hi, what 3rd party for example do you refer to? The PM code already exists at engine, And you're also using quartz for scheduling.
Yair, You're are raising some good points, but imo the entire host monitoring (inc getVdsStats, etc.) should be externalized. There are 2 major issues that we still don't cover: - No HA for monitoring, who checks the hosts when the engine is down. - No scale - the engine is a bottle-neck in network and compute. Although the above is a huge arch change, we need to start somewhere, this feature sounds like a candidate to introduce it. About the examples: http://sixrevisions.com/tools/10-free-server-network-monitoring-tools-that-k... The main goal of the feature if my suggestion is taken, is to select to most appropriate one. Thanks, Gilad.
----- Original Message -----
From: "Eli Mesika" <emesika@redhat.com> To: "users" <users@ovirt.org> Cc: "Arthur Berezin" <aberezin@redhat.com> Sent: Sunday, May 4, 2014 12:18:47 PM Subject: [ovirt-users] oVirt 3.5 : "Power Management Health Check" - feature pages
Hi
The following wiki pages were added to the "Power Management Health Check" feature planned for oVirt 3.5
http://www.ovirt.org/Features/PMHealthCheck http://www.ovirt.org/Features/Design/DetailedPMHealthCheck
Your comments/questions are mostly welcomed.
Thanks Eli Mesika _______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

----- Original Message -----
From: "Gilad Chaplik" <gchaplik@redhat.com> To: "Yair Zaslavsky" <yzaslavs@redhat.com> Cc: "Arthur Berezin" <aberezin@redhat.com>, "users" <users@ovirt.org> Sent: Monday, May 5, 2014 10:57:01 AM Subject: Re: [ovirt-users] oVirt 3.5 : "Power Management Health Check" - feature pages
----- Original Message -----
From: "Yair Zaslavsky" <yzaslavs@redhat.com> To: "Arthur Berezin" <aberezin@redhat.com> Cc: "Gilad Chaplik" <gchaplik@redhat.com>, "users" <users@ovirt.org> Sent: Monday, May 5, 2014 6:39:02 AM Subject: Re: [ovirt-users] oVirt 3.5 : "Power Management Health Check" - feature pages
----- Original Message -----
From: "Arthur Berezin" <aberezin@redhat.com> To: "Gilad Chaplik" <gchaplik@redhat.com> Cc: "users" <users@ovirt.org> Sent: Sunday, May 4, 2014 5:35:59 PM Subject: Re: [ovirt-users] oVirt 3.5 : "Power Management Health Check" - feature pages
In this case engine periodically checks health of hosts' power management as HA relies on it.
Arthur
----- Original Message -----
From: "Gilad Chaplik" <gchaplik@redhat.com> To: "Eli Mesika" <emesika@redhat.com> Cc: "users" <users@ovirt.org>, "Arthur Berezin" <aberezin@redhat.com> Sent: Sunday, May 4, 2014 5:26:45 PM Subject: Re: [ovirt-users] oVirt 3.5 : "Power Management Health Check" - feature pages
Hi Eli,
Here is my comment :) Why engine needs to send the status health check, isn't there any 3rd parties that does it, that we can integrate with? If found, it probably has /less (known) bugs/more features/ and it's already written, tested, documented, allows further integration and probably deals with scale.
btw, fixed some typos in your pages :-)
Thanks, Gilad.
Hi, what 3rd party for example do you refer to? The PM code already exists at engine, And you're also using quartz for scheduling.
Yair,
You're are raising some good points, but imo the entire host monitoring (inc getVdsStats, etc.) should be externalized. There are 2 major issues that we still don't cover: - No HA for monitoring, who checks the hosts when the engine is down. - No scale - the engine is a bottle-neck in network and compute. Although the above is a huge arch change, we need to start somewhere, this feature sounds like a candidate to introduce it.
About the examples: http://sixrevisions.com/tools/10-free-server-network-monitoring-tools-that-k... The main goal of the feature if my suggestion is taken, is to select to most appropriate one.
Thanks, Gilad.
Well, Nagios is being considered to be used or used by Gluster guys. However, it will still require (AFAIK) to code some nagios plugin to perfrom the health check. In addition, you will have to report somehow the state change to engine. IMHO, this a bit of an overkill (look also at the time that the check is run - once in an hour, so it can't be compared to getVmStats).
----- Original Message -----
From: "Eli Mesika" <emesika@redhat.com> To: "users" <users@ovirt.org> Cc: "Arthur Berezin" <aberezin@redhat.com> Sent: Sunday, May 4, 2014 12:18:47 PM Subject: [ovirt-users] oVirt 3.5 : "Power Management Health Check" - feature pages
Hi
The following wiki pages were added to the "Power Management Health Check" feature planned for oVirt 3.5
http://www.ovirt.org/Features/PMHealthCheck http://www.ovirt.org/Features/Design/DetailedPMHealthCheck
Your comments/questions are mostly welcomed.
Thanks Eli Mesika _______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

------=_Part_418892_387399814.1399278624214 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit ----- Original Message -----
From: "Yair Zaslavsky" <yzaslavs@redhat.com> To: "Gilad Chaplik" <gchaplik@redhat.com> Cc: "Arthur Berezin" <aberezin@redhat.com>, "users" <users@ovirt.org> Sent: Monday, May 5, 2014 11:10:10 AM Subject: Re: [ovirt-users] oVirt 3.5 : "Power Management Health Check" - feature pages
----- Original Message -----
From: "Gilad Chaplik" <gchaplik@redhat.com> To: "Yair Zaslavsky" <yzaslavs@redhat.com> Cc: "Arthur Berezin" <aberezin@redhat.com>, "users" <users@ovirt.org> Sent: Monday, May 5, 2014 10:57:01 AM Subject: Re: [ovirt-users] oVirt 3.5 : "Power Management Health Check" - feature pages
----- Original Message -----
From: "Yair Zaslavsky" <yzaslavs@redhat.com> To: "Arthur Berezin" <aberezin@redhat.com> Cc: "Gilad Chaplik" <gchaplik@redhat.com>, "users" <users@ovirt.org> Sent: Monday, May 5, 2014 6:39:02 AM Subject: Re: [ovirt-users] oVirt 3.5 : "Power Management Health Check" - feature pages
----- Original Message -----
From: "Arthur Berezin" <aberezin@redhat.com> To: "Gilad Chaplik" <gchaplik@redhat.com> Cc: "users" <users@ovirt.org> Sent: Sunday, May 4, 2014 5:35:59 PM Subject: Re: [ovirt-users] oVirt 3.5 : "Power Management Health Check" - feature pages
In this case engine periodically checks health of hosts' power management as HA relies on it.
Arthur
----- Original Message -----
From: "Gilad Chaplik" <gchaplik@redhat.com> To: "Eli Mesika" <emesika@redhat.com> Cc: "users" <users@ovirt.org>, "Arthur Berezin" <aberezin@redhat.com> Sent: Sunday, May 4, 2014 5:26:45 PM Subject: Re: [ovirt-users] oVirt 3.5 : "Power Management Health Check" - feature pages
Hi Eli,
Here is my comment :) Why engine needs to send the status health check, isn't there any 3rd parties that does it, that we can integrate with? If found, it probably has /less (known) bugs/more features/ and it's already written, tested, documented, allows further integration and probably deals with scale.
btw, fixed some typos in your pages :-)
Thanks, Gilad.
Hi, what 3rd party for example do you refer to? The PM code already exists at engine, And you're also using quartz for scheduling.
Yair,
You're are raising some good points, but imo the entire host monitoring (inc getVdsStats, etc.) should be externalized. There are 2 major issues that we still don't cover: - No HA for monitoring, who checks the hosts when the engine is down. - No scale - the engine is a bottle-neck in network and compute. Although the above is a huge arch change, we need to start somewhere, this feature sounds like a candidate to introduce it.
About the examples: http://sixrevisions.com/tools/10-free-server-network-monitoring-tools-that-k... The main goal of the feature if my suggestion is taken, is to select to most appropriate one.
Thanks, Gilad.
Well, Nagios is being considered to be used or used by Gluster guys. However, it will still require (AFAIK) to code some nagios plugin to perfrom the health check. In addition, you will have to report somehow the state change to engine. IMHO, this a bit of an overkill (look also at the time that the check is run - once in an hour, so it can't be compared to getVmStats). +1 These monitoring tools bring a lot of value, and there are some initial integrations that we might want to look into[1][2]. But it's an overkill for this RFE - run "PM Check" periodically, in addition to initial PM check at host setup stage.
[1] https://github.com/monitoring-ui-plugin/development [2] http://exchange.nagios.org/directory/Plugins/Operating-Systems/*-Virtual-Env...
----- Original Message -----
From: "Eli Mesika" <emesika@redhat.com> To: "users" <users@ovirt.org> Cc: "Arthur Berezin" <aberezin@redhat.com> Sent: Sunday, May 4, 2014 12:18:47 PM Subject: [ovirt-users] oVirt 3.5 : "Power Management Health Check" - feature pages
Hi
The following wiki pages were added to the "Power Management Health Check" feature planned for oVirt 3.5
http://www.ovirt.org/Features/PMHealthCheck http://www.ovirt.org/Features/Design/DetailedPMHealthCheck
Your comments/questions are mostly welcomed.
Thanks Eli Mesika _______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
> > > HA relies on it.<br>> > > <br>> > > Arthu= r<br>> > > <br>> > > ----- Original Message -----<br>>= > > <br>> > > > From: "Gilad Chaplik" <gchaplik@redha= t.com><br>> > > > To: "Eli Mesika" <emesika@redhat.com>= ;<br>> > > > Cc: "users" <users@ovirt.org>, "Arthur Berez= in" <aberezin@redhat.com><br>> > > > Sent: Sunday, May 4,= 2014 5:26:45 PM<br>> > > > Subject: Re: [ovirt-users] oVirt 3.= 5 : "Power Management Health Check"<br>> > > > -<br>> > &= gt; > feature pages<br>> > > <br>> > > > Hi Eli,<br= > > > <br>> > > > Here is my comment :)<br>> > = > > Why engine needs to send the status health check, isn't there any= 3rd<br>> > > > parties<br>> > > > that does it, th= at we can integrate with?<br>> > > > If found, it probably has = /less (known) bugs/more features/ and it's<br>> > > > already<b= r>> > > > written, tested, documented, allows further integrati= on and probably<br>> > > > deals<br>> > > > with sc= ale.<br>> > > <br>> > > > btw, fixed some typos in you= r pages :-)<br>> > > <br>> > > > Thanks,<br>> > = > > Gilad.<br>> > <br>> > Hi, what 3rd party for example = do you refer to?<br>> > The PM code already exists at engine,<br>>= > And you're also using quartz for scheduling.<br>> > <br>> <b= r>> Yair,<br>> <br>> You're are raising some good points, but imo =
------=_Part_418892_387399814.1399278624214 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: quoted-printable <html><body><div style=3D"font-family: times new roman, new york, times, se= rif; font-size: 12pt; color: #000000"><br><div><br></div><hr id=3D"zwchr"><= blockquote style=3D"border-left:2px solid #1010FF;margin-left:5px;padding-l= eft:5px;color:#000;font-weight:normal;font-style:normal;text-decoration:non= e;font-family:Helvetica,Arial,sans-serif;font-size:12pt;"><b>From: </b>"Yai= r Zaslavsky" <yzaslavs@redhat.com><br><b>To: </b>"Gilad Chaplik" <= gchaplik@redhat.com><br><b>Cc: </b>"Arthur Berezin" <aberezin@redhat.= com>, "users" <users@ovirt.org><br><b>Sent: </b>Monday, May 5, 201= 4 11:10:10 AM<br><b>Subject: </b>Re: [ovirt-users] oVirt 3.5 : "Power Manag= ement Health Check" - feature &nbs= p;pages<br><div><br></div><br><div><br></div>----- Original Message -----<b= r>> From: "Gilad Chaplik" <gchaplik@redhat.com><br>> To: "Yair = Zaslavsky" <yzaslavs@redhat.com><br>> Cc: "Arthur Berezin" <abe= rezin@redhat.com>, "users" <users@ovirt.org><br>> Sent: Monday,= May 5, 2014 10:57:01 AM<br>> Subject: Re: [ovirt-users] oVirt 3.5 : "Po= wer Management Health Check" - feature &= nbsp; pages<br>> <br>> ----- Original Message -----<br>> >= From: "Yair Zaslavsky" <yzaslavs@redhat.com><br>> > To: "Arthu= r Berezin" <aberezin@redhat.com><br>> > Cc: "Gilad Chaplik" <= ;gchaplik@redhat.com>, "users" <users@ovirt.org><br>> > Sent= : Monday, May 5, 2014 6:39:02 AM<br>> > Subject: Re: [ovirt-users] oV= irt 3.5 : "Power Management Health Check" -<br>> > feature  = ; pages<br>> > <br>> > <br>&= gt; > <br>> > ----- Original Message -----<br>> > > From:= "Arthur Berezin" <aberezin@redhat.com><br>> > > To: "Gilad = Chaplik" <gchaplik@redhat.com><br>> > > Cc: "users" <user= s@ovirt.org><br>> > > Sent: Sunday, May 4, 2014 5:35:59 PM<br>&= gt; > > Subject: Re: [ovirt-users] oVirt 3.5 : "Power Management Heal= th Check" -<br>> > > feature &n= bsp; pages<br>> > > <br>> > > In this case engine pe= riodically checks health of hosts' power management<br>> > > as<br= the entire host monitoring (inc<br>> getVdsStats, etc.) should be extern= alized.<br>> There are 2 major issues that we still don't cover:<br>>= - No HA for monitoring, who checks the hosts when the engine is down.<br>&= gt; - No scale - the engine is a bottle-neck in network and compute.<br>>= ; Although the above is a huge arch change, we need to start somewhere, thi= s<br>> feature sounds like a candidate to introduce it.<br>> <br>>= About the examples:<br>> http://sixrevisions.com/tools/10-free-server-n= etwork-monitoring-tools-that-kick-ass/<br>> The main goal of the feature= if my suggestion is taken, is to select to most<br>> appropriate one.<b= r>> <br>> Thanks,<br>> Gilad.<br><div><br></div><br>Well, Nagios i= s being considered to be used or used by Gluster guys.<br>However, it will = still require (AFAIK) to code some nagios plugin to perfrom the health chec= k.<br>In addition, you will have to report somehow the state change to engi= ne.<br>IMHO, this a bit of an overkill (look also at the time that the chec= k is run - once in an hour, so it can't be compared to getVmStats).</blockq= uote><div>+1</div><div>These monitoring tools bring a lot of value, and the= re are some initial integrations that we might want to look into[1][2].</di= v><div>But it's an overkill for this RFE - run "PM Check" periodically, in = addition to initial PM check at host setup stage.</div><div><br></div><div>= [1] https://github.com/monitoring-ui-plugin/development</div><div>[2]&= nbsp;http://exchange.nagios.org/directory/Plugins/Operating-Systems/*-Virtu= al-Environments/Others/check_rhev3/details</div><blockquote style=3D"border= -left:2px solid #1010FF;margin-left:5px;padding-left:5px;color:#000;font-we= ight:normal;font-style:normal;text-decoration:none;font-family:Helvetica,Ar= ial,sans-serif;font-size:12pt;"><br><div><br></div><br>> <br>> > <= br>> > > <br>> > > > ----- Original Message -----<br>&= gt; > > > > From: "Eli Mesika" <emesika@redhat.com><br>&g= t; > > > > To: "users" <users@ovirt.org><br>> > >= ; > > Cc: "Arthur Berezin" <aberezin@redhat.com><br>> > &= gt; > > Sent: Sunday, May 4, 2014 12:18:47 PM<br>> > > > = > Subject: [ovirt-users] oVirt 3.5 : "Power Management Health Check" -<b= r>> > > > > feature pages<br>> > > > ><br>>= ; > > > > Hi<br>> > > > ><br>> > > >= > The following wiki pages were added to the "Power Management Health<b= r>> > > > > Check"<br>> > > > > feature plann= ed for oVirt 3.5<br>> > > > ><br>> > > > > ht= tp://www.ovirt.org/Features/PMHealthCheck<br>> > > > > http:= //www.ovirt.org/Features/Design/DetailedPMHealthCheck<br>> > > >= ; ><br>> > > > > Your comments/questions are mostly welco= med.<br>> > > > ><br>> > > > > Thanks<br>>= > > > > Eli Mesika<br>> > > > > _______________= ________________________________<br>> > > > > Users mailing = list<br>> > > > > Users@ovirt.org<br>> > > > >= ; http://lists.ovirt.org/mailman/listinfo/users<br>> > > > >= <br>> > > <br>> > > _____________________________________= __________<br>> > > Users mailing list<br>> > > Users@ovi= rt.org<br>> > > http://lists.ovirt.org/mailman/listinfo/users<br>&= gt; > > <br>> > <br>> <br></blockquote><br></div></body></ht= ml> ------=_Part_418892_387399814.1399278624214--

----- Original Message -----
From: "Arthur Berezin" <aberezin@redhat.com> To: "Gilad Chaplik" <gchaplik@redhat.com> Cc: "users" <users@ovirt.org>, "Yair Zaslavsky" <yzaslavs@redhat.com> Sent: Monday, May 5, 2014 11:30:24 AM Subject: Re: [ovirt-users] oVirt 3.5 : "Power Management Health Check" - feature pages
----- Original Message -----
From: "Yair Zaslavsky" <yzaslavs@redhat.com> To: "Gilad Chaplik" <gchaplik@redhat.com> Cc: "Arthur Berezin" <aberezin@redhat.com>, "users" <users@ovirt.org> Sent: Monday, May 5, 2014 11:10:10 AM Subject: Re: [ovirt-users] oVirt 3.5 : "Power Management Health Check" - feature pages
----- Original Message -----
From: "Gilad Chaplik" <gchaplik@redhat.com> To: "Yair Zaslavsky" <yzaslavs@redhat.com> Cc: "Arthur Berezin" <aberezin@redhat.com>, "users" <users@ovirt.org> Sent: Monday, May 5, 2014 10:57:01 AM Subject: Re: [ovirt-users] oVirt 3.5 : "Power Management Health Check" - feature pages
----- Original Message -----
From: "Yair Zaslavsky" <yzaslavs@redhat.com> To: "Arthur Berezin" <aberezin@redhat.com> Cc: "Gilad Chaplik" <gchaplik@redhat.com>, "users" <users@ovirt.org> Sent: Monday, May 5, 2014 6:39:02 AM Subject: Re: [ovirt-users] oVirt 3.5 : "Power Management Health Check" - feature pages
----- Original Message -----
From: "Arthur Berezin" <aberezin@redhat.com> To: "Gilad Chaplik" <gchaplik@redhat.com> Cc: "users" <users@ovirt.org> Sent: Sunday, May 4, 2014 5:35:59 PM Subject: Re: [ovirt-users] oVirt 3.5 : "Power Management Health Check" - feature pages
In this case engine periodically checks health of hosts' power management as HA relies on it.
Arthur
----- Original Message -----
From: "Gilad Chaplik" <gchaplik@redhat.com> To: "Eli Mesika" <emesika@redhat.com> Cc: "users" <users@ovirt.org>, "Arthur Berezin" <aberezin@redhat.com> Sent: Sunday, May 4, 2014 5:26:45 PM Subject: Re: [ovirt-users] oVirt 3.5 : "Power Management Health Check" - feature pages
Hi Eli,
Here is my comment :) Why engine needs to send the status health check, isn't there any 3rd parties that does it, that we can integrate with? If found, it probably has /less (known) bugs/more features/ and it's already written, tested, documented, allows further integration and probably deals with scale.
btw, fixed some typos in your pages :-)
Thanks, Gilad.
Hi, what 3rd party for example do you refer to? The PM code already exists at engine, And you're also using quartz for scheduling.
Yair,
You're are raising some good points, but imo the entire host monitoring (inc getVdsStats, etc.) should be externalized. There are 2 major issues that we still don't cover: - No HA for monitoring, who checks the hosts when the engine is down. - No scale - the engine is a bottle-neck in network and compute. Although the above is a huge arch change, we need to start somewhere, this feature sounds like a candidate to introduce it.
About the examples: http://sixrevisions.com/tools/10-free-server-network-monitoring-tools-that-k... The main goal of the feature if my suggestion is taken, is to select to most appropriate one.
Thanks, Gilad.
Well, Nagios is being considered to be used or used by Gluster guys. However, it will still require (AFAIK) to code some nagios plugin to perfrom the health check. In addition, you will have to report somehow the state change to engine. IMHO, this a bit of an overkill (look also at the time that the check is run - once in an hour, so it can't be compared to getVmStats). +1 These monitoring tools bring a lot of value, and there are some initial integrations that we might want to look into[1][2]. But it's an overkill for this RFE - run "PM Check" periodically, in addition to initial PM check at host setup stage.
[1] https://github.com/monitoring-ui-plugin/development [2] http://exchange.nagios.org/directory/Plugins/Operating-Systems/*-Virtual-Env...
-1 on overkill. As I mentioned, proper monitoring is a huge feature; it should be gradually introduced, IMO this is a good starting point. We can look at it as an overkill _or_ as a jumpborad, that will reduce learning curve and future integrations issues. Gilad.
----- Original Message ----- > From: "Eli Mesika" <emesika@redhat.com> > To: "users" <users@ovirt.org> > Cc: "Arthur Berezin" <aberezin@redhat.com> > Sent: Sunday, May 4, 2014 12:18:47 PM > Subject: [ovirt-users] oVirt 3.5 : "Power Management Health > Check" > - > feature pages > > Hi > > The following wiki pages were added to the "Power Management > Health > Check" > feature planned for oVirt 3.5 > > http://www.ovirt.org/Features/PMHealthCheck > http://www.ovirt.org/Features/Design/DetailedPMHealthCheck > > Your comments/questions are mostly welcomed. > > Thanks > Eli Mesika > _______________________________________________ > Users mailing list > Users@ovirt.org > http://lists.ovirt.org/mailman/listinfo/users >
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

----- Original Message -----
From: "Gilad Chaplik" <gchaplik@redhat.com> To: "Arthur Berezin" <aberezin@redhat.com> Cc: "users" <users@ovirt.org>, "Yair Zaslavsky" <yzaslavs@redhat.com> Sent: Monday, May 5, 2014 11:52:25 AM Subject: Re: [ovirt-users] oVirt 3.5 : "Power Management Health Check" - feature pages
----- Original Message -----
From: "Arthur Berezin" <aberezin@redhat.com> To: "Gilad Chaplik" <gchaplik@redhat.com> Cc: "users" <users@ovirt.org>, "Yair Zaslavsky" <yzaslavs@redhat.com> Sent: Monday, May 5, 2014 11:30:24 AM Subject: Re: [ovirt-users] oVirt 3.5 : "Power Management Health Check" - feature pages
----- Original Message -----
From: "Yair Zaslavsky" <yzaslavs@redhat.com> To: "Gilad Chaplik" <gchaplik@redhat.com> Cc: "Arthur Berezin" <aberezin@redhat.com>, "users" <users@ovirt.org> Sent: Monday, May 5, 2014 11:10:10 AM Subject: Re: [ovirt-users] oVirt 3.5 : "Power Management Health Check" - feature pages
----- Original Message -----
From: "Gilad Chaplik" <gchaplik@redhat.com> To: "Yair Zaslavsky" <yzaslavs@redhat.com> Cc: "Arthur Berezin" <aberezin@redhat.com>, "users" <users@ovirt.org> Sent: Monday, May 5, 2014 10:57:01 AM Subject: Re: [ovirt-users] oVirt 3.5 : "Power Management Health Check" - feature pages
----- Original Message -----
From: "Yair Zaslavsky" <yzaslavs@redhat.com> To: "Arthur Berezin" <aberezin@redhat.com> Cc: "Gilad Chaplik" <gchaplik@redhat.com>, "users" <users@ovirt.org> Sent: Monday, May 5, 2014 6:39:02 AM Subject: Re: [ovirt-users] oVirt 3.5 : "Power Management Health Check" - feature pages
----- Original Message -----
From: "Arthur Berezin" <aberezin@redhat.com> To: "Gilad Chaplik" <gchaplik@redhat.com> Cc: "users" <users@ovirt.org> Sent: Sunday, May 4, 2014 5:35:59 PM Subject: Re: [ovirt-users] oVirt 3.5 : "Power Management Health Check" - feature pages
In this case engine periodically checks health of hosts' power management as HA relies on it.
Arthur
----- Original Message -----
> From: "Gilad Chaplik" <gchaplik@redhat.com> > To: "Eli Mesika" <emesika@redhat.com> > Cc: "users" <users@ovirt.org>, "Arthur Berezin" > <aberezin@redhat.com> > Sent: Sunday, May 4, 2014 5:26:45 PM > Subject: Re: [ovirt-users] oVirt 3.5 : "Power Management Health > Check" > - > feature pages
> Hi Eli,
> Here is my comment :) > Why engine needs to send the status health check, isn't there any > 3rd > parties > that does it, that we can integrate with? > If found, it probably has /less (known) bugs/more features/ and > it's > already > written, tested, documented, allows further integration and > probably > deals > with scale.
> btw, fixed some typos in your pages :-)
> Thanks, > Gilad.
Hi, what 3rd party for example do you refer to? The PM code already exists at engine, And you're also using quartz for scheduling.
Yair,
You're are raising some good points, but imo the entire host monitoring (inc getVdsStats, etc.) should be externalized. There are 2 major issues that we still don't cover: - No HA for monitoring, who checks the hosts when the engine is down. - No scale - the engine is a bottle-neck in network and compute. Although the above is a huge arch change, we need to start somewhere, this feature sounds like a candidate to introduce it.
About the examples: http://sixrevisions.com/tools/10-free-server-network-monitoring-tools-that-k... The main goal of the feature if my suggestion is taken, is to select to most appropriate one.
Thanks, Gilad.
Well, Nagios is being considered to be used or used by Gluster guys. However, it will still require (AFAIK) to code some nagios plugin to perfrom the health check. In addition, you will have to report somehow the state change to engine. IMHO, this a bit of an overkill (look also at the time that the check is run - once in an hour, so it can't be compared to getVmStats). +1 These monitoring tools bring a lot of value, and there are some initial integrations that we might want to look into[1][2]. But it's an overkill for this RFE - run "PM Check" periodically, in addition to initial PM check at host setup stage.
[1] https://github.com/monitoring-ui-plugin/development [2] http://exchange.nagios.org/directory/Plugins/Operating-Systems/*-Virtual-Env...
-1 on overkill. As I mentioned, proper monitoring is a huge feature; it should be gradually introduced, IMO this is a good starting point. We can look at it as an overkill _or_ as a jumpborad, that will reduce learning curve and future integrations issues.
IMHO this will increase also deployment complexity, and require our customers to have another component installed. The chances for bugs here (as you previously mentioned) IMHO are equally the same to bugs occurring due to developing a cusotm nagios plugin here.
Gilad.
> ----- Original Message ----- > > From: "Eli Mesika" <emesika@redhat.com> > > To: "users" <users@ovirt.org> > > Cc: "Arthur Berezin" <aberezin@redhat.com> > > Sent: Sunday, May 4, 2014 12:18:47 PM > > Subject: [ovirt-users] oVirt 3.5 : "Power Management Health > > Check" > > - > > feature pages > > > > Hi > > > > The following wiki pages were added to the "Power Management > > Health > > Check" > > feature planned for oVirt 3.5 > > > > http://www.ovirt.org/Features/PMHealthCheck > > http://www.ovirt.org/Features/Design/DetailedPMHealthCheck > > > > Your comments/questions are mostly welcomed. > > > > Thanks > > Eli Mesika > > _______________________________________________ > > Users mailing list > > Users@ovirt.org > > http://lists.ovirt.org/mailman/listinfo/users > >
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

----- Original Message -----
From: "Yair Zaslavsky" <yzaslavs@redhat.com> To: "Gilad Chaplik" <gchaplik@redhat.com> Cc: "Arthur Berezin" <aberezin@redhat.com>, "users" <users@ovirt.org> Sent: Monday, May 5, 2014 12:00:10 PM Subject: Re: [ovirt-users] oVirt 3.5 : "Power Management Health Check" - feature pages
----- Original Message -----
From: "Gilad Chaplik" <gchaplik@redhat.com> To: "Arthur Berezin" <aberezin@redhat.com> Cc: "users" <users@ovirt.org>, "Yair Zaslavsky" <yzaslavs@redhat.com> Sent: Monday, May 5, 2014 11:52:25 AM Subject: Re: [ovirt-users] oVirt 3.5 : "Power Management Health Check" - feature pages
----- Original Message -----
From: "Arthur Berezin" <aberezin@redhat.com> To: "Gilad Chaplik" <gchaplik@redhat.com> Cc: "users" <users@ovirt.org>, "Yair Zaslavsky" <yzaslavs@redhat.com> Sent: Monday, May 5, 2014 11:30:24 AM Subject: Re: [ovirt-users] oVirt 3.5 : "Power Management Health Check" - feature pages
----- Original Message -----
From: "Yair Zaslavsky" <yzaslavs@redhat.com> To: "Gilad Chaplik" <gchaplik@redhat.com> Cc: "Arthur Berezin" <aberezin@redhat.com>, "users" <users@ovirt.org> Sent: Monday, May 5, 2014 11:10:10 AM Subject: Re: [ovirt-users] oVirt 3.5 : "Power Management Health Check" - feature pages
----- Original Message -----
From: "Gilad Chaplik" <gchaplik@redhat.com> To: "Yair Zaslavsky" <yzaslavs@redhat.com> Cc: "Arthur Berezin" <aberezin@redhat.com>, "users" <users@ovirt.org> Sent: Monday, May 5, 2014 10:57:01 AM Subject: Re: [ovirt-users] oVirt 3.5 : "Power Management Health Check" - feature pages
----- Original Message -----
From: "Yair Zaslavsky" <yzaslavs@redhat.com> To: "Arthur Berezin" <aberezin@redhat.com> Cc: "Gilad Chaplik" <gchaplik@redhat.com>, "users" <users@ovirt.org> Sent: Monday, May 5, 2014 6:39:02 AM Subject: Re: [ovirt-users] oVirt 3.5 : "Power Management Health Check" - feature pages
----- Original Message ----- > From: "Arthur Berezin" <aberezin@redhat.com> > To: "Gilad Chaplik" <gchaplik@redhat.com> > Cc: "users" <users@ovirt.org> > Sent: Sunday, May 4, 2014 5:35:59 PM > Subject: Re: [ovirt-users] oVirt 3.5 : "Power Management Health > Check" > - > feature pages > > In this case engine periodically checks health of hosts' power > management > as > HA relies on it. > > Arthur > > ----- Original Message ----- > > > From: "Gilad Chaplik" <gchaplik@redhat.com> > > To: "Eli Mesika" <emesika@redhat.com> > > Cc: "users" <users@ovirt.org>, "Arthur Berezin" > > <aberezin@redhat.com> > > Sent: Sunday, May 4, 2014 5:26:45 PM > > Subject: Re: [ovirt-users] oVirt 3.5 : "Power Management Health > > Check" > > - > > feature pages > > > Hi Eli, > > > Here is my comment :) > > Why engine needs to send the status health check, isn't there > > any > > 3rd > > parties > > that does it, that we can integrate with? > > If found, it probably has /less (known) bugs/more features/ and > > it's > > already > > written, tested, documented, allows further integration and > > probably > > deals > > with scale. > > > btw, fixed some typos in your pages :-) > > > Thanks, > > Gilad.
Hi, what 3rd party for example do you refer to? The PM code already exists at engine, And you're also using quartz for scheduling.
Yair,
You're are raising some good points, but imo the entire host monitoring (inc getVdsStats, etc.) should be externalized. There are 2 major issues that we still don't cover: - No HA for monitoring, who checks the hosts when the engine is down. - No scale - the engine is a bottle-neck in network and compute. Although the above is a huge arch change, we need to start somewhere, this feature sounds like a candidate to introduce it.
About the examples: http://sixrevisions.com/tools/10-free-server-network-monitoring-tools-that-k... The main goal of the feature if my suggestion is taken, is to select to most appropriate one.
Thanks, Gilad.
Well, Nagios is being considered to be used or used by Gluster guys. However, it will still require (AFAIK) to code some nagios plugin to perfrom the health check. In addition, you will have to report somehow the state change to engine. IMHO, this a bit of an overkill (look also at the time that the check is run - once in an hour, so it can't be compared to getVmStats). +1 These monitoring tools bring a lot of value, and there are some initial integrations that we might want to look into[1][2]. But it's an overkill for this RFE - run "PM Check" periodically, in addition to initial PM check at host setup stage.
[1] https://github.com/monitoring-ui-plugin/development [2] http://exchange.nagios.org/directory/Plugins/Operating-Systems/*-Virtual-Env...
-1 on overkill. As I mentioned, proper monitoring is a huge feature; it should be gradually introduced, IMO this is a good starting point. We can look at it as an overkill _or_ as a jumpborad, that will reduce learning curve and future integrations issues.
IMHO this will increase also deployment complexity, and require our customers to have another component installed. The chances for bugs here (as you previously mentioned) IMHO are equally the same to bugs occurring due to developing a cusotm nagios plugin here.
I appreciate your involvement, and rapid responses :-) I get your point, but the bugs for this specific feature will be nothing in compare to the bugs we'll get once we integrate fully external monitoring process (in terms of priority and severity), this is what I mean in 'learning curve and future integration issues'.
Gilad.
> > > ----- Original Message ----- > > > From: "Eli Mesika" <emesika@redhat.com> > > > To: "users" <users@ovirt.org> > > > Cc: "Arthur Berezin" <aberezin@redhat.com> > > > Sent: Sunday, May 4, 2014 12:18:47 PM > > > Subject: [ovirt-users] oVirt 3.5 : "Power Management Health > > > Check" > > > - > > > feature pages > > > > > > Hi > > > > > > The following wiki pages were added to the "Power Management > > > Health > > > Check" > > > feature planned for oVirt 3.5 > > > > > > http://www.ovirt.org/Features/PMHealthCheck > > > http://www.ovirt.org/Features/Design/DetailedPMHealthCheck > > > > > > Your comments/questions are mostly welcomed. > > > > > > Thanks > > > Eli Mesika > > > _______________________________________________ > > > Users mailing list > > > Users@ovirt.org > > > http://lists.ovirt.org/mailman/listinfo/users > > > > > _______________________________________________ > Users mailing list > Users@ovirt.org > http://lists.ovirt.org/mailman/listinfo/users >

------=_Part_455243_250179676.1399285418769 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Gilad, this is integral part of HA, For HA to work properly we rely on PM mechanism, this feature makes sure PM is available, ie. running existing "Check PM" periodically. We need to look at integrating a monitoring tool and it's benefits separately. Arthur ----- Original Message -----
From: "Gilad Chaplik" <gchaplik@redhat.com> To: "Yair Zaslavsky" <yzaslavs@redhat.com> Cc: "Arthur Berezin" <aberezin@redhat.com>, "users" <users@ovirt.org> Sent: Monday, May 5, 2014 12:06:28 PM Subject: Re: [ovirt-users] oVirt 3.5 : "Power Management Health Check" - feature pages
----- Original Message -----
From: "Yair Zaslavsky" <yzaslavs@redhat.com> To: "Gilad Chaplik" <gchaplik@redhat.com> Cc: "Arthur Berezin" <aberezin@redhat.com>, "users" <users@ovirt.org> Sent: Monday, May 5, 2014 12:00:10 PM Subject: Re: [ovirt-users] oVirt 3.5 : "Power Management Health Check" - feature pages
----- Original Message -----
From: "Gilad Chaplik" <gchaplik@redhat.com> To: "Arthur Berezin" <aberezin@redhat.com> Cc: "users" <users@ovirt.org>, "Yair Zaslavsky" <yzaslavs@redhat.com> Sent: Monday, May 5, 2014 11:52:25 AM Subject: Re: [ovirt-users] oVirt 3.5 : "Power Management Health Check" - feature pages
----- Original Message -----
From: "Arthur Berezin" <aberezin@redhat.com> To: "Gilad Chaplik" <gchaplik@redhat.com> Cc: "users" <users@ovirt.org>, "Yair Zaslavsky" <yzaslavs@redhat.com> Sent: Monday, May 5, 2014 11:30:24 AM Subject: Re: [ovirt-users] oVirt 3.5 : "Power Management Health Check" - feature pages
----- Original Message -----
From: "Yair Zaslavsky" <yzaslavs@redhat.com> To: "Gilad Chaplik" <gchaplik@redhat.com> Cc: "Arthur Berezin" <aberezin@redhat.com>, "users" <users@ovirt.org> Sent: Monday, May 5, 2014 11:10:10 AM Subject: Re: [ovirt-users] oVirt 3.5 : "Power Management Health Check" - feature pages
----- Original Message -----
From: "Gilad Chaplik" <gchaplik@redhat.com> To: "Yair Zaslavsky" <yzaslavs@redhat.com> Cc: "Arthur Berezin" <aberezin@redhat.com>, "users" <users@ovirt.org> Sent: Monday, May 5, 2014 10:57:01 AM Subject: Re: [ovirt-users] oVirt 3.5 : "Power Management Health Check" - feature pages
----- Original Message ----- > From: "Yair Zaslavsky" <yzaslavs@redhat.com> > To: "Arthur Berezin" <aberezin@redhat.com> > Cc: "Gilad Chaplik" <gchaplik@redhat.com>, "users" > <users@ovirt.org> > Sent: Monday, May 5, 2014 6:39:02 AM > Subject: Re: [ovirt-users] oVirt 3.5 : "Power Management Health > Check" > - > feature pages > > > > ----- Original Message ----- > > From: "Arthur Berezin" <aberezin@redhat.com> > > To: "Gilad Chaplik" <gchaplik@redhat.com> > > Cc: "users" <users@ovirt.org> > > Sent: Sunday, May 4, 2014 5:35:59 PM > > Subject: Re: [ovirt-users] oVirt 3.5 : "Power Management Health > > Check" > > - > > feature pages > > > > In this case engine periodically checks health of hosts' power > > management > > as > > HA relies on it. > > > > Arthur > > > > ----- Original Message ----- > > > > > From: "Gilad Chaplik" <gchaplik@redhat.com> > > > To: "Eli Mesika" <emesika@redhat.com> > > > Cc: "users" <users@ovirt.org>, "Arthur Berezin" > > > <aberezin@redhat.com> > > > Sent: Sunday, May 4, 2014 5:26:45 PM > > > Subject: Re: [ovirt-users] oVirt 3.5 : "Power Management > > > Health > > > Check" > > > - > > > feature pages > > > > > Hi Eli, > > > > > Here is my comment :) > > > Why engine needs to send the status health check, isn't there > > > any > > > 3rd > > > parties > > > that does it, that we can integrate with? > > > If found, it probably has /less (known) bugs/more features/ > > > and > > > it's > > > already > > > written, tested, documented, allows further integration and > > > probably > > > deals > > > with scale. > > > > > btw, fixed some typos in your pages :-) > > > > > Thanks, > > > Gilad. > > Hi, what 3rd party for example do you refer to? > The PM code already exists at engine, > And you're also using quartz for scheduling. >
Yair,
You're are raising some good points, but imo the entire host monitoring (inc getVdsStats, etc.) should be externalized. There are 2 major issues that we still don't cover: - No HA for monitoring, who checks the hosts when the engine is down. - No scale - the engine is a bottle-neck in network and compute. Although the above is a huge arch change, we need to start somewhere, this feature sounds like a candidate to introduce it.
About the examples: http://sixrevisions.com/tools/10-free-server-network-monitoring-tools-that-k... The main goal of the feature if my suggestion is taken, is to select to most appropriate one.
Thanks, Gilad.
Well, Nagios is being considered to be used or used by Gluster guys. However, it will still require (AFAIK) to code some nagios plugin to perfrom the health check. In addition, you will have to report somehow the state change to engine. IMHO, this a bit of an overkill (look also at the time that the check is run - once in an hour, so it can't be compared to getVmStats). +1 These monitoring tools bring a lot of value, and there are some initial integrations that we might want to look into[1][2]. But it's an overkill for this RFE - run "PM Check" periodically, in addition to initial PM check at host setup stage.
[1] https://github.com/monitoring-ui-plugin/development [2] http://exchange.nagios.org/directory/Plugins/Operating-Systems/*-Virtual-Env...
-1 on overkill. As I mentioned, proper monitoring is a huge feature; it should be gradually introduced, IMO this is a good starting point. We can look at it as an overkill _or_ as a jumpborad, that will reduce learning curve and future integrations issues.
IMHO this will increase also deployment complexity, and require our customers to have another component installed. The chances for bugs here (as you previously mentioned) IMHO are equally the same to bugs occurring due to developing a cusotm nagios plugin here.
I appreciate your involvement, and rapid responses :-) I get your point, but the bugs for this specific feature will be nothing in compare to the bugs we'll get once we integrate fully external monitoring process (in terms of priority and severity), this is what I mean in 'learning curve and future integration issues'.
Gilad.
> > > > > > ----- Original Message ----- > > > > From: "Eli Mesika" <emesika@redhat.com> > > > > To: "users" <users@ovirt.org> > > > > Cc: "Arthur Berezin" <aberezin@redhat.com> > > > > Sent: Sunday, May 4, 2014 12:18:47 PM > > > > Subject: [ovirt-users] oVirt 3.5 : "Power Management Health > > > > Check" > > > > - > > > > feature pages > > > > > > > > Hi > > > > > > > > The following wiki pages were added to the "Power > > > > Management > > > > Health > > > > Check" > > > > feature planned for oVirt 3.5 > > > > > > > > http://www.ovirt.org/Features/PMHealthCheck > > > > http://www.ovirt.org/Features/Design/DetailedPMHealthCheck > > > > > > > > Your comments/questions are mostly welcomed. > > > > > > > > Thanks > > > > Eli Mesika > > > > _______________________________________________ > > > > Users mailing list > > > > Users@ovirt.org > > > > http://lists.ovirt.org/mailman/listinfo/users > > > > > > > > _______________________________________________ > > Users mailing list > > Users@ovirt.org > > http://lists.ovirt.org/mailman/listinfo/users > > >
------=_Part_455243_250179676.1399285418769 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: quoted-printable <html><body><div style=3D"font-family: times new roman, new york, times, se= rif; font-size: 12pt; color: #000000"><div>Gilad, this is integral part of = HA, <font style=3D"font-size: 12pt;">For HA to work properly we rely o= n PM mechanism, this feature makes sure PM is available, ie. running existi= ng "Check PM" </font><span style=3D"font-size: 12pt;">periodically.</s= pan></div><div><br></div><div>We need to look at integrating a monitoring t= ool and it's benefits separately.</div><div><span style=3D"font-size: 12pt;= "><br></span></div><div><span style=3D"font-size: 12pt;"><br></span></div><= div><span style=3D"font-size: 12pt;">Arthur</span></div><div><br></div><hr = id=3D"zwchr"><blockquote style=3D"border-left:2px solid #1010FF;margin-left= :5px;padding-left:5px;color:#000;font-weight:normal;font-style:normal;text-= decoration:none;font-family:Helvetica,Arial,sans-serif;font-size:12pt;"><b>= From: </b>"Gilad Chaplik" <gchaplik@redhat.com><br><b>To: </b>"Yair Z= aslavsky" <yzaslavs@redhat.com><br><b>Cc: </b>"Arthur Berezin" <ab= erezin@redhat.com>, "users" <users@ovirt.org><br><b>Sent: </b>Mond= ay, May 5, 2014 12:06:28 PM<br><b>Subject: </b>Re: [ovirt-users] oVirt 3.5 = : "Power Management Health Check" - feature &n= bsp; pages<br><div><br></div>----- Original Message -----<br>>= ; From: "Yair Zaslavsky" <yzaslavs@redhat.com><br>> To: "Gilad Cha= plik" <gchaplik@redhat.com><br>> Cc: "Arthur Berezin" <aberezin= @redhat.com>, "users" <users@ovirt.org><br>> Sent: Monday, May = 5, 2014 12:00:10 PM<br>> Subject: Re: [ovirt-users] oVirt 3.5 : "Power M= anagement Health Check" - feature = pages<br>> <br>> <br>> <br>> ----- Original Message -----= <br>> > From: "Gilad Chaplik" <gchaplik@redhat.com><br>> >= ; To: "Arthur Berezin" <aberezin@redhat.com><br>> > Cc: "users"= <users@ovirt.org>, "Yair Zaslavsky" <yzaslavs@redhat.com><br>&= gt; > Sent: Monday, May 5, 2014 11:52:25 AM<br>> > Subject: Re: [o= virt-users] oVirt 3.5 : "Power Management Health Check" -<br>> > feat= ure pages<br>> > <br>&= gt; > ----- Original Message -----<br>> > > From: "Arthur Berez= in" <aberezin@redhat.com><br>> > > To: "Gilad Chaplik" <g= chaplik@redhat.com><br>> > > Cc: "users" <users@ovirt.org>= ;, "Yair Zaslavsky" <yzaslavs@redhat.com><br>> > > Sent: Mon= day, May 5, 2014 11:30:24 AM<br>> > > Subject: Re: [ovirt-users] o= Virt 3.5 : "Power Management Health Check" -<br>> > > feature = ; pages<br>> > > <br>>= > > ----- Original Message -----<br>> > > <br>> > >= ; > From: "Yair Zaslavsky" <yzaslavs@redhat.com><br>> > >= > To: "Gilad Chaplik" <gchaplik@redhat.com><br>> > > >= ; Cc: "Arthur Berezin" <aberezin@redhat.com>, "users" <users@ovirt= .org><br>> > > > Sent: Monday, May 5, 2014 11:10:10 AM<br>&g= t; > > > Subject: Re: [ovirt-users] oVirt 3.5 : "Power Management = Health Check"<br>> > > > -<br>> > > > feature pages= <br>> > > <br>> > > > ----- Original Message -----<br>= > > > > > From: "Gilad Chaplik" <gchaplik@redhat.com><= br>> > > > > To: "Yair Zaslavsky" <yzaslavs@redhat.com>= ;<br>> > > > > Cc: "Arthur Berezin" <aberezin@redhat.com&= gt;, "users" <users@ovirt.org><br>> > > > > Sent: Mond= ay, May 5, 2014 10:57:01 AM<br>> > > > > Subject: Re: [ovirt= -users] oVirt 3.5 : "Power Management Health<br>> > > > > Ch= eck"<br>> > > > > -<br>> > > > > feature page= s<br>> > > > ><br>> > > > > ----- Original Me= ssage -----<br>> > > > > > From: "Yair Zaslavsky" <yza= slavs@redhat.com><br>> > > > > > To: "Arthur Berezin" = <aberezin@redhat.com><br>> > > > > > Cc: "Gilad Cha= plik" <gchaplik@redhat.com>, "users"<br>> > > > > >= <users@ovirt.org><br>> > > > > > Sent: Monday, May= 5, 2014 6:39:02 AM<br>> > > > > > Subject: Re: [ovirt-us= ers] oVirt 3.5 : "Power Management Health<br>> > > > > > = Check"<br>> > > > > > -<br>> > > > > > = feature pages<br>> > > > > ><br>> > > > > = ><br>> > > > > ><br>> > > > > > ----= - Original Message -----<br>> > > > > > > From: "Arthu= r Berezin" <aberezin@redhat.com><br>> > > > > > >= ; To: "Gilad Chaplik" <gchaplik@redhat.com><br>> > > > &g= t; > > Cc: "users" <users@ovirt.org><br>> > > > >= ; > > Sent: Sunday, May 4, 2014 5:35:59 PM<br>> > > > >= ; > > Subject: Re: [ovirt-users] oVirt 3.5 : "Power Management Health= <br>> > > > > > > Check"<br>> > > > > &= gt; > -<br>> > > > > > > feature pages<br>> >= > > > > ><br>> > > > > > > In this cas= e engine periodically checks health of hosts' power<br>> > > > = > > > management<br>> > > > > > > as<br>> = > > > > > > HA relies on it.<br>> > > > > = > ><br>> > > > > > > Arthur<br>> > > &g= t; > > ><br>> > > > > > > ----- Original Mess= age -----<br>> > > > > > ><br>> > > > >= > > > From: "Gilad Chaplik" <gchaplik@redhat.com><br>> &= gt; > > > > > > To: "Eli Mesika" <emesika@redhat.com&g= t;<br>> > > > > > > > Cc: "users" <users@ovirt.o= rg>, "Arthur Berezin"<br>> > > > > > > > <abe= rezin@redhat.com><br>> > > > > > > > Sent: Sunda= y, May 4, 2014 5:26:45 PM<br>> > > > > > > > Subjec= t: Re: [ovirt-users] oVirt 3.5 : "Power Management Health<br>> > >= > > > > > Check"<br>> > > > > > > >= -<br>> > > > > > > > feature pages<br>> > &g= t; > > > ><br>> > > > > > > > Hi Eli,<b= r>> > > > > > ><br>> > > > > > > = > Here is my comment :)<br>> > > > > > > > Why e= ngine needs to send the status health check, isn't there<br>> > > = > > > > > any<br>> > > > > > > > 3rd= <br>> > > > > > > > parties<br>> > > > = > > > > that does it, that we can integrate with?<br>> > = > > > > > > If found, it probably has /less (known) bugs/= more features/ and<br>> > > > > > > > it's<br>> = > > > > > > > already<br>> > > > > >= > > written, tested, documented, allows further integration and<br>&= gt; > > > > > > > probably<br>> > > > >= > > > deals<br>> > > > > > > > with scale= .<br>> > > > > > ><br>> > > > > > &g= t; > btw, fixed some typos in your pages :-)<br>> > > > >= > ><br>> > > > > > > > Thanks,<br>> > = > > > > > > Gilad.<br>> > > > > ><br>&g= t; > > > > > Hi, what 3rd party for example do you refer to?= <br>> > > > > > The PM code already exists at engine,<br>= > > > > > > And you're also using quartz for scheduling.<= br>> > > > > ><br>> > > > ><br>> > &= gt; > > Yair,<br>> > > > ><br>> > > > >= You're are raising some good points, but imo the entire host<br>> > = > > > monitoring<br>> > > > > (inc<br>> > >= ; > > getVdsStats, etc.) should be externalized.<br>> > > &g= t; > There are 2 major issues that we still don't cover:<br>> > &g= t; > > - No HA for monitoring, who checks the hosts when the engine i= s down.<br>> > > > > - No scale - the engine is a bottle-nec= k in network and compute.<br>> > > > > Although the above is= a huge arch change, we need to start somewhere,<br>> > > > >= ; this<br>> > > > > feature sounds like a candidate to intro= duce it.<br>> > > > ><br>> > > > > About the = examples:<br>> > > > > http://sixrevisions.com/tools/10-free= -server-network-monitoring-tools-that-kick-ass/<br>> > > > >= The main goal of the feature if my suggestion is taken, is to select<br>&g= t; > > > > to<br>> > > > > most<br>> > >= ; > > appropriate one.<br>> > > > ><br>> > > = > > Thanks,<br>> > > > > Gilad.<br>> > > <br>= > > > > Well, Nagios is being considered to be used or used by = Gluster guys.<br>> > > > However, it will still require (AFAIK)= to code some nagios plugin to<br>> > > > perfrom<br>> > = > > the health check.<br>> > > > In addition, you will ha= ve to report somehow the state change to<br>> > > > engine.<br>= > > > > IMHO, this a bit of an overkill (look also at the time = that the check<br>> > > > is<br>> > > > run<br>>= > > > - once in an hour, so it can't be compared to getVmStats).<= br>> > > +1<br>> > > These monitoring tools bring a lot o= f value, and there are some initial<br>> > > integrations that we = might want to look into[1][2].<br>> > > But it's an overkill for t= his RFE - run "PM Check" periodically, in<br>> > > addition<br>>= ; > > to initial PM check at host setup stage.<br>> > > <br>= > > > [1] https://github.com/monitoring-ui-plugin/development<br>&= gt; > > [2]<br>> > > http://exchange.nagios.org/directory/Pl= ugins/Operating-Systems/*-Virtual-Environments/Others/check_rhev3/details<b= r>> > <br>> > <br>> > -1 on overkill.<br>> > As I m= entioned, proper monitoring is a huge feature; it should be gradually<br>&g= t; > introduced, IMO this is a good starting point.<br>> > We can = look at it as an overkill _or_ as a jumpborad, that will reduce<br>> >= ; learning curve and future integrations issues.<br>> <br>> IMHO this= will increase also deployment complexity, and require our customers<br>>= ; to have another component installed.<br>> The chances for bugs here (a= s you previously mentioned) IMHO are equally the<br>> same to bugs occur= ring due to developing a cusotm nagios plugin here.<br><div><br></div>I app= reciate your involvement, and rapid responses :-)<br>I get your point, but = the bugs for this specific feature will be nothing in compare to the bugs w= e'll get once we integrate fully external monitoring process (in terms of p= riority and severity), this is what I mean in 'learning curve and future in= tegration issues'.<br><div><br></div><br>> <br>> > <br>> > G= ilad.<br>> > <br>> > > <br>> > > > ><br>> = > > > > ><br>> > > > > > ><br>> >= > > > > > > ----- Original Message -----<br>> > &g= t; > > > > > > From: "Eli Mesika" <emesika@redhat.com&= gt;<br>> > > > > > > > > To: "users" <users@o= virt.org><br>> > > > > > > > > Cc: "Arthur Be= rezin" <aberezin@redhat.com><br>> > > > > > > &g= t; > Sent: Sunday, May 4, 2014 12:18:47 PM<br>> > > > > &= gt; > > > Subject: [ovirt-users] oVirt 3.5 : "Power Management Hea= lth<br>> > > > > > > > > Check"<br>> > >= ; > > > > > > -<br>> > > > > > > >= ; > feature pages<br>> > > > > > > > ><br>>= ; > > > > > > > > Hi<br>> > > > > &g= t; > > ><br>> > > > > > > > > The follo= wing wiki pages were added to the "Power Management<br>> > > > = > > > > > Health<br>> > > > > > > > = > Check"<br>> > > > > > > > > feature planned= for oVirt 3.5<br>> > > > > > > > ><br>> >= > > > > > > > http://www.ovirt.org/Features/PMHealthC= heck<br>> > > > > > > > > http://www.ovirt.org/F= eatures/Design/DetailedPMHealthCheck<br>> > > > > > > = > ><br>> > > > > > > > > Your comments/que= stions are mostly welcomed.<br>> > > > > > > > >= <br>> > > > > > > > > Thanks<br>> > > &= gt; > > > > > Eli Mesika<br>> > > > > > &g= t; > > _______________________________________________<br>> > &= gt; > > > > > > Users mailing list<br>> > > >= > > > > > Users@ovirt.org<br>> > > > > > = > > > http://lists.ovirt.org/mailman/listinfo/users<br>> > &= gt; > > > > > ><br>> > > > > > ><br>= > > > > > > > ________________________________________= _______<br>> > > > > > > Users mailing list<br>> &g= t; > > > > > Users@ovirt.org<br>> > > > > >= ; > http://lists.ovirt.org/mailman/listinfo/users<br>> > > >= > > ><br>> > > > > ><br>> > > > >= ;<br>> > > <br>> > <br>> <br></blockquote><div><br></div>= </div></body></html> ------=_Part_455243_250179676.1399285418769--
participants (4)
-
Arthur Berezin
-
Eli Mesika
-
Gilad Chaplik
-
Yair Zaslavsky