--Apple-Mail-2--344998072
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
charset=us-ascii
Dear Doron,
I haven't collected the logs from the tests, but I would gladly re-do the ca=
se and get back to you asap.=20
This feature is the main reason of which I have chosen to go with Ovirt in t=
he first place, besides other virt environments.
Could you please inform me what logs should I be focusing on, besides the en=
gine log; vdsm maybe or other relevant logs?
Regards,
Alex
--
Sent from phone.
On 13.01.2013, at 09:56, Doron Fediuck <dfediuck(a)redhat.com> wrote:
=20
=20
From: "Alexandru Vladulescu" <avladulescu(a)bfproject.ro>
To: "users" <users(a)ovirt.org>
Sent: Friday, January 11, 2013 2:47:38 PM
Subject: [Users] Testing High Availability and Power outages
=20
=20
Hi,
=20
=20
Today, I started testing on my Ovirt 3.1 installation (from dreyou repos) r=
unning
on 3 x Centos 6.3 hypervisors the High Availability features and the f=
ence mechanism.
=20
As yesterday, I have reported in a previous email thread, that the migrati=
on
priority queue cannot be increased (bug) in this current version, I decid=
ed to test what the official documentation says about the High Availability c=
ases.=20
=20
This will be a disaster case scenarios to suffer from if one hypervisor ha=
s a
power outage/hardware problem and the VMs running on it are not migratin=
g on other spare resources.
=20
=20
In the official documenation from
ovirt.org it is quoted the following:
High availability
=20
Allows critical VMs to be restarted on another host in the event of hardwa=
re
failure with three levels of priority, taking into account resiliency pol=
icy.
=20
Resiliency policy to control high availability VMs at the cluster level.
Supports application-level high availability with supported fencing agents=
.
=20
As well as in the Architecture description:
=20
High Availability - restart guest VMs from failed hosts automatically on o=
ther
hosts
=20
=20
=20
So the testing went like this -- One VM running a linux box, having the ch=
eck box
"High Available" and "Priority for Run/Migration queue:" set to Low.=
On Host we have the check box to "Any Host in Cluster", without "Allow VM
m=
igration only upon Admin specific request" checked.
=20
=20
=20
My environment:
=20
=20
Configuration : 2 x Hypervisors (same cluster/hardware configuration) ; 1=
x
Hypervisor + acting as a NAS (NFS) server (different cluster/hardware con=
figuration)
=20
Actions: Went and cut-off the power from one of the hypervisors from the 2=
node
clusters, while the VM was running on. This would translate to a power=
outage.
=20
Results: The hypervisor node that suffered from the outage is showing in H=
osts
tab as Non Responsive on Status, and the VM has a question mark and can=
not be powered off or nothing (therefore it's stuck).
=20
In the Log console in GUI, I get:=20
=20
Host Hyper01 is non-responsive.
VM Web-Frontend01 was set to the Unknown status.
=20
There is nothing I could I could do besides clicking on the Hyper01 "Confi=
rm
Host as been rebooted", afterwards the VM starts on the Hyper02 with a co=
ld reboot of the VM.
=20
The Log console changes to:
=20
Vm Web-Frontend01 was shut down due to Hyper01 host reboot or manual fence=
All VMs' status on Non-Responsive Host Hyper01 were changed to
'Down' by a=
dmin@internal
Manual fencing for host Hyper01 was started.
VM Web-Frontend01 was restarted on Host Hyper02
=20
=20
I would like you approach on this problem, reading the documentation & fea=
tures pages on the official website, I suppose that this would have been an a=
utomatically mechanism working on some sort of a vdsm & engine fencing actio=
n. Am I missing something regarding it ?
=20
=20
Thank you for your patience reading this.
=20
=20
Regards,
Alex.
=20
=20
=20
=20
_______________________________________________
Users mailing list
Users(a)ovirt.org
http://lists.ovirt.org/mailman/listinfo/users
Hi Alex,
Can you share with us the engine's log from the relevant time period?
=20
Doron
--Apple-Mail-2--344998072
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html;
charset=utf-8
<html><body bgcolor=3D"#FFFFFF"><div>Dear
Doron,</div><div><br></div><div>I h=
aven't collected the logs from the tests, but I would gladly re-do the case a=
nd get back to you
asap. </div><div><br></div><div>This feature is the
m=
ain reason of which I have chosen to go with Ovirt in the first place, besid=
es other virt environments.</div><div><br></div><div>Could
you please inform=
me what logs should I be focusing on, <span
class=3D"Apple-style-span"=
style=3D"-webkit-tap-highlight-color: rgba(26, 26, 26, 0.296875); -webkit-c=
omposition-fill-color: rgba(175, 192, 227, 0.230469); -webkit-composition-fr=
ame-color: rgba(77, 128, 180, 0.230469); ">besides the engine log; vdsm mayb=
e or other relevant
logs?</span></div><div><br><div><div>Regards,</div><div>=
Alex</div></div><div><br></div><div><br></div><div><span
class=3D"Apple-styl=
e-span" style=3D"-webkit-tap-highlight-color: rgba(26, 26, 26, 0.292969); -w=
ebkit-composition-fill-color: rgba(175, 192, 227, 0.230469); -webkit-composi=
tion-frame-color: rgba(77, 128, 180,
0.230469);">--</span></div><div><span c=
lass=3D"Apple-style-span" style=3D"-webkit-tap-highlight-color: rgba(26,
26,=
26, 0.296875); -webkit-composition-fill-color: rgba(175, 192, 227, 0.230469=
); -webkit-composition-frame-color: rgba(77, 128, 180, 0.230469); ">Sent fro=
m phone.</span></div></div><div><br>On 13.01.2013, at 09:56,
Doron Fediuck &=
lt;<a
href=3D"mailto:dfediuck@redhat.com">dfediuck@redhat.com</a>>
wrote:=
<br><br></div><div></div><blockquote
type=3D"cite"><div><div style=3D"font-f=
amily: times new roman,new york,times,serif; font-size: 12pt; color: #000000=
"><br><br><hr id=3D"zwchr"><blockquote
style=3D"border-left:2px solid rgb(16=
, 16, 255);margin-left:5px;padding-left:5px;color:#000;font-weight:normal;fo=
nt-style:normal;text-decoration:none;font-family:Helvetica,Arial,sans-serif;=
font-size:12pt;"><b>From: </b>"Alexandru Vladulescu"
<<a href=3D"mailto:a=
vladulescu@bfproject.ro">avladulescu(a)bfproject.ro</a>&gt;<br><b>To:
</b>"use=
rs" <<a
href=3D"mailto:users@ovirt.org">users@ovirt.org</a>><br><b>Sen=
t: </b>Friday, January 11, 2013 2:47:38 PM<br><b>Subject:
</b>[Users] Testin=
g High Availability and Power outages<br><br>
=20
=20
=20
=20
<br>
Hi,<br>
<br>
<br>
Today, I started testing on my Ovirt 3.1 installation (from dreyou
repos) running on 3 x Centos 6.3 hypervisors the High Availability
features and the fence mechanism.<br>
<br>
As yesterday, I have reported in a previous email thread, that the
migration priority queue cannot be increased (bug) in this current
version, I decided to test what the official documentation says
about the High Availability cases. <br>
<br>
This will be a disaster case scenarios to suffer from if one
hypervisor has a power outage/hardware problem and the VMs running
on it are not migrating on other spare resources.<br>
<br>
<br>
In the official documenation from <a
href=3D"http://ovirt.org">ovirt.org=
</a> it is quoted the
following:<br>
<h3> <span class=3D"mw-headline"
id=3D"High_availability"> <font color=3D=
"#333399"><i><small>High availability
</small></i></font></span></h3>
<font color=3D"#333399"><i><small>
</small></i></font>
<p><font color=3D"#333399"><i><small>Allows critical
VMs to be
restarted on another host in the event of hardware failure
with three levels of priority, taking into account
resiliency policy.
</small></i></font></p>
<font color=3D"#333399"><i><small>
</small></i></font>
<ul>
<li><font color=3D"#333399"><i><small> Resiliency
policy to control
high availability VMs at the cluster level.
</small></i></font></li>
<li><font color=3D"#333399"><i><small> Supports
application-level
high availability with supported fencing agents.
</small></i></font></li>
</ul>
<br>
As well as in the Architecture description:<br>
<font color=3D"#333399"><br>
<small><i>High Availability - restart guest VMs from failed hosts
automatically on other hosts</i></small></font><br>
<br>
<br>
<br>
So the testing went like this -- One VM running a linux box, having
the check box "High Available" and "Priority for Run/Migration
queue:" set to Low. On Host we have the check box to "Any Host in
Cluster", without "Allow VM migration only upon Admin specific
request" checked.<br>
<br>
<br>
<br>
My environment:<br>
<br>
<br>
Configuration : 2 x Hypervisors (same cluster/hardware
configuration) ; 1 x Hypervisor + acting as a NAS (NFS) server
(different cluster/hardware configuration)<br>
<br>
Actions: Went and cut-off the power from one of the hypervisors from
the 2 node clusters, while the VM was running on. This would
translate to a power outage.<br>
<br>
Results: The hypervisor node that suffered from the outage is
showing in Hosts tab as Non Responsive on Status, and the VM has a
question mark and cannot be powered off or nothing (therefore it's
stuck).<br>
<br>
In the Log console in GUI, I get: <br>
<br>
=20
=20
<span style=3D"color: rgb(255, 255, 255); font-family: 'Arial Unicode
MS', Arial, sans-serif; font-size: small; font-style: normal;
font-variant: normal; font-weight: normal; letter-spacing: normal;
line-height: 26px; orphans: 2; text-align: start; text-indent:
0px; text-transform: none; white-space: nowrap; widows: 2;
word-spacing: 0px; -webkit-text-size-adjust: auto;
-webkit-text-stroke-width: 0px; background-color: rgb(102, 102,
102); display: inline !important; float: none; ">Host Hyper01 is
non-responsive.</span><br>
=20
<span style=3D"color: rgb(255, 255, 255); font-family: 'Arial Unicode
MS', Arial, sans-serif; font-size: small; font-style: normal;
font-variant: normal; font-weight: normal; letter-spacing: normal;
line-height: 26px; orphans: 2; text-align: start; text-indent:
0px; text-transform: none; white-space: nowrap; widows: 2;
word-spacing: 0px; -webkit-text-size-adjust: auto;
-webkit-text-stroke-width: 0px; background-color: rgb(102, 102,
102); display: inline !important; float: none; ">VM Web-Frontend01
was set to the Unknown status.</span><br>
=20
<br>
There is nothing I could I could do besides clicking on the Hyper01
"Confirm Host as been rebooted", afterwards the VM starts on the
Hyper02 with a cold reboot of the VM.<br>
<br>
The Log console changes to:<br>
<br>
=20
<span style=3D"color: rgb(255, 255, 255); font-family: 'Arial Unicode
MS', Arial, sans-serif; font-size: small; font-style: normal;
font-variant: normal; font-weight: normal; letter-spacing: normal;
line-height: 26px; orphans: 2; text-align: start; text-indent:
0px; text-transform: none; white-space: nowrap; widows: 2;
word-spacing: 0px; -webkit-text-size-adjust: auto;
-webkit-text-stroke-width: 0px; background-color: rgb(102, 102,
102); display: inline !important; float: none; ">Vm Web-Frontend01
was shut down due to Hyper01 host reboot or manual fence</span><br>
=20
<span style=3D"color: rgb(255, 255, 255); font-family: 'Arial Unicode
MS', Arial, sans-serif; font-size: small; font-style: normal;
font-variant: normal; font-weight: normal; letter-spacing: normal;
line-height: 26px; orphans: 2; text-align: start; text-indent:
0px; text-transform: none; white-space: nowrap; widows: 2;
word-spacing: 0px; -webkit-text-size-adjust: auto;
-webkit-text-stroke-width: 0px; background-color: rgb(102, 102,
102); display: inline !important; float: none; ">All VMs' status
on Non-Responsive Host Hyper01 were changed to 'Down' by
admin@internal</span><br>
=20
<span style=3D"color: rgb(255, 255, 255); font-family: 'Arial Unicode
MS', Arial, sans-serif; font-size: small; font-style: normal;
font-variant: normal; font-weight: normal; letter-spacing: normal;
line-height: 26px; orphans: 2; text-align: start; text-indent:
0px; text-transform: none; white-space: nowrap; widows: 2;
word-spacing: 0px; -webkit-text-size-adjust: auto;
-webkit-text-stroke-width: 0px; background-color: rgb(102, 102,
102); display: inline !important; float: none; ">Manual fencing
for host Hyper01 was started.</span><br>
=20
<span style=3D"color: rgb(255, 255, 255); font-family: 'Arial Unicode
MS', Arial, sans-serif; font-size: small; font-style: normal;
font-variant: normal; font-weight: normal; letter-spacing: normal;
line-height: 26px; orphans: 2; text-align: start; text-indent:
0px; text-transform: none; white-space: nowrap; widows: 2;
word-spacing: 0px; -webkit-text-size-adjust: auto;
-webkit-text-stroke-width: 0px; background-color: rgb(102, 102,
102); display: inline !important; float: none; ">VM Web-Frontend01
was restarted on Host Hyper02</span><br>
<br>
<br>
I would like you approach on this problem, reading the documentation
& features pages on the official website, I suppose that this
would have been an automatically mechanism working on some sort of a
vdsm & engine fencing action. Am I missing something regarding
it ?<br>
<br>
<br>
Thank you for your patience reading this.<br>
<br>
<br>
Regards,<br>
Alex.<br>
<br>
<br>
<br>
=20
<br>_______________________________________________<br>Users mailing
list<br=
<a
href=3D"mailto:Users@ovirt.org">Users@ovirt.org</a><br>http://lists.ovir=
t.org/mailman/listinfo/users<br></blockquote>Hi Alex,<br>Can you
share with u=
s the engine's log from the relevant time
period?<br><br>Doron<br></div></di=
v></blockquote></body></html>=
--Apple-Mail-2--344998072--