<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><br class=""><div><blockquote type="cite" class=""><div class="">On 16 Sep 2016, at 14:23, Martin Perina <<a href="mailto:mperina@redhat.com" class="">mperina@redhat.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" class=""><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br class=""></div><div class="gmail_extra"><br class=""><div class="gmail_quote">On Fri, Sep 16, 2016 at 1:54 PM, Simone Tiraboschi <span dir="ltr" class=""><<a href="mailto:stirabos@redhat.com" target="_blank" class="">stirabos@redhat.com</a>></span> wrote:<br class=""><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr" class=""><br class=""><div class="gmail_extra"><br class=""><div class="gmail_quote">On Fri, Sep 16, 2016 at 12:50 PM, Martin Perina <span dir="ltr" class=""><<a href="mailto:mperina@redhat.com" target="_blank" class="">mperina@redhat.com</a>></span> wrote:<br class=""><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr" class=""><div style="font-family:arial,helvetica,sans-serif" class=""><br class=""></div><div class="gmail_extra"><br class=""><div class="gmail_quote"><span class="">On Fri, Sep 16, 2016 at 9:26 AM, Michal Skrivanek <span dir="ltr" class=""><<a href="mailto:michal.skrivanek@redhat.com" target="_blank" class="">michal.skrivanek@redhat.com</a>></span> wrote:<br class=""><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><br class="">
> On 16 Sep 2016, at 08:29, <a href="mailto:aleksey.maksimov@it-kb.ru" target="_blank" class="">aleksey.maksimov@it-kb.ru</a> wrote:<br class="">
><br class="">
> There are more ideas?<br class="">
><br class="">
> 15.09.2016, 14:40, "<a href="mailto:aleksey.maksimov@it-kb.ru" target="_blank" class="">aleksey.maksimov@it-kb.ru</a>" <<a href="mailto:aleksey.maksimov@it-kb.ru" target="_blank" class="">aleksey.maksimov@it-kb.ru</a>>:<br class="">
>> Martin, I physically turned off the server through the iLO2. See screenshots.<br class="">
>> I did not touch Virtual Machine (KOM-AD01-PBX02) at the same time.<br class="">
>> The virtual machine has been turned on at the time when the host shut down.<br class="">
>><br class="">
>> 15.09.2016, 14:27, "Martin Perina" <<a href="mailto:mperina@redhat.com" target="_blank" class="">mperina@redhat.com</a>>:<br class="">
>>> Hi,<br class="">
>>><br class="">
>>> I found out this in the log:<br class="">
>>><br class="">
>>> 2016-09-15 12:02:04,661 INFO [org.ovirt.engine.core.vdsbrok<wbr class="">er.monitoring.VmAnalyzer] (ForkJoinPool-1-worker-6) [] VM '660bafca-e9c3-4191-99b4-295ff<wbr class="">8553488'(KOM-AD01-PBX02) moved from 'Up' --> 'Down'<br class="">
>>> 2016-09-15 12:02:04,788 INFO [org.ovirt.engine.core.dal.dbb<wbr class="">roker.auditloghandling.AuditLo<wbr class="">gDirector] (ForkJoinPool-1-worker-6) [] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: VM KOM-AD01-PBX02 is down. Exit message: User shut down from within the guest<br class="">
<br class="">
since it shut down cleanly, can you please check the guest's logs to see what triggered the shutdown? In such cases it is considered a user requested shutdown and such VMs are not restarted automatically<br class=""></blockquote></span><div class=""><br class=""><div style="font-family:arial,helvetica,sans-serif;display:inline" class="">That's exactly what I meant by my response. From the log it's obvious that VM was shutdown properly, so engine will not restart it on a different. host. Also on most modern hosts if you execute power management off action, a signal is sent to OS to execute </div> <div style="font-family:arial,helvetica,sans-serif;display:inline" class="">regular shutdown so VMs are also shutted down properly.<br class=""></div></div></div></div></div></blockquote><div class=""><br class=""></div><div class="">I understand the reason, but is it really what the user expects?</div><div class=""><br class=""></div><div class="">I mean, if I set HA mode on a VM I'd expect the that the engine cares to keep it up of restart if needed regardless of shutdown reasons.</div></div></div></div></blockquote></div></div></div></div></blockquote><div><br class=""></div>no, that’s not how HA works today. When you log into a guest and issue “shutdown” we do not restart the VM under your hands. We can argue how it should or may work, but this is the defined behavior since the dawn of oVirt.</div><div><br class=""><blockquote type="cite" class=""><div class=""><div dir="ltr" class=""><div class="gmail_extra"><div class="gmail_quote"><div class=""><br class=""><div class="gmail_default" style="font-family:arial,helvetica,sans-serif;display:inline">AFAIK that's correct, we need to be able </div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif;display:inline">shutdown HA VM</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif;display:inline"> without being it immediately restarted on different host. We want to restart HA VM only if host, where HA VM is running, is non-responsive.<br class=""></div></div></div></div></div></div></blockquote><div><br class=""></div>we try to restart it in all other cases other than user initiated shutdown, e.g. a QEMU process crash on an otherwise-healthy host</div><div><br class=""><blockquote type="cite" class=""><div class=""><div dir="ltr" class=""><div class="gmail_extra"><div class="gmail_quote"><div class=""><div class="gmail_default" style="font-family:arial,helvetica,sans-serif;display:inline"><br class=""></div></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr" class=""><div class="gmail_extra"><div class="gmail_quote"><div class="">For instance, on hosted-engine the HA agent, if not in global maintenance mode, will restart the engine VM regardless of who or why it went off.</div></div></div></div></blockquote><div class=""><br class=""><div class="gmail_default" style="font-family:arial,helvetica,sans-serif;display:inline">Well, HE VM is definitely not a standard HA VM :-)<br class=""></div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr" class=""><div class="gmail_extra"><div class="gmail_quote"><div class=""><br class=""></div><div class=""> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr" class=""><div class="gmail_extra"><div class="gmail_quote"><div class=""><div style="font-family:arial,helvetica,sans-serif;display:inline" class=""></div></div><span class=""><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
We are aware of a similar issue on specific hw - <a href="https://bugzilla.redhat.com/show_bug.cgi?id=1341106" rel="noreferrer" target="_blank" class="">https://bugzilla.redhat.com/sh<wbr class="">ow_bug.cgi?id=1341106</a><br class="">
<br class="">
>>><br class="">
>>> If I'm not mistaken, this means that VM was properly shutted down from within itself and in that case it's not restarted automatically. So I'm curious what actions have you made to make host KOM-AD01-VM31 non-responsive?<br class="">
>>><br class="">
>>> If you want to test fencing properly, then I suggest you to either block connection between host and engine on host side and forcibly stop ovirtmgmt network interface on host and watch fencing is applied.<br class=""></blockquote></span><div class=""><br class=""><div style="font-family:arial,helvetica,sans-serif;display:inline" class="">Try above if you want to test fencing. Of course you can always configure firewall rule to drop all packets between engine and host or unplug host network cable.<br class=""><br class=""></div></div><div class=""><div class=""><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
>>><br class="">
>>> Martin<br class="">
>>><br class="">
>>> On Thu, Sep 15, 2016 at 1:16 PM, <<a href="mailto:aleksey.maksimov@it-kb.ru" target="_blank" class="">aleksey.maksimov@it-kb.ru</a>> wrote:<br class="">
>>>> engine.log for this period.<br class="">
>>>><br class="">
>>>> 15.09.2016, 14:01, "Martin Perina" <<a href="mailto:mperina@redhat.com" target="_blank" class="">mperina@redhat.com</a>>:<br class="">
>>>>> On Thu, Sep 15, 2016 at 12:47 PM, <<a href="mailto:aleksey.maksimov@it-kb.ru" target="_blank" class="">aleksey.maksimov@it-kb.ru</a>> wrote:<br class="">
>>>>>> Hi Martin.<br class="">
>>>>>> I have a stupid question. Use Watchdog device mandatory to automatically start a virtual machine in host Fencing process?<br class="">
>>>>><br class="">
>>>>> AFAIK it's not, but I'm not na expert, adding Arik.<br class="">
>>>>><br class="">
>>>>> You need correct power management setup for the hosts and VM has to be marked as highly available for sure.<br class="">
>>>>><br class="">
>>>>>> 15.09.2016, 13:43, "Martin Perina" <<a href="mailto:mperina@redhat.com" target="_blank" class="">mperina@redhat.com</a>>:<br class="">
>>>>>>> Hi,<br class="">
>>>>>>><br class="">
>>>>>>> could you please share whole engine.log?<br class="">
>>>>>>><br class="">
>>>>>>> Thanks<br class="">
>>>>>>><br class="">
>>>>>>> Martin Perina<br class="">
>>>>>>><br class="">
>>>>>>> On Thu, Sep 15, 2016 at 12:01 PM, <<a href="mailto:aleksey.maksimov@it-kb.ru" target="_blank" class="">aleksey.maksimov@it-kb.ru</a>> wrote:<br class="">
>>>>>>>> Hello oVirt guru`s !<br class="">
>>>>>>>><br class="">
>>>>>>>> I have oVirt Hosted Engine 4.0.3-1.el7.centos on two CentOS 7.2 hosts (HP ProLiant DL 360 G5) connected to shared FC SAN Storage.<br class="">
>>>>>>>><br class="">
>>>>>>>> 1. I configured Power Management for the Hosts (successfully added Fencing Agent for iLO2 from my hosts)<br class="">
>>>>>>>><br class="">
>>>>>>>> 2. I created new VM (KOM-AD01-PBX02) and installed Guest OS (Ubuntu Server 16.04 LTS) and oVirt Guest Agent<br class="">
>>>>>>>> (As described herein <a href="https://blog.it-kb.ru/2016/09/14/install-ovirt-4-0-part-2-about-data-center-iso-domain-logical-network-vlan-vm-settings-console-guest-agent-live-migration/" rel="noreferrer" target="_blank" class="">https://blog.it-kb.ru/2016/09/<wbr class="">14/install-ovirt-4-0-part-2-ab<wbr class="">out-data-center-iso-domain-log<wbr class="">ical-network-vlan-vm-settings-<wbr class="">console-guest-agent-live-migra<wbr class="">tion/</a>)<br class="">
>>>>>>>> In VM settings on "High Availability" I turned on the option "Highly Available" and change "Priority" to "High"<br class="">
>>>>>>>><br class="">
>>>>>>>> 3. Now I'm trying to check Hard-Fencing and power off my first host (KOM-AD01-VM31) from his iLO (KOM-AD01-ILO31).<br class="">
>>>>>>>><br class="">
>>>>>>>> Fencing successfully works and server is automatically turned on, but my HA VM not started on second host (KOM-AD01-VM32).<br class="">
>>>>>>>><br class="">
>>>>>>>> These events I see in the oVirt web console:<br class="">
>>>>>>>><br class="">
>>>>>>>> Sep 15, 2016 12:08:13 PM Host KOM-AD01-VM31 power management was verified successfully.<br class="">
>>>>>>>> Sep 15, 2016 12:08:13 PM Status of host KOM-AD01-VM31 was set to Up.<br class="">
>>>>>>>> Sep 15, 2016 12:08:05 PM Executing power management status on Host KOM-AD01-VM31 using Proxy Host KOM-AD01-VM32 and Fence Agent ilo:<a href="http://kom-ad01-ilo31.holding.com/" rel="noreferrer" target="_blank" class="">KOM-AD01-ILO31.holding.com</a><wbr class="">.<br class="">
>>>>>>>> Sep 15, 2016 12:05:48 PM Host KOM-AD01-VM31 is rebooting.<br class="">
>>>>>>>> Sep 15, 2016 12:05:48 PM Host KOM-AD01-VM31 was started by SYSTEM.<br class="">
>>>>>>>> Sep 15, 2016 12:05:48 PM Power management start of Host KOM-AD01-VM31 succeeded.<br class="">
>>>>>>>> Sep 15, 2016 12:05:41 PM Executing power management status on Host KOM-AD01-VM31 using Proxy Host KOM-AD01-VM32 and Fence Agent ilo:<a href="http://kom-ad01-ilo31.holding.com/" rel="noreferrer" target="_blank" class="">KOM-AD01-ILO31.holding.com</a><wbr class="">.<br class="">
>>>>>>>> Sep 15, 2016 12:05:19 PM Executing power management start on Host KOM-AD01-VM31 using Proxy Host KOM-AD01-VM32 and Fence Agent ilo:<a href="http://kom-ad01-ilo31.holding.com/" rel="noreferrer" target="_blank" class="">KOM-AD01-ILO31.holding.com</a><wbr class="">.<br class="">
>>>>>>>> Sep 15, 2016 12:05:19 PM Power management start of Host KOM-AD01-VM31 initiated.<br class="">
>>>>>>>> Sep 15, 2016 12:05:19 PM Auto fence for host KOM-AD01-VM31 was started.<br class="">
>>>>>>>> Sep 15, 2016 12:05:11 PM Executing power management status on Host KOM-AD01-VM31 using Proxy Host KOM-AD01-VM32 and Fence Agent ilo:<a href="http://kom-ad01-ilo31.holding.com/" rel="noreferrer" target="_blank" class="">KOM-AD01-ILO31.holding.com</a><wbr class="">.<br class="">
>>>>>>>> Sep 15, 2016 12:05:04 PM Executing power management status on Host KOM-AD01-VM31 using Proxy Host KOM-AD01-VM32 and Fence Agent ilo:<a href="http://kom-ad01-ilo31.holding.com/" rel="noreferrer" target="_blank" class="">KOM-AD01-ILO31.holding.com</a><wbr class="">.<br class="">
>>>>>>>> Sep 15, 2016 12:05:04 PM Host KOM-AD01-VM31 is non responsive.<br class="">
>>>>>>>> Sep 15, 2016 12:02:32 PM Host KOM-AD01-VM31 is not responding. It will stay in Connecting state for a grace period of 60 seconds and after that an attempt to fence the host will be issued.<br class="">
>>>>>>>> Sep 15, 2016 12:02:32 PM VDSM KOM-AD01-VM31 command failed: Heartbeat exeeded<br class="">
>>>>>>>> Sep 15, 2016 12:02:04 PM VM KOM-AD01-PBX02 is down. Exit message: User shut down from within the guest<br class="">
>>>>>>>><br class="">
>>>>>>>> What am I doing wrong? Why HA VM not start on a second host?<br class="">
>>>>>>>> ______________________________<wbr class="">_________________<br class="">
>>>>>>>> Users mailing list<br class="">
>>>>>>>> <a href="mailto:Users@ovirt.org" target="_blank" class="">Users@ovirt.org</a><br class="">
>>>>>>>> <a href="http://lists.ovirt.org/mailman/listinfo/users" rel="noreferrer" target="_blank" class="">http://lists.ovirt.org/mailman<wbr class="">/listinfo/users</a><br class="">
> ______________________________<wbr class="">_________________<br class="">
> Users mailing list<br class="">
> <a href="mailto:Users@ovirt.org" target="_blank" class="">Users@ovirt.org</a><br class="">
> <a href="http://lists.ovirt.org/mailman/listinfo/users" rel="noreferrer" target="_blank" class="">http://lists.ovirt.org/mailman<wbr class="">/listinfo/users</a><br class="">
><br class="">
><br class="">
<br class="">
</blockquote></div></div></div><br class=""></div></div>
<br class="">______________________________<wbr class="">_________________<br class="">
Users mailing list<br class="">
<a href="mailto:Users@ovirt.org" target="_blank" class="">Users@ovirt.org</a><br class="">
<a href="http://lists.ovirt.org/mailman/listinfo/users" rel="noreferrer" target="_blank" class="">http://lists.ovirt.org/mailman<wbr class="">/listinfo/users</a><br class="">
<br class=""></blockquote></div><br class=""></div></div>
</blockquote></div><br class=""></div></div>
</div></blockquote></div><br class=""></body></html>