[ovirt-users] oVirt 4.0.3 (Hosted Engine) - High Availability VM not restart after auto-fencing of host.

Martin Perina mperina at redhat.com
Fri Sep 16 10:50:00 UTC 2016


On Fri, Sep 16, 2016 at 9:26 AM, Michal Skrivanek <
michal.skrivanek at redhat.com> wrote:

>
> > On 16 Sep 2016, at 08:29, aleksey.maksimov at it-kb.ru wrote:
> >
> > There are more ideas?
> >
> > 15.09.2016, 14:40, "aleksey.maksimov at it-kb.ru" <
> aleksey.maksimov at it-kb.ru>:
> >> Martin, I physically turned off the server through the iLO2. See
> screenshots.
> >> I did not touch Virtual Machine (KOM-AD01-PBX02) at the same time.
> >> The virtual machine has been turned on at the time when the host shut
> down.
> >>
> >> 15.09.2016, 14:27, "Martin Perina" <mperina at redhat.com>:
> >>>  Hi,
> >>>
> >>>  I found out this in the log:
> >>>
> >>>  2016-09-15 12:02:04,661 INFO  [org.ovirt.engine.core.
> vdsbroker.monitoring.VmAnalyzer] (ForkJoinPool-1-worker-6) [] VM
> '660bafca-e9c3-4191-99b4-295ff8553488'(KOM-AD01-PBX02) moved from 'Up'
> --> 'Down'
> >>>  2016-09-15 12:02:04,788 INFO  [org.ovirt.engine.core.dal.
> dbbroker.auditloghandling.AuditLogDirector] (ForkJoinPool-1-worker-6) []
> Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: VM
> KOM-AD01-PBX02 is down. Exit message: User shut down from within the guest
>
> since it shut down cleanly, can you please check the guest's logs to see
> what triggered the shutdown? In such cases it is considered a user
> requested shutdown and such VMs are not restarted automatically
>

​That's exactly what I meant by my response. From the log it's obvious that
VM was shutdown properly, so engine will not restart it on a different.
host. Also on most modern hosts if you execute power management off action,
a signal is sent to OS to execute ​

​regular shutdown so VMs are also shutted down properly.
​

> We are aware of a similar issue on specific hw -
> https://bugzilla.redhat.com/show_bug.cgi?id=1341106
>
> >>>
> >>>  If I'm not mistaken, this means that VM was properly shutted down
> from within itself and in that case it's not restarted automatically. So
> I'm curious what actions have you made to make host KOM-AD01-VM31
> non-responsive?
> >>>
> >>>  If you want to test fencing properly, then I suggest you to either
> block connection between host and engine on host side and forcibly stop
> ovirtmgmt network interface on host and watch fencing is applied.
>

​Try above if you want to test fencing. Of course you can always configure
firewall rule to drop all packets between engine and host or unplug host
network cable​.

>>>
> >>>  Martin
> >>>
> >>>  On Thu, Sep 15, 2016 at 1:16 PM, <aleksey.maksimov at it-kb.ru> wrote:
> >>>>  engine.log for this period.
> >>>>
> >>>>  15.09.2016, 14:01, "Martin Perina" <mperina at redhat.com>:
> >>>>>  On Thu, Sep 15, 2016 at 12:47 PM, <aleksey.maksimov at it-kb.ru>
> wrote:
> >>>>>>  Hi Martin.
> >>>>>>  I have a stupid question. Use Watchdog device mandatory to
> automatically start a virtual machine in host Fencing process?
> >>>>>
> >>>>>  ​AFAIK it's not, but I'm not na expert, adding Arik.
> >>>>>
> >>>>>  You need correct power management setup for the hosts and VM has to
> be marked as highly available​ for sure.
> >>>>>
> >>>>>>  15.09.2016, 13:43, "Martin Perina" <mperina at redhat.com>:
> >>>>>>>  Hi,
> >>>>>>>
> >>>>>>>  could you please share whole engine.log?
> >>>>>>>
> >>>>>>>  Thanks
> >>>>>>>
> >>>>>>>  Martin Perina
> >>>>>>>
> >>>>>>>  On Thu, Sep 15, 2016 at 12:01 PM, <aleksey.maksimov at it-kb.ru>
> wrote:
> >>>>>>>>  Hello oVirt guru`s !
> >>>>>>>>
> >>>>>>>>  I have oVirt Hosted Engine 4.0.3-1.el7.centos on two CentOS 7.2
> hosts (HP ProLiant DL 360 G5) connected to shared FC SAN Storage.
> >>>>>>>>
> >>>>>>>>  1. I configured Power Management for the Hosts (successfully
> added Fencing Agent for iLO2 from my hosts)
> >>>>>>>>
> >>>>>>>>  2. I created new VM (KOM-AD01-PBX02) and installed Guest OS
> (Ubuntu Server 16.04 LTS) and oVirt Guest Agent
> >>>>>>>>  (As described herein https://blog.it-kb.ru/2016/09/
> 14/install-ovirt-4-0-part-2-about-data-center-iso-domain-
> logical-network-vlan-vm-settings-console-guest-agent-live-migration/)
> >>>>>>>>     In VM settings on "High Availability" I turned on the option
> "Highly Available" and change "Priority" to "High"
> >>>>>>>>
> >>>>>>>>  3. Now I'm trying to check Hard-Fencing and power off my first
> host (KOM-AD01-VM31) from his iLO (KOM-AD01-ILO31).
> >>>>>>>>
> >>>>>>>>  Fencing successfully works and server is automatically turned
> on, but my HA VM not started on second host (KOM-AD01-VM32).
> >>>>>>>>
> >>>>>>>>  These events I see in the oVirt web console:
> >>>>>>>>
> >>>>>>>>  Sep 15, 2016 12:08:13 PM        Host KOM-AD01-VM31 power
> management was verified successfully.
> >>>>>>>>  Sep 15, 2016 12:08:13 PM        Status of host KOM-AD01-VM31 was
> set to Up.
> >>>>>>>>  Sep 15, 2016 12:08:05 PM        Executing power management
> status on Host KOM-AD01-VM31 using Proxy Host KOM-AD01-VM32 and Fence Agent
> ilo:KOM-AD01-ILO31.holding.com.
> >>>>>>>>  Sep 15, 2016 12:05:48 PM        Host KOM-AD01-VM31 is rebooting.
> >>>>>>>>  Sep 15, 2016 12:05:48 PM        Host KOM-AD01-VM31 was started
> by SYSTEM.
> >>>>>>>>  Sep 15, 2016 12:05:48 PM        Power management start of Host
> KOM-AD01-VM31 succeeded.
> >>>>>>>>  Sep 15, 2016 12:05:41 PM        Executing power management
> status on Host KOM-AD01-VM31 using Proxy Host KOM-AD01-VM32 and Fence Agent
> ilo:KOM-AD01-ILO31.holding.com.
> >>>>>>>>  Sep 15, 2016 12:05:19 PM        Executing power management start
> on Host KOM-AD01-VM31 using Proxy Host KOM-AD01-VM32 and Fence Agent ilo:
> KOM-AD01-ILO31.holding.com.
> >>>>>>>>  Sep 15, 2016 12:05:19 PM        Power management start of Host
> KOM-AD01-VM31 initiated.
> >>>>>>>>  Sep 15, 2016 12:05:19 PM        Auto fence for host
> KOM-AD01-VM31 was started.
> >>>>>>>>  Sep 15, 2016 12:05:11 PM        Executing power management
> status on Host KOM-AD01-VM31 using Proxy Host KOM-AD01-VM32 and Fence Agent
> ilo:KOM-AD01-ILO31.holding.com.
> >>>>>>>>  Sep 15, 2016 12:05:04 PM        Executing power management
> status on Host KOM-AD01-VM31 using Proxy Host KOM-AD01-VM32 and Fence Agent
> ilo:KOM-AD01-ILO31.holding.com.
> >>>>>>>>  Sep 15, 2016 12:05:04 PM        Host KOM-AD01-VM31 is non
> responsive.
> >>>>>>>>  Sep 15, 2016 12:02:32 PM        Host KOM-AD01-VM31 is not
> responding. It will stay in Connecting state for a grace period of 60
> seconds and after that an attempt to fence the host will be issued.
> >>>>>>>>  Sep 15, 2016 12:02:32 PM        VDSM KOM-AD01-VM31 command
> failed: Heartbeat exeeded
> >>>>>>>>  Sep 15, 2016 12:02:04 PM        VM KOM-AD01-PBX02 is down. Exit
> message: User shut down from within the guest
> >>>>>>>>
> >>>>>>>>  What am I doing wrong? Why HA VM not start on a second host?
> >>>>>>>>  _______________________________________________
> >>>>>>>>  Users mailing list
> >>>>>>>>  Users at ovirt.org
> >>>>>>>>  http://lists.ovirt.org/mailman/listinfo/users
> > _______________________________________________
> > Users mailing list
> > Users at ovirt.org
> > http://lists.ovirt.org/mailman/listinfo/users
> >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20160916/8c7a139c/attachment-0001.html>


More information about the Users mailing list