[ovirt-users] oVirt 4.0.3 (Hosted Engine) - High Availability VM not restart after auto-fencing of host.
Martin Perina
mperina at redhat.com
Fri Sep 16 10:50:00 UTC 2016
On Fri, Sep 16, 2016 at 9:26 AM, Michal Skrivanek <
michal.skrivanek at redhat.com> wrote:
>
> > On 16 Sep 2016, at 08:29, aleksey.maksimov at it-kb.ru wrote:
> >
> > There are more ideas?
> >
> > 15.09.2016, 14:40, "aleksey.maksimov at it-kb.ru" <
> aleksey.maksimov at it-kb.ru>:
> >> Martin, I physically turned off the server through the iLO2. See
> screenshots.
> >> I did not touch Virtual Machine (KOM-AD01-PBX02) at the same time.
> >> The virtual machine has been turned on at the time when the host shut
> down.
> >>
> >> 15.09.2016, 14:27, "Martin Perina" <mperina at redhat.com>:
> >>> Hi,
> >>>
> >>> I found out this in the log:
> >>>
> >>> 2016-09-15 12:02:04,661 INFO [org.ovirt.engine.core.
> vdsbroker.monitoring.VmAnalyzer] (ForkJoinPool-1-worker-6) [] VM
> '660bafca-e9c3-4191-99b4-295ff8553488'(KOM-AD01-PBX02) moved from 'Up'
> --> 'Down'
> >>> 2016-09-15 12:02:04,788 INFO [org.ovirt.engine.core.dal.
> dbbroker.auditloghandling.AuditLogDirector] (ForkJoinPool-1-worker-6) []
> Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: VM
> KOM-AD01-PBX02 is down. Exit message: User shut down from within the guest
>
> since it shut down cleanly, can you please check the guest's logs to see
> what triggered the shutdown? In such cases it is considered a user
> requested shutdown and such VMs are not restarted automatically
>
That's exactly what I meant by my response. From the log it's obvious that
VM was shutdown properly, so engine will not restart it on a different.
host. Also on most modern hosts if you execute power management off action,
a signal is sent to OS to execute
regular shutdown so VMs are also shutted down properly.
> We are aware of a similar issue on specific hw -
> https://bugzilla.redhat.com/show_bug.cgi?id=1341106
>
> >>>
> >>> If I'm not mistaken, this means that VM was properly shutted down
> from within itself and in that case it's not restarted automatically. So
> I'm curious what actions have you made to make host KOM-AD01-VM31
> non-responsive?
> >>>
> >>> If you want to test fencing properly, then I suggest you to either
> block connection between host and engine on host side and forcibly stop
> ovirtmgmt network interface on host and watch fencing is applied.
>
Try above if you want to test fencing. Of course you can always configure
firewall rule to drop all packets between engine and host or unplug host
network cable.
>>>
> >>> Martin
> >>>
> >>> On Thu, Sep 15, 2016 at 1:16 PM, <aleksey.maksimov at it-kb.ru> wrote:
> >>>> engine.log for this period.
> >>>>
> >>>> 15.09.2016, 14:01, "Martin Perina" <mperina at redhat.com>:
> >>>>> On Thu, Sep 15, 2016 at 12:47 PM, <aleksey.maksimov at it-kb.ru>
> wrote:
> >>>>>> Hi Martin.
> >>>>>> I have a stupid question. Use Watchdog device mandatory to
> automatically start a virtual machine in host Fencing process?
> >>>>>
> >>>>> AFAIK it's not, but I'm not na expert, adding Arik.
> >>>>>
> >>>>> You need correct power management setup for the hosts and VM has to
> be marked as highly available for sure.
> >>>>>
> >>>>>> 15.09.2016, 13:43, "Martin Perina" <mperina at redhat.com>:
> >>>>>>> Hi,
> >>>>>>>
> >>>>>>> could you please share whole engine.log?
> >>>>>>>
> >>>>>>> Thanks
> >>>>>>>
> >>>>>>> Martin Perina
> >>>>>>>
> >>>>>>> On Thu, Sep 15, 2016 at 12:01 PM, <aleksey.maksimov at it-kb.ru>
> wrote:
> >>>>>>>> Hello oVirt guru`s !
> >>>>>>>>
> >>>>>>>> I have oVirt Hosted Engine 4.0.3-1.el7.centos on two CentOS 7.2
> hosts (HP ProLiant DL 360 G5) connected to shared FC SAN Storage.
> >>>>>>>>
> >>>>>>>> 1. I configured Power Management for the Hosts (successfully
> added Fencing Agent for iLO2 from my hosts)
> >>>>>>>>
> >>>>>>>> 2. I created new VM (KOM-AD01-PBX02) and installed Guest OS
> (Ubuntu Server 16.04 LTS) and oVirt Guest Agent
> >>>>>>>> (As described herein https://blog.it-kb.ru/2016/09/
> 14/install-ovirt-4-0-part-2-about-data-center-iso-domain-
> logical-network-vlan-vm-settings-console-guest-agent-live-migration/)
> >>>>>>>> In VM settings on "High Availability" I turned on the option
> "Highly Available" and change "Priority" to "High"
> >>>>>>>>
> >>>>>>>> 3. Now I'm trying to check Hard-Fencing and power off my first
> host (KOM-AD01-VM31) from his iLO (KOM-AD01-ILO31).
> >>>>>>>>
> >>>>>>>> Fencing successfully works and server is automatically turned
> on, but my HA VM not started on second host (KOM-AD01-VM32).
> >>>>>>>>
> >>>>>>>> These events I see in the oVirt web console:
> >>>>>>>>
> >>>>>>>> Sep 15, 2016 12:08:13 PM Host KOM-AD01-VM31 power
> management was verified successfully.
> >>>>>>>> Sep 15, 2016 12:08:13 PM Status of host KOM-AD01-VM31 was
> set to Up.
> >>>>>>>> Sep 15, 2016 12:08:05 PM Executing power management
> status on Host KOM-AD01-VM31 using Proxy Host KOM-AD01-VM32 and Fence Agent
> ilo:KOM-AD01-ILO31.holding.com.
> >>>>>>>> Sep 15, 2016 12:05:48 PM Host KOM-AD01-VM31 is rebooting.
> >>>>>>>> Sep 15, 2016 12:05:48 PM Host KOM-AD01-VM31 was started
> by SYSTEM.
> >>>>>>>> Sep 15, 2016 12:05:48 PM Power management start of Host
> KOM-AD01-VM31 succeeded.
> >>>>>>>> Sep 15, 2016 12:05:41 PM Executing power management
> status on Host KOM-AD01-VM31 using Proxy Host KOM-AD01-VM32 and Fence Agent
> ilo:KOM-AD01-ILO31.holding.com.
> >>>>>>>> Sep 15, 2016 12:05:19 PM Executing power management start
> on Host KOM-AD01-VM31 using Proxy Host KOM-AD01-VM32 and Fence Agent ilo:
> KOM-AD01-ILO31.holding.com.
> >>>>>>>> Sep 15, 2016 12:05:19 PM Power management start of Host
> KOM-AD01-VM31 initiated.
> >>>>>>>> Sep 15, 2016 12:05:19 PM Auto fence for host
> KOM-AD01-VM31 was started.
> >>>>>>>> Sep 15, 2016 12:05:11 PM Executing power management
> status on Host KOM-AD01-VM31 using Proxy Host KOM-AD01-VM32 and Fence Agent
> ilo:KOM-AD01-ILO31.holding.com.
> >>>>>>>> Sep 15, 2016 12:05:04 PM Executing power management
> status on Host KOM-AD01-VM31 using Proxy Host KOM-AD01-VM32 and Fence Agent
> ilo:KOM-AD01-ILO31.holding.com.
> >>>>>>>> Sep 15, 2016 12:05:04 PM Host KOM-AD01-VM31 is non
> responsive.
> >>>>>>>> Sep 15, 2016 12:02:32 PM Host KOM-AD01-VM31 is not
> responding. It will stay in Connecting state for a grace period of 60
> seconds and after that an attempt to fence the host will be issued.
> >>>>>>>> Sep 15, 2016 12:02:32 PM VDSM KOM-AD01-VM31 command
> failed: Heartbeat exeeded
> >>>>>>>> Sep 15, 2016 12:02:04 PM VM KOM-AD01-PBX02 is down. Exit
> message: User shut down from within the guest
> >>>>>>>>
> >>>>>>>> What am I doing wrong? Why HA VM not start on a second host?
> >>>>>>>> _______________________________________________
> >>>>>>>> Users mailing list
> >>>>>>>> Users at ovirt.org
> >>>>>>>> http://lists.ovirt.org/mailman/listinfo/users
> > _______________________________________________
> > Users mailing list
> > Users at ovirt.org
> > http://lists.ovirt.org/mailman/listinfo/users
> >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20160916/8c7a139c/attachment-0001.html>
More information about the Users
mailing list