[ovirt-users] oVirt 4.0.3 (Hosted Engine) - High Availability VM not restart after auto-fencing of host.

Martin Perina mperina at redhat.com
Thu Sep 15 11:27:15 UTC 2016


Hi,

I found out this in the log:

2016-09-15 12:02:04,661 INFO
[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
(ForkJoinPool-1-worker-6) [] VM
'660bafca-e9c3-4191-99b4-295ff8553488'(KOM-AD01-PBX02) moved from 'Up' -->
'Down'
2016-09-15 12:02:04,788 INFO
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(ForkJoinPool-1-worker-6) [] Correlation ID: null, Call Stack: null, Custom
Event ID: -1, Message: VM KOM-AD01-PBX02 is down. Exit message: User shut
down from within the guest

If I'm not mistaken, this means that VM was properly shutted down from
within itself and in that case it's not restarted automatically. So I'm
curious what actions have you made to make host KOM-AD01-VM31
non-responsive?

If you want to test fencing properly, then I suggest you to either block
connection between host and engine on host side and forcibly stop ovirtmgmt
network interface on host and watch fencing is applied.


Martin


On Thu, Sep 15, 2016 at 1:16 PM, <aleksey.maksimov at it-kb.ru> wrote:

> engine.log for this period.
>
> 15.09.2016, 14:01, "Martin Perina" <mperina at redhat.com>:
> > On Thu, Sep 15, 2016 at 12:47 PM, <aleksey.maksimov at it-kb.ru> wrote:
> >> Hi Martin.
> >> I have a stupid question. Use Watchdog device mandatory to
> automatically start a virtual machine in host Fencing process?
> >
> > ​AFAIK it's not, but I'm not na expert, adding Arik.
> >
> > You need correct power management setup for the hosts and VM has to be
> marked as highly available​ for sure.
> >
> >> 15.09.2016, 13:43, "Martin Perina" <mperina at redhat.com>:
> >>> Hi,
> >>>
> >>> could you please share whole engine.log?
> >>>
> >>> Thanks
> >>>
> >>> Martin Perina
> >>>
> >>> On Thu, Sep 15, 2016 at 12:01 PM, <aleksey.maksimov at it-kb.ru> wrote:
> >>>> Hello oVirt guru`s !
> >>>>
> >>>> I have oVirt Hosted Engine 4.0.3-1.el7.centos on two CentOS 7.2 hosts
> (HP ProLiant DL 360 G5) connected to shared FC SAN Storage.
> >>>>
> >>>> 1. I configured Power Management for the Hosts (successfully added
> Fencing Agent for iLO2 from my hosts)
> >>>>
> >>>> 2. I created new VM (KOM-AD01-PBX02) and installed Guest OS (Ubuntu
> Server 16.04 LTS) and oVirt Guest Agent
> >>>> (As described herein https://blog.it-kb.ru/2016/09/
> 14/install-ovirt-4-0-part-2-about-data-center-iso-domain-
> logical-network-vlan-vm-settings-console-guest-agent-live-migration/)
> >>>>    In VM settings on "High Availability" I turned on the option
> "Highly Available" and change "Priority" to "High"
> >>>>
> >>>> 3. Now I'm trying to check Hard-Fencing and power off my first host
> (KOM-AD01-VM31) from his iLO (KOM-AD01-ILO31).
> >>>>
> >>>> Fencing successfully works and server is automatically turned on, but
> my HA VM not started on second host (KOM-AD01-VM32).
> >>>>
> >>>> These events I see in the oVirt web console:
> >>>>
> >>>> Sep 15, 2016 12:08:13 PM        Host KOM-AD01-VM31 power management
> was verified successfully.
> >>>> Sep 15, 2016 12:08:13 PM        Status of host KOM-AD01-VM31 was set
> to Up.
> >>>> Sep 15, 2016 12:08:05 PM        Executing power management status on
> Host KOM-AD01-VM31 using Proxy Host KOM-AD01-VM32 and Fence Agent ilo:
> KOM-AD01-ILO31.holding.com.
> >>>> Sep 15, 2016 12:05:48 PM        Host KOM-AD01-VM31 is rebooting.
> >>>> Sep 15, 2016 12:05:48 PM        Host KOM-AD01-VM31 was started by
> SYSTEM.
> >>>> Sep 15, 2016 12:05:48 PM        Power management start of Host
> KOM-AD01-VM31 succeeded.
> >>>> Sep 15, 2016 12:05:41 PM        Executing power management status on
> Host KOM-AD01-VM31 using Proxy Host KOM-AD01-VM32 and Fence Agent ilo:
> KOM-AD01-ILO31.holding.com.
> >>>> Sep 15, 2016 12:05:19 PM        Executing power management start on
> Host KOM-AD01-VM31 using Proxy Host KOM-AD01-VM32 and Fence Agent ilo:
> KOM-AD01-ILO31.holding.com.
> >>>> Sep 15, 2016 12:05:19 PM        Power management start of Host
> KOM-AD01-VM31 initiated.
> >>>> Sep 15, 2016 12:05:19 PM        Auto fence for host KOM-AD01-VM31 was
> started.
> >>>> Sep 15, 2016 12:05:11 PM        Executing power management status on
> Host KOM-AD01-VM31 using Proxy Host KOM-AD01-VM32 and Fence Agent ilo:
> KOM-AD01-ILO31.holding.com.
> >>>> Sep 15, 2016 12:05:04 PM        Executing power management status on
> Host KOM-AD01-VM31 using Proxy Host KOM-AD01-VM32 and Fence Agent ilo:
> KOM-AD01-ILO31.holding.com.
> >>>> Sep 15, 2016 12:05:04 PM        Host KOM-AD01-VM31 is non responsive.
> >>>> Sep 15, 2016 12:02:32 PM        Host KOM-AD01-VM31 is not responding.
> It will stay in Connecting state for a grace period of 60 seconds and after
> that an attempt to fence the host will be issued.
> >>>> Sep 15, 2016 12:02:32 PM        VDSM KOM-AD01-VM31 command failed:
> Heartbeat exeeded
> >>>> Sep 15, 2016 12:02:04 PM        VM KOM-AD01-PBX02 is down. Exit
> message: User shut down from within the guest
> >>>>
> >>>> What am I doing wrong? Why HA VM not start on a second host?
> >>>> _______________________________________________
> >>>> Users mailing list
> >>>> Users at ovirt.org
> >>>> http://lists.ovirt.org/mailman/listinfo/users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20160915/3a27696a/attachment-0001.html>


More information about the Users mailing list