Hi,
I looked at the logs and the reason why host vmh-02 wasn't
restarted is that PM restart failed using both other hosts
(vmh-01 and vmh-03) with error:
Test Failed, [Powering off machine @ IPMI:10.9.1.11...Failed
So we couldn't restart HA VMs on another hosts, because we were
not sure that host vmh-02 is really down.
I also noticed that even getting PM status of host vmh-02 is
problematic, fence agent returned this message:
Power Management test failed for Host vmh-02.Done
but it also returned successful operation. This looks very very
suspicious!
Could you please execute following command from vmh-01 or vmh-03
to test PM agent on vmh-02
fence_ipmilan -a <IP> -l <USER> -p <PASSWORD> -o status -v -P
where <IP>, <USER> and <PASSWORD> contains values valid for vmh-02?
Could you please send us also vdsm.log from machines vmh-01 and
vmh-03 so we could investigate details of fence agents execution
failures?
Thanks a lot
Martin Perina
----- Original Message -----
From: "Siddharth Patil" <siddharth(a)patil.co.uk>
To: users(a)ovirt.org
Sent: Wednesday, February 11, 2015 5:17:28 PM
Subject: Re: [ovirt-users] Fenced hosts VM's never migrate
On 11/02/15, 4:34 PM, Omer Frenkel wrote:
>
>
> ----- Original Message -----
>> From: "Tim Macy" <macytd(a)gmail.com>
>> To: users(a)ovirt.org
>> Sent: Tuesday, February 10, 2015 6:55:31 PM
>> Subject: [ovirt-users] Fenced hosts VM's never migrate
>>
>> I have a 3 host cluster setup with HA enabled and fencing enabled and it
>> appears to be working properly. Executing power management stop, start,
>> and
>> restart work along with host shutdown/restart following a simulated crash.
>> When network is pulled a proxy is chosen and it powers off the downed
>> host,
>> and then restarts it. Since the network is still down it repeats the
>> following in events:
>> "Host kvm01 is not responding. It will stay in Connecting state for a
>> grace
>> period of 162 seconds and after that an attempt to fence the host will be
>> issued."
>>
>> The real problem here is that the VM's on the host that has failed never
>> migrate to a new host and remain down until the network is reconnected.
>>
>
> once the host is powered off by the proxy, HA vms will be started (not
> migrated) on other host,
> if there are resources for it..
> if you have HA vms that are not started although there is another host
> available for it,
> it might be a bug, can you please attach engine.log from the time of the
> failure?
>
>> We have tested this with back-end storage on gluster and NFS with the same
>> result. This is on oVirt Engine Version: 3.5.1.1-1.el6. Hosts are on
>> CentOS
>> 7 and the Engine is standalone on CentOS 6.6.
>>
>>
>>
>> _______________________________________________
>> Users mailing list
>> Users(a)ovirt.org
>>
http://lists.ovirt.org/mailman/listinfo/users
>>
I've had the exact same problem during testing yesterday. The HA VMs
never restarted on the other available hosts. The only difference is
that we're using iSCSI storage backend.
oVirt Engine Version: 3.5.1.1-1.el6 (hosted engine)
Host: CentOS 6.6
Engine logs are attached.
Thanks,
Siddharth
_______________________________________________
Users mailing list
Users(a)ovirt.org
http://lists.ovirt.org/mailman/listinfo/users