[ovirt-users] VM failover with ovirt3.5

Artyom Lukianov alukiano at redhat.com
Thu Jan 8 16:46:08 UTC 2015


So, behavior for not HE HA vm is:
1) If vm crash(from some reason) it restarted automatically on another host in the same cluster.
2) If something happen with host where HA vm run(network problem, power outage), vm dropped to unknown state, and if you want from engine to start this vm on another host, you need click under problematic host menu "Confirm Host has been Rebooted", when you confirm this, engine will start vm on another host and also release SPM role from problematic host(if it SPM sure).
 

----- Original Message -----
From: "Cong Yue" <Cong_Yue at alliedtelesis.com>
To: "Artyom Lukianov" <alukiano at redhat.com>
Cc: "cong yue" <yuecong1104 at gmail.com>, stirabos at redhat.com, users at ovirt.org, "Jiri Moskovcak" <jmoskovc at redhat.com>, "Yedidyah Bar David" <didi at redhat.com>, "Sandro Bonazzola" <sbonazzo at redhat.com>
Sent: Wednesday, January 7, 2015 3:00:26 AM
Subject: RE: [ovirt-users] VM failover with ovirt3.5

For case 1, I got the avicde that I need to change 'migration_max_time_per_gib_mem'  value inside vdsm.conf, I am doing it and when I get the result, I will also share with you. Thanks.

For case 2, do you mean I did the wrong way to test normal VM failover? Now although I shut down host 3 forcely, the vm on the top of it will not do failover.
What is your advice for this?

Thanks,
Cong



-----Original Message-----
From: Artyom Lukianov [mailto:alukiano at redhat.com]
Sent: Tuesday, January 06, 2015 12:34 AM
To: Yue, Cong
Cc: cong yue; stirabos at redhat.com; users at ovirt.org; Jiri Moskovcak; Yedidyah Bar David; Sandro Bonazzola
Subject: Re: [ovirt-users] VM failover with ovirt3.5

Case 1:
In vdsm.log I can see this one:
Thread-674407::ERROR::2015-01-05 12:09:43,264::migration::259::vm.Vm::(run) vmId=`0d3adb5c-0960-483c-9d73-5e256a519f2f`::Failed to migrate
Traceback (most recent call last):
  File "/usr/share/vdsm/virt/migration.py", line 245, in run
    self._startUnderlyingMigration(time.time())
  File "/usr/share/vdsm/virt/migration.py", line 324, in _startUnderlyingMigration
    None, maxBandwidth)
  File "/usr/share/vdsm/virt/vm.py", line 670, in f
    ret = attr(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line 111, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1264, in migrateToURI2
    if ret == -1: raise libvirtError ('virDomainMigrateToURI2() failed', dom=self)
libvirtError: operation aborted: migration job: canceled by client
I see that this kind can be happen, because migration time exceeding the configured maximum time for migrations, but anyway we need help from devs, I added some to CC.

Case 2:
HA vm must migrate only in case of some fail on host3, so if your host_3 is ok vm will continue run on it.


----- Original Message -----
From: "Cong Yue" <Cong_Yue at alliedtelesis.com>
To: "Artyom Lukianov" <alukiano at redhat.com>
Cc: "cong yue" <yuecong1104 at gmail.com>, stirabos at redhat.com, users at ovirt.org
Sent: Monday, January 5, 2015 7:38:08 PM
Subject: RE: [ovirt-users] VM failover with ovirt3.5

I collected the agent.log and vdsm.log in 2 cases.

Case1 HE VM failover trail
What I did
1, make all host be engine up
2, set host1 be with local maintenance mode. In host1, there is HE VM.
3, Then HE VM is trying to migrate, but finally it fails. This can be found from agent.log_hosted_engine_1
As for the log is very large, I uploaded into google dirve. The link is as
https://drive.google.com/drive/#folders/0B9Pi5vvimKTdNU82bWVpZDhDQmM/0B9Pi5vvimKTdRGJhUXUwejNGRHc
The logs are for 3 hosts in my environment.

Case2 non-HE VM failover trail
1, make all host be engine up
2,set host2 be with local maintenance mode. In host3, there is one vm with ha enabled. Also for the cluster, "Enable HA reservation" and Resilience policy is set as "migrating virtual machines"
3,But the vm on the top of host3 does not migrate at all.
The logs are uploaded to good drive as
https://drive.google.com/drive/#folders/0B9Pi5vvimKTdNU82bWVpZDhDQmM/0B9Pi5vvimKTdd3MzTXZBbmxpNmc


Thanks,
Cong




-----Original Message-----
From: Artyom Lukianov [mailto:alukiano at redhat.com]
Sent: Sunday, January 04, 2015 3:22 AM
To: Yue, Cong
Cc: cong yue; stirabos at redhat.com; users at ovirt.org
Subject: Re: [ovirt-users] VM failover with ovirt3.5

Can you provide vdsm logs:
1) for HE vm case
2) for not HE vm case
Thanks

----- Original Message -----
From: "Cong Yue" <Cong_Yue at alliedtelesis.com>
To: "Artyom Lukianov" <alukiano at redhat.com>
Cc: "cong yue" <yuecong1104 at gmail.com>, stirabos at redhat.com, users at ovirt.org
Sent: Thursday, January 1, 2015 2:32:18 AM
Subject: Re: [ovirt-users] VM failover with ovirt3.5

Thanks for the advice. I applied the patch for clientIF.py as
- port = config.getint('addresses', 'management_port')
+ port = config.get('addresses', 'management_port')

Now there is no fatal error in beam.log, also migration can start to happen when I set the host where HE VM is to be local maintenance mode. But it finally fail with the following log. Also HE VM can not be done with live migration in my environment.

MainThread::INFO::2014-12-31
19:08:06,197::states::759::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
Continuing to monitor migration
MainThread::INFO::2014-12-31
19:08:06,430::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Current state EngineMigratingAway (score: 2000)
MainThread::INFO::2014-12-31
19:08:06,430::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host 10.0.0.93 (id: 2, score: 2400)
MainThread::ERROR::2014-12-31
19:08:16,490::hosted_engine::867::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitor_migration)
Failed to migrate
Traceback (most recent call last):
 File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
line 863, in _monitor_migration
   vm_id,
 File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/vds_client.py",
line 85, in run_vds_client_cmd
   response['status']['message'])
DetailedError: Error 47 from migrateStatus: Migration canceled
MainThread::INFO::2014-12-31
19:08:16,501::brokerlink::111::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
Trying: notify time=1420070896.5 type=state_transition
detail=EngineMigratingAway-ReinitializeFSM hostname='compute2-3'
MainThread::INFO::2014-12-31
19:08:16,502::brokerlink::120::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
Success, was notification of state_transition
(EngineMigratingAway-ReinitializeFSM) sent? ignored
MainThread::INFO::2014-12-31
19:08:16,805::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Current state ReinitializeFSM (score: 0)
MainThread::INFO::2014-12-31
19:08:16,805::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host 10.0.0.93 (id: 2, score: 2400)

Besides, I had a try for other VMs instead of HE VM, but the failover( also no start to try migrating) happen. I set HA for those VMs. Is there some log I can check for this?

Please kindly advise.

Thanks,
Cong


> On 2014/12/31, at 0:14, "Artyom Lukianov" <alukiano at redhat.com> wrote:
>
> Ok I found this one:
> Thread-1807180::ERROR::2014-12-30 13:02:52,164::migration::165::vm.Vm::(_recover) vmId=`0d3adb5c-0960-483c-9d73-5e256a519f2f`::Failed to destroy remote VM
> Traceback (most recent call last):
> File "/usr/share/vdsm/virt/migration.py", line 163, in _recover
>  self.destServer.destroy(self._vm.id)
> AttributeError: 'SourceThread' object has no attribute 'destServer'
> Thread-1807180::ERROR::2014-12-30 13:02:52,165::migration::259::vm.Vm::(run) vmId=`0d3adb5c-0960-483c-9d73-5e256a519f2f`::Failed to migrate
> Traceback (most recent call last):
> File "/usr/share/vdsm/virt/migration.py", line 229, in run
>  self._setupVdsConnection()
> File "/usr/share/vdsm/virt/migration.py", line 92, in _setupVdsConnection
>  self._dst, self._vm.cif.bindings['xmlrpc'].serverPort)
> File "/usr/lib/python2.7/site-packages/vdsm/vdscli.py", line 91, in cannonizeHostPort
>  return addr + ':' + port
> TypeError: cannot concatenate 'str' and 'int' objects
>
> We have bug that already verified for this one https://bugzilla.redhat.com/show_bug.cgi?id=1163771, so patch must be included in latest builds, but you can also take a look on patch, and edit files by yourself on all you machines and restart vdsm.
>
> ----- Original Message -----
> From: "cong yue" <yuecong1104 at gmail.com>
> To: alukiano at redhat.com, stirabos at redhat.com, users at ovirt.org
> Cc: "Cong Yue" <Cong_Yue at alliedtelesis.com>
> Sent: Tuesday, December 30, 2014 8:22:47 PM
> Subject: Re: [ovirt-users] VM failover with ovirt3.5
>
> The vdsm.log just after I turned the host where HE VM is to local.
>
> In the log, there is some part like
>
> ---
> GuestMonitor-HostedEngine::DEBUG::2014-12-30
> 13:01:03,988::vm::486::vm.Vm::(_getUserCpuTuneInfo)
> vmId=`0d3adb5c-0960-483c-9d73-5e256a519f2f`::Domain Metadata is not
> set
> GuestMonitor-HostedEngine::DEBUG::2014-12-30
> 13:01:03,989::vm::486::vm.Vm::(_getUserCpuTuneInfo)
> vmId=`0d3adb5c-0960-483c-9d73-5e256a519f2f`::Domain Metadata is not
> set
> GuestMonitor-HostedEngine::DEBUG::2014-12-30
> 13:01:03,990::vm::486::vm.Vm::(_getUserCpuTuneInfo)
> vmId=`0d3adb5c-0960-483c-9d73-5e256a519f2f`::Domain Metadata is not
> set
> JsonRpc (StompReactor)::DEBUG::2014-12-30
> 13:01:04,675::stompReactor::98::Broker.StompAdapter::(handle_frame)
> Handling message <StompFrame command='SEND'>
> JsonRpcServer::DEBUG::2014-12-30
> 13:01:04,676::__init__::504::jsonrpc.JsonRpcServer::(serve_requests)
> Waiting for request
> Thread-1806995::DEBUG::2014-12-30
> 13:01:04,677::stompReactor::163::yajsonrpc.StompServer::(send) Sending
> response
> JsonRpc (StompReactor)::DEBUG::2014-12-30
> 13:01:04,678::stompReactor::98::Broker.StompAdapter::(handle_frame)
> Handling message <StompFrame command='SEND'>
> JsonRpcServer::DEBUG::2014-12-30
> 13:01:04,679::__init__::504::jsonrpc.JsonRpcServer::(serve_requests)
> Waiting for request
> Thread-1806996::DEBUG::2014-12-30
> 13:01:04,681::vm::486::vm.Vm::(_getUserCpuTuneInfo)
> vmId=`0d3adb5c-0960-483c-9d73-5e256a519f2f`::Domain Metadata is not
> set
> ---
>
> I this with some wrong?
>
> Thanks,
> Cong
>
>
>> From: Artyom Lukianov <alukiano at redhat.com>
>> Date: 2014年12月29日 23:13:45 GMT-8
>> To: "Yue, Cong" <Cong_Yue at alliedtelesis.com>
>> Cc: Simone Tiraboschi <stirabos at redhat.com>, "users at ovirt.org"
>> <users at ovirt.org>
>> Subject: Re: [ovirt-users] VM failover with ovirt3.5
>>
>> HE vm migrated only by ovirt-ha-agent and not by engine, but FatalError it's
>> more interesting, can you provide vdsm.log for this one please.
>>
>> ----- Original Message -----
>> From: "Cong Yue" <Cong_Yue at alliedtelesis.com>
>> To: "Artyom Lukianov" <alukiano at redhat.com>
>> Cc: "Simone Tiraboschi" <stirabos at redhat.com>, users at ovirt.org
>> Sent: Monday, December 29, 2014 8:29:04 PM
>> Subject: Re: [ovirt-users] VM failover with ovirt3.5
>>
>> I disabled local maintenance mode for all hosts, and then only set the host
>> where HE VM is there to local maintenance mode. The logs are as follows.
>> During the migration of HE VM , it shows some fatal error happen. By the
>> way, also HE VM can not work with live migration. Instead, other VMs can do
>> live migration.
>>
>> ---
>> [root at compute2-3 ~]# hosted-engine --set-maintenance --mode=local
>> You have new mail in /var/spool/mail/root
>> [root at compute2-3 ~]# tail -f /var/log/ovirt-hosted-engine-ha/agent.log
>> MainThread::INFO::2014-12-29
>> 13:16:12,435::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host 10.0.0.92 (id: 3, score: 2400)
>> MainThread::INFO::2014-12-29
>> 13:16:22,711::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Current state EngineUp (score: 2400)
>> MainThread::INFO::2014-12-29
>> 13:16:22,711::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host 10.0.0.92 (id: 3, score: 2400)
>> MainThread::INFO::2014-12-29
>> 13:16:32,978::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Current state EngineUp (score: 2400)
>> MainThread::INFO::2014-12-29
>> 13:16:32,978::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host 10.0.0.93 (id: 2, score: 2400)
>> MainThread::INFO::2014-12-29
>> 13:16:43,272::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Current state EngineUp (score: 2400)
>> MainThread::INFO::2014-12-29
>> 13:16:43,272::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host 10.0.0.93 (id: 2, score: 2400)
>> MainThread::INFO::2014-12-29
>> 13:16:53,316::states::394::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
>> Engine vm running on localhost
>> MainThread::INFO::2014-12-29
>> 13:16:53,562::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Current state EngineUp (score: 2400)
>> MainThread::INFO::2014-12-29
>> 13:16:53,562::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host 10.0.0.93 (id: 2, score: 2400)
>> MainThread::INFO::2014-12-29
>> 13:17:03,600::state_decorators::124::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check)
>> Local maintenance detected
>> MainThread::INFO::2014-12-29
>> 13:17:03,611::brokerlink::111::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
>> Trying: notify time=1419877023.61 type=state_transition
>> detail=EngineUp-LocalMaintenanceMigrateVm hostname='compute2-3'
>> MainThread::INFO::2014-12-29
>> 13:17:03,672::brokerlink::120::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
>> Success, was notification of state_transition
>> (EngineUp-LocalMaintenanceMigrateVm) sent? sent
>> MainThread::INFO::2014-12-29
>> 13:17:03,911::states::208::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(score)
>> Score is 0 due to local maintenance mode
>> MainThread::INFO::2014-12-29
>> 13:17:03,912::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Current state LocalMaintenanceMigrateVm (score: 0)
>> MainThread::INFO::2014-12-29
>> 13:17:03,912::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host 10.0.0.93 (id: 2, score: 2400)
>> MainThread::INFO::2014-12-29
>> 13:17:03,960::brokerlink::111::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
>> Trying: notify time=1419877023.96 type=state_transition
>> detail=LocalMaintenanceMigrateVm-EngineMigratingAway
>> hostname='compute2-3'
>> MainThread::INFO::2014-12-29
>> 13:17:03,980::brokerlink::120::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
>> Success, was notification of state_transition
>> (LocalMaintenanceMigrateVm-EngineMigratingAway) sent? sent
>> MainThread::INFO::2014-12-29
>> 13:17:04,218::states::66::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_penalize_memory)
>> Penalizing score by 400 due to low free memory
>> MainThread::INFO::2014-12-29
>> 13:17:04,218::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Current state EngineMigratingAway (score: 2000)
>> MainThread::INFO::2014-12-29
>> 13:17:04,219::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host 10.0.0.93 (id: 2, score: 2400)
>> MainThread::ERROR::2014-12-29
>> 13:17:14,251::hosted_engine::867::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitor_migration)
>> Failed to migrate
>> Traceback (most recent call last):
>> File
>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>> line 863, in _monitor_migration
>> vm_id,
>> File
>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/vds_client.py",
>> line 85, in run_vds_client_cmd
>> response['status']['message'])
>> DetailedError: Error 12 from migrateStatus: Fatal error during migration
>> MainThread::INFO::2014-12-29
>> 13:17:14,262::brokerlink::111::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
>> Trying: notify time=1419877034.26 type=state_transition
>> detail=EngineMigratingAway-ReinitializeFSM hostname='compute2-3'
>> MainThread::INFO::2014-12-29
>> 13:17:14,263::brokerlink::120::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
>> Success, was notification of state_transition
>> (EngineMigratingAway-ReinitializeFSM) sent? ignored
>> MainThread::INFO::2014-12-29
>> 13:17:14,496::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Current state ReinitializeFSM (score: 0)
>> MainThread::INFO::2014-12-29
>> 13:17:14,496::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host 10.0.0.93 (id: 2, score: 2400)
>> MainThread::INFO::2014-12-29
>> 13:17:24,536::state_decorators::124::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check)
>> Local maintenance detected
>> MainThread::INFO::2014-12-29
>> 13:17:24,547::brokerlink::111::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
>> Trying: notify time=1419877044.55 type=state_transition
>> detail=ReinitializeFSM-LocalMaintenance hostname='compute2-3'
>> MainThread::INFO::2014-12-29
>> 13:17:24,574::brokerlink::120::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
>> Success, was notification of state_transition
>> (ReinitializeFSM-LocalMaintenance) sent? sent
>> MainThread::INFO::2014-12-29
>> 13:17:24,812::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Current state LocalMaintenance (score: 0)
>> MainThread::INFO::2014-12-29
>> 13:17:24,812::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host 10.0.0.93 (id: 2, score: 2400)
>> MainThread::INFO::2014-12-29
>> 13:17:34,851::state_decorators::124::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check)
>> Local maintenance detected
>> MainThread::INFO::2014-12-29
>> 13:17:35,095::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Current state LocalMaintenance (score: 0)
>> MainThread::INFO::2014-12-29
>> 13:17:35,095::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host 10.0.0.93 (id: 2, score: 2400)
>> MainThread::INFO::2014-12-29
>> 13:17:45,130::state_decorators::124::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check)
>> Local maintenance detected
>> MainThread::INFO::2014-12-29
>> 13:17:45,368::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Current state LocalMaintenance (score: 0)
>> MainThread::INFO::2014-12-29
>> 13:17:45,368::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host 10.0.0.93 (id: 2, score: 2400)
>> ^C
>> [root at compute2-3 ~]#
>>
>>
>> [root at compute2-3 ~]# hosted-engine --vm-status
>>
>>
>> --== Host 1 status ==--
>>
>> Status up-to-date                  : True
>> Hostname                           : 10.0.0.94
>> Host ID                            : 1
>> Engine status                      : {"health": "good", "vm": "up",
>> "detail": "up"}
>> Score                              : 0
>> Local maintenance                  : True
>> Host timestamp                     : 1014956<tel:1014956>
>> Extra metadata (valid at timestamp):
>> metadata_parse_version=1
>> metadata_feature_version=1
>> timestamp=1014956<tel:1014956> (Mon Dec 29 13:20:19 2014)
>> host-id=1
>> score=0
>> maintenance=True
>> state=LocalMaintenance
>>
>>
>> --== Host 2 status ==--
>>
>> Status up-to-date                  : True
>> Hostname                           : 10.0.0.93
>> Host ID                            : 2
>> Engine status                      : {"reason": "vm not running on
>> this host", "health": "bad", "vm": "down", "detail": "unknown"}
>> Score                              : 2400
>> Local maintenance                  : False
>> Host timestamp                     : 866019
>> Extra metadata (valid at timestamp):
>> metadata_parse_version=1
>> metadata_feature_version=1
>> timestamp=866019 (Mon Dec 29 10:19:45 2014)
>> host-id=2
>> score=2400
>> maintenance=False
>> state=EngineDown
>>
>>
>> --== Host 3 status ==--
>>
>> Status up-to-date                  : True
>> Hostname                           : 10.0.0.92
>> Host ID                            : 3
>> Engine status                      : {"reason": "vm not running on
>> this host", "health": "bad", "vm": "down", "detail": "unknown"}
>> Score                              : 2400
>> Local maintenance                  : False
>> Host timestamp                     : 860493
>> Extra metadata (valid at timestamp):
>> metadata_parse_version=1
>> metadata_feature_version=1
>> timestamp=860493 (Mon Dec 29 10:20:35 2014)
>> host-id=3
>> score=2400
>> maintenance=False
>> state=EngineDown
>> [root at compute2-3 ~]#
>> ---
>> Thanks,
>> Cong
>>
>>
>>
>> On 2014/12/29, at 8:43, "Artyom Lukianov"
>> <alukiano at redhat.com<mailto:alukiano at redhat.com>> wrote:
>>
>> I see that HE vm run on host with ip 10.0.0.94, and two another hosts in
>> "Local Maintenance" state, so vm will not migrate to any of them, can you
>> try disable local maintenance on all hosts in HE environment and after
>> enable "local maintenance" on host where HE vm run, and provide also output
>> of hosted-engine --vm-status.
>> Failover works in next way:
>> 1) if host where run HE vm have score less by 800 that some other host in HE
>> environment, HE vm will migrate on host with best score
>> 2) if something happen to vm(kernel panic, crash of service...), agent will
>> restart HE vm on another host in HE environment with positive score
>> 3) if put to local maintenance host with HE vm, vm will migrate to another
>> host with positive score
>> Thanks.
>>
>> ----- Original Message -----
>> From: "Cong Yue"
>> <Cong_Yue at alliedtelesis.com<mailto:Cong_Yue at alliedtelesis.com>>
>> To: "Artyom Lukianov" <alukiano at redhat.com<mailto:alukiano at redhat.com>>
>> Cc: "Simone Tiraboschi" <stirabos at redhat.com<mailto:stirabos at redhat.com>>,
>> users at ovirt.org<mailto:users at ovirt.org>
>> Sent: Monday, December 29, 2014 6:30:42 PM
>> Subject: Re: [ovirt-users] VM failover with ovirt3.5
>>
>> Thanks and the --vm-status log is as follows:
>> [root at compute2-2 ~]# hosted-engine --vm-status
>>
>>
>> --== Host 1 status ==--
>>
>> Status up-to-date                  : True
>> Hostname                           : 10.0.0.94
>> Host ID                            : 1
>> Engine status                      : {"health": "good", "vm": "up",
>> "detail": "up"}
>> Score                              : 2400
>> Local maintenance                  : False
>> Host timestamp                     : 1008087
>> Extra metadata (valid at timestamp):
>> metadata_parse_version=1
>> metadata_feature_version=1
>> timestamp=1008087<tel:1008087> (Mon Dec 29 11:25:51 2014)
>> host-id=1
>> score=2400
>> maintenance=False
>> state=EngineUp
>>
>>
>> --== Host 2 status ==--
>>
>> Status up-to-date                  : True
>> Hostname                           : 10.0.0.93
>> Host ID                            : 2
>> Engine status                      : {"reason": "vm not running on
>> this host", "health": "bad", "vm": "down", "detail": "unknown"}
>> Score                              : 0
>> Local maintenance                  : True
>> Host timestamp                     : 859142
>> Extra metadata (valid at timestamp):
>> metadata_parse_version=1
>> metadata_feature_version=1
>> timestamp=859142 (Mon Dec 29 08:25:08 2014)
>> host-id=2
>> score=0
>> maintenance=True
>> state=LocalMaintenance
>>
>>
>> --== Host 3 status ==--
>>
>> Status up-to-date                  : True
>> Hostname                           : 10.0.0.92
>> Host ID                            : 3
>> Engine status                      : {"reason": "vm not running on
>> this host", "health": "bad", "vm": "down", "detail": "unknown"}
>> Score                              : 0
>> Local maintenance                  : True
>> Host timestamp                     : 853615
>> Extra metadata (valid at timestamp):
>> metadata_parse_version=1
>> metadata_feature_version=1
>> timestamp=853615 (Mon Dec 29 08:25:57 2014)
>> host-id=3
>> score=0
>> maintenance=True
>> state=LocalMaintenance
>> You have new mail in /var/spool/mail/root
>> [root at compute2-2 ~]#
>>
>> Could you please explain how VM failover works inside ovirt? Is there any
>> other debug option I can enable to check the problem?
>>
>> Thanks,
>> Cong
>>
>>
>> On 2014/12/29, at 1:39, "Artyom Lukianov"
>> <alukiano at redhat.com<mailto:alukiano at redhat.com><mailto:alukiano at redhat.com>>
>> wrote:
>>
>> Can you also provide output of hosted-engine --vm-status please, previous
>> time it was useful, because I do not see something unusual.
>> Thanks
>>
>> ----- Original Message -----
>> From: "Cong Yue"
>> <Cong_Yue at alliedtelesis.com<mailto:Cong_Yue at alliedtelesis.com><mailto:Cong_Yue at alliedtelesis.com>>
>> To: "Artyom Lukianov"
>> <alukiano at redhat.com<mailto:alukiano at redhat.com><mailto:alukiano at redhat.com>>
>> Cc: "Simone Tiraboschi"
>> <stirabos at redhat.com<mailto:stirabos at redhat.com><mailto:stirabos at redhat.com>>,
>> users at ovirt.org<mailto:users at ovirt.org><mailto:users at ovirt.org>
>> Sent: Monday, December 29, 2014 7:15:24 AM
>> Subject: Re: [ovirt-users] VM failover with ovirt3.5
>>
>> Also I change the maintenance mode to local in another host. But also the VM
>> in this host can not be migrated. The logs are as follows.
>>
>> [root at compute2-2 ~]# hosted-engine --set-maintenance --mode=local
>> [root at compute2-2 ~]# tail -f /var/log/ovirt-hosted-engine-ha/agent.log
>> MainThread::INFO::2014-12-28
>> 21:09:04,184::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host 10.0.0.94 (id: 1, score: 2400)
>> MainThread::INFO::2014-12-28
>> 21:09:14,603::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Current state EngineDown (score: 2400)
>> MainThread::INFO::2014-12-28
>> 21:09:14,603::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host 10.0.0.94 (id: 1, score: 2400)
>> MainThread::INFO::2014-12-28
>> 21:09:24,903::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Current state EngineDown (score: 2400)
>> MainThread::INFO::2014-12-28
>> 21:09:24,904::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host 10.0.0.94 (id: 1, score: 2400)
>> MainThread::INFO::2014-12-28
>> 21:09:35,026::states::437::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
>> Engine vm is running on host 10.0.0.94 (id 1)
>> MainThread::INFO::2014-12-28
>> 21:09:35,236::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Current state EngineDown (score: 2400)
>> MainThread::INFO::2014-12-28
>> 21:09:35,236::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host 10.0.0.94 (id: 1, score: 2400)
>> MainThread::INFO::2014-12-28
>> 21:09:45,604::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Current state EngineDown (score: 2400)
>> MainThread::INFO::2014-12-28
>> 21:09:45,604::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host 10.0.0.94 (id: 1, score: 2400)
>> MainThread::INFO::2014-12-28
>> 21:09:55,691::state_decorators::124::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check)
>> Local maintenance detected
>> MainThread::INFO::2014-12-28
>> 21:09:55,701::brokerlink::111::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
>> Trying: notify time=1419829795.7 type=state_transition
>> detail=EngineDown-LocalMaintenance hostname='compute2-2'
>> MainThread::INFO::2014-12-28
>> 21:09:55,761::brokerlink::120::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
>> Success, was notification of state_transition
>> (EngineDown-LocalMaintenance) sent? sent
>> MainThread::INFO::2014-12-28
>> 21:09:55,990::states::208::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(score)
>> Score is 0 due to local maintenance mode
>> MainThread::INFO::2014-12-28
>> 21:09:55,990::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Current state LocalMaintenance (score: 0)
>> MainThread::INFO::2014-12-28
>> 21:09:55,991::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host 10.0.0.94 (id: 1, score: 2400)
>> ^C
>> You have new mail in /var/spool/mail/root
>> [root at compute2-2 ~]# ps -ef | grep qemu
>> root     18420  2777  0 21:10<x-apple-data-detectors://39> pts/0
>> 00:00:00<x-apple-data-detectors://40> grep --color=auto qemu
>> qemu     29809     1  0 Dec19 ?        01:17:20 /usr/libexec/qemu-kvm
>> -name testvm2-2 -S -machine rhel6.5.0,accel=kvm,usb=off -cpu Nehalem
>> -m 500 -realtime mlock=off -smp
>> 1,maxcpus=16,sockets=16,cores=1,threads=1 -uuid
>> c31e97d0-135e-42da-9954-162b5228dce3 -smbios
>> type=1,manufacturer=oVirt,product=oVirt
>> Node,version=7-0.1406.el7.centos.2.5,serial=4C4C4544-0059-3610-8033-B4C04F395931,uuid=c31e97d0-135e-42da-9954-162b5228dce3
>> -no-user-config -nodefaults -chardev
>> socket,id=charmonitor,path=/var/lib/libvirt/qemu/testvm2-2.monitor,server,nowait
>> -mon chardev=charmonitor,id=monitor,mode=control -rtc
>> base=2014-12-19T20:17:17<x-apple-data-detectors://42>,driftfix=slew
>> -no-kvm-pit-reinjection
>> -no-hpet -no-shutdown -boot strict=on -device
>> piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device
>> virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x4 -device
>> virtio-serial-pci,id=virtio-serial0,max_ports=16,bus=pci.0,addr=0x5
>> -drive if=none,id=drive-ide0-1-0,readonly=on,format=raw,serial=
>> -device ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0
>> -drive
>> file=/rhev/data-center/00000002-0002-0002-0002-0000000001e4/1dc71096-27c4-4256-b2ac-bd7265525c69/images/5cbeb8c9-4f04-48d0-a5eb-78c49187c550/a0570e8c-9867-4ec4-818f-11e102fc4f9b,if=none,id=drive-virtio-disk0,format=qcow2,serial=5cbeb8c9-4f04-48d0-a5eb-78c49187c550,cache=none,werror=stop,rerror=stop,aio=threads
>> -device
>> virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1
>> -netdev tap,fd=28,id=hostnet0,vhost=on,vhostfd=29 -device
>> virtio-net-pci,netdev=hostnet0,id=net0,mac=00:1a:4a:db:94:00,bus=pci.0,addr=0x3
>> -chardev
>> socket,id=charchannel0,path=/var/lib/libvirt/qemu/channels/c31e97d0-135e-42da-9954-162b5228dce3.com.redhat.rhevm.vdsm,server,nowait
>> -device
>> virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.rhevm.vdsm
>> -chardev
>> socket,id=charchannel1,path=/var/lib/libvirt/qemu/channels/c31e97d0-135e-42da-9954-162b5228dce3.org.qemu.guest_agent.0,server,nowait
>> -device
>> virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=org.qemu.guest_agent.0
>> -chardev spicevmc,id=charchannel2,name=vdagent -device
>> virtserialport,bus=virtio-serial0.0,nr=3,chardev=charchannel2,id=channel2,name=com.redhat.spice.0
>> -spice
>> tls-port=5901,addr=10.0.0.93,x509-dir=/etc/pki/vdsm/libvirt-spice,tls-channel=main,tls-channel=display,tls-channel=inputs,tls-channel=cursor,tls-channel=playback,tls-channel=record,tls-channel=smartcard,tls-channel=usbredir,seamless-migration=on
>> -k en-us -vga qxl -global qxl-vga.ram_size=67108864<tel:67108864> -global
>> qxl-vga.vram_size=33554432<tel:33554432> -incoming tcp:[::]:49152 -device
>> virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x7
>> [root at compute2-2 ~]#
>>
>> Thanks,
>> Cong
>>
>>
>> On 2014/12/28, at 20:53, "Yue, Cong"
>> <Cong_Yue at alliedtelesis.com<mailto:Cong_Yue at alliedtelesis.com><mailto:Cong_Yue at alliedtelesis.com><mailto:Cong_Yue at alliedtelesis.com>>
>> wrote:
>>
>> I checked it again and confirmed there is one guest VM is running on the top
>> of this host. The log is as follows:
>>
>> [root at compute2-1 vdsm]# ps -ef | grep qemu
>> qemu      2983   846  0 Dec19 ?        00:00:00<x-apple-data-detectors://0>
>> [supervdsmServer] <defunct>
>> root      5489  3053  0 20:49<x-apple-data-detectors://1> pts/0
>> 00:00:00<x-apple-data-detectors://2> grep --color=auto qemu
>> qemu     26128     1  0 Dec19 ?        01:09:19 /usr/libexec/qemu-kvm
>> -name testvm2 -S -machine rhel6.5.0,accel=kvm,usb=off -cpu Nehalem -m
>> 500 -realtime mlock=off -smp 1,maxcpus=16,sockets=16,cores=1,threads=1
>> -uuid e46bca87-4df5-4287-844b-90a26fccef33 -smbios
>> type=1,manufacturer=oVirt,product=oVirt
>> Node,version=7-0.1406.el7.centos.2.5,serial=4C4C4544-0030-3310-8059-B8C04F585231,uuid=e46bca87-4df5-4287-844b-90a26fccef33
>> -no-user-config -nodefaults -chardev
>> socket,id=charmonitor,path=/var/lib/libvirt/qemu/testvm2.monitor,server,nowait
>> -mon chardev=charmonitor,id=monitor,mode=control -rtc
>> base=2014-12-19T20:18:01<x-apple-data-detectors://4>,driftfix=slew
>> -no-kvm-pit-reinjection
>> -no-hpet -no-shutdown -boot strict=on -device
>> piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device
>> virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x4 -device
>> virtio-serial-pci,id=virtio-serial0,max_ports=16,bus=pci.0,addr=0x5
>> -drive if=none,id=drive-ide0-1-0,readonly=on,format=raw,serial=
>> -device ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0
>> -drive
>> file=/rhev/data-center/00000002-0002-0002-0002-0000000001e4/1dc71096-27c4-4256-b2ac-bd7265525c69/images/b4b5426b-95e3-41af-b286-da245891cdaf/0f688d49-97e3-4f1d-84d4-ac1432d903b3,if=none,id=drive-virtio-disk0,format=qcow2,serial=b4b5426b-95e3-41af-b286-da245891cdaf,cache=none,werror=stop,rerror=stop,aio=threads
>> -device
>> virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1
>> -netdev tap,fd=26,id=hostnet0,vhost=on,vhostfd=27 -device
>> virtio-net-pci,netdev=hostnet0,id=net0,mac=00:1a:4a:db:94:01,bus=pci.0,addr=0x3
>> -chardev
>> socket,id=charchannel0,path=/var/lib/libvirt/qemu/channels/e46bca87-4df5-4287-844b-90a26fccef33.com.redhat.rhevm.vdsm,server,nowait
>> -device
>> virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.rhevm.vdsm
>> -chardev
>> socket,id=charchannel1,path=/var/lib/libvirt/qemu/channels/e46bca87-4df5-4287-844b-90a26fccef33.org.qemu.guest_agent.0,server,nowait
>> -device
>> virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=org.qemu.guest_agent.0
>> -chardev spicevmc,id=charchannel2,name=vdagent -device
>> virtserialport,bus=virtio-serial0.0,nr=3,chardev=charchannel2,id=channel2,name=com.redhat.spice.0
>> -spice
>> tls-port=5900,addr=10.0.0.92,x509-dir=/etc/pki/vdsm/libvirt-spice,tls-channel=main,tls-channel=display,tls-channel=inputs,tls-channel=cursor,tls-channel=playback,tls-channel=record,tls-channel=smartcard,tls-channel=usbredir,seamless-migration=on
>> -k en-us -vga qxl -global qxl-vga.ram_size=67108864<tel:67108864> -global
>> qxl-vga.vram_size=33554432<tel:33554432> -incoming tcp:[::]:49152 -device
>> virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x7
>> [root at compute2-1 vdsm]# tail -f /var/log/ovirt-hosted-engine-ha/agent.log
>> MainThread::INFO::2014-12-28
>> 20:49:27,315::state_decorators::124::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check)
>> Local maintenance detected
>> MainThread::INFO::2014-12-28
>> 20:49:27,646::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Current state LocalMaintenance (score: 0)
>> MainThread::INFO::2014-12-28
>> 20:49:27,646::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host 10.0.0.94 (id: 1, score: 2400)
>> MainThread::INFO::2014-12-28
>> 20:49:37,732::state_decorators::124::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check)
>> Local maintenance detected
>> MainThread::INFO::2014-12-28
>> 20:49:37,961::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Current state LocalMaintenance (score: 0)
>> MainThread::INFO::2014-12-28
>> 20:49:37,961::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host 10.0.0.94 (id: 1, score: 2400)
>> MainThread::INFO::2014-12-28
>> 20:49:48,048::state_decorators::124::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check)
>> Local maintenance detected
>> MainThread::INFO::2014-12-28
>> 20:49:48,319::states::208::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(score)
>> Score is 0 due to local maintenance mode
>> MainThread::INFO::2014-12-28
>> 20:49:48,319::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Current state LocalMaintenance (score: 0)
>> MainThread::INFO::2014-12-28
>> 20:49:48,319::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host 10.0.0.94 (id: 1, score: 2400)
>>
>> Thanks,
>> Cong
>>
>>
>> On 2014/12/28, at 3:46, "Artyom Lukianov"
>> <alukiano at redhat.com<mailto:alukiano at redhat.com><mailto:alukiano at redhat.com><mailto:alukiano at redhat.com>>
>> wrote:
>>
>> I see that you set local maintenance on host3 that do not have engine vm on
>> it, so it nothing to migrate from this host.
>> If you set local maintenance on host1, vm must migrate to another host with
>> positive score.
>> Thanks
>>
>> ----- Original Message -----
>> From: "Cong Yue"
>> <Cong_Yue at alliedtelesis.com<mailto:Cong_Yue at alliedtelesis.com><mailto:Cong_Yue at alliedtelesis.com><mailto:Cong_Yue at alliedtelesis.com>>
>> To: "Simone Tiraboschi"
>> <stirabos at redhat.com<mailto:stirabos at redhat.com><mailto:stirabos at redhat.com><mailto:stirabos at redhat.com>>
>> Cc:
>> users at ovirt.org<mailto:users at ovirt.org><mailto:users at ovirt.org><mailto:users at ovirt.org>
>> Sent: Saturday, December 27, 2014 6:58:32 PM
>> Subject: Re: [ovirt-users] VM failover with ovirt3.5
>>
>> Hi
>>
>> I had a try with "hosted-engine --set-maintence --mode=local" on
>> compute2-1, which is host 3 in my cluster. From the log, it shows
>> maintence mode is dectected, but migration does not happen.
>>
>> The logs are as follows. Is there any other config I need to check?
>>
>> [root at compute2-1 vdsm]# hosted-engine --vm-status
>>
>>
>> --== Host 1 status ==-
>>
>> Status up-to-date                  : True
>> Hostname                           : 10.0.0.94
>> Host ID                            : 1
>> Engine status                      : {"health": "good", "vm": "up",
>> "detail": "up"}
>> Score                              : 2400
>> Local maintenance                  : False
>> Host timestamp                     : 836296
>> Extra metadata (valid at timestamp):
>> metadata_parse_version=1
>> metadata_feature_version=1
>> timestamp=836296 (Sat Dec 27 11:42:39 2014)
>> host-id=1
>> score=2400
>> maintenance=False
>> state=EngineUp
>>
>>
>> --== Host 2 status ==--
>>
>> Status up-to-date                  : True
>> Hostname                           : 10.0.0.93
>> Host ID                            : 2
>> Engine status                      : {"reason": "vm not running on
>> this host", "health": "bad", "vm": "down", "detail": "unknown"}
>> Score                              : 2400
>> Local maintenance                  : False
>> Host timestamp                     : 687358
>> Extra metadata (valid at timestamp):
>> metadata_parse_version=1
>> metadata_feature_version=1
>> timestamp=687358 (Sat Dec 27 08:42:04 2014)
>> host-id=2
>> score=2400
>> maintenance=False
>> state=EngineDown
>>
>>
>> --== Host 3 status ==--
>>
>> Status up-to-date                  : True
>> Hostname                           : 10.0.0.92
>> Host ID                            : 3
>> Engine status                      : {"reason": "vm not running on
>> this host", "health": "bad", "vm": "down", "detail": "unknown"}
>> Score                              : 0
>> Local maintenance                  : True
>> Host timestamp                     : 681827
>> Extra metadata (valid at timestamp):
>> metadata_parse_version=1
>> metadata_feature_version=1
>> timestamp=681827 (Sat Dec 27 08:42:40 2014)
>> host-id=3
>> score=0
>> maintenance=True
>> state=LocalMaintenance
>> [root at compute2-1 vdsm]# tail -f /var/log/ovirt-hosted-engine-ha/agent.log
>> MainThread::INFO::2014-12-27
>> 08:42:41,109::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host 10.0.0.94 (id: 1, score: 2400)
>> MainThread::INFO::2014-12-27
>> 08:42:51,198::state_decorators::124::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check)
>> Local maintenance detected
>> MainThread::INFO::2014-12-27
>> 08:42:51,420::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Current state LocalMaintenance (score: 0)
>> MainThread::INFO::2014-12-27
>> 08:42:51,420::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host 10.0.0.94 (id: 1, score: 2400)
>> MainThread::INFO::2014-12-27
>> 08:43:01,507::state_decorators::124::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check)
>> Local maintenance detected
>> MainThread::INFO::2014-12-27
>> 08:43:01,773::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Current state LocalMaintenance (score: 0)
>> MainThread::INFO::2014-12-27
>> 08:43:01,773::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host 10.0.0.94 (id: 1, score: 2400)
>> MainThread::INFO::2014-12-27
>> 08:43:11,859::state_decorators::124::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check)
>> Local maintenance detected
>> MainThread::INFO::2014-12-27
>> 08:43:12,072::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Current state LocalMaintenance (score: 0)
>> MainThread::INFO::2014-12-27
>> 08:43:12,072::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host 10.0.0.94 (id: 1, score: 2400)
>>
>>
>>
>> [root at compute2-3 ~]# tail -f /var/log/ovirt-hosted-engine-ha/agent.log
>> MainThread::INFO::2014-12-27
>> 11:36:28,855::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host 10.0.0.93 (id: 2, score: 2400)
>> MainThread::INFO::2014-12-27
>> 11:36:39,130::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Current state EngineUp (score: 2400)
>> MainThread::INFO::2014-12-27
>> 11:36:39,130::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host 10.0.0.93 (id: 2, score: 2400)
>> MainThread::INFO::2014-12-27
>> 11:36:49,449::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Current state EngineUp (score: 2400)
>> MainThread::INFO::2014-12-27
>> 11:36:49,449::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host 10.0.0.93 (id: 2, score: 2400)
>> MainThread::INFO::2014-12-27
>> 11:36:59,739::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Current state EngineUp (score: 2400)
>> MainThread::INFO::2014-12-27
>> 11:36:59,739::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host 10.0.0.93 (id: 2, score: 2400)
>> MainThread::INFO::2014-12-27
>> 11:37:09,779::states::394::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
>> Engine vm running on localhost
>> MainThread::INFO::2014-12-27
>> 11:37:10,026::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Current state EngineUp (score: 2400)
>> MainThread::INFO::2014-12-27
>> 11:37:10,026::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host 10.0.0.93 (id: 2, score: 2400)
>> MainThread::INFO::2014-12-27
>> 11:37:20,331::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Current state EngineUp (score: 2400)
>> MainThread::INFO::2014-12-27
>> 11:37:20,331::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host 10.0.0.93 (id: 2, score: 2400)
>>
>>
>> [root at compute2-2 ~]# tail -f /var/log/ovirt-hosted-engine-ha/agent.log
>> MainThread::INFO::2014-12-27
>> 08:36:12,462::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host 10.0.0.94 (id: 1, score: 2400)
>> MainThread::INFO::2014-12-27
>> 08:36:22,797::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Current state EngineDown (score: 2400)
>> MainThread::INFO::2014-12-27
>> 08:36:22,798::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host 10.0.0.94 (id: 1, score: 2400)
>> MainThread::INFO::2014-12-27
>> 08:36:32,876::states::437::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
>> Engine vm is running on host 10.0.0.94 (id 1)
>> MainThread::INFO::2014-12-27
>> 08:36:33,169::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Current state EngineDown (score: 2400)
>> MainThread::INFO::2014-12-27
>> 08:36:33,169::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host 10.0.0.94 (id: 1, score: 2400)
>> MainThread::INFO::2014-12-27
>> 08:36:43,567::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Current state EngineDown (score: 2400)
>> MainThread::INFO::2014-12-27
>> 08:36:43,567::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host 10.0.0.94 (id: 1, score: 2400)
>> MainThread::INFO::2014-12-27
>> 08:36:53,858::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Current state EngineDown (score: 2400)
>> MainThread::INFO::2014-12-27
>> 08:36:53,858::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host 10.0.0.94 (id: 1, score: 2400)
>> MainThread::INFO::2014-12-27
>> 08:37:04,028::state_machine::160::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh)
>> Global metadata: {'maintenance': False}
>> MainThread::INFO::2014-12-27
>> 08:37:04,028::state_machine::165::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh)
>> Host 10.0.0.94 (id 1): {'extra':
>> 'metadata_parse_version=1\nmetadata_feature_version=1\ntimestamp=835987
>> (Sat Dec 27 11:37:30
>> 2014)\nhost-id=1\nscore=2400\nmaintenance=False\nstate=EngineUp\n',
>> 'hostname': '10.0.0.94', 'alive': True, 'host-id': 1, 'engine-status':
>> {'health': 'good', 'vm': 'up', 'detail': 'up'}, 'score': 2400,
>> 'maintenance': False, 'host-ts': 835987}
>> MainThread::INFO::2014-12-27
>> 08:37:04,028::state_machine::165::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh)
>> Host 10.0.0.92 (id 3): {'extra':
>> 'metadata_parse_version=1\nmetadata_feature_version=1\ntimestamp=681528
>> (Sat Dec 27 08:37:41
>> 2014)\nhost-id=3\nscore=0\nmaintenance=True\nstate=LocalMaintenance\n',
>> 'hostname': '10.0.0.92', 'alive': True, 'host-id': 3, 'engine-status':
>> {'reason': 'vm not running on this host', 'health': 'bad', 'vm':
>> 'down', 'detail': 'unknown'}, 'score': 0, 'maintenance': True,
>> 'host-ts': 681528}
>> MainThread::INFO::2014-12-27
>> 08:37:04,028::state_machine::168::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh)
>> Local (id 2): {'engine-health': {'reason': 'vm not running on this
>> host', 'health': 'bad', 'vm': 'down', 'detail': 'unknown'}, 'bridge':
>> True, 'mem-free': 15300.0, 'maintenance': False, 'cpu-load': 0.0215,
>> 'gateway': True}
>> MainThread::INFO::2014-12-27
>> 08:37:04,265::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Current state EngineDown (score: 2400)
>> MainThread::INFO::2014-12-27
>> 08:37:04,265::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host 10.0.0.94 (id: 1, score: 2400)
>>
>> Thanks,
>> Cong
>>
>> On 2014/12/22, at 5:29, "Simone Tiraboschi"
>> <stirabos at redhat.com<mailto:stirabos at redhat.com><mailto:stirabos at redhat.com><mailto:stirabos at redhat.com>>
>> wrote:
>>
>>
>>
>> ----- Original Message -----
>> From: "Cong Yue"
>> <Cong_Yue at alliedtelesis.com<mailto:Cong_Yue at alliedtelesis.com><mailto:Cong_Yue at alliedtelesis.com><mailto:Cong_Yue at alliedtelesis.com>>
>> To: "Simone Tiraboschi"
>> <stirabos at redhat.com<mailto:stirabos at redhat.com><mailto:stirabos at redhat.com><mailto:stirabos at redhat.com>>
>> Cc:
>> users at ovirt.org<mailto:users at ovirt.org><mailto:users at ovirt.org><mailto:users at ovirt.org>
>> Sent: Friday, December 19, 2014 7:22:10 PM
>> Subject: RE: [ovirt-users] VM failover with ovirt3.5
>>
>> Thanks for the information. This is the log for my three ovirt nodes.
>> From the output of hosted-engine --vm-status, it shows the engine state for
>> my 2nd and 3rd ovirt node is DOWN.
>> Is this the reason why VM failover not work in my environment?
>>
>> No, they looks ok: you can run the engine VM on single host at a time.
>>
>> How can I make
>> also engine works for my 2nd and 3rd ovit nodes?
>>
>> If you put the host 1 in local maintenance mode ( hosted-engine
>> --set-maintenance --mode=local ) the VM should migrate to host 2; if you
>> reactivate host 1 ( hosted-engine --set-maintenance --mode=none ) and put
>> host 2 in local maintenance mode the VM should migrate again.
>>
>> Can you please try that and post the logs if something is going bad?
>>
>>
>> --
>> --== Host 1 status ==--
>>
>> Status up-to-date                  : True
>> Hostname                           : 10.0.0.94
>> Host ID                            : 1
>> Engine status                      : {"health": "good", "vm": "up",
>> "detail": "up"}
>> Score                              : 2400
>> Local maintenance                  : False
>> Host timestamp                     : 150475
>> Extra metadata (valid at timestamp):
>> metadata_parse_version=1
>> metadata_feature_version=1
>> timestamp=150475 (Fri Dec 19 13:12:18 2014)
>> host-id=1
>> score=2400
>> maintenance=False
>> state=EngineUp
>>
>>
>> --== Host 2 status ==--
>>
>> Status up-to-date                  : True
>> Hostname                           : 10.0.0.93
>> Host ID                            : 2
>> Engine status                      : {"reason": "vm not running on
>> this host", "health": "bad", "vm": "down", "detail": "unknown"}
>> Score                              : 2400
>> Local maintenance                  : False
>> Host timestamp                     : 1572
>> Extra metadata (valid at timestamp):
>> metadata_parse_version=1
>> metadata_feature_version=1
>> timestamp=1572 (Fri Dec 19 10:12:18 2014)
>> host-id=2
>> score=2400
>> maintenance=False
>> state=EngineDown
>>
>>
>> --== Host 3 status ==--
>>
>> Status up-to-date                  : False
>> Hostname                           : 10.0.0.92
>> Host ID                            : 3
>> Engine status                      : unknown stale-data
>> Score                              : 2400
>> Local maintenance                  : False
>> Host timestamp                     : 987
>> Extra metadata (valid at timestamp):
>> metadata_parse_version=1
>> metadata_feature_version=1
>> timestamp=987 (Fri Dec 19 10:09:58 2014)
>> host-id=3
>> score=2400
>> maintenance=False
>> state=EngineDown
>>
>> --
>> And the /var/log/ovirt-hosted-engine-ha/agent.log for three ovirt nodes are
>> as follows:
>> --
>> 10.0.0.94(hosted-engine-1)
>> ---
>> MainThread::INFO::2014-12-19
>> 13:09:33,716::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Current state EngineUp (score: 2400)
>> MainThread::INFO::2014-12-19
>> 13:09:33,716::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host 10.0.0.93 (id: 2, score: 2400)
>> MainThread::INFO::2014-12-19
>> 13:09:44,017::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Current state EngineUp (score: 2400)
>> MainThread::INFO::2014-12-19
>> 13:09:44,017::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host 10.0.0.93 (id: 2, score: 2400)
>> MainThread::INFO::2014-12-19
>> 13:09:54,303::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Current state EngineUp (score: 2400)
>> MainThread::INFO::2014-12-19
>> 13:09:54,303::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host 10.0.0.93 (id: 2, score: 2400)
>> MainThread::INFO::2014-12-19
>> 13:10:04,342::states::394::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
>> Engine vm running on localhost
>> MainThread::INFO::2014-12-19
>> 13:10:04,617::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Current state EngineUp (score: 2400)
>> MainThread::INFO::2014-12-19
>> 13:10:04,617::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host 10.0.0.93 (id: 2, score: 2400)
>> MainThread::INFO::2014-12-19
>> 13:10:14,657::state_machine::160::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh)
>> Global metadata: {'maintenance': False}
>> MainThread::INFO::2014-12-19
>> 13:10:14,657::state_machine::165::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh)
>> Host 10.0.0.93 (id 2): {'extra':
>> 'metadata_parse_version=1\nmetadata_feature_version=1\ntimestamp=1448
>> (Fri Dec 19 10:10:14
>> 2014)\nhost-id=2\nscore=2400\nmaintenance=False\nstate=EngineDown\n',
>> 'hostname': '10.0.0.93', 'alive': True, 'host-id': 2, 'engine-status':
>> {'reason': 'vm not running on this host', 'health': 'bad', 'vm':
>> 'down', 'detail': 'unknown'}, 'score': 2400, 'maintenance': False,
>> 'host-ts': 1448}
>> MainThread::INFO::2014-12-19
>> 13:10:14,657::state_machine::165::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh)
>> Host 10.0.0.92 (id 3): {'extra':
>> 'metadata_parse_version=1\nmetadata_feature_version=1\ntimestamp=987
>> (Fri Dec 19 10:09:58
>> 2014)\nhost-id=3\nscore=2400\nmaintenance=False\nstate=EngineDown\n',
>> 'hostname': '10.0.0.92', 'alive': True, 'host-id': 3, 'engine-status':
>> {'reason': 'vm not running on this host', 'health': 'bad', 'vm':
>> 'down', 'detail': 'unknown'}, 'score': 2400, 'maintenance': False,
>> 'host-ts': 987}
>> MainThread::INFO::2014-12-19
>> 13:10:14,658::state_machine::168::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh)
>> Local (id 1): {'engine-health': {'health': 'good', 'vm': 'up',
>> 'detail': 'up'}, 'bridge': True, 'mem-free': 1079.0, 'maintenance':
>> False, 'cpu-load': 0.0269, 'gateway': True}
>> MainThread::INFO::2014-12-19
>> 13:10:14,904::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Current state EngineUp (score: 2400)
>> MainThread::INFO::2014-12-19
>> 13:10:14,904::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host 10.0.0.93 (id: 2, score: 2400)
>> MainThread::INFO::2014-12-19
>> 13:10:25,210::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Current state EngineUp (score: 2400)
>> MainThread::INFO::2014-12-19
>> 13:10:25,210::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host 10.0.0.93 (id: 2, score: 2400)
>> MainThread::INFO::2014-12-19
>> 13:10:35,499::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Current state EngineUp (score: 2400)
>> MainThread::INFO::2014-12-19
>> 13:10:35,499::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host 10.0.0.93 (id: 2, score: 2400)
>> MainThread::INFO::2014-12-19
>> 13:10:45,784::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Current state EngineUp (score: 2400)
>> MainThread::INFO::2014-12-19
>> 13:10:45,785::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host 10.0.0.93 (id: 2, score: 2400)
>> MainThread::INFO::2014-12-19
>> 13:10:56,070::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Current state EngineUp (score: 2400)
>> MainThread::INFO::2014-12-19
>> 13:10:56,070::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host 10.0.0.93 (id: 2, score: 2400)
>> MainThread::INFO::2014-12-19
>> 13:11:06,109::states::394::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
>> Engine vm running on localhost
>> MainThread::INFO::2014-12-19
>> 13:11:06,359::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Current state EngineUp (score: 2400)
>> MainThread::INFO::2014-12-19
>> 13:11:06,359::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host 10.0.0.93 (id: 2, score: 2400)
>> MainThread::INFO::2014-12-19
>> 13:11:16,658::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Current state EngineUp (score: 2400)
>> MainThread::INFO::2014-12-19
>> 13:11:16,658::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host 10.0.0.93 (id: 2, score: 2400)
>> MainThread::INFO::2014-12-19
>> 13:11:26,991::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Current state EngineUp (score: 2400)
>> MainThread::INFO::2014-12-19
>> 13:11:26,991::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host 10.0.0.93 (id: 2, score: 2400)
>> MainThread::INFO::2014-12-19
>> 13:11:37,341::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Current state EngineUp (score: 2400)
>> MainThread::INFO::2014-12-19
>> 13:11:37,341::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host 10.0.0.93 (id: 2, score: 2400)
>> ----
>>
>> 10.0.0.93 (hosted-engine-2)
>> MainThread::INFO::2014-12-19
>> 10:12:18,339::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Current state EngineDown (score: 2400)
>> MainThread::INFO::2014-12-19
>> 10:12:18,339::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host 10.0.0.94 (id: 1, score: 2400)
>> MainThread::INFO::2014-12-19
>> 10:12:28,651::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Current state EngineDown (score: 2400)
>> MainThread::INFO::2014-12-19
>> 10:12:28,652::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host 10.0.0.94 (id: 1, score: 2400)
>> MainThread::INFO::2014-12-19
>> 10:12:39,010::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Current state EngineDown (score: 2400)
>> MainThread::INFO::2014-12-19
>> 10:12:39,010::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host 10.0.0.94 (id: 1, score: 2400)
>> MainThread::INFO::2014-12-19
>> 10:12:49,338::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Current state EngineDown (score: 2400)
>> MainThread::INFO::2014-12-19
>> 10:12:49,338::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host 10.0.0.94 (id: 1, score: 2400)
>> MainThread::INFO::2014-12-19
>> 10:12:59,642::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Current state EngineDown (score: 2400)
>> MainThread::INFO::2014-12-19
>> 10:12:59,642::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host 10.0.0.94 (id: 1, score: 2400)
>> MainThread::INFO::2014-12-19
>> 10:13:10,010::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Current state EngineDown (score: 2400)
>> MainThread::INFO::2014-12-19
>> 10:13:10,010::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>> Best remote host 10.0.0.94 (id: 1, score: 2400)
>>
>>
>> 10.0.0.92(hosted-engine-3)
>> same as 10.0.0.93
>> --
>>
>> -----Original Message-----
>> From: Simone Tiraboschi [mailto:stirabos at redhat.com]
>> Sent: Friday, December 19, 2014 12:28 AM
>> To: Yue, Cong
>> Cc:
>> users at ovirt.org<mailto:users at ovirt.org><mailto:users at ovirt.org><mailto:users at ovirt.org>
>> Subject: Re: [ovirt-users] VM failover with ovirt3.5
>>
>>
>>
>> ----- Original Message -----
>> From: "Cong Yue"
>> <Cong_Yue at alliedtelesis.com<mailto:Cong_Yue at alliedtelesis.com><mailto:Cong_Yue at alliedtelesis.com><mailto:Cong_Yue at alliedtelesis.com>>
>> To:
>> users at ovirt.org<mailto:users at ovirt.org><mailto:users at ovirt.org><mailto:users at ovirt.org>
>> Sent: Friday, December 19, 2014 2:14:33 AM
>> Subject: [ovirt-users] VM failover with ovirt3.5
>>
>>
>>
>> Hi
>>
>>
>>
>> In my environment, I have 3 ovirt nodes as one cluster. And on top of
>> host-1, there is one vm to host ovirt engine.
>>
>> Also I have one external storage for the cluster to use as data domain
>> of engine and data.
>>
>> I confirmed live migration works well in my environment.
>>
>> But it seems very buggy for VM failover if I try to force to shut down
>> one ovirt node. Sometimes the VM in the node which is shutdown can
>> migrate to other host, but it take more than several minutes.
>>
>> Sometimes, it can not migrate at all. Sometimes, only when the host is
>> back, the VM is beginning to move.
>>
>> Can you please check or share the logs under
>> /var/log/ovirt-hosted-engine-ha/
>> ?
>>
>> Is there some documentation to explain how VM failover is working? And
>> is there some bugs reported related with this?
>>
>> http://www.ovirt.org/Features/Self_Hosted_Engine#Agent_State_Diagram
>>
>> Thanks in advance,
>>
>> Cong
>>
>>
>>
>>
>> This e-mail message is for the sole use of the intended recipient(s)
>> and may contain confidential and privileged information. Any
>> unauthorized review, use, disclosure or distribution is prohibited. If
>> you are not the intended recipient, please contact the sender by reply
>> e-mail and destroy all copies of the original message. If you are the
>> intended recipient, please be advised that the content of this message
>> is subject to access, review and disclosure by the sender's e-mail System
>> Administrator.
>>
>> _______________________________________________
>> Users mailing list
>> Users at ovirt.org<mailto:Users at ovirt.org><mailto:Users at ovirt.org><mailto:Users at ovirt.org>
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>> This e-mail message is for the sole use of the intended recipient(s) and may
>> contain confidential and privileged information. Any unauthorized review,
>> use, disclosure or distribution is prohibited. If you are not the intended
>> recipient, please contact the sender by reply e-mail and destroy all copies
>> of the original message. If you are the intended recipient, please be
>> advised that the content of this message is subject to access, review and
>> disclosure by the sender's e-mail System Administrator.
>>
>>
>> This e-mail message is for the sole use of the intended recipient(s) and may
>> contain confidential and privileged information. Any unauthorized review,
>> use, disclosure or distribution is prohibited. If you are not the intended
>> recipient, please contact the sender by reply e-mail and destroy all copies
>> of the original message. If you are the intended recipient, please be
>> advised that the content of this message is subject to access, review and
>> disclosure by the sender's e-mail System Administrator.
>> _______________________________________________
>> Users mailing list
>> Users at ovirt.org<mailto:Users at ovirt.org><mailto:Users at ovirt.org><mailto:Users at ovirt.org>
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>> ________________________________
>> This e-mail message is for the sole use of the intended recipient(s) and may
>> contain confidential and privileged information. Any unauthorized review,
>> use, disclosure or distribution is prohibited. If you are not the intended
>> recipient, please contact the sender by reply e-mail and destroy all copies
>> of the original message. If you are the intended recipient, please be
>> advised that the content of this message is subject to access, review and
>> disclosure by the sender's e-mail System Administrator.
>>
>> ________________________________
>> This e-mail message is for the sole use of the intended recipient(s) and may
>> contain confidential and privileged information. Any unauthorized review,
>> use, disclosure or distribution is prohibited. If you are not the intended
>> recipient, please contact the sender by reply e-mail and destroy all copies
>> of the original message. If you are the intended recipient, please be
>> advised that the content of this message is subject to access, review and
>> disclosure by the sender's e-mail System Administrator.
>>
>> ________________________________
>> This e-mail message is for the sole use of the intended recipient(s) and may
>> contain confidential and privileged information. Any unauthorized review,
>> use, disclosure or distribution is prohibited. If you are not the intended
>> recipient, please contact the sender by reply e-mail and destroy all copies
>> of the original message. If you are the intended recipient, please be
>> advised that the content of this message is subject to access, review and
>> disclosure by the sender's e-mail System Administrator.
>>
>>
>> ________________________________
>> This e-mail message is for the sole use of the intended recipient(s) and may
>> contain confidential and privileged information. Any unauthorized review,
>> use, disclosure or distribution is prohibited. If you are not the intended
>> recipient, please contact the sender by reply e-mail and destroy all copies
>> of the original message. If you are the intended recipient, please be
>> advised that the content of this message is subject to access, review and
>> disclosure by the sender's e-mail System Administrator.

This e-mail message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message. If you are the intended recipient, please be advised that the content of this message is subject to access, review and disclosure by the sender's e-mail System Administrator.

This e-mail message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message. If you are the intended recipient, please be advised that the content of this message is subject to access, review and disclosure by the sender's e-mail System Administrator.

This e-mail message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message. If you are the intended recipient, please be advised that the content of this message is subject to access, review and disclosure by the sender's e-mail System Administrator.



More information about the Users mailing list