[ovirt-users] VM failover with ovirt3.5

cong yue yuecong1104 at gmail.com
Tue Dec 30 18:22:47 UTC 2014


The vdsm.log just after I turned the host where HE VM is to local.

In the log, there is some part like

---
GuestMonitor-HostedEngine::DEBUG::2014-12-30
13:01:03,988::vm::486::vm.Vm::(_getUserCpuTuneInfo)
vmId=`0d3adb5c-0960-483c-9d73-5e256a519f2f`::Domain Metadata is not
set
GuestMonitor-HostedEngine::DEBUG::2014-12-30
13:01:03,989::vm::486::vm.Vm::(_getUserCpuTuneInfo)
vmId=`0d3adb5c-0960-483c-9d73-5e256a519f2f`::Domain Metadata is not
set
GuestMonitor-HostedEngine::DEBUG::2014-12-30
13:01:03,990::vm::486::vm.Vm::(_getUserCpuTuneInfo)
vmId=`0d3adb5c-0960-483c-9d73-5e256a519f2f`::Domain Metadata is not
set
JsonRpc (StompReactor)::DEBUG::2014-12-30
13:01:04,675::stompReactor::98::Broker.StompAdapter::(handle_frame)
Handling message <StompFrame command='SEND'>
JsonRpcServer::DEBUG::2014-12-30
13:01:04,676::__init__::504::jsonrpc.JsonRpcServer::(serve_requests)
Waiting for request
Thread-1806995::DEBUG::2014-12-30
13:01:04,677::stompReactor::163::yajsonrpc.StompServer::(send) Sending
response
JsonRpc (StompReactor)::DEBUG::2014-12-30
13:01:04,678::stompReactor::98::Broker.StompAdapter::(handle_frame)
Handling message <StompFrame command='SEND'>
JsonRpcServer::DEBUG::2014-12-30
13:01:04,679::__init__::504::jsonrpc.JsonRpcServer::(serve_requests)
Waiting for request
Thread-1806996::DEBUG::2014-12-30
13:01:04,681::vm::486::vm.Vm::(_getUserCpuTuneInfo)
vmId=`0d3adb5c-0960-483c-9d73-5e256a519f2f`::Domain Metadata is not
set
---

I this with some wrong?

Thanks,
Cong


> From: Artyom Lukianov <alukiano at redhat.com>
> Date: 2014年12月29日 23:13:45 GMT-8
> To: "Yue, Cong" <Cong_Yue at alliedtelesis.com>
> Cc: Simone Tiraboschi <stirabos at redhat.com>, "users at ovirt.org"
> <users at ovirt.org>
> Subject: Re: [ovirt-users] VM failover with ovirt3.5
>
> HE vm migrated only by ovirt-ha-agent and not by engine, but FatalError it's
> more interesting, can you provide vdsm.log for this one please.
>
> ----- Original Message -----
> From: "Cong Yue" <Cong_Yue at alliedtelesis.com>
> To: "Artyom Lukianov" <alukiano at redhat.com>
> Cc: "Simone Tiraboschi" <stirabos at redhat.com>, users at ovirt.org
> Sent: Monday, December 29, 2014 8:29:04 PM
> Subject: Re: [ovirt-users] VM failover with ovirt3.5
>
> I disabled local maintenance mode for all hosts, and then only set the host
> where HE VM is there to local maintenance mode. The logs are as follows.
> During the migration of HE VM , it shows some fatal error happen. By the
> way, also HE VM can not work with live migration. Instead, other VMs can do
> live migration.
>
> ---
> [root at compute2-3 ~]# hosted-engine --set-maintenance --mode=local
> You have new mail in /var/spool/mail/root
> [root at compute2-3 ~]# tail -f /var/log/ovirt-hosted-engine-ha/agent.log
> MainThread::INFO::2014-12-29
> 13:16:12,435::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.92 (id: 3, score: 2400)
> MainThread::INFO::2014-12-29
> 13:16:22,711::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineUp (score: 2400)
> MainThread::INFO::2014-12-29
> 13:16:22,711::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.92 (id: 3, score: 2400)
> MainThread::INFO::2014-12-29
> 13:16:32,978::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineUp (score: 2400)
> MainThread::INFO::2014-12-29
> 13:16:32,978::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.93 (id: 2, score: 2400)
> MainThread::INFO::2014-12-29
> 13:16:43,272::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineUp (score: 2400)
> MainThread::INFO::2014-12-29
> 13:16:43,272::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.93 (id: 2, score: 2400)
> MainThread::INFO::2014-12-29
> 13:16:53,316::states::394::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
> Engine vm running on localhost
> MainThread::INFO::2014-12-29
> 13:16:53,562::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineUp (score: 2400)
> MainThread::INFO::2014-12-29
> 13:16:53,562::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.93 (id: 2, score: 2400)
> MainThread::INFO::2014-12-29
> 13:17:03,600::state_decorators::124::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check)
> Local maintenance detected
> MainThread::INFO::2014-12-29
> 13:17:03,611::brokerlink::111::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
> Trying: notify time=1419877023.61 type=state_transition
> detail=EngineUp-LocalMaintenanceMigrateVm hostname='compute2-3'
> MainThread::INFO::2014-12-29
> 13:17:03,672::brokerlink::120::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
> Success, was notification of state_transition
> (EngineUp-LocalMaintenanceMigrateVm) sent? sent
> MainThread::INFO::2014-12-29
> 13:17:03,911::states::208::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(score)
> Score is 0 due to local maintenance mode
> MainThread::INFO::2014-12-29
> 13:17:03,912::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state LocalMaintenanceMigrateVm (score: 0)
> MainThread::INFO::2014-12-29
> 13:17:03,912::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.93 (id: 2, score: 2400)
> MainThread::INFO::2014-12-29
> 13:17:03,960::brokerlink::111::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
> Trying: notify time=1419877023.96 type=state_transition
> detail=LocalMaintenanceMigrateVm-EngineMigratingAway
> hostname='compute2-3'
> MainThread::INFO::2014-12-29
> 13:17:03,980::brokerlink::120::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
> Success, was notification of state_transition
> (LocalMaintenanceMigrateVm-EngineMigratingAway) sent? sent
> MainThread::INFO::2014-12-29
> 13:17:04,218::states::66::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_penalize_memory)
> Penalizing score by 400 due to low free memory
> MainThread::INFO::2014-12-29
> 13:17:04,218::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineMigratingAway (score: 2000)
> MainThread::INFO::2014-12-29
> 13:17:04,219::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.93 (id: 2, score: 2400)
> MainThread::ERROR::2014-12-29
> 13:17:14,251::hosted_engine::867::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitor_migration)
> Failed to migrate
> Traceback (most recent call last):
> File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
> line 863, in _monitor_migration
>   vm_id,
> File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/vds_client.py",
> line 85, in run_vds_client_cmd
>   response['status']['message'])
> DetailedError: Error 12 from migrateStatus: Fatal error during migration
> MainThread::INFO::2014-12-29
> 13:17:14,262::brokerlink::111::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
> Trying: notify time=1419877034.26 type=state_transition
> detail=EngineMigratingAway-ReinitializeFSM hostname='compute2-3'
> MainThread::INFO::2014-12-29
> 13:17:14,263::brokerlink::120::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
> Success, was notification of state_transition
> (EngineMigratingAway-ReinitializeFSM) sent? ignored
> MainThread::INFO::2014-12-29
> 13:17:14,496::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state ReinitializeFSM (score: 0)
> MainThread::INFO::2014-12-29
> 13:17:14,496::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.93 (id: 2, score: 2400)
> MainThread::INFO::2014-12-29
> 13:17:24,536::state_decorators::124::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check)
> Local maintenance detected
> MainThread::INFO::2014-12-29
> 13:17:24,547::brokerlink::111::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
> Trying: notify time=1419877044.55 type=state_transition
> detail=ReinitializeFSM-LocalMaintenance hostname='compute2-3'
> MainThread::INFO::2014-12-29
> 13:17:24,574::brokerlink::120::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
> Success, was notification of state_transition
> (ReinitializeFSM-LocalMaintenance) sent? sent
> MainThread::INFO::2014-12-29
> 13:17:24,812::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state LocalMaintenance (score: 0)
> MainThread::INFO::2014-12-29
> 13:17:24,812::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.93 (id: 2, score: 2400)
> MainThread::INFO::2014-12-29
> 13:17:34,851::state_decorators::124::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check)
> Local maintenance detected
> MainThread::INFO::2014-12-29
> 13:17:35,095::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state LocalMaintenance (score: 0)
> MainThread::INFO::2014-12-29
> 13:17:35,095::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.93 (id: 2, score: 2400)
> MainThread::INFO::2014-12-29
> 13:17:45,130::state_decorators::124::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check)
> Local maintenance detected
> MainThread::INFO::2014-12-29
> 13:17:45,368::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state LocalMaintenance (score: 0)
> MainThread::INFO::2014-12-29
> 13:17:45,368::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.93 (id: 2, score: 2400)
> ^C
> [root at compute2-3 ~]#
>
>
> [root at compute2-3 ~]# hosted-engine --vm-status
>
>
> --== Host 1 status ==--
>
> Status up-to-date                  : True
> Hostname                           : 10.0.0.94
> Host ID                            : 1
> Engine status                      : {"health": "good", "vm": "up",
> "detail": "up"}
> Score                              : 0
> Local maintenance                  : True
> Host timestamp                     : 1014956<tel:1014956>
> Extra metadata (valid at timestamp):
> metadata_parse_version=1
> metadata_feature_version=1
> timestamp=1014956<tel:1014956> (Mon Dec 29 13:20:19 2014)
> host-id=1
> score=0
> maintenance=True
> state=LocalMaintenance
>
>
> --== Host 2 status ==--
>
> Status up-to-date                  : True
> Hostname                           : 10.0.0.93
> Host ID                            : 2
> Engine status                      : {"reason": "vm not running on
> this host", "health": "bad", "vm": "down", "detail": "unknown"}
> Score                              : 2400
> Local maintenance                  : False
> Host timestamp                     : 866019
> Extra metadata (valid at timestamp):
> metadata_parse_version=1
> metadata_feature_version=1
> timestamp=866019 (Mon Dec 29 10:19:45 2014)
> host-id=2
> score=2400
> maintenance=False
> state=EngineDown
>
>
> --== Host 3 status ==--
>
> Status up-to-date                  : True
> Hostname                           : 10.0.0.92
> Host ID                            : 3
> Engine status                      : {"reason": "vm not running on
> this host", "health": "bad", "vm": "down", "detail": "unknown"}
> Score                              : 2400
> Local maintenance                  : False
> Host timestamp                     : 860493
> Extra metadata (valid at timestamp):
> metadata_parse_version=1
> metadata_feature_version=1
> timestamp=860493 (Mon Dec 29 10:20:35 2014)
> host-id=3
> score=2400
> maintenance=False
> state=EngineDown
> [root at compute2-3 ~]#
> ---
> Thanks,
> Cong
>
>
>
> On 2014/12/29, at 8:43, "Artyom Lukianov"
> <alukiano at redhat.com<mailto:alukiano at redhat.com>> wrote:
>
> I see that HE vm run on host with ip 10.0.0.94, and two another hosts in
> "Local Maintenance" state, so vm will not migrate to any of them, can you
> try disable local maintenance on all hosts in HE environment and after
> enable "local maintenance" on host where HE vm run, and provide also output
> of hosted-engine --vm-status.
> Failover works in next way:
> 1) if host where run HE vm have score less by 800 that some other host in HE
> environment, HE vm will migrate on host with best score
> 2) if something happen to vm(kernel panic, crash of service...), agent will
> restart HE vm on another host in HE environment with positive score
> 3) if put to local maintenance host with HE vm, vm will migrate to another
> host with positive score
> Thanks.
>
> ----- Original Message -----
> From: "Cong Yue"
> <Cong_Yue at alliedtelesis.com<mailto:Cong_Yue at alliedtelesis.com>>
> To: "Artyom Lukianov" <alukiano at redhat.com<mailto:alukiano at redhat.com>>
> Cc: "Simone Tiraboschi" <stirabos at redhat.com<mailto:stirabos at redhat.com>>,
> users at ovirt.org<mailto:users at ovirt.org>
> Sent: Monday, December 29, 2014 6:30:42 PM
> Subject: Re: [ovirt-users] VM failover with ovirt3.5
>
> Thanks and the --vm-status log is as follows:
> [root at compute2-2 ~]# hosted-engine --vm-status
>
>
> --== Host 1 status ==--
>
> Status up-to-date                  : True
> Hostname                           : 10.0.0.94
> Host ID                            : 1
> Engine status                      : {"health": "good", "vm": "up",
> "detail": "up"}
> Score                              : 2400
> Local maintenance                  : False
> Host timestamp                     : 1008087
> Extra metadata (valid at timestamp):
> metadata_parse_version=1
> metadata_feature_version=1
> timestamp=1008087<tel:1008087> (Mon Dec 29 11:25:51 2014)
> host-id=1
> score=2400
> maintenance=False
> state=EngineUp
>
>
> --== Host 2 status ==--
>
> Status up-to-date                  : True
> Hostname                           : 10.0.0.93
> Host ID                            : 2
> Engine status                      : {"reason": "vm not running on
> this host", "health": "bad", "vm": "down", "detail": "unknown"}
> Score                              : 0
> Local maintenance                  : True
> Host timestamp                     : 859142
> Extra metadata (valid at timestamp):
> metadata_parse_version=1
> metadata_feature_version=1
> timestamp=859142 (Mon Dec 29 08:25:08 2014)
> host-id=2
> score=0
> maintenance=True
> state=LocalMaintenance
>
>
> --== Host 3 status ==--
>
> Status up-to-date                  : True
> Hostname                           : 10.0.0.92
> Host ID                            : 3
> Engine status                      : {"reason": "vm not running on
> this host", "health": "bad", "vm": "down", "detail": "unknown"}
> Score                              : 0
> Local maintenance                  : True
> Host timestamp                     : 853615
> Extra metadata (valid at timestamp):
> metadata_parse_version=1
> metadata_feature_version=1
> timestamp=853615 (Mon Dec 29 08:25:57 2014)
> host-id=3
> score=0
> maintenance=True
> state=LocalMaintenance
> You have new mail in /var/spool/mail/root
> [root at compute2-2 ~]#
>
> Could you please explain how VM failover works inside ovirt? Is there any
> other debug option I can enable to check the problem?
>
> Thanks,
> Cong
>
>
> On 2014/12/29, at 1:39, "Artyom Lukianov"
> <alukiano at redhat.com<mailto:alukiano at redhat.com><mailto:alukiano at redhat.com>>
> wrote:
>
> Can you also provide output of hosted-engine --vm-status please, previous
> time it was useful, because I do not see something unusual.
> Thanks
>
> ----- Original Message -----
> From: "Cong Yue"
> <Cong_Yue at alliedtelesis.com<mailto:Cong_Yue at alliedtelesis.com><mailto:Cong_Yue at alliedtelesis.com>>
> To: "Artyom Lukianov"
> <alukiano at redhat.com<mailto:alukiano at redhat.com><mailto:alukiano at redhat.com>>
> Cc: "Simone Tiraboschi"
> <stirabos at redhat.com<mailto:stirabos at redhat.com><mailto:stirabos at redhat.com>>,
> users at ovirt.org<mailto:users at ovirt.org><mailto:users at ovirt.org>
> Sent: Monday, December 29, 2014 7:15:24 AM
> Subject: Re: [ovirt-users] VM failover with ovirt3.5
>
> Also I change the maintenance mode to local in another host. But also the VM
> in this host can not be migrated. The logs are as follows.
>
> [root at compute2-2 ~]# hosted-engine --set-maintenance --mode=local
> [root at compute2-2 ~]# tail -f /var/log/ovirt-hosted-engine-ha/agent.log
> MainThread::INFO::2014-12-28
> 21:09:04,184::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.94 (id: 1, score: 2400)
> MainThread::INFO::2014-12-28
> 21:09:14,603::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineDown (score: 2400)
> MainThread::INFO::2014-12-28
> 21:09:14,603::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.94 (id: 1, score: 2400)
> MainThread::INFO::2014-12-28
> 21:09:24,903::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineDown (score: 2400)
> MainThread::INFO::2014-12-28
> 21:09:24,904::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.94 (id: 1, score: 2400)
> MainThread::INFO::2014-12-28
> 21:09:35,026::states::437::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
> Engine vm is running on host 10.0.0.94 (id 1)
> MainThread::INFO::2014-12-28
> 21:09:35,236::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineDown (score: 2400)
> MainThread::INFO::2014-12-28
> 21:09:35,236::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.94 (id: 1, score: 2400)
> MainThread::INFO::2014-12-28
> 21:09:45,604::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineDown (score: 2400)
> MainThread::INFO::2014-12-28
> 21:09:45,604::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.94 (id: 1, score: 2400)
> MainThread::INFO::2014-12-28
> 21:09:55,691::state_decorators::124::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check)
> Local maintenance detected
> MainThread::INFO::2014-12-28
> 21:09:55,701::brokerlink::111::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
> Trying: notify time=1419829795.7 type=state_transition
> detail=EngineDown-LocalMaintenance hostname='compute2-2'
> MainThread::INFO::2014-12-28
> 21:09:55,761::brokerlink::120::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
> Success, was notification of state_transition
> (EngineDown-LocalMaintenance) sent? sent
> MainThread::INFO::2014-12-28
> 21:09:55,990::states::208::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(score)
> Score is 0 due to local maintenance mode
> MainThread::INFO::2014-12-28
> 21:09:55,990::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state LocalMaintenance (score: 0)
> MainThread::INFO::2014-12-28
> 21:09:55,991::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.94 (id: 1, score: 2400)
> ^C
> You have new mail in /var/spool/mail/root
> [root at compute2-2 ~]# ps -ef | grep qemu
> root     18420  2777  0 21:10<x-apple-data-detectors://39> pts/0
> 00:00:00<x-apple-data-detectors://40> grep --color=auto qemu
> qemu     29809     1  0 Dec19 ?        01:17:20 /usr/libexec/qemu-kvm
> -name testvm2-2 -S -machine rhel6.5.0,accel=kvm,usb=off -cpu Nehalem
> -m 500 -realtime mlock=off -smp
> 1,maxcpus=16,sockets=16,cores=1,threads=1 -uuid
> c31e97d0-135e-42da-9954-162b5228dce3 -smbios
> type=1,manufacturer=oVirt,product=oVirt
> Node,version=7-0.1406.el7.centos.2.5,serial=4C4C4544-0059-3610-8033-B4C04F395931,uuid=c31e97d0-135e-42da-9954-162b5228dce3
> -no-user-config -nodefaults -chardev
> socket,id=charmonitor,path=/var/lib/libvirt/qemu/testvm2-2.monitor,server,nowait
> -mon chardev=charmonitor,id=monitor,mode=control -rtc
> base=2014-12-19T20:17:17<x-apple-data-detectors://42>,driftfix=slew
> -no-kvm-pit-reinjection
> -no-hpet -no-shutdown -boot strict=on -device
> piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device
> virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x4 -device
> virtio-serial-pci,id=virtio-serial0,max_ports=16,bus=pci.0,addr=0x5
> -drive if=none,id=drive-ide0-1-0,readonly=on,format=raw,serial=
> -device ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0
> -drive
> file=/rhev/data-center/00000002-0002-0002-0002-0000000001e4/1dc71096-27c4-4256-b2ac-bd7265525c69/images/5cbeb8c9-4f04-48d0-a5eb-78c49187c550/a0570e8c-9867-4ec4-818f-11e102fc4f9b,if=none,id=drive-virtio-disk0,format=qcow2,serial=5cbeb8c9-4f04-48d0-a5eb-78c49187c550,cache=none,werror=stop,rerror=stop,aio=threads
> -device
> virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1
> -netdev tap,fd=28,id=hostnet0,vhost=on,vhostfd=29 -device
> virtio-net-pci,netdev=hostnet0,id=net0,mac=00:1a:4a:db:94:00,bus=pci.0,addr=0x3
> -chardev
> socket,id=charchannel0,path=/var/lib/libvirt/qemu/channels/c31e97d0-135e-42da-9954-162b5228dce3.com.redhat.rhevm.vdsm,server,nowait
> -device
> virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.rhevm.vdsm
> -chardev
> socket,id=charchannel1,path=/var/lib/libvirt/qemu/channels/c31e97d0-135e-42da-9954-162b5228dce3.org.qemu.guest_agent.0,server,nowait
> -device
> virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=org.qemu.guest_agent.0
> -chardev spicevmc,id=charchannel2,name=vdagent -device
> virtserialport,bus=virtio-serial0.0,nr=3,chardev=charchannel2,id=channel2,name=com.redhat.spice.0
> -spice
> tls-port=5901,addr=10.0.0.93,x509-dir=/etc/pki/vdsm/libvirt-spice,tls-channel=main,tls-channel=display,tls-channel=inputs,tls-channel=cursor,tls-channel=playback,tls-channel=record,tls-channel=smartcard,tls-channel=usbredir,seamless-migration=on
> -k en-us -vga qxl -global qxl-vga.ram_size=67108864<tel:67108864> -global
> qxl-vga.vram_size=33554432<tel:33554432> -incoming tcp:[::]:49152 -device
> virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x7
> [root at compute2-2 ~]#
>
> Thanks,
> Cong
>
>
> On 2014/12/28, at 20:53, "Yue, Cong"
> <Cong_Yue at alliedtelesis.com<mailto:Cong_Yue at alliedtelesis.com><mailto:Cong_Yue at alliedtelesis.com><mailto:Cong_Yue at alliedtelesis.com>>
> wrote:
>
> I checked it again and confirmed there is one guest VM is running on the top
> of this host. The log is as follows:
>
> [root at compute2-1 vdsm]# ps -ef | grep qemu
> qemu      2983   846  0 Dec19 ?        00:00:00<x-apple-data-detectors://0>
> [supervdsmServer] <defunct>
> root      5489  3053  0 20:49<x-apple-data-detectors://1> pts/0
> 00:00:00<x-apple-data-detectors://2> grep --color=auto qemu
> qemu     26128     1  0 Dec19 ?        01:09:19 /usr/libexec/qemu-kvm
> -name testvm2 -S -machine rhel6.5.0,accel=kvm,usb=off -cpu Nehalem -m
> 500 -realtime mlock=off -smp 1,maxcpus=16,sockets=16,cores=1,threads=1
> -uuid e46bca87-4df5-4287-844b-90a26fccef33 -smbios
> type=1,manufacturer=oVirt,product=oVirt
> Node,version=7-0.1406.el7.centos.2.5,serial=4C4C4544-0030-3310-8059-B8C04F585231,uuid=e46bca87-4df5-4287-844b-90a26fccef33
> -no-user-config -nodefaults -chardev
> socket,id=charmonitor,path=/var/lib/libvirt/qemu/testvm2.monitor,server,nowait
> -mon chardev=charmonitor,id=monitor,mode=control -rtc
> base=2014-12-19T20:18:01<x-apple-data-detectors://4>,driftfix=slew
> -no-kvm-pit-reinjection
> -no-hpet -no-shutdown -boot strict=on -device
> piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device
> virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x4 -device
> virtio-serial-pci,id=virtio-serial0,max_ports=16,bus=pci.0,addr=0x5
> -drive if=none,id=drive-ide0-1-0,readonly=on,format=raw,serial=
> -device ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0
> -drive
> file=/rhev/data-center/00000002-0002-0002-0002-0000000001e4/1dc71096-27c4-4256-b2ac-bd7265525c69/images/b4b5426b-95e3-41af-b286-da245891cdaf/0f688d49-97e3-4f1d-84d4-ac1432d903b3,if=none,id=drive-virtio-disk0,format=qcow2,serial=b4b5426b-95e3-41af-b286-da245891cdaf,cache=none,werror=stop,rerror=stop,aio=threads
> -device
> virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1
> -netdev tap,fd=26,id=hostnet0,vhost=on,vhostfd=27 -device
> virtio-net-pci,netdev=hostnet0,id=net0,mac=00:1a:4a:db:94:01,bus=pci.0,addr=0x3
> -chardev
> socket,id=charchannel0,path=/var/lib/libvirt/qemu/channels/e46bca87-4df5-4287-844b-90a26fccef33.com.redhat.rhevm.vdsm,server,nowait
> -device
> virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.rhevm.vdsm
> -chardev
> socket,id=charchannel1,path=/var/lib/libvirt/qemu/channels/e46bca87-4df5-4287-844b-90a26fccef33.org.qemu.guest_agent.0,server,nowait
> -device
> virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=org.qemu.guest_agent.0
> -chardev spicevmc,id=charchannel2,name=vdagent -device
> virtserialport,bus=virtio-serial0.0,nr=3,chardev=charchannel2,id=channel2,name=com.redhat.spice.0
> -spice
> tls-port=5900,addr=10.0.0.92,x509-dir=/etc/pki/vdsm/libvirt-spice,tls-channel=main,tls-channel=display,tls-channel=inputs,tls-channel=cursor,tls-channel=playback,tls-channel=record,tls-channel=smartcard,tls-channel=usbredir,seamless-migration=on
> -k en-us -vga qxl -global qxl-vga.ram_size=67108864<tel:67108864> -global
> qxl-vga.vram_size=33554432<tel:33554432> -incoming tcp:[::]:49152 -device
> virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x7
> [root at compute2-1 vdsm]# tail -f /var/log/ovirt-hosted-engine-ha/agent.log
> MainThread::INFO::2014-12-28
> 20:49:27,315::state_decorators::124::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check)
> Local maintenance detected
> MainThread::INFO::2014-12-28
> 20:49:27,646::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state LocalMaintenance (score: 0)
> MainThread::INFO::2014-12-28
> 20:49:27,646::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.94 (id: 1, score: 2400)
> MainThread::INFO::2014-12-28
> 20:49:37,732::state_decorators::124::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check)
> Local maintenance detected
> MainThread::INFO::2014-12-28
> 20:49:37,961::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state LocalMaintenance (score: 0)
> MainThread::INFO::2014-12-28
> 20:49:37,961::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.94 (id: 1, score: 2400)
> MainThread::INFO::2014-12-28
> 20:49:48,048::state_decorators::124::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check)
> Local maintenance detected
> MainThread::INFO::2014-12-28
> 20:49:48,319::states::208::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(score)
> Score is 0 due to local maintenance mode
> MainThread::INFO::2014-12-28
> 20:49:48,319::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state LocalMaintenance (score: 0)
> MainThread::INFO::2014-12-28
> 20:49:48,319::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.94 (id: 1, score: 2400)
>
> Thanks,
> Cong
>
>
> On 2014/12/28, at 3:46, "Artyom Lukianov"
> <alukiano at redhat.com<mailto:alukiano at redhat.com><mailto:alukiano at redhat.com><mailto:alukiano at redhat.com>>
> wrote:
>
> I see that you set local maintenance on host3 that do not have engine vm on
> it, so it nothing to migrate from this host.
> If you set local maintenance on host1, vm must migrate to another host with
> positive score.
> Thanks
>
> ----- Original Message -----
> From: "Cong Yue"
> <Cong_Yue at alliedtelesis.com<mailto:Cong_Yue at alliedtelesis.com><mailto:Cong_Yue at alliedtelesis.com><mailto:Cong_Yue at alliedtelesis.com>>
> To: "Simone Tiraboschi"
> <stirabos at redhat.com<mailto:stirabos at redhat.com><mailto:stirabos at redhat.com><mailto:stirabos at redhat.com>>
> Cc:
> users at ovirt.org<mailto:users at ovirt.org><mailto:users at ovirt.org><mailto:users at ovirt.org>
> Sent: Saturday, December 27, 2014 6:58:32 PM
> Subject: Re: [ovirt-users] VM failover with ovirt3.5
>
> Hi
>
> I had a try with "hosted-engine --set-maintence --mode=local" on
> compute2-1, which is host 3 in my cluster. From the log, it shows
> maintence mode is dectected, but migration does not happen.
>
> The logs are as follows. Is there any other config I need to check?
>
> [root at compute2-1 vdsm]# hosted-engine --vm-status
>
>
> --== Host 1 status ==-
>
> Status up-to-date                  : True
> Hostname                           : 10.0.0.94
> Host ID                            : 1
> Engine status                      : {"health": "good", "vm": "up",
> "detail": "up"}
> Score                              : 2400
> Local maintenance                  : False
> Host timestamp                     : 836296
> Extra metadata (valid at timestamp):
> metadata_parse_version=1
> metadata_feature_version=1
> timestamp=836296 (Sat Dec 27 11:42:39 2014)
> host-id=1
> score=2400
> maintenance=False
> state=EngineUp
>
>
> --== Host 2 status ==--
>
> Status up-to-date                  : True
> Hostname                           : 10.0.0.93
> Host ID                            : 2
> Engine status                      : {"reason": "vm not running on
> this host", "health": "bad", "vm": "down", "detail": "unknown"}
> Score                              : 2400
> Local maintenance                  : False
> Host timestamp                     : 687358
> Extra metadata (valid at timestamp):
> metadata_parse_version=1
> metadata_feature_version=1
> timestamp=687358 (Sat Dec 27 08:42:04 2014)
> host-id=2
> score=2400
> maintenance=False
> state=EngineDown
>
>
> --== Host 3 status ==--
>
> Status up-to-date                  : True
> Hostname                           : 10.0.0.92
> Host ID                            : 3
> Engine status                      : {"reason": "vm not running on
> this host", "health": "bad", "vm": "down", "detail": "unknown"}
> Score                              : 0
> Local maintenance                  : True
> Host timestamp                     : 681827
> Extra metadata (valid at timestamp):
> metadata_parse_version=1
> metadata_feature_version=1
> timestamp=681827 (Sat Dec 27 08:42:40 2014)
> host-id=3
> score=0
> maintenance=True
> state=LocalMaintenance
> [root at compute2-1 vdsm]# tail -f /var/log/ovirt-hosted-engine-ha/agent.log
> MainThread::INFO::2014-12-27
> 08:42:41,109::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.94 (id: 1, score: 2400)
> MainThread::INFO::2014-12-27
> 08:42:51,198::state_decorators::124::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check)
> Local maintenance detected
> MainThread::INFO::2014-12-27
> 08:42:51,420::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state LocalMaintenance (score: 0)
> MainThread::INFO::2014-12-27
> 08:42:51,420::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.94 (id: 1, score: 2400)
> MainThread::INFO::2014-12-27
> 08:43:01,507::state_decorators::124::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check)
> Local maintenance detected
> MainThread::INFO::2014-12-27
> 08:43:01,773::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state LocalMaintenance (score: 0)
> MainThread::INFO::2014-12-27
> 08:43:01,773::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.94 (id: 1, score: 2400)
> MainThread::INFO::2014-12-27
> 08:43:11,859::state_decorators::124::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check)
> Local maintenance detected
> MainThread::INFO::2014-12-27
> 08:43:12,072::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state LocalMaintenance (score: 0)
> MainThread::INFO::2014-12-27
> 08:43:12,072::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.94 (id: 1, score: 2400)
>
>
>
> [root at compute2-3 ~]# tail -f /var/log/ovirt-hosted-engine-ha/agent.log
> MainThread::INFO::2014-12-27
> 11:36:28,855::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.93 (id: 2, score: 2400)
> MainThread::INFO::2014-12-27
> 11:36:39,130::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineUp (score: 2400)
> MainThread::INFO::2014-12-27
> 11:36:39,130::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.93 (id: 2, score: 2400)
> MainThread::INFO::2014-12-27
> 11:36:49,449::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineUp (score: 2400)
> MainThread::INFO::2014-12-27
> 11:36:49,449::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.93 (id: 2, score: 2400)
> MainThread::INFO::2014-12-27
> 11:36:59,739::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineUp (score: 2400)
> MainThread::INFO::2014-12-27
> 11:36:59,739::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.93 (id: 2, score: 2400)
> MainThread::INFO::2014-12-27
> 11:37:09,779::states::394::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
> Engine vm running on localhost
> MainThread::INFO::2014-12-27
> 11:37:10,026::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineUp (score: 2400)
> MainThread::INFO::2014-12-27
> 11:37:10,026::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.93 (id: 2, score: 2400)
> MainThread::INFO::2014-12-27
> 11:37:20,331::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineUp (score: 2400)
> MainThread::INFO::2014-12-27
> 11:37:20,331::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.93 (id: 2, score: 2400)
>
>
> [root at compute2-2 ~]# tail -f /var/log/ovirt-hosted-engine-ha/agent.log
> MainThread::INFO::2014-12-27
> 08:36:12,462::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.94 (id: 1, score: 2400)
> MainThread::INFO::2014-12-27
> 08:36:22,797::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineDown (score: 2400)
> MainThread::INFO::2014-12-27
> 08:36:22,798::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.94 (id: 1, score: 2400)
> MainThread::INFO::2014-12-27
> 08:36:32,876::states::437::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
> Engine vm is running on host 10.0.0.94 (id 1)
> MainThread::INFO::2014-12-27
> 08:36:33,169::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineDown (score: 2400)
> MainThread::INFO::2014-12-27
> 08:36:33,169::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.94 (id: 1, score: 2400)
> MainThread::INFO::2014-12-27
> 08:36:43,567::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineDown (score: 2400)
> MainThread::INFO::2014-12-27
> 08:36:43,567::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.94 (id: 1, score: 2400)
> MainThread::INFO::2014-12-27
> 08:36:53,858::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineDown (score: 2400)
> MainThread::INFO::2014-12-27
> 08:36:53,858::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.94 (id: 1, score: 2400)
> MainThread::INFO::2014-12-27
> 08:37:04,028::state_machine::160::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh)
> Global metadata: {'maintenance': False}
> MainThread::INFO::2014-12-27
> 08:37:04,028::state_machine::165::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh)
> Host 10.0.0.94 (id 1): {'extra':
> 'metadata_parse_version=1\nmetadata_feature_version=1\ntimestamp=835987
> (Sat Dec 27 11:37:30
> 2014)\nhost-id=1\nscore=2400\nmaintenance=False\nstate=EngineUp\n',
> 'hostname': '10.0.0.94', 'alive': True, 'host-id': 1, 'engine-status':
> {'health': 'good', 'vm': 'up', 'detail': 'up'}, 'score': 2400,
> 'maintenance': False, 'host-ts': 835987}
> MainThread::INFO::2014-12-27
> 08:37:04,028::state_machine::165::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh)
> Host 10.0.0.92 (id 3): {'extra':
> 'metadata_parse_version=1\nmetadata_feature_version=1\ntimestamp=681528
> (Sat Dec 27 08:37:41
> 2014)\nhost-id=3\nscore=0\nmaintenance=True\nstate=LocalMaintenance\n',
> 'hostname': '10.0.0.92', 'alive': True, 'host-id': 3, 'engine-status':
> {'reason': 'vm not running on this host', 'health': 'bad', 'vm':
> 'down', 'detail': 'unknown'}, 'score': 0, 'maintenance': True,
> 'host-ts': 681528}
> MainThread::INFO::2014-12-27
> 08:37:04,028::state_machine::168::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh)
> Local (id 2): {'engine-health': {'reason': 'vm not running on this
> host', 'health': 'bad', 'vm': 'down', 'detail': 'unknown'}, 'bridge':
> True, 'mem-free': 15300.0, 'maintenance': False, 'cpu-load': 0.0215,
> 'gateway': True}
> MainThread::INFO::2014-12-27
> 08:37:04,265::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineDown (score: 2400)
> MainThread::INFO::2014-12-27
> 08:37:04,265::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.94 (id: 1, score: 2400)
>
> Thanks,
> Cong
>
> On 2014/12/22, at 5:29, "Simone Tiraboschi"
> <stirabos at redhat.com<mailto:stirabos at redhat.com><mailto:stirabos at redhat.com><mailto:stirabos at redhat.com>>
> wrote:
>
>
>
> ----- Original Message -----
> From: "Cong Yue"
> <Cong_Yue at alliedtelesis.com<mailto:Cong_Yue at alliedtelesis.com><mailto:Cong_Yue at alliedtelesis.com><mailto:Cong_Yue at alliedtelesis.com>>
> To: "Simone Tiraboschi"
> <stirabos at redhat.com<mailto:stirabos at redhat.com><mailto:stirabos at redhat.com><mailto:stirabos at redhat.com>>
> Cc:
> users at ovirt.org<mailto:users at ovirt.org><mailto:users at ovirt.org><mailto:users at ovirt.org>
> Sent: Friday, December 19, 2014 7:22:10 PM
> Subject: RE: [ovirt-users] VM failover with ovirt3.5
>
> Thanks for the information. This is the log for my three ovirt nodes.
> From the output of hosted-engine --vm-status, it shows the engine state for
> my 2nd and 3rd ovirt node is DOWN.
> Is this the reason why VM failover not work in my environment?
>
> No, they looks ok: you can run the engine VM on single host at a time.
>
> How can I make
> also engine works for my 2nd and 3rd ovit nodes?
>
> If you put the host 1 in local maintenance mode ( hosted-engine
> --set-maintenance --mode=local ) the VM should migrate to host 2; if you
> reactivate host 1 ( hosted-engine --set-maintenance --mode=none ) and put
> host 2 in local maintenance mode the VM should migrate again.
>
> Can you please try that and post the logs if something is going bad?
>
>
> --
> --== Host 1 status ==--
>
> Status up-to-date                  : True
> Hostname                           : 10.0.0.94
> Host ID                            : 1
> Engine status                      : {"health": "good", "vm": "up",
> "detail": "up"}
> Score                              : 2400
> Local maintenance                  : False
> Host timestamp                     : 150475
> Extra metadata (valid at timestamp):
> metadata_parse_version=1
> metadata_feature_version=1
> timestamp=150475 (Fri Dec 19 13:12:18 2014)
> host-id=1
> score=2400
> maintenance=False
> state=EngineUp
>
>
> --== Host 2 status ==--
>
> Status up-to-date                  : True
> Hostname                           : 10.0.0.93
> Host ID                            : 2
> Engine status                      : {"reason": "vm not running on
> this host", "health": "bad", "vm": "down", "detail": "unknown"}
> Score                              : 2400
> Local maintenance                  : False
> Host timestamp                     : 1572
> Extra metadata (valid at timestamp):
> metadata_parse_version=1
> metadata_feature_version=1
> timestamp=1572 (Fri Dec 19 10:12:18 2014)
> host-id=2
> score=2400
> maintenance=False
> state=EngineDown
>
>
> --== Host 3 status ==--
>
> Status up-to-date                  : False
> Hostname                           : 10.0.0.92
> Host ID                            : 3
> Engine status                      : unknown stale-data
> Score                              : 2400
> Local maintenance                  : False
> Host timestamp                     : 987
> Extra metadata (valid at timestamp):
> metadata_parse_version=1
> metadata_feature_version=1
> timestamp=987 (Fri Dec 19 10:09:58 2014)
> host-id=3
> score=2400
> maintenance=False
> state=EngineDown
>
> --
> And the /var/log/ovirt-hosted-engine-ha/agent.log for three ovirt nodes are
> as follows:
> --
> 10.0.0.94(hosted-engine-1)
> ---
> MainThread::INFO::2014-12-19
> 13:09:33,716::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineUp (score: 2400)
> MainThread::INFO::2014-12-19
> 13:09:33,716::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.93 (id: 2, score: 2400)
> MainThread::INFO::2014-12-19
> 13:09:44,017::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineUp (score: 2400)
> MainThread::INFO::2014-12-19
> 13:09:44,017::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.93 (id: 2, score: 2400)
> MainThread::INFO::2014-12-19
> 13:09:54,303::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineUp (score: 2400)
> MainThread::INFO::2014-12-19
> 13:09:54,303::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.93 (id: 2, score: 2400)
> MainThread::INFO::2014-12-19
> 13:10:04,342::states::394::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
> Engine vm running on localhost
> MainThread::INFO::2014-12-19
> 13:10:04,617::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineUp (score: 2400)
> MainThread::INFO::2014-12-19
> 13:10:04,617::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.93 (id: 2, score: 2400)
> MainThread::INFO::2014-12-19
> 13:10:14,657::state_machine::160::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh)
> Global metadata: {'maintenance': False}
> MainThread::INFO::2014-12-19
> 13:10:14,657::state_machine::165::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh)
> Host 10.0.0.93 (id 2): {'extra':
> 'metadata_parse_version=1\nmetadata_feature_version=1\ntimestamp=1448
> (Fri Dec 19 10:10:14
> 2014)\nhost-id=2\nscore=2400\nmaintenance=False\nstate=EngineDown\n',
> 'hostname': '10.0.0.93', 'alive': True, 'host-id': 2, 'engine-status':
> {'reason': 'vm not running on this host', 'health': 'bad', 'vm':
> 'down', 'detail': 'unknown'}, 'score': 2400, 'maintenance': False,
> 'host-ts': 1448}
> MainThread::INFO::2014-12-19
> 13:10:14,657::state_machine::165::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh)
> Host 10.0.0.92 (id 3): {'extra':
> 'metadata_parse_version=1\nmetadata_feature_version=1\ntimestamp=987
> (Fri Dec 19 10:09:58
> 2014)\nhost-id=3\nscore=2400\nmaintenance=False\nstate=EngineDown\n',
> 'hostname': '10.0.0.92', 'alive': True, 'host-id': 3, 'engine-status':
> {'reason': 'vm not running on this host', 'health': 'bad', 'vm':
> 'down', 'detail': 'unknown'}, 'score': 2400, 'maintenance': False,
> 'host-ts': 987}
> MainThread::INFO::2014-12-19
> 13:10:14,658::state_machine::168::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh)
> Local (id 1): {'engine-health': {'health': 'good', 'vm': 'up',
> 'detail': 'up'}, 'bridge': True, 'mem-free': 1079.0, 'maintenance':
> False, 'cpu-load': 0.0269, 'gateway': True}
> MainThread::INFO::2014-12-19
> 13:10:14,904::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineUp (score: 2400)
> MainThread::INFO::2014-12-19
> 13:10:14,904::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.93 (id: 2, score: 2400)
> MainThread::INFO::2014-12-19
> 13:10:25,210::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineUp (score: 2400)
> MainThread::INFO::2014-12-19
> 13:10:25,210::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.93 (id: 2, score: 2400)
> MainThread::INFO::2014-12-19
> 13:10:35,499::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineUp (score: 2400)
> MainThread::INFO::2014-12-19
> 13:10:35,499::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.93 (id: 2, score: 2400)
> MainThread::INFO::2014-12-19
> 13:10:45,784::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineUp (score: 2400)
> MainThread::INFO::2014-12-19
> 13:10:45,785::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.93 (id: 2, score: 2400)
> MainThread::INFO::2014-12-19
> 13:10:56,070::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineUp (score: 2400)
> MainThread::INFO::2014-12-19
> 13:10:56,070::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.93 (id: 2, score: 2400)
> MainThread::INFO::2014-12-19
> 13:11:06,109::states::394::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
> Engine vm running on localhost
> MainThread::INFO::2014-12-19
> 13:11:06,359::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineUp (score: 2400)
> MainThread::INFO::2014-12-19
> 13:11:06,359::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.93 (id: 2, score: 2400)
> MainThread::INFO::2014-12-19
> 13:11:16,658::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineUp (score: 2400)
> MainThread::INFO::2014-12-19
> 13:11:16,658::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.93 (id: 2, score: 2400)
> MainThread::INFO::2014-12-19
> 13:11:26,991::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineUp (score: 2400)
> MainThread::INFO::2014-12-19
> 13:11:26,991::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.93 (id: 2, score: 2400)
> MainThread::INFO::2014-12-19
> 13:11:37,341::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineUp (score: 2400)
> MainThread::INFO::2014-12-19
> 13:11:37,341::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.93 (id: 2, score: 2400)
> ----
>
> 10.0.0.93 (hosted-engine-2)
> MainThread::INFO::2014-12-19
> 10:12:18,339::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineDown (score: 2400)
> MainThread::INFO::2014-12-19
> 10:12:18,339::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.94 (id: 1, score: 2400)
> MainThread::INFO::2014-12-19
> 10:12:28,651::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineDown (score: 2400)
> MainThread::INFO::2014-12-19
> 10:12:28,652::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.94 (id: 1, score: 2400)
> MainThread::INFO::2014-12-19
> 10:12:39,010::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineDown (score: 2400)
> MainThread::INFO::2014-12-19
> 10:12:39,010::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.94 (id: 1, score: 2400)
> MainThread::INFO::2014-12-19
> 10:12:49,338::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineDown (score: 2400)
> MainThread::INFO::2014-12-19
> 10:12:49,338::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.94 (id: 1, score: 2400)
> MainThread::INFO::2014-12-19
> 10:12:59,642::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineDown (score: 2400)
> MainThread::INFO::2014-12-19
> 10:12:59,642::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.94 (id: 1, score: 2400)
> MainThread::INFO::2014-12-19
> 10:13:10,010::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineDown (score: 2400)
> MainThread::INFO::2014-12-19
> 10:13:10,010::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.94 (id: 1, score: 2400)
>
>
> 10.0.0.92(hosted-engine-3)
> same as 10.0.0.93
> --
>
> -----Original Message-----
> From: Simone Tiraboschi [mailto:stirabos at redhat.com]
> Sent: Friday, December 19, 2014 12:28 AM
> To: Yue, Cong
> Cc:
> users at ovirt.org<mailto:users at ovirt.org><mailto:users at ovirt.org><mailto:users at ovirt.org>
> Subject: Re: [ovirt-users] VM failover with ovirt3.5
>
>
>
> ----- Original Message -----
> From: "Cong Yue"
> <Cong_Yue at alliedtelesis.com<mailto:Cong_Yue at alliedtelesis.com><mailto:Cong_Yue at alliedtelesis.com><mailto:Cong_Yue at alliedtelesis.com>>
> To:
> users at ovirt.org<mailto:users at ovirt.org><mailto:users at ovirt.org><mailto:users at ovirt.org>
> Sent: Friday, December 19, 2014 2:14:33 AM
> Subject: [ovirt-users] VM failover with ovirt3.5
>
>
>
> Hi
>
>
>
> In my environment, I have 3 ovirt nodes as one cluster. And on top of
> host-1, there is one vm to host ovirt engine.
>
> Also I have one external storage for the cluster to use as data domain
> of engine and data.
>
> I confirmed live migration works well in my environment.
>
> But it seems very buggy for VM failover if I try to force to shut down
> one ovirt node. Sometimes the VM in the node which is shutdown can
> migrate to other host, but it take more than several minutes.
>
> Sometimes, it can not migrate at all. Sometimes, only when the host is
> back, the VM is beginning to move.
>
> Can you please check or share the logs under
> /var/log/ovirt-hosted-engine-ha/
> ?
>
> Is there some documentation to explain how VM failover is working? And
> is there some bugs reported related with this?
>
> http://www.ovirt.org/Features/Self_Hosted_Engine#Agent_State_Diagram
>
> Thanks in advance,
>
> Cong
>
>
>
>
> This e-mail message is for the sole use of the intended recipient(s)
> and may contain confidential and privileged information. Any
> unauthorized review, use, disclosure or distribution is prohibited. If
> you are not the intended recipient, please contact the sender by reply
> e-mail and destroy all copies of the original message. If you are the
> intended recipient, please be advised that the content of this message
> is subject to access, review and disclosure by the sender's e-mail System
> Administrator.
>
> _______________________________________________
> Users mailing list
> Users at ovirt.org<mailto:Users at ovirt.org><mailto:Users at ovirt.org><mailto:Users at ovirt.org>
> http://lists.ovirt.org/mailman/listinfo/users
>
> This e-mail message is for the sole use of the intended recipient(s) and may
> contain confidential and privileged information. Any unauthorized review,
> use, disclosure or distribution is prohibited. If you are not the intended
> recipient, please contact the sender by reply e-mail and destroy all copies
> of the original message. If you are the intended recipient, please be
> advised that the content of this message is subject to access, review and
> disclosure by the sender's e-mail System Administrator.
>
>
> This e-mail message is for the sole use of the intended recipient(s) and may
> contain confidential and privileged information. Any unauthorized review,
> use, disclosure or distribution is prohibited. If you are not the intended
> recipient, please contact the sender by reply e-mail and destroy all copies
> of the original message. If you are the intended recipient, please be
> advised that the content of this message is subject to access, review and
> disclosure by the sender's e-mail System Administrator.
> _______________________________________________
> Users mailing list
> Users at ovirt.org<mailto:Users at ovirt.org><mailto:Users at ovirt.org><mailto:Users at ovirt.org>
> http://lists.ovirt.org/mailman/listinfo/users
>
> ________________________________
> This e-mail message is for the sole use of the intended recipient(s) and may
> contain confidential and privileged information. Any unauthorized review,
> use, disclosure or distribution is prohibited. If you are not the intended
> recipient, please contact the sender by reply e-mail and destroy all copies
> of the original message. If you are the intended recipient, please be
> advised that the content of this message is subject to access, review and
> disclosure by the sender's e-mail System Administrator.
>
> ________________________________
> This e-mail message is for the sole use of the intended recipient(s) and may
> contain confidential and privileged information. Any unauthorized review,
> use, disclosure or distribution is prohibited. If you are not the intended
> recipient, please contact the sender by reply e-mail and destroy all copies
> of the original message. If you are the intended recipient, please be
> advised that the content of this message is subject to access, review and
> disclosure by the sender's e-mail System Administrator.
>
> ________________________________
> This e-mail message is for the sole use of the intended recipient(s) and may
> contain confidential and privileged information. Any unauthorized review,
> use, disclosure or distribution is prohibited. If you are not the intended
> recipient, please contact the sender by reply e-mail and destroy all copies
> of the original message. If you are the intended recipient, please be
> advised that the content of this message is subject to access, review and
> disclosure by the sender's e-mail System Administrator.
>
>
> ________________________________
> This e-mail message is for the sole use of the intended recipient(s) and may
> contain confidential and privileged information. Any unauthorized review,
> use, disclosure or distribution is prohibited. If you are not the intended
> recipient, please contact the sender by reply e-mail and destroy all copies
> of the original message. If you are the intended recipient, please be
> advised that the content of this message is subject to access, review and
> disclosure by the sender's e-mail System Administrator.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: vdsm.log-123014
Type: application/octet-stream
Size: 1719518 bytes
Desc: not available
URL: <http://lists.ovirt.org/pipermail/users/attachments/20141230/5e8cd790/attachment.obj>


More information about the Users mailing list