Hi,
We have a failure in OST on test 006_migrations.migrate_vm.
From what I can see, the migration succeeded but because there was a
KeyError: 'cpuUsage' engine assume the v is down and tries to retry
migration.
the re-try fails with vm already exists.
*Link and headline of suspected patches: *Failed change:
*https://gerrit.ovirt.org/#/c/88432/2
<
https://gerrit.ovirt.org/#/c/88432/2> - *
*engine : Events coming too soon on refresh caps*
*CQ reported this as root ca*use but I don't think its related as well:
*https://gerrit.ovirt.org/#/c/88404/ <
https://gerrit.ovirt.org/#/c/88404/>
- *
*core: Auto SD selection on template version update*
*Link to
Job:http://jenkins.ovirt.org/job/ovirt-4.2_change-queue-tester/1021/
<
http://jenkins.ovirt.org/job/ovirt-4.2_change-queue-tester/1021/>Link to
all
logs:http://jenkins.ovirt.org/job/ovirt-4.2_change-queue-tester/1021/arti...
<
http://jenkins.ovirt.org/job/ovirt-4.2_change-queue-tester/1021/artifact/...
error snippet from the log: <error>*
*migration starts and finishs *but fails on vm state on eyError: 'cpuUsage'
*2018-03-05 05:21:33,269-0500 INFO (jsonrpc/3) [api.virt] START
migrate(params={u'incomingLimit': 2, u'tunneled': u'false',
u'dstqemu':
u'192.0.3.2', u'autoConverge': u'false', u'src':
u'lago-basic-suite-4-2-host-0', u'enableGuestEvents': False,
u'dst':
u'lago-basic-suite-4-2-host-1:54321', u'vmId':
u'a80596a2-f57b-4878-adc3-772363e42783', u'abortOnError':
u'true',
u'outgoingLimit': 2, u'compressed': u'false', u'method':
u'online'})
from=::ffff:192.168.200.2,42452,
flow_id=123f5fd3-81a4-4ba9-9034-525674696629,
vmId=a80596a2-f57b-4878-adc3-772363e42783 (api:46)2018-03-05
05:21:33,271-0500 INFO (jsonrpc/3) [api.virt] FINISH migrate
return={'status': {'message': 'Migration in progress',
'code': 0},
'progress': 0} from=::ffff:192.168.200.2,42452,
flow_id=123f5fd3-81a4-4ba9-9034-525674696629,
vmId=a80596a2-f57b-4878-adc3-772363e42783 (api:52)2018-03-05
05:21:33,271-0500 INFO (jsonrpc/3) [jsonrpc.JsonRpcServer] RPC call
VM.migrate succeeded in 0.00 seconds (__init__:573)2018-03-05
05:21:33,473-0500 INFO (monitor/bd2f874) [IOProcessClient] Closing client
ioprocess-0 (__init__:583)2018-03-05 05:21:34,936-0500 INFO
(migsrc/a80596a2) [virt.vm] (vmId='a80596a2-f57b-4878-adc3-772363e42783')
Creation of destination VM took: 1 seconds (migration:473)2018-03-05
05:21:34,936-0500 INFO (migsrc/a80596a2) [virt.vm]
(vmId='a80596a2-f57b-4878-adc3-772363e42783') starting migration to
qemu+tls://lago-basic-suite-4-2-host-1/system with miguri tcp://192.0.3.2
<
http://192.0.3.2> (migration:502)2018-03-05 05:21:36,356-0500 INFO
(jsonrpc/7) [api.host] START getAllVmStats()
from=::ffff:192.168.200.2,42452 (api:46)2018-03-05 05:21:36,359-0500 INFO
(jsonrpc/7) [api.host] FINISH getAllVmStats return={'status': {'message':
'Done', 'code': 0}, 'statsList': (suppressed)}
from=::ffff:192.168.200.2,42452 (api:52)2018-03-05 05:21:36,361-0500 INFO
(jsonrpc/7) [jsonrpc.JsonRpcServer] RPC call Host.getAllVmStats succeeded
in 0.00 seconds (__init__:573)2018-03-05 05:21:38,861-0500 INFO
(libvirt/events) [virt.vm] (vmId='a80596a2-f57b-4878-adc3-772363e42783')
CPU stopped: onSuspend (vm:6063)2018-03-05 05:21:40,109-0500 WARN
(periodic/0) [virt.periodic.VmDispatcher] could not run <class
'vdsm.virt.periodic.DriveWatermarkMonitor'> on
['a80596a2-f57b-4878-adc3-772363e42783'] (periodic:323)2018-03-05
05:21:40,109-0500 INFO (migsrc/a80596a2) [virt.vm]
(vmId='a80596a2-f57b-4878-adc3-772363e42783') migration took 6 seconds to
complete (migration:514)2018-03-05 05:21:40,110-0500 INFO
(migsrc/a80596a2) [virt.vm] (vmId='a80596a2-f57b-4878-adc3-772363e42783')
Changed state to Down: Migration succeeded (code=4) (vm:1677)2018-03-05
05:21:40,116-0500 INFO (migsrc/a80596a2) [virt.vm]
(vmId='a80596a2-f57b-4878-adc3-772363e42783') Stopping connection
(guestagent:438)2018-03-05 05:21:41,118-0500 WARN (periodic/3)
[virt.periodic.VmDispatcher] could not run <class
'vdsm.virt.periodic.DriveWatermarkMonitor'> on
['a80596a2-f57b-4878-adc3-772363e42783'] (periodic:323)2018-03-05
05:21:42,063-0500 WARN (periodic/2) [virt.vmstats] Missing stat:
'balloon.current' for vm a80596a2-f57b-4878-adc3-772363e42783
(vmstats:552)2018-03-05 05:21:42,064-0500 ERROR (periodic/2) [virt.vmstats]
VM metrics collection failed (vmstats:264)Traceback (most recent call
last): File "/usr/lib/python2.7/site-packages/vdsm/virt/vmstats.py", line
197, in send_metrics data[prefix + '.cpu.usage'] =
stat['cpuUsage']KeyError: 'cpuUsage'*
*</error>*