Hi,

We have a failure in OST on test 006_migrations.migrate_vm.

From what I can see, the migration succeeded but because there was a KeyError: 'cpuUsage' engine assume the v is down and tries to retry migration.
the re-try fails with vm already exists.


Link and headline of suspected patches:

Failed change:
https://gerrit.ovirt.org/#/c/88432/2 -
engine : Events coming too soon on refresh caps

CQ reported this as root cause but I don't think its related as well:
https://gerrit.ovirt.org/#/c/88404/ -

core: Auto SD selection on template version update

Link to Job:

http://jenkins.ovirt.org/job/ovirt-4.2_change-queue-tester/1021/

Link to all logs:

http://jenkins.ovirt.org/job/ovirt-4.2_change-queue-tester/1021/artifact/

(Relevant) error snippet from the log:

<error>


migration starts and finishs but fails on vm state on eyError: 'cpuUsage'

2018-03-05 05:21:33,269-0500 INFO  (jsonrpc/3) [api.virt] START migrate(params={u'incomingLimit': 2, u'tunneled': u'false', u'dstqemu': u'192.0.3.2', u'autoConverge': u'false', u'src': u'lago-basic-suite-4-2-host-0', u'enableGuestEvents':
 False, u'dst': u'lago-basic-suite-4-2-host-1:54321', u'vmId': u'a80596a2-f57b-4878-adc3-772363e42783', u'abortOnError': u'true', u'outgoingLimit': 2, u'compressed': u'false', u'method': u'online'}) from=::ffff:192.168.200.2,42452, flow_i
d=123f5fd3-81a4-4ba9-9034-525674696629, vmId=a80596a2-f57b-4878-adc3-772363e42783 (api:46)
2018-03-05 05:21:33,271-0500 INFO  (jsonrpc/3) [api.virt] FINISH migrate return={'status': {'message': 'Migration in progress', 'code': 0}, 'progress': 0} from=::ffff:192.168.200.2,42452, flow_id=123f5fd3-81a4-4ba9-9034-525674696629, vmId
=a80596a2-f57b-4878-adc3-772363e42783 (api:52)
2018-03-05 05:21:33,271-0500 INFO  (jsonrpc/3) [jsonrpc.JsonRpcServer] RPC call VM.migrate succeeded in 0.00 seconds (__init__:573)
2018-03-05 05:21:33,473-0500 INFO  (monitor/bd2f874) [IOProcessClient] Closing client ioprocess-0 (__init__:583)
2018-03-05 05:21:34,936-0500 INFO  (migsrc/a80596a2) [virt.vm] (vmId='a80596a2-f57b-4878-adc3-772363e42783') Creation of destination VM took: 1 seconds (migration:473)
2018-03-05 05:21:34,936-0500 INFO  (migsrc/a80596a2) [virt.vm] (vmId='a80596a2-f57b-4878-adc3-772363e42783') starting migration to qemu+tls://lago-basic-suite-4-2-host-1/system with miguri tcp://192.0.3.2 (migration:502)
2018-03-05 05:21:36,356-0500 INFO  (jsonrpc/7) [api.host] START getAllVmStats() from=::ffff:192.168.200.2,42452 (api:46)
2018-03-05 05:21:36,359-0500 INFO  (jsonrpc/7) [api.host] FINISH getAllVmStats return={'status': {'message': 'Done', 'code': 0}, 'statsList': (suppressed)} from=::ffff:192.168.200.2,42452 (api:52)
2018-03-05 05:21:36,361-0500 INFO  (jsonrpc/7) [jsonrpc.JsonRpcServer] RPC call Host.getAllVmStats succeeded in 0.00 seconds (__init__:573)
2018-03-05 05:21:38,861-0500 INFO  (libvirt/events) [virt.vm] (vmId='a80596a2-f57b-4878-adc3-772363e42783') CPU stopped: onSuspend (vm:6063)
2018-03-05 05:21:40,109-0500 WARN  (periodic/0) [virt.periodic.VmDispatcher] could not run <class 'vdsm.virt.periodic.DriveWatermarkMonitor'> on ['a80596a2-f57b-4878-adc3-772363e42783'] (periodic:323)
2018-03-05 05:21:40,109-0500 INFO  (migsrc/a80596a2) [virt.vm] (vmId='a80596a2-f57b-4878-adc3-772363e42783') migration took 6 seconds to complete (migration:514)
2018-03-05 05:21:40,110-0500 INFO  (migsrc/a80596a2) [virt.vm] (vmId='a80596a2-f57b-4878-adc3-772363e42783') Changed state to Down: Migration succeeded (code=4) (vm:1677)
2018-03-05 05:21:40,116-0500 INFO  (migsrc/a80596a2) [virt.vm] (vmId='a80596a2-f57b-4878-adc3-772363e42783') Stopping connection (guestagent:438)
2018-03-05 05:21:41,118-0500 WARN  (periodic/3) [virt.periodic.VmDispatcher] could not run <class 'vdsm.virt.periodic.DriveWatermarkMonitor'> on ['a80596a2-f57b-4878-adc3-772363e42783'] (periodic:323)
2018-03-05 05:21:42,063-0500 WARN  (periodic/2) [virt.vmstats] Missing stat: 'balloon.current' for vm a80596a2-f57b-4878-adc3-772363e42783 (vmstats:552)
2018-03-05 05:21:42,064-0500 ERROR (periodic/2) [virt.vmstats] VM metrics collection failed (vmstats:264)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/virt/vmstats.py", line 197, in send_metrics
    data[prefix + '.cpu.usage'] = stat['cpuUsage']
KeyError: 'cpuUsage'


</error>