This is a multi-part message in MIME format.
--------------D695925BEA0466FA018AF9F5
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
Hi,
We had a failure for test 006_migrations.migrate_vm on master.
There was a libvirt disruption in the migration src and after that vdsm
reported the migration as failed because the vm already exists which
makes me suspect a split bran case.
The patch reported has nothing to do with this issue and I think we
simply stumbled on a race condition which can cause a split brain.
Please note that there are several metrics related issues reported in
vdsm logs as well.
**
*Link and headline of suspected patches: *
*
*Not related *
Link to Job:
*
**http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/4278/**
*
*
*Link to all logs:*
*
*
**http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/4278/artifact/exported-artifacts/basic-suit-master-el7/test_logs/basic-suite-master/post-006_migrations.py/**
*
*
*(Relevant) error snippet from the log: *
*
*<error>*
**
**Engine: * *
*2017-12-05 06:26:48,546-05 ERROR
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(EE-ManagedThreadFactory-engine-Thread-385) [] EVENT_ID:
VM_MIGRATION_TO_SERVER_FAILED(120), Migration failed (VM: vm0, Source:
lago- basic-suite-master-host-1, Destination:
lago-basic-suite-master-host-0). *
*dst:**
**
*2017-12-05 06:26:46,615-0500 WARN (jsonrpc/6) [vds] vm
b3962e5c-08e3-444e-910e-ea675fa1a5c7 already exists (API:179)
2017-12-05 06:26:46,615-0500 ERROR (jsonrpc/6) [api] FINISH create
error=Virtual machine already exists (api:124)
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 117,
in method
ret = func(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/vdsm/API.py", line 180, in create
raise exception.VMExists()
VMExists: Virtual machine already exists
2017-12-05 06:26:46,620-0500 INFO (jsonrpc/6) [api.virt] FINISH create
return={'status': {'message': 'Virtual machine already exists',
'code':
4}} from=::ffff:192.168.201.3,50394 (api:52)
2017-12-05 06:26:46,620-0500 INFO (jsonrpc/6) [api.virt] FINISH
migrationCreate return={'status': {'message': 'Virtual machine
already
exists', 'code': 4}} from=::ffff:192.168.201.3,50394 (api:52)
2017-12-05 06:26:46,620-0500 INFO (jsonrpc/6) [jsonrpc.JsonRpcServer]
RPC call VM.migrationCreate failed (error 4) in 0.03 seconds (__init__:573)
2017-12-05 06:26:46,624-0500 INFO (jsonrpc/2) [api.virt] START
destroy(gracefulAttempts=1) from=::ffff:192.168.201.3,50394 (api:46)
2017-12-05 06:26:46,624-0500 INFO (jsonrpc/2) [virt.vm]
(vmId='b3962e5c-08e3-444e-910e-ea675fa1a5c7') Release VM resources (vm:4967)
2017-12-05 06:26:46,625-0500 WARN (jsonrpc/2) [virt.vm]
(vmId='b3962e5c-08e3-444e-910e-ea675fa1a5c7') trying to set state to
Powering down when already Down (vm:575)
2017-12-05 06:26:46,625-0500 INFO (jsonrpc/2) [virt.vm]
(vmId='b3962e5c-08e3-444e-910e-ea675fa1a5c7') Stopping connection
(guestagent:435)
2017-12-05 06:26:46,625-0500 INFO (jsonrpc/2) [virt.vm]
(vmId='b3962e5c-08e3-444e-910e-ea675fa1a5c7') _destroyVmGraceful attempt
#0 (vm:5004)
2017-12-05 06:26:46,626-0500 WARN (jsonrpc/2) [virt.vm]
(vmId='b3962e5c-08e3-444e-910e-ea675fa1a5c7') VM
'b3962e5c-08e3-444e-910e-ea675fa1a5c7' couldn't be destroyed in libvirt:
Requested operation is not valid: domain is not running (vm
:5025)
2017-12-05 06:26:46,627-0500 INFO (jsonrpc/2) [vdsm.api] START
teardownImage(sdUUID='952bb427-b88c-4fbe-99ef-49970d3aaf70',
spUUID='9dcfeaaf-96b7-4e26-a327-5570e0e39261',
imgUUID='e6eadbae-ec7a-48f4-8832-64a622a12bef', volUUID=None) from
=::ffff:192.168.201.3,50394,
task_id=2da93725-5533-4354-a369-751eb44f9ef2 (api:46)
*
**scr *
*
*2017-12-05 06:26:46,623-0500 ERROR (migsrc/b3962e5c) [virt.vm]
(vmId='b3962e5c-08e3-444e-910e-ea675fa1a5c7') migration destination
error: Virtual machine already exists (migration:290)
*
**disruption on **dst**: *
*
**2017-12-05 06:20:04,662-0500 INFO (jsonrpc/3) [jsonrpc.JsonRpcServer]
RPC call VM.shutdown succeeded in 0.00 seconds (__init__:573)
2017-12-05 06:20:04,676-0500 ERROR (Thread-1) [root] Shutdown by QEMU
Guest Agent failed (vm:5097)
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 5088, in
qemuGuestAgentShutdown
self._dom.shutdownFlags(libvirt.VIR_DOMAIN_SHUTDOWN_GUEST_AGENT)
File "/usr/lib/python2.7/site-packages/vdsm/virt/virdomain.py", line
98, in f
ret = attr(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py",
line 126, in wrapper
ret = f(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 512, in
wrapper
return func(inst, *args, **kwargs)
File "/usr/lib64/python2.7/site-packages/libvirt.py", line 2403, in
shutdownFlags
if ret == -1: raise libvirtError ('virDomainShutdownFlags() failed',
dom=self)
libvirtError: Guest agent is not responding: QEMU guest agent is not
connected
2017-12-05 06:20:04,697-0500 INFO (libvirt/events) [virt.vm]
(vmId='b3962e5c-08e3-444e-910e-ea675fa1a5c7') block threshold 1 exceeded
on 'vda'
(/rhev/data-center/mnt/blockSD/952bb427-b88c-4fbe-99ef-49970d3aaf70/images/e6eadbae-ec7a-48f4-
8832-64a622a12bef/1f063c73-f5cd-44e8-a84f-7810857f82df) (drivemonitor:162)
2017-12-05 06:20:04,698-0500 INFO (libvirt/events) [virt.vm]
(vmId='b3962e5c-08e3-444e-910e-ea675fa1a5c7') drive 'vda' threshold
exceeded (storage:872)
2017-12-05 06:20:05,889-0500 INFO (jsonrpc/7) [api.host] START
getAllVmStats() from=::1,41118 (api:46)
2017-12-05 06:20:05,891-0500 INFO (jsonrpc/7) [api.host] FINISH
getAllVmStats return={'status': {'message': 'Done',
'code': 0},
'statsList': (suppressed)} from=::1,41118 (api:52)
2017-12-05 06:20:05,892-0500 INFO (jsonrpc/7) [jsonrpc.JsonRpcServer]
RPC call Host.getAllVmStats succeeded in 0.00 seconds (__init__:573)
2017-12-05 06:20:07,466-0500 INFO (libvirt/events) [virt.vm]
(vmId='b3962e5c-08e3-444e-910e-ea675fa1a5c7') underlying process
disconnected (vm:1024)
2017-12-05 06:20:07,469-0500 INFO (libvirt/events) [virt.vm]
(vmId='b3962e5c-08e3-444e-910e-ea675fa1a5c7') Release VM resources (vm:4967)
2017-12-05 06:20:07,469-0500 INFO (libvirt/events) [virt.vm]
(vmId='b3962e5c-08e3-444e-910e-ea675fa1a5c7') Stopping connection
(guestagent:435)
2017-12-05 06:20:07,469-0500 INFO (libvirt/events) [vdsm.api] START
teardownImage(sdUUID='952bb427-b88c-4fbe-99ef-49970d3aaf70',
spUUID='9dcfeaaf-96b7-4e26-a327-5570e0e39261',
imgUUID='e6eadbae-ec7a-48f4-8832-64a622a12bef', volUUID=None)
from=internal, task_id=9efedf46-d3be-4e41-b7f9-a074ed6344f6 (api:46)
*
*
*</error>*
**
--------------D695925BEA0466FA018AF9F5
Content-Type: text/html; charset=utf-8
Content-Transfer-Encoding: 8bit
<html>
<head>
<meta http-equiv="content-type" content="text/html;
charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<p>Hi, <br>
</p>
<p>We had a failure for test 006_migrations.migrate_vm on master. </p>
<p>There was a libvirt disruption in the migration src and after
that vdsm reported the migration as failed because the vm already
exists which makes me suspect a split bran case. <br>
</p>
<p>The patch reported has nothing to do with this issue and I think
we simply stumbled on a race condition which can cause a split
brain. <br>
</p>
<p>Please note that there are several metrics related issues
reported in vdsm logs as well. <br>
</p>
<p><b style="font-weight:normal;"
id="docs-internal-guid-5859b7a1-269d-3fbb-6886-821a181fd550">
<p dir="ltr"
style="line-height:1.38;margin-top:0pt;margin-bottom:0pt;"><span
style="font-size:11pt;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;">Link
and headline of suspected patches:
</span></p>
<p dir="ltr"
style="line-height:1.38;margin-top:0pt;margin-bottom:0pt;"><span
style="font-size:11pt;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;">
</span></p>
<p dir="ltr"
style="line-height:1.38;margin-top:0pt;margin-bottom:0pt;"><span
style="font-size:11pt;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;"><b>Not
related </b>
</span></p>
<br>
<p dir="ltr"
style="line-height:1.38;margin-top:0pt;margin-bottom:0pt;"><span
style="font-size:11pt;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;">Link
to Job:</span></p>
<br>
</b></p>
<p><b style="font-weight:normal;"
id="docs-internal-guid-5859b7a1-269d-3fbb-6886-821a181fd550"><b><a
class="moz-txt-link-freetext"
href="http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/...
<p><b style="font-weight:normal;"
id="docs-internal-guid-5859b7a1-269d-3fbb-6886-821a181fd550"><br>
<p dir="ltr"
style="line-height:1.38;margin-top:0pt;margin-bottom:0pt;"><span
style="font-size:11pt;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;">Link
to all logs:</span></p>
<br>
</b></p>
<p><b style="font-weight:normal;"
id="docs-internal-guid-5859b7a1-269d-3fbb-6886-821a181fd550"><b><a
class="moz-txt-link-freetext"
href="http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/...
<p><b style="font-weight:normal;"
id="docs-internal-guid-5859b7a1-269d-3fbb-6886-821a181fd550"><br>
<p dir="ltr"
style="line-height:1.38;margin-top:0pt;margin-bottom:0pt;"><span
style="font-size:11pt;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;">(Relevant)
error snippet from the log: </span></p>
<p dir="ltr"
style="line-height:1.38;margin-top:0pt;margin-bottom:0pt;"><b><span
style="font-size:11pt;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;"><error></span></b></p>
<p dir="ltr"
style="line-height:1.38;margin-top:0pt;margin-bottom:0pt;"><b><span
style="font-size:11pt;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;">
</span></b></p>
<p dir="ltr"
style="line-height:1.38;margin-top:0pt;margin-bottom:0pt;"><b><span
style="font-size:11pt;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;"><b>Engine:
</b>
</span></b></p>
<p dir="ltr"
style="line-height:1.38;margin-top:0pt;margin-bottom:0pt;"><b><span
style="font-size:11pt;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;">2017-12-05
06:26:48,546-05 ERROR
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(EE-ManagedThreadFactory-engine-Thread-385) [] EVENT_ID:
VM_MIGRATION_TO_SERVER_FAILED(120), Migration failed (VM: vm0, Source: lago-
basic-suite-master-host-1, Destination: lago-basic-suite-master-host-0).
</span></b></p>
<br>
<b>dst:</b><b><br>
</b></b></p>
<p><b><span style="font-weight:normal;">2017-12-05
06:26:46,615-0500
WARN (jsonrpc/6) [vds] vm
b3962e5c-08e3-444e-910e-ea675fa1a5c7 already exists (API:179)<br>
2017-12-05 06:26:46,615-0500 ERROR (jsonrpc/6) [api] FINISH
create error=Virtual machine already exists (api:124)<br>
Traceback (most recent call last):<br>
File "/usr/lib/python2.7/site-packages/vdsm/common/api.py",
line 117, in method<br>
ret = func(*args, **kwargs)<br>
File "/usr/lib/python2.7/site-packages/vdsm/API.py", line
180, in create<br>
raise exception.VMExists()<br>
VMExists: Virtual machine already exists<br>
2017-12-05 06:26:46,620-0500 INFO (jsonrpc/6) [api.virt]
FINISH create return={'status': {'message': 'Virtual
machine
already exists', 'code': 4}} from=::ffff:192.168.201.3,50394
(api:52)<br>
2017-12-05 06:26:46,620-0500 INFO (jsonrpc/6) [api.virt]
FINISH migrationCreate return={'status': {'message':
'Virtual
machine already exists', 'code': 4}}
from=::ffff:192.168.201.3,50394 (api:52)<br>
2017-12-05 06:26:46,620-0500 INFO (jsonrpc/6)
[jsonrpc.JsonRpcServer] RPC call VM.migrationCreate failed
(error 4) in 0.03 seconds (__init__:573)<br>
2017-12-05 06:26:46,624-0500 INFO (jsonrpc/2) [api.virt]
START destroy(gracefulAttempts=1)
from=::ffff:192.168.201.3,50394 (api:46)<br>
2017-12-05 06:26:46,624-0500 INFO (jsonrpc/2) [virt.vm]
(vmId='b3962e5c-08e3-444e-910e-ea675fa1a5c7') Release VM
resources (vm:4967)<br>
2017-12-05 06:26:46,625-0500 WARN (jsonrpc/2) [virt.vm]
(vmId='b3962e5c-08e3-444e-910e-ea675fa1a5c7') trying to set
state to Powering down when already Down (vm:575)<br>
2017-12-05 06:26:46,625-0500 INFO (jsonrpc/2) [virt.vm]
(vmId='b3962e5c-08e3-444e-910e-ea675fa1a5c7') Stopping
connection (guestagent:435)<br>
2017-12-05 06:26:46,625-0500 INFO (jsonrpc/2) [virt.vm]
(vmId='b3962e5c-08e3-444e-910e-ea675fa1a5c7')
_destroyVmGraceful attempt #0 (vm:5004)<br>
2017-12-05 06:26:46,626-0500 WARN (jsonrpc/2) [virt.vm]
(vmId='b3962e5c-08e3-444e-910e-ea675fa1a5c7') VM
'b3962e5c-08e3-444e-910e-ea675fa1a5c7' couldn't be destroyed
in libvirt: Requested operation is not valid: domain is not
running (vm<br>
:5025)<br>
2017-12-05 06:26:46,627-0500 INFO (jsonrpc/2) [vdsm.api]
START
teardownImage(sdUUID='952bb427-b88c-4fbe-99ef-49970d3aaf70',
spUUID='9dcfeaaf-96b7-4e26-a327-5570e0e39261',
imgUUID='e6eadbae-ec7a-48f4-8832-64a622a12bef', volUUID=None)
from<br>
=::ffff:192.168.201.3,50394,
task_id=2da93725-5533-4354-a369-751eb44f9ef2 (api:46)<br>
</span></b></p>
<p><b><span style="font-weight:normal;"><b>scr
</b><br>
</span></b></p>
<p><b><span style="font-weight:normal;">2017-12-05
06:26:46,623-0500
ERROR (migsrc/b3962e5c) [virt.vm]
(vmId='b3962e5c-08e3-444e-910e-ea675fa1a5c7') migration
destination error: Virtual machine already exists
(migration:290)<br>
</span></b></p>
<p><b><span
style="font-weight:normal;"><b>disruption on
</b><b>dst</b><b>:
</b><br>
</span></b></p>
<p><b style="font-weight:normal;"
id="docs-internal-guid-5859b7a1-269d-3fbb-6886-821a181fd550"><b>2017-12-05
06:20:04,662-0500 INFO (jsonrpc/3) [jsonrpc.JsonRpcServer]
RPC call VM.shutdown succeeded in 0.00 seconds (__init__:573)<br>
2017-12-05 06:20:04,676-0500 ERROR (Thread-1) [root] Shutdown
by QEMU Guest Agent failed (vm:5097)<br>
Traceback (most recent call last):<br>
File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py",
line 5088, in qemuGuestAgentShutdown<br>
self._dom.shutdownFlags(libvirt.VIR_DOMAIN_SHUTDOWN_GUEST_AGENT)<br>
File
"/usr/lib/python2.7/site-packages/vdsm/virt/virdomain.py",
line 98, in f<br>
ret = attr(*args, **kwargs)<br>
File
"/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py",
line 126, in wrapper<br>
ret = f(*args, **kwargs)<br>
File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line
512, in wrapper<br>
return func(inst, *args, **kwargs)<br>
File "/usr/lib64/python2.7/site-packages/libvirt.py", line
2403, in shutdownFlags<br>
if ret == -1: raise libvirtError
('virDomainShutdownFlags() failed', dom=self)<br>
libvirtError: Guest agent is not responding: QEMU guest agent
is not connected<br>
2017-12-05 06:20:04,697-0500 INFO (libvirt/events) [virt.vm]
(vmId='b3962e5c-08e3-444e-910e-ea675fa1a5c7') block threshold
1 exceeded on 'vda'
(/rhev/data-center/mnt/blockSD/952bb427-b88c-4fbe-99ef-49970d3aaf70/images/e6eadbae-ec7a-48f4-<br>
8832-64a622a12bef/1f063c73-f5cd-44e8-a84f-7810857f82df)
(drivemonitor:162)<br>
2017-12-05 06:20:04,698-0500 INFO (libvirt/events) [virt.vm]
(vmId='b3962e5c-08e3-444e-910e-ea675fa1a5c7') drive 'vda'
threshold exceeded (storage:872)<br>
2017-12-05 06:20:05,889-0500 INFO (jsonrpc/7) [api.host]
START getAllVmStats() from=::1,41118 (api:46)<br>
2017-12-05 06:20:05,891-0500 INFO (jsonrpc/7) [api.host]
FINISH getAllVmStats return={'status': {'message':
'Done',
'code': 0}, 'statsList': (suppressed)} from=::1,41118
(api:52)<br>
2017-12-05 06:20:05,892-0500 INFO (jsonrpc/7)
[jsonrpc.JsonRpcServer] RPC call Host.getAllVmStats succeeded
in 0.00 seconds (__init__:573)<br>
2017-12-05 06:20:07,466-0500 INFO (libvirt/events) [virt.vm]
(vmId='b3962e5c-08e3-444e-910e-ea675fa1a5c7') underlying
process disconnected (vm:1024)<br>
2017-12-05 06:20:07,469-0500 INFO (libvirt/events) [virt.vm]
(vmId='b3962e5c-08e3-444e-910e-ea675fa1a5c7') Release VM
resources (vm:4967)<br>
2017-12-05 06:20:07,469-0500 INFO (libvirt/events) [virt.vm]
(vmId='b3962e5c-08e3-444e-910e-ea675fa1a5c7') Stopping
connection (guestagent:435)<br>
2017-12-05 06:20:07,469-0500 INFO (libvirt/events) [vdsm.api]
START
teardownImage(sdUUID='952bb427-b88c-4fbe-99ef-49970d3aaf70',
spUUID='9dcfeaaf-96b7-4e26-a327-5570e0e39261',
imgUUID='e6eadbae-ec7a-48f4-8832-64a622a12bef', volUUID=None)<br>
from=internal, task_id=9efedf46-d3be-4e41-b7f9-a074ed6344f6
(api:46)<br>
</b><br>
<p dir="ltr"
style="line-height:1.38;margin-top:0pt;margin-bottom:0pt;"><span
style="font-size:11pt;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;"></error></span></p>
</b><br class="Apple-interchange-newline">
</p>
<p><br>
</p>
<p><br>
</p>
</body>
</html>
--------------D695925BEA0466FA018AF9F5--