[ovirt-devel] [ OST Failure Report ] [ oVirt Master ] [ 05-12-2017 ] [ 006_migrations.migrate_vm ]

Michal Skrivanek michal.skrivanek at redhat.com
Tue Dec 5 15:26:56 UTC 2017


Milan,

log [1], VM b3962e5c-08e3-444e-910e-ea675fa1a5c7
migration away finished at 2017-12-05 06:26:24,515-0500
incoming migration of the same VM back, at 2017-12-05 06:26:46,614-0500

seems to me the migration away didn’t really properly clean up the VM. Milane, can you check that if logs matches? Perhaps orphaned libvirt’s xml?
Otherwise it would need reproduction in CI and some more logs….

[1] http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/4278/artifact/exported-artifacts/basic-suit-master-el7/test_logs/basic-suite-master/post-006_migrations.py/lago-basic-suite-master-host-0/_var_log/vdsm/vdsm.log <http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/4278/artifact/exported-artifacts/basic-suit-master-el7/test_logs/basic-suite-master/post-006_migrations.py/lago-basic-suite-master-host-0/_var_log/vdsm/vdsm.log>

> On 5 Dec 2017, at 13:26, Dafna Ron <dron at redhat.com> wrote:
> 
> Hi, 
> We had a failure for test 006_migrations.migrate_vm on master. 
> 
> There was a libvirt disruption in the migration src  and after that vdsm reported the migration as failed because the vm already exists which makes me suspect a split bran case. 
> The patch reported has nothing to do with this issue and I think we simply stumbled on a race condition which can cause a split brain. 
> Please note that there are several metrics related issues reported in vdsm logs as well. 
> 
> Link and headline of suspected patches: 
> 
> Not related 
> 
> Link to Job:
> 
> 
> http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/4278/ <http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/4278/>
> 
> Link to all logs:
> 
> 
> http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/4278/artifact/exported-artifacts/basic-suit-master-el7/test_logs/basic-suite-master/post-006_migrations.py/ <http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/4278/artifact/exported-artifacts/basic-suit-master-el7/test_logs/basic-suite-master/post-006_migrations.py/>
> 
> (Relevant) error snippet from the log: 
> <error>
> 
> Engine: 
> 2017-12-05 06:26:48,546-05 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-385) [] EVENT_ID: VM_MIGRATION_TO_SERVER_FAILED(120), Migration failed  (VM: vm0, Source: lago-
> basic-suite-master-host-1, Destination: lago-basic-suite-master-host-0).
> 
> 
> dst:
> 
> 2017-12-05 06:26:46,615-0500 WARN  (jsonrpc/6) [vds] vm b3962e5c-08e3-444e-910e-ea675fa1a5c7 already exists (API:179)
> 2017-12-05 06:26:46,615-0500 ERROR (jsonrpc/6) [api] FINISH create error=Virtual machine already exists (api:124)
> Traceback (most recent call last):
>   File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 117, in method
>     ret = func(*args, **kwargs)
>   File "/usr/lib/python2.7/site-packages/vdsm/API.py", line 180, in create
>     raise exception.VMExists()
> VMExists: Virtual machine already exists
> 2017-12-05 06:26:46,620-0500 INFO  (jsonrpc/6) [api.virt] FINISH create return={'status': {'message': 'Virtual machine already exists', 'code': 4}} from=::ffff:192.168.201.3,50394 (api:52)
> 2017-12-05 06:26:46,620-0500 INFO  (jsonrpc/6) [api.virt] FINISH migrationCreate return={'status': {'message': 'Virtual machine already exists', 'code': 4}} from=::ffff:192.168.201.3,50394 (api:52)
> 2017-12-05 06:26:46,620-0500 INFO  (jsonrpc/6) [jsonrpc.JsonRpcServer] RPC call VM.migrationCreate failed (error 4) in 0.03 seconds (__init__:573)
> 2017-12-05 06:26:46,624-0500 INFO  (jsonrpc/2) [api.virt] START destroy(gracefulAttempts=1) from=::ffff:192.168.201.3,50394 (api:46)
> 2017-12-05 06:26:46,624-0500 INFO  (jsonrpc/2) [virt.vm] (vmId='b3962e5c-08e3-444e-910e-ea675fa1a5c7') Release VM resources (vm:4967)
> 2017-12-05 06:26:46,625-0500 WARN  (jsonrpc/2) [virt.vm] (vmId='b3962e5c-08e3-444e-910e-ea675fa1a5c7') trying to set state to Powering down when already Down (vm:575)
> 2017-12-05 06:26:46,625-0500 INFO  (jsonrpc/2) [virt.vm] (vmId='b3962e5c-08e3-444e-910e-ea675fa1a5c7') Stopping connection (guestagent:435)
> 2017-12-05 06:26:46,625-0500 INFO  (jsonrpc/2) [virt.vm] (vmId='b3962e5c-08e3-444e-910e-ea675fa1a5c7') _destroyVmGraceful attempt #0 (vm:5004)
> 2017-12-05 06:26:46,626-0500 WARN  (jsonrpc/2) [virt.vm] (vmId='b3962e5c-08e3-444e-910e-ea675fa1a5c7') VM 'b3962e5c-08e3-444e-910e-ea675fa1a5c7' couldn't be destroyed in libvirt: Requested operation is not valid: domain is not running (vm
> :5025)
> 2017-12-05 06:26:46,627-0500 INFO  (jsonrpc/2) [vdsm.api] START teardownImage(sdUUID='952bb427-b88c-4fbe-99ef-49970d3aaf70', spUUID='9dcfeaaf-96b7-4e26-a327-5570e0e39261', imgUUID='e6eadbae-ec7a-48f4-8832-64a622a12bef', volUUID=None) from
> =::ffff:192.168.201.3,50394, task_id=2da93725-5533-4354-a369-751eb44f9ef2 (api:46)
> 
> scr 
> 
> 2017-12-05 06:26:46,623-0500 ERROR (migsrc/b3962e5c) [virt.vm] (vmId='b3962e5c-08e3-444e-910e-ea675fa1a5c7') migration destination error: Virtual machine already exists (migration:290)
> 
> disruption on dst: 
> 
> 2017-12-05 06:20:04,662-0500 INFO  (jsonrpc/3) [jsonrpc.JsonRpcServer] RPC call VM.shutdown succeeded in 0.00 seconds (__init__:573)
> 2017-12-05 06:20:04,676-0500 ERROR (Thread-1) [root] Shutdown by QEMU Guest Agent failed (vm:5097)
> Traceback (most recent call last):
>   File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 5088, in qemuGuestAgentShutdown
>     self._dom.shutdownFlags(libvirt.VIR_DOMAIN_SHUTDOWN_GUEST_AGENT)
>   File "/usr/lib/python2.7/site-packages/vdsm/virt/virdomain.py", line 98, in f
>     ret = attr(*args, **kwargs)
>   File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line 126, in wrapper
>     ret = f(*args, **kwargs)
>   File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 512, in wrapper
>     return func(inst, *args, **kwargs)
>   File "/usr/lib64/python2.7/site-packages/libvirt.py", line 2403, in shutdownFlags
>     if ret == -1: raise libvirtError ('virDomainShutdownFlags() failed', dom=self)
> libvirtError: Guest agent is not responding: QEMU guest agent is not connected
> 2017-12-05 06:20:04,697-0500 INFO  (libvirt/events) [virt.vm] (vmId='b3962e5c-08e3-444e-910e-ea675fa1a5c7') block threshold 1 exceeded on 'vda' (/rhev/data-center/mnt/blockSD/952bb427-b88c-4fbe-99ef-49970d3aaf70/images/e6eadbae-ec7a-48f4-
> 8832-64a622a12bef/1f063c73-f5cd-44e8-a84f-7810857f82df) (drivemonitor:162)
> 2017-12-05 06:20:04,698-0500 INFO  (libvirt/events) [virt.vm] (vmId='b3962e5c-08e3-444e-910e-ea675fa1a5c7') drive 'vda' threshold exceeded (storage:872)
> 2017-12-05 06:20:05,889-0500 INFO  (jsonrpc/7) [api.host] START getAllVmStats() from=::1,41118 (api:46)
> 2017-12-05 06:20:05,891-0500 INFO  (jsonrpc/7) [api.host] FINISH getAllVmStats return={'status': {'message': 'Done', 'code': 0}, 'statsList': (suppressed)} from=::1,41118 (api:52)
> 2017-12-05 06:20:05,892-0500 INFO  (jsonrpc/7) [jsonrpc.JsonRpcServer] RPC call Host.getAllVmStats succeeded in 0.00 seconds (__init__:573)
> 2017-12-05 06:20:07,466-0500 INFO  (libvirt/events) [virt.vm] (vmId='b3962e5c-08e3-444e-910e-ea675fa1a5c7') underlying process disconnected (vm:1024)
> 2017-12-05 06:20:07,469-0500 INFO  (libvirt/events) [virt.vm] (vmId='b3962e5c-08e3-444e-910e-ea675fa1a5c7') Release VM resources (vm:4967)
> 2017-12-05 06:20:07,469-0500 INFO  (libvirt/events) [virt.vm] (vmId='b3962e5c-08e3-444e-910e-ea675fa1a5c7') Stopping connection (guestagent:435)
> 2017-12-05 06:20:07,469-0500 INFO  (libvirt/events) [vdsm.api] START teardownImage(sdUUID='952bb427-b88c-4fbe-99ef-49970d3aaf70', spUUID='9dcfeaaf-96b7-4e26-a327-5570e0e39261', imgUUID='e6eadbae-ec7a-48f4-8832-64a622a12bef', volUUID=None)
>  from=internal, task_id=9efedf46-d3be-4e41-b7f9-a074ed6344f6 (api:46)
> 
> 
> </error>
> 
> 
> 
> 
> _______________________________________________
> Devel mailing list
> Devel at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/devel

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/devel/attachments/20171205/3c7bb93b/attachment.html>


More information about the Devel mailing list