[ovirt-devel] Strange concurrency error on VM creation

Marc Young 3vilpenguin at gmail.com
Tue Mar 7 16:42:47 UTC 2017


I've been fighting this for roughly two days and I'm starting to think that
possibly it's not my code but an interaction with the server.

I'm using test-kitchen[1] with the kitchen-vagrant[2] driver to spin up
vagrant machines and run tests against them. I'm using Jenkins to run
kitchen in containers in parallel.

Basically Jenkins runs a docker container with ruby + vagrant 1.9.2 and
runs kitchen test all at the same time as another container with ruby +
vagrant 1.9.1.

If I run these in parallel, on some occasions the server seems to respond
with the wrong creation information. If you look at the logs here:
http://home.blindrage.us:8080/job/myoung34/job/vagrant-ovirt4/view/change-
requests/job/PR-79/41/console


the container for vagrant 1.9.1 created a VM `vagrant-dynamic-1.9.1:

[vagrant-1.9.1]        Bringing machine 'default' up with 'ovirt4' provider...

[vagrant-1.9.1]        ==> default: Creating VM with the following settings...

[vagrant-1.9.1]        ==> default:  -- Name:          dynamic-1.9.1


And the container for vagrant 1.9.2 (nearly the same time) created a VM
`vagrant-dynamic-1.9.2`:

[vagrant-1.9.2]        ==> default: Creating VM with the following settings...

[vagrant-1.9.2]        ==> default:  -- Name:          dynamic-1.9.2

[vagrant-1.9.2]        ==> default:  -- Cluster:       Default


If you look at the ss:

the container 1.9.1 will wait for dynamic-1.9.1 and try to contact it at
192.168.2.54

the container 1.9.2 will wait for dynamic-1.9.2 and try to contact it at
192.168.2.55

But if you look at the logs, the 1.9.1 container started trying to work
with 192.168.2.55 by creating a new key then talking to it:

     [vagrant-1.9.1]            default: Key inserted! Disconnecting
and reconnecting using new SSH key...

[vagrant-1.9.1]        Waiting for SSH service on 192.168.2.55:22,
retrying in 3 seconds


Because 1.9.1 inserted a generated key into that box, the 1.9.2 container
which _should_ be talking to it cannot now:

[vagrant-1.9.2]        ==> default: Rsyncing folder:
/home/jenkins/.kitchen/cache/ => /tmp/omnibus/cache
[vagrant-1.9.2]        SSH authentication failed! This is typically
caused by the public/private
[vagrant-1.9.2]        keypair for the SSH user not being properly set
on the guest VM. Please
[vagrant-1.9.2]        verify that the guest VM is setup with the
proper public key, and that
[vagrant-1.9.2]        the private key path for Vagrant is setup
properly as well.



Via the ruby sdk I create the VM and store the ID it responded with.
Then to get the IP:

server = env[:vms_service].vm_service(env[:machine].id)
nics_service = server.nics_service
nics = nics_service.list
ip_addr = nics.collect { |nic_attachment|
env[:connection].follow_link(nic_attachment).reported_devices.collect {
|dev| dev.ips.collect { |ip| ip.address if ip.version == 'v4' } }
}.flatten.reject {   |ip| ip.nil? }.first rescue nil

Given this code I can't think of any way that I would get the wrong IP
unless somehow the server responded incorrectly, since the NIC's i've
scanned and compiled across are tied directly to the server I created.

Any thoughts? This only happpens randomly and it seems to happen if I
bombard the server with a bunch of VM creations simultaneously

[1] https://github.com/test-kitchen/test-kitchen
[2] https://github.com/test-kitchen/kitchen-vagrant
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/devel/attachments/20170307/51a6ec14/attachment.html>


More information about the Devel mailing list