[ovirt-devel] Strange concurrency error on VM creation
Juan Hernández
jhernand at redhat.com
Tue Mar 7 17:01:02 UTC 2017
On 03/07/2017 05:42 PM, Marc Young wrote:
> I've been fighting this for roughly two days and I'm starting to think
> that possibly it's not my code but an interaction with the server.
>
> I'm using test-kitchen[1] with the kitchen-vagrant[2] driver to spin up
> vagrant machines and run tests against them. I'm using Jenkins to run
> kitchen in containers in parallel.
>
> Basically Jenkins runs a docker container with ruby + vagrant 1.9.2 and
> runs kitchen test all at the same time as another container with ruby +
> vagrant 1.9.1.
>
> If I run these in parallel, on some occasions the server seems to
> respond with the wrong creation information. If you look at the logs
> here: http://home.blindrage.us:8080/job/myoung34/job/vagrant-ovirt4/view/change-requests/job/PR-79/41/console
> <http://home.blindrage.us:8080/job/myoung34/job/vagrant-ovirt4/view/change-requests/job/PR-79/41/console>
>
>
> the container for vagrant 1.9.1 created a VM `vagrant-dynamic-1.9.1:
>
> [vagrant-1.9.1] Bringing machine 'default' up with 'ovirt4' provider...
>
> [vagrant-1.9.1] ==> default: Creating VM with the following settings...
>
> [vagrant-1.9.1] ==> default: -- Name: dynamic-1.9.1
>
>
> And the container for vagrant 1.9.2 (nearly the same time) created a VM
> `vagrant-dynamic-1.9.2`:
>
> [vagrant-1.9.2] ==> default: Creating VM with the following settings...
>
> [vagrant-1.9.2] ==> default: -- Name: dynamic-1.9.2
>
> [vagrant-1.9.2] ==> default: -- Cluster: Default
>
>
> If you look at the ss:
>
> the container 1.9.1 will wait for dynamic-1.9.1 and try to contact it at
> 192.168.2.54
>
> the container 1.9.2 will wait for dynamic-1.9.2 and try to contact it at
> 192.168.2.55
>
> But if you look at the logs, the 1.9.1 container started trying to work
> with 192.168.2.55 by creating a new key then talking to it:
>
> [vagrant-1.9.1] default: Key inserted! Disconnecting and reconnecting using new SSH key...
>
> [vagrant-1.9.1] Waiting for SSH service on 192.168.2.55:22 <http://192.168.2.55:22>, retrying in 3 seconds
>
>
> Because 1.9.1 inserted a generated key into that box, the 1.9.2
> container which _should_ be talking to it cannot now:
>
> [vagrant-1.9.2] ==> default: Rsyncing folder: /home/jenkins/.kitchen/cache/ => /tmp/omnibus/cache
> [vagrant-1.9.2] SSH authentication failed! This is typically caused by the public/private
> [vagrant-1.9.2] keypair for the SSH user not being properly set on the guest VM. Please
> [vagrant-1.9.2] verify that the guest VM is setup with the proper public key, and that
> [vagrant-1.9.2] the private key path for Vagrant is setup properly as well.
>
>
>
> Via the ruby sdk I create the VM and store the ID it responded with.
> Then to get the IP:
>
> server = env[:vms_service].vm_service(env[:machine].id)
> nics_service = server.nics_service
> nics = nics_service.list
> ip_addr = nics.collect { |nic_attachment|
> env[:connection].follow_link(nic_attachment).reported_devices.collect {
> |dev| dev.ips.collect { |ip| ip.address if ip.version == 'v4' } }
> }.flatten.reject { |ip| ip.nil? }.first rescue nil
>
Is this code running inside the same Ruby process for both virtual
machines? In multiple threads?
> Given this code I can't think of any way that I would get the wrong IP
> unless somehow the server responded incorrectly, since the NIC's i've
> scanned and compiled across are tied directly to the server I created.
>
> Any thoughts? This only happpens randomly and it seems to happen if I
> bombard the server with a bunch of VM creations simultaneously
>
> [1] https://github.com/test-kitchen/test-kitchen
> [2] https://github.com/test-kitchen/kitchen-vagrant
>
>
> _______________________________________________
> Devel mailing list
> Devel at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/devel
>
More information about the Devel
mailing list