Strange concurrency error on VM creation

I've been fighting this for roughly two days and I'm starting to think that possibly it's not my code but an interaction with the server. I'm using test-kitchen[1] with the kitchen-vagrant[2] driver to spin up vagrant machines and run tests against them. I'm using Jenkins to run kitchen in containers in parallel. Basically Jenkins runs a docker container with ruby + vagrant 1.9.2 and runs kitchen test all at the same time as another container with ruby + vagrant 1.9.1. If I run these in parallel, on some occasions the server seems to respond with the wrong creation information. If you look at the logs here: http://home.blindrage.us:8080/job/myoung34/job/vagrant-ovirt4/view/change- requests/job/PR-79/41/console the container for vagrant 1.9.1 created a VM `vagrant-dynamic-1.9.1: [vagrant-1.9.1] Bringing machine 'default' up with 'ovirt4' provider... [vagrant-1.9.1] ==> default: Creating VM with the following settings... [vagrant-1.9.1] ==> default: -- Name: dynamic-1.9.1 And the container for vagrant 1.9.2 (nearly the same time) created a VM `vagrant-dynamic-1.9.2`: [vagrant-1.9.2] ==> default: Creating VM with the following settings... [vagrant-1.9.2] ==> default: -- Name: dynamic-1.9.2 [vagrant-1.9.2] ==> default: -- Cluster: Default If you look at the ss: the container 1.9.1 will wait for dynamic-1.9.1 and try to contact it at 192.168.2.54 the container 1.9.2 will wait for dynamic-1.9.2 and try to contact it at 192.168.2.55 But if you look at the logs, the 1.9.1 container started trying to work with 192.168.2.55 by creating a new key then talking to it: [vagrant-1.9.1] default: Key inserted! Disconnecting and reconnecting using new SSH key... [vagrant-1.9.1] Waiting for SSH service on 192.168.2.55:22, retrying in 3 seconds Because 1.9.1 inserted a generated key into that box, the 1.9.2 container which _should_ be talking to it cannot now: [vagrant-1.9.2] ==> default: Rsyncing folder: /home/jenkins/.kitchen/cache/ => /tmp/omnibus/cache [vagrant-1.9.2] SSH authentication failed! This is typically caused by the public/private [vagrant-1.9.2] keypair for the SSH user not being properly set on the guest VM. Please [vagrant-1.9.2] verify that the guest VM is setup with the proper public key, and that [vagrant-1.9.2] the private key path for Vagrant is setup properly as well. Via the ruby sdk I create the VM and store the ID it responded with. Then to get the IP: server = env[:vms_service].vm_service(env[:machine].id) nics_service = server.nics_service nics = nics_service.list ip_addr = nics.collect { |nic_attachment| env[:connection].follow_link(nic_attachment).reported_devices.collect { |dev| dev.ips.collect { |ip| ip.address if ip.version == 'v4' } } }.flatten.reject { |ip| ip.nil? }.first rescue nil Given this code I can't think of any way that I would get the wrong IP unless somehow the server responded incorrectly, since the NIC's i've scanned and compiled across are tied directly to the server I created. Any thoughts? This only happpens randomly and it seems to happen if I bombard the server with a bunch of VM creations simultaneously [1] https://github.com/test-kitchen/test-kitchen [2] https://github.com/test-kitchen/kitchen-vagrant

On 03/07/2017 05:42 PM, Marc Young wrote:
I've been fighting this for roughly two days and I'm starting to think that possibly it's not my code but an interaction with the server.
I'm using test-kitchen[1] with the kitchen-vagrant[2] driver to spin up vagrant machines and run tests against them. I'm using Jenkins to run kitchen in containers in parallel.
Basically Jenkins runs a docker container with ruby + vagrant 1.9.2 and runs kitchen test all at the same time as another container with ruby + vagrant 1.9.1.
If I run these in parallel, on some occasions the server seems to respond with the wrong creation information. If you look at the logs here: http://home.blindrage.us:8080/job/myoung34/job/vagrant-ovirt4/view/change-re... <http://home.blindrage.us:8080/job/myoung34/job/vagrant-ovirt4/view/change-requests/job/PR-79/41/console>
the container for vagrant 1.9.1 created a VM `vagrant-dynamic-1.9.1:
[vagrant-1.9.1] Bringing machine 'default' up with 'ovirt4' provider...
[vagrant-1.9.1] ==> default: Creating VM with the following settings...
[vagrant-1.9.1] ==> default: -- Name: dynamic-1.9.1
And the container for vagrant 1.9.2 (nearly the same time) created a VM `vagrant-dynamic-1.9.2`:
[vagrant-1.9.2] ==> default: Creating VM with the following settings...
[vagrant-1.9.2] ==> default: -- Name: dynamic-1.9.2
[vagrant-1.9.2] ==> default: -- Cluster: Default
If you look at the ss:
the container 1.9.1 will wait for dynamic-1.9.1 and try to contact it at 192.168.2.54
the container 1.9.2 will wait for dynamic-1.9.2 and try to contact it at 192.168.2.55
But if you look at the logs, the 1.9.1 container started trying to work with 192.168.2.55 by creating a new key then talking to it:
[vagrant-1.9.1] default: Key inserted! Disconnecting and reconnecting using new SSH key...
[vagrant-1.9.1] Waiting for SSH service on 192.168.2.55:22 <http://192.168.2.55:22>, retrying in 3 seconds
Because 1.9.1 inserted a generated key into that box, the 1.9.2 container which _should_ be talking to it cannot now:
[vagrant-1.9.2] ==> default: Rsyncing folder: /home/jenkins/.kitchen/cache/ => /tmp/omnibus/cache [vagrant-1.9.2] SSH authentication failed! This is typically caused by the public/private [vagrant-1.9.2] keypair for the SSH user not being properly set on the guest VM. Please [vagrant-1.9.2] verify that the guest VM is setup with the proper public key, and that [vagrant-1.9.2] the private key path for Vagrant is setup properly as well.
Via the ruby sdk I create the VM and store the ID it responded with. Then to get the IP:
server = env[:vms_service].vm_service(env[:machine].id) nics_service = server.nics_service nics = nics_service.list ip_addr = nics.collect { |nic_attachment| env[:connection].follow_link(nic_attachment).reported_devices.collect { |dev| dev.ips.collect { |ip| ip.address if ip.version == 'v4' } } }.flatten.reject { |ip| ip.nil? }.first rescue nil
Is this code running inside the same Ruby process for both virtual machines? In multiple threads?
Given this code I can't think of any way that I would get the wrong IP unless somehow the server responded incorrectly, since the NIC's i've scanned and compiled across are tied directly to the server I created.
Any thoughts? This only happpens randomly and it seems to happen if I bombard the server with a bunch of VM creations simultaneously
[1] https://github.com/test-kitchen/test-kitchen [2] https://github.com/test-kitchen/kitchen-vagrant
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel

Completely isolated docker containers. Jenkins basically runs two separate calls to docker... [vagrant-1.9.1] $ docker run -t -d -u 997:994 -v /opt/gemcache:/opt/gemcache -w /var/lib/jenkins/workspace/oung34_vagrant-ovirt4_PR-79-7BRKVM5TQ5BGPECFMXYIEOYZOICCET4GY37WXT4D65NSV4F5TADQ -v /var/lib/jenkins/workspace/oung34_vagrant-ovirt4_PR-79-7BRKVM5TQ5BGPECFMXYIEOYZOICCET4GY37WXT4D65NSV4F5TADQ:/var/lib/jenkins/workspace/oung34_vagrant-ovirt4_PR-79-7BRKVM5TQ5BGPECFMXYIEOYZOICCET4GY37WXT4D65NSV4F5TADQ:rw -v /var/lib/jenkins/workspace/oung34_vagrant-ovirt4_PR-79-7BRKVM5TQ5BGPECFMXYIEOYZOICCET4GY37WXT4D65NSV4F5TADQ@tmp:/var/lib/jenkins/workspace/oung34_vagrant-ovirt4_PR-79-7BRKVM5TQ5BGPECFMXYIEOYZOICCET4GY37WXT4D65NSV4F5TADQ@tmp:rw -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** --entrypoint cat myoung34/vagrant:1.9.1[Pipeline] [vagrant-1.9.1] { [Pipeline] [vagrant-1.9.2] withDockerContainer[vagrant-1.9.2] $ docker run -t -d -u 997:994 -v /opt/gemcache:/opt/gemcache -w /var/lib/jenkins/workspace/oung34_vagrant-ovirt4_PR-79-7BRKVM5TQ5BGPECFMXYIEOYZOICCET4GY37WXT4D65NSV4F5TADQ -v /var/lib/jenkins/workspace/oung34_vagrant-ovirt4_PR-79-7BRKVM5TQ5BGPECFMXYIEOYZOICCET4GY37WXT4D65NSV4F5TADQ:/var/lib/jenkins/workspace/oung34_vagrant-ovirt4_PR-79-7BRKVM5TQ5BGPECFMXYIEOYZOICCET4GY37WXT4D65NSV4F5TADQ:rw -v /var/lib/jenkins/workspace/oung34_vagrant-ovirt4_PR-79-7BRKVM5TQ5BGPECFMXYIEOYZOICCET4GY37WXT4D65NSV4F5TADQ@tmp:/var/lib/jenkins/workspace/oung34_vagrant-ovirt4_PR-79-7BRKVM5TQ5BGPECFMXYIEOYZOICCET4GY37WXT4D65NSV4F5TADQ@tmp:rw -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** --entrypoint cat myoung34/vagrant:1.9.2 Each of those containers in turn runs: + gem build *.gemspec + /usr/bin/vagrant plugin install *.gem + bundle install --path /opt/gemcache --without development plugins + bundle exec kitchen destroy all + rm -rf .kitchen + sleep \$(shuf -i 0-10 -n 1) #i did this to see if maybe i could stagger the creates + export VAGRANT_VERSION=\$(echo ${vagrantVersion} | sed 's/\\.//g') + bundle exec kitchen test ^[^singleton-] On Tue, Mar 7, 2017 at 11:01 AM, Juan Hernández <jhernand@redhat.com> wrote:
I've been fighting this for roughly two days and I'm starting to think that possibly it's not my code but an interaction with the server.
I'm using test-kitchen[1] with the kitchen-vagrant[2] driver to spin up vagrant machines and run tests against them. I'm using Jenkins to run kitchen in containers in parallel.
Basically Jenkins runs a docker container with ruby + vagrant 1.9.2 and runs kitchen test all at the same time as another container with ruby + vagrant 1.9.1.
If I run these in parallel, on some occasions the server seems to respond with the wrong creation information. If you look at the logs here: http://home.blindrage.us:8080/job/myoung34/job/vagrant- ovirt4/view/change-requests/job/PR-79/41/console <http://home.blindrage.us:8080/job/myoung34/job/vagrant- ovirt4/view/change-requests/job/PR-79/41/console>
the container for vagrant 1.9.1 created a VM `vagrant-dynamic-1.9.1:
[vagrant-1.9.1] Bringing machine 'default' up with 'ovirt4'
On 03/07/2017 05:42 PM, Marc Young wrote: provider...
[vagrant-1.9.1] ==> default: Creating VM with the following
settings...
[vagrant-1.9.1] ==> default: -- Name: dynamic-1.9.1
And the container for vagrant 1.9.2 (nearly the same time) created a VM `vagrant-dynamic-1.9.2`:
[vagrant-1.9.2] ==> default: Creating VM with the following
settings...
[vagrant-1.9.2] ==> default: -- Name: dynamic-1.9.2
[vagrant-1.9.2] ==> default: -- Cluster: Default
If you look at the ss:
the container 1.9.1 will wait for dynamic-1.9.1 and try to contact it at 192.168.2.54
the container 1.9.2 will wait for dynamic-1.9.2 and try to contact it at 192.168.2.55
But if you look at the logs, the 1.9.1 container started trying to work with 192.168.2.55 by creating a new key then talking to it:
[vagrant-1.9.1] default: Key inserted! Disconnecting and
reconnecting using new SSH key...
[vagrant-1.9.1] Waiting for SSH service on 192.168.2.55:22 <
http://192.168.2.55:22>, retrying in 3 seconds
Because 1.9.1 inserted a generated key into that box, the 1.9.2 container which _should_ be talking to it cannot now:
[vagrant-1.9.2] ==> default: Rsyncing folder:
/home/jenkins/.kitchen/cache/ => /tmp/omnibus/cache
[vagrant-1.9.2] SSH authentication failed! This is typically
caused by the public/private
[vagrant-1.9.2] keypair for the SSH user not being properly
set on the guest VM. Please
[vagrant-1.9.2] verify that the guest VM is setup with the
proper public key, and that
[vagrant-1.9.2] the private key path for Vagrant is setup
properly as well.
Via the ruby sdk I create the VM and store the ID it responded with. Then to get the IP:
server = env[:vms_service].vm_service(env[:machine].id) nics_service = server.nics_service nics = nics_service.list ip_addr = nics.collect { |nic_attachment| env[:connection].follow_link(nic_attachment).reported_devices.collect
{
|dev| dev.ips.collect { |ip| ip.address if ip.version == 'v4' } } }.flatten.reject { |ip| ip.nil? }.first rescue nil
Is this code running inside the same Ruby process for both virtual machines? In multiple threads?
Given this code I can't think of any way that I would get the wrong IP unless somehow the server responded incorrectly, since the NIC's i've scanned and compiled across are tied directly to the server I created.
Any thoughts? This only happpens randomly and it seems to happen if I bombard the server with a bunch of VM creations simultaneously
[1] https://github.com/test-kitchen/test-kitchen [2] https://github.com/test-kitchen/kitchen-vagrant
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel

On 03/07/2017 06:06 PM, Marc Young wrote:
Completely isolated docker containers. Jenkins basically runs two separate calls to docker...
[vagrant-1.9.1] $ docker run -t -d -u 997:994 -v /opt/gemcache:/opt/gemcache -w /var/lib/jenkins/workspace/oung34_vagrant-ovirt4_PR-79-7BRKVM5TQ5BGPECFMXYIEOYZOICCET4GY37WXT4D65NSV4F5TADQ -v /var/lib/jenkins/workspace/oung34_vagrant-ovirt4_PR-79-7BRKVM5TQ5BGPECFMXYIEOYZOICCET4GY37WXT4D65NSV4F5TADQ:/var/lib/jenkins/workspace/oung34_vagrant-ovirt4_PR-79-7BRKVM5TQ5BGPECFMXYIEOYZOICCET4GY37WXT4D65NSV4F5TADQ:rw -v /var/lib/jenkins/workspace/oung34_vagrant-ovirt4_PR-79-7BRKVM5TQ5BGPECFMXYIEOYZOICCET4GY37WXT4D65NSV4F5TADQ@tmp:/var/lib/jenkins/workspace/oung34_vagrant-ovirt4_PR-79-7BRKVM5TQ5BGPECFMXYIEOYZOICCET4GY37WXT4D65NSV4F5TADQ@tmp:rw -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** --entrypoint cat myoung34/vagrant:1.9.1 [Pipeline] [vagrant-1.9.1] {
[Pipeline] [vagrant-1.9.2] withDockerContainer [vagrant-1.9.2] $ docker run -t -d -u 997:994 -v /opt/gemcache:/opt/gemcache -w /var/lib/jenkins/workspace/oung34_vagrant-ovirt4_PR-79-7BRKVM5TQ5BGPECFMXYIEOYZOICCET4GY37WXT4D65NSV4F5TADQ -v /var/lib/jenkins/workspace/oung34_vagrant-ovirt4_PR-79-7BRKVM5TQ5BGPECFMXYIEOYZOICCET4GY37WXT4D65NSV4F5TADQ:/var/lib/jenkins/workspace/oung34_vagrant-ovirt4_PR-79-7BRKVM5TQ5BGPECFMXYIEOYZOICCET4GY37WXT4D65NSV4F5TADQ:rw -v /var/lib/jenkins/workspace/oung34_vagrant-ovirt4_PR-79-7BRKVM5TQ5BGPECFMXYIEOYZOICCET4GY37WXT4D65NSV4F5TADQ@tmp:/var/lib/jenkins/workspace/oung34_vagrant-ovirt4_PR-79-7BRKVM5TQ5BGPECFMXYIEOYZOICCET4GY37WXT4D65NSV4F5TADQ@tmp:rw -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** --entrypoint cat myoung34/vagrant:1.9.2
Each of those containers in turn runs:
+gem build *.gemspec +/usr/bin/vagrant plugin install *.gem +bundle install --path /opt/gemcache --without development plugins +bundle exec kitchen destroy all +rm -rf .kitchen +sleep \$(shuf -i 0-10 -n 1) #i did this to see if maybe i could stagger the creates +export VAGRANT_VERSION=\$(echo ${vagrantVersion} | sed 's/\\.//g') +bundle exec kitchen test ^[^singleton-]
On Tue, Mar 7, 2017 at 11:01 AM, Juan Hernández <jhernand@redhat.com <mailto:jhernand@redhat.com>> wrote:
On 03/07/2017 05:42 PM, Marc Young wrote: > I've been fighting this for roughly two days and I'm starting to think > that possibly it's not my code but an interaction with the server. > > I'm using test-kitchen[1] with the kitchen-vagrant[2] driver to spin up > vagrant machines and run tests against them. I'm using Jenkins to run > kitchen in containers in parallel. > > Basically Jenkins runs a docker container with ruby + vagrant 1.9.2 and > runs kitchen test all at the same time as another container with ruby + > vagrant 1.9.1. > > If I run these in parallel, on some occasions the server seems to > respond with the wrong creation information. If you look at the logs > here: http://home.blindrage.us:8080/job/myoung34/job/vagrant-ovirt4/view/change-re... <http://home.blindrage.us:8080/job/myoung34/job/vagrant-ovirt4/view/change-requests/job/PR-79/41/console> > <http://home.blindrage.us:8080/job/myoung34/job/vagrant-ovirt4/view/change-re... <http://home.blindrage.us:8080/job/myoung34/job/vagrant-ovirt4/view/change-requests/job/PR-79/41/console>> > > > the container for vagrant 1.9.1 created a VM `vagrant-dynamic-1.9.1: > > [vagrant-1.9.1] Bringing machine 'default' up with 'ovirt4' provider... > > [vagrant-1.9.1] ==> default: Creating VM with the following settings... > > [vagrant-1.9.1] ==> default: -- Name: dynamic-1.9.1 > > > And the container for vagrant 1.9.2 (nearly the same time) created a VM > `vagrant-dynamic-1.9.2`: > > [vagrant-1.9.2] ==> default: Creating VM with the following settings... > > [vagrant-1.9.2] ==> default: -- Name: dynamic-1.9.2 > > [vagrant-1.9.2] ==> default: -- Cluster: Default > > > If you look at the ss: > > the container 1.9.1 will wait for dynamic-1.9.1 and try to contact it at > 192.168.2.54 > > the container 1.9.2 will wait for dynamic-1.9.2 and try to contact it at > 192.168.2.55 > > But if you look at the logs, the 1.9.1 container started trying to work > with 192.168.2.55 by creating a new key then talking to it: > > [vagrant-1.9.1] default: Key inserted! Disconnecting and reconnecting using new SSH key... > > [vagrant-1.9.1] Waiting for SSH service on 192.168.2.55:22 <http://192.168.2.55:22> <http://192.168.2.55:22>, retrying in 3 seconds > > > Because 1.9.1 inserted a generated key into that box, the 1.9.2 > container which _should_ be talking to it cannot now: > > [vagrant-1.9.2] ==> default: Rsyncing folder: /home/jenkins/.kitchen/cache/ => /tmp/omnibus/cache > [vagrant-1.9.2] SSH authentication failed! This is typically caused by the public/private > [vagrant-1.9.2] keypair for the SSH user not being properly set on the guest VM. Please > [vagrant-1.9.2] verify that the guest VM is setup with the proper public key, and that > [vagrant-1.9.2] the private key path for Vagrant is setup properly as well. > > > > Via the ruby sdk I create the VM and store the ID it responded with. > Then to get the IP: >
Can you share this ^ code that creates and stores the ID of the virtual machine?
> server = env[:vms_service].vm_service(env[:machine].id) > nics_service = server.nics_service > nics = nics_service.list > ip_addr = nics.collect { |nic_attachment| > env[:connection].follow_link(nic_attachment).reported_devices.collect { > |dev| dev.ips.collect { |ip| ip.address if ip.version == 'v4' } } > }.flatten.reject { |ip| ip.nil? }.first rescue nil >
Is this code running inside the same Ruby process for both virtual machines? In multiple threads?
> Given this code I can't think of any way that I would get the wrong IP > unless somehow the server responded incorrectly, since the NIC's i've > scanned and compiled across are tied directly to the server I created. > > Any thoughts? This only happpens randomly and it seems to happen if I > bombard the server with a bunch of VM creations simultaneously > > [1] https://github.com/test-kitchen/test-kitchen <https://github.com/test-kitchen/test-kitchen> > [2] https://github.com/test-kitchen/kitchen-vagrant <https://github.com/test-kitchen/kitchen-vagrant> > > > _______________________________________________ > Devel mailing list > Devel@ovirt.org <mailto:Devel@ovirt.org> > http://lists.ovirt.org/mailman/listinfo/devel <http://lists.ovirt.org/mailman/listinfo/devel> >

This is where the ID is retrieved and stored: https://github.com/myoung34/vagrant-ovirt4/blob/master/lib/vagrant-ovirt4/ac... workflow is: create, wait til disks are OK, wait until vm status is "down", create network interfaces[1], start vm[2], wait until up[3] [1] https://github.com/myoung34/vagrant-ovirt4/blob/master/lib/vagrant-ovirt4/ac... [2] https://github.com/myoung34/vagrant-ovirt4/blob/master/lib/vagrant-ovirt4/ac... [3] https://github.com/myoung34/vagrant-ovirt4/blob/master/lib/vagrant-ovirt4/ac... On Tue, Mar 7, 2017 at 11:34 AM, Juan Hernández <jhernand@redhat.com> wrote:
Completely isolated docker containers. Jenkins basically runs two separate calls to docker...
[vagrant-1.9.1] $ docker run -t -d -u 997:994 -v /opt/gemcache:/opt/gemcache -w /var/lib/jenkins/workspace/ oung34_vagrant-ovirt4_PR-79-7BRKVM5TQ5BGPECFMXYIEOYZOICCET4GY37WXT4D65NSV4F5TADQ -v /var/lib/jenkins/workspace/oung34_vagrant-ovirt4_PR-79- 7BRKVM5TQ5BGPECFMXYIEOYZOICCET4GY37WXT4D65NSV4F5TADQ:/var/
On 03/07/2017 06:06 PM, Marc Young wrote: lib/jenkins/workspace/oung34_vagrant-ovirt4_PR-79- 7BRKVM5TQ5BGPECFMXYIEOYZOICCET4GY37WXT4D65NSV4F5TADQ:rw -v /var/lib/jenkins/workspace/oung34_vagrant-ovirt4_PR-79- 7BRKVM5TQ5BGPECFMXYIEOYZOICCET4GY37WXT4D65NSV4F5TADQ@tmp:/ var/lib/jenkins/workspace/oung34_vagrant-ovirt4_PR-79- 7BRKVM5TQ5BGPECFMXYIEOYZOICCET4GY37WXT4D65NSV4F5TADQ@tmp:rw -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** --entrypoint cat myoung34/vagrant:1.9.1
[Pipeline] [vagrant-1.9.1] {
[Pipeline] [vagrant-1.9.2] withDockerContainer [vagrant-1.9.2] $
docker run -t -d -u 997:994 -v /opt/gemcache:/opt/gemcache -w /var/lib/jenkins/workspace/oung34_vagrant-ovirt4_PR-79- 7BRKVM5TQ5BGPECFMXYIEOYZOICCET4GY37WXT4D65NSV4F5TADQ -v /var/lib/jenkins/workspace/oung34_vagrant-ovirt4_PR-79- 7BRKVM5TQ5BGPECFMXYIEOYZOICCET4GY37WXT4D65NSV4F5TADQ:/var/ lib/jenkins/workspace/oung34_vagrant-ovirt4_PR-79- 7BRKVM5TQ5BGPECFMXYIEOYZOICCET4GY37WXT4D65NSV4F5TADQ:rw -v /var/lib/jenkins/workspace/oung34_vagrant-ovirt4_PR-79- 7BRKVM5TQ5BGPECFMXYIEOYZOICCET4GY37WXT4D65NSV4F5TADQ@tmp:/ var/lib/jenkins/workspace/oung34_vagrant-ovirt4_PR-79- 7BRKVM5TQ5BGPECFMXYIEOYZOICCET4GY37WXT4D65NSV4F5TADQ@tmp:rw -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** --entrypoint cat myoung34/vagrant:1.9.2
Each of those containers in turn runs:
+gem build *.gemspec +/usr/bin/vagrant plugin install *.gem +bundle install --path /opt/gemcache --without development
plugins
+bundle exec kitchen destroy all +rm -rf .kitchen +sleep \$(shuf -i 0-10 -n 1) #i did this to see if maybe i
stagger the creates +export VAGRANT_VERSION=\$(echo ${vagrantVersion} | sed 's/\\.//g') +bundle exec kitchen test ^[^singleton-]
On Tue, Mar 7, 2017 at 11:01 AM, Juan Hernández <jhernand@redhat.com <mailto:jhernand@redhat.com>> wrote:
On 03/07/2017 05:42 PM, Marc Young wrote: > I've been fighting this for roughly two days and I'm starting to
could think
> that possibly it's not my code but an interaction with the server. > > I'm using test-kitchen[1] with the kitchen-vagrant[2] driver to
spin up
> vagrant machines and run tests against them. I'm using Jenkins to
run
> kitchen in containers in parallel. > > Basically Jenkins runs a docker container with ruby + vagrant
1.9.2 and
> runs kitchen test all at the same time as another container with
ruby +
> vagrant 1.9.1. > > If I run these in parallel, on some occasions the server seems to > respond with the wrong creation information. If you look at the
logs
> here: http://home.blindrage.us:8080/job/myoung34/job/vagrant-
ovirt4/view/change-requests/job/PR-79/41/console
ovirt4/view/change-requests/job/PR-79/41/console>
ovirt4/view/change-requests/job/PR-79/41/console
ovirt4/view/change-requests/job/PR-79/41/console>>
> > > the container for vagrant 1.9.1 created a VM
`vagrant-dynamic-1.9.1:
> > [vagrant-1.9.1] Bringing machine 'default' up with
'ovirt4' provider...
> > [vagrant-1.9.1] ==> default: Creating VM with the
following settings...
> > [vagrant-1.9.1] ==> default: -- Name:
dynamic-1.9.1
> > > And the container for vagrant 1.9.2 (nearly the same time) created
a VM
> `vagrant-dynamic-1.9.2`: > > [vagrant-1.9.2] ==> default: Creating VM with the
following settings...
> > [vagrant-1.9.2] ==> default: -- Name:
dynamic-1.9.2
> > [vagrant-1.9.2] ==> default: -- Cluster: Default > > > If you look at the ss: > > the container 1.9.1 will wait for dynamic-1.9.1 and try to contact
it at
> 192.168.2.54 > > the container 1.9.2 will wait for dynamic-1.9.2 and try to contact
it at
> 192.168.2.55 > > But if you look at the logs, the 1.9.1 container started trying to
work
> with 192.168.2.55 by creating a new key then talking to it: > > [vagrant-1.9.1] default: Key inserted!
Disconnecting and reconnecting using new SSH key...
> > [vagrant-1.9.1] Waiting for SSH service on 192.168.2.55:22 <http://192.168.2.55:22> <http://192.168.2.55:22>, retrying in 3 seconds > > > Because 1.9.1 inserted a generated key into that box, the 1.9.2 > container which _should_ be talking to it cannot now: > > [vagrant-1.9.2] ==> default: Rsyncing folder:
/home/jenkins/.kitchen/cache/ => /tmp/omnibus/cache
> [vagrant-1.9.2] SSH authentication failed! This is
typically caused by the public/private
> [vagrant-1.9.2] keypair for the SSH user not being
properly set on the guest VM. Please
> [vagrant-1.9.2] verify that the guest VM is setup with
the proper public key, and that
> [vagrant-1.9.2] the private key path for Vagrant is
setup properly as well.
> > > > Via the ruby sdk I create the VM and store the ID it responded
with.
> Then to get the IP: >
Can you share this ^ code that creates and stores the ID of the virtual machine?
> server = env[:vms_service].vm_service(env[:machine].id) > nics_service = server.nics_service > nics = nics_service.list > ip_addr = nics.collect { |nic_attachment| > env[:connection].follow_link(nic_attachment).reported_devices.collect
{
> |dev| dev.ips.collect { |ip| ip.address if ip.version == 'v4'
} }
> }.flatten.reject { |ip| ip.nil? }.first rescue nil >
Is this code running inside the same Ruby process for both virtual machines? In multiple threads?
> Given this code I can't think of any way that I would get the
wrong IP
> unless somehow the server responded incorrectly, since the NIC's
i've
> scanned and compiled across are tied directly to the server I
created.
> > Any thoughts? This only happpens randomly and it seems to happen
if I
> bombard the server with a bunch of VM creations simultaneously > > [1] https://github.com/test-kitchen/test-kitchen <https://github.com/test-kitchen/test-kitchen> > [2] https://github.com/test-kitchen/kitchen-vagrant <https://github.com/test-kitchen/kitchen-vagrant> > > > _______________________________________________ > Devel mailing list > Devel@ovirt.org <mailto:Devel@ovirt.org> > http://lists.ovirt.org/mailman/listinfo/devel <http://lists.ovirt.org/mailman/listinfo/devel> >

I just had the "aha" moment and i"m sorry. I think I'll take a nap. In jenkins the local directory is mounted into the container in order to get the code there. This includes the `.kitchen` dir which has all the state information. So the 2nd run of `kitchen ...` will have the dir and try to use the vagrant state information. Disregard, it's one of those stupid problems you have when you run things in parallel/isolation and dont question how isolated it actually is. On Tue, Mar 7, 2017 at 11:47 AM, Marc Young <3vilpenguin@gmail.com> wrote:
This is where the ID is retrieved and stored: https://github.com/myoung34/ vagrant-ovirt4/blob/master/lib/vagrant-ovirt4/action/create_vm.rb#L79
workflow is: create, wait til disks are OK, wait until vm status is "down", create network interfaces[1], start vm[2], wait until up[3]
[1] https://github.com/myoung34/vagrant-ovirt4/blob/ master/lib/vagrant-ovirt4/action/create_network_interfaces.rb#L53 [2] https://github.com/myoung34/vagrant-ovirt4/blob/ master/lib/vagrant-ovirt4/action/start_vm.rb#L83 [3] https://github.com/myoung34/vagrant-ovirt4/blob/ master/lib/vagrant-ovirt4/action/wait_till_up.rb#L43
On Tue, Mar 7, 2017 at 11:34 AM, Juan Hernández <jhernand@redhat.com> wrote:
Completely isolated docker containers. Jenkins basically runs two separate calls to docker...
[vagrant-1.9.1] $ docker run -t -d -u 997:994 -v /opt/gemcache:/opt/gemcache -w /var/lib/jenkins/workspace/oun g34_vagrant-ovirt4_PR-79-7BRKVM5TQ5BGPECFMXYIEOYZOICCET4GY37WXT4D65NSV4F5TADQ -v /var/lib/jenkins/workspace/oung34_vagrant-ovirt4_PR-79-7BRKV M5TQ5BGPECFMXYIEOYZOICCET4GY37WXT4D65NSV4F5TADQ:/var/lib/ jenkins/workspace/oung34_vagrant-ovirt4_PR-79-7BRKVM5TQ5BGPE CFMXYIEOYZOICCET4GY37WXT4D65NSV4F5TADQ:rw -v /var/lib/jenkins/workspace/oung34_vagrant-ovirt4_PR-79-7BRKV M5TQ5BGPECFMXYIEOYZOICCET4GY37WXT4D65NSV4F5TADQ@tmp:/var/
On 03/07/2017 06:06 PM, Marc Young wrote: lib/jenkins/workspace/oung34_vagrant-ovirt4_PR-79-7BRKVM5TQ 5BGPECFMXYIEOYZOICCET4GY37WXT4D65NSV4F5TADQ@tmp:rw -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** --entrypoint cat myoung34/vagrant:1.9.1
[Pipeline] [vagrant-1.9.1] {
[Pipeline] [vagrant-1.9.2] withDockerContainer [vagrant-1.9.2] $
docker run -t -d -u 997:994 -v /opt/gemcache:/opt/gemcache -w /var/lib/jenkins/workspace/oung34_vagrant-ovirt4_PR-79-7BRKV M5TQ5BGPECFMXYIEOYZOICCET4GY37WXT4D65NSV4F5TADQ -v /var/lib/jenkins/workspace/oung34_vagrant-ovirt4_PR-79-7BRKV M5TQ5BGPECFMXYIEOYZOICCET4GY37WXT4D65NSV4F5TADQ:/var/lib/ jenkins/workspace/oung34_vagrant-ovirt4_PR-79-7BRKVM5TQ5BGPE CFMXYIEOYZOICCET4GY37WXT4D65NSV4F5TADQ:rw -v /var/lib/jenkins/workspace/oung34_vagrant-ovirt4_PR-79-7BRKV M5TQ5BGPECFMXYIEOYZOICCET4GY37WXT4D65NSV4F5TADQ@tmp:/var/ lib/jenkins/workspace/oung34_vagrant-ovirt4_PR-79-7BRKVM5TQ 5BGPECFMXYIEOYZOICCET4GY37WXT4D65NSV4F5TADQ@tmp:rw -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** --entrypoint cat myoung34/vagrant:1.9.2
Each of those containers in turn runs:
+gem build *.gemspec +/usr/bin/vagrant plugin install *.gem +bundle install --path /opt/gemcache --without
development plugins
+bundle exec kitchen destroy all +rm -rf .kitchen +sleep \$(shuf -i 0-10 -n 1) #i did this to see if maybe
stagger the creates +export VAGRANT_VERSION=\$(echo ${vagrantVersion} | sed 's/\\.//g') +bundle exec kitchen test ^[^singleton-]
On Tue, Mar 7, 2017 at 11:01 AM, Juan Hernández <jhernand@redhat.com <mailto:jhernand@redhat.com>> wrote:
On 03/07/2017 05:42 PM, Marc Young wrote: > I've been fighting this for roughly two days and I'm starting to
i could think
> that possibly it's not my code but an interaction with the server. > > I'm using test-kitchen[1] with the kitchen-vagrant[2] driver to
spin up
> vagrant machines and run tests against them. I'm using Jenkins to
run
> kitchen in containers in parallel. > > Basically Jenkins runs a docker container with ruby + vagrant
1.9.2 and
> runs kitchen test all at the same time as another container with
ruby +
> vagrant 1.9.1. > > If I run these in parallel, on some occasions the server seems to > respond with the wrong creation information. If you look at the
logs
> here: http://home.blindrage.us:8080/
job/myoung34/job/vagrant-ovirt4/view/change-requests/job/PR-79/41/console
rt4/view/change-requests/job/PR-79/41/console>
> <http://home.blindrage.us:8080/job/myoung34/job/vagrant-ovir
t4/view/change-requests/job/PR-79/41/console
rt4/view/change-requests/job/PR-79/41/console>>
> > > the container for vagrant 1.9.1 created a VM
`vagrant-dynamic-1.9.1:
> > [vagrant-1.9.1] Bringing machine 'default' up with
'ovirt4' provider...
> > [vagrant-1.9.1] ==> default: Creating VM with the
following settings...
> > [vagrant-1.9.1] ==> default: -- Name:
dynamic-1.9.1
> > > And the container for vagrant 1.9.2 (nearly the same time)
created a VM
> `vagrant-dynamic-1.9.2`: > > [vagrant-1.9.2] ==> default: Creating VM with the
following settings...
> > [vagrant-1.9.2] ==> default: -- Name:
dynamic-1.9.2
> > [vagrant-1.9.2] ==> default: -- Cluster: Default > > > If you look at the ss: > > the container 1.9.1 will wait for dynamic-1.9.1 and try to
contact it at
> 192.168.2.54 > > the container 1.9.2 will wait for dynamic-1.9.2 and try to
contact it at
> 192.168.2.55 > > But if you look at the logs, the 1.9.1 container started trying
to work
> with 192.168.2.55 by creating a new key then talking to it: > > [vagrant-1.9.1] default: Key inserted!
Disconnecting and reconnecting using new SSH key...
> > [vagrant-1.9.1] Waiting for SSH service on 192.168.2.55:22 <http://192.168.2.55:22> <http://192.168.2.55:22>, retrying in 3 seconds > > > Because 1.9.1 inserted a generated key into that box, the 1.9.2 > container which _should_ be talking to it cannot now: > > [vagrant-1.9.2] ==> default: Rsyncing folder:
/home/jenkins/.kitchen/cache/ => /tmp/omnibus/cache
> [vagrant-1.9.2] SSH authentication failed! This is
typically caused by the public/private
> [vagrant-1.9.2] keypair for the SSH user not being
properly set on the guest VM. Please
> [vagrant-1.9.2] verify that the guest VM is setup with
the proper public key, and that
> [vagrant-1.9.2] the private key path for Vagrant is
setup properly as well.
> > > > Via the ruby sdk I create the VM and store the ID it responded
with.
> Then to get the IP: >
Can you share this ^ code that creates and stores the ID of the virtual machine?
> server = env[:vms_service].vm_service(env[:machine].id) > nics_service = server.nics_service > nics = nics_service.list > ip_addr = nics.collect { |nic_attachment| > env[:connection].follow_link(nic_attachment).reported_devices.collect
{
> |dev| dev.ips.collect { |ip| ip.address if ip.version == 'v4'
} }
> }.flatten.reject { |ip| ip.nil? }.first rescue nil >
Is this code running inside the same Ruby process for both virtual machines? In multiple threads?
> Given this code I can't think of any way that I would get the
wrong IP
> unless somehow the server responded incorrectly, since the NIC's
i've
> scanned and compiled across are tied directly to the server I
created.
> > Any thoughts? This only happpens randomly and it seems to happen
if I
> bombard the server with a bunch of VM creations simultaneously > > [1] https://github.com/test-kitchen/test-kitchen <https://github.com/test-kitchen/test-kitchen> > [2] https://github.com/test-kitchen/kitchen-vagrant <https://github.com/test-kitchen/kitchen-vagrant> > > > _______________________________________________ > Devel mailing list > Devel@ovirt.org <mailto:Devel@ovirt.org> > http://lists.ovirt.org/mailman/listinfo/devel <http://lists.ovirt.org/mailman/listinfo/devel> >
participants (2)
-
Juan Hernández
-
Marc Young