[ OST Failure Report ] [ oVirt master ] [ 24/12/2017 ] [use_ovn_provider]

Test failed: [ 098_ovirt_provider_ovn.use_ovn_provider ] Link to suspected patches: - Linked test failed on: https://gerrit.ovirt.org/#/c/85703/3 - It seems OVN patches had been failing tests ever since: https://gerrit.ovirt.org/#/c/85645/2 Link to Job: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/4492/ Link to all logs: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/4492/artifact/... Error snippet from log: <error> Fault reason is "Operation Failed". Fault detail is "Failed to communicate with the external provider, see log for additional details.". HTTP response code is 400. -------------------- >> begin captured logging << -------------------- requests.packages.urllib3.connectionpool: INFO: * Starting new HTTPS connection (1): 192.168.201.4 py.warnings: WARNING: * Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.org/en/latest/security.html requests.packages.urllib3.connectionpool: DEBUG: "POST /v2.0/tokens/ HTTP/1.1" 200 None requests.packages.urllib3.connectionpool: INFO: * Starting new HTTPS connection (1): 192.168.201.4 requests.packages.urllib3.connectionpool: DEBUG: "GET /v2.0/networks/ HTTP/1.1" 200 None requests.packages.urllib3.connectionpool: INFO: * Starting new HTTPS connection (1): 192.168.201.4 requests.packages.urllib3.connectionpool: DEBUG: "GET /v2.0/ports/ HTTP/1.1" 200 None requests.packages.urllib3.connectionpool: INFO: * Starting new HTTPS connection (1): 192.168.201.4 requests.packages.urllib3.connectionpool: DEBUG: "GET /v2.0/subnets/ HTTP/1.1" 200 None requests.packages.urllib3.connectionpool: INFO: * Starting new HTTPS connection (1): 192.168.201.4 requests.packages.urllib3.connectionpool: DEBUG: "POST /v2.0/networks/ HTTP/1.1" 201 None requests.packages.urllib3.connectionpool: INFO: * Starting new HTTPS connection (1): 192.168.201.4 requests.packages.urllib3.connectionpool: DEBUG: "POST /v2.0/subnets/ HTTP/1.1" 201 None requests.packages.urllib3.connectionpool: INFO: * Starting new HTTPS connection (1): 192.168.201.4 requests.packages.urllib3.connectionpool: DEBUG: "POST /v2.0/ports/ HTTP/1.1" 201 None requests.packages.urllib3.connectionpool: INFO: * Starting new HTTPS connection (1): 192.168.201.4 requests.packages.urllib3.connectionpool: DEBUG: "GET /v2.0/networks/ HTTP/1.1" 200 None requests.packages.urllib3.connectionpool: INFO: * Starting new HTTPS connection (1): 192.168.201.4 requests.packages.urllib3.connectionpool: DEBUG: "GET /v2.0/ports/ HTTP/1.1" 200 None requests.packages.urllib3.connectionpool: INFO: * Starting new HTTPS connection (1): 192.168.201.4 requests.packages.urllib3.connectionpool: DEBUG: "GET /v2.0/subnets/ HTTP/1.1" 200 None --------------------- >> end captured logging << --------------------- </error> -- Barak Korren RHV DevOps team , RHCE, RHCi Red Hat EMEA redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted

Adding ovn maintainers. On Sun, Dec 24, 2017 at 9:25 AM, Barak Korren <bkorren@redhat.com> wrote:
Test failed: [ 098_ovirt_provider_ovn.use_ovn_provider ]
Link to suspected patches:
- Linked test failed on: https://gerrit.ovirt.org/#/c/85703/3 - It seems OVN patches had been failing tests ever since: https://gerrit.ovirt.org/#/c/85645/2
Link to Job: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/4492/
Link to all logs: http://jenkins.ovirt.org/job/ovirt-master_change-queue- tester/4492/artifact/exported-artifacts/basic-suit-master- el7/test_logs/basic-suite-master/post-098_ovirt_provider_ovn.py/
Error snippet from log:
<error>
Fault reason is "Operation Failed". Fault detail is "Failed to communicate with the external provider, see log for additional details.". HTTP response code is 400. -------------------- >> begin captured logging << -------------------- requests.packages.urllib3.connectionpool: INFO: * Starting new HTTPS connection (1): 192.168.201.4 py.warnings: WARNING: * Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.org/en/latest/security.html requests.packages.urllib3.connectionpool: DEBUG: "POST /v2.0/tokens/ HTTP/1.1" 200 None requests.packages.urllib3.connectionpool: INFO: * Starting new HTTPS connection (1): 192.168.201.4 requests.packages.urllib3.connectionpool: DEBUG: "GET /v2.0/networks/ HTTP/1.1" 200 None requests.packages.urllib3.connectionpool: INFO: * Starting new HTTPS connection (1): 192.168.201.4 requests.packages.urllib3.connectionpool: DEBUG: "GET /v2.0/ports/ HTTP/1.1" 200 None requests.packages.urllib3.connectionpool: INFO: * Starting new HTTPS connection (1): 192.168.201.4 requests.packages.urllib3.connectionpool: DEBUG: "GET /v2.0/subnets/ HTTP/1.1" 200 None requests.packages.urllib3.connectionpool: INFO: * Starting new HTTPS connection (1): 192.168.201.4 requests.packages.urllib3.connectionpool: DEBUG: "POST /v2.0/networks/ HTTP/1.1" 201 None requests.packages.urllib3.connectionpool: INFO: * Starting new HTTPS connection (1): 192.168.201.4 requests.packages.urllib3.connectionpool: DEBUG: "POST /v2.0/subnets/ HTTP/1.1" 201 None requests.packages.urllib3.connectionpool: INFO: * Starting new HTTPS connection (1): 192.168.201.4 requests.packages.urllib3.connectionpool: DEBUG: "POST /v2.0/ports/ HTTP/1.1" 201 None requests.packages.urllib3.connectionpool: INFO: * Starting new HTTPS connection (1): 192.168.201.4 requests.packages.urllib3.connectionpool: DEBUG: "GET /v2.0/networks/ HTTP/1.1" 200 None requests.packages.urllib3.connectionpool: INFO: * Starting new HTTPS connection (1): 192.168.201.4 requests.packages.urllib3.connectionpool: DEBUG: "GET /v2.0/ports/ HTTP/1.1" 200 None requests.packages.urllib3.connectionpool: INFO: * Starting new HTTPS connection (1): 192.168.201.4 requests.packages.urllib3.connectionpool: DEBUG: "GET /v2.0/subnets/ HTTP/1.1" 200 None --------------------- >> end captured logging << ---------------------
</error>
-- Barak Korren RHV DevOps team , RHCE, RHCi Red Hat EMEA redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted _______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
-- Eyal edri MANAGER RHV DevOps EMEA VIRTUALIZATION R&D Red Hat EMEA <https://www.redhat.com/> <https://red.ht/sig> TRIED. TESTED. TRUSTED. <https://redhat.com/trusted> phone: +972-9-7692018 irc: eedri (on #tlv #rhev-dev #rhev-integ)

A helpful hint is in http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/4492/artifact/... : Caused by: org.jboss.resteasy.spi.ReaderException: org.codehaus.jackson.map.JsonMappingException: Can not construct instance of java.util.Calendar from String value '2017-12-27 13:19:51Z': not a valid representation (error: Can not parse date "2017-12-27 13:19:51Z": not compatible with any of standard forms ("yyyy-MM-dd'T'HH:mm:ss.SSSZ", "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'", "EEE, dd MMM yyyy HH:mm:ss zzz", "yyyy-MM-dd")) at [Source: org.jboss.resteasy.client.core.BaseClientResponse$InputStreamWrapper@72c184c5; line: 1, column: 23] (through reference chain: com.woorea.openstack.keystone.model.Access["token"]->com.woorea.openstack.keystone.model.Token["expires"]) This problem was introduced by https://gerrit.ovirt.org/#/c/85702/ I created a fix: https://gerrit.ovirt.org/85734 On Mon, 25 Dec 2017 11:30:19 +0200 Eyal Edri <eedri@redhat.com> wrote:
Adding ovn maintainers.
On Sun, Dec 24, 2017 at 9:25 AM, Barak Korren <bkorren@redhat.com> wrote:
Test failed: [ 098_ovirt_provider_ovn.use_ovn_provider ]
Link to suspected patches:
- Linked test failed on: https://gerrit.ovirt.org/#/c/85703/3 - It seems OVN patches had been failing tests ever since: https://gerrit.ovirt.org/#/c/85645/2
Link to Job: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/4492/
Link to all logs: http://jenkins.ovirt.org/job/ovirt-master_change-queue- tester/4492/artifact/exported-artifacts/basic-suit-master- el7/test_logs/basic-suite-master/post-098_ovirt_provider_ovn.py/
Error snippet from log:
<error>
Fault reason is "Operation Failed". Fault detail is "Failed to communicate with the external provider, see log for additional details.". HTTP response code is 400. -------------------- >> begin captured logging << -------------------- requests.packages.urllib3.connectionpool: INFO: * Starting new HTTPS connection (1): 192.168.201.4 py.warnings: WARNING: * Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.org/en/latest/security.html requests.packages.urllib3.connectionpool: DEBUG: "POST /v2.0/tokens/ HTTP/1.1" 200 None requests.packages.urllib3.connectionpool: INFO: * Starting new HTTPS connection (1): 192.168.201.4 requests.packages.urllib3.connectionpool: DEBUG: "GET /v2.0/networks/ HTTP/1.1" 200 None requests.packages.urllib3.connectionpool: INFO: * Starting new HTTPS connection (1): 192.168.201.4 requests.packages.urllib3.connectionpool: DEBUG: "GET /v2.0/ports/ HTTP/1.1" 200 None requests.packages.urllib3.connectionpool: INFO: * Starting new HTTPS connection (1): 192.168.201.4 requests.packages.urllib3.connectionpool: DEBUG: "GET /v2.0/subnets/ HTTP/1.1" 200 None requests.packages.urllib3.connectionpool: INFO: * Starting new HTTPS connection (1): 192.168.201.4 requests.packages.urllib3.connectionpool: DEBUG: "POST /v2.0/networks/ HTTP/1.1" 201 None requests.packages.urllib3.connectionpool: INFO: * Starting new HTTPS connection (1): 192.168.201.4 requests.packages.urllib3.connectionpool: DEBUG: "POST /v2.0/subnets/ HTTP/1.1" 201 None requests.packages.urllib3.connectionpool: INFO: * Starting new HTTPS connection (1): 192.168.201.4 requests.packages.urllib3.connectionpool: DEBUG: "POST /v2.0/ports/ HTTP/1.1" 201 None requests.packages.urllib3.connectionpool: INFO: * Starting new HTTPS connection (1): 192.168.201.4 requests.packages.urllib3.connectionpool: DEBUG: "GET /v2.0/networks/ HTTP/1.1" 200 None requests.packages.urllib3.connectionpool: INFO: * Starting new HTTPS connection (1): 192.168.201.4 requests.packages.urllib3.connectionpool: DEBUG: "GET /v2.0/ports/ HTTP/1.1" 200 None requests.packages.urllib3.connectionpool: INFO: * Starting new HTTPS connection (1): 192.168.201.4 requests.packages.urllib3.connectionpool: DEBUG: "GET /v2.0/subnets/ HTTP/1.1" 200 None --------------------- >> end captured logging << ---------------------
</error>
-- Barak Korren RHV DevOps team , RHCE, RHCi Red Hat EMEA redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted _______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel

On Mon, Dec 25, 2017 at 2:09 PM, Dominik Holler <dholler@redhat.com> wrote:
A helpful hint is in
http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/4492/artifact/... : Caused by: org.jboss.resteasy.spi.ReaderException: org.codehaus.jackson.map.JsonMappingException: Can not construct instance of java.util.Calendar from String value '2017-12-27 13:19:51Z': not a valid representation (error: Can not parse date "2017-12-27 13:19:51Z": not compatible with any of standard forms ("yyyy-MM-dd'T'HH:mm:ss.SSSZ", "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'", "EEE, dd MMM yyyy HH:mm:ss zzz", "yyyy-MM-dd")) at [Source: org.jboss.resteasy.client.core.BaseClientResponse$InputStreamWrapper@72c184c5; line: 1, column: 23] (through reference chain: com.woorea.openstack.keystone.model.Access["token"]->com.woorea.openstack.keystone.model.Token["expires"])
This problem was introduced by https://gerrit.ovirt.org/#/c/85702/
I created a fix: https://gerrit.ovirt.org/85734
Thanks for the quick fix. Is the new format accpetable to other users of the keystone-like API (such at the neutron cli)?

On Mon, 25 Dec 2017 14:14:36 +0200 Dan Kenigsberg <danken@redhat.com> wrote:
On Mon, Dec 25, 2017 at 2:09 PM, Dominik Holler <dholler@redhat.com> wrote:
A helpful hint is in
http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/4492/artifact/... : Caused by: org.jboss.resteasy.spi.ReaderException: org.codehaus.jackson.map.JsonMappingException: Can not construct instance of java.util.Calendar from String value '2017-12-27 13:19:51Z': not a valid representation (error: Can not parse date "2017-12-27 13:19:51Z": not compatible with any of standard forms ("yyyy-MM-dd'T'HH:mm:ss.SSSZ", "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'", "EEE, dd MMM yyyy HH:mm:ss zzz", "yyyy-MM-dd")) at [Source: org.jboss.resteasy.client.core.BaseClientResponse$InputStreamWrapper@72c184c5; line: 1, column: 23] (through reference chain: com.woorea.openstack.keystone.model.Access["token"]->com.woorea.openstack.keystone.model.Token["expires"])
This problem was introduced by https://gerrit.ovirt.org/#/c/85702/
I created a fix: https://gerrit.ovirt.org/85734
Thanks for the quick fix.
Is the new format accpetable to other users of the keystone-like API (such at the neutron cli)?
Yes, I verified ovirt-engine via webadmin, and neutron CLI and ansible on command line: [user@fedora-25-gui ovirt-system-tests]$ cat createNetwok.yml --- - hosts: localhost tasks: - os_network: auth: auth_url: http://0.0.0.0:35357/v2.0 username: admin@internal password: 123456 state: present name: myNewAnsibleNet [user@fedora-25-gui ovirt-system-tests]$ ansible-playbook createNetwok.yml [WARNING]: Could not match supplied host pattern, ignoring: all [WARNING]: provided hosts list is empty, only localhost is available PLAY [localhost] *************************************************************************************** TASK [Gathering Facts] ********************************************************************************* ok: [localhost] TASK [os_network] ************************************************************************************** changed: [localhost] PLAY RECAP ********************************************************************************************* localhost : ok=2 changed=1 unreachable=0 failed=0 [user@fedora-25-gui ovirt-system-tests]$ ansible-playbook createNetwok.yml [WARNING]: Could not match supplied host pattern, ignoring: all [WARNING]: provided hosts list is empty, only localhost is available PLAY [localhost] *************************************************************************************** TASK [Gathering Facts] ********************************************************************************* ok: [localhost] TASK [os_network] ************************************************************************************** ok: [localhost] PLAY RECAP ********************************************************************************************* localhost : ok=2 changed=0 unreachable=0 failed=0 [user@fedora-25-gui ovirt-system-tests]$ OS_USERNAME=admin@internal OS_PASSWORD=123456 OS_AUTH_URL=http://0.0.0.0:35357/v2.0 neutron net-list Failed to discover available identity versions when contacting http://0.0.0.0:35357/v2.0. Attempting to parse version from URL. +--------------------------------------+-----------------+ | id | name | +--------------------------------------+-----------------+ | 97b653b0-623e-4b5d-a7a0-e05c6d95fdf2 | ansibleNet2 | | e1f36f9b-bfb2-4779-880f-d8b8f8d9c64a | myNewAnsibleNet | | 31172fec-1d6e-42eb-acb4-ab5bf77a1296 | osnet | | 05e680b8-544a-4278-9ac0-403fb5e83af2 | test.json | | 60f74925-adb9-4ae2-9751-2a3f1315bd2e | net877 | | 18687d84-0923-4e1a-b349-4030c6f9c11e | net111 | | c20b5484-dde1-4729-bae4-5f073c3e14ef | net1114 | | ddd9741b-6874-4075-abba-615fb1777b62 | ansibleNet | | 2b913120-260f-4750-9fc2-c0e44f3d51e9 | net11149 | | a3db332f-5b2b-478c-a90e-73ee5fbee3ce | net412 | +--------------------------------------+-----------------+

On 25 December 2017 at 14:14, Dan Kenigsberg <danken@redhat.com> wrote:
On Mon, Dec 25, 2017 at 2:09 PM, Dominik Holler <dholler@redhat.com> wrote:
A helpful hint is in
http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/4492/artifact/... : Caused by: org.jboss.resteasy.spi.ReaderException: org.codehaus.jackson.map.JsonMappingException: Can not construct instance of java.util.Calendar from String value '2017-12-27 13:19:51Z': not a valid representation (error: Can not parse date "2017-12-27 13:19:51Z": not compatible with any of standard forms ("yyyy-MM-dd'T'HH:mm:ss.SSSZ", "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'", "EEE, dd MMM yyyy HH:mm:ss zzz", "yyyy-MM-dd")) at [Source: org.jboss.resteasy.client.core.BaseClientResponse$InputStreamWrapper@72c184c5; line: 1, column: 23] (through reference chain: com.woorea.openstack.keystone.model.Access["token"]->com.woorea.openstack.keystone.model.Token["expires"])
This problem was introduced by https://gerrit.ovirt.org/#/c/85702/
I created a fix: https://gerrit.ovirt.org/85734
Thanks for the quick fix.
Is the new format accpetable to other users of the keystone-like API (such at the neutron cli)?
It seems the fix patch itself failed as well: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/4539/ The failed test is: 006_migrations.prepare_migration_attachments_ipv6 It seems engine has lost the ability to talk to the host. Logs are here: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/4539/artifact/... -- Barak Korren RHV DevOps team , RHCE, RHCi Red Hat EMEA redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted

On Wed, Dec 27, 2017 at 12:34 PM, Barak Korren <bkorren@redhat.com> wrote:
On 25 December 2017 at 14:14, Dan Kenigsberg <danken@redhat.com> wrote:
On Mon, Dec 25, 2017 at 2:09 PM, Dominik Holler <dholler@redhat.com> wrote:
A helpful hint is in
http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/4492/artifact/... : Caused by: org.jboss.resteasy.spi.ReaderException: org.codehaus.jackson.map.JsonMappingException: Can not construct instance of java.util.Calendar from String value '2017-12-27 13:19:51Z': not a valid representation (error: Can not parse date "2017-12-27 13:19:51Z": not compatible with any of standard forms ("yyyy-MM-dd'T'HH:mm:ss.SSSZ", "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'", "EEE, dd MMM yyyy HH:mm:ss zzz", "yyyy-MM-dd")) at [Source: org.jboss.resteasy.client.core.BaseClientResponse$InputStreamWrapper@72c184c5; line: 1, column: 23] (through reference chain: com.woorea.openstack.keystone.model.Access["token"]->com.woorea.openstack.keystone.model.Token["expires"])
This problem was introduced by https://gerrit.ovirt.org/#/c/85702/
I created a fix: https://gerrit.ovirt.org/85734
Thanks for the quick fix.
Is the new format accpetable to other users of the keystone-like API (such at the neutron cli)?
It seems the fix patch itself failed as well: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/4539/
The failed test is: 006_migrations.prepare_migration_attachments_ipv6
It seems engine has lost the ability to talk to the host.
Logs are here: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/4539/artifact/...
The error seems unrelated to the patch, since - as you say - the error is about host networking, long before OVN got involved. I see that http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/4539/artifact/... has a worrying failure to ifup a bridge, which might be more related ifup/oncae8adf7ba944::ERROR::2017-12-27 05:03:06,001::concurrent::201::root::(run) FINISH thread <Thread(ifup/oncae8adf7ba944, started daemon 140108891563776)> failed Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/common/concurrent.py", line 194, in run ret = func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/network/configurators/ifcfg.py", line 925, in _exec_ifup _exec_ifup_by_name(iface.name, cgroup) File "/usr/lib/python2.7/site-packages/vdsm/network/configurators/ifcfg.py", line 911, in _exec_ifup_by_name raise ConfigNetworkError(ERR_FAILED_IFUP, out[-1] if out else '') ConfigNetworkError: (29, '\n')

On Wed, Dec 27, 2017 at 12:53 PM, Dan Kenigsberg <danken@redhat.com> wrote:
On 25 December 2017 at 14:14, Dan Kenigsberg <danken@redhat.com> wrote:
On Mon, Dec 25, 2017 at 2:09 PM, Dominik Holler <dholler@redhat.com> wrote:
A helpful hint is in
http://jenkins.ovirt.org/job/ovirt-master_change-queue- tester/4492/artifact/exported-artifacts/basic-suit-master- el7/test_logs/basic-suite-master/post-098_ovirt_
On Wed, Dec 27, 2017 at 12:34 PM, Barak Korren <bkorren@redhat.com> wrote: provider_ovn.py/lago-basic-suite-master-engine/_var_log/ ovirt-engine/engine.log :
Caused by: org.jboss.resteasy.spi.ReaderException: org.codehaus.jackson.map.JsonMappingException: Can not construct instance of java.util.Calendar from String value '2017-12-27 13:19:51Z': not a valid representation (error: Can not parse date "2017-12-27 13:19:51Z": not compatible with any of standard forms ("yyyy-MM-dd'T'HH:mm:ss.SSSZ", "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'", "EEE, dd MMM yyyy HH:mm:ss zzz", "yyyy-MM-dd")) at [Source: org.jboss.resteasy.client.core.BaseClientResponse$ InputStreamWrapper@72c184c5; line: 1, column: 23] (through reference chain: com.woorea.openstack.keystone.model.Access["token"]->com. woorea.openstack.keystone.model.Token["expires"])
This problem was introduced by https://gerrit.ovirt.org/#/c/85702/
I created a fix: https://gerrit.ovirt.org/85734
Thanks for the quick fix.
Is the new format accpetable to other users of the keystone-like API (such at the neutron cli)?
It seems the fix patch itself failed as well: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/4539/
The failed test is: 006_migrations.prepare_migration_attachments_ipv6
It seems engine has lost the ability to talk to the host.
Logs are here: http://jenkins.ovirt.org/job/ovirt-master_change-queue- tester/4539/artifact/exported-artifacts/basic-suit-master- el7/test_logs/basic-suite-master/post-006_migrations.py/
The error seems unrelated to the patch, since - as you say - the error is about host networking, long before OVN got involved.
I see that http://jenkins.ovirt.org/job/ovirt-master_change-queue- tester/4539/artifact/exported-artifacts/basic-suit-master- el7/test_logs/basic-suite-master/post-006_migrations.py/ lago-basic-suite-master-host-0/_var_log/vdsm/supervdsm.log/*view*/
has a worrying failure to ifup a bridge, which might be more related
ifup/oncae8adf7ba944::ERROR::2017-12-27 05:03:06,001::concurrent::201::root::(run) FINISH thread <Thread(ifup/oncae8adf7ba944, started daemon 140108891563776)> failed Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/common/concurrent.py", line 194, in run ret = func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/network/ configurators/ifcfg.py", line 925, in _exec_ifup _exec_ifup_by_name(iface.name, cgroup) File "/usr/lib/python2.7/site-packages/vdsm/network/ configurators/ifcfg.py", line 911, in _exec_ifup_by_name raise ConfigNetworkError(ERR_FAILED_IFUP, out[-1] if out else '') ConfigNetworkError: (29, '\n')
Does not seem to be a problem, it is an outcome of an ancient command on an device that does not exist anymore. We are not cleaning up all the spawned threads at the end of a transaction, so we see such things from time to time.

On Dec 27, 2017 12:35 PM, "Barak Korren" <bkorren@redhat.com> wrote: On 25 December 2017 at 14:14, Dan Kenigsberg <danken@redhat.com> wrote:
On Mon, Dec 25, 2017 at 2:09 PM, Dominik Holler <dholler@redhat.com> wrote:
A helpful hint is in
http://jenkins.ovirt.org/job/ovirt-master_change-queue- tester/4492/artifact/exported-artifacts/basic-suit-master- el7/test_logs/basic-suite-master/post-098_ovirt_provider_ovn.py/lago-basic- suite-master-engine/_var_log/ovirt-engine/engine.log : Caused by: org.jboss.resteasy.spi.ReaderException: org.codehaus.jackson.map.JsonMappingException: Can not construct instance of java.util.Calendar from String value '2017-12-27 13:19:51Z': not a valid representation (error: Can not parse date "2017-12-27 13:19:51Z": not compatible with any of standard forms ("yyyy-MM-dd'T'HH:mm:ss.SSSZ", "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'", "EEE, dd MMM yyyy HH:mm:ss zzz", "yyyy-MM-dd")) at [Source: org.jboss.resteasy.client.core.BaseClientResponse$ InputStreamWrapper@72c184c5; line: 1, column: 23] (through reference chain: com.woorea.openstack.keystone.model.Access["token"]->com. woorea.openstack.keystone.model.Token["expires"])
This problem was introduced by https://gerrit.ovirt.org/#/c/85702/
I created a fix: https://gerrit.ovirt.org/85734
Thanks for the quick fix.
Is the new format accpetable to other users of the keystone-like API (such at the neutron cli)?
It seems the fix patch itself failed as well: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/4539/ The failed test is: 006_migrations.prepare_migration_attachments_ipv6 Any updates on this failure? Y. It seems engine has lost the ability to talk to the host. Logs are here: http://jenkins.ovirt.org/job/ovirt-master_change-queue- tester/4539/artifact/exported-artifacts/basic-suit-master- el7/test_logs/basic-suite-master/post-006_migrations.py/ -- Barak Korren RHV DevOps team , RHCE, RHCi Red Hat EMEA redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted _______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel

On Thu, Dec 28, 2017 at 5:24 PM, Yaniv Kaul <ykaul@redhat.com> wrote:
On Dec 27, 2017 12:35 PM, "Barak Korren" <bkorren@redhat.com> wrote:
On 25 December 2017 at 14:14, Dan Kenigsberg <danken@redhat.com> wrote:
On Mon, Dec 25, 2017 at 2:09 PM, Dominik Holler <dholler@redhat.com> wrote:
A helpful hint is in
http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/4492/artifact/... : Caused by: org.jboss.resteasy.spi.ReaderException: org.codehaus.jackson.map.JsonMappingException: Can not construct instance of java.util.Calendar from String value '2017-12-27 13:19:51Z': not a valid representation (error: Can not parse date "2017-12-27 13:19:51Z": not compatible with any of standard forms ("yyyy-MM-dd'T'HH:mm:ss.SSSZ", "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'", "EEE, dd MMM yyyy HH:mm:ss zzz", "yyyy-MM-dd")) at [Source: org.jboss.resteasy.client.core.BaseClientResponse$InputStreamWrapper@72c184c5; line: 1, column: 23] (through reference chain: com.woorea.openstack.keystone.model.Access["token"]->com.woorea.openstack.keystone.model.Token["expires"])
This problem was introduced by https://gerrit.ovirt.org/#/c/85702/
I created a fix: https://gerrit.ovirt.org/85734
Thanks for the quick fix.
Is the new format accpetable to other users of the keystone-like API (such at the neutron cli)?
It seems the fix patch itself failed as well: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/4539/
The failed test is: 006_migrations.prepare_migration_attachments_ipv6
Any updates on this failure?
There's a versioning mess in ovirt-provider-ovn, which might cause the fixed not to be tested. Fixing the master branch with https://gerrit.ovirt.org/#/c/85797 (and possibly other patches) is our utmost task for today. Dan.

I was able to run OST's successfully with: https://gerrit.ovirt.org/#/c/85797 ( http://jenkins.ovirt.org/job/ovirt-provider-ovn_master_build-artifacts-el7-x... ) both locally and on jenkins: http://jenkins.ovirt.org/view/oVirt%20system%20tests/job/ovirt-system-tests_... On Thu, Dec 28, 2017 at 5:06 PM, Dan Kenigsberg <danken@redhat.com> wrote:
On Thu, Dec 28, 2017 at 5:24 PM, Yaniv Kaul <ykaul@redhat.com> wrote:
On Dec 27, 2017 12:35 PM, "Barak Korren" <bkorren@redhat.com> wrote:
On 25 December 2017 at 14:14, Dan Kenigsberg <danken@redhat.com> wrote:
On Mon, Dec 25, 2017 at 2:09 PM, Dominik Holler <dholler@redhat.com> wrote:
A helpful hint is in
tester/4492/artifact/exported-artifacts/basic-suit-master- el7/test_logs/basic-suite-master/post-098_ovirt_ provider_ovn.py/lago-basic-suite-master-engine/_var_log/ ovirt-engine/engine.log
: Caused by: org.jboss.resteasy.spi.ReaderException: org.codehaus.jackson.map.JsonMappingException: Can not construct instance of java.util.Calendar from String value '2017-12-27 13:19:51Z': not a valid representation (error: Can not parse date "2017-12-27 13:19:51Z": not compatible with any of standard forms ("yyyy-MM-dd'T'HH:mm:ss.SSSZ", "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'", "EEE, dd MMM yyyy HH:mm:ss zzz", "yyyy-MM-dd")) at [Source: org.jboss.resteasy.client.core.BaseClientResponse$ InputStreamWrapper@72c184c5; line: 1, column: 23] (through reference chain: com.woorea.openstack.keystone.model.Access["token"]->com. woorea.openstack.keystone.model.Token["expires"])
This problem was introduced by https://gerrit.ovirt.org/#/c/85702/
I created a fix: https://gerrit.ovirt.org/85734
Thanks for the quick fix.
Is the new format accpetable to other users of the keystone-like API (such at the neutron cli)?
It seems the fix patch itself failed as well: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/4539/
The failed test is: 006_migrations.prepare_migration_attachments_ipv6
Any updates on this failure?
There's a versioning mess in ovirt-provider-ovn, which might cause the fixed not to be tested. Fixing the master branch with https://gerrit.ovirt.org/#/c/85797 (and possibly other patches) is our utmost task for today.
Dan.

Yet http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/4559/ (which is the gating job for https://gerrit.ovirt.org/#/c/85797/2 ) still fails. Could you look into why, Marcin? The failure seems unrelated to ovn, as it is about a *host* loosing connectivity. But it reproduces too much, so we need to get to the bottom of it. On Thu, Dec 28, 2017 at 7:22 PM, Marcin Mirecki <mmirecki@redhat.com> wrote:
I was able to run OST's successfully with: https://gerrit.ovirt.org/#/c/85797 (http://jenkins.ovirt.org/job/ovirt-provider-ovn_master_build-artifacts-el7-x...)
both locally and on jenkins: http://jenkins.ovirt.org/view/oVirt%20system%20tests/job/ovirt-system-tests_...
On Thu, Dec 28, 2017 at 5:06 PM, Dan Kenigsberg <danken@redhat.com> wrote:
On Thu, Dec 28, 2017 at 5:24 PM, Yaniv Kaul <ykaul@redhat.com> wrote:
On Dec 27, 2017 12:35 PM, "Barak Korren" <bkorren@redhat.com> wrote:
On 25 December 2017 at 14:14, Dan Kenigsberg <danken@redhat.com> wrote:
On Mon, Dec 25, 2017 at 2:09 PM, Dominik Holler <dholler@redhat.com> wrote:
A helpful hint is in
http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/4492/artifact/... : Caused by: org.jboss.resteasy.spi.ReaderException: org.codehaus.jackson.map.JsonMappingException: Can not construct instance of java.util.Calendar from String value '2017-12-27 13:19:51Z': not a valid representation (error: Can not parse date "2017-12-27 13:19:51Z": not compatible with any of standard forms ("yyyy-MM-dd'T'HH:mm:ss.SSSZ", "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'", "EEE, dd MMM yyyy HH:mm:ss zzz", "yyyy-MM-dd")) at [Source:
org.jboss.resteasy.client.core.BaseClientResponse$InputStreamWrapper@72c184c5; line: 1, column: 23] (through reference chain:
com.woorea.openstack.keystone.model.Access["token"]->com.woorea.openstack.keystone.model.Token["expires"])
This problem was introduced by https://gerrit.ovirt.org/#/c/85702/
I created a fix: https://gerrit.ovirt.org/85734
Thanks for the quick fix.
Is the new format accpetable to other users of the keystone-like API (such at the neutron cli)?
It seems the fix patch itself failed as well: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/4539/
The failed test is: 006_migrations.prepare_migration_attachments_ipv6
Any updates on this failure?
There's a versioning mess in ovirt-provider-ovn, which might cause the fixed not to be tested. Fixing the master branch with https://gerrit.ovirt.org/#/c/85797 (and possibly other patches) is our utmost task for today.
Dan.

On 28 December 2017 at 20:02, Dan Kenigsberg <danken@redhat.com> wrote:
Yet http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/4559/ (which is the gating job for https://gerrit.ovirt.org/#/c/85797/2 ) still fails. Could you look into why, Marcin? The failure seems unrelated to ovn, as it is about a *host* loosing connectivity. But it reproduces too much, so we need to get to the bottom of it.
Re sending the change through the gate yielded a different error: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/4563/ If this is still unrelated, we need to think seriously what is raising this large amount of unrelated failures. We cannot do any accurate reporting when failures are sporadic. -- Barak Korren RHV DevOps team , RHCE, RHCi Red Hat EMEA redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted

On 29 December 2017 at 00:22, Barak Korren <bkorren@redhat.com> wrote:
On 28 December 2017 at 20:02, Dan Kenigsberg <danken@redhat.com> wrote:
Yet http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/4559/ (which is the gating job for https://gerrit.ovirt.org/#/c/85797/2 ) still fails. Could you look into why, Marcin? The failure seems unrelated to ovn, as it is about a *host* loosing connectivity. But it reproduces too much, so we need to get to the bottom of it.
Re sending the change through the gate yielded a different error: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/4563/
If this is still unrelated, we need to think seriously what is raising this large amount of unrelated failures. We cannot do any accurate reporting when failures are sporadic.
And here is yet another host connectivity issue failing a test for a change that should have no effect whatsoever (its a tox patch for vdsm): http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/4565/ -- Barak Korren RHV DevOps team , RHCE, RHCi Red Hat EMEA redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted

On Fri, Dec 29, 2017 at 12:26 AM, Barak Korren <bkorren@redhat.com> wrote:
On 29 December 2017 at 00:22, Barak Korren <bkorren@redhat.com> wrote:
On 28 December 2017 at 20:02, Dan Kenigsberg <danken@redhat.com> wrote:
Yet http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/4559/ (which is the gating job for https://gerrit.ovirt.org/#/c/85797/2 ) still fails. Could you look into why, Marcin? The failure seems unrelated to ovn, as it is about a *host* loosing connectivity. But it reproduces too much, so we need to get to the bottom of it.
Re sending the change through the gate yielded a different error: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/4563/
If this is still unrelated, we need to think seriously what is raising this large amount of unrelated failures. We cannot do any accurate reporting when failures are sporadic.
And here is yet another host connectivity issue failing a test for a change that should have no effect whatsoever (its a tox patch for vdsm):
http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/4565/
I've added a fair number of changes this week. I doubt they are related, but the one that stands out is the addition of a fence-agent to one of the hosts. https://gerrit.ovirt.org/#/c/85817/ disables this specific test, just in case. I don't think it causes an issue, but it's the only one looking at the git log I can suspect. Y.
-- Barak Korren RHV DevOps team , RHCE, RHCi Red Hat EMEA redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted _______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel

Trying to rebuild Barak's build resulted in another fail: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/4571/ (with the same problem as Dan's build) Engine log contains a few of "IOException: Broken pipe" which seem to correspond to a vdsm restart: "[vds] Exiting (vdsmd:170)" yet looking at my local successful run, I see the same issues in the log. I don't see any other obvious reasons for the problem so far. On Thu, Dec 28, 2017 at 11:48 PM, Yaniv Kaul <ykaul@redhat.com> wrote:
On Fri, Dec 29, 2017 at 12:26 AM, Barak Korren <bkorren@redhat.com> wrote:
On 29 December 2017 at 00:22, Barak Korren <bkorren@redhat.com> wrote:
On 28 December 2017 at 20:02, Dan Kenigsberg <danken@redhat.com> wrote:
Yet http://jenkins.ovirt.org/job/ovirt-master_change-queue-teste r/4559/ (which is the gating job for https://gerrit.ovirt.org/#/c/85797/2 ) still fails. Could you look into why, Marcin? The failure seems unrelated to ovn, as it is about a *host* loosing connectivity. But it reproduces too much, so we need to get to the bottom of it.
Re sending the change through the gate yielded a different error: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/4563/
If this is still unrelated, we need to think seriously what is raising this large amount of unrelated failures. We cannot do any accurate reporting when failures are sporadic.
And here is yet another host connectivity issue failing a test for a change that should have no effect whatsoever (its a tox patch for vdsm):
http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/4565/
I've added a fair number of changes this week. I doubt they are related, but the one that stands out is the addition of a fence-agent to one of the hosts. https://gerrit.ovirt.org/#/c/85817/ disables this specific test, just in case.
I don't think it causes an issue, but it's the only one looking at the git log I can suspect. Y.
-- Barak Korren RHV DevOps team , RHCE, RHCi Red Hat EMEA redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted _______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel

top posting is evil. On Fri, Dec 29, 2017 at 1:00 PM, Marcin Mirecki <mmirecki@redhat.com> wrote:
On Thu, Dec 28, 2017 at 11:48 PM, Yaniv Kaul <ykaul@redhat.com> wrote:
On Fri, Dec 29, 2017 at 12:26 AM, Barak Korren <bkorren@redhat.com> wrote:
On 29 December 2017 at 00:22, Barak Korren <bkorren@redhat.com> wrote:
On 28 December 2017 at 20:02, Dan Kenigsberg <danken@redhat.com> wrote:
Yet http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/4559/ (which is the gating job for https://gerrit.ovirt.org/#/c/85797/2 ) still fails. Could you look into why, Marcin? The failure seems unrelated to ovn, as it is about a *host* loosing connectivity. But it reproduces too much, so we need to get to the bottom of it.
Re sending the change through the gate yielded a different error: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/4563/
If this is still unrelated, we need to think seriously what is raising this large amount of unrelated failures. We cannot do any accurate reporting when failures are sporadic.
And here is yet another host connectivity issue failing a test for a change that should have no effect whatsoever (its a tox patch for vdsm):
http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/4565/
I've added a fair number of changes this week. I doubt they are related, but the one that stands out is the addition of a fence-agent to one of the hosts. https://gerrit.ovirt.org/#/c/85817/ disables this specific test, just in case.
I don't think it causes an issue, but it's the only one looking at the git log I can suspect.
Trying to rebuild Barak's build resulted in another fail: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/4571/ (with the same problem as Dan's build)
Engine log contains a few of "IOException: Broken pipe" which seem to correspond to a vdsm restart: "[vds] Exiting (vdsmd:170)" yet looking at my local successful run, I see the same issues in the log. I don't see any other obvious reasons for the problem so far.
This actually points back to ykaul's fencing patch. And indeed, http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/4571/artifact/... has 2017-12-29 05:26:07,712-05 DEBUG [org.ovirt.engine.core.uutils.ssh.SSHClient] (EE-ManagedThreadFactory-engine-Thread-417) [1a4f9963] Executed: '/usr/bin/vdsm-tool service-restart vdsmd' which means that Engine decided that it wants to kill vdsm. There are multiple communication errors prior to the soft fencing, but maybe waiting a bit longer would have kept the host alive.

On Fri, Dec 29, 2017 at 2:21 PM, Dan Kenigsberg <danken@redhat.com> wrote:
top posting is evil.
On Fri, Dec 29, 2017 at 1:00 PM, Marcin Mirecki <mmirecki@redhat.com> wrote:
On Thu, Dec 28, 2017 at 11:48 PM, Yaniv Kaul <ykaul@redhat.com> wrote:
On Fri, Dec 29, 2017 at 12:26 AM, Barak Korren <bkorren@redhat.com>
wrote:
On 29 December 2017 at 00:22, Barak Korren <bkorren@redhat.com> wrote:
On 28 December 2017 at 20:02, Dan Kenigsberg <danken@redhat.com>
wrote:
Yet http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/4559/ (which is the gating job for https://gerrit.ovirt.org/#/c/85797/2 ) still fails. Could you look into why, Marcin? The failure seems unrelated to ovn, as it is about a *host* loosing connectivity. But it reproduces too much, so we need to get to the bottom of it.
Re sending the change through the gate yielded a different error: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/4563/
If this is still unrelated, we need to think seriously what is raising this large amount of unrelated failures. We cannot do any accurate reporting when failures are sporadic.
And here is yet another host connectivity issue failing a test for a change that should have no effect whatsoever (its a tox patch for vdsm):
http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/4565/
I've added a fair number of changes this week. I doubt they are related, but the one that stands out is the addition of a fence-agent to one of the hosts. https://gerrit.ovirt.org/#/c/85817/ disables this specific test, just in case.
I don't think it causes an issue, but it's the only one looking at the git log I can suspect.
Trying to rebuild Barak's build resulted in another fail: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/4571/ (with the same problem as Dan's build)
Engine log contains a few of "IOException: Broken pipe" which seem to correspond to a vdsm restart: "[vds] Exiting (vdsmd:170)" yet looking at my local successful run, I see the same issues in the log. I don't see any other obvious reasons for the problem so far.
This actually points back to ykaul's fencing patch. And indeed, http://jenkins.ovirt.org/job/ovirt-master_change-queue- tester/4571/artifact/exported-artifacts/basic-suit-master- el7/test_logs/basic-suite-master/post-005_network_by_ label.py/lago-basic-suite-master-engine/_var_log/ovirt-engine/engine.log has
2017-12-29 05:26:07,712-05 DEBUG [org.ovirt.engine.core.uutils.ssh.SSHClient] (EE-ManagedThreadFactory-engine-Thread-417) [1a4f9963] Executed: '/usr/bin/vdsm-tool service-restart vdsmd'
which means that Engine decided that it wants to kill vdsm. There are multiple communication errors prior to the soft fencing, but maybe waiting a bit longer would have kept the host alive.
Note that there's a test called vdsm recovery, where we actually stop and start VDSM - perhaps it's there? Anyway, disabled the test that adds fencing. I don't think this is the cause, but let's see. Y.
participants (7)
-
Barak Korren
-
Dan Kenigsberg
-
Dominik Holler
-
Edward Haas
-
Eyal Edri
-
Marcin Mirecki
-
Yaniv Kaul