OST Failure - Weekly update [17/03/2018-23/03/2018]

Hello, I would like to update on this week's failures and OST current status. On 19-03-2018 - the CI team reported 3 different failures. On Master branch the failed changes reported were: *core: fix removal of vm-host device - https://gerrit.ovirt.org/#/c/89145/ <https://gerrit.ovirt.org/#/c/89145/>* *core: USB in osinfo configuration depends on chipset - https://gerrit.ovirt.org/#/c/88777/ <https://gerrit.ovirt.org/#/c/88777/>* *On 4.2 *branch, the reported change was: *core: Call endAction() of all child commands in ImportVmCommand - https://gerrit.ovirt.org/#/c/89165/ <https://gerrit.ovirt.org/#/c/89165/>* The fix's for the regressions were merged the following day (20-03-2018) https://gerrit.ovirt.org/#/c/89250/- core: Replace generic unlockVm() logic in ImportVmCommand https://gerrit.ovirt.org/#/c/89187/ - core: Fix NPE when creating an instance type On 20-03-2018 - the CI team discovered an issue on the job's cleanup which caused random failures on changes testing due to failure in docker cleanup. There is an open Jira on the issue: https://ovirt-jira.atlassian.net/browse/OVIRT-1939 *Below you can see the chart for this week's resolved issues but cause of failure:*Code = regression of working components/functionalities Configurations = package related issues Other = failed build artifacts Infra = infrastructure/OST/Lago related issues *Below is a chart of resolved failures based on ovirt version* *Below is a chart showing failures by suite type: * Thank you, Dafna

Hi, Is there an ongoing engine master OST failure blocking? [ INFO ] Stage: Misc configuration [ INFO ] Stage: Package installation [ INFO ] Stage: Misc configuration [ ERROR ] Failed to execute stage \'Misc configuration\': Failed to start service \'openvswitch\' [ INFO ] Yum Performing yum transaction rollback These are unrelated code changes: http://jenkins.ovirt.org/job/ovirt-system-tests_master_check-patch-el7-x86_6... https://gerrit.ovirt.org/#/c/89347/ and http://jenkins.ovirt.org/job/ovirt-system-tests_master_check-patch-el7-x86_6... https://gerrit.ovirt.org/67166 But they both die in 001, with exactly 1.24MB in the log and 'Failed to start service openvswitch' 001_initialize_engine.py.junit.xml 1.24 MB Full file: http://jenkins.ovirt.org/job/ovirt-system-tests_master_check-patch-el7-x86_6... On Fri, Mar 23, 2018 at 12:14 PM, Dafna Ron <dron@redhat.com> wrote:
Hello,
I would like to update on this week's failures and OST current status.
On 19-03-2018 - the CI team reported 3 different failures.
On Master branch the failed changes reported were:
*core: fix removal of vm-host device - https://gerrit.ovirt.org/#/c/89145/ <https://gerrit.ovirt.org/#/c/89145/>*
*core: USB in osinfo configuration depends on chipset - https://gerrit.ovirt.org/#/c/88777/ <https://gerrit.ovirt.org/#/c/88777/>* *On 4.2 *branch, the reported change was:
*core: Call endAction() of all child commands in ImportVmCommand - https://gerrit.ovirt.org/#/c/89165/ <https://gerrit.ovirt.org/#/c/89165/>* The fix's for the regressions were merged the following day (20-03-2018)
https://gerrit.ovirt.org/#/c/89250/- core: Replace generic unlockVm() logic in ImportVmCommand https://gerrit.ovirt.org/#/c/89187/ - core: Fix NPE when creating an instance type
On 20-03-2018 - the CI team discovered an issue on the job's cleanup which caused random failures on changes testing due to failure in docker cleanup. There is an open Jira on the issue: https://ovirt-jira.atlassian. net/browse/OVIRT-1939
*Below you can see the chart for this week's resolved issues but cause of failure:*Code = regression of working components/functionalities Configurations = package related issues Other = failed build artifacts Infra = infrastructure/OST/Lago related issues
*Below is a chart of resolved failures based on ovirt version*
*Below is a chart showing failures by suite type: * Thank you, Dafna
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
-- GREG SHEREMETA SENIOR SOFTWARE ENGINEER - TEAM LEAD - RHV UX Red Hat NA <https://www.redhat.com/> gshereme@redhat.com IRC: gshereme <https://red.ht/sig>

performance suite fails with the same openvswitch error. http://jenkins.ovirt.org/job/ovirt-system-tests_ performance-suite-master/152/ <snip> ----------------- Failed Tests: ----------------- 1 tests failed. FAILED: 001_initialize_engine.initialize_engine On Sat, Mar 24, 2018 at 1:19 PM, Greg Sheremeta <gshereme@redhat.com> wrote:
Hi,
Is there an ongoing engine master OST failure blocking?
[ INFO ] Stage: Misc configuration [ INFO ] Stage: Package installation [ INFO ] Stage: Misc configuration [ ERROR ] Failed to execute stage \'Misc configuration\': Failed to start service \'openvswitch\' [ INFO ] Yum Performing yum transaction rollback
These are unrelated code changes:
http://jenkins.ovirt.org/job/ovirt-system-tests_master_ check-patch-el7-x86_64/4644/ https://gerrit.ovirt.org/#/c/89347/
and http://jenkins.ovirt.org/job/ovirt-system-tests_master_ check-patch-el7-x86_64/4647/ https://gerrit.ovirt.org/67166
But they both die in 001, with exactly 1.24MB in the log and 'Failed to start service openvswitch' 001_initialize_engine.py.junit.xml 1.24 MB
Full file: http://jenkins.ovirt.org/job/ovirt-system-tests_ master_check-patch-el7-x86_64/4644/artifact/exported- artifacts/basic-suite-master__logs/001_initialize_engine.py.junit.xml
On Fri, Mar 23, 2018 at 12:14 PM, Dafna Ron <dron@redhat.com> wrote:
Hello,
I would like to update on this week's failures and OST current status.
On 19-03-2018 - the CI team reported 3 different failures.
On Master branch the failed changes reported were:
*core: fix removal of vm-host device - https://gerrit.ovirt.org/#/c/89145/ <https://gerrit.ovirt.org/#/c/89145/>*
*core: USB in osinfo configuration depends on chipset - https://gerrit.ovirt.org/#/c/88777/ <https://gerrit.ovirt.org/#/c/88777/>* *On 4.2 *branch, the reported change was:
*core: Call endAction() of all child commands in ImportVmCommand - https://gerrit.ovirt.org/#/c/89165/ <https://gerrit.ovirt.org/#/c/89165/>* The fix's for the regressions were merged the following day (20-03-2018)
https://gerrit.ovirt.org/#/c/89250/- core: Replace generic unlockVm() logic in ImportVmCommand https://gerrit.ovirt.org/#/c/89187/ - core: Fix NPE when creating an instance type
On 20-03-2018 - the CI team discovered an issue on the job's cleanup which caused random failures on changes testing due to failure in docker cleanup. There is an open Jira on the issue: https://ovirt-jira.atlassian.net/browse/OVIRT-1939
*Below you can see the chart for this week's resolved issues but cause of failure:*Code = regression of working components/functionalities Configurations = package related issues Other = failed build artifacts Infra = infrastructure/OST/Lago related issues
*Below is a chart of resolved failures based on ovirt version*
*Below is a chart showing failures by suite type: * Thank you, Dafna
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
--
GREG SHEREMETA
SENIOR SOFTWARE ENGINEER - TEAM LEAD - RHV UX
Red Hat NA
gshereme@redhat.com IRC: gshereme <https://red.ht/sig>
-- GREG SHEREMETA SENIOR SOFTWARE ENGINEER - TEAM LEAD - RHV UX Red Hat NA <https://www.redhat.com/> gshereme@redhat.com IRC: gshereme <https://red.ht/sig>

+ Network team. I'm not sure if we've moved to ovs 2.9 already? Y. On Sat, Mar 24, 2018 at 8:19 PM, Greg Sheremeta <gshereme@redhat.com> wrote:
Hi,
Is there an ongoing engine master OST failure blocking?
[ INFO ] Stage: Misc configuration [ INFO ] Stage: Package installation [ INFO ] Stage: Misc configuration [ ERROR ] Failed to execute stage \'Misc configuration\': Failed to start service \'openvswitch\' [ INFO ] Yum Performing yum transaction rollback
These are unrelated code changes:
http://jenkins.ovirt.org/job/ovirt-system-tests_master_ check-patch-el7-x86_64/4644/ https://gerrit.ovirt.org/#/c/89347/
and http://jenkins.ovirt.org/job/ovirt-system-tests_master_ check-patch-el7-x86_64/4647/ https://gerrit.ovirt.org/67166
But they both die in 001, with exactly 1.24MB in the log and 'Failed to start service openvswitch' 001_initialize_engine.py.junit.xml 1.24 MB
Full file: http://jenkins.ovirt.org/job/ovirt-system-tests_ master_check-patch-el7-x86_64/4644/artifact/exported- artifacts/basic-suite-master__logs/001_initialize_engine.py.junit.xml
On Fri, Mar 23, 2018 at 12:14 PM, Dafna Ron <dron@redhat.com> wrote:
Hello,
I would like to update on this week's failures and OST current status.
On 19-03-2018 - the CI team reported 3 different failures.
On Master branch the failed changes reported were:
*core: fix removal of vm-host device - https://gerrit.ovirt.org/#/c/89145/ <https://gerrit.ovirt.org/#/c/89145/>*
*core: USB in osinfo configuration depends on chipset - https://gerrit.ovirt.org/#/c/88777/ <https://gerrit.ovirt.org/#/c/88777/>* *On 4.2 *branch, the reported change was:
*core: Call endAction() of all child commands in ImportVmCommand - https://gerrit.ovirt.org/#/c/89165/ <https://gerrit.ovirt.org/#/c/89165/>* The fix's for the regressions were merged the following day (20-03-2018)
https://gerrit.ovirt.org/#/c/89250/- core: Replace generic unlockVm() logic in ImportVmCommand https://gerrit.ovirt.org/#/c/89187/ - core: Fix NPE when creating an instance type
On 20-03-2018 - the CI team discovered an issue on the job's cleanup which caused random failures on changes testing due to failure in docker cleanup. There is an open Jira on the issue: https://ovirt-jira.atlassian.net/browse/OVIRT-1939
*Below you can see the chart for this week's resolved issues but cause of failure:*Code = regression of working components/functionalities Configurations = package related issues Other = failed build artifacts Infra = infrastructure/OST/Lago related issues
*Below is a chart of resolved failures based on ovirt version*
*Below is a chart showing failures by suite type: * Thank you, Dafna
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
--
GREG SHEREMETA
SENIOR SOFTWARE ENGINEER - TEAM LEAD - RHV UX
Red Hat NA
gshereme@redhat.com IRC: gshereme <https://red.ht/sig>
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel

basic suite failed for me too. /var/log/messages has[1]: Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Starting Open vSwitch Database Unit... Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: runuser: System error Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: /etc/openvswitch/conf.db does not exist ... (warning). Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: Creating empty database /etc/openvswitch/conf.db runuser: System error Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: [FAILED] Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service: control process exited, code=exited status=1 Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Failed to start Open vSwitch Database Unit. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Dependency failed for Open vSwitch. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Job openvswitch.service/start failed with result 'dependency'. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Dependency failed for Open vSwitch Forwarding Unit. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Job ovs-vswitchd.service/start failed with result 'dependency'. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Unit ovsdb-server.service entered failed state. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service failed. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Assertion failed for Open vSwitch Delete Transient Ports. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service holdoff time over, scheduling restart. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Starting Open vSwitch Database Unit... Mar 25 04:42:05 lago-basic-suite-master-engine systemd-logind: Removed session 14. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Removed slice User Slice of root. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Stopping User Slice of root. Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: runuser: System error Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: /etc/openvswitch/conf.db does not exist ... (warning). Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: Creating empty database /etc/openvswitch/conf.db runuser: System error Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: [FAILED] Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service: control process exited, code=exited status=1 Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Failed to start Open vSwitch Database Unit. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Unit ovsdb-server.service entered failed state. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service failed. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Assertion failed for Open vSwitch Delete Transient Ports. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service holdoff time over, scheduling restart. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Starting Open vSwitch Database Unit... Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: runuser: System error Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: /etc/openvswitch/conf.db does not exist ... (warning). Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: Creating empty database /etc/openvswitch/conf.db runuser: System error Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: [FAILED] Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service: control process exited, code=exited status=1 Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Failed to start Open vSwitch Database Unit. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Unit ovsdb-server.service entered failed state. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service failed. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Assertion failed for Open vSwitch Delete Transient Ports. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Created slice User Slice of root. Mar 25 04:42:05 lago-basic-suite-master-engine systemd-logind: New session 17 of user root. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Starting User Slice of root. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Started Session 17 of user root. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Starting Session 17 of user root. Mar 25 04:42:05 lago-basic-suite-master-engine kernel: DCCP: Activated CCID 2 (TCP-like) Mar 25 04:42:05 lago-basic-suite-master-engine kernel: DCCP: Activated CCID 3 (TCP-Friendly Rate Control) Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service holdoff time over, scheduling restart. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Starting Open vSwitch Database Unit... Mar 25 04:42:05 lago-basic-suite-master-engine kernel: sctp: Hash tables configured (bind 256/256) Mar 25 04:42:05 lago-basic-suite-master-engine systemd-logind: Removed session 17. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Removed slice User Slice of root. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Stopping User Slice of root. Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: runuser: System error Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: /etc/openvswitch/conf.db does not exist ... (warning). Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: Creating empty database /etc/openvswitch/conf.db runuser: System error Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: [FAILED] Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service: control process exited, code=exited status=1 [1] http://jenkins.ovirt.org/job/ovirt-system-tests_master_check-patch-el7-x86_6... On Sun, Mar 25, 2018 at 10:07 AM, Yaniv Kaul <ykaul@redhat.com> wrote:
+ Network team. I'm not sure if we've moved to ovs 2.9 already? Y.
On Sat, Mar 24, 2018 at 8:19 PM, Greg Sheremeta <gshereme@redhat.com> wrote:
Hi,
Is there an ongoing engine master OST failure blocking?
[ INFO ] Stage: Misc configuration [ INFO ] Stage: Package installation [ INFO ] Stage: Misc configuration [ ERROR ] Failed to execute stage \'Misc configuration\': Failed to start service \'openvswitch\' [ INFO ] Yum Performing yum transaction rollback
These are unrelated code changes:
http://jenkins.ovirt.org/job/ovirt-system-tests_master_check -patch-el7-x86_64/4644/ https://gerrit.ovirt.org/#/c/89347/
and http://jenkins.ovirt.org/job/ovirt-system-tests_master_check -patch-el7-x86_64/4647/ https://gerrit.ovirt.org/67166
But they both die in 001, with exactly 1.24MB in the log and 'Failed to start service openvswitch' 001_initialize_engine.py.junit.xml 1.24 MB
Full file: http://jenkins.ovirt.org/job/ovirt-system-tests_master _check-patch-el7-x86_64/4644/artifact/exported-artifacts/bas ic-suite-master__logs/001_initialize_engine.py.junit.xml
On Fri, Mar 23, 2018 at 12:14 PM, Dafna Ron <dron@redhat.com> wrote:
Hello,
I would like to update on this week's failures and OST current status.
On 19-03-2018 - the CI team reported 3 different failures.
On Master branch the failed changes reported were:
*core: fix removal of vm-host device - https://gerrit.ovirt.org/#/c/89145/ <https://gerrit.ovirt.org/#/c/89145/>*
*core: USB in osinfo configuration depends on chipset - https://gerrit.ovirt.org/#/c/88777/ <https://gerrit.ovirt.org/#/c/88777/>* *On 4.2 *branch, the reported change was:
*core: Call endAction() of all child commands in ImportVmCommand - https://gerrit.ovirt.org/#/c/89165/ <https://gerrit.ovirt.org/#/c/89165/>* The fix's for the regressions were merged the following day (20-03-2018)
https://gerrit.ovirt.org/#/c/89250/- core: Replace generic unlockVm() logic in ImportVmCommand https://gerrit.ovirt.org/#/c/89187/ - core: Fix NPE when creating an instance type
On 20-03-2018 - the CI team discovered an issue on the job's cleanup which caused random failures on changes testing due to failure in docker cleanup. There is an open Jira on the issue: https://ovirt-jira.atlassian.net/browse/OVIRT-1939
*Below you can see the chart for this week's resolved issues but cause of failure:*Code = regression of working components/functionalities Configurations = package related issues Other = failed build artifacts Infra = infrastructure/OST/Lago related issues
*Below is a chart of resolved failures based on ovirt version*
*Below is a chart showing failures by suite type: * Thank you, Dafna
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
--
GREG SHEREMETA
SENIOR SOFTWARE ENGINEER - TEAM LEAD - RHV UX
Red Hat NA
gshereme@redhat.com IRC: gshereme <https://red.ht/sig>
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
-- Didi

On Sun, Mar 25, 2018 at 12:04 PM, Yedidyah Bar David <didi@redhat.com> wrote:
basic suite failed for me too.
/var/log/messages has[1]:
Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Starting Open vSwitch Database Unit... Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: runuser: System error Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: /etc/openvswitch/conf.db does not exist ... (warning). Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: Creating empty database /etc/openvswitch/conf.db runuser: System error Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: [FAILED] Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service: control process exited, code=exited status=1 Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Failed to start Open vSwitch Database Unit. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Dependency failed for Open vSwitch. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Job openvswitch.service/start failed with result 'dependency'. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Dependency failed for Open vSwitch Forwarding Unit. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Job ovs-vswitchd.service/start failed with result 'dependency'. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Unit ovsdb-server.service entered failed state. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service failed. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Assertion failed for Open vSwitch Delete Transient Ports. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service holdoff time over, scheduling restart. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Starting Open vSwitch Database Unit... Mar 25 04:42:05 lago-basic-suite-master-engine systemd-logind: Removed session 14. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Removed slice User Slice of root. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Stopping User Slice of root. Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: runuser: System error Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: /etc/openvswitch/conf.db does not exist ... (warning). Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: Creating empty database /etc/openvswitch/conf.db runuser: System error Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: [FAILED] Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service: control process exited, code=exited status=1 Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Failed to start Open vSwitch Database Unit. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Unit ovsdb-server.service entered failed state. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service failed. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Assertion failed for Open vSwitch Delete Transient Ports. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service holdoff time over, scheduling restart. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Starting Open vSwitch Database Unit... Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: runuser: System error Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: /etc/openvswitch/conf.db does not exist ... (warning). Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: Creating empty database /etc/openvswitch/conf.db runuser: System error Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: [FAILED] Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service: control process exited, code=exited status=1 Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Failed to start Open vSwitch Database Unit. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Unit ovsdb-server.service entered failed state. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service failed. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Assertion failed for Open vSwitch Delete Transient Ports. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Created slice User Slice of root. Mar 25 04:42:05 lago-basic-suite-master-engine systemd-logind: New session 17 of user root. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Starting User Slice of root. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Started Session 17 of user root. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Starting Session 17 of user root. Mar 25 04:42:05 lago-basic-suite-master-engine kernel: DCCP: Activated CCID 2 (TCP-like) Mar 25 04:42:05 lago-basic-suite-master-engine kernel: DCCP: Activated CCID 3 (TCP-Friendly Rate Control) Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service holdoff time over, scheduling restart. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Starting Open vSwitch Database Unit... Mar 25 04:42:05 lago-basic-suite-master-engine kernel: sctp: Hash tables configured (bind 256/256) Mar 25 04:42:05 lago-basic-suite-master-engine systemd-logind: Removed session 17. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Removed slice User Slice of root. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Stopping User Slice of root. Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: runuser: System error Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: /etc/openvswitch/conf.db does not exist ... (warning). Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: Creating empty database /etc/openvswitch/conf.db runuser: System error Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: [FAILED] Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service: control process exited, code=exited status=1
[1] http://jenkins.ovirt.org/job/ovirt-system-tests_master_ check-patch-el7-x86_64/4651/artifact/exported-artifacts/ basic-suite-master__logs/test_logs/basic-suite-master/post- 001_initialize_engine.py/lago-basic-suite-master-engine/_ var_log/messages/*view*/
Talked with danken, he asked to check if it's an selinux issue. It is. audit lot has: type=AVC msg=audit(1521967325.146:675): avc: denied { create } for pid=3787 comm="runuser" scontext=system_u:system_r:openvswitch_t:s0 tcontext=system_u:system_r:openvswitch_t:s0 tclass=netlink_audit_socket type=SYSCALL msg=audit(1521967325.146:675): arch=c000003e syscall=41 success=no exit=-13 a0=10 a1=3 a2=9 a3=7ffc4e12b930 items=0 ppid=3786 pid=3787 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="runuser" exe="/usr/sbin/runuser" subj=system_u:system_r:openvswitch_t:s0 key=(null) type=PROCTITLE msg=audit(1521967325.146:675): proctitle=72756E75736572002D2D75736572006F70656E76737769746368002D2D006F767364622D746F6F6C002D76636F6E736F6C653A6F666600736368656D612D76657273696F6E002F7573722F73686172652F6F70656E767377697463682F767377697463682E6F7673736368656D61 type=AVC msg=audit(1521967325.150:676): avc: denied { create } for pid=3789 comm="runuser" scontext=system_u:system_r:openvswitch_t:s0 tcontext=system_u:system_r:openvswitch_t:s0 tclass=netlink_audit_socket type=SYSCALL msg=audit(1521967325.150:676): arch=c000003e syscall=41 success=no exit=-13 a0=10 a1=3 a2=9 a3=7ffe03060130 items=0 ppid=3755 pid=3789 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="runuser" exe="/usr/sbin/runuser" subj=system_u:system_r:openvswitch_t:s0 key=(null) type=PROCTITLE msg=audit(1521967325.150:676): http://jenkins.ovirt.org/job/ovirt-system-tests_master_check-patch-el7-x86_6... And it's 2.9: Mar 25 04:38:39 lago-basic-suite-master-engine yum[1183]: Installed: 1:python2-openvswitch-2.9.0-3.el7.noarch
On Sun, Mar 25, 2018 at 10:07 AM, Yaniv Kaul <ykaul@redhat.com> wrote:
+ Network team. I'm not sure if we've moved to ovs 2.9 already? Y.
On Sat, Mar 24, 2018 at 8:19 PM, Greg Sheremeta <gshereme@redhat.com> wrote:
Hi,
Is there an ongoing engine master OST failure blocking?
[ INFO ] Stage: Misc configuration [ INFO ] Stage: Package installation [ INFO ] Stage: Misc configuration [ ERROR ] Failed to execute stage \'Misc configuration\': Failed to start service \'openvswitch\' [ INFO ] Yum Performing yum transaction rollback
These are unrelated code changes:
http://jenkins.ovirt.org/job/ovirt-system-tests_master_check -patch-el7-x86_64/4644/ https://gerrit.ovirt.org/#/c/89347/
and http://jenkins.ovirt.org/job/ovirt-system-tests_master_check -patch-el7-x86_64/4647/ https://gerrit.ovirt.org/67166
But they both die in 001, with exactly 1.24MB in the log and 'Failed to start service openvswitch' 001_initialize_engine.py.junit.xml 1.24 MB
Full file: http://jenkins.ovirt.org/job/ovirt-system-tests_master _check-patch-el7-x86_64/4644/artifact/exported-artifacts/bas ic-suite-master__logs/001_initialize_engine.py.junit.xml
On Fri, Mar 23, 2018 at 12:14 PM, Dafna Ron <dron@redhat.com> wrote:
Hello,
I would like to update on this week's failures and OST current status.
On 19-03-2018 - the CI team reported 3 different failures.
On Master branch the failed changes reported were:
*core: fix removal of vm-host device - https://gerrit.ovirt.org/#/c/89145/ <https://gerrit.ovirt.org/#/c/89145/>*
*core: USB in osinfo configuration depends on chipset - https://gerrit.ovirt.org/#/c/88777/ <https://gerrit.ovirt.org/#/c/88777/>* *On 4.2 *branch, the reported change was:
*core: Call endAction() of all child commands in ImportVmCommand - https://gerrit.ovirt.org/#/c/89165/ <https://gerrit.ovirt.org/#/c/89165/>* The fix's for the regressions were merged the following day (20-03-2018)
https://gerrit.ovirt.org/#/c/89250/- core: Replace generic unlockVm() logic in ImportVmCommand https://gerrit.ovirt.org/#/c/89187/ - core: Fix NPE when creating an instance type
On 20-03-2018 - the CI team discovered an issue on the job's cleanup which caused random failures on changes testing due to failure in docker cleanup. There is an open Jira on the issue: https://ovirt-jira.atlassian.net/browse/OVIRT-1939
*Below you can see the chart for this week's resolved issues but cause of failure:*Code = regression of working components/functionalities Configurations = package related issues Other = failed build artifacts Infra = infrastructure/OST/Lago related issues
*Below is a chart of resolved failures based on ovirt version*
*Below is a chart showing failures by suite type: * Thank you, Dafna
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
--
GREG SHEREMETA
SENIOR SOFTWARE ENGINEER - TEAM LEAD - RHV UX
Red Hat NA
gshereme@redhat.com IRC: gshereme <https://red.ht/sig>
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
-- Didi
-- Didi

Which version of selinux-policy do we have on the Engine image? *Bug 1538936* <https://bugzilla.redhat.com/show_bug.cgi?id=1538936> - Open vSwitch selinux policy needs updating [rhel-7.4.z] was fixed in selinux-policy-3.13.1-166.el7_4.9 which is available in http://mirror.centos.org/centos-7/7/updates/x86_64/Packages/selinux-policy-t... On Sun, Mar 25, 2018 at 12:32 PM, Yedidyah Bar David <didi@redhat.com> wrote:
On Sun, Mar 25, 2018 at 12:04 PM, Yedidyah Bar David <didi@redhat.com> wrote:
basic suite failed for me too.
/var/log/messages has[1]:
Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Starting Open vSwitch Database Unit... Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: runuser: System error Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: /etc/openvswitch/conf.db does not exist ... (warning). Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: Creating empty database /etc/openvswitch/conf.db runuser: System error Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: [FAILED] Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service: control process exited, code=exited status=1 Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Failed to start Open vSwitch Database Unit. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Dependency failed for Open vSwitch. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Job openvswitch.service/start failed with result 'dependency'. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Dependency failed for Open vSwitch Forwarding Unit. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Job ovs-vswitchd.service/start failed with result 'dependency'. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Unit ovsdb-server.service entered failed state. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service failed. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Assertion failed for Open vSwitch Delete Transient Ports. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service holdoff time over, scheduling restart. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Starting Open vSwitch Database Unit... Mar 25 04:42:05 lago-basic-suite-master-engine systemd-logind: Removed session 14. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Removed slice User Slice of root. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Stopping User Slice of root. Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: runuser: System error Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: /etc/openvswitch/conf.db does not exist ... (warning). Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: Creating empty database /etc/openvswitch/conf.db runuser: System error Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: [FAILED] Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service: control process exited, code=exited status=1 Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Failed to start Open vSwitch Database Unit. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Unit ovsdb-server.service entered failed state. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service failed. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Assertion failed for Open vSwitch Delete Transient Ports. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service holdoff time over, scheduling restart. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Starting Open vSwitch Database Unit... Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: runuser: System error Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: /etc/openvswitch/conf.db does not exist ... (warning). Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: Creating empty database /etc/openvswitch/conf.db runuser: System error Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: [FAILED] Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service: control process exited, code=exited status=1 Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Failed to start Open vSwitch Database Unit. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Unit ovsdb-server.service entered failed state. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service failed. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Assertion failed for Open vSwitch Delete Transient Ports. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Created slice User Slice of root. Mar 25 04:42:05 lago-basic-suite-master-engine systemd-logind: New session 17 of user root. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Starting User Slice of root. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Started Session 17 of user root. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Starting Session 17 of user root. Mar 25 04:42:05 lago-basic-suite-master-engine kernel: DCCP: Activated CCID 2 (TCP-like) Mar 25 04:42:05 lago-basic-suite-master-engine kernel: DCCP: Activated CCID 3 (TCP-Friendly Rate Control) Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service holdoff time over, scheduling restart. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Starting Open vSwitch Database Unit... Mar 25 04:42:05 lago-basic-suite-master-engine kernel: sctp: Hash tables configured (bind 256/256) Mar 25 04:42:05 lago-basic-suite-master-engine systemd-logind: Removed session 17. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Removed slice User Slice of root. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Stopping User Slice of root. Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: runuser: System error Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: /etc/openvswitch/conf.db does not exist ... (warning). Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: Creating empty database /etc/openvswitch/conf.db runuser: System error Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: [FAILED] Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service: control process exited, code=exited status=1
[1] http://jenkins.ovirt.org/job/ovirt-system-tests_master_check -patch-el7-x86_64/4651/artifact/exported-artifacts/basic- suite-master__logs/test_logs/basic-suite-master/post-001_ initialize_engine.py/lago-basic-suite-master-engine/_var_ log/messages/*view*/
Talked with danken, he asked to check if it's an selinux issue. It is. audit lot has:
type=AVC msg=audit(1521967325.146:675): avc: denied { create } for pid=3787 comm="runuser" scontext=system_u:system_r:openvswitch_t:s0 tcontext=system_u:system_r:openvswitch_t:s0 tclass=netlink_audit_socket type=SYSCALL msg=audit(1521967325.146:675): arch=c000003e syscall=41 success=no exit=-13 a0=10 a1=3 a2=9 a3=7ffc4e12b930 items=0 ppid=3786 pid=3787 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="runuser" exe="/usr/sbin/runuser" subj=system_u:system_r:openvswitch_t:s0 key=(null) type=PROCTITLE msg=audit(1521967325.146:675): proctitle=72756E75736572002D2D75736572006F70656E76737769746368002D2D006F767364622D746F6F6C002D76636F6E736F6C653A6F666600736368656D612D76657273696F6E002F7573722F73686172652F6F70656E767377697463682F767377697463682E6F7673736368656D61 type=AVC msg=audit(1521967325.150:676): avc: denied { create } for pid=3789 comm="runuser" scontext=system_u:system_r:openvswitch_t:s0 tcontext=system_u:system_r:openvswitch_t:s0 tclass=netlink_audit_socket type=SYSCALL msg=audit(1521967325.150:676): arch=c000003e syscall=41 success=no exit=-13 a0=10 a1=3 a2=9 a3=7ffe03060130 items=0 ppid=3755 pid=3789 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="runuser" exe="/usr/sbin/runuser" subj=system_u:system_r:openvswitch_t:s0 key=(null) type=PROCTITLE msg=audit(1521967325.150:676):
http://jenkins.ovirt.org/job/ovirt-system-tests_master_check-patch-el7-x86_6...
And it's 2.9:
Mar 25 04:38:39 lago-basic-suite-master-engine yum[1183]: Installed: 1:python2-openvswitch-2.9.0-3.el7.noarch
On Sun, Mar 25, 2018 at 10:07 AM, Yaniv Kaul <ykaul@redhat.com> wrote:
+ Network team. I'm not sure if we've moved to ovs 2.9 already? Y.
On Sat, Mar 24, 2018 at 8:19 PM, Greg Sheremeta <gshereme@redhat.com> wrote:
Hi,
Is there an ongoing engine master OST failure blocking?
[ INFO ] Stage: Misc configuration [ INFO ] Stage: Package installation [ INFO ] Stage: Misc configuration [ ERROR ] Failed to execute stage \'Misc configuration\': Failed to start service \'openvswitch\' [ INFO ] Yum Performing yum transaction rollback
These are unrelated code changes:
http://jenkins.ovirt.org/job/ovirt-system-tests_master_check -patch-el7-x86_64/4644/ https://gerrit.ovirt.org/#/c/89347/
and http://jenkins.ovirt.org/job/ovirt-system-tests_master_check -patch-el7-x86_64/4647/ https://gerrit.ovirt.org/67166
But they both die in 001, with exactly 1.24MB in the log and 'Failed to start service openvswitch' 001_initialize_engine.py.junit.xml 1.24 MB
Full file: http://jenkins.ovirt.org/job/ovirt-system-tests_master _check-patch-el7-x86_64/4644/artifact/exported-artifacts/bas ic-suite-master__logs/001_initialize_engine.py.junit.xml
On Fri, Mar 23, 2018 at 12:14 PM, Dafna Ron <dron@redhat.com> wrote:
Hello,
I would like to update on this week's failures and OST current status.
On 19-03-2018 - the CI team reported 3 different failures.
On Master branch the failed changes reported were:
*core: fix removal of vm-host device - https://gerrit.ovirt.org/#/c/89145/ <https://gerrit.ovirt.org/#/c/89145/>*
*core: USB in osinfo configuration depends on chipset - https://gerrit.ovirt.org/#/c/88777/ <https://gerrit.ovirt.org/#/c/88777/>* *On 4.2 *branch, the reported change was:
*core: Call endAction() of all child commands in ImportVmCommand - https://gerrit.ovirt.org/#/c/89165/ <https://gerrit.ovirt.org/#/c/89165/>* The fix's for the regressions were merged the following day (20-03-2018)
https://gerrit.ovirt.org/#/c/89250/- core: Replace generic unlockVm() logic in ImportVmCommand https://gerrit.ovirt.org/#/c/89187/ - core: Fix NPE when creating an instance type
On 20-03-2018 - the CI team discovered an issue on the job's cleanup which caused random failures on changes testing due to failure in docker cleanup. There is an open Jira on the issue: https://ovirt-jira.atlassian.net/browse/OVIRT-1939
*Below you can see the chart for this week's resolved issues but cause of failure:*Code = regression of working components/functionalities Configurations = package related issues Other = failed build artifacts Infra = infrastructure/OST/Lago related issues
*Below is a chart of resolved failures based on ovirt version*
*Below is a chart showing failures by suite type: * Thank you, Dafna
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
--
GREG SHEREMETA
SENIOR SOFTWARE ENGINEER - TEAM LEAD - RHV UX
Red Hat NA
gshereme@redhat.com IRC: gshereme <https://red.ht/sig>
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
-- Didi
-- Didi

currently selinux-policy-3.13.1-166.el7_4.4.noarch updating selinux-policy on engine gets me past 001, and then 002 dies: http://jenkins.ovirt.org/job/ovirt-system-tests_master_check-patch-el7-x86_6... 15:05:45 # initialize_engine: rpm -qa: 15:05:46 CommandStatus(code=0, out='selinux-policy-3.13.1-166.el7_4.4.noarch\n', err='') 15:06:13 <snip> Package selinux-policy.noarch 0:3.13.1-166.el7_4.9 will be an update <snip> Updated: selinux-policy.noarch 0:3.13.1-166.el7_4.9 rpm -qa: 15:06:13 CommandStatus(code=0, out='selinux-policy-3.13.1-166.el7_4.9.noarch\n', err='') But later in 002 15:08:47 RuntimeError: 1 hosts failed installation: 15:08:47 lago-basic-suite-master-host-0: install_failed Perhaps selinux-policy needs to be updated on the hosts too? Not my area of expertise :) Greg On Sun, Mar 25, 2018 at 7:22 AM, Dan Kenigsberg <danken@redhat.com> wrote:
Which version of selinux-policy do we have on the Engine image?
*Bug 1538936* <https://bugzilla.redhat.com/show_bug.cgi?id=1538936> - Open vSwitch selinux policy needs updating [rhel-7.4.z]
was fixed in selinux-policy-3.13.1-166.el7_4.9 which is available in http://mirror.centos.org/centos-7/7/updates/x86_64/ Packages/selinux-policy-targeted-3.13.1-166.el7_4.9.noarch.rpm
On Sun, Mar 25, 2018 at 12:32 PM, Yedidyah Bar David <didi@redhat.com> wrote:
On Sun, Mar 25, 2018 at 12:04 PM, Yedidyah Bar David <didi@redhat.com> wrote:
basic suite failed for me too.
/var/log/messages has[1]:
Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Starting Open vSwitch Database Unit... Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: runuser: System error Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: /etc/openvswitch/conf.db does not exist ... (warning). Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: Creating empty database /etc/openvswitch/conf.db runuser: System error Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: [FAILED] Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service: control process exited, code=exited status=1 Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Failed to start Open vSwitch Database Unit. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Dependency failed for Open vSwitch. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Job openvswitch.service/start failed with result 'dependency'. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Dependency failed for Open vSwitch Forwarding Unit. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Job ovs-vswitchd.service/start failed with result 'dependency'. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Unit ovsdb-server.service entered failed state. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service failed. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Assertion failed for Open vSwitch Delete Transient Ports. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service holdoff time over, scheduling restart. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Starting Open vSwitch Database Unit... Mar 25 04:42:05 lago-basic-suite-master-engine systemd-logind: Removed session 14. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Removed slice User Slice of root. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Stopping User Slice of root. Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: runuser: System error Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: /etc/openvswitch/conf.db does not exist ... (warning). Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: Creating empty database /etc/openvswitch/conf.db runuser: System error Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: [FAILED] Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service: control process exited, code=exited status=1 Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Failed to start Open vSwitch Database Unit. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Unit ovsdb-server.service entered failed state. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service failed. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Assertion failed for Open vSwitch Delete Transient Ports. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service holdoff time over, scheduling restart. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Starting Open vSwitch Database Unit... Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: runuser: System error Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: /etc/openvswitch/conf.db does not exist ... (warning). Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: Creating empty database /etc/openvswitch/conf.db runuser: System error Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: [FAILED] Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service: control process exited, code=exited status=1 Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Failed to start Open vSwitch Database Unit. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Unit ovsdb-server.service entered failed state. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service failed. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Assertion failed for Open vSwitch Delete Transient Ports. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Created slice User Slice of root. Mar 25 04:42:05 lago-basic-suite-master-engine systemd-logind: New session 17 of user root. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Starting User Slice of root. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Started Session 17 of user root. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Starting Session 17 of user root. Mar 25 04:42:05 lago-basic-suite-master-engine kernel: DCCP: Activated CCID 2 (TCP-like) Mar 25 04:42:05 lago-basic-suite-master-engine kernel: DCCP: Activated CCID 3 (TCP-Friendly Rate Control) Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service holdoff time over, scheduling restart. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Starting Open vSwitch Database Unit... Mar 25 04:42:05 lago-basic-suite-master-engine kernel: sctp: Hash tables configured (bind 256/256) Mar 25 04:42:05 lago-basic-suite-master-engine systemd-logind: Removed session 17. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Removed slice User Slice of root. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Stopping User Slice of root. Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: runuser: System error Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: /etc/openvswitch/conf.db does not exist ... (warning). Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: Creating empty database /etc/openvswitch/conf.db runuser: System error Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: [FAILED] Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service: control process exited, code=exited status=1
[1] http://jenkins.ovirt.org/job/ovirt-system-tests_master_check -patch-el7-x86_64/4651/artifact/exported-artifacts/basic-sui te-master__logs/test_logs/basic-suite-master/post-001_initia lize_engine.py/lago-basic-suite-master-engine/_var_log/messages/*view*/
Talked with danken, he asked to check if it's an selinux issue. It is. audit lot has:
type=AVC msg=audit(1521967325.146:675): avc: denied { create } for pid=3787 comm="runuser" scontext=system_u:system_r:openvswitch_t:s0 tcontext=system_u:system_r:openvswitch_t:s0 tclass=netlink_audit_socket type=SYSCALL msg=audit(1521967325.146:675): arch=c000003e syscall=41 success=no exit=-13 a0=10 a1=3 a2=9 a3=7ffc4e12b930 items=0 ppid=3786 pid=3787 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="runuser" exe="/usr/sbin/runuser" subj=system_u:system_r:openvswitch_t:s0 key=(null) type=PROCTITLE msg=audit(1521967325.146:675): proctitle=72756E75736572002D2D75736572006F70656E76737769746368002D2D006F767364622D746F6F6C002D76636F6E736F6C653A6F666600736368656D612D76657273696F6E002F7573722F73686172652F6F70656E767377697463682F767377697463682E6F7673736368656D61 type=AVC msg=audit(1521967325.150:676): avc: denied { create } for pid=3789 comm="runuser" scontext=system_u:system_r:openvswitch_t:s0 tcontext=system_u:system_r:openvswitch_t:s0 tclass=netlink_audit_socket type=SYSCALL msg=audit(1521967325.150:676): arch=c000003e syscall=41 success=no exit=-13 a0=10 a1=3 a2=9 a3=7ffe03060130 items=0 ppid=3755 pid=3789 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="runuser" exe="/usr/sbin/runuser" subj=system_u:system_r:openvswitch_t:s0 key=(null) type=PROCTITLE msg=audit(1521967325.150:676):
http://jenkins.ovirt.org/job/ovirt-system-tests_master_check-patch-el7-x86_6...
And it's 2.9:
Mar 25 04:38:39 lago-basic-suite-master-engine yum[1183]: Installed: 1:python2-openvswitch-2.9.0-3.el7.noarch
On Sun, Mar 25, 2018 at 10:07 AM, Yaniv Kaul <ykaul@redhat.com> wrote:
+ Network team. I'm not sure if we've moved to ovs 2.9 already? Y.
On Sat, Mar 24, 2018 at 8:19 PM, Greg Sheremeta <gshereme@redhat.com> wrote:
Hi,
Is there an ongoing engine master OST failure blocking?
[ INFO ] Stage: Misc configuration [ INFO ] Stage: Package installation [ INFO ] Stage: Misc configuration [ ERROR ] Failed to execute stage \'Misc configuration\': Failed to start service \'openvswitch\' [ INFO ] Yum Performing yum transaction rollback
These are unrelated code changes:
http://jenkins.ovirt.org/job/ovirt-system-tests_master_check -patch-el7-x86_64/4644/ https://gerrit.ovirt.org/#/c/89347/
and http://jenkins.ovirt.org/job/ovirt-system-tests_master_check -patch-el7-x86_64/4647/ https://gerrit.ovirt.org/67166
But they both die in 001, with exactly 1.24MB in the log and 'Failed to start service openvswitch' 001_initialize_engine.py.junit.xml 1.24 MB
Full file: http://jenkins.ovirt.org/job/ovirt-system-tests_master _check-patch-el7-x86_64/4644/artifact/exported-artifacts/bas ic-suite-master__logs/001_initialize_engine.py.junit.xml
On Fri, Mar 23, 2018 at 12:14 PM, Dafna Ron <dron@redhat.com> wrote:
Hello,
I would like to update on this week's failures and OST current status.
On 19-03-2018 - the CI team reported 3 different failures.
On Master branch the failed changes reported were:
*core: fix removal of vm-host device - https://gerrit.ovirt.org/#/c/89145/ <https://gerrit.ovirt.org/#/c/89145/>*
*core: USB in osinfo configuration depends on chipset - https://gerrit.ovirt.org/#/c/88777/ <https://gerrit.ovirt.org/#/c/88777/>* *On 4.2 *branch, the reported change was:
*core: Call endAction() of all child commands in ImportVmCommand - https://gerrit.ovirt.org/#/c/89165/ <https://gerrit.ovirt.org/#/c/89165/>* The fix's for the regressions were merged the following day (20-03-2018)
https://gerrit.ovirt.org/#/c/89250/- core: Replace generic unlockVm() logic in ImportVmCommand https://gerrit.ovirt.org/#/c/89187/ - core: Fix NPE when creating an instance type
On 20-03-2018 - the CI team discovered an issue on the job's cleanup which caused random failures on changes testing due to failure in docker cleanup. There is an open Jira on the issue: https://ovirt-jira.atlassian.net/browse/OVIRT-1939
*Below you can see the chart for this week's resolved issues but cause of failure:*Code = regression of working components/functionalities Configurations = package related issues Other = failed build artifacts Infra = infrastructure/OST/Lago related issues
*Below is a chart of resolved failures based on ovirt version*
*Below is a chart showing failures by suite type: * Thank you, Dafna
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
--
GREG SHEREMETA
SENIOR SOFTWARE ENGINEER - TEAM LEAD - RHV UX
Red Hat NA
gshereme@redhat.com IRC: gshereme <https://red.ht/sig>
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
-- Didi
-- Didi
-- GREG SHEREMETA SENIOR SOFTWARE ENGINEER - TEAM LEAD - RHV UX Red Hat NA <https://www.redhat.com/> gshereme@redhat.com IRC: gshereme <https://red.ht/sig>

Indeed, updating selinux-policy on both engine and hosts passes. Change If671d938: [test do not merge] test selinux-policy update on engine and hosts | gerrit.ovirt Code Review https://gerrit.ovirt.org/#/c/89427/ http://jenkins.ovirt.org/job/ovirt-system-tests_master_check-patch-el7-x86_6... So I guess the images need updates. On Sun, Mar 25, 2018 at 11:20 AM, Greg Sheremeta <gshereme@redhat.com> wrote:
currently selinux-policy-3.13.1-166.el7_4.4.noarch
updating selinux-policy on engine gets me past 001, and then 002 dies: http://jenkins.ovirt.org/job/ovirt-system-tests_master_ check-patch-el7-x86_64/4663/console
15:05:45 # initialize_engine: rpm -qa: 15:05:46 CommandStatus(code=0, out='selinux-policy-3.13.1-166.el7_4.4.noarch\n', err='') 15:06:13 <snip> Package selinux-policy.noarch 0:3.13.1-166.el7_4.9 will be an update <snip> Updated: selinux-policy.noarch 0:3.13.1-166.el7_4.9 rpm -qa: 15:06:13 CommandStatus(code=0, out='selinux-policy-3.13.1-166.el7_4.9.noarch\n', err='')
But later in 002 15:08:47 RuntimeError: 1 hosts failed installation: 15:08:47 lago-basic-suite-master-host-0: install_failed
Perhaps selinux-policy needs to be updated on the hosts too? Not my area of expertise :)
Greg
On Sun, Mar 25, 2018 at 7:22 AM, Dan Kenigsberg <danken@redhat.com> wrote:
Which version of selinux-policy do we have on the Engine image?
*Bug 1538936* <https://bugzilla.redhat.com/show_bug.cgi?id=1538936> - Open vSwitch selinux policy needs updating [rhel-7.4.z]
was fixed in selinux-policy-3.13.1-166.el7_4.9 which is available in http://mirror.centos.org/centos-7/7/updates/x86_64/Packages/ selinux-policy-targeted-3.13.1-166.el7_4.9.noarch.rpm
On Sun, Mar 25, 2018 at 12:32 PM, Yedidyah Bar David <didi@redhat.com> wrote:
On Sun, Mar 25, 2018 at 12:04 PM, Yedidyah Bar David <didi@redhat.com> wrote:
basic suite failed for me too.
/var/log/messages has[1]:
Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Starting Open vSwitch Database Unit... Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: runuser: System error Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: /etc/openvswitch/conf.db does not exist ... (warning). Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: Creating empty database /etc/openvswitch/conf.db runuser: System error Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: [FAILED] Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service: control process exited, code=exited status=1 Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Failed to start Open vSwitch Database Unit. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Dependency failed for Open vSwitch. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Job openvswitch.service/start failed with result 'dependency'. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Dependency failed for Open vSwitch Forwarding Unit. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Job ovs-vswitchd.service/start failed with result 'dependency'. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Unit ovsdb-server.service entered failed state. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service failed. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Assertion failed for Open vSwitch Delete Transient Ports. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service holdoff time over, scheduling restart. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Starting Open vSwitch Database Unit... Mar 25 04:42:05 lago-basic-suite-master-engine systemd-logind: Removed session 14. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Removed slice User Slice of root. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Stopping User Slice of root. Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: runuser: System error Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: /etc/openvswitch/conf.db does not exist ... (warning). Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: Creating empty database /etc/openvswitch/conf.db runuser: System error Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: [FAILED] Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service: control process exited, code=exited status=1 Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Failed to start Open vSwitch Database Unit. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Unit ovsdb-server.service entered failed state. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service failed. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Assertion failed for Open vSwitch Delete Transient Ports. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service holdoff time over, scheduling restart. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Starting Open vSwitch Database Unit... Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: runuser: System error Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: /etc/openvswitch/conf.db does not exist ... (warning). Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: Creating empty database /etc/openvswitch/conf.db runuser: System error Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: [FAILED] Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service: control process exited, code=exited status=1 Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Failed to start Open vSwitch Database Unit. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Unit ovsdb-server.service entered failed state. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service failed. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Assertion failed for Open vSwitch Delete Transient Ports. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Created slice User Slice of root. Mar 25 04:42:05 lago-basic-suite-master-engine systemd-logind: New session 17 of user root. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Starting User Slice of root. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Started Session 17 of user root. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Starting Session 17 of user root. Mar 25 04:42:05 lago-basic-suite-master-engine kernel: DCCP: Activated CCID 2 (TCP-like) Mar 25 04:42:05 lago-basic-suite-master-engine kernel: DCCP: Activated CCID 3 (TCP-Friendly Rate Control) Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service holdoff time over, scheduling restart. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Starting Open vSwitch Database Unit... Mar 25 04:42:05 lago-basic-suite-master-engine kernel: sctp: Hash tables configured (bind 256/256) Mar 25 04:42:05 lago-basic-suite-master-engine systemd-logind: Removed session 17. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Removed slice User Slice of root. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Stopping User Slice of root. Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: runuser: System error Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: /etc/openvswitch/conf.db does not exist ... (warning). Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: Creating empty database /etc/openvswitch/conf.db runuser: System error Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: [FAILED] Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service: control process exited, code=exited status=1
[1] http://jenkins.ovirt.org/job/ovirt-system-tests_master_check -patch-el7-x86_64/4651/artifact/exported-artifacts/basic-sui te-master__logs/test_logs/basic-suite-master/post-001_initia lize_engine.py/lago-basic-suite-master-engine/_var_log/messages/*view*/
Talked with danken, he asked to check if it's an selinux issue. It is. audit lot has:
type=AVC msg=audit(1521967325.146:675): avc: denied { create } for pid=3787 comm="runuser" scontext=system_u:system_r:openvswitch_t:s0 tcontext=system_u:system_r:openvswitch_t:s0 tclass=netlink_audit_socket type=SYSCALL msg=audit(1521967325.146:675): arch=c000003e syscall=41 success=no exit=-13 a0=10 a1=3 a2=9 a3=7ffc4e12b930 items=0 ppid=3786 pid=3787 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="runuser" exe="/usr/sbin/runuser" subj=system_u:system_r:openvswitch_t:s0 key=(null) type=PROCTITLE msg=audit(1521967325.146:675): proctitle=72756E75736572002D2D75736572006F70656E76737769746368002D2D006F767364622D746F6F6C002D76636F6E736F6C653A6F666600736368656D612D76657273696F6E002F7573722F73686172652F6F70656E767377697463682F767377697463682E6F7673736368656D61 type=AVC msg=audit(1521967325.150:676): avc: denied { create } for pid=3789 comm="runuser" scontext=system_u:system_r:openvswitch_t:s0 tcontext=system_u:system_r:openvswitch_t:s0 tclass=netlink_audit_socket type=SYSCALL msg=audit(1521967325.150:676): arch=c000003e syscall=41 success=no exit=-13 a0=10 a1=3 a2=9 a3=7ffe03060130 items=0 ppid=3755 pid=3789 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="runuser" exe="/usr/sbin/runuser" subj=system_u:system_r:openvswitch_t:s0 key=(null) type=PROCTITLE msg=audit(1521967325.150:676):
http://jenkins.ovirt.org/job/ovirt-system-tests_master_check-patch-el7-x86_6...
And it's 2.9:
Mar 25 04:38:39 lago-basic-suite-master-engine yum[1183]: Installed: 1:python2-openvswitch-2.9.0-3.el7.noarch
On Sun, Mar 25, 2018 at 10:07 AM, Yaniv Kaul <ykaul@redhat.com> wrote:
+ Network team. I'm not sure if we've moved to ovs 2.9 already? Y.
On Sat, Mar 24, 2018 at 8:19 PM, Greg Sheremeta <gshereme@redhat.com> wrote:
Hi,
Is there an ongoing engine master OST failure blocking?
[ INFO ] Stage: Misc configuration [ INFO ] Stage: Package installation [ INFO ] Stage: Misc configuration [ ERROR ] Failed to execute stage \'Misc configuration\': Failed to start service \'openvswitch\' [ INFO ] Yum Performing yum transaction rollback
These are unrelated code changes:
http://jenkins.ovirt.org/job/ovirt-system-tests_master_check -patch-el7-x86_64/4644/ https://gerrit.ovirt.org/#/c/89347/
and http://jenkins.ovirt.org/job/ovirt-system-tests_master_check -patch-el7-x86_64/4647/ https://gerrit.ovirt.org/67166
But they both die in 001, with exactly 1.24MB in the log and 'Failed to start service openvswitch' 001_initialize_engine.py.junit.xml 1.24 MB
Full file: http://jenkins.ovirt.org/job/ovirt-system-tests_master _check-patch-el7-x86_64/4644/artifact/exported-artifacts/bas ic-suite-master__logs/001_initialize_engine.py.junit.xml
On Fri, Mar 23, 2018 at 12:14 PM, Dafna Ron <dron@redhat.com> wrote:
> Hello, > > I would like to update on this week's failures and OST current > status. > > On 19-03-2018 - the CI team reported 3 different failures. > > On Master branch the failed changes reported were: > > > *core: fix removal of vm-host device - > https://gerrit.ovirt.org/#/c/89145/ <https://gerrit.ovirt.org/#/c/89145/>* > > *core: USB in osinfo configuration depends on chipset - > https://gerrit.ovirt.org/#/c/88777/ <https://gerrit.ovirt.org/#/c/88777/>* > *On 4.2 *branch, the reported change was: > > > > *core: Call endAction() of all child commands in ImportVmCommand - > https://gerrit.ovirt.org/#/c/89165/ <https://gerrit.ovirt.org/#/c/89165/>* > The fix's for the regressions were merged the following day > (20-03-2018) > > https://gerrit.ovirt.org/#/c/89250/- core: Replace generic > unlockVm() logic in ImportVmCommand > https://gerrit.ovirt.org/#/c/89187/ - core: Fix NPE when creating > an instance type > > On 20-03-2018 - the CI team discovered an issue on the job's cleanup > which caused random failures on changes testing due to failure in docker > cleanup. There is an open Jira on the issue: > https://ovirt-jira.atlassian.net/browse/OVIRT-1939 > > > > *Below you can see the chart for this week's resolved issues but > cause of failure:*Code = regression of working > components/functionalities > Configurations = package related issues > Other = failed build artifacts > Infra = infrastructure/OST/Lago related issues > > > > > > > > > > > *Below is a chart of resolved failures based on ovirt version* > > > > > > > > > > > > > *Below is a chart showing failures by suite type: * > Thank you, > Dafna > > > _______________________________________________ > Infra mailing list > Infra@ovirt.org > http://lists.ovirt.org/mailman/listinfo/infra > >
--
GREG SHEREMETA
SENIOR SOFTWARE ENGINEER - TEAM LEAD - RHV UX
Red Hat NA
gshereme@redhat.com IRC: gshereme <https://red.ht/sig>
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
-- Didi
-- Didi
--
GREG SHEREMETA
SENIOR SOFTWARE ENGINEER - TEAM LEAD - RHV UX
Red Hat NA
gshereme@redhat.com IRC: gshereme <https://red.ht/sig>
-- GREG SHEREMETA SENIOR SOFTWARE ENGINEER - TEAM LEAD - RHV UX Red Hat NA <https://www.redhat.com/> gshereme@redhat.com IRC: gshereme <https://red.ht/sig>

Just wondering, if the bug requires updated selinux, shouldn't the ovs PKG enforce it I'm the spec file? On Mar 25, 2018 20:12, "Greg Sheremeta" <gshereme@redhat.com> wrote:
Indeed, updating selinux-policy on both engine and hosts passes.
Change If671d938: [test do not merge] test selinux-policy update on engine and hosts | gerrit.ovirt Code Review https://gerrit.ovirt.org/#/c/89427/ http://jenkins.ovirt.org/job/ovirt-system-tests_master_ check-patch-el7-x86_64/4676/consoleFull
So I guess the images need updates.
On Sun, Mar 25, 2018 at 11:20 AM, Greg Sheremeta <gshereme@redhat.com> wrote:
currently selinux-policy-3.13.1-166.el7_4.4.noarch
updating selinux-policy on engine gets me past 001, and then 002 dies: http://jenkins.ovirt.org/job/ovirt-system-tests_master_check -patch-el7-x86_64/4663/console
15:05:45 # initialize_engine: rpm -qa: 15:05:46 CommandStatus(code=0, out='selinux-policy-3.13.1-166.el7_4.4.noarch\n', err='') 15:06:13 <snip> Package selinux-policy.noarch 0:3.13.1-166.el7_4.9 will be an update <snip> Updated: selinux-policy.noarch 0:3.13.1-166.el7_4.9 rpm -qa: 15:06:13 CommandStatus(code=0, out='selinux-policy-3.13.1-166.el7_4.9.noarch\n', err='')
But later in 002 15:08:47 RuntimeError: 1 hosts failed installation: 15:08:47 lago-basic-suite-master-host-0: install_failed
Perhaps selinux-policy needs to be updated on the hosts too? Not my area of expertise :)
Greg
On Sun, Mar 25, 2018 at 7:22 AM, Dan Kenigsberg <danken@redhat.com> wrote:
Which version of selinux-policy do we have on the Engine image?
*Bug 1538936* <https://bugzilla.redhat.com/show_bug.cgi?id=1538936> - Open vSwitch selinux policy needs updating [rhel-7.4.z]
was fixed in selinux-policy-3.13.1-166.el7_4.9 which is available in http://mirror.centos.org/centos-7/7/updates/x86_64/Packages/ selinux-policy-targeted-3.13.1-166.el7_4.9.noarch.rpm
On Sun, Mar 25, 2018 at 12:32 PM, Yedidyah Bar David <didi@redhat.com> wrote:
On Sun, Mar 25, 2018 at 12:04 PM, Yedidyah Bar David <didi@redhat.com> wrote:
basic suite failed for me too.
/var/log/messages has[1]:
Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Starting Open vSwitch Database Unit... Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: runuser: System error Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: /etc/openvswitch/conf.db does not exist ... (warning). Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: Creating empty database /etc/openvswitch/conf.db runuser: System error Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: [FAILED] Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service: control process exited, code=exited status=1 Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Failed to start Open vSwitch Database Unit. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Dependency failed for Open vSwitch. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Job openvswitch.service/start failed with result 'dependency'. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Dependency failed for Open vSwitch Forwarding Unit. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Job ovs-vswitchd.service/start failed with result 'dependency'. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Unit ovsdb-server.service entered failed state. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service failed. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Assertion failed for Open vSwitch Delete Transient Ports. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service holdoff time over, scheduling restart. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Starting Open vSwitch Database Unit... Mar 25 04:42:05 lago-basic-suite-master-engine systemd-logind: Removed session 14. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Removed slice User Slice of root. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Stopping User Slice of root. Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: runuser: System error Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: /etc/openvswitch/conf.db does not exist ... (warning). Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: Creating empty database /etc/openvswitch/conf.db runuser: System error Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: [FAILED] Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service: control process exited, code=exited status=1 Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Failed to start Open vSwitch Database Unit. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Unit ovsdb-server.service entered failed state. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service failed. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Assertion failed for Open vSwitch Delete Transient Ports. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service holdoff time over, scheduling restart. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Starting Open vSwitch Database Unit... Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: runuser: System error Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: /etc/openvswitch/conf.db does not exist ... (warning). Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: Creating empty database /etc/openvswitch/conf.db runuser: System error Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: [FAILED] Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service: control process exited, code=exited status=1 Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Failed to start Open vSwitch Database Unit. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Unit ovsdb-server.service entered failed state. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service failed. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Assertion failed for Open vSwitch Delete Transient Ports. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Created slice User Slice of root. Mar 25 04:42:05 lago-basic-suite-master-engine systemd-logind: New session 17 of user root. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Starting User Slice of root. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Started Session 17 of user root. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Starting Session 17 of user root. Mar 25 04:42:05 lago-basic-suite-master-engine kernel: DCCP: Activated CCID 2 (TCP-like) Mar 25 04:42:05 lago-basic-suite-master-engine kernel: DCCP: Activated CCID 3 (TCP-Friendly Rate Control) Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service holdoff time over, scheduling restart. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Starting Open vSwitch Database Unit... Mar 25 04:42:05 lago-basic-suite-master-engine kernel: sctp: Hash tables configured (bind 256/256) Mar 25 04:42:05 lago-basic-suite-master-engine systemd-logind: Removed session 17. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Removed slice User Slice of root. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Stopping User Slice of root. Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: runuser: System error Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: /etc/openvswitch/conf.db does not exist ... (warning). Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: Creating empty database /etc/openvswitch/conf.db runuser: System error Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: [FAILED] Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service: control process exited, code=exited status=1
[1] http://jenkins.ovirt.org/job/ovirt-system-tests_master_check -patch-el7-x86_64/4651/artifact/exported-artifacts/basic-sui te-master__logs/test_logs/basic-suite-master/post-001_initia lize_engine.py/lago-basic-suite-master-engine/_var_log/messa ges/*view*/
Talked with danken, he asked to check if it's an selinux issue. It is. audit lot has:
type=AVC msg=audit(1521967325.146:675): avc: denied { create } for pid=3787 comm="runuser" scontext=system_u:system_r:openvswitch_t:s0 tcontext=system_u:system_r:openvswitch_t:s0 tclass=netlink_audit_socket type=SYSCALL msg=audit(1521967325.146:675): arch=c000003e syscall=41 success=no exit=-13 a0=10 a1=3 a2=9 a3=7ffc4e12b930 items=0 ppid=3786 pid=3787 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="runuser" exe="/usr/sbin/runuser" subj=system_u:system_r:openvswitch_t:s0 key=(null) type=PROCTITLE msg=audit(1521967325.146:675): proctitle=72756E75736572002D2D75736572006F70656E76737769746368002D2D006F767364622D746F6F6C002D76636F6E736F6C653A6F666600736368656D612D76657273696F6E002F7573722F73686172652F6F70656E767377697463682F767377697463682E6F7673736368656D61 type=AVC msg=audit(1521967325.150:676): avc: denied { create } for pid=3789 comm="runuser" scontext=system_u:system_r:openvswitch_t:s0 tcontext=system_u:system_r:openvswitch_t:s0 tclass=netlink_audit_socket type=SYSCALL msg=audit(1521967325.150:676): arch=c000003e syscall=41 success=no exit=-13 a0=10 a1=3 a2=9 a3=7ffe03060130 items=0 ppid=3755 pid=3789 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="runuser" exe="/usr/sbin/runuser" subj=system_u:system_r:openvswitch_t:s0 key=(null) type=PROCTITLE msg=audit(1521967325.150:676):
http://jenkins.ovirt.org/job/ovirt-system-tests_master_check-patch-el7-x86_6...
And it's 2.9:
Mar 25 04:38:39 lago-basic-suite-master-engine yum[1183]: Installed: 1:python2-openvswitch-2.9.0-3.el7.noarch
On Sun, Mar 25, 2018 at 10:07 AM, Yaniv Kaul <ykaul@redhat.com> wrote:
+ Network team. I'm not sure if we've moved to ovs 2.9 already? Y.
On Sat, Mar 24, 2018 at 8:19 PM, Greg Sheremeta <gshereme@redhat.com> wrote:
> Hi, > > Is there an ongoing engine master OST failure blocking? > > [ INFO ] Stage: Misc configuration > [ INFO ] Stage: Package installation > [ INFO ] Stage: Misc configuration > [ ERROR ] Failed to execute stage \'Misc configuration\': Failed to > start service \'openvswitch\' > [ INFO ] Yum Performing yum transaction rollback > > > These are unrelated code changes: > > http://jenkins.ovirt.org/job/ovirt-system-tests_master_check > -patch-el7-x86_64/4644/ > https://gerrit.ovirt.org/#/c/89347/ > > and > http://jenkins.ovirt.org/job/ovirt-system-tests_master_check > -patch-el7-x86_64/4647/ > https://gerrit.ovirt.org/67166 > > But they both die in 001, with exactly 1.24MB in the log and 'Failed > to start service openvswitch' > 001_initialize_engine.py.junit.xml 1.24 MB > > Full file: http://jenkins.ovirt.org/job/ovirt-system-tests_master > _check-patch-el7-x86_64/4644/artifact/exported-artifacts/bas > ic-suite-master__logs/001_initialize_engine.py.junit.xml > > > On Fri, Mar 23, 2018 at 12:14 PM, Dafna Ron <dron@redhat.com> wrote: > >> Hello, >> >> I would like to update on this week's failures and OST current >> status. >> >> On 19-03-2018 - the CI team reported 3 different failures. >> >> On Master branch the failed changes reported were: >> >> >> *core: fix removal of vm-host device - >> https://gerrit.ovirt.org/#/c/89145/ <https://gerrit.ovirt.org/#/c/89145/>* >> >> *core: USB in osinfo configuration depends on chipset - >> https://gerrit.ovirt.org/#/c/88777/ <https://gerrit.ovirt.org/#/c/88777/>* >> *On 4.2 *branch, the reported change was: >> >> >> >> *core: Call endAction() of all child commands in ImportVmCommand - >> https://gerrit.ovirt.org/#/c/89165/ <https://gerrit.ovirt.org/#/c/89165/>* >> The fix's for the regressions were merged the following day >> (20-03-2018) >> >> https://gerrit.ovirt.org/#/c/89250/- core: Replace generic >> unlockVm() logic in ImportVmCommand >> https://gerrit.ovirt.org/#/c/89187/ - core: Fix NPE when creating >> an instance type >> >> On 20-03-2018 - the CI team discovered an issue on the job's >> cleanup which caused random failures on changes testing due to failure in >> docker cleanup. There is an open Jira on the issue: >> https://ovirt-jira.atlassian.net/browse/OVIRT-1939 >> >> >> >> *Below you can see the chart for this week's resolved issues but >> cause of failure:*Code = regression of working >> components/functionalities >> Configurations = package related issues >> Other = failed build artifacts >> Infra = infrastructure/OST/Lago related issues >> >> >> >> >> >> >> >> >> >> >> *Below is a chart of resolved failures based on ovirt version* >> >> >> >> >> >> >> >> >> >> >> >> >> *Below is a chart showing failures by suite type: * >> Thank you, >> Dafna >> >> >> _______________________________________________ >> Infra mailing list >> Infra@ovirt.org >> http://lists.ovirt.org/mailman/listinfo/infra >> >> > > > -- > > GREG SHEREMETA > > SENIOR SOFTWARE ENGINEER - TEAM LEAD - RHV UX > > Red Hat NA > > <https://www.redhat.com/> > > gshereme@redhat.com IRC: gshereme > <https://red.ht/sig> > > _______________________________________________ > Devel mailing list > Devel@ovirt.org > http://lists.ovirt.org/mailman/listinfo/devel >
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
-- Didi
-- Didi
--
GREG SHEREMETA
SENIOR SOFTWARE ENGINEER - TEAM LEAD - RHV UX
Red Hat NA
gshereme@redhat.com IRC: gshereme <https://red.ht/sig>
--
GREG SHEREMETA
SENIOR SOFTWARE ENGINEER - TEAM LEAD - RHV UX
Red Hat NA
gshereme@redhat.com IRC: gshereme <https://red.ht/sig>
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra

Eyal, danken told me that ovn cannot require selinux, so the images must be updated. Looks like Gal is updating the images here: https://gerrit.ovirt.org/#/c/89430/ (ng: Update the CentOS image, just merged) Gal, will this fix it? On Sun, Mar 25, 2018 at 1:15 PM, Eyal Edri <eedri@redhat.com> wrote:
Just wondering, if the bug requires updated selinux, shouldn't the ovs PKG enforce it I'm the spec file?
On Mar 25, 2018 20:12, "Greg Sheremeta" <gshereme@redhat.com> wrote:
Indeed, updating selinux-policy on both engine and hosts passes.
Change If671d938: [test do not merge] test selinux-policy update on engine and hosts | gerrit.ovirt Code Review https://gerrit.ovirt.org/#/c/89427/ http://jenkins.ovirt.org/job/ovirt-system-tests_master_check -patch-el7-x86_64/4676/consoleFull
So I guess the images need updates.
On Sun, Mar 25, 2018 at 11:20 AM, Greg Sheremeta <gshereme@redhat.com> wrote:
currently selinux-policy-3.13.1-166.el7_4.4.noarch
updating selinux-policy on engine gets me past 001, and then 002 dies: http://jenkins.ovirt.org/job/ovirt-system-tests_master_check -patch-el7-x86_64/4663/console
15:05:45 # initialize_engine: rpm -qa: 15:05:46 CommandStatus(code=0, out='selinux-policy-3.13.1-166.el7_4.4.noarch\n', err='') 15:06:13 <snip> Package selinux-policy.noarch 0:3.13.1-166.el7_4.9 will be an update <snip> Updated: selinux-policy.noarch 0:3.13.1-166.el7_4.9 rpm -qa: 15:06:13 CommandStatus(code=0, out='selinux-policy-3.13.1-166.el7_4.9.noarch\n', err='')
But later in 002 15:08:47 RuntimeError: 1 hosts failed installation: 15:08:47 lago-basic-suite-master-host-0: install_failed
Perhaps selinux-policy needs to be updated on the hosts too? Not my area of expertise :)
Greg
On Sun, Mar 25, 2018 at 7:22 AM, Dan Kenigsberg <danken@redhat.com> wrote:
Which version of selinux-policy do we have on the Engine image?
*Bug 1538936* <https://bugzilla.redhat.com/show_bug.cgi?id=1538936> - Open vSwitch selinux policy needs updating [rhel-7.4.z]
was fixed in selinux-policy-3.13.1-166.el7_4.9 which is available in http://mirror.centos.org/centos-7/7/updates/x86_64/Packages/ selinux-policy-targeted-3.13.1-166.el7_4.9.noarch.rpm
On Sun, Mar 25, 2018 at 12:32 PM, Yedidyah Bar David <didi@redhat.com> wrote:
On Sun, Mar 25, 2018 at 12:04 PM, Yedidyah Bar David <didi@redhat.com> wrote:
basic suite failed for me too.
/var/log/messages has[1]:
Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Starting Open vSwitch Database Unit... Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: runuser: System error Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: /etc/openvswitch/conf.db does not exist ... (warning). Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: Creating empty database /etc/openvswitch/conf.db runuser: System error Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: [FAILED] Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service: control process exited, code=exited status=1 Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Failed to start Open vSwitch Database Unit. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Dependency failed for Open vSwitch. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Job openvswitch.service/start failed with result 'dependency'. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Dependency failed for Open vSwitch Forwarding Unit. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Job ovs-vswitchd.service/start failed with result 'dependency'. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Unit ovsdb-server.service entered failed state. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service failed. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Assertion failed for Open vSwitch Delete Transient Ports. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service holdoff time over, scheduling restart. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Starting Open vSwitch Database Unit... Mar 25 04:42:05 lago-basic-suite-master-engine systemd-logind: Removed session 14. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Removed slice User Slice of root. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Stopping User Slice of root. Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: runuser: System error Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: /etc/openvswitch/conf.db does not exist ... (warning). Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: Creating empty database /etc/openvswitch/conf.db runuser: System error Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: [FAILED] Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service: control process exited, code=exited status=1 Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Failed to start Open vSwitch Database Unit. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Unit ovsdb-server.service entered failed state. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service failed. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Assertion failed for Open vSwitch Delete Transient Ports. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service holdoff time over, scheduling restart. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Starting Open vSwitch Database Unit... Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: runuser: System error Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: /etc/openvswitch/conf.db does not exist ... (warning). Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: Creating empty database /etc/openvswitch/conf.db runuser: System error Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: [FAILED] Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service: control process exited, code=exited status=1 Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Failed to start Open vSwitch Database Unit. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Unit ovsdb-server.service entered failed state. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service failed. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Assertion failed for Open vSwitch Delete Transient Ports. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Created slice User Slice of root. Mar 25 04:42:05 lago-basic-suite-master-engine systemd-logind: New session 17 of user root. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Starting User Slice of root. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Started Session 17 of user root. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Starting Session 17 of user root. Mar 25 04:42:05 lago-basic-suite-master-engine kernel: DCCP: Activated CCID 2 (TCP-like) Mar 25 04:42:05 lago-basic-suite-master-engine kernel: DCCP: Activated CCID 3 (TCP-Friendly Rate Control) Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service holdoff time over, scheduling restart. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Starting Open vSwitch Database Unit... Mar 25 04:42:05 lago-basic-suite-master-engine kernel: sctp: Hash tables configured (bind 256/256) Mar 25 04:42:05 lago-basic-suite-master-engine systemd-logind: Removed session 17. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Removed slice User Slice of root. Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Stopping User Slice of root. Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: runuser: System error Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: /etc/openvswitch/conf.db does not exist ... (warning). Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: Creating empty database /etc/openvswitch/conf.db runuser: System error Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: [FAILED] Mar 25 04:42:05 lago-basic-suite-master-engine systemd: ovsdb-server.service: control process exited, code=exited status=1
[1] http://jenkins.ovirt.org/job/ovirt-system-tests_master_check -patch-el7-x86_64/4651/artifact/exported-artifacts/basic-sui te-master__logs/test_logs/basic-suite-master/post-001_initia lize_engine.py/lago-basic-suite-master-engine/_var_log/messa ges/*view*/
Talked with danken, he asked to check if it's an selinux issue. It is. audit lot has:
type=AVC msg=audit(1521967325.146:675): avc: denied { create } for pid=3787 comm="runuser" scontext=system_u:system_r:openvswitch_t:s0 tcontext=system_u:system_r:openvswitch_t:s0 tclass=netlink_audit_socket type=SYSCALL msg=audit(1521967325.146:675): arch=c000003e syscall=41 success=no exit=-13 a0=10 a1=3 a2=9 a3=7ffc4e12b930 items=0 ppid=3786 pid=3787 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="runuser" exe="/usr/sbin/runuser" subj=system_u:system_r:openvswitch_t:s0 key=(null) type=PROCTITLE msg=audit(1521967325.146:675): proctitle=72756E75736572002D2D75736572006F70656E76737769746368002D2D006F767364622D746F6F6C002D76636F6E736F6C653A6F666600736368656D612D76657273696F6E002F7573722F73686172652F6F70656E767377697463682F767377697463682E6F7673736368656D61 type=AVC msg=audit(1521967325.150:676): avc: denied { create } for pid=3789 comm="runuser" scontext=system_u:system_r:openvswitch_t:s0 tcontext=system_u:system_r:openvswitch_t:s0 tclass=netlink_audit_socket type=SYSCALL msg=audit(1521967325.150:676): arch=c000003e syscall=41 success=no exit=-13 a0=10 a1=3 a2=9 a3=7ffe03060130 items=0 ppid=3755 pid=3789 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="runuser" exe="/usr/sbin/runuser" subj=system_u:system_r:openvswitch_t:s0 key=(null) type=PROCTITLE msg=audit(1521967325.150:676):
http://jenkins.ovirt.org/job/ovirt-system-tests_master_check-patch-el7-x86_6...
And it's 2.9:
Mar 25 04:38:39 lago-basic-suite-master-engine yum[1183]: Installed: 1:python2-openvswitch-2.9.0-3.el7.noarch
On Sun, Mar 25, 2018 at 10:07 AM, Yaniv Kaul <ykaul@redhat.com> wrote:
> + Network team. > I'm not sure if we've moved to ovs 2.9 already? > Y. > > On Sat, Mar 24, 2018 at 8:19 PM, Greg Sheremeta <gshereme@redhat.com > > wrote: > >> Hi, >> >> Is there an ongoing engine master OST failure blocking? >> >> [ INFO ] Stage: Misc configuration >> [ INFO ] Stage: Package installation >> [ INFO ] Stage: Misc configuration >> [ ERROR ] Failed to execute stage \'Misc configuration\': Failed to >> start service \'openvswitch\' >> [ INFO ] Yum Performing yum transaction rollback >> >> >> These are unrelated code changes: >> >> http://jenkins.ovirt.org/job/ovirt-system-tests_master_check >> -patch-el7-x86_64/4644/ >> https://gerrit.ovirt.org/#/c/89347/ >> >> and >> http://jenkins.ovirt.org/job/ovirt-system-tests_master_check >> -patch-el7-x86_64/4647/ >> https://gerrit.ovirt.org/67166 >> >> But they both die in 001, with exactly 1.24MB in the log and 'Failed >> to start service openvswitch' >> 001_initialize_engine.py.junit.xml 1.24 MB >> >> Full file: http://jenkins.ovirt.org/job/ovirt-system-tests_master >> _check-patch-el7-x86_64/4644/artifact/exported-artifacts/bas >> ic-suite-master__logs/001_initialize_engine.py.junit.xml >> >> >> On Fri, Mar 23, 2018 at 12:14 PM, Dafna Ron <dron@redhat.com> >> wrote: >> >>> Hello, >>> >>> I would like to update on this week's failures and OST current >>> status. >>> >>> On 19-03-2018 - the CI team reported 3 different failures. >>> >>> On Master branch the failed changes reported were: >>> >>> >>> *core: fix removal of vm-host device - >>> https://gerrit.ovirt.org/#/c/89145/ <https://gerrit.ovirt.org/#/c/89145/>* >>> >>> *core: USB in osinfo configuration depends on chipset - >>> https://gerrit.ovirt.org/#/c/88777/ <https://gerrit.ovirt.org/#/c/88777/>* >>> *On 4.2 *branch, the reported change was: >>> >>> >>> >>> *core: Call endAction() of all child commands in ImportVmCommand - >>> https://gerrit.ovirt.org/#/c/89165/ <https://gerrit.ovirt.org/#/c/89165/>* >>> The fix's for the regressions were merged the following day >>> (20-03-2018) >>> >>> https://gerrit.ovirt.org/#/c/89250/- core: Replace generic >>> unlockVm() logic in ImportVmCommand >>> https://gerrit.ovirt.org/#/c/89187/ - core: Fix NPE when creating >>> an instance type >>> >>> On 20-03-2018 - the CI team discovered an issue on the job's >>> cleanup which caused random failures on changes testing due to failure in >>> docker cleanup. There is an open Jira on the issue: >>> https://ovirt-jira.atlassian.net/browse/OVIRT-1939 >>> >>> >>> >>> *Below you can see the chart for this week's resolved issues but >>> cause of failure:*Code = regression of working >>> components/functionalities >>> Configurations = package related issues >>> Other = failed build artifacts >>> Infra = infrastructure/OST/Lago related issues >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> *Below is a chart of resolved failures based on ovirt version* >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> *Below is a chart showing failures by suite type: * >>> Thank you, >>> Dafna >>> >>> >>> _______________________________________________ >>> Infra mailing list >>> Infra@ovirt.org >>> http://lists.ovirt.org/mailman/listinfo/infra >>> >>> >> >> >> -- >> >> GREG SHEREMETA >> >> SENIOR SOFTWARE ENGINEER - TEAM LEAD - RHV UX >> >> Red Hat NA >> >> <https://www.redhat.com/> >> >> gshereme@redhat.com IRC: gshereme >> <https://red.ht/sig> >> >> _______________________________________________ >> Devel mailing list >> Devel@ovirt.org >> http://lists.ovirt.org/mailman/listinfo/devel >> > > > _______________________________________________ > Infra mailing list > Infra@ovirt.org > http://lists.ovirt.org/mailman/listinfo/infra > >
-- Didi
-- Didi
--
GREG SHEREMETA
SENIOR SOFTWARE ENGINEER - TEAM LEAD - RHV UX
Red Hat NA
gshereme@redhat.com IRC: gshereme <https://red.ht/sig>
--
GREG SHEREMETA
SENIOR SOFTWARE ENGINEER - TEAM LEAD - RHV UX
Red Hat NA
gshereme@redhat.com IRC: gshereme <https://red.ht/sig>
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
-- GREG SHEREMETA SENIOR SOFTWARE ENGINEER - TEAM LEAD - RHV UX Red Hat NA <https://www.redhat.com/> gshereme@redhat.com IRC: gshereme <https://red.ht/sig>

On Mon, Mar 26, 2018 at 3:29 PM, Greg Sheremeta <gshereme@redhat.com> wrote:
Eyal, danken told me that ovn cannot require selinux, so the images must be updated.
Yea, I got that, and Gal indeed went a head and updated the image, just wondered why it wasn't fixed also on OVN side.
Looks like Gal is updating the images here: https://gerrit.ovirt. org/#/c/89430/ (ng: Update the CentOS image, just merged)
Gal, will this fix it?
I think Gal verified it, so should be working now, try rebasing your patch.
On Sun, Mar 25, 2018 at 1:15 PM, Eyal Edri <eedri@redhat.com> wrote:
Just wondering, if the bug requires updated selinux, shouldn't the ovs PKG enforce it I'm the spec file?
On Mar 25, 2018 20:12, "Greg Sheremeta" <gshereme@redhat.com> wrote:
Indeed, updating selinux-policy on both engine and hosts passes.
Change If671d938: [test do not merge] test selinux-policy update on engine and hosts | gerrit.ovirt Code Review https://gerrit.ovirt.org/#/c/89427/ http://jenkins.ovirt.org/job/ovirt-system-tests_master_check -patch-el7-x86_64/4676/consoleFull
So I guess the images need updates.
On Sun, Mar 25, 2018 at 11:20 AM, Greg Sheremeta <gshereme@redhat.com> wrote:
currently selinux-policy-3.13.1-166.el7_4.4.noarch
updating selinux-policy on engine gets me past 001, and then 002 dies: http://jenkins.ovirt.org/job/ovirt-system-tests_master_check -patch-el7-x86_64/4663/console
15:05:45 # initialize_engine: rpm -qa: 15:05:46 CommandStatus(code=0, out='selinux-policy-3.13.1-166.el7_4.4.noarch\n', err='') 15:06:13 <snip> Package selinux-policy.noarch 0:3.13.1-166.el7_4.9 will be an update <snip> Updated: selinux-policy.noarch 0:3.13.1-166.el7_4.9 rpm -qa: 15:06:13 CommandStatus(code=0, out='selinux-policy-3.13.1-166.el7_4.9.noarch\n', err='')
But later in 002 15:08:47 RuntimeError: 1 hosts failed installation: 15:08:47 lago-basic-suite-master-host-0: install_failed
Perhaps selinux-policy needs to be updated on the hosts too? Not my area of expertise :)
Greg
On Sun, Mar 25, 2018 at 7:22 AM, Dan Kenigsberg <danken@redhat.com> wrote:
Which version of selinux-policy do we have on the Engine image?
*Bug 1538936* <https://bugzilla.redhat.com/show_bug.cgi?id=1538936> - Open vSwitch selinux policy needs updating [rhel-7.4.z]
was fixed in selinux-policy-3.13.1-166.el7_4.9 which is available in http://mirror.centos.org/centos-7/7/updates/x86_64/Packages/ selinux-policy-targeted-3.13.1-166.el7_4.9.noarch.rpm
On Sun, Mar 25, 2018 at 12:32 PM, Yedidyah Bar David <didi@redhat.com> wrote:
On Sun, Mar 25, 2018 at 12:04 PM, Yedidyah Bar David <didi@redhat.com > wrote:
> basic suite failed for me too. > > /var/log/messages has[1]: > > Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Starting > Open vSwitch Database Unit... > Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: runuser: > System error > Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: > /etc/openvswitch/conf.db does not exist ... (warning). > Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: Creating > empty database /etc/openvswitch/conf.db runuser: System error > Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: [FAILED] > Mar 25 04:42:05 lago-basic-suite-master-engine systemd: > ovsdb-server.service: control process exited, code=exited status=1 > Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Failed to > start Open vSwitch Database Unit. > Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Dependency > failed for Open vSwitch. > Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Job > openvswitch.service/start failed with result 'dependency'. > Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Dependency > failed for Open vSwitch Forwarding Unit. > Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Job > ovs-vswitchd.service/start failed with result 'dependency'. > Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Unit > ovsdb-server.service entered failed state. > Mar 25 04:42:05 lago-basic-suite-master-engine systemd: > ovsdb-server.service failed. > Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Assertion > failed for Open vSwitch Delete Transient Ports. > Mar 25 04:42:05 lago-basic-suite-master-engine systemd: > ovsdb-server.service holdoff time over, scheduling restart. > Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Starting > Open vSwitch Database Unit... > Mar 25 04:42:05 lago-basic-suite-master-engine systemd-logind: > Removed session 14. > Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Removed > slice User Slice of root. > Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Stopping > User Slice of root. > Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: runuser: > System error > Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: > /etc/openvswitch/conf.db does not exist ... (warning). > Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: Creating > empty database /etc/openvswitch/conf.db runuser: System error > Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: [FAILED] > Mar 25 04:42:05 lago-basic-suite-master-engine systemd: > ovsdb-server.service: control process exited, code=exited status=1 > Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Failed to > start Open vSwitch Database Unit. > Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Unit > ovsdb-server.service entered failed state. > Mar 25 04:42:05 lago-basic-suite-master-engine systemd: > ovsdb-server.service failed. > Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Assertion > failed for Open vSwitch Delete Transient Ports. > Mar 25 04:42:05 lago-basic-suite-master-engine systemd: > ovsdb-server.service holdoff time over, scheduling restart. > Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Starting > Open vSwitch Database Unit... > Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: runuser: > System error > Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: > /etc/openvswitch/conf.db does not exist ... (warning). > Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: Creating > empty database /etc/openvswitch/conf.db runuser: System error > Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: [FAILED] > Mar 25 04:42:05 lago-basic-suite-master-engine systemd: > ovsdb-server.service: control process exited, code=exited status=1 > Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Failed to > start Open vSwitch Database Unit. > Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Unit > ovsdb-server.service entered failed state. > Mar 25 04:42:05 lago-basic-suite-master-engine systemd: > ovsdb-server.service failed. > Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Assertion > failed for Open vSwitch Delete Transient Ports. > Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Created > slice User Slice of root. > Mar 25 04:42:05 lago-basic-suite-master-engine systemd-logind: New > session 17 of user root. > Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Starting > User Slice of root. > Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Started > Session 17 of user root. > Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Starting > Session 17 of user root. > Mar 25 04:42:05 lago-basic-suite-master-engine kernel: DCCP: > Activated CCID 2 (TCP-like) > Mar 25 04:42:05 lago-basic-suite-master-engine kernel: DCCP: > Activated CCID 3 (TCP-Friendly Rate Control) > Mar 25 04:42:05 lago-basic-suite-master-engine systemd: > ovsdb-server.service holdoff time over, scheduling restart. > Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Starting > Open vSwitch Database Unit... > Mar 25 04:42:05 lago-basic-suite-master-engine kernel: sctp: Hash > tables configured (bind 256/256) > Mar 25 04:42:05 lago-basic-suite-master-engine systemd-logind: > Removed session 17. > Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Removed > slice User Slice of root. > Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Stopping > User Slice of root. > Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: runuser: > System error > Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: > /etc/openvswitch/conf.db does not exist ... (warning). > Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: Creating > empty database /etc/openvswitch/conf.db runuser: System error > Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: [FAILED] > Mar 25 04:42:05 lago-basic-suite-master-engine systemd: > ovsdb-server.service: control process exited, code=exited status=1 > > [1] http://jenkins.ovirt.org/job/ovirt-system-tests_master_check > -patch-el7-x86_64/4651/artifact/exported-artifacts/basic-sui > te-master__logs/test_logs/basic-suite-master/post-001_initia > lize_engine.py/lago-basic-suite-master-engine/_var_log/messa > ges/*view*/ > > Talked with danken, he asked to check if it's an selinux issue. It is. audit lot has:
type=AVC msg=audit(1521967325.146:675): avc: denied { create } for pid=3787 comm="runuser" scontext=system_u:system_r:openvswitch_t:s0 tcontext=system_u:system_r:openvswitch_t:s0 tclass=netlink_audit_socket type=SYSCALL msg=audit(1521967325.146:675): arch=c000003e syscall=41 success=no exit=-13 a0=10 a1=3 a2=9 a3=7ffc4e12b930 items=0 ppid=3786 pid=3787 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="runuser" exe="/usr/sbin/runuser" subj=system_u:system_r:openvswitch_t:s0 key=(null) type=PROCTITLE msg=audit(1521967325.146:675): proctitle=72756E75736572002D2D75736572006F70656E76737769746368002D2D006F767364622D746F6F6C002D76636F6E736F6C653A6F666600736368656D612D76657273696F6E002F7573722F73686172652F6F70656E767377697463682F767377697463682E6F7673736368656D61 type=AVC msg=audit(1521967325.150:676): avc: denied { create } for pid=3789 comm="runuser" scontext=system_u:system_r:openvswitch_t:s0 tcontext=system_u:system_r:openvswitch_t:s0 tclass=netlink_audit_socket type=SYSCALL msg=audit(1521967325.150:676): arch=c000003e syscall=41 success=no exit=-13 a0=10 a1=3 a2=9 a3=7ffe03060130 items=0 ppid=3755 pid=3789 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="runuser" exe="/usr/sbin/runuser" subj=system_u:system_r:openvswitch_t:s0 key=(null) type=PROCTITLE msg=audit(1521967325.150:676):
http://jenkins.ovirt.org/job/ovirt-system-tests_master_check-patch-el7-x86_6...
And it's 2.9:
Mar 25 04:38:39 lago-basic-suite-master-engine yum[1183]: Installed: 1:python2-openvswitch-2.9.0-3.el7.noarch
> > On Sun, Mar 25, 2018 at 10:07 AM, Yaniv Kaul <ykaul@redhat.com> > wrote: > >> + Network team. >> I'm not sure if we've moved to ovs 2.9 already? >> Y. >> >> On Sat, Mar 24, 2018 at 8:19 PM, Greg Sheremeta < >> gshereme@redhat.com> wrote: >> >>> Hi, >>> >>> Is there an ongoing engine master OST failure blocking? >>> >>> [ INFO ] Stage: Misc configuration >>> [ INFO ] Stage: Package installation >>> [ INFO ] Stage: Misc configuration >>> [ ERROR ] Failed to execute stage \'Misc configuration\': Failed >>> to start service \'openvswitch\' >>> [ INFO ] Yum Performing yum transaction rollback >>> >>> >>> These are unrelated code changes: >>> >>> http://jenkins.ovirt.org/job/ovirt-system-tests_master_check >>> -patch-el7-x86_64/4644/ >>> https://gerrit.ovirt.org/#/c/89347/ >>> >>> and >>> http://jenkins.ovirt.org/job/ovirt-system-tests_master_check >>> -patch-el7-x86_64/4647/ >>> https://gerrit.ovirt.org/67166 >>> >>> But they both die in 001, with exactly 1.24MB in the log and 'Failed >>> to start service openvswitch' >>> 001_initialize_engine.py.junit.xml 1.24 MB >>> >>> Full file: http://jenkins.ovirt.org/job/ovirt-system-tests_master >>> _check-patch-el7-x86_64/4644/artifact/exported-artifacts/bas >>> ic-suite-master__logs/001_initialize_engine.py.junit.xml >>> >>> >>> On Fri, Mar 23, 2018 at 12:14 PM, Dafna Ron <dron@redhat.com> >>> wrote: >>> >>>> Hello, >>>> >>>> I would like to update on this week's failures and OST current >>>> status. >>>> >>>> On 19-03-2018 - the CI team reported 3 different failures. >>>> >>>> On Master branch the failed changes reported were: >>>> >>>> >>>> *core: fix removal of vm-host device - >>>> https://gerrit.ovirt.org/#/c/89145/ <https://gerrit.ovirt.org/#/c/89145/>* >>>> >>>> *core: USB in osinfo configuration depends on chipset - >>>> https://gerrit.ovirt.org/#/c/88777/ <https://gerrit.ovirt.org/#/c/88777/>* >>>> *On 4.2 *branch, the reported change was: >>>> >>>> >>>> >>>> *core: Call endAction() of all child commands in ImportVmCommand >>>> - https://gerrit.ovirt.org/#/c/89165/ <https://gerrit.ovirt.org/#/c/89165/>* >>>> The fix's for the regressions were merged the following day >>>> (20-03-2018) >>>> >>>> https://gerrit.ovirt.org/#/c/89250/- core: Replace generic >>>> unlockVm() logic in ImportVmCommand >>>> https://gerrit.ovirt.org/#/c/89187/ - core: Fix NPE when >>>> creating an instance type >>>> >>>> On 20-03-2018 - the CI team discovered an issue on the job's >>>> cleanup which caused random failures on changes testing due to failure in >>>> docker cleanup. There is an open Jira on the issue: >>>> https://ovirt-jira.atlassian.net/browse/OVIRT-1939 >>>> >>>> >>>> >>>> *Below you can see the chart for this week's resolved issues but >>>> cause of failure:*Code = regression of working >>>> components/functionalities >>>> Configurations = package related issues >>>> Other = failed build artifacts >>>> Infra = infrastructure/OST/Lago related issues >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> *Below is a chart of resolved failures based on ovirt version* >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> *Below is a chart showing failures by suite type: * >>>> Thank you, >>>> Dafna >>>> >>>> >>>> _______________________________________________ >>>> Infra mailing list >>>> Infra@ovirt.org >>>> http://lists.ovirt.org/mailman/listinfo/infra >>>> >>>> >>> >>> >>> -- >>> >>> GREG SHEREMETA >>> >>> SENIOR SOFTWARE ENGINEER - TEAM LEAD - RHV UX >>> >>> Red Hat NA >>> >>> <https://www.redhat.com/> >>> >>> gshereme@redhat.com IRC: gshereme >>> <https://red.ht/sig> >>> >>> _______________________________________________ >>> Devel mailing list >>> Devel@ovirt.org >>> http://lists.ovirt.org/mailman/listinfo/devel >>> >> >> >> _______________________________________________ >> Infra mailing list >> Infra@ovirt.org >> http://lists.ovirt.org/mailman/listinfo/infra >> >> > > > -- > Didi >
-- Didi
--
GREG SHEREMETA
SENIOR SOFTWARE ENGINEER - TEAM LEAD - RHV UX
Red Hat NA
gshereme@redhat.com IRC: gshereme <https://red.ht/sig>
--
GREG SHEREMETA
SENIOR SOFTWARE ENGINEER - TEAM LEAD - RHV UX
Red Hat NA
gshereme@redhat.com IRC: gshereme <https://red.ht/sig>
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
--
GREG SHEREMETA
SENIOR SOFTWARE ENGINEER - TEAM LEAD - RHV UX
Red Hat NA
gshereme@redhat.com IRC: gshereme <https://red.ht/sig>
-- Eyal edri MANAGER RHV DevOps EMEA VIRTUALIZATION R&D Red Hat EMEA <https://www.redhat.com/> <https://red.ht/sig> TRIED. TESTED. TRUSTED. <https://redhat.com/trusted> phone: +972-9-7692018 irc: eedri (on #tlv #rhev-dev #rhev-integ)

On Mon, Mar 26, 2018 at 3:32 PM, Eyal Edri <eedri@redhat.com> wrote:
On Mon, Mar 26, 2018 at 3:29 PM, Greg Sheremeta <gshereme@redhat.com> wrote:
Eyal, danken told me that ovn cannot require selinux, so the images must be updated.
Yea, I got that, and Gal indeed went a head and updated the image, just wondered why it wasn't fixed also on OVN side.
They are. https://bugzilla.redhat.com/show_bug.cgi?id=1549673 fixes this in ovs-2.9.0-6.el7fdn but that is not released yet.
Looks like Gal is updating the images here: https://gerrit.ovirt.org /#/c/89430/ (ng: Update the CentOS image, just merged)
Gal, will this fix it?
I think Gal verified it, so should be working now, try rebasing your patch.
On Sun, Mar 25, 2018 at 1:15 PM, Eyal Edri <eedri@redhat.com> wrote:
Just wondering, if the bug requires updated selinux, shouldn't the ovs PKG enforce it I'm the spec file?
On Mar 25, 2018 20:12, "Greg Sheremeta" <gshereme@redhat.com> wrote:
Indeed, updating selinux-policy on both engine and hosts passes.
Change If671d938: [test do not merge] test selinux-policy update on engine and hosts | gerrit.ovirt Code Review https://gerrit.ovirt.org/#/c/89427/ http://jenkins.ovirt.org/job/ovirt-system-tests_master_check -patch-el7-x86_64/4676/consoleFull
So I guess the images need updates.
On Sun, Mar 25, 2018 at 11:20 AM, Greg Sheremeta <gshereme@redhat.com> wrote:
currently selinux-policy-3.13.1-166.el7_4.4.noarch
updating selinux-policy on engine gets me past 001, and then 002 dies: http://jenkins.ovirt.org/job/ovirt-system-tests_master_check -patch-el7-x86_64/4663/console
15:05:45 # initialize_engine: rpm -qa: 15:05:46 CommandStatus(code=0, out='selinux-policy-3.13.1-166.el7_4.4.noarch\n', err='') 15:06:13 <snip> Package selinux-policy.noarch 0:3.13.1-166.el7_4.9 will be an update <snip> Updated: selinux-policy.noarch 0:3.13.1-166.el7_4.9 rpm -qa: 15:06:13 CommandStatus(code=0, out='selinux-policy-3.13.1-166.el7_4.9.noarch\n', err='')
But later in 002 15:08:47 RuntimeError: 1 hosts failed installation: 15:08:47 lago-basic-suite-master-host-0: install_failed
Perhaps selinux-policy needs to be updated on the hosts too? Not my area of expertise :)
Greg
On Sun, Mar 25, 2018 at 7:22 AM, Dan Kenigsberg <danken@redhat.com> wrote:
Which version of selinux-policy do we have on the Engine image?
*Bug 1538936* <https://bugzilla.redhat.com/show_bug.cgi?id=1538936> - Open vSwitch selinux policy needs updating [rhel-7.4.z]
was fixed in selinux-policy-3.13.1-166.el7_4.9 which is available in http://mirror.centos.org/centos-7/7/updates/x86_64/Packages/ selinux-policy-targeted-3.13.1-166.el7_4.9.noarch.rpm
On Sun, Mar 25, 2018 at 12:32 PM, Yedidyah Bar David <didi@redhat.com > wrote:
> On Sun, Mar 25, 2018 at 12:04 PM, Yedidyah Bar David < > didi@redhat.com> wrote: > >> basic suite failed for me too. >> >> /var/log/messages has[1]: >> >> Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Starting >> Open vSwitch Database Unit... >> Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: runuser: >> System error >> Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: >> /etc/openvswitch/conf.db does not exist ... (warning). >> Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: Creating >> empty database /etc/openvswitch/conf.db runuser: System error >> Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: [FAILED] >> Mar 25 04:42:05 lago-basic-suite-master-engine systemd: >> ovsdb-server.service: control process exited, code=exited status=1 >> Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Failed to >> start Open vSwitch Database Unit. >> Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Dependency >> failed for Open vSwitch. >> Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Job >> openvswitch.service/start failed with result 'dependency'. >> Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Dependency >> failed for Open vSwitch Forwarding Unit. >> Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Job >> ovs-vswitchd.service/start failed with result 'dependency'. >> Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Unit >> ovsdb-server.service entered failed state. >> Mar 25 04:42:05 lago-basic-suite-master-engine systemd: >> ovsdb-server.service failed. >> Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Assertion >> failed for Open vSwitch Delete Transient Ports. >> Mar 25 04:42:05 lago-basic-suite-master-engine systemd: >> ovsdb-server.service holdoff time over, scheduling restart. >> Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Starting >> Open vSwitch Database Unit... >> Mar 25 04:42:05 lago-basic-suite-master-engine systemd-logind: >> Removed session 14. >> Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Removed >> slice User Slice of root. >> Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Stopping >> User Slice of root. >> Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: runuser: >> System error >> Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: >> /etc/openvswitch/conf.db does not exist ... (warning). >> Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: Creating >> empty database /etc/openvswitch/conf.db runuser: System error >> Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: [FAILED] >> Mar 25 04:42:05 lago-basic-suite-master-engine systemd: >> ovsdb-server.service: control process exited, code=exited status=1 >> Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Failed to >> start Open vSwitch Database Unit. >> Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Unit >> ovsdb-server.service entered failed state. >> Mar 25 04:42:05 lago-basic-suite-master-engine systemd: >> ovsdb-server.service failed. >> Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Assertion >> failed for Open vSwitch Delete Transient Ports. >> Mar 25 04:42:05 lago-basic-suite-master-engine systemd: >> ovsdb-server.service holdoff time over, scheduling restart. >> Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Starting >> Open vSwitch Database Unit... >> Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: runuser: >> System error >> Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: >> /etc/openvswitch/conf.db does not exist ... (warning). >> Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: Creating >> empty database /etc/openvswitch/conf.db runuser: System error >> Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: [FAILED] >> Mar 25 04:42:05 lago-basic-suite-master-engine systemd: >> ovsdb-server.service: control process exited, code=exited status=1 >> Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Failed to >> start Open vSwitch Database Unit. >> Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Unit >> ovsdb-server.service entered failed state. >> Mar 25 04:42:05 lago-basic-suite-master-engine systemd: >> ovsdb-server.service failed. >> Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Assertion >> failed for Open vSwitch Delete Transient Ports. >> Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Created >> slice User Slice of root. >> Mar 25 04:42:05 lago-basic-suite-master-engine systemd-logind: New >> session 17 of user root. >> Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Starting >> User Slice of root. >> Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Started >> Session 17 of user root. >> Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Starting >> Session 17 of user root. >> Mar 25 04:42:05 lago-basic-suite-master-engine kernel: DCCP: >> Activated CCID 2 (TCP-like) >> Mar 25 04:42:05 lago-basic-suite-master-engine kernel: DCCP: >> Activated CCID 3 (TCP-Friendly Rate Control) >> Mar 25 04:42:05 lago-basic-suite-master-engine systemd: >> ovsdb-server.service holdoff time over, scheduling restart. >> Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Starting >> Open vSwitch Database Unit... >> Mar 25 04:42:05 lago-basic-suite-master-engine kernel: sctp: Hash >> tables configured (bind 256/256) >> Mar 25 04:42:05 lago-basic-suite-master-engine systemd-logind: >> Removed session 17. >> Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Removed >> slice User Slice of root. >> Mar 25 04:42:05 lago-basic-suite-master-engine systemd: Stopping >> User Slice of root. >> Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: runuser: >> System error >> Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: >> /etc/openvswitch/conf.db does not exist ... (warning). >> Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: Creating >> empty database /etc/openvswitch/conf.db runuser: System error >> Mar 25 04:42:05 lago-basic-suite-master-engine ovs-ctl: [FAILED] >> Mar 25 04:42:05 lago-basic-suite-master-engine systemd: >> ovsdb-server.service: control process exited, code=exited status=1 >> >> [1] http://jenkins.ovirt.org/job/ovirt-system-tests_master_check >> -patch-el7-x86_64/4651/artifact/exported-artifacts/basic-sui >> te-master__logs/test_logs/basic-suite-master/post-001_initia >> lize_engine.py/lago-basic-suite-master-engine/_var_log/messa >> ges/*view*/ >> >> > Talked with danken, he asked to check if it's an selinux issue. It > is. audit lot has: > > type=AVC msg=audit(1521967325.146:675): avc: denied { create } for pid=3787 comm="runuser" scontext=system_u:system_r:openvswitch_t:s0 tcontext=system_u:system_r:openvswitch_t:s0 tclass=netlink_audit_socket > type=SYSCALL msg=audit(1521967325.146:675): arch=c000003e syscall=41 success=no exit=-13 a0=10 a1=3 a2=9 a3=7ffc4e12b930 items=0 ppid=3786 pid=3787 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="runuser" exe="/usr/sbin/runuser" subj=system_u:system_r:openvswitch_t:s0 key=(null) > type=PROCTITLE msg=audit(1521967325.146:675): proctitle=72756E75736572002D2D75736572006F70656E76737769746368002D2D006F767364622D746F6F6C002D76636F6E736F6C653A6F666600736368656D612D76657273696F6E002F7573722F73686172652F6F70656E767377697463682F767377697463682E6F7673736368656D61 > type=AVC msg=audit(1521967325.150:676): avc: denied { create } for pid=3789 comm="runuser" scontext=system_u:system_r:openvswitch_t:s0 tcontext=system_u:system_r:openvswitch_t:s0 tclass=netlink_audit_socket > type=SYSCALL msg=audit(1521967325.150:676): arch=c000003e syscall=41 success=no exit=-13 a0=10 a1=3 a2=9 a3=7ffe03060130 items=0 ppid=3755 pid=3789 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="runuser" exe="/usr/sbin/runuser" subj=system_u:system_r:openvswitch_t:s0 key=(null) > type=PROCTITLE msg=audit(1521967325.150:676): > > http://jenkins.ovirt.org/job/ovirt-system-tests_master_check-patch-el7-x86_6... > > And it's 2.9: > > Mar 25 04:38:39 lago-basic-suite-master-engine yum[1183]: Installed: 1:python2-openvswitch-2.9.0-3.el7.noarch > > > >> >> On Sun, Mar 25, 2018 at 10:07 AM, Yaniv Kaul <ykaul@redhat.com> >> wrote: >> >>> + Network team. >>> I'm not sure if we've moved to ovs 2.9 already? >>> Y. >>> >>> On Sat, Mar 24, 2018 at 8:19 PM, Greg Sheremeta < >>> gshereme@redhat.com> wrote: >>> >>>> Hi, >>>> >>>> Is there an ongoing engine master OST failure blocking? >>>> >>>> [ INFO ] Stage: Misc configuration >>>> [ INFO ] Stage: Package installation >>>> [ INFO ] Stage: Misc configuration >>>> [ ERROR ] Failed to execute stage \'Misc configuration\': Failed >>>> to start service \'openvswitch\' >>>> [ INFO ] Yum Performing yum transaction rollback >>>> >>>> >>>> These are unrelated code changes: >>>> >>>> http://jenkins.ovirt.org/job/ovirt-system-tests_master_check >>>> -patch-el7-x86_64/4644/ >>>> https://gerrit.ovirt.org/#/c/89347/ >>>> >>>> and >>>> http://jenkins.ovirt.org/job/ovirt-system-tests_master_check >>>> -patch-el7-x86_64/4647/ >>>> https://gerrit.ovirt.org/67166 >>>> >>>> But they both die in 001, with exactly 1.24MB in the log and 'Failed >>>> to start service openvswitch' >>>> 001_initialize_engine.py.junit.xml 1.24 MB >>>> >>>> Full file: http://jenkins.ovirt.org/job/ovirt-system-tests_master >>>> _check-patch-el7-x86_64/4644/artifact/exported-artifacts/bas >>>> ic-suite-master__logs/001_initialize_engine.py.junit.xml >>>> >>>> >>>> On Fri, Mar 23, 2018 at 12:14 PM, Dafna Ron <dron@redhat.com> >>>> wrote: >>>> >>>>> Hello, >>>>> >>>>> I would like to update on this week's failures and OST current >>>>> status. >>>>> >>>>> On 19-03-2018 - the CI team reported 3 different failures. >>>>> >>>>> On Master branch the failed changes reported were: >>>>> >>>>> >>>>> *core: fix removal of vm-host device - >>>>> https://gerrit.ovirt.org/#/c/89145/ <https://gerrit.ovirt.org/#/c/89145/>* >>>>> >>>>> *core: USB in osinfo configuration depends on chipset - >>>>> https://gerrit.ovirt.org/#/c/88777/ <https://gerrit.ovirt.org/#/c/88777/>* >>>>> *On 4.2 *branch, the reported change was: >>>>> >>>>> >>>>> >>>>> *core: Call endAction() of all child commands in ImportVmCommand >>>>> - https://gerrit.ovirt.org/#/c/89165/ <https://gerrit.ovirt.org/#/c/89165/>* >>>>> The fix's for the regressions were merged the following day >>>>> (20-03-2018) >>>>> >>>>> https://gerrit.ovirt.org/#/c/89250/- core: Replace generic >>>>> unlockVm() logic in ImportVmCommand >>>>> https://gerrit.ovirt.org/#/c/89187/ - core: Fix NPE when >>>>> creating an instance type >>>>> >>>>> On 20-03-2018 - the CI team discovered an issue on the job's >>>>> cleanup which caused random failures on changes testing due to failure in >>>>> docker cleanup. There is an open Jira on the issue: >>>>> https://ovirt-jira.atlassian.net/browse/OVIRT-1939 >>>>> >>>>> >>>>> >>>>> *Below you can see the chart for this week's resolved issues but >>>>> cause of failure:*Code = regression of working >>>>> components/functionalities >>>>> Configurations = package related issues >>>>> Other = failed build artifacts >>>>> Infra = infrastructure/OST/Lago related issues >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> *Below is a chart of resolved failures based on ovirt version* >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> *Below is a chart showing failures by suite type: * >>>>> Thank you, >>>>> Dafna >>>>> >>>>> >>>>> _______________________________________________ >>>>> Infra mailing list >>>>> Infra@ovirt.org >>>>> http://lists.ovirt.org/mailman/listinfo/infra >>>>> >>>>> >>>> >>>> >>>> -- >>>> >>>> GREG SHEREMETA >>>> >>>> SENIOR SOFTWARE ENGINEER - TEAM LEAD - RHV UX >>>> >>>> Red Hat NA >>>> >>>> <https://www.redhat.com/> >>>> >>>> gshereme@redhat.com IRC: gshereme >>>> <https://red.ht/sig> >>>> >>>> _______________________________________________ >>>> Devel mailing list >>>> Devel@ovirt.org >>>> http://lists.ovirt.org/mailman/listinfo/devel >>>> >>> >>> >>> _______________________________________________ >>> Infra mailing list >>> Infra@ovirt.org >>> http://lists.ovirt.org/mailman/listinfo/infra >>> >>> >> >> >> -- >> Didi >> > > > > -- > Didi >
--
GREG SHEREMETA
SENIOR SOFTWARE ENGINEER - TEAM LEAD - RHV UX
Red Hat NA
gshereme@redhat.com IRC: gshereme <https://red.ht/sig>
--
GREG SHEREMETA
SENIOR SOFTWARE ENGINEER - TEAM LEAD - RHV UX
Red Hat NA
gshereme@redhat.com IRC: gshereme <https://red.ht/sig>
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
--
GREG SHEREMETA
SENIOR SOFTWARE ENGINEER - TEAM LEAD - RHV UX
Red Hat NA
gshereme@redhat.com IRC: gshereme <https://red.ht/sig>
--
Eyal edri
MANAGER
RHV DevOps
EMEA VIRTUALIZATION R&D
Red Hat EMEA <https://www.redhat.com/> <https://red.ht/sig> TRIED. TESTED. TRUSTED. <https://redhat.com/trusted> phone: +972-9-7692018 <+972%209-769-2018> irc: eedri (on #tlv #rhev-dev #rhev-integ)
participants (6)
-
Dafna Ron
-
Dan Kenigsberg
-
Eyal Edri
-
Greg Sheremeta
-
Yaniv Kaul
-
Yedidyah Bar David