[ OST Failure Report ] [ oVirt Master ] [ 06-11-2017 ] [ 002_bootstrap.verify_add_hosts ]

This is a multi-part message in MIME format. --------------88DD9FCD3B3DDEA34D4EAEF4 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Hi, We failed test 002_bootstrap.verify_add_hosts I can see we only tried to install one of the hosts (host-0) and failed. the second host has no log which means we did not try to deploy it. The error suggests that we ovirt-imageio-daemon failed to start. However, there is another message that I think should be addressed about conflicting vdsm and libvirt configurations. ** *Link to suspected patches: https://gerrit.ovirt.org/#/c/83612/* * Link to Job: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/ Link to all logs: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/artifact/ http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/artifact/... * * *(Relevant) error snippet from the log: * * <error> \* 2017-11-06 02:56:46,526-0500 DEBUG otopi.plugins.ovirt_host_deploy.vdsm.packages plugin.execute:921 execute-output: ('/usr/bin/vdsm-tool', 'configure', '--force') stdout: Checking configuration status... abrt is not configured for vdsm WARNING: LVM local configuration: /etc/lvm/lvmlocal.conf is not based on vdsm configuration lvm requires configuration libvirt is not configured for vdsm yet FAILED: conflicting vdsm and libvirt-qemu tls configuration. vdsm.conf with ssl=True requires the following changes: libvirtd.conf: listen_tcp=0, auth_tcp="sasl", listen_tls=1 qemu.conf: spice_tls=1. multipath requires configuration 2017-11-06 02:56:47,551-0500 DEBUG otopi.plugins.otopi.services.systemd plugin.execute:926 execute-output: ('/usr/bin/systemctl', 'start', 'ovirt-imageio-daemon.service') stderr: Job for ovirt-imageio-daemon.service failed because the control process exited with error code. See "systemctl status ovirt-imageio-daemon.service" and "journalctl -xe" for details. 2017-11-06 02:56:47,552-0500 DEBUG otopi.context context._executeMethod:143 method exception Traceback (most recent call last): File "/tmp/ovirt-R4R8gZhaQI/pythonlib/otopi/context.py", line 133, in _executeMethod method['method']() File "/tmp/ovirt-R4R8gZhaQI/otopi-plugins/ovirt-host-deploy/vdsm/packages.py", line 179, in _start self.services.state('ovirt-imageio-daemon', True) File "/tmp/ovirt-R4R8gZhaQI/otopi-plugins/otopi/services/systemd.py", line 141, in state service=name, RuntimeError: Failed to start service 'ovirt-imageio-daemon' 2017-11-06 02:56:47,553-0500 ERROR otopi.context context._executeMethod:152 Failed to execute stage 'Closing up': Failed to start service 'ovirt-imageio-daemon' ** *</error>* * * --------------88DD9FCD3B3DDEA34D4EAEF4 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: 7bit <html> <head> <meta http-equiv="content-type" content="text/html; charset=utf-8"> </head> <body text="#000000" bgcolor="#FFFFFF"> <p>Hi, <br> </p> <p>We failed test 002_bootstrap.verify_add_hosts</p> <p>I can see we only tried to install one of the hosts (host-0) and failed. the second host has no log which means we did not try to deploy it. <br> </p> <p>The error suggests that we ovirt-imageio-daemon failed to start. However, there is another message that I think should be addressed about conflicting vdsm and libvirt configurations. <br> </p> <p><b style="font-weight:normal;" id="docs-internal-guid-5859b7a1-911e-5616-3cbc-97286587db85"> <p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt;"><span style="font-size:11pt;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;">Link to suspected patches: <a class="moz-txt-link-freetext" href="https://gerrit.ovirt.org/#/c/83612/">https://gerrit.ovirt.org/#/c/83612/</a></span></p> <br> <p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt;"><span style="font-size:11pt;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;">Link to Job: <a class="moz-txt-link-freetext" href="http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/">http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/</a></span></p> <br> <p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt;"><span style="font-size:11pt;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;">Link to all logs: <a class="moz-txt-link-freetext" href="http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/artifact/">http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/artifact/</a></span></p> <p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt;"><span style="font-size:11pt;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;"> </span></p> <a class="moz-txt-link-freetext" href="http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/artifact/exported-artifacts/basic-suit-master-el7/test_logs/basic-suite-master/post-002_bootstrap.py/lago-basic-suite-master-engine/_var_log/ovirt-engine/host-deploy/ovirt-host-deploy-20171106025647-lago-basic-suite-master-host-0-5530ab1f.log">http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/artifact/exported-artifacts/basic-suit-master-el7/test_logs/basic-suite-master/post-002_bootstrap.py/lago-basic-suite-master-engine/_var_log/ovirt-engine/host-deploy/ovirt-host-deploy-20171106025647-lago-basic-suite-master-host-0-5530ab1f.log</a></b></p> <p><b style="font-weight:normal;" id="docs-internal-guid-5859b7a1-911e-5616-3cbc-97286587db85"><br> <p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt;"><span style="font-size:11pt;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;">(Relevant) error snippet from the log: </span></p> <p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt;"><span style="font-size:11pt;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;"><error></span></p> \</b><br> </p> <pre style="color: rgb(0, 0, 0); font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;">2017-11-06 02:56:46,526-0500 DEBUG otopi.plugins.ovirt_host_deploy.vdsm.packages plugin.execute:921 execute-output: ('/usr/bin/vdsm-tool', 'configure', '--force') stdout: Checking configuration status... abrt is not configured for vdsm WARNING: LVM local configuration: /etc/lvm/lvmlocal.conf is not based on vdsm configuration lvm requires configuration libvirt is not configured for vdsm yet FAILED: conflicting vdsm and libvirt-qemu tls configuration. vdsm.conf with ssl=True requires the following changes: libvirtd.conf: listen_tcp=0, auth_tcp="sasl", listen_tls=1 qemu.conf: spice_tls=1. multipath requires configuration </pre> <br class="Apple-interchange-newline"> <pre style="color: rgb(0, 0, 0); font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;">2017-11-06 02:56:47,551-0500 DEBUG otopi.plugins.otopi.services.systemd plugin.execute:926 execute-output: ('/usr/bin/systemctl', 'start', 'ovirt-imageio-daemon.service') stderr: Job for ovirt-imageio-daemon.service failed because the control process exited with error code. See "systemctl status ovirt-imageio-daemon.service" and "journalctl -xe" for details. 2017-11-06 02:56:47,552-0500 DEBUG otopi.context context._executeMethod:143 method exception Traceback (most recent call last): File "/tmp/ovirt-R4R8gZhaQI/pythonlib/otopi/context.py", line 133, in _executeMethod method['method']() File "/tmp/ovirt-R4R8gZhaQI/otopi-plugins/ovirt-host-deploy/vdsm/packages.py", line 179, in _start self.services.state('ovirt-imageio-daemon', True) File "/tmp/ovirt-R4R8gZhaQI/otopi-plugins/otopi/services/systemd.py", line 141, in state service=name, RuntimeError: Failed to start service 'ovirt-imageio-daemon' 2017-11-06 02:56:47,553-0500 ERROR otopi.context context._executeMethod:152 Failed to execute stage 'Closing up': Failed to start service 'ovirt-imageio-daemon'</pre> <p><b style="font-weight:normal;" id="docs-internal-guid-5859b7a1-911e-5616-3cbc-97286587db85"> <p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt;"><span style="font-size:11pt;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;"></error></span></p> <br> </b></p> </body> </html> --------------88DD9FCD3B3DDEA34D4EAEF4--

Suspected patch (https://gerrit.ovirt.org/#/c/83612/) is about cold merge and has nothing to do with host deploy. On Mon, Nov 6, 2017 at 1:39 PM, Dafna Ron <dron@redhat.com> wrote:
Hi,
We failed test 002_bootstrap.verify_add_hosts
I can see we only tried to install one of the hosts (host-0) and failed. the second host has no log which means we did not try to deploy it.
The error suggests that we ovirt-imageio-daemon failed to start. However, there is another message that I think should be addressed about conflicting vdsm and libvirt configurations.
*Link to suspected patches: https://gerrit.ovirt.org/#/c/83612/ <https://gerrit.ovirt.org/#/c/83612/>*
* Link to Job: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/ <http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/> Link to all logs: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/artifact/ <http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/artifact/> http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/artifact/... <http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/artifact/exported-artifacts/basic-suit-master-el7/test_logs/basic-suite-master/post-002_bootstrap.py/lago-basic-suite-master-engine/_var_log/ovirt-engine/host-deploy/ovirt-host-deploy-20171106025647-lago-basic-suite-master-host-0-5530ab1f.log>*
*(Relevant) error snippet from the log: *
* <error> \*
2017-11-06 02:56:46,526-0500 DEBUG otopi.plugins.ovirt_host_deploy.vdsm.packages plugin.execute:921 execute-output: ('/usr/bin/vdsm-tool', 'configure', '--force') stdout:
Checking configuration status...
abrt is not configured for vdsm WARNING: LVM local configuration: /etc/lvm/lvmlocal.conf is not based on vdsm configuration lvm requires configuration libvirt is not configured for vdsm yet FAILED: conflicting vdsm and libvirt-qemu tls configuration. vdsm.conf with ssl=True requires the following changes: libvirtd.conf: listen_tcp=0, auth_tcp="sasl", listen_tls=1 qemu.conf: spice_tls=1. multipath requires configuration
2017-11-06 02:56:47,551-0500 DEBUG otopi.plugins.otopi.services.systemd plugin.execute:926 execute-output: ('/usr/bin/systemctl', 'start', 'ovirt-imageio-daemon.service') stderr: Job for ovirt-imageio-daemon.service failed because the control process exited with error code. See "systemctl status ovirt-imageio-daemon.service" and "journalctl -xe" for details.
2017-11-06 02:56:47,552-0500 DEBUG otopi.context context._executeMethod:143 method exception Traceback (most recent call last): File "/tmp/ovirt-R4R8gZhaQI/pythonlib/otopi/context.py", line 133, in _executeMethod method['method']() File "/tmp/ovirt-R4R8gZhaQI/otopi-plugins/ovirt-host-deploy/vdsm/packages.py", line 179, in _start self.services.state('ovirt-imageio-daemon', True) File "/tmp/ovirt-R4R8gZhaQI/otopi-plugins/otopi/services/systemd.py", line 141, in state service=name, RuntimeError: Failed to start service 'ovirt-imageio-daemon' 2017-11-06 02:56:47,553-0500 ERROR otopi.context context._executeMethod:152 Failed to execute stage 'Closing up': Failed to start service 'ovirt-imageio-daemon'
*</error>*

This is a multi-part message in MIME format. --------------5DECD70B48A640569C491C90 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit adding Didi. On 11/06/2017 11:51 AM, Ala Hino wrote:
Suspected patch (https://gerrit.ovirt.org/#/c/83612/) is about cold merge and has nothing to do with host deploy.
On Mon, Nov 6, 2017 at 1:39 PM, Dafna Ron <dron@redhat.com <mailto:dron@redhat.com>> wrote:
Hi,
We failed test 002_bootstrap.verify_add_hosts
I can see we only tried to install one of the hosts (host-0) and failed. the second host has no log which means we did not try to deploy it.
The error suggests that we ovirt-imageio-daemon failed to start. However, there is another message that I think should be addressed about conflicting vdsm and libvirt configurations.
**
*Link to suspected patches: https://gerrit.ovirt.org/#/c/83612/ <https://gerrit.ovirt.org/#/c/83612/>*
*
Link to Job: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/ <http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/>
Link to all logs: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/artifact/ <http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/artifact/>
http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/artifact/... <http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/artifact/exported-artifacts/basic-suit-master-el7/test_logs/basic-suite-master/post-002_bootstrap.py/lago-basic-suite-master-engine/_var_log/ovirt-engine/host-deploy/ovirt-host-deploy-20171106025647-lago-basic-suite-master-host-0-5530ab1f.log>*
* *
*(Relevant) error snippet from the log: *
*
<error>
\*
2017-11-06 02:56:46,526-0500 DEBUG otopi.plugins.ovirt_host_deploy.vdsm.packages plugin.execute:921 execute-output: ('/usr/bin/vdsm-tool', 'configure', '--force') stdout:
Checking configuration status...
abrt is not configured for vdsm WARNING: LVM local configuration: /etc/lvm/lvmlocal.conf is not based on vdsm configuration lvm requires configuration libvirt is not configured for vdsm yet FAILED: conflicting vdsm and libvirt-qemu tls configuration. vdsm.conf with ssl=True requires the following changes: libvirtd.conf: listen_tcp=0, auth_tcp="sasl", listen_tls=1 qemu.conf: spice_tls=1. multipath requires configuration
2017-11-06 02:56:47,551-0500 DEBUG otopi.plugins.otopi.services.systemd plugin.execute:926 execute-output: ('/usr/bin/systemctl', 'start', 'ovirt-imageio-daemon.service') stderr: Job for ovirt-imageio-daemon.service failed because the control process exited with error code. See "systemctl status ovirt-imageio-daemon.service" and "journalctl -xe" for details.
2017-11-06 02:56:47,552-0500 DEBUG otopi.context context._executeMethod:143 method exception Traceback (most recent call last): File "/tmp/ovirt-R4R8gZhaQI/pythonlib/otopi/context.py", line 133, in _executeMethod method['method']() File "/tmp/ovirt-R4R8gZhaQI/otopi-plugins/ovirt-host-deploy/vdsm/packages.py", line 179, in _start self.services.state('ovirt-imageio-daemon', True) File "/tmp/ovirt-R4R8gZhaQI/otopi-plugins/otopi/services/systemd.py", line 141, in state service=name, RuntimeError: Failed to start service 'ovirt-imageio-daemon' 2017-11-06 02:56:47,553-0500 ERROR otopi.context context._executeMethod:152 Failed to execute stage 'Closing up': Failed to start service 'ovirt-imageio-daemon'
**
*</error>*
* *
--------------5DECD70B48A640569C491C90 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: 7bit <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> </head> <body text="#000000" bgcolor="#FFFFFF"> <div class="moz-cite-prefix">adding Didi. <br> <br> <br> On 11/06/2017 11:51 AM, Ala Hino wrote:<br> </div> <blockquote type="cite" cite="mid:CAPuOgO3Lm8eJpU+iapGym2czEDQ669+wrQ7Gy+Xd22HQujOxWg@mail.gmail.com"> <div dir="ltr">Suspected patch (<a href="https://gerrit.ovirt.org/#/c/83612/" moz-do-not-send="true">https://gerrit.ovirt.org/#/c/83612/</a>) is about cold merge and has nothing to do with host deploy.</div> <div class="gmail_extra"><br> <div class="gmail_quote">On Mon, Nov 6, 2017 at 1:39 PM, Dafna Ron <span dir="ltr"><<a href="mailto:dron@redhat.com" target="_blank" moz-do-not-send="true">dron@redhat.com</a>></span> wrote:<br> <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> <div text="#000000" bgcolor="#FFFFFF"> <p>Hi, <br> </p> <p>We failed test 002_bootstrap.verify_add_hosts</p> <p>I can see we only tried to install one of the hosts (host-0) and failed. the second host has no log which means we did not try to deploy it. <br> </p> <p>The error suggests that we ovirt-imageio-daemon failed to start. However, there is another message that I think should be addressed about conflicting vdsm and libvirt configurations. <br> </p> <p><b style="font-weight:normal" id="m_-6123620486477417361docs-internal-guid-5859b7a1-911e-5616-3cbc-97286587db85"> </b></p> <p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><b style="font-weight:normal" id="m_-6123620486477417361docs-internal-guid-5859b7a1-911e-5616-3cbc-97286587db85"><span style="font-size:11pt;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">Link to suspected patches: <a class="m_-6123620486477417361moz-txt-link-freetext" href="https://gerrit.ovirt.org/#/c/83612/" target="_blank" moz-do-not-send="true">https://gerrit.ovirt.org/#/c/<wbr>83612/</a></span></b></p> <b style="font-weight:normal" id="m_-6123620486477417361docs-internal-guid-5859b7a1-911e-5616-3cbc-97286587db85"> <br> <p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">Link to Job: <a class="m_-6123620486477417361moz-txt-link-freetext" href="http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/" target="_blank" moz-do-not-send="true">http://jenkins.ovirt.org/job/<wbr>ovirt-master_change-queue-<wbr>tester/3626/</a></span></p> <br> <p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">Link to all logs: <a class="m_-6123620486477417361moz-txt-link-freetext" href="http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/artifact/" target="_blank" moz-do-not-send="true">http://jenkins.ovirt.org/job/<wbr>ovirt-master_change-queue-<wbr>tester/3626/artifact/</a></span></p> <p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap"> </span></p> <a class="m_-6123620486477417361moz-txt-link-freetext" href="http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/artifact/..." target="_blank" moz-do-not-send="true">http://jenkins.ovirt.org/job/<wbr>ovirt-master_change-queue-<wbr>tester/3626/artifact/exported-<wbr>artifacts/basic-suit-master-<wbr>el7/test_logs/basic-suite-<wbr>master/post-002_bootstrap.py/<wbr>lago-basic-suite-master-<wbr>engine/_var_log/ovirt-engine/<wbr>host-deploy/ovirt-host-deploy-<wbr>20171106025647-lago-basic-<wbr>suite-master-host-0-5530ab1f.<wbr>log</a></b> <p><b style="font-weight:normal" id="m_-6123620486477417361docs-internal-guid-5859b7a1-911e-5616-3cbc-97286587db85"><br> </b></p> <p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><b style="font-weight:normal" id="m_-6123620486477417361docs-internal-guid-5859b7a1-911e-5616-3cbc-97286587db85"><span style="font-size:11pt;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">(Relevant) error snippet from the log: </span></b></p> <b style="font-weight:normal" id="m_-6123620486477417361docs-internal-guid-5859b7a1-911e-5616-3cbc-97286587db85"> <p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap"><error></span></p> \</b><br> <pre style="color:rgb(0,0,0);font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;word-spacing:0px;text-decoration-style:initial;text-decoration-color:initial">2017-11-06 02:56:46,526-0500 DEBUG otopi.plugins.ovirt_host_<wbr>deploy.vdsm.packages plugin.execute:921 execute-output: ('/usr/bin/vdsm-tool', 'configure', '--force') stdout: Checking configuration status... abrt is not configured for vdsm WARNING: LVM local configuration: /etc/lvm/lvmlocal.conf is not based on vdsm configuration lvm requires configuration libvirt is not configured for vdsm yet FAILED: conflicting vdsm and libvirt-qemu tls configuration. vdsm.conf with ssl=True requires the following changes: libvirtd.conf: listen_tcp=0, auth_tcp="sasl", listen_tls=1 qemu.conf: spice_tls=1. multipath requires configuration </pre> <br class="m_-6123620486477417361Apple-interchange-newline"> <pre style="color:rgb(0,0,0);font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;word-spacing:0px;text-decoration-style:initial;text-decoration-color:initial">2017-11-06 02:56:47,551-0500 DEBUG otopi.plugins.otopi.services.<wbr>systemd plugin.execute:926 execute-output: ('/usr/bin/systemctl', 'start', 'ovirt-imageio-daemon.service'<wbr>) stderr: Job for ovirt-imageio-daemon.service failed because the control process exited with error code. See "systemctl status ovirt-imageio-daemon.service" and "journalctl -xe" for details. 2017-11-06 02:56:47,552-0500 DEBUG otopi.context context._executeMethod:143 method exception Traceback (most recent call last): File "/tmp/ovirt-R4R8gZhaQI/<wbr>pythonlib/otopi/context.py", line 133, in _executeMethod method['method']() File "/tmp/ovirt-R4R8gZhaQI/otopi-<wbr>plugins/ovirt-host-deploy/<wbr>vdsm/packages.py", line 179, in _start self.services.state('ovirt-<wbr>imageio-daemon', True) File "/tmp/ovirt-R4R8gZhaQI/otopi-<wbr>plugins/otopi/services/<wbr>systemd.py", line 141, in state service=name, RuntimeError: Failed to start service 'ovirt-imageio-daemon' 2017-11-06 02:56:47,553-0500 ERROR otopi.context context._executeMethod:152 Failed to execute stage 'Closing up': Failed to start service 'ovirt-imageio-daemon'</pre> <p><b style="font-weight:normal" id="m_-6123620486477417361docs-internal-guid-5859b7a1-911e-5616-3cbc-97286587db85"> </b></p> <p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><b style="font-weight:normal" id="m_-6123620486477417361docs-internal-guid-5859b7a1-911e-5616-3cbc-97286587db85"><span style="font-size:11pt;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap"></error></span></b></p> <b style="font-weight:normal" id="m_-6123620486477417361docs-internal-guid-5859b7a1-911e-5616-3cbc-97286587db85"> <br> </b> </div> </blockquote> </div> <br> </div> </blockquote> <p><br> </p> </body> </html> --------------5DECD70B48A640569C491C90--

On Mon, Nov 6, 2017 at 1:57 PM, Dafna Ron <dron@redhat.com> wrote:
adding Didi.
On 11/06/2017 11:51 AM, Ala Hino wrote:
Suspected patch (https://gerrit.ovirt.org/#/c/83612/) is about cold merge and has nothing to do with host deploy.
On Mon, Nov 6, 2017 at 1:39 PM, Dafna Ron <dron@redhat.com> wrote:
Hi,
We failed test 002_bootstrap.verify_add_hosts
I can see we only tried to install one of the hosts (host-0) and failed. the second host has no log which means we did not try to deploy it.
The error suggests that we ovirt-imageio-daemon failed to start. However, there is another message that I think should be addressed about conflicting vdsm and libvirt configurations.
Link to suspected patches: https://gerrit.ovirt.org/#/c/83612/
Link to Job: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/
Link to all logs: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/artifact/
http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/artifact/...
(Relevant) error snippet from the log:
<error>
\
2017-11-06 02:56:46,526-0500 DEBUG otopi.plugins.ovirt_host_deploy.vdsm.packages plugin.execute:921 execute-output: ('/usr/bin/vdsm-tool', 'configure', '--force') stdout:
Checking configuration status...
abrt is not configured for vdsm WARNING: LVM local configuration: /etc/lvm/lvmlocal.conf is not based on vdsm configuration lvm requires configuration libvirt is not configured for vdsm yet FAILED: conflicting vdsm and libvirt-qemu tls configuration. vdsm.conf with ssl=True requires the following changes: libvirtd.conf: listen_tcp=0, auth_tcp="sasl", listen_tls=1 qemu.conf: spice_tls=1. multipath requires configuration
2017-11-06 02:56:47,551-0500 DEBUG otopi.plugins.otopi.services.systemd plugin.execute:926 execute-output: ('/usr/bin/systemctl', 'start', 'ovirt-imageio-daemon.service') stderr: Job for ovirt-imageio-daemon.service failed because the control process exited with error code. See "systemctl status ovirt-imageio-daemon.service" and "journalctl -xe" for details.
2017-11-06 02:56:47,552-0500 DEBUG otopi.context context._executeMethod:143 method exception Traceback (most recent call last): File "/tmp/ovirt-R4R8gZhaQI/pythonlib/otopi/context.py", line 133, in _executeMethod method['method']() File "/tmp/ovirt-R4R8gZhaQI/otopi-plugins/ovirt-host-deploy/vdsm/packages.py", line 179, in _start self.services.state('ovirt-imageio-daemon', True) File "/tmp/ovirt-R4R8gZhaQI/otopi-plugins/otopi/services/systemd.py", line 141, in state service=name, RuntimeError: Failed to start service 'ovirt-imageio-daemon' 2017-11-06 02:56:47,553-0500 ERROR otopi.context context._executeMethod:152 Failed to execute stage 'Closing up': Failed to start service 'ovirt-imageio-daemon'
In /var/log/messages of the host [1], there is: Nov 6 02:56:47 lago-basic-suite-master-host-0 systemd: Starting oVirt ImageIO Daemon... Nov 6 02:56:47 lago-basic-suite-master-host-0 python: detected unhandled Python exception in '/usr/bin/ovirt-imageio-daemon' Nov 6 02:56:47 lago-basic-suite-master-host-0 python: can't communicate with ABRT daemon, is it running? [Errno 2] No such file or directory Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: Traceback (most recent call last): Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: File "/usr/bin/ovirt-imageio-daemon", line 14, in <module> Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: server.main(sys.argv) Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: File "/usr/lib/python2.7/site-packages/ovirt_imageio_daemon/server.py", line 57, in main Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: start(config) Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: File "/usr/lib/python2.7/site-packages/ovirt_imageio_daemon/server.py", line 85, in start Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: WSGIRequestHandler) Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: File "/usr/lib64/python2.7/SocketServer.py", line 419, in __init__ Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: self.server_bind() Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: File "/usr/lib64/python2.7/wsgiref/simple_server.py", line 48, in server_bind Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: HTTPServer.server_bind(self) Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: File "/usr/lib64/python2.7/BaseHTTPServer.py", line 108, in server_bind Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: SocketServer.TCPServer.server_bind(self) Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: File "/usr/lib64/python2.7/SocketServer.py", line 430, in server_bind Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: self.socket.bind(self.server_address) Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: File "/usr/lib64/python2.7/socket.py", line 224, in meth Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: return getattr(self._sock,name)(*args) Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: socket.error: [Errno 98] Address already in use Nov 6 02:56:47 lago-basic-suite-master-host-0 systemd: ovirt-imageio-daemon.service: main process exited, code=exited, status=1/FAILURE ovirt-host-deploy stops it, and immediately tries to start it: 2017-11-06 02:56:47,203-0500 DEBUG otopi.plugins.otopi.services.systemd plugin.executeRaw:863 execute-result: ('/usr/bin/systemctl', 'stop', 'ovirt-imageio-daemon.service'), rc=0 ... 2017-11-06 02:56:47,550-0500 DEBUG otopi.plugins.otopi.services.systemd plugin.executeRaw:863 execute-result: ('/usr/bin/systemctl', 'start', 'ovirt-imageio-daemon.service'), rc=1 Also, imageio-daemon's log [2] looks a bit weird to me - it has 5 'Starting' lines, but no other lines I would have expected to have, reading its source, and as I can see in another run, that did finish successfully [3]. Adding Idan, but not sure it's a bug in the daemon. [1] http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/artifact/... [2] http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/artifact/... [3] http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3628/artifact/...
</error>
-- Didi

On Mon, Nov 6, 2017 at 4:16 PM Yedidyah Bar David <didi@redhat.com> wrote:
On Mon, Nov 6, 2017 at 1:57 PM, Dafna Ron <dron@redhat.com> wrote:
adding Didi.
On 11/06/2017 11:51 AM, Ala Hino wrote:
Suspected patch (https://gerrit.ovirt.org/#/c/83612/) is about cold merge and has nothing to do with host deploy.
On Mon, Nov 6, 2017 at 1:39 PM, Dafna Ron <dron@redhat.com> wrote:
Hi,
We failed test 002_bootstrap.verify_add_hosts
I can see we only tried to install one of the hosts (host-0) and failed. the second host has no log which means we did not try to deploy it.
The error suggests that we ovirt-imageio-daemon failed to start.
However,
there is another message that I think should be addressed about conflicting vdsm and libvirt configurations.
Link to suspected patches: https://gerrit.ovirt.org/#/c/83612/
Link to Job: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/
Link to all logs:
http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/artifact/
http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/artifact/...
(Relevant) error snippet from the log:
<error>
\
2017-11-06 02:56:46,526-0500 DEBUG otopi.plugins.ovirt_host_deploy.vdsm.packages plugin.execute:921 execute-output: ('/usr/bin/vdsm-tool', 'configure', '--force') stdout:
Checking configuration status...
abrt is not configured for vdsm WARNING: LVM local configuration: /etc/lvm/lvmlocal.conf is not based on vdsm configuration lvm requires configuration libvirt is not configured for vdsm yet FAILED: conflicting vdsm and libvirt-qemu tls configuration. vdsm.conf with ssl=True requires the following changes: libvirtd.conf: listen_tcp=0, auth_tcp="sasl", listen_tls=1 qemu.conf: spice_tls=1. multipath requires configuration
2017-11-06 02:56:47,551-0500 DEBUG otopi.plugins.otopi.services.systemd plugin.execute:926 execute-output: ('/usr/bin/systemctl', 'start', 'ovirt-imageio-daemon.service') stderr: Job for ovirt-imageio-daemon.service failed because the control process exited with error code. See "systemctl status
ovirt-imageio-daemon.service"
and "journalctl -xe" for details.
2017-11-06 02:56:47,552-0500 DEBUG otopi.context context._executeMethod:143 method exception Traceback (most recent call last): File "/tmp/ovirt-R4R8gZhaQI/pythonlib/otopi/context.py", line 133, in _executeMethod method['method']() File
"/tmp/ovirt-R4R8gZhaQI/otopi-plugins/ovirt-host-deploy/vdsm/packages.py",
line 179, in _start self.services.state('ovirt-imageio-daemon', True) File "/tmp/ovirt-R4R8gZhaQI/otopi-plugins/otopi/services/systemd.py", line 141, in state service=name, RuntimeError: Failed to start service 'ovirt-imageio-daemon' 2017-11-06 02:56:47,553-0500 ERROR otopi.context context._executeMethod:152 Failed to execute stage 'Closing up': Failed to start service 'ovirt-imageio-daemon'
In /var/log/messages of the host [1], there is:
Nov 6 02:56:47 lago-basic-suite-master-host-0 systemd: Starting oVirt ImageIO Daemon... Nov 6 02:56:47 lago-basic-suite-master-host-0 python: detected unhandled Python exception in '/usr/bin/ovirt-imageio-daemon' Nov 6 02:56:47 lago-basic-suite-master-host-0 python: can't communicate with ABRT daemon, is it running? [Errno 2] No such file or directory Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: Traceback (most recent call last): Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: File "/usr/bin/ovirt-imageio-daemon", line 14, in <module> Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: server.main(sys.argv) Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: File "/usr/lib/python2.7/site-packages/ovirt_imageio_daemon/server.py", line 57, in main Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: start(config) Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: File "/usr/lib/python2.7/site-packages/ovirt_imageio_daemon/server.py", line 85, in start Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: WSGIRequestHandler) Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: File "/usr/lib64/python2.7/SocketServer.py", line 419, in __init__ Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: self.server_bind() Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: File "/usr/lib64/python2.7/wsgiref/simple_server.py", line 48, in server_bind Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: HTTPServer.server_bind(self) Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: File "/usr/lib64/python2.7/BaseHTTPServer.py", line 108, in server_bind Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: SocketServer.TCPServer.server_bind(self) Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: File "/usr/lib64/python2.7/SocketServer.py", line 430, in server_bind Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: self.socket.bind(self.server_address) Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: File "/usr/lib64/python2.7/socket.py", line 224, in meth Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: return getattr(self._sock,name)(*args) Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: socket.error: [Errno 98] Address already in use Nov 6 02:56:47 lago-basic-suite-master-host-0 systemd: ovirt-imageio-daemon.service: main process exited, code=exited, status=1/FAILURE
ovirt-host-deploy stops it, and immediately tries to start it:
2017-11-06 02:56:47,203-0500 DEBUG otopi.plugins.otopi.services.systemd plugin.executeRaw:863 execute-result: ('/usr/bin/systemctl', 'stop', 'ovirt-imageio-daemon.service'), rc=0 ... 2017-11-06 02:56:47,550-0500 DEBUG otopi.plugins.otopi.services.systemd plugin.executeRaw:863 execute-result: ('/usr/bin/systemctl', 'start', 'ovirt-imageio-daemon.service'), rc=1
Also, imageio-daemon's log [2] looks a bit weird to me - it has 5 'Starting' lines, but no other lines I would have expected to have, reading its source, and as I can see in another run, that did finish successfully [3].
Adding Idan, but not sure it's a bug in the daemon.
[1] http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/artifact/...
[2] http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/artifact/...
[3] http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3628/artifact/...
Looks like the daemon is already running on this host - maybe host deploy is trying to start the service twice? We did not change the startup code couple of years, so this must be some change in another component. This patch will make it easier to detect future issues, logging any error to the daemon log during startup: https://gerrit.ovirt.org/83670/ Nir
</error>
-- Didi _______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel

This is a multi-part message in MIME format. --------------A6910E3AD3267E1D5DC1BE4B Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit we had the same failure this morning: Failed build: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3646/ All Logs: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3646/artifact/ engine log: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3646/artifact/... host logs: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3646/artifact/... On 11/06/2017 08:26 PM, Nir Soffer wrote:
On Mon, Nov 6, 2017 at 4:16 PM Yedidyah Bar David <didi@redhat.com <mailto:didi@redhat.com>> wrote:
On Mon, Nov 6, 2017 at 1:57 PM, Dafna Ron <dron@redhat.com <mailto:dron@redhat.com>> wrote: > adding Didi. > > > On 11/06/2017 11:51 AM, Ala Hino wrote: > > Suspected patch (https://gerrit.ovirt.org/#/c/83612/) is about cold merge > and has nothing to do with host deploy. > > On Mon, Nov 6, 2017 at 1:39 PM, Dafna Ron <dron@redhat.com <mailto:dron@redhat.com>> wrote: >> >> Hi, >> >> We failed test 002_bootstrap.verify_add_hosts >> >> I can see we only tried to install one of the hosts (host-0) and failed. >> the second host has no log which means we did not try to deploy it. >> >> The error suggests that we ovirt-imageio-daemon failed to start. However, >> there is another message that I think should be addressed about conflicting >> vdsm and libvirt configurations. >> >> Link to suspected patches: https://gerrit.ovirt.org/#/c/83612/ >> >> >> Link to Job: >> http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/ >> >> >> Link to all logs: >> http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/artifact/ >> >> >> http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/artifact/... >> >> >> (Relevant) error snippet from the log: >> >> <error> >> >> \ >> >> 2017-11-06 02:56:46,526-0500 DEBUG >> otopi.plugins.ovirt_host_deploy.vdsm.packages plugin.execute:921 >> execute-output: ('/usr/bin/vdsm-tool', 'configure', '--force') stdout: >> >> Checking configuration status... >> >> abrt is not configured for vdsm >> WARNING: LVM local configuration: /etc/lvm/lvmlocal.conf is not based on >> vdsm configuration >> lvm requires configuration >> libvirt is not configured for vdsm yet >> FAILED: conflicting vdsm and libvirt-qemu tls configuration. >> vdsm.conf with ssl=True requires the following changes: >> libvirtd.conf: listen_tcp=0, auth_tcp="sasl", listen_tls=1 >> qemu.conf: spice_tls=1. >> multipath requires configuration >> >> >> 2017-11-06 02:56:47,551-0500 DEBUG otopi.plugins.otopi.services.systemd >> plugin.execute:926 execute-output: ('/usr/bin/systemctl', 'start', >> 'ovirt-imageio-daemon.service') stderr: >> Job for ovirt-imageio-daemon.service failed because the control process >> exited with error code. See "systemctl status ovirt-imageio-daemon.service" >> and "journalctl -xe" for details. >> >> 2017-11-06 02:56:47,552-0500 DEBUG otopi.context >> context._executeMethod:143 method exception >> Traceback (most recent call last): >> File "/tmp/ovirt-R4R8gZhaQI/pythonlib/otopi/context.py", line 133, in >> _executeMethod >> method['method']() >> File >> "/tmp/ovirt-R4R8gZhaQI/otopi-plugins/ovirt-host-deploy/vdsm/packages.py", >> line 179, in _start >> self.services.state('ovirt-imageio-daemon', True) >> File "/tmp/ovirt-R4R8gZhaQI/otopi-plugins/otopi/services/systemd.py", >> line 141, in state >> service=name, >> RuntimeError: Failed to start service 'ovirt-imageio-daemon' >> 2017-11-06 02:56:47,553-0500 ERROR otopi.context >> context._executeMethod:152 Failed to execute stage 'Closing up': Failed to >> start service 'ovirt-imageio-daemon'
In /var/log/messages of the host [1], there is:
Nov 6 02:56:47 lago-basic-suite-master-host-0 systemd: Starting oVirt ImageIO Daemon... Nov 6 02:56:47 lago-basic-suite-master-host-0 python: detected unhandled Python exception in '/usr/bin/ovirt-imageio-daemon' Nov 6 02:56:47 lago-basic-suite-master-host-0 python: can't communicate with ABRT daemon, is it running? [Errno 2] No such file or directory Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: Traceback (most recent call last): Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: File "/usr/bin/ovirt-imageio-daemon", line 14, in <module> Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: server.main(sys.argv) Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: File "/usr/lib/python2.7/site-packages/ovirt_imageio_daemon/server.py", line 57, in main Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: start(config) Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: File "/usr/lib/python2.7/site-packages/ovirt_imageio_daemon/server.py", line 85, in start Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: WSGIRequestHandler) Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: File "/usr/lib64/python2.7/SocketServer.py", line 419, in __init__ Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: self.server_bind() Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: File "/usr/lib64/python2.7/wsgiref/simple_server.py", line 48, in server_bind Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: HTTPServer.server_bind(self) Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: File "/usr/lib64/python2.7/BaseHTTPServer.py", line 108, in server_bind Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: SocketServer.TCPServer.server_bind(self) Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: File "/usr/lib64/python2.7/SocketServer.py", line 430, in server_bind Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: self.socket.bind(self.server_address) Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: File "/usr/lib64/python2.7/socket.py", line 224, in meth Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: return getattr(self._sock,name)(*args) Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: socket.error: [Errno 98] Address already in use Nov 6 02:56:47 lago-basic-suite-master-host-0 systemd: ovirt-imageio-daemon.service: main process exited, code=exited, status=1/FAILURE
ovirt-host-deploy stops it, and immediately tries to start it:
2017-11-06 02:56:47,203-0500 DEBUG otopi.plugins.otopi.services.systemd plugin.executeRaw:863 execute-result: ('/usr/bin/systemctl', 'stop', 'ovirt-imageio-daemon.service'), rc=0 ... 2017-11-06 02:56:47,550-0500 DEBUG otopi.plugins.otopi.services.systemd plugin.executeRaw:863 execute-result: ('/usr/bin/systemctl', 'start', 'ovirt-imageio-daemon.service'), rc=1
Also, imageio-daemon's log [2] looks a bit weird to me - it has 5 'Starting' lines, but no other lines I would have expected to have, reading its source, and as I can see in another run, that did finish successfully [3].
Adding Idan, but not sure it's a bug in the daemon.
[1] http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/artifact/...
[2] http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/artifact/...
[3] http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3628/artifact/...
Looks like the daemon is already running on this host - maybe host deploy is trying to start the service twice?
We did not change the startup code couple of years, so this must be some change in another component.
This patch will make it easier to detect future issues, logging any error to the daemon log during startup: https://gerrit.ovirt.org/83670/
Nir
>> >> </error> >> >> > >
-- Didi _______________________________________________ Devel mailing list Devel@ovirt.org <mailto:Devel@ovirt.org> http://lists.ovirt.org/mailman/listinfo/devel
--------------A6910E3AD3267E1D5DC1BE4B Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: 8bit <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> </head> <body text="#000000" bgcolor="#FFFFFF"> <div class="moz-cite-prefix">we had the same failure this morning: <br> <br> Failed build:<br> <br> <a class="moz-txt-link-freetext" href="http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3646/">http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3646/</a><br> <br> All Logs: <br> <br> <a class="moz-txt-link-freetext" href="http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3646/artifact/">http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3646/artifact/</a><br> <br> engine log: <br> <br> <a class="moz-txt-link-freetext" href="http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3646/artifact/exported-artifacts/basic-suit-master-el7/test_logs/basic-suite-master/post-002_bootstrap.py/lago-basic-suite-master-engine/_var_log/ovirt-engine/host-deploy/ovirt-host-deploy-20171107030411-lago-basic-suite-master-host-0-5f90b210.log">http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3646/artifact/exported-artifacts/basic-suit-master-el7/test_logs/basic-suite-master/post-002_bootstrap.py/lago-basic-suite-master-engine/_var_log/ovirt-engine/host-deploy/ovirt-host-deploy-20171107030411-lago-basic-suite-master-host-0-5f90b210.log</a><br> <br> host logs: <br> <br> <a class="moz-txt-link-freetext" href="http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3646/artifact/exported-artifacts/basic-suit-master-el7/test_logs/basic-suite-master/post-002_bootstrap.py/lago-basic-suite-master-host-0/_var_log/">http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3646/artifact/exported-artifacts/basic-suit-master-el7/test_logs/basic-suite-master/post-002_bootstrap.py/lago-basic-suite-master-host-0/_var_log/</a><br> <br> <br> On 11/06/2017 08:26 PM, Nir Soffer wrote:<br> </div> <blockquote type="cite" cite="mid:CAMRbyyufL8QaLfgfdZD2-VbvTRxbgbY3k5_2kK7aha0XvcQAEg@mail.gmail.com"> <div dir="ltr"> <div class="gmail_quote"> <div dir="ltr">On Mon, Nov 6, 2017 at 4:16 PM Yedidyah Bar David <<a href="mailto:didi@redhat.com" moz-do-not-send="true">didi@redhat.com</a>> wrote:<br> </div> <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">On Mon, Nov 6, 2017 at 1:57 PM, Dafna Ron <<a href="mailto:dron@redhat.com" target="_blank" moz-do-not-send="true">dron@redhat.com</a>> wrote:<br> > adding Didi.<br> ><br> ><br> > On 11/06/2017 11:51 AM, Ala Hino wrote:<br> ><br> > Suspected patch (<a href="https://gerrit.ovirt.org/#/c/83612/" rel="noreferrer" target="_blank" moz-do-not-send="true">https://gerrit.ovirt.org/#/c/83612/</a>) is about cold merge<br> > and has nothing to do with host deploy.<br> ><br> > On Mon, Nov 6, 2017 at 1:39 PM, Dafna Ron <<a href="mailto:dron@redhat.com" target="_blank" moz-do-not-send="true">dron@redhat.com</a>> wrote:<br> >><br> >> Hi,<br> >><br> >> We failed test 002_bootstrap.verify_add_hosts<br> >><br> >> I can see we only tried to install one of the hosts (host-0) and failed.<br> >> the second host has no log which means we did not try to deploy it.<br> >><br> >> The error suggests that we ovirt-imageio-daemon failed to start. However,<br> >> there is another message that I think should be addressed about conflicting<br> >> vdsm and libvirt configurations.<br> >><br> >> Link to suspected patches: <a href="https://gerrit.ovirt.org/#/c/83612/" rel="noreferrer" target="_blank" moz-do-not-send="true">https://gerrit.ovirt.org/#/c/83612/</a><br> >><br> >><br> >> Link to Job:<br> >> <a href="http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/" rel="noreferrer" target="_blank" moz-do-not-send="true">http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/</a><br> >><br> >><br> >> Link to all logs:<br> >> <a href="http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/artifact/" rel="noreferrer" target="_blank" moz-do-not-send="true">http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/artifact/</a><br> >><br> >><br> >> <a href="http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/artifact/..." rel="noreferrer" target="_blank" moz-do-not-send="true">http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/artifact/exported-artifacts/basic-suit-master-el7/test_logs/basic-suite-master/post-002_bootstrap.py/lago-basic-suite-master-engine/_var_log/ovirt-engine/host-deploy/ovirt-host-deploy-20171106025647-lago-basic-suite-master-host-0-5530ab1f.log</a><br> >><br> >><br> >> (Relevant) error snippet from the log:<br> >><br> >> <error><br> >><br> >> \<br> >><br> >> 2017-11-06 02:56:46,526-0500 DEBUG<br> >> otopi.plugins.ovirt_host_deploy.vdsm.packages plugin.execute:921<br> >> execute-output: ('/usr/bin/vdsm-tool', 'configure', '--force') stdout:<br> >><br> >> Checking configuration status...<br> >><br> >> abrt is not configured for vdsm<br> >> WARNING: LVM local configuration: /etc/lvm/lvmlocal.conf is not based on<br> >> vdsm configuration<br> >> lvm requires configuration<br> >> libvirt is not configured for vdsm yet<br> >> FAILED: conflicting vdsm and libvirt-qemu tls configuration.<br> >> vdsm.conf with ssl=True requires the following changes:<br> >> libvirtd.conf: listen_tcp=0, auth_tcp="sasl", listen_tls=1<br> >> qemu.conf: spice_tls=1.<br> >> multipath requires configuration<br> >><br> >><br> >> 2017-11-06 02:56:47,551-0500 DEBUG otopi.plugins.otopi.services.systemd<br> >> plugin.execute:926 execute-output: ('/usr/bin/systemctl', 'start',<br> >> 'ovirt-imageio-daemon.service') stderr:<br> >> Job for ovirt-imageio-daemon.service failed because the control process<br> >> exited with error code. See "systemctl status ovirt-imageio-daemon.service"<br> >> and "journalctl -xe" for details.<br> >><br> >> 2017-11-06 02:56:47,552-0500 DEBUG otopi.context<br> >> context._executeMethod:143 method exception<br> >> Traceback (most recent call last):<br> >> File "/tmp/ovirt-R4R8gZhaQI/pythonlib/otopi/context.py", line 133, in<br> >> _executeMethod<br> >> method['method']()<br> >> File<br> >> "/tmp/ovirt-R4R8gZhaQI/otopi-plugins/ovirt-host-deploy/vdsm/packages.py",<br> >> line 179, in _start<br> >> self.services.state('ovirt-imageio-daemon', True)<br> >> File "/tmp/ovirt-R4R8gZhaQI/otopi-plugins/otopi/services/systemd.py",<br> >> line 141, in state<br> >> service=name,<br> >> RuntimeError: Failed to start service 'ovirt-imageio-daemon'<br> >> 2017-11-06 02:56:47,553-0500 ERROR otopi.context<br> >> context._executeMethod:152 Failed to execute stage 'Closing up': Failed to<br> >> start service 'ovirt-imageio-daemon'<br> <br> In /var/log/messages of the host [1], there is:<br> <br> Nov 6 02:56:47 lago-basic-suite-master-host-0 systemd: Starting oVirt<br> ImageIO Daemon...<br> Nov 6 02:56:47 lago-basic-suite-master-host-0 python: detected<br> unhandled Python exception in '/usr/bin/ovirt-imageio-daemon'<br> Nov 6 02:56:47 lago-basic-suite-master-host-0 python: can't<br> communicate with ABRT daemon, is it running? [Errno 2] No such file or<br> directory<br> Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon:<br> Traceback (most recent call last):<br> Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon:<br> File "/usr/bin/ovirt-imageio-daemon", line 14, in <module><br> Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon:<br> server.main(sys.argv)<br> Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon:<br> File "/usr/lib/python2.7/site-packages/ovirt_imageio_daemon/server.py",<br> line 57, in main<br> Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon:<br> start(config)<br> Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon:<br> File "/usr/lib/python2.7/site-packages/ovirt_imageio_daemon/server.py",<br> line 85, in start<br> Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon:<br> WSGIRequestHandler)<br> Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon:<br> File "/usr/lib64/python2.7/SocketServer.py", line 419, in __init__<br> Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon:<br> self.server_bind()<br> Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon:<br> File "/usr/lib64/python2.7/wsgiref/simple_server.py", line 48, in<br> server_bind<br> Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon:<br> HTTPServer.server_bind(self)<br> Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon:<br> File "/usr/lib64/python2.7/BaseHTTPServer.py", line 108, in<br> server_bind<br> Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon:<br> SocketServer.TCPServer.server_bind(self)<br> Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon:<br> File "/usr/lib64/python2.7/SocketServer.py", line 430, in server_bind<br> Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon:<br> self.socket.bind(self.server_address)<br> Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon:<br> File "/usr/lib64/python2.7/socket.py", line 224, in meth<br> Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon:<br> return getattr(self._sock,name)(*args)<br> Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon:<br> socket.error: [Errno 98] Address already in use<br> Nov 6 02:56:47 lago-basic-suite-master-host-0 systemd:<br> ovirt-imageio-daemon.service: main process exited, code=exited,<br> status=1/FAILURE<br> <br> ovirt-host-deploy stops it, and immediately tries to start it:<br> <br> 2017-11-06 02:56:47,203-0500 DEBUG<br> otopi.plugins.otopi.services.systemd plugin.executeRaw:863<br> execute-result: ('/usr/bin/systemctl', 'stop',<br> 'ovirt-imageio-daemon.service'), rc=0<br> ...<br> 2017-11-06 02:56:47,550-0500 DEBUG<br> otopi.plugins.otopi.services.systemd plugin.executeRaw:863<br> execute-result: ('/usr/bin/systemctl', 'start',<br> 'ovirt-imageio-daemon.service'), rc=1<br> <br> Also, imageio-daemon's log [2] looks a bit weird to me - it has 5<br> 'Starting' lines, but no<br> other lines I would have expected to have, reading its source, and as<br> I can see in another<br> run, that did finish successfully [3].<br> <br> Adding Idan, but not sure it's a bug in the daemon.<br> <br> [1] <a href="http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/artifact/..." rel="noreferrer" target="_blank" moz-do-not-send="true">http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/artifact/exported-artifacts/basic-suit-master-el7/test_logs/basic-suite-master/post-002_bootstrap.py/lago-basic-suite-master-host-0/_var_log/</a><br> <br> [2] <a href="http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/artifact/..." rel="noreferrer" target="_blank" moz-do-not-send="true">http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/artifact/exported-artifacts/basic-suit-master-el7/test_logs/basic-suite-master/post-002_bootstrap.py/lago-basic-suite-master-host-0/_var_log/ovirt-imageio-daemon/daemon.log</a><br> <br> [3] <a href="http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3628/artifact/..." rel="noreferrer" target="_blank" moz-do-not-send="true">http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3628/artifact/exported-artifacts/basic-suit-master-el7/test_logs/basic-suite-master/post-002_bootstrap.py/lago-basic-suite-master-host-0/_var_log/ovirt-imageio-daemon/daemon.log</a></blockquote> <div><br> </div> <div>Looks like the daemon is already running on this host - maybe host deploy</div> <div>is trying to start the service twice?</div> <div><br> </div> <div>We did not change the startup code couple of years, so this must be some</div> <div>change in another component.</div> <div><br> </div> <div>This patch will make it easier to detect future issues, logging any error</div> <div>to the daemon log during startup:</div> <div><a href="https://gerrit.ovirt.org/83670/" moz-do-not-send="true">https://gerrit.ovirt.org/83670/</a><br> </div> <div><br> </div> <div>Nir</div> <div> </div> <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><br> <br> >><br> >> </error><br> >><br> >><br> ><br> ><br> <br> <br> <br> --<br> Didi<br> _______________________________________________<br> Devel mailing list<br> <a href="mailto:Devel@ovirt.org" target="_blank" moz-do-not-send="true">Devel@ovirt.org</a><br> <a href="http://lists.ovirt.org/mailman/listinfo/devel" rel="noreferrer" target="_blank" moz-do-not-send="true">http://lists.ovirt.org/mailman/listinfo/devel</a><br> </blockquote> </div> </div> </blockquote> <p><br> </p> </body> </html> --------------A6910E3AD3267E1D5DC1BE4B--

This still use the older daemon, the patch improving logging was merged today at 13:02 Please check again with current version. On Tue, Nov 7, 2017 at 11:54 AM Dafna Ron <dron@redhat.com> wrote:
we had the same failure this morning:
Failed build:
http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3646/
All Logs:
http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3646/artifact/
engine log:
http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3646/artifact/...
host logs:
http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3646/artifact/...
On 11/06/2017 08:26 PM, Nir Soffer wrote:
On Mon, Nov 6, 2017 at 4:16 PM Yedidyah Bar David <didi@redhat.com> wrote:
On Mon, Nov 6, 2017 at 1:57 PM, Dafna Ron <dron@redhat.com> wrote:
adding Didi.
On 11/06/2017 11:51 AM, Ala Hino wrote:
Suspected patch (https://gerrit.ovirt.org/#/c/83612/) is about cold merge and has nothing to do with host deploy.
On Mon, Nov 6, 2017 at 1:39 PM, Dafna Ron <dron@redhat.com> wrote:
Hi,
We failed test 002_bootstrap.verify_add_hosts
I can see we only tried to install one of the hosts (host-0) and
failed.
the second host has no log which means we did not try to deploy it.
The error suggests that we ovirt-imageio-daemon failed to start. However, there is another message that I think should be addressed about conflicting vdsm and libvirt configurations.
Link to suspected patches: https://gerrit.ovirt.org/#/c/83612/
Link to Job: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/
Link to all logs:
http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/artifact/
http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/artifact/...
(Relevant) error snippet from the log:
<error>
\
2017-11-06 02:56:46,526-0500 DEBUG otopi.plugins.ovirt_host_deploy.vdsm.packages plugin.execute:921 execute-output: ('/usr/bin/vdsm-tool', 'configure', '--force') stdout:
Checking configuration status...
abrt is not configured for vdsm WARNING: LVM local configuration: /etc/lvm/lvmlocal.conf is not based
on
vdsm configuration lvm requires configuration libvirt is not configured for vdsm yet FAILED: conflicting vdsm and libvirt-qemu tls configuration. vdsm.conf with ssl=True requires the following changes: libvirtd.conf: listen_tcp=0, auth_tcp="sasl", listen_tls=1 qemu.conf: spice_tls=1. multipath requires configuration
2017-11-06 02:56:47,551-0500 DEBUG otopi.plugins.otopi.services.systemd plugin.execute:926 execute-output: ('/usr/bin/systemctl', 'start', 'ovirt-imageio-daemon.service') stderr: Job for ovirt-imageio-daemon.service failed because the control process exited with error code. See "systemctl status ovirt-imageio-daemon.service" and "journalctl -xe" for details.
2017-11-06 02:56:47,552-0500 DEBUG otopi.context context._executeMethod:143 method exception Traceback (most recent call last): File "/tmp/ovirt-R4R8gZhaQI/pythonlib/otopi/context.py", line 133, in _executeMethod method['method']() File
"/tmp/ovirt-R4R8gZhaQI/otopi-plugins/ovirt-host-deploy/vdsm/packages.py",
line 179, in _start self.services.state('ovirt-imageio-daemon', True) File "/tmp/ovirt-R4R8gZhaQI/otopi-plugins/otopi/services/systemd.py", line 141, in state service=name, RuntimeError: Failed to start service 'ovirt-imageio-daemon' 2017-11-06 02:56:47,553-0500 ERROR otopi.context context._executeMethod:152 Failed to execute stage 'Closing up': Failed to start service 'ovirt-imageio-daemon'
In /var/log/messages of the host [1], there is:
Nov 6 02:56:47 lago-basic-suite-master-host-0 systemd: Starting oVirt ImageIO Daemon... Nov 6 02:56:47 lago-basic-suite-master-host-0 python: detected unhandled Python exception in '/usr/bin/ovirt-imageio-daemon' Nov 6 02:56:47 lago-basic-suite-master-host-0 python: can't communicate with ABRT daemon, is it running? [Errno 2] No such file or directory Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: Traceback (most recent call last): Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: File "/usr/bin/ovirt-imageio-daemon", line 14, in <module> Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: server.main(sys.argv) Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: File "/usr/lib/python2.7/site-packages/ovirt_imageio_daemon/server.py", line 57, in main Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: start(config) Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: File "/usr/lib/python2.7/site-packages/ovirt_imageio_daemon/server.py", line 85, in start Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: WSGIRequestHandler) Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: File "/usr/lib64/python2.7/SocketServer.py", line 419, in __init__ Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: self.server_bind() Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: File "/usr/lib64/python2.7/wsgiref/simple_server.py", line 48, in server_bind Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: HTTPServer.server_bind(self) Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: File "/usr/lib64/python2.7/BaseHTTPServer.py", line 108, in server_bind Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: SocketServer.TCPServer.server_bind(self) Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: File "/usr/lib64/python2.7/SocketServer.py", line 430, in server_bind Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: self.socket.bind(self.server_address) Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: File "/usr/lib64/python2.7/socket.py", line 224, in meth Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: return getattr(self._sock,name)(*args) Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: socket.error: [Errno 98] Address already in use Nov 6 02:56:47 lago-basic-suite-master-host-0 systemd: ovirt-imageio-daemon.service: main process exited, code=exited, status=1/FAILURE
ovirt-host-deploy stops it, and immediately tries to start it:
2017-11-06 02:56:47,203-0500 DEBUG otopi.plugins.otopi.services.systemd plugin.executeRaw:863 execute-result: ('/usr/bin/systemctl', 'stop', 'ovirt-imageio-daemon.service'), rc=0 ... 2017-11-06 02:56:47,550-0500 DEBUG otopi.plugins.otopi.services.systemd plugin.executeRaw:863 execute-result: ('/usr/bin/systemctl', 'start', 'ovirt-imageio-daemon.service'), rc=1
Also, imageio-daemon's log [2] looks a bit weird to me - it has 5 'Starting' lines, but no other lines I would have expected to have, reading its source, and as I can see in another run, that did finish successfully [3].
Adding Idan, but not sure it's a bug in the daemon.
[1] http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/artifact/...
[2] http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/artifact/...
[3] http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3628/artifact/...
Looks like the daemon is already running on this host - maybe host deploy is trying to start the service twice?
We did not change the startup code couple of years, so this must be some change in another component.
This patch will make it easier to detect future issues, logging any error to the daemon log during startup: https://gerrit.ovirt.org/83670/
Nir
</error>
-- Didi _______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel

On Mon, Nov 6, 2017 at 1:39 PM, Dafna Ron <dron@redhat.com> wrote:
Hi,
We failed test 002_bootstrap.verify_add_hosts
I can see we only tried to install one of the hosts (host-0) and failed. the second host has no log which means we did not try to deploy it.
The error suggests that we ovirt-imageio-daemon failed to start. However, there is another message that I think should be addressed about conflicting vdsm and libvirt configurations.
*Link to suspected patches: https://gerrit.ovirt.org/#/c/83612/ <https://gerrit.ovirt.org/#/c/83612/>*
* Link to Job: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/ <http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/> Link to all logs: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/artifact/ <http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/artifact/> http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/artifact/... <http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3626/artifact/exported-artifacts/basic-suit-master-el7/test_logs/basic-suite-master/post-002_bootstrap.py/lago-basic-suite-master-engine/_var_log/ovirt-engine/host-deploy/ovirt-host-deploy-20171106025647-lago-basic-suite-master-host-0-5530ab1f.log>*
*(Relevant) error snippet from the log: *
* <error> \*
2017-11-06 02:56:46,526-0500 DEBUG otopi.plugins.ovirt_host_deploy.vdsm.packages plugin.execute:921 execute-output: ('/usr/bin/vdsm-tool', 'configure', '--force') stdout:
Checking configuration status...
abrt is not configured for vdsm WARNING: LVM local configuration: /etc/lvm/lvmlocal.conf is not based on vdsm configuration lvm requires configuration libvirt is not configured for vdsm yet FAILED: conflicting vdsm and libvirt-qemu tls configuration. vdsm.conf with ssl=True requires the following changes: libvirtd.conf: listen_tcp=0, auth_tcp="sasl", listen_tls=1 qemu.conf: spice_tls=1. multipath requires configuration
2017-11-06 02:56:47,551-0500 DEBUG otopi.plugins.otopi.services.systemd plugin.execute:926 execute-output: ('/usr/bin/systemctl', 'start', 'ovirt-imageio-daemon.service') stderr: Job for ovirt-imageio-daemon.service failed because the control process exited with error code. See "systemctl status ovirt-imageio-daemon.service" and "journalctl -xe" for details.
2017-11-06 02:56:47,552-0500 DEBUG otopi.context context._executeMethod:143 method exception Traceback (most recent call last): File "/tmp/ovirt-R4R8gZhaQI/pythonlib/otopi/context.py", line 133, in _executeMethod method['method']() File "/tmp/ovirt-R4R8gZhaQI/otopi-plugins/ovirt-host-deploy/vdsm/packages.py", line 179, in _start self.services.state('ovirt-imageio-daemon', True) File "/tmp/ovirt-R4R8gZhaQI/otopi-plugins/otopi/services/systemd.py", line 141, in state service=name, RuntimeError: Failed to start service 'ovirt-imageio-daemon' 2017-11-06 02:56:47,553-0500 ERROR otopi.context context._executeMethod:152 Failed to execute stage 'Closing up': Failed to start service 'ovirt-imageio-daemon'
*</error>*
The problem: Nov 6 02:56:47 lago-basic-suite-master-host-0 systemd: Starting oVirt ImageIO Daemon... Nov 6 02:56:47 lago-basic-suite-master-host-0 python: detected unhandled Python exception in '/usr/bin/ovirt-imageio-daemon' Nov 6 02:56:47 lago-basic-suite-master-host-0 python: can't communicate with ABRT daemon, is it running? [Errno 2] No such file or directory Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: Traceback (most recent call last): Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: File "/usr/bin/ovirt-imageio-daemon", line 14, in <module> Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: server.main(sys.argv) Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: File "/usr/lib/python2.7/site-packages/ovirt_imageio_daemon/server.py", line 57, in main Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: start(config) Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: File "/usr/lib/python2.7/site-packages/ovirt_imageio_daemon/server.py", line 85, in start Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: WSGIRequestHandler) Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: File "/usr/lib64/python2.7/SocketServer.py", line 419, in __init__ Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: self.server_bind() Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: File "/usr/lib64/python2.7/wsgiref/simple_server.py", line 48, in server_bind Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: HTTPServer.server_bind(self) Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: File "/usr/lib64/python2.7/BaseHTTPServer.py", line 108, in server_bind Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: SocketServer.TCPServer.server_bind(self) Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: File "/usr/lib64/python2.7/SocketServer.py", line 430, in server_bind Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: self.socket.bind(self.server_address) Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: File "/usr/lib64/python2.7/socket.py", line 224, in meth Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: return getattr(self._sock,name)(*args) Nov 6 02:56:47 lago-basic-suite-master-host-0 ovirt-imageio-daemon: socket.error: [Errno 98] Address already in use
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
participants (5)
-
Ala Hino
-
Dafna Ron
-
Nir Soffer
-
Yaniv Kaul
-
Yedidyah Bar David