
Didi - we just saw a report of a similar failure on engine-4.1's OST. COuld you please backport these patches there too? On Mon, Nov 27, 2017 at 2:57 PM, Yedidyah Bar David <didi@redhat.com> wrote:
On Mon, Nov 27, 2017 at 10:38 AM, Yedidyah Bar David <didi@redhat.com> wrote:
I think we need to check and report which process is listening on a
On Sun, Nov 26, 2017 at 7:24 PM, Nir Soffer <nsoffer@redhat.com> wrote: port
when starting a server on that port fail.
How do you know that a server was "started on that port", and that if failed specifically because it failed to bind?
There is no standardized (Unix) way to mark that a service wants to listen on a specific port, or that it failed because a specific port was bound by some other process.
There are various classical *inetd* daemons, and modern systemd.socket, that listen *instead* of some service. Then they can manage the port resources and perhaps do something intelligent about them.
Didi, do you think we can integrate this in the deploy code, or this should be implemented in each server?
It should be quite easy to patch otopi's services.state to run something if start fails, e.g. 'ss -anp' or whatever you want.
It should even be not-too-hard to do this in a self-contained plugin, so can be part of otopi-debug-plugins.
If we decide that something needs to be implemented by each server, perhaps "something" should be to be controlled by a systemd.socket unit. Didn't try, though, to see what this actually buys us.
Maybe when deployment fails, the deploy code can report all the listening sockets and the processes bound to these sockets?
Pushed now:
https://gerrit.ovirt.org/84699 core: Name TRANSACTION_INIT https://gerrit.ovirt.org/84700 plugins: debug: Add debug_failure https://gerrit.ovirt.org/84701 automation: Test failure
Will merge soon, if all goes well.
Merged them.
Pushed to OST:
https://gerrit.ovirt.org/84710
Dafna - thanks for opening the bug on ovirt-imageio, but I am not sure anyone can do much about it without more info, such as might be provided by above patches. When I suggested below to open BZ I meant on otopi or host-deploy to provide more debugging info, not for imageio - obviously no harm in opening it, and it's good to have it even if only for reference.
Feel free to open BZ for other things discussed above, if relevant.
Nir
On Sun, Nov 26, 2017 at 7:11 PM Gal Ben Haim <gbenhaim@redhat.com>
The failure is not consistent.
On Sun, Nov 26, 2017 at 5:33 PM, Yaniv Kaul <ykaul@redhat.com> wrote:
On Sun, Nov 26, 2017 at 4:53 PM, Gal Ben Haim <gbenhaim@redhat.com> wrote:
We still see this issue on the upgrade suite from latest release to master [1]. I don't see any evidence in "/var/log/messages" [2] that "ovirt-imageio-proxy" was started twice.
Since it's not a registered port and a high port, could it be used by something else (what are the odds though ? Is it consistent? Y.
[1] http://jenkins.ovirt.org/blue/rest/organizations/jenkins/pip
elines/ovirt-master_change-queue-tester/runs/4153/nodes/123/ steps/241/log/?start=0
[2] http://jenkins.ovirt.org/view/Change%20queue%20jobs/job/ovir
t-master_change-queue-tester/4153/artifact/exported-artifac ts/upgrade-from-release-suit-master-el7/test_logs/upgrade- from-release-suite-master/post-001_initialize_engine.py/ lago-upgrade-from-release-suite-master-engine/_var_log/messages/*view*/
On Fri, Nov 24, 2017 at 8:16 PM, Dafna Ron <dron@redhat.com> wrote: > > there were two different patches reported as failing cq today with
> ovirt-imageio-proxy service failing to start. > > Here is the latest failure: > http://jenkins.ovirt.org/job/ovirt-master_change-queue-teste r/4130/artifact > > > > > On 11/23/2017 03:39 PM, Allon Mureinik wrote: > > Daniel/Nir? > > On Thu, Nov 23, 2017 at 5:29 PM, Dafna Ron <dron@redhat.com> wrote: >> >> Hi, >> >> We have a failing on test >> 001_initialize_engine.test_initialize_engine. >> >> This is failing with error Failed to start service >> 'ovirt-imageio-proxy >> >> >> Link and headline ofto suspected patches: >> >> build: Make resulting RPMs architecture-specific - >> https://gerrit.ovirt.org/#/c/84534/ >> >> >> Link to Job: >> >> http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/4055 >> >> >> Link to all logs: >> >> >> http://jenkins.ovirt.org/job/ovirt-master_change-queue-teste r/4055/artifact/ >> >> >> http://jenkins.ovirt.org/job/ovirt-master_change-queue-teste r/4055/artifact/exported-artifacts/upgrade-from-release- suit-master-el7/test_logs/upgrade-from-release-suite- master/post-001_initialize_engine.py/lago-upgrade-from- release-suite-master-engine/_var_log/messages/*view*/ >> >> >> (Relevant) error snippet from the log: >> >> <error> >> >> >> from lago log: >> >> Failed to start service 'ovirt-imageio-proxy >> >> messages logs: >> >> Nov 23 07:30:47 lago-upgrade-from-release-suite-master-engine systemd: >> Starting Session 8 of user root. >> Nov 23 07:30:47 lago-upgrade-from-release-suite-master-engine >> ovirt-imageio-proxy: Traceback (most recent call last): >> Nov 23 07:30:47 lago-upgrade-from-release-suite-master-engine >> ovirt-imageio-proxy: File "/usr/bin/ovirt-imageio-proxy", line 85, in >> <module> >> Nov 23 07:30:47 lago-upgrade-from-release-suite-master-engine >> ovirt-imageio-proxy: status = image_proxy.main(args, config) >> Nov 23 07:30:47 lago-upgrade-from-release-suite-master-engine >> ovirt-imageio-proxy: File >> "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/image_proxy.py",
>> 21, in main >> Nov 23 07:30:47 lago-upgrade-from-release-suite-master-engine >> ovirt-imageio-proxy: image_server.start(config) >> Nov 23 07:30:47 lago-upgrade-from-release-suite-master-engine >> ovirt-imageio-proxy: File >> "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/server.py",
>> in start >> Nov 23 07:30:47 lago-upgrade-from-release-suite-master-engine >> ovirt-imageio-proxy: WSGIRequestHandler) >> Nov 23 07:30:47 lago-upgrade-from-release-suite-master-engine >> ovirt-imageio-proxy: File "/usr/lib64/python2.7/SocketServer.py",
>> in __init__ >> Nov 23 07:30:47 lago-upgrade-from-release-suite-master-engine >> ovirt-imageio-proxy: self.server_bind() >> Nov 23 07:30:47 lago-upgrade-from-release-suite-master-engine >> ovirt-imageio-proxy: File "/usr/lib64/python2.7/wsgiref/ simple_server.py", >> line 48, in server_bind >> Nov 23 07:30:47 lago-upgrade-from-release-suite-master-engine >> ovirt-imageio-proxy: HTTPServer.server_bind(self) >> Nov 23 07:30:47 lago-upgrade-from-release-suite-master-engine >> ovirt-imageio-proxy: File "/usr/lib64/python2.7/BaseHTTPServer.py",
>> 108, in server_bind >> Nov 23 07:30:47 lago-upgrade-from-release-suite-master-engine >> ovirt-imageio-proxy: SocketServer.TCPServer.server_bind(self) >> Nov 23 07:30:47 lago-upgrade-from-release-suite-master-engine >> ovirt-imageio-proxy: File "/usr/lib64/python2.7/SocketServer.py",
>> in server_bind >> Nov 23 07:30:47 lago-upgrade-from-release-suite-master-engine >> ovirt-imageio-proxy: self.socket.bind(self.server_address) >> Nov 23 07:30:47 lago-upgrade-from-release-suite-master-engine >> ovirt-imageio-proxy: File "/usr/lib64/python2.7/socket.py", line 224, in >> meth >> Nov 23 07:30:47 lago-upgrade-from-release-suite-master-engine >> ovirt-imageio-proxy: return getattr(self._sock,name)(*args) >> Nov 23 07:30:47 lago-upgrade-from-release-suite-master-engine >> ovirt-imageio-proxy: socket.error: [Errno 98] Address already in use >> Nov 23 07:30:47 lago-upgrade-from-release-suite-master-engine systemd: >> ovirt-imageio-proxy.service: main process exited, code=exited, >> status=1/FAILURE >> Nov 23 07:30:47 lago-upgrade-from-release-suite-master-engine systemd: >> Failed to start oVirt ImageIO Proxy. >> Nov 23 07:30:47 lago-upgrade-from-release-suite-master-engine systemd: >> Unit ovirt-imageio-proxy.service entered failed state. >> Nov 23 07:30:47 lago-upgrade-from-release-suite-master-engine systemd: >> ovirt-imageio-proxy.service failed. >> Nov 23 07:30:47 lago-upgrade-from-release-suite-master-engine systemd: >> ovirt-imageio-proxy.service holdoff time over, scheduling restart. >> Nov 23 07:30:47 lago-upgrade-from-release-suite-master-engine systemd: >> Starting oVirt ImageIO Proxy... >> Nov 23 07:30:48 lago-upgrade-from-release-suite-master-engine >> ovirt-imageio-proxy: Traceback (most recent call last): >> Nov 23 07:30:48 lago-upgrade-from-release-suite-master-engine >> ovirt-imageio-proxy: File "/usr/bin/ovirt-imageio-proxy", line 85, in >> <module> >> Nov 23 07:30:48 lago-upgrade-from-release-suite-master-engine >> ovirt-imageio-proxy: status = image_proxy.main(args, config) >> Nov 23 07:30:48 lago-upgrade-from-release-suite-master-engine >> ovirt-imageio-proxy: File >> "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/image_proxy.py",
>> 21, in main >> Nov 23 07:30:48 lago-upgrade-from-release-suite-master-engine >> ovirt-imageio-proxy: image_server.start(config) >> Nov 23 07:30:48 lago-upgrade-from-release-suite-master-engine >> ovirt-imageio-proxy: File >> "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/server.py",
>> in start >> Nov 23 07:30:48 lago-upgrade-from-release-suite-master-engine >> ovirt-imageio-proxy: WSGIRequestHandler) >> Nov 23 07:30:48 lago-upgrade-from-release-suite-master-engine >> ovirt-imageio-proxy: File "/usr/lib64/python2.7/SocketServer.py",
>> in __init__ >> Nov 23 07:30:48 lago-upgrade-from-release-suite-master-engine >> ovirt-imageio-proxy: self.server_bind() >> Nov 23 07:30:48 lago-upgrade-from-release-suite-master-engine >> ovirt-imageio-proxy: File "/usr/lib64/python2.7/wsgiref/ simple_server.py", >> line 48, in server_bind >> Nov 23 07:30:48 lago-upgrade-from-release-suite-master-engine >> ovirt-imageio-proxy: HTTPServer.server_bind(self) >> Nov 23 07:30:48 lago-upgrade-from-release-suite-master-engine >> ovirt-imageio-proxy: File "/usr/lib64/python2.7/BaseHTTPServer.py",
>> 108, in server_bind >> Nov 23 07:30:48 lago-upgrade-from-release-suite-master-engine >> ovirt-imageio-proxy: SocketServer.TCPServer.server_bind(self) >> Nov 23 07:30:48 lago-upgrade-from-release-suite-master-engine >> ovirt-imageio-proxy: File "/usr/lib64/python2.7/SocketServer.py",
wrote: the line line 45, line 419, line line 430, line line 45, line 419, line line 430,
>> in server_bind >> Nov 23 07:30:48 lago-upgrade-from-release-suite-master-engine >> ovirt-imageio-proxy: self.socket.bind(self.server_address) >> Nov 23 07:30:48 lago-upgrade-from-release-suite-master-engine >> ovirt-imageio-proxy: File "/usr/lib64/python2.7/socket.py", line 224, in >> meth >> Nov 23 07:30:48 lago-upgrade-from-release-suite-master-engine >> ovirt-imageio-proxy: return getattr(self._sock,name)(*args) >> Nov 23 07:30:48 lago-upgrade-from-release-suite-master-engine >> ovirt-imageio-proxy: socket.error: [Errno 98] Address already in use >> Nov 23 07:30:48 lago-upgrade-from-release-suite-master-engine systemd: >> ovirt-imageio-proxy.service: main process exited, code=exited, >> status=1/FAILURE >> Nov 23 07:30:48 lago-upgrade-from-release-suite-master-engine systemd: >> Failed to start oVirt ImageIO Proxy. >> Nov 23 07:30:48 lago-upgrade-from-release-suite-master-engine systemd: >> Unit ovirt-imageio-proxy.service entered failed state. >> Nov 23 07:30:48 lago-upgrade-from-release-suite-master-engine systemd: >> ovirt-imageio-proxy.service failed. >> Nov 23 07:30:48 lago-upgrade-from-release-suite-master-engine systemd: >> ovirt-imageio-proxy.service holdoff time over, scheduling restart. >> Nov 23 07:30:48 lago-upgrade-from-release-suite-master-engine systemd: >> start request repeated too quickly for ovirt-imageio-proxy.service >> Nov 23 07:30:48 lago-upgrade-from-release-suite-master-engine systemd: >> Failed to start oVirt ImageIO Proxy. >> Nov 23 07:30:48 lago-upgrade-from-release-suite-master-engine systemd: >> Unit ovirt-imageio-proxy.service entered failed state. >> Nov 23 07:30:48 lago-upgrade-from-release-suite-master-engine systemd: >> ovirt-imageio-proxy.service failed. >> >> </error> >> >> >> >> _______________________________________________ >> Infra mailing list >> Infra@ovirt.org >> http://lists.ovirt.org/mailman/listinfo/infra >> > > > > _______________________________________________ > Devel mailing list > Devel@ovirt.org > http://lists.ovirt.org/mailman/listinfo/devel
-- GAL bEN HAIM RHV DEVOPS
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
-- GAL bEN HAIM RHV DEVOPS _______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
-- Didi
-- Didi
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel