[ovirt-devel] [ OST Failure Report ] [ oVirt master ] [ 23-11-2017 ] [ 001_initialize_engine.test_initialize_engine ]

Allon Mureinik amureini at redhat.com
Mon Nov 27 15:37:24 UTC 2017


Didi - we just saw a report of a similar failure on engine-4.1's OST.
COuld you please backport these patches there too?

On Mon, Nov 27, 2017 at 2:57 PM, Yedidyah Bar David <didi at redhat.com> wrote:

> On Mon, Nov 27, 2017 at 10:38 AM, Yedidyah Bar David <didi at redhat.com>
> wrote:
>
>> On Sun, Nov 26, 2017 at 7:24 PM, Nir Soffer <nsoffer at redhat.com> wrote:
>> > I think we need to check and report which  process is listening on a
>> port
>> > when starting a server on that port fail.
>>
>> How do you know that a server was "started on that port", and that
>> if failed specifically because it failed to bind?
>>
>> There is no standardized (Unix) way to mark that a service wants to
>> listen on a specific port, or that it failed because a specific port
>> was bound by some other process.
>>
>> There are various classical *inetd* daemons, and modern systemd.socket,
>> that listen *instead* of some service. Then they can manage the port
>> resources and perhaps do something intelligent about them.
>>
>> >
>> > Didi, do you think we can integrate this in the deploy code, or this
>> > should be implemented in each server?
>>
>> It should be quite easy to patch otopi's services.state to run something
>> if start fails, e.g. 'ss -anp' or whatever you want.
>>
>> It should even be not-too-hard to do this in a self-contained plugin,
>> so can be part of otopi-debug-plugins.
>>
>> If we decide that something needs to be implemented by each server,
>> perhaps "something" should be to be controlled by a systemd.socket unit.
>> Didn't try, though, to see what this actually buys us.
>>
>> >
>> > Maybe when deployment fails, the deploy code can report all the
>> > listening sockets and the processes bound to these sockets?
>>
>> Pushed now:
>>
>> https://gerrit.ovirt.org/84699 core: Name TRANSACTION_INIT
>> https://gerrit.ovirt.org/84700 plugins: debug: Add debug_failure
>> https://gerrit.ovirt.org/84701 automation: Test failure
>>
>> Will merge soon, if all goes well.
>>
>
> Merged them.
>
> Pushed to OST:
>
> https://gerrit.ovirt.org/84710
>
> Dafna - thanks for opening the bug on ovirt-imageio, but I am not
> sure anyone can do much about it without more info, such as might
> be provided by above patches. When I suggested below to open BZ
> I meant on otopi or host-deploy to provide more debugging info,
> not for imageio - obviously no harm in opening it, and it's good
> to have it even if only for reference.
>
>
>>
>> Feel free to open BZ for other things discussed above, if relevant.
>>
>> >
>> > Nir
>> >
>> > On Sun, Nov 26, 2017 at 7:11 PM Gal Ben Haim <gbenhaim at redhat.com>
>> wrote:
>> >>
>> >> The failure is not consistent.
>> >>
>> >> On Sun, Nov 26, 2017 at 5:33 PM, Yaniv Kaul <ykaul at redhat.com> wrote:
>> >>>
>> >>>
>> >>>
>> >>> On Sun, Nov 26, 2017 at 4:53 PM, Gal Ben Haim <gbenhaim at redhat.com>
>> >>> wrote:
>> >>>>
>> >>>> We still see this issue on the upgrade suite from latest release to
>> >>>> master [1].
>> >>>> I don't see any evidence in "/var/log/messages" [2] that
>> >>>> "ovirt-imageio-proxy" was started twice.
>> >>>
>> >>>
>> >>> Since it's not a registered port and a high port, could it be used by
>> >>> something else (what are the odds though ?
>> >>> Is it consistent?
>> >>> Y.
>> >>>
>> >>>>
>> >>>>
>> >>>> [1]
>> >>>> http://jenkins.ovirt.org/blue/rest/organizations/jenkins/pip
>> elines/ovirt-master_change-queue-tester/runs/4153/nodes/123/
>> steps/241/log/?start=0
>> >>>>
>> >>>> [2]
>> >>>> http://jenkins.ovirt.org/view/Change%20queue%20jobs/job/ovir
>> t-master_change-queue-tester/4153/artifact/exported-artifac
>> ts/upgrade-from-release-suit-master-el7/test_logs/upgrade-
>> from-release-suite-master/post-001_initialize_engine.py/
>> lago-upgrade-from-release-suite-master-engine/_var_log/messages/*view*/
>> >>>>
>> >>>> On Fri, Nov 24, 2017 at 8:16 PM, Dafna Ron <dron at redhat.com> wrote:
>> >>>>>
>> >>>>> there were two different patches reported as failing cq today with
>> the
>> >>>>> ovirt-imageio-proxy service failing to start.
>> >>>>>
>> >>>>> Here is the latest failure:
>> >>>>> http://jenkins.ovirt.org/job/ovirt-master_change-queue-teste
>> r/4130/artifact
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> On 11/23/2017 03:39 PM, Allon Mureinik wrote:
>> >>>>>
>> >>>>> Daniel/Nir?
>> >>>>>
>> >>>>> On Thu, Nov 23, 2017 at 5:29 PM, Dafna Ron <dron at redhat.com> wrote:
>> >>>>>>
>> >>>>>> Hi,
>> >>>>>>
>> >>>>>> We have a failing on test
>> >>>>>> 001_initialize_engine.test_initialize_engine.
>> >>>>>>
>> >>>>>> This is failing with error Failed to start service
>> >>>>>> 'ovirt-imageio-proxy
>> >>>>>>
>> >>>>>>
>> >>>>>> Link and headline ofto suspected patches:
>> >>>>>>
>> >>>>>> build: Make resulting RPMs architecture-specific -
>> >>>>>> https://gerrit.ovirt.org/#/c/84534/
>> >>>>>>
>> >>>>>>
>> >>>>>> Link to Job:
>> >>>>>>
>> >>>>>> http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/4055
>> >>>>>>
>> >>>>>>
>> >>>>>> Link to all logs:
>> >>>>>>
>> >>>>>>
>> >>>>>> http://jenkins.ovirt.org/job/ovirt-master_change-queue-teste
>> r/4055/artifact/
>> >>>>>>
>> >>>>>>
>> >>>>>> http://jenkins.ovirt.org/job/ovirt-master_change-queue-teste
>> r/4055/artifact/exported-artifacts/upgrade-from-release-
>> suit-master-el7/test_logs/upgrade-from-release-suite-
>> master/post-001_initialize_engine.py/lago-upgrade-from-
>> release-suite-master-engine/_var_log/messages/*view*/
>> >>>>>>
>> >>>>>>
>> >>>>>> (Relevant) error snippet from the log:
>> >>>>>>
>> >>>>>> <error>
>> >>>>>>
>> >>>>>>
>> >>>>>> from lago log:
>> >>>>>>
>> >>>>>> Failed to start service 'ovirt-imageio-proxy
>> >>>>>>
>> >>>>>> messages logs:
>> >>>>>>
>> >>>>>> Nov 23 07:30:47 lago-upgrade-from-release-suite-master-engine
>> systemd:
>> >>>>>> Starting Session 8 of user root.
>> >>>>>> Nov 23 07:30:47 lago-upgrade-from-release-suite-master-engine
>> >>>>>> ovirt-imageio-proxy: Traceback (most recent call last):
>> >>>>>> Nov 23 07:30:47 lago-upgrade-from-release-suite-master-engine
>> >>>>>> ovirt-imageio-proxy: File "/usr/bin/ovirt-imageio-proxy", line
>> 85, in
>> >>>>>> <module>
>> >>>>>> Nov 23 07:30:47 lago-upgrade-from-release-suite-master-engine
>> >>>>>> ovirt-imageio-proxy: status = image_proxy.main(args, config)
>> >>>>>> Nov 23 07:30:47 lago-upgrade-from-release-suite-master-engine
>> >>>>>> ovirt-imageio-proxy: File
>> >>>>>> "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/image_proxy.py",
>> line
>> >>>>>> 21, in main
>> >>>>>> Nov 23 07:30:47 lago-upgrade-from-release-suite-master-engine
>> >>>>>> ovirt-imageio-proxy: image_server.start(config)
>> >>>>>> Nov 23 07:30:47 lago-upgrade-from-release-suite-master-engine
>> >>>>>> ovirt-imageio-proxy: File
>> >>>>>> "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/server.py",
>> line 45,
>> >>>>>> in start
>> >>>>>> Nov 23 07:30:47 lago-upgrade-from-release-suite-master-engine
>> >>>>>> ovirt-imageio-proxy: WSGIRequestHandler)
>> >>>>>> Nov 23 07:30:47 lago-upgrade-from-release-suite-master-engine
>> >>>>>> ovirt-imageio-proxy: File "/usr/lib64/python2.7/SocketServer.py",
>> line 419,
>> >>>>>> in __init__
>> >>>>>> Nov 23 07:30:47 lago-upgrade-from-release-suite-master-engine
>> >>>>>> ovirt-imageio-proxy: self.server_bind()
>> >>>>>> Nov 23 07:30:47 lago-upgrade-from-release-suite-master-engine
>> >>>>>> ovirt-imageio-proxy: File "/usr/lib64/python2.7/wsgiref/
>> simple_server.py",
>> >>>>>> line 48, in server_bind
>> >>>>>> Nov 23 07:30:47 lago-upgrade-from-release-suite-master-engine
>> >>>>>> ovirt-imageio-proxy: HTTPServer.server_bind(self)
>> >>>>>> Nov 23 07:30:47 lago-upgrade-from-release-suite-master-engine
>> >>>>>> ovirt-imageio-proxy: File "/usr/lib64/python2.7/BaseHTTPServer.py",
>> line
>> >>>>>> 108, in server_bind
>> >>>>>> Nov 23 07:30:47 lago-upgrade-from-release-suite-master-engine
>> >>>>>> ovirt-imageio-proxy: SocketServer.TCPServer.server_bind(self)
>> >>>>>> Nov 23 07:30:47 lago-upgrade-from-release-suite-master-engine
>> >>>>>> ovirt-imageio-proxy: File "/usr/lib64/python2.7/SocketServer.py",
>> line 430,
>> >>>>>> in server_bind
>> >>>>>> Nov 23 07:30:47 lago-upgrade-from-release-suite-master-engine
>> >>>>>> ovirt-imageio-proxy: self.socket.bind(self.server_address)
>> >>>>>> Nov 23 07:30:47 lago-upgrade-from-release-suite-master-engine
>> >>>>>> ovirt-imageio-proxy: File "/usr/lib64/python2.7/socket.py", line
>> 224, in
>> >>>>>> meth
>> >>>>>> Nov 23 07:30:47 lago-upgrade-from-release-suite-master-engine
>> >>>>>> ovirt-imageio-proxy: return getattr(self._sock,name)(*args)
>> >>>>>> Nov 23 07:30:47 lago-upgrade-from-release-suite-master-engine
>> >>>>>> ovirt-imageio-proxy: socket.error: [Errno 98] Address already in
>> use
>> >>>>>> Nov 23 07:30:47 lago-upgrade-from-release-suite-master-engine
>> systemd:
>> >>>>>> ovirt-imageio-proxy.service: main process exited, code=exited,
>> >>>>>> status=1/FAILURE
>> >>>>>> Nov 23 07:30:47 lago-upgrade-from-release-suite-master-engine
>> systemd:
>> >>>>>> Failed to start oVirt ImageIO Proxy.
>> >>>>>> Nov 23 07:30:47 lago-upgrade-from-release-suite-master-engine
>> systemd:
>> >>>>>> Unit ovirt-imageio-proxy.service entered failed state.
>> >>>>>> Nov 23 07:30:47 lago-upgrade-from-release-suite-master-engine
>> systemd:
>> >>>>>> ovirt-imageio-proxy.service failed.
>> >>>>>> Nov 23 07:30:47 lago-upgrade-from-release-suite-master-engine
>> systemd:
>> >>>>>> ovirt-imageio-proxy.service holdoff time over, scheduling restart.
>> >>>>>> Nov 23 07:30:47 lago-upgrade-from-release-suite-master-engine
>> systemd:
>> >>>>>> Starting oVirt ImageIO Proxy...
>> >>>>>> Nov 23 07:30:48 lago-upgrade-from-release-suite-master-engine
>> >>>>>> ovirt-imageio-proxy: Traceback (most recent call last):
>> >>>>>> Nov 23 07:30:48 lago-upgrade-from-release-suite-master-engine
>> >>>>>> ovirt-imageio-proxy: File "/usr/bin/ovirt-imageio-proxy", line
>> 85, in
>> >>>>>> <module>
>> >>>>>> Nov 23 07:30:48 lago-upgrade-from-release-suite-master-engine
>> >>>>>> ovirt-imageio-proxy: status = image_proxy.main(args, config)
>> >>>>>> Nov 23 07:30:48 lago-upgrade-from-release-suite-master-engine
>> >>>>>> ovirt-imageio-proxy: File
>> >>>>>> "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/image_proxy.py",
>> line
>> >>>>>> 21, in main
>> >>>>>> Nov 23 07:30:48 lago-upgrade-from-release-suite-master-engine
>> >>>>>> ovirt-imageio-proxy: image_server.start(config)
>> >>>>>> Nov 23 07:30:48 lago-upgrade-from-release-suite-master-engine
>> >>>>>> ovirt-imageio-proxy: File
>> >>>>>> "/usr/lib/python2.7/site-packages/ovirt_imageio_proxy/server.py",
>> line 45,
>> >>>>>> in start
>> >>>>>> Nov 23 07:30:48 lago-upgrade-from-release-suite-master-engine
>> >>>>>> ovirt-imageio-proxy: WSGIRequestHandler)
>> >>>>>> Nov 23 07:30:48 lago-upgrade-from-release-suite-master-engine
>> >>>>>> ovirt-imageio-proxy: File "/usr/lib64/python2.7/SocketServer.py",
>> line 419,
>> >>>>>> in __init__
>> >>>>>> Nov 23 07:30:48 lago-upgrade-from-release-suite-master-engine
>> >>>>>> ovirt-imageio-proxy: self.server_bind()
>> >>>>>> Nov 23 07:30:48 lago-upgrade-from-release-suite-master-engine
>> >>>>>> ovirt-imageio-proxy: File "/usr/lib64/python2.7/wsgiref/
>> simple_server.py",
>> >>>>>> line 48, in server_bind
>> >>>>>> Nov 23 07:30:48 lago-upgrade-from-release-suite-master-engine
>> >>>>>> ovirt-imageio-proxy: HTTPServer.server_bind(self)
>> >>>>>> Nov 23 07:30:48 lago-upgrade-from-release-suite-master-engine
>> >>>>>> ovirt-imageio-proxy: File "/usr/lib64/python2.7/BaseHTTPServer.py",
>> line
>> >>>>>> 108, in server_bind
>> >>>>>> Nov 23 07:30:48 lago-upgrade-from-release-suite-master-engine
>> >>>>>> ovirt-imageio-proxy: SocketServer.TCPServer.server_bind(self)
>> >>>>>> Nov 23 07:30:48 lago-upgrade-from-release-suite-master-engine
>> >>>>>> ovirt-imageio-proxy: File "/usr/lib64/python2.7/SocketServer.py",
>> line 430,
>> >>>>>> in server_bind
>> >>>>>> Nov 23 07:30:48 lago-upgrade-from-release-suite-master-engine
>> >>>>>> ovirt-imageio-proxy: self.socket.bind(self.server_address)
>> >>>>>> Nov 23 07:30:48 lago-upgrade-from-release-suite-master-engine
>> >>>>>> ovirt-imageio-proxy: File "/usr/lib64/python2.7/socket.py", line
>> 224, in
>> >>>>>> meth
>> >>>>>> Nov 23 07:30:48 lago-upgrade-from-release-suite-master-engine
>> >>>>>> ovirt-imageio-proxy: return getattr(self._sock,name)(*args)
>> >>>>>> Nov 23 07:30:48 lago-upgrade-from-release-suite-master-engine
>> >>>>>> ovirt-imageio-proxy: socket.error: [Errno 98] Address already in
>> use
>> >>>>>> Nov 23 07:30:48 lago-upgrade-from-release-suite-master-engine
>> systemd:
>> >>>>>> ovirt-imageio-proxy.service: main process exited, code=exited,
>> >>>>>> status=1/FAILURE
>> >>>>>> Nov 23 07:30:48 lago-upgrade-from-release-suite-master-engine
>> systemd:
>> >>>>>> Failed to start oVirt ImageIO Proxy.
>> >>>>>> Nov 23 07:30:48 lago-upgrade-from-release-suite-master-engine
>> systemd:
>> >>>>>> Unit ovirt-imageio-proxy.service entered failed state.
>> >>>>>> Nov 23 07:30:48 lago-upgrade-from-release-suite-master-engine
>> systemd:
>> >>>>>> ovirt-imageio-proxy.service failed.
>> >>>>>> Nov 23 07:30:48 lago-upgrade-from-release-suite-master-engine
>> systemd:
>> >>>>>> ovirt-imageio-proxy.service holdoff time over, scheduling restart.
>> >>>>>> Nov 23 07:30:48 lago-upgrade-from-release-suite-master-engine
>> systemd:
>> >>>>>> start request repeated too quickly for ovirt-imageio-proxy.service
>> >>>>>> Nov 23 07:30:48 lago-upgrade-from-release-suite-master-engine
>> systemd:
>> >>>>>> Failed to start oVirt ImageIO Proxy.
>> >>>>>> Nov 23 07:30:48 lago-upgrade-from-release-suite-master-engine
>> systemd:
>> >>>>>> Unit ovirt-imageio-proxy.service entered failed state.
>> >>>>>> Nov 23 07:30:48 lago-upgrade-from-release-suite-master-engine
>> systemd:
>> >>>>>> ovirt-imageio-proxy.service failed.
>> >>>>>>
>> >>>>>> </error>
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> _______________________________________________
>> >>>>>> Infra mailing list
>> >>>>>> Infra at ovirt.org
>> >>>>>> http://lists.ovirt.org/mailman/listinfo/infra
>> >>>>>>
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> _______________________________________________
>> >>>>> Devel mailing list
>> >>>>> Devel at ovirt.org
>> >>>>> http://lists.ovirt.org/mailman/listinfo/devel
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> --
>> >>>> GAL bEN HAIM
>> >>>> RHV DEVOPS
>> >>>>
>> >>>> _______________________________________________
>> >>>> Devel mailing list
>> >>>> Devel at ovirt.org
>> >>>> http://lists.ovirt.org/mailman/listinfo/devel
>> >>>
>> >>>
>> >>
>> >>
>> >>
>> >> --
>> >> GAL bEN HAIM
>> >> RHV DEVOPS
>> >> _______________________________________________
>> >> Devel mailing list
>> >> Devel at ovirt.org
>> >> http://lists.ovirt.org/mailman/listinfo/devel
>>
>>
>>
>> --
>> Didi
>>
>
>
>
> --
> Didi
>
> _______________________________________________
> Devel mailing list
> Devel at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/devel/attachments/20171127/5b72a00a/attachment-0001.html>


More information about the Devel mailing list