Uh this really is weird.
The situation is clear though:
Broker dies when it tries to initialize logging (missing /dev/stdout ???)
Agent dies because it can't connect to the broker.
My /dev/stdout looks like this:
lrwxrwxrwx. 1 root root 15 Mar 30 17:29 /dev/stdout -> /proc/self/fd/1
And /proc/self/fd/1 is obviously related to the process. But I have an idea.
Can you check whether the /proc/self/fd/1 is there? It might be missing if
the broker closed its stdout during daemonizing.
--
Martin Sivák
msivak(a)redhat.com
Red Hat Czech
RHEV-M SLA / Brno, CZ
----- Original Message -----
Yes,
-----------------------
[root@ovirt-node2 ~]# systemctl start ovirt-ha-broker.service && systemctl
start ovirt-ha-agent.service
Job for ovirt-ha-broker.service failed. See 'systemctl status
ovirt-ha-broker.service' and 'journalctl -xn' for details.
[root@ovirt-node2 ~]# journalctl -xn
-- Logs begin at Sun 2015-04-26 09:14:15 CDT, end at Mon 2015-04-27 02:49:33
CDT. --
Apr 27 02:49:27 ovirt-node2.mgmt.asl.local systemd-ovirt-ha-broker[29068]:
File "/usr/lib64/python2.7/logging/__init__.py", line 925, in _open
Apr 27 02:49:27 ovirt-node2.mgmt.asl.local systemd-ovirt-ha-broker[29068]:
stream = open(self.baseFilename, self.mode)
Apr 27 02:49:27 ovirt-node2.mgmt.asl.local systemd-ovirt-ha-broker[29068]:
IOError: [Errno 6] No such device or address: '/dev/stdout'
Apr 27 02:49:27 ovirt-node2.mgmt.asl.local systemd-ovirt-ha-broker[29068]:
[FAILED]
Apr 27 02:49:27 ovirt-node2.mgmt.asl.local systemd[1]:
ovirt-ha-broker.service: control process exited, code=exited status=1
Apr 27 02:49:27 ovirt-node2.mgmt.asl.local systemd[1]: Failed to start oVirt
Hosted Engine High Availability Communications Broker.
-- Subject: Unit ovirt-ha-broker.service has failed
-- Defined-By: systemd
-- Support:
http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit ovirt-ha-broker.service has failed.
--
-- The result is failed.
Apr 27 02:49:27 ovirt-node2.mgmt.asl.local systemd[1]: Unit
ovirt-ha-broker.service entered failed state.
Apr 27 02:49:33 ovirt-node2.mgmt.asl.local vdsm[3309]: vdsm
ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink ERROR Failed to connect to
broker, the number of errors has exceeded the limit (
Apr 27 02:49:33 ovirt-node2.mgmt.asl.local vdsm[3309]: vdsm vds ERROR failed
to retrieve Hosted Engine HA info
Traceback (most recent
call last):
File
"/usr/share/vdsm/API.py",
line 1703, in
_getHaInfo
stats =
instance.get_all_stats()
File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py",
line 97, in
get_all_stats
with
broker.connection():
File
"/usr/lib64/python2.7/contextlib.py",
line 17, in
__enter__
return
self.gen.next()
File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
line 99, in
connection
self.connect()
File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
line 78, in connect
raise
BrokerConnectionError(error_msg)
BrokerConnectionError:
Failed to connect to
broker, the number of
errors has exceeded
the limit (5)
Apr 27 02:49:33 ovirt-node2.mgmt.asl.local libvirtd[1678]: metadata not
found: Requested metadata element is not present
--------------------------------
-----Ursprüngliche Nachricht-----
Von: Yedidyah Bar David [mailto:didi@redhat.com]
Gesendet: Montag, 27. April 2015 09:46
An: Sven Achtelik; Martin Sivak
Cc: Roy Golan; users(a)ovirt.org
Betreff: Re: AW: AW: [ovirt-users] Hosted Engine-Setup issue additional host
----- Original Message -----
> From: "Sven Achtelik" <Sven.Achtelik(a)mailpool.us>
> To: "Yedidyah Bar David" <didi(a)redhat.com>
> Cc: "Roy Golan" <rgolan(a)redhat.com>, users(a)ovirt.org
> Sent: Monday, April 27, 2015 10:34:13 AM
> Subject: AW: AW: [ovirt-users] Hosted Engine-Setup issue additional
> host
>
> Hi Did,
>
> results are
> ---------------------------
> [root@ovirt-node2 ~]# ls -l /dev/stdout lrwxrwxrwx 1 root root 15 Apr
> 26 09:14 /dev/stdout -> /proc/self/fd/1
> [root@ovirt-node2 ~]# echo test > /dev/stdout test
> ---------------------------
> Looks like everything is working fine.
And it still fails with the same message when you restart ha daemons?
Adding Martin.
Weird.
>
> Sven
>
>
> -----Ursprüngliche Nachricht-----
> Von: Yedidyah Bar David [mailto:didi@redhat.com]
> Gesendet: Montag, 27. April 2015 08:57
> An: Sven Achtelik
> Cc: Roy Golan; users(a)ovirt.org
> Betreff: Re: AW: [ovirt-users] Hosted Engine-Setup issue additional
> host
>
>
>
> ----- Original Message -----
> > From: "Sven Achtelik" <Sven.Achtelik(a)mailpool.us>
> > To: "Roy Golan" <rgolan(a)redhat.com>, users(a)ovirt.org,
"Yedidyah Bar
> > David" <didi(a)redhat.com>
> > Sent: Sunday, April 26, 2015 6:57:06 PM
> > Subject: AW: [ovirt-users] Hosted Engine-Setup issue additional host
> >
> > On the node that fails to start the ha-broker and ha-agent I'm using:
> >
> > ovirt-engine-sdk-python.noarch 3.5.2.1-1.el7.centos
> > @ovirt-3.5-pre
> > ovirt-host-deploy.noarch 1.3.1-1.el7
> > @ovirt-3.5
> > ovirt-hosted-engine-ha.noarch 1.2.5-1.el7.centos
> > @ovirt-3.5
> > ovirt-hosted-engine-setup.noarch 1.2.3-1.el7.centos
> > @ovirt-3.5-pre
> > ovirt-release35.noarch 003-1
> > @/ovirt-release35
> >
> >
> > Von: users-bounces(a)ovirt.org [mailto:users-bounces@ovirt.org] Im
> > Auftrag von Roy Golan
> > Gesendet: Sonntag, 26. April 2015 16:59
> > An: users(a)ovirt.org; Yedidyah Bar David
> > Betreff: Re: [ovirt-users] Hosted Engine-Setup issue additional host
> >
> > On 04/26/2015 05:38 PM, Sven Achtelik wrote:
> > Hi All,
> >
> > after a successful setup of hosted-engine on the first node I'm
> > having trouble completing it on an additional node. The Setup fails with:
> > ---------------------------------
> > [ INFO ] Waiting for the host to become operational in the engine.
> > This may take several minutes...
> > [ INFO ] Still waiting for VDSM host to become operational...
> > [ INFO ] The VDSM Host is now operational [ INFO ] Enabling and
> > starting HA services [ ERROR ] Failed to execute stage 'Closing up':
> > Command '/bin/systemctl'
> > failed to execute
> > [ INFO ] Stage: Clean up
> > [ INFO ] Generating answer file
> >
'/var/lib/ovirt-hosted-engine-setup/answers/answers-20150426080028.conf'
> > [ INFO ] Stage: Pre-termination
> > [ INFO ] Stage: Termination
> > ---------------------------------
> > After that the node is added to the cluster and is operational from
> > the GUI, but the hosted engine broker and agent fail to start with
> > error
> > messages:
> > ----------------------------------
> > [root@ovirt-node2 ~]# systemctl status ovirt-ha-agent.service -l
> > ovirt-ha-agent.service - oVirt Hosted Engine High Availability
> > Monitoring Agent
> > Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-agent.service;
> > enabled)
> > Active: failed (Result: exit-code) since Sun 2015-04-26 08:00:28 CDT;
> > 20min ago
> > Process: 5373 ExecStart=/usr/lib/systemd/systemd-ovirt-ha-agent start
> > (code=exited, status=1/FAILURE)
> >
> > Apr 26 08:00:28 ovirt-node2.mgmt.asl.local
> > systemd-ovirt-ha-agent[5373]: hdlr = FileHandler(filename, mode) Apr
> > 26 08:00:28 ovirt-node2.mgmt.asl.local systemd-ovirt-ha-agent[5373]:
> > File "/usr/lib64/python2.7/logging/__init__.py", line 902, in
> > __init__ Apr 26 08:00:28 ovirt-node2.mgmt.asl.local
> > systemd-ovirt-ha-agent[5373]:
> > StreamHandler.__init__(self, self._open()) Apr 26 08:00:28
> > ovirt-node2.mgmt.asl.local systemd-ovirt-ha-agent[5373]: File
> > "/usr/lib64/python2.7/logging/__init__.py", line 925, in _open Apr
> > 26
> > 08:00:28 ovirt-node2.mgmt.asl.local systemd-ovirt-ha-agent[5373]:
> > stream = open(self.baseFilename, self.mode) Apr 26 08:00:28
> > ovirt-node2.mgmt.asl.local systemd-ovirt-ha-agent[5373]:
> > IOError: [Errno 6] No such device or address: '/dev/stdout'
> > Apr 26 08:00:28 ovirt-node2.mgmt.asl.local systemd-ovirt-ha-agent[5373]:
> > [FAILED]
> > Apr 26 08:00:28 ovirt-node2.mgmt.asl.local systemd[1]:
> > ovirt-ha-agent.service: control process exited, code=exited status=1
> > Apr 26 08:00:28 ovirt-node2.mgmt.asl.local systemd[1]: Failed to
> > start oVirt Hosted Engine High Availability Monitoring Agent.
> > Apr 26 08:00:28 ovirt-node2.mgmt.asl.local systemd[1]: Unit
> > ovirt-ha-agent.service entered failed state.
> > -------------------------------------
> > And
> > -------------------------------------
> > [root@ovirt-node2 ~]# systemctl status ovirt-ha-broker
> > ovirt-ha-broker.service - oVirt Hosted Engine High Availability
> > Communications Broker
> > Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-broker.service;
> > enabled)
> > Active: failed (Result: exit-code) since Sun 2015-04-26 08:00:28 CDT;
> > 21min ago
> > Process: 5359 ExecStart=/usr/lib/systemd/systemd-ovirt-ha-broker start
> > (code=exited, status=1/FAILURE)
> >
> > Apr 26 08:00:28 ovirt-node2.mgmt.asl.local systemd-ovirt-ha-broker[5359]:
> > hdlr = FileHandler(filename, mode)
> > Apr 26 08:00:28 ovirt-node2.mgmt.asl.local systemd-ovirt-ha-broker[5359]:
> > File "/usr/lib64/python2.7/logging/__init__.py", line ...it__ Apr 26
> > 08:00:28 ovirt-node2.mgmt.asl.local systemd-ovirt-ha-broker[5359]:
> > StreamHandler.__init__(self, self._open()) Apr 26 08:00:28
> > ovirt-node2.mgmt.asl.local systemd-ovirt-ha-broker[5359]:
> > File "/usr/lib64/python2.7/logging/__init__.py", line ...open Apr 26
> > 08:00:28 ovirt-node2.mgmt.asl.local systemd-ovirt-ha-broker[5359]:
> > stream = open(self.baseFilename, self.mode) Apr 26 08:00:28
> > ovirt-node2.mgmt.asl.local systemd-ovirt-ha-broker[5359]:
> > IOError: [Errno 6] No such device or address: '/dev/stdout'
> >
> > Didi any clue?
> > the log says it runs as root so I canrule that out
> >
>
> That's weird. Please check/post:
>
> ls -l /dev/stdout
> echo test > /dev/stdout
>
> It should be a symlink to /proc/self/fd/1 .
>
> > Apr 26 08:00:28 ovirt-node2.mgmt.asl.local systemd-ovirt-ha-broker[5359]:
> > [FAILED]
> > Apr 26 08:00:28 ovirt-node2.mgmt.asl.local systemd[1]:
> > ovirt-ha-broker.service: control process exited, code=exited
> > status=1 Apr 26 08:00:28 ovirt-node2.mgmt.asl.local systemd[1]:
> > Failed to start oVirt Hosted Engine High Availability Communications
> > Broker.
> > Apr 26 08:00:28 ovirt-node2.mgmt.asl.local systemd[1]: Unit
> > ovirt-ha-broker.service entered failed state.
> > ----------------------------------------
> >
> > The system is a CentOS 7 Setup with SeLinux switched off, no
> > firewall or iptables. How can I find out which version of ovirt I'm
> > running exactly ?
> > I've had a lock at the logs and read through old bug reports.
> >
> >
> > the rpm version of ovirt* will be enough I guess
> >
> >
> > Thank you,
> >
> > Sven
> >
> >
> >
> >
> > _______________________________________________
> >
> > Users mailing list
> >
> > Users@ovirt.org<mailto:Users@ovirt.org>
> >
> >
http://lists.ovirt.org/mailman/listinfo/users
> >
> >
>
> Thanks,
> --
> Didi
>
>
Best,
--
Didi