[ovirt-users] Hosted Engine-Setup issue additional host

Yedidyah Bar David didi at redhat.com
Mon Apr 27 08:55:26 UTC 2015


----- Original Message -----
> From: "Martin Sivak" <msivak at redhat.com>
> To: "Sven Achtelik" <Sven.Achtelik at mailpool.us>
> Cc: "Yedidyah Bar David" <didi at redhat.com>, "Roy Golan" <rgolan at redhat.com>, users at ovirt.org
> Sent: Monday, April 27, 2015 11:50:30 AM
> Subject: Re: AW: AW: AW: [ovirt-users] Hosted Engine-Setup issue additional host
> 
> Uh this really is weird.
> 
> The situation is clear though:
> 
> Broker dies when it tries to initialize logging (missing /dev/stdout ???)
> Agent dies because it can't connect to the broker.
> 
> My /dev/stdout looks like this:
> 
> lrwxrwxrwx. 1 root root 15 Mar 30 17:29 /dev/stdout -> /proc/self/fd/1
> 
> And /proc/self/fd/1 is obviously related to the process. But I have an idea.
> 
> Can you check whether the /proc/self/fd/1 is there? It might be missing if
> the broker closed its stdout during daemonizing.

If that's the problem, he can't see that - in his shell, /proc/self is its
shell's process.

Not sure how else to check without some small patch...

> 
> --
> Martin Sivák
> msivak at redhat.com
> Red Hat Czech
> RHEV-M SLA / Brno, CZ
> 
> ----- Original Message -----
> > Yes,
> > 
> > -----------------------
> > [root at ovirt-node2 ~]# systemctl start ovirt-ha-broker.service && systemctl
> > start ovirt-ha-agent.service
> > Job for ovirt-ha-broker.service failed. See 'systemctl status
> > ovirt-ha-broker.service' and 'journalctl -xn' for details.
> > [root at ovirt-node2 ~]# journalctl -xn
> > -- Logs begin at Sun 2015-04-26 09:14:15 CDT, end at Mon 2015-04-27
> > 02:49:33
> > CDT. --
> > Apr 27 02:49:27 ovirt-node2.mgmt.asl.local systemd-ovirt-ha-broker[29068]:
> > File "/usr/lib64/python2.7/logging/__init__.py", line 925, in _open
> > Apr 27 02:49:27 ovirt-node2.mgmt.asl.local systemd-ovirt-ha-broker[29068]:
> > stream = open(self.baseFilename, self.mode)
> > Apr 27 02:49:27 ovirt-node2.mgmt.asl.local systemd-ovirt-ha-broker[29068]:
> > IOError: [Errno 6] No such device or address: '/dev/stdout'
> > Apr 27 02:49:27 ovirt-node2.mgmt.asl.local systemd-ovirt-ha-broker[29068]:
> > [FAILED]
> > Apr 27 02:49:27 ovirt-node2.mgmt.asl.local systemd[1]:
> > ovirt-ha-broker.service: control process exited, code=exited status=1
> > Apr 27 02:49:27 ovirt-node2.mgmt.asl.local systemd[1]: Failed to start
> > oVirt
> > Hosted Engine High Availability Communications Broker.
> > -- Subject: Unit ovirt-ha-broker.service has failed
> > -- Defined-By: systemd
> > -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
> > --
> > -- Unit ovirt-ha-broker.service has failed.
> > --
> > -- The result is failed.
> > Apr 27 02:49:27 ovirt-node2.mgmt.asl.local systemd[1]: Unit
> > ovirt-ha-broker.service entered failed state.
> > Apr 27 02:49:33 ovirt-node2.mgmt.asl.local vdsm[3309]: vdsm
> > ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink ERROR Failed to connect to
> > broker, the number of errors has exceeded the limit (
> > Apr 27 02:49:33 ovirt-node2.mgmt.asl.local vdsm[3309]: vdsm vds ERROR
> > failed
> > to retrieve Hosted Engine HA info
> >                                                        Traceback (most
> >                                                        recent
> >                                                        call last):
> >                                                          File
> >                                                          "/usr/share/vdsm/API.py",
> >                                                          line 1703, in
> >                                                          _getHaInfo
> >                                                            stats =
> >                                                            instance.get_all_stats()
> >                                                          File
> >                                                          "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py",
> >                                                          line 97, in
> >                                                          get_all_stats
> >                                                            with
> >                                                            broker.connection():
> >                                                          File
> >                                                          "/usr/lib64/python2.7/contextlib.py",
> >                                                          line 17, in
> >                                                          __enter__
> >                                                            return
> >                                                            self.gen.next()
> >                                                          File
> >                                                          "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
> >                                                          line 99, in
> >                                                          connection
> >                                                            self.connect()
> >                                                          File
> >                                                          "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
> >                                                          line 78, in
> >                                                          connect
> >                                                            raise
> >                                                            BrokerConnectionError(error_msg)
> >                                                        BrokerConnectionError:
> >                                                        Failed to connect to
> >                                                        broker, the number
> >                                                        of
> >                                                        errors has exceeded
> >                                                        the limit (5)
> > Apr 27 02:49:33 ovirt-node2.mgmt.asl.local libvirtd[1678]: metadata not
> > found: Requested metadata element is not present
> > --------------------------------
> > 
> > 
> > -----Ursprüngliche Nachricht-----
> > Von: Yedidyah Bar David [mailto:didi at redhat.com]
> > Gesendet: Montag, 27. April 2015 09:46
> > An: Sven Achtelik; Martin Sivak
> > Cc: Roy Golan; users at ovirt.org
> > Betreff: Re: AW: AW: [ovirt-users] Hosted Engine-Setup issue additional
> > host
> > 
> > ----- Original Message -----
> > > From: "Sven Achtelik" <Sven.Achtelik at mailpool.us>
> > > To: "Yedidyah Bar David" <didi at redhat.com>
> > > Cc: "Roy Golan" <rgolan at redhat.com>, users at ovirt.org
> > > Sent: Monday, April 27, 2015 10:34:13 AM
> > > Subject: AW: AW: [ovirt-users] Hosted Engine-Setup issue additional
> > > host
> > > 
> > > Hi Did,
> > > 
> > > results are
> > > ---------------------------
> > > [root at ovirt-node2 ~]# ls -l /dev/stdout lrwxrwxrwx 1 root root 15 Apr
> > > 26 09:14 /dev/stdout -> /proc/self/fd/1
> > > [root at ovirt-node2 ~]# echo test > /dev/stdout test
> > > ---------------------------
> > > Looks like everything is working fine.
> > 
> > And it still fails with the same message when you restart ha daemons?
> > 
> > Adding Martin.
> > 
> > Weird.
> > 
> > > 
> > > Sven
> > > 
> > > 
> > > -----Ursprüngliche Nachricht-----
> > > Von: Yedidyah Bar David [mailto:didi at redhat.com]
> > > Gesendet: Montag, 27. April 2015 08:57
> > > An: Sven Achtelik
> > > Cc: Roy Golan; users at ovirt.org
> > > Betreff: Re: AW: [ovirt-users] Hosted Engine-Setup issue additional
> > > host
> > > 
> > > 
> > > 
> > > ----- Original Message -----
> > > > From: "Sven Achtelik" <Sven.Achtelik at mailpool.us>
> > > > To: "Roy Golan" <rgolan at redhat.com>, users at ovirt.org, "Yedidyah Bar
> > > > David" <didi at redhat.com>
> > > > Sent: Sunday, April 26, 2015 6:57:06 PM
> > > > Subject: AW: [ovirt-users] Hosted Engine-Setup issue additional host
> > > > 
> > > > On the node that fails to start the ha-broker and ha-agent I'm using:
> > > > 
> > > > ovirt-engine-sdk-python.noarch                3.5.2.1-1.el7.centos
> > > > @ovirt-3.5-pre
> > > > ovirt-host-deploy.noarch                            1.3.1-1.el7
> > > > @ovirt-3.5
> > > > ovirt-hosted-engine-ha.noarch                  1.2.5-1.el7.centos
> > > > @ovirt-3.5
> > > > ovirt-hosted-engine-setup.noarch        1.2.3-1.el7.centos
> > > > @ovirt-3.5-pre
> > > > ovirt-release35.noarch                                003-1
> > > > @/ovirt-release35
> > > > 
> > > > 
> > > > Von: users-bounces at ovirt.org [mailto:users-bounces at ovirt.org] Im
> > > > Auftrag von Roy Golan
> > > > Gesendet: Sonntag, 26. April 2015 16:59
> > > > An: users at ovirt.org; Yedidyah Bar David
> > > > Betreff: Re: [ovirt-users] Hosted Engine-Setup issue additional host
> > > > 
> > > > On 04/26/2015 05:38 PM, Sven Achtelik wrote:
> > > > Hi All,
> > > > 
> > > > after a successful setup of hosted-engine on the first node I'm
> > > > having trouble completing it on an additional node. The Setup fails
> > > > with:
> > > > ---------------------------------
> > > > [ INFO  ] Waiting for the host to become operational in the engine.
> > > > This may take several minutes...
> > > > [ INFO  ] Still waiting for VDSM host to become operational...
> > > > [ INFO  ] The VDSM Host is now operational [ INFO  ] Enabling and
> > > > starting HA services [ ERROR ] Failed to execute stage 'Closing up':
> > > > Command '/bin/systemctl'
> > > > failed to execute
> > > > [ INFO  ] Stage: Clean up
> > > > [ INFO  ] Generating answer file
> > > > '/var/lib/ovirt-hosted-engine-setup/answers/answers-20150426080028.conf'
> > > > [ INFO  ] Stage: Pre-termination
> > > > [ INFO  ] Stage: Termination
> > > > ---------------------------------
> > > > After that the node is added to the cluster and is operational from
> > > > the GUI, but the hosted  engine broker and agent fail to start with
> > > > error
> > > > messages:
> > > > ----------------------------------
> > > > [root at ovirt-node2 ~]# systemctl status ovirt-ha-agent.service -l
> > > > ovirt-ha-agent.service - oVirt Hosted Engine High Availability
> > > > Monitoring Agent
> > > >    Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-agent.service;
> > > >    enabled)
> > > >    Active: failed (Result: exit-code) since Sun 2015-04-26 08:00:28
> > > >    CDT;
> > > >    20min ago
> > > >   Process: 5373 ExecStart=/usr/lib/systemd/systemd-ovirt-ha-agent start
> > > >   (code=exited, status=1/FAILURE)
> > > > 
> > > > Apr 26 08:00:28 ovirt-node2.mgmt.asl.local
> > > > systemd-ovirt-ha-agent[5373]: hdlr = FileHandler(filename, mode) Apr
> > > > 26 08:00:28 ovirt-node2.mgmt.asl.local systemd-ovirt-ha-agent[5373]:
> > > > File "/usr/lib64/python2.7/logging/__init__.py", line 902, in
> > > > __init__ Apr 26 08:00:28 ovirt-node2.mgmt.asl.local
> > > > systemd-ovirt-ha-agent[5373]:
> > > > StreamHandler.__init__(self, self._open()) Apr 26 08:00:28
> > > > ovirt-node2.mgmt.asl.local systemd-ovirt-ha-agent[5373]: File
> > > > "/usr/lib64/python2.7/logging/__init__.py", line 925, in _open Apr
> > > > 26
> > > > 08:00:28 ovirt-node2.mgmt.asl.local systemd-ovirt-ha-agent[5373]:
> > > > stream = open(self.baseFilename, self.mode) Apr 26 08:00:28
> > > > ovirt-node2.mgmt.asl.local systemd-ovirt-ha-agent[5373]:
> > > > IOError: [Errno 6] No such device or address: '/dev/stdout'
> > > > Apr 26 08:00:28 ovirt-node2.mgmt.asl.local
> > > > systemd-ovirt-ha-agent[5373]:
> > > > [FAILED]
> > > > Apr 26 08:00:28 ovirt-node2.mgmt.asl.local systemd[1]:
> > > > ovirt-ha-agent.service: control process exited, code=exited status=1
> > > > Apr 26 08:00:28 ovirt-node2.mgmt.asl.local systemd[1]: Failed to
> > > > start oVirt Hosted Engine High Availability Monitoring Agent.
> > > > Apr 26 08:00:28 ovirt-node2.mgmt.asl.local systemd[1]: Unit
> > > > ovirt-ha-agent.service entered failed state.
> > > > -------------------------------------
> > > > And
> > > > -------------------------------------
> > > > [root at ovirt-node2 ~]# systemctl status ovirt-ha-broker
> > > > ovirt-ha-broker.service - oVirt Hosted Engine High Availability
> > > > Communications Broker
> > > >    Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-broker.service;
> > > >    enabled)
> > > >    Active: failed (Result: exit-code) since Sun 2015-04-26 08:00:28
> > > >    CDT;
> > > >    21min ago
> > > >   Process: 5359 ExecStart=/usr/lib/systemd/systemd-ovirt-ha-broker
> > > >   start
> > > >   (code=exited, status=1/FAILURE)
> > > > 
> > > > Apr 26 08:00:28 ovirt-node2.mgmt.asl.local
> > > > systemd-ovirt-ha-broker[5359]:
> > > > hdlr = FileHandler(filename, mode)
> > > > Apr 26 08:00:28 ovirt-node2.mgmt.asl.local
> > > > systemd-ovirt-ha-broker[5359]:
> > > > File "/usr/lib64/python2.7/logging/__init__.py", line ...it__ Apr 26
> > > > 08:00:28 ovirt-node2.mgmt.asl.local systemd-ovirt-ha-broker[5359]:
> > > > StreamHandler.__init__(self, self._open()) Apr 26 08:00:28
> > > > ovirt-node2.mgmt.asl.local systemd-ovirt-ha-broker[5359]:
> > > > File "/usr/lib64/python2.7/logging/__init__.py", line ...open Apr 26
> > > > 08:00:28 ovirt-node2.mgmt.asl.local systemd-ovirt-ha-broker[5359]:
> > > > stream = open(self.baseFilename, self.mode) Apr 26 08:00:28
> > > > ovirt-node2.mgmt.asl.local systemd-ovirt-ha-broker[5359]:
> > > > IOError: [Errno 6] No such device or address: '/dev/stdout'
> > > > 
> > > > Didi any clue?
> > > > the log says it runs as root so I canrule that out
> > > > 
> > > 
> > > That's weird. Please check/post:
> > > 
> > > ls -l /dev/stdout
> > > echo test > /dev/stdout
> > > 
> > > It should be a symlink to /proc/self/fd/1 .
> > > 
> > > > Apr 26 08:00:28 ovirt-node2.mgmt.asl.local
> > > > systemd-ovirt-ha-broker[5359]:
> > > > [FAILED]
> > > > Apr 26 08:00:28 ovirt-node2.mgmt.asl.local systemd[1]:
> > > > ovirt-ha-broker.service: control process exited, code=exited
> > > > status=1 Apr 26 08:00:28 ovirt-node2.mgmt.asl.local systemd[1]:
> > > > Failed to start oVirt Hosted Engine High Availability Communications
> > > > Broker.
> > > > Apr 26 08:00:28 ovirt-node2.mgmt.asl.local systemd[1]: Unit
> > > > ovirt-ha-broker.service entered failed state.
> > > > ----------------------------------------
> > > > 
> > > > The system is a CentOS 7 Setup with SeLinux switched off, no
> > > > firewall or iptables.  How can I find out which version of ovirt I'm
> > > > running exactly ?
> > > > I've had a lock at the logs and read through old bug reports.
> > > > 
> > > > 
> > > > the rpm version of ovirt* will be enough I guess
> > > > 
> > > > 
> > > > Thank you,
> > > > 
> > > > Sven
> > > > 
> > > > 
> > > > 
> > > > 
> > > > _______________________________________________
> > > > 
> > > > Users mailing list
> > > > 
> > > > Users at ovirt.org<mailto:Users at ovirt.org>
> > > > 
> > > > http://lists.ovirt.org/mailman/listinfo/users
> > > > 
> > > > 
> > > 
> > > Thanks,
> > > --
> > > Didi
> > > 
> > > 
> > 
> > Best,
> > --
> > Didi
> > 
> > 
> 

-- 
Didi




More information about the Users mailing list