Re: [ovirt-users] Adding another host to my cluster

13 May 2016

      This is a multi-part message in MIME format.
--------------030000020408000408030009
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit

Hi Gervais,

   Okay, I see two problems: there are some leftover direcyories causing 
issues and for some reason VDSM seems to be trying to bind to a port 
something is already running on (probably an older version of VDSM.)  
Try removing the duplicate dirs (rmdir 
/var/run/vdsm/storage/248f46f0-d793-4581-9810-c9d965e2f286 and 
/rhev/data-center/mnt - if they aren't empty don't rm -rf them because 
they might be mounted from your production servers.  Just mv -i them to 
/root or somewhere.)

   Next shutdown the vdsm service with "service vdsm stop" (I think, 
might be service stop vdsm, I don't use CentOS much) and kill any 
running vdsm processes (ps ax |grep vdsm)  The error that I saw was:

MainThread::ERROR::2016-05-13 
08:58:38,262::clientIF::128::vds::(__init__) failed to init clientIF, 
shutting down storage dispatcher
MainThread::ERROR::2016-05-13 08:58:38,289::vdsm::171::vds::(run) 
Exception raised
Traceback (most recent call last):
   File "/usr/share/vdsm/vdsm", line 169, in run
     serve_clients(log)
   File "/usr/share/vdsm/vdsm", line 102, in serve_clients
     cif = clientIF.getInstance(irs, log, scheduler)
   File "/usr/share/vdsm/clientIF.py", line 193, in getInstance
     cls._instance = clientIF(irs, log, scheduler)
   File "/usr/share/vdsm/clientIF.py", line 123, in __init__
     self._createAcceptor(host, port)
   File "/usr/share/vdsm/clientIF.py", line 201, in _createAcceptor
     port, sslctx)
   File "/usr/share/vdsm/protocoldetector.py", line 170, in __init__
     sock = _create_socket(host, port)
   File "/usr/share/vdsm/protocoldetector.py", line 40, in _create_socket
     server_socket.bind(addr[0][4])
   File "/usr/lib64/python2.7/socket.py", line 224, in meth
     return getattr(self._sock,name)(*args)
error: [Errno 98] Address already in use

If you get the same error, do a netstat -lnp and compare it to the same 
from a working box to see if something else is running on the VDSM port.

On 2016-05-13 09:37 AM, Gervais de Montbrun wrote:
...
Hi Charles,
I think the problem I am having is due to the setup failing and not 
something in vdsm configs as I have never gotten this server to start 
up properly and the BRIDGE ethernet interface + ovirt routes are not 
setup.
I put the logs here: 
https://www.dropbox.com/sh/5ugyykqh1lgru9l/AACXxRYWr3tgd0WbBVFW5twHa?dl=0
hosted-engine--deploy-logs.zip# Logs from when I tried to deploy and 
it failed
vdsm.tar.gz# /var/log/vdsm
Output from running vdsm from the command line:
[root@cultivar2 log]# su -s /bin/bash vdsm
    [vdsm@cultivar2 log]$ python /usr/share/vdsm/vdsm
    (PID: 6521) I am the actual vdsm 4.17.26-1.el7
    cultivar2.grove.silverorange.com
    <http://cultivar2.grove.silverorange.com/> (3.10.0-327.el7.x86_64)
    VDSM will run with cpu affinity: frozenset([1])
    /usr/bin/taskset --all-tasks --pid --cpu-list 1 6521 (cwd None)
    SUCCESS: <err> = ''; <rc> = 0
    Starting scheduler vdsm.Scheduler
    started
    Run and protect:
    registerDomainStateChangeCallback(callbackFunc=<functools.partial
    object at 0x381b158>)
    Run and protect: registerDomainStateChangeCallback, Return
    response: None
    Trying to connect to Super Vdsm
    Preparing MOM interface
    Using named unix socket /var/run/vdsm/mom-vdsm.sock
    Unregistering all secrests
    trying to connect libvirt
    recovery: started
    Setting channels' timeout to 30 seconds.
    Starting VM channels listener thread.
    Listening at 0.0.0.0:54321 <http://0.0.0.0:54321>
    Adding detector <rpc.bindingxmlrpc.XmlDetector instance at 0x3b4ecb0>
    recovery: completed in 0s
    Adding detector <yajsonrpc.stompreactor.StompDetector instance at
    0x382e5a8>
    Starting executor
    Starting worker jsonrpc.Executor/0
    Worker started
    Starting worker jsonrpc.Executor/1
    Worker started
    Starting worker jsonrpc.Executor/2
    Worker started
    Starting worker jsonrpc.Executor/3
    Worker started
    Starting worker jsonrpc.Executor/4
    Worker started
    Starting worker jsonrpc.Executor/5
    Worker started
    Starting worker jsonrpc.Executor/6
    Worker started
    Starting worker jsonrpc.Executor/7
    Worker started
    XMLRPC server running
    Starting executor
    Starting worker periodic/0
    Worker started
    Starting worker periodic/1
    Worker started
    Starting worker periodic/2
    Worker started
    Starting worker periodic/3
    Worker started
    trying to connect libvirt
    Panic: Connect to supervdsm service failed: [Errno 2] No such file
    or directory
    Traceback (most recent call last):
      File "/usr/share/vdsm/supervdsm.py", line 78, in _connect
        utils.retry(self._manager.connect, Exception, timeout=60, tries=3)
      File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 959,
    in retry
        return func()
      File "/usr/lib64/python2.7/multiprocessing/managers.py", line
    500, in connect
        conn = Client(self._address, authkey=self._authkey)
      File "/usr/lib64/python2.7/multiprocessing/connection.py", line
    173, in Client
        c = SocketClient(address)
      File "/usr/lib64/python2.7/multiprocessing/connection.py", line
    308, in SocketClient
        s.connect(address)
      File "/usr/lib64/python2.7/socket.py", line 224, in meth
        return getattr(self._sock,name)(*args)
    error: [Errno 2] No such file or directory
    Killed
Thanks for the help. It's really appreciated.
Cheers,
Gervais
On Fri, May 13, 2016 at 12:55 AM, Charles Tassell <ctassell@gmail.com 
<mailto:ctassell@gmail.com>> wrote:
Hi Gervais,
Hmm, can you tar up the logfiles (/var/log/vdsm/* on the host
    you are installing on) and put them somewhere to look at?  Also, I
    found that starting VDSM from the command line is useful as it
    sometimes spits out error messages that don't show up in the
    logs.  I think the command I used was:
    su -s /bin/bash vdsm
    python /usr/share/vdsm/vdsm
My problem was that I customized the logging settings in
    /etc/vdsm/*conf to try and tone down the debugging stuff and had a
    syntax error.
On 16-05-12 10:24 PM, Gervais de Montbrun wrote:
...
Hi Charles,
Thanks for the suggestion.
I cleaned up again using the bash script from the
    recoving-from-failed-install link below, then reinstalled (yum
    install ovirt-hosted-engine-setup).
I enabled NetworkManager and firewalld as you suggested. The
    install stops very early on with an error:
    [ ERROR ] Failed to execute stage 'Programs detection':
    hosted-engine cannot be deployed while NetworkManager is running,
    please stop and disable it before proceeding
I disabled and stopped NetworkManager and tried again. Same
    result. :(
Any more guesses?
Cheers,
    Gervais
...
On May 12, 2016, at 9:08 PM, Charles Tassell <ctassell@gmail.com
    <mailto:ctassell@gmail.com>> wrote:
Hey Gervais,
Try enabling NetworkManager and firewalld before doing the
    hosted-engine --deploy.  I have run into problems with oVirt
    trying to perform tasks on hosts where firewalld is disabled, so
    maybe you are running into a similar problem.  Also, I think the
    setup script will disable NetworkManager if it needs to.  I know
    I didn't manually disable it on any of the boxes I installed on.
On 16-05-12 04:49 PM, users-request@ovirt.org
    <mailto:users-request@ovirt.org> wrote:
...
Message: 1
    Date: Thu, 12 May 2016 14:22:12 -0300
    From: Gervais de Montbrun <gervais@demontbrun.com
    <mailto:gervais@demontbrun.com>>
    To: Wee Sritippho <wee.s@forest.go.th <mailto:wee.s@forest.go.th>>
    Cc: users <users@ovirt.org <mailto:users@ovirt.org>>
    Subject: Re: [ovirt-users] Adding another host to my cluster
    Message-ID:
    <28B7FC74-5C52-4F60-B9F3-39A36621A7CA@demontbrun.com
    <mailto:28B7FC74-5C52-4F60-B9F3-39A36621A7CA@demontbrun.com>>
    Content-Type: text/plain; charset="utf-8"
Hi Wee
    (and others)
Thanks for the reply. I tried what you suggested, but I am in
    the exact same state. :-(
I don't want to completely remove my hosted engine setup as it
    is working on the two other hosts in my cluster. I did not run
    the rm -rf stes listed here
    (https://www.ovirt.org/documentation/how-to/hosted-engine/#recoving-from-fail...
    <https://www.ovirt.org/documentation/how-to/hosted-engine/#recoving-from-failed-install>)
    that would wipe my hosted_engine nfs mount. If you know that
    this is 100% necessary, please let me know.
I did:
    hosted-engine --clean-metadata --force-cleanup --host-id=3
    run the bash script to remove all of the ovirt packages and
    config files
    reinstalled ovirt-hosted-engine-setup
    ran "hosted-engine --deploy"
I'm back exactly where I started. Is there a way to run just
    the network configuration part of the deploy?
Since the last attempt, I did upgrade my hosted engine and my
    cluster is now running oVirt 3.6.5.
Cheers,
    Gervais
...
On May 12, 2016, at 11:50 AM, Wee Sritippho
    <wee.s@forest.go.th <mailto:wee.s@forest.go.th>> wrote:
Hi,
I used to have a similar problem where one of my host can't be
    deployed due to the absence of ovirtmgmt bridge. Simone said
    it's a bug ( https://bugzilla.redhat.com/1323465
    <https://bugzilla.redhat.com/1323465> ) which would be fixed
    in 3.6.6.
This is what I've done to solve it:
1. In the web UI, set the failed host to maintenance.
    2. Remove it.
    3. In that host, run a script from
    https://www.ovirt.org/documentation/how-to/hosted-engine/#recoving-from-fail...
    <https://www.ovirt.org/documentation/how-to/hosted-engine/#recoving-from-failed-install>
    4. Install ovirt-hosted-engine-setup again.
    5. Redeploy again.
Hope that helps
On 11 ??????? 2016 22 ?????? 48 ???? 58 ?????? GMT+07:00,
    Gervais de Montbrun <gervais@demontbrun.com
    <mailto:gervais@demontbrun.com>> wrote:
    Hi Folks,
I hate to reply to my own message, but I'm really hoping
    someone can help me with my issue
    http://lists.ovirt.org/pipermail/users/2016-May/039690.html
    <http://lists.ovirt.org/pipermail/users/2016-May/039690.html>
Does anyone have a suggestion for me? If there is any more
    information that I can provide that would help you to help me,
    please advise.
Cheers,
    Gervais
...
On May 9, 2016, at 1:42 PM, Gervais de Montbrun
    <gervais@demontbrun.com <mailto:gervais@demontbrun.com>
    <mailto:gervais@demontbrun.com
    <mailto:gervais@demontbrun.com>>> wrote:
Hi All,
I'm trying to add a third host into my oVirt cluster. I have
    hosted engine setup on the first two. It's failing to finish
    the hosted-engine --deploy on this third host. I wiped the
    server and did a CentOS 7 minimum install and ran it again to
    have a clean machine.
My setup:
    CentOS 7 clean install
    yum install -y
    http://resources.ovirt.org/pub/yum-repo/ovirt-release36.rpm
    <http://resources.ovirt.org/pub/yum-repo/ovirt-release36.rpm>
    yum install -y ovirt-hosted-engine-setup
    yum upgrade -y && reboot
    systemctl disable NetworkManager ; systemctl stop
    NetworkManager ; systemctl disable firewalld ; systemctl stop
    firewalld
    hosted-engine --deploy
hosted-engine --deploy always throws an error:
    [ ERROR ] The VDSM host was found in a failed state. Please
    check engine and bootstrap installation logs.
    [ ERROR ] Unable to add Cultivar2 to the manager
    and then echo's
    [ INFO  ] Waiting for VDSM hardware info
    ...
    [ ERROR ] Failed to execute stage 'Closing up': VDSM did not
    start within 120 seconds
    [ INFO  ] Stage: Clean up
    [ INFO  ] Generating answer file
    '/var/lib/ovirt-hosted-engine-setup/answers/answers-20160509131103.conf'
    [ INFO  ] Stage: Pre-termination
    [ INFO  ] Stage: Termination
    [ ERROR ] Hosted Engine deployment failed: this system is not
    reliable, please check the issue, fix and redeploy
        Log file is located at
    /var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20160509130658-qb8ev0.log
Full output of hosted-engine --deploy included in the
    attached zip file.
    I've also included vdsm.log (There is more than one tries
    worth of tries in there).
    You'll also find the
    ovirt-hosted-engine-setup-20160509130658-qb8ev0.log listed above.
This is my "test" setup. Cultivar0 is my first host and my
    nfs server for storage. I have two hosts in the setup already
    and everything is working fine. The host does show up in the
    oVirt admin, but shows "Installed Failed"
    <PastedGraphic-1.png>
Trying to reinstall from within the interface just fails again.
The ovirt bridge interface is not configured and there are no
    config files in /etc/sysconfi/network-scripts related to ovirt.
OS:
    [root@cultivar2 ovirt-hosted-engine-setup]# cat
    /etc/redhat-release
    CentOS Linux release 7.2.1511 (Core)
[root@cultivar2 ovirt-hosted-engine-setup]# uname -a
    Linux cultivar2.grove.silverorange.com
    <http://cultivar2.grove.silverorange.com>
    <http://cultivar2.grove.silverorange.com/>
    3.10.0-327.13.1.el7.x86_64 #1 SMP Thu Mar 31 16:04:38 UTC
    2016 x86_64 x86_64 x86_64 GNU/Linux
Versions:
    [root@cultivar2 ovirt-hosted-engine-setup]# rpm -qa | grep -i
    ovirt
    libgovirt-0.3.3-1.el7_2.1.x86_64
    ovirt-hosted-engine-setup-1.3.5.0-1.1.el7.noarch
    ovirt-host-deploy-1.4.1-1.el7.centos.noarch
    ovirt-vmconsole-1.0.0-1.el7.centos.noarch
    ovirt-vmconsole-host-1.0.0-1.el7.centos.noarch
    ovirt-release36-007-1.noarch
    ovirt-engine-sdk-python-3.6.5.0-1.el7.centos.noarch
    ovirt-setup-lib-1.0.1-1.el7.centos.noarch
    ovirt-hosted-engine-ha-1.3.5.3-1.1.el7.noarch
    [root@cultivar2 ovirt-hosted-engine-setup]#
    [root@cultivar2 ovirt-hosted-engine-setup]#
    [root@cultivar2 ovirt-hosted-engine-setup]#
    [root@cultivar2 ovirt-hosted-engine-setup]# rpm -qa | grep -i
    virt
    libvirt-daemon-driver-secret-1.2.17-13.el7_2.4.x86_64
    virt-viewer-2.0-6.el7.x86_64
    libgovirt-0.3.3-1.el7_2.1.x86_64
    libvirt-daemon-kvm-1.2.17-13.el7_2.4.x86_64
    ovirt-hosted-engine-setup-1.3.5.0-1.1.el7.noarch
    fence-virt-0.3.2-2.el7.x86_64
    virt-what-1.13-6.el7.x86_64
    libvirt-python-1.2.17-2.el7.x86_64
    libvirt-daemon-1.2.17-13.el7_2.4.x86_64
    libvirt-daemon-config-nwfilter-1.2.17-13.el7_2.4.x86_64
    libvirt-lock-sanlock-1.2.17-13.el7_2.4.x86_64
    libvirt-daemon-driver-nodedev-1.2.17-13.el7_2.4.x86_64
    libvirt-daemon-driver-network-1.2.17-13.el7_2.4.x86_64
    libvirt-daemon-driver-storage-1.2.17-13.el7_2.4.x86_64
    ovirt-host-deploy-1.4.1-1.el7.centos.noarch
    virt-v2v-1.28.1-1.55.el7.centos.2.x86_64
    ovirt-vmconsole-1.0.0-1.el7.centos.noarch
    ovirt-vmconsole-host-1.0.0-1.el7.centos.noarch
    libvirt-client-1.2.17-13.el7_2.4.x86_64
    libvirt-daemon-driver-nwfilter-1.2.17-13.el7_2.4.x86_64
    ovirt-release36-007-1.noarch
    libvirt-daemon-driver-interface-1.2.17-13.el7_2.4.x86_64
    libvirt-daemon-driver-qemu-1.2.17-13.el7_2.4.x86_64
    ovirt-engine-sdk-python-3.6.5.0-1.el7.centos.noarch
    ovirt-setup-lib-1.0.1-1.el7.centos.noarch
    ovirt-hosted-engine-ha-1.3.5.3-1.1.el7.noarch
I also have a series of stuck tasks that I can't clear
    related to the host that can't be added... This is a
    secondary issue and I don't want to get off track, but they
    look like this:
    <PastedGraphic-2.png>
I'd appreciate any help that can be offered.
Cheers,
    Gervais
Gervais de Montbrun
    Systems Administrator  / silverorange Inc.
Phone +1 902 367 4532 ext. 104
    <tel:%2B1%20902%20367%204532%20ext.%20104> <tel:+1 902 367
    4532 ext. 104 <tel:%2B1%20902%20367%204532%20ext.%20104>>
    Mobile +1 902 978 0009 <tel:%2B1%20902%20978%200009> <tel:+1
    902 978 0009 <tel:%2B1%20902%20978%200009>>
<hosted-engine--deploy-logs.zip>
Users mailing list
    Users@ovirt.org <mailto:Users@ovirt.org>
    http://lists.ovirt.org/mailman/listinfo/users
    <http://lists.ovirt.org/mailman/listinfo/users>
-- 
    Wee
_______________________________________________
    Users mailing list
    Users@ovirt.org <mailto:Users@ovirt.org>
    http://lists.ovirt.org/mailman/listinfo/users
--------------030000020408000408030009
Content-Type: text/html; charset=utf-8
Content-Transfer-Encoding: 8bit

<html>
  <head>
    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <div class="moz-cite-prefix">Hi Gervais,<br>
      <br>
        Okay, I see two problems: there are some leftover direcyories
      causing issues and for some reason VDSM seems to be trying to bind
      to a port something is already running on (probably an older
      version of VDSM.)  Try removing the duplicate dirs (rmdir
      /var/run/vdsm/storage/248f46f0-d793-4581-9810-c9d965e2f286 and
      /rhev/data-center/mnt - if they aren't empty don't rm -rf them
      because they might be mounted from your production servers.  Just
      mv -i them to /root or somewhere.)<br>
      <br>
        Next shutdown the vdsm service with "service vdsm stop" (I
      think, might be service stop vdsm, I don't use CentOS much) and
      kill any running vdsm processes (ps ax |grep vdsm)  The error that
      I saw was:<br>
      <br>
      MainThread::ERROR::2016-05-13
      08:58:38,262::clientIF::128::vds::(__init__) failed to init
      clientIF, shutting down storage dispatcher<br>
      MainThread::ERROR::2016-05-13 08:58:38,289::vdsm::171::vds::(run)
      Exception raised<br>
      Traceback (most recent call last):<br>
        File "/usr/share/vdsm/vdsm", line 169, in run<br>
          serve_clients(log)<br>
        File "/usr/share/vdsm/vdsm", line 102, in serve_clients<br>
          cif = clientIF.getInstance(irs, log, scheduler)<br>
        File "/usr/share/vdsm/clientIF.py", line 193, in getInstance<br>
          cls._instance = clientIF(irs, log, scheduler)<br>
        File "/usr/share/vdsm/clientIF.py", line 123, in __init__<br>
          self._createAcceptor(host, port)<br>
        File "/usr/share/vdsm/clientIF.py", line 201, in _createAcceptor<br>
          port, sslctx)<br>
        File "/usr/share/vdsm/protocoldetector.py", line 170, in
      __init__<br>
          sock = _create_socket(host, port)<br>
        File "/usr/share/vdsm/protocoldetector.py", line 40, in
      _create_socket<br>
          server_socket.bind(addr[0][4])<br>
        File "/usr/lib64/python2.7/socket.py", line 224, in meth<br>
          return getattr(self._sock,name)(*args)<br>
      error: [Errno 98] Address already in use<br>
      <br>
      If you get the same error, do a netstat -lnp and compare it to the
      same from a working box to see if something else is running on the
      VDSM port.<br>
      <br>
      <br>
      On 2016-05-13 09:37 AM, Gervais de Montbrun wrote:<br>
    </div>
    <blockquote
cite="mid:CAESCRhO-o2my2MqO0-GXzW=y0eyK0YdUpGkJN8Mq_wnXxn_QBA@mail.gmail.com"
      type="cite">
      <div dir="ltr"><span
          style="color:rgb(0,0,0);font-family:Helvetica;font-size:12px">Hi
          Charles,</span>
        <div class=""
          style="color:rgb(0,0,0);font-family:Helvetica;font-size:12px"><br
            class="">
        </div>
        <div class=""
          style="color:rgb(0,0,0);font-family:Helvetica;font-size:12px">I
          think the problem I am having is due to the setup failing and
          not something in vdsm configs as I have never gotten this
          server to start up properly and the BRIDGE ethernet interface
          + ovirt routes are not setup.</div>
        <div class=""
          style="color:rgb(0,0,0);font-family:Helvetica;font-size:12px"><br
            class="">
        </div>
        <div class=""
          style="color:rgb(0,0,0);font-family:Helvetica;font-size:12px">I
          put the logs here: <a moz-do-not-send="true"
href="https://www.dropbox.com/sh/5ugyykqh1lgru9l/AACXxRYWr3tgd0WbBVFW5twHa?dl=0"
            class="">https://www.dropbox.com/sh/5ugyykqh1lgru9l/AACXxRYWr3tgd0WbBVFW5twHa?dl=0</a></div>
        <div class=""
          style="color:rgb(0,0,0);font-family:Helvetica;font-size:12px"><br
            class="">
        </div>
        <div class=""
          style="color:rgb(0,0,0);font-family:Helvetica;font-size:12px">hosted-engine--deploy-logs.zip<span class="" style="white-space:pre">	</span>#
          Logs from when I tried to deploy and it failed</div>
        <div class=""
          style="color:rgb(0,0,0);font-family:Helvetica;font-size:12px">vdsm.tar.gz<span class="" style="white-space:pre">					</span>#
          /var/log/vdsm</div>
        <div class=""
          style="color:rgb(0,0,0);font-family:Helvetica;font-size:12px"><br
            class="">
        </div>
        <div class=""
          style="color:rgb(0,0,0);font-family:Helvetica;font-size:12px">Output
          from running vdsm from the command line:</div>
        <blockquote class=""
          style="color:rgb(0,0,0);font-family:Helvetica;font-size:12px;margin:0px
          0px 0px 40px;border:none;padding:0px">
          <div class="">[root@cultivar2 log]# su -s /bin/bash vdsm</div>
          <div class="">[vdsm@cultivar2 log]$ python
            /usr/share/vdsm/vdsm</div>
          <div class="">(PID: 6521) I am the actual vdsm 4.17.26-1.el7 <a
              moz-do-not-send="true"
              href="http://cultivar2.grove.silverorange.com/" class="">cultivar2.grove.silverorange.com</a> (3.10.0-327.el7.x86_64)</div>
          <div class="">VDSM will run with cpu affinity: frozenset([1])</div>
          <div class="">/usr/bin/taskset --all-tasks --pid --cpu-list 1
            6521 (cwd None)</div>
          <div class="">SUCCESS: <err> = ''; <rc> = 0</div>
          <div class="">Starting scheduler vdsm.Scheduler</div>
          <div class="">started</div>
          <div class="">Run and protect:
            registerDomainStateChangeCallback(callbackFunc=<functools.partial
            object at 0x381b158>)</div>
          <div class="">Run and protect:
            registerDomainStateChangeCallback, Return response: None</div>
          <div class="">Trying to connect to Super Vdsm</div>
          <div class="">Preparing MOM interface</div>
          <div class="">Using named unix socket
            /var/run/vdsm/mom-vdsm.sock</div>
          <div class="">Unregistering all secrests</div>
          <div class="">trying to connect libvirt</div>
          <div class="">recovery: started</div>
          <div class="">Setting channels' timeout to 30 seconds.</div>
          <div class="">Starting VM channels listener thread.</div>
          <div class="">Listening at <a moz-do-not-send="true"
              href="http://0.0.0.0:54321">0.0.0.0:54321</a></div>
          <div class="">Adding detector
            <rpc.bindingxmlrpc.XmlDetector instance at 0x3b4ecb0></div>
          <div class="">recovery: completed in 0s</div>
          <div class="">Adding detector
            <yajsonrpc.stompreactor.StompDetector instance at
            0x382e5a8></div>
          <div class="">Starting executor</div>
          <div class="">Starting worker jsonrpc.Executor/0</div>
          <div class="">Worker started</div>
          <div class="">Starting worker jsonrpc.Executor/1</div>
          <div class="">Worker started</div>
          <div class="">Starting worker jsonrpc.Executor/2</div>
          <div class="">Worker started</div>
          <div class="">Starting worker jsonrpc.Executor/3</div>
          <div class="">Worker started</div>
          <div class="">Starting worker jsonrpc.Executor/4</div>
          <div class="">Worker started</div>
          <div class="">Starting worker jsonrpc.Executor/5</div>
          <div class="">Worker started</div>
          <div class="">Starting worker jsonrpc.Executor/6</div>
          <div class="">Worker started</div>
          <div class="">Starting worker jsonrpc.Executor/7</div>
          <div class="">Worker started</div>
          <div class="">XMLRPC server running</div>
          <div class="">Starting executor</div>
          <div class="">Starting worker periodic/0</div>
          <div class="">Worker started</div>
          <div class="">Starting worker periodic/1</div>
          <div class="">Worker started</div>
          <div class="">Starting worker periodic/2</div>
          <div class="">Worker started</div>
          <div class="">Starting worker periodic/3</div>
          <div class="">Worker started</div>
          <div class="">trying to connect libvirt</div>
          <div class="">Panic: Connect to supervdsm service failed:
            [Errno 2] No such file or directory</div>
          <div class="">Traceback (most recent call last):</div>
          <div class="">  File "/usr/share/vdsm/supervdsm.py", line 78,
            in _connect</div>
          <div class="">    utils.retry(self._manager.connect,
            Exception, timeout=60, tries=3)</div>
          <div class="">  File
            "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 959,
            in retry</div>
          <div class="">    return func()</div>
          <div class="">  File
            "/usr/lib64/python2.7/multiprocessing/managers.py", line
            500, in connect</div>
          <div class="">    conn = Client(self._address,
            authkey=self._authkey)</div>
          <div class="">  File
            "/usr/lib64/python2.7/multiprocessing/connection.py", line
            173, in Client</div>
          <div class="">    c = SocketClient(address)</div>
          <div class="">  File
            "/usr/lib64/python2.7/multiprocessing/connection.py", line
            308, in SocketClient</div>
          <div class="">    s.connect(address)</div>
          <div class="">  File "/usr/lib64/python2.7/socket.py", line
            224, in meth</div>
          <div class="">    return getattr(self._sock,name)(*args)</div>
          <div class="">error: [Errno 2] No such file or directory</div>
          <div class="">Killed</div>
        </blockquote>
        <div class=""
          style="color:rgb(0,0,0);font-family:Helvetica;font-size:12px">
          <div class=""><br class="">
          </div>
          <div class="">Thanks for the help. It's really appreciated.</div>
          <div class="">
            <div id="signature" class=""><br class="">
              Cheers,<br class="">
              Gervais</div>
          </div>
        </div>
      </div>
      <div class="gmail_extra"><br>
        <div class="gmail_quote">On Fri, May 13, 2016 at 12:55 AM,
          Charles Tassell <span dir="ltr"><<a moz-do-not-send="true"
              href="mailto:ctassell@gmail.com" target="_blank">ctassell@gmail.com</a>></span>
          wrote:<br>
          <blockquote class="gmail_quote" style="margin:0 0 0
            .8ex;border-left:1px #ccc solid;padding-left:1ex">
            <div text="#000000" bgcolor="#FFFFFF">
              <div>Hi Gervais,<br>
                <br>
                  Hmm, can you tar up the logfiles (/var/log/vdsm/* on
                the host you are installing on) and put them somewhere
                to look at?  Also, I found that starting VDSM from the
                command line is useful as it sometimes spits out error
                messages that don't show up in the logs.  I think the
                command I used was:<br>
                su -s /bin/bash vdsm<br>
                python /usr/share/vdsm/vdsm<br>
                <br>
                My problem was that I customized the logging settings in
                /etc/vdsm/*conf to try and tone down the debugging stuff
                and had a syntax error.
                <div>
                  <div class="h5"><br>
                    <br>
                    On 16-05-12 10:24 PM, Gervais de Montbrun wrote:<br>
                  </div>
                </div>
              </div>
              <div>
                <div class="h5">
                  <blockquote type="cite">
                    <div dir="ltr">
                      <div dir="auto" style="word-wrap:break-word">Hi
                        Charles,<br>
                        <br>
                        Thanks for the suggestion.<br>
                        <br>
                        I cleaned up again using the bash script from
                        the recoving-from-failed-install link below,
                        then reinstalled (yum install
                        ovirt-hosted-engine-setup).<br>
                        <br>
                        I enabled NetworkManager and firewalld as you
                        suggested. The install stops very early on with
                        an error:<br>
                        <span style="white-space:pre-wrap">	</span>[
                        ERROR ] Failed to execute stage 'Programs
                        detection': hosted-engine cannot be deployed
                        while NetworkManager is running, please stop and
                        disable it before proceeding <br>
                        <br>
                        I disabled and stopped NetworkManager and tried
                        again. Same result. :(<br>
                        <br>
                        Any more guesses?<br>
                        <br>
                        Cheers,<br>
                        Gervais<br>
                        <br>
                        <br>
                        <br>
                        <blockquote type="cite">On May 12, 2016, at 9:08
                          PM, Charles Tassell <<a
                            moz-do-not-send="true"
                            href="mailto:ctassell@gmail.com"
                            target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:ctassell@gmail.com">ctassell@gmail.com</a></a>>
                          wrote:<br>
                          <br>
                          Hey Gervais,<br>
                          <br>
                          Try enabling NetworkManager and firewalld
                          before doing the hosted-engine --deploy.  I
                          have run into problems with oVirt trying to
                          perform tasks on hosts where firewalld is
                          disabled, so maybe you are running into a
                          similar problem.  Also, I think the setup
                          script will disable NetworkManager if it needs
                          to.  I know I didn't manually disable it on
                          any of the boxes I installed on.<br>
                          <br>
                          On 16-05-12 04:49 PM, <a
                            moz-do-not-send="true"
                            href="mailto:users-request@ovirt.org"
                            target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:users-request@ovirt.org">users-request@ovirt.org</a></a>
                          wrote:<br>
                          <blockquote type="cite">Message: 1<br>
                            Date: Thu, 12 May 2016 14:22:12 -0300<br>
                            From: Gervais de Montbrun <<a
                              moz-do-not-send="true"
                              href="mailto:gervais@demontbrun.com"
                              target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:gervais@demontbrun.com">gervais@demontbrun.com</a></a>><br>
                            To: Wee Sritippho <<a
                              moz-do-not-send="true"
                              href="mailto:wee.s@forest.go.th"
                              target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:wee.s@forest.go.th">wee.s@forest.go.th</a></a>><br>
                            Cc: users <<a moz-do-not-send="true"
                              href="mailto:users@ovirt.org"
                              target="_blank">users@ovirt.org</a>><br>
                            Subject: Re: [ovirt-users] Adding another
                            host to my cluster<br>
                            Message-ID: <<a moz-do-not-send="true"
                              href="mailto:28B7FC74-5C52-4F60-B9F3-39A36621A7CA@demontbrun.com"
                              target="_blank">28B7FC74-5C52-4F60-B9F3-39A36621A7CA@demontbrun.com</a>><br>
                            Content-Type: text/plain; charset="utf-8"<br>
                            <br>
                            Hi Wee<br>
                            (and others)<br>
                            <br>
                            Thanks for the reply. I tried what you
                            suggested, but I am in the exact same state.
                            :-(<br>
                            <br>
                            I don't want to completely remove my hosted
                            engine setup as it is working on the two
                            other hosts in my cluster. I did not run the
                            rm -rf stes listed here (<a
                              moz-do-not-send="true"
href="https://www.ovirt.org/documentation/how-to/hosted-engine/#recoving-from-fail..."
                              target="_blank"><a class="moz-txt-link-freetext" href="https://www.ovirt.org/documentation/how-to/hosted-engine/#recoving-from-failed-install">https://www.ovirt.org/documentation/how-to/hosted-engine/#recoving-from-failed-install</a></a>
                            <<a moz-do-not-send="true"
href="https://www.ovirt.org/documentation/how-to/hosted-engine/#recoving-from-fail..."
                              target="_blank">https://www.ovirt.org/documentation/how-to/hosted-engine/#recoving-from-failed-install</a>>;)

                            that would wipe my hosted_engine nfs mount.
                            If you know that this is 100% necessary,
                            please let me know.<br>
                            <br>
                            I did:<br>
                            hosted-engine --clean-metadata
                            --force-cleanup --host-id=3<br>
                            run the bash script to remove all of the
                            ovirt packages and config files<br>
                            reinstalled ovirt-hosted-engine-setup<br>
                            ran "hosted-engine --deploy"<br>
                            <br>
                            I'm back exactly where I started. Is there a
                            way to run just the network configuration
                            part of the deploy?<br>
                            <br>
                            Since the last attempt, I did upgrade my
                            hosted engine and my cluster is now running
                            oVirt 3.6.5.<br>
                            <br>
                            Cheers,<br>
                            Gervais<br>
                            <br>
                            <br>
                            <br>
                            <blockquote type="cite">On May 12, 2016, at
                              11:50 AM, Wee Sritippho <<a
                                moz-do-not-send="true"
                                href="mailto:wee.s@forest.go.th"
                                target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:wee.s@forest.go.th">wee.s@forest.go.th</a></a>>

                              wrote:<br>
                              <br>
                              Hi,<br>
                              <br>
                              I used to have a similar problem where one
                              of my host can't be deployed due to the
                              absence of ovirtmgmt bridge. Simone said
                              it's a bug ( <a moz-do-not-send="true"
                                href="https://bugzilla.redhat.com/1323465"
                                target="_blank">https://bugzilla.redhat.com/1323465</a>
                              <<a moz-do-not-send="true"
                                href="https://bugzilla.redhat.com/1323465"
                                target="_blank">https://bugzilla.redhat.com/1323465</a>>;

                              ) which would be fixed in 3.6.6.<br>
                              <br>
                              This is what I've done to solve it:<br>
                              <br>
                              1. In the web UI, set the failed host to
                              maintenance.<br>
                              2. Remove it.<br>
                              3. In that host, run a script from <a
                                moz-do-not-send="true"
href="https://www.ovirt.org/documentation/how-to/hosted-engine/#recoving-from-fail..."
                                target="_blank"><a class="moz-txt-link-freetext" href="https://www.ovirt.org/documentation/how-to/hosted-engine/#recoving-from-failed-install">https://www.ovirt.org/documentation/how-to/hosted-engine/#recoving-from-failed-install</a></a>
                              <<a moz-do-not-send="true"
href="https://www.ovirt.org/documentation/how-to/hosted-engine/#recoving-from-fail..."
                                target="_blank">https://www.ovirt.org/documentation/how-to/hosted-engine/#recoving-from-failed-install</a>><br>
                              4. Install ovirt-hosted-engine-setup
                              again.<br>
                              5. Redeploy again.<br>
                              <br>
                              Hope that helps<br>
                              <br>
                              On 11 ??????? 2016 22 ?????? 48 ???? 58
                              ?????? GMT+07:00, Gervais de Montbrun <<a
                                moz-do-not-send="true"
                                href="mailto:gervais@demontbrun.com"
                                target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:gervais@demontbrun.com">gervais@demontbrun.com</a></a>>

                              wrote:<br>
                              Hi Folks,<br>
                              <br>
                              I hate to reply to my own message, but I'm
                              really hoping someone can help me with my
                              issue<br>
                              <a moz-do-not-send="true"
                                href="http://lists.ovirt.org/pipermail/users/2016-May/039690.html"
                                target="_blank">http://lists.ovirt.org/pipermail/users/2016-May/039690.html</a>
                              <<a moz-do-not-send="true"
                                href="http://lists.ovirt.org/pipermail/users/2016-May/039690.html"
                                target="_blank">http://lists.ovirt.org/pipermail/users/2016-May/039690.html</a>><br>
                              <br>
                              Does anyone have a suggestion for me? If
                              there is any more information that I can
                              provide that would help you to help me,
                              please advise.<br>
                              <br>
                              Cheers,<br>
                              Gervais<br>
                              <br>
                              <br>
                              <br>
                              <blockquote type="cite">On May 9, 2016, at
                                1:42 PM, Gervais de Montbrun <<a
                                  moz-do-not-send="true"
                                  href="mailto:gervais@demontbrun.com"
                                  target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:gervais@demontbrun.com">gervais@demontbrun.com</a></a>
                                <mailto:<a moz-do-not-send="true"
                                  href="mailto:gervais@demontbrun.com"
                                  target="_blank">gervais@demontbrun.com</a>>>

                                wrote:<br>
                                <br>
                                Hi All,<br>
                                <br>
                                I'm trying to add a third host into my
                                oVirt cluster. I have hosted engine
                                setup on the first two. It's failing to
                                finish the hosted-engine --deploy on
                                this third host. I wiped the server and
                                did a CentOS 7 minimum install and ran
                                it again to have a clean machine.<br>
                                <br>
                                My setup:<br>
                                CentOS 7 clean install<br>
                                yum install -y <a
                                  moz-do-not-send="true"
                                  href="http://resources.ovirt.org/pub/yum-repo/ovirt-release36.rpm"
                                  target="_blank"><a class="moz-txt-link-freetext" href="http://resources.ovirt.org/pub/yum-repo/ovirt-release36.rpm">http://resources.ovirt.org/pub/yum-repo/ovirt-release36.rpm</a></a>
                                <<a moz-do-not-send="true"
                                  href="http://resources.ovirt.org/pub/yum-repo/ovirt-release36.rpm"
                                  target="_blank">http://resources.ovirt.org/pub/yum-repo/ovirt-release36.rpm</a>><br>
                                yum install -y ovirt-hosted-engine-setup<br>
                                yum upgrade -y && reboot<br>
                                systemctl disable NetworkManager ;
                                systemctl stop NetworkManager ;
                                systemctl disable firewalld ; systemctl
                                stop firewalld<br>
                                hosted-engine --deploy<br>
                                <br>
                                hosted-engine --deploy always throws an
                                error:<br>
                                [ ERROR ] The VDSM host was found in a
                                failed state. Please check engine and
                                bootstrap installation logs.<br>
                                [ ERROR ] Unable to add Cultivar2 to the
                                manager<br>
                                and then echo's<br>
                                [ INFO  ] Waiting for VDSM hardware info<br>
                                ...<br>
                                [ ERROR ] Failed to execute stage
                                'Closing up': VDSM did not start within
                                120 seconds<br>
                                [ INFO  ] Stage: Clean up<br>
                                [ INFO  ] Generating answer file
                                '/var/lib/ovirt-hosted-engine-setup/answers/answers-20160509131103.conf'<br>
                                [ INFO  ] Stage: Pre-termination<br>
                                [ INFO  ] Stage: Termination<br>
                                [ ERROR ] Hosted Engine deployment
                                failed: this system is not reliable,
                                please check the issue, fix and redeploy<br>
                                    Log file is located at
/var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20160509130658-qb8ev0.log<br>
                                <br>
                                Full output of hosted-engine --deploy
                                included in the attached zip file.<br>
                                I've also included vdsm.log (There is
                                more than one tries worth of tries in
                                there).<br>
                                You'll also find the
                                ovirt-hosted-engine-setup-20160509130658-qb8ev0.log
                                listed above.<br>
                                <br>
                                This is my "test" setup. Cultivar0 is my
                                first host and my nfs server for
                                storage. I have two hosts in the setup
                                already and everything is working fine.
                                The host does show up in the oVirt
                                admin, but shows "Installed Failed"<br>
                                <PastedGraphic-1.png><br>
                                <br>
                                Trying to reinstall from within the
                                interface just fails again.<br>
                                <br>
                                The ovirt bridge interface is not
                                configured and there are no config files
                                in /etc/sysconfi/network-scripts related
                                to ovirt.<br>
                                <br>
                                OS:<br>
                                [root@cultivar2
                                ovirt-hosted-engine-setup]# cat
                                /etc/redhat-release<br>
                                CentOS Linux release 7.2.1511 (Core)<br>
                                <br>
                                [root@cultivar2
                                ovirt-hosted-engine-setup]# uname -a<br>
                                Linux <a moz-do-not-send="true"
                                  href="http://cultivar2.grove.silverorange.com"
                                  target="_blank">cultivar2.grove.silverorange.com</a>
                                <<a moz-do-not-send="true"
                                  href="http://cultivar2.grove.silverorange.com/"
                                  target="_blank">http://cultivar2.grove.silverorange.com/</a>>;

                                3.10.0-327.13.1.el7.x86_64 #1 SMP Thu
                                Mar 31 16:04:38 UTC 2016 x86_64 x86_64
                                x86_64 GNU/Linux<br>
                                <br>
                                Versions:<br>
                                [root@cultivar2
                                ovirt-hosted-engine-setup]# rpm -qa |
                                grep -i ovirt<br>
                                libgovirt-0.3.3-1.el7_2.1.x86_64<br>
ovirt-hosted-engine-setup-1.3.5.0-1.1.el7.noarch<br>
ovirt-host-deploy-1.4.1-1.el7.centos.noarch<br>
ovirt-vmconsole-1.0.0-1.el7.centos.noarch<br>
ovirt-vmconsole-host-1.0.0-1.el7.centos.noarch<br>
                                ovirt-release36-007-1.noarch<br>
ovirt-engine-sdk-python-3.6.5.0-1.el7.centos.noarch<br>
ovirt-setup-lib-1.0.1-1.el7.centos.noarch<br>
ovirt-hosted-engine-ha-1.3.5.3-1.1.el7.noarch<br>
                                [root@cultivar2
                                ovirt-hosted-engine-setup]#<br>
                                [root@cultivar2
                                ovirt-hosted-engine-setup]#<br>
                                [root@cultivar2
                                ovirt-hosted-engine-setup]#<br>
                                [root@cultivar2
                                ovirt-hosted-engine-setup]# rpm -qa |
                                grep -i virt<br>
libvirt-daemon-driver-secret-1.2.17-13.el7_2.4.x86_64<br>
                                virt-viewer-2.0-6.el7.x86_64<br>
                                libgovirt-0.3.3-1.el7_2.1.x86_64<br>
libvirt-daemon-kvm-1.2.17-13.el7_2.4.x86_64<br>
ovirt-hosted-engine-setup-1.3.5.0-1.1.el7.noarch<br>
                                fence-virt-0.3.2-2.el7.x86_64<br>
                                virt-what-1.13-6.el7.x86_64<br>
                                libvirt-python-1.2.17-2.el7.x86_64<br>
                                libvirt-daemon-1.2.17-13.el7_2.4.x86_64<br>
libvirt-daemon-config-nwfilter-1.2.17-13.el7_2.4.x86_64<br>
libvirt-lock-sanlock-1.2.17-13.el7_2.4.x86_64<br>
libvirt-daemon-driver-nodedev-1.2.17-13.el7_2.4.x86_64<br>
libvirt-daemon-driver-network-1.2.17-13.el7_2.4.x86_64<br>
libvirt-daemon-driver-storage-1.2.17-13.el7_2.4.x86_64<br>
ovirt-host-deploy-1.4.1-1.el7.centos.noarch<br>
                                virt-v2v-1.28.1-1.55.el7.centos.2.x86_64<br>
ovirt-vmconsole-1.0.0-1.el7.centos.noarch<br>
ovirt-vmconsole-host-1.0.0-1.el7.centos.noarch<br>
                                libvirt-client-1.2.17-13.el7_2.4.x86_64<br>
libvirt-daemon-driver-nwfilter-1.2.17-13.el7_2.4.x86_64<br>
                                ovirt-release36-007-1.noarch<br>
libvirt-daemon-driver-interface-1.2.17-13.el7_2.4.x86_64<br>
libvirt-daemon-driver-qemu-1.2.17-13.el7_2.4.x86_64<br>
ovirt-engine-sdk-python-3.6.5.0-1.el7.centos.noarch<br>
ovirt-setup-lib-1.0.1-1.el7.centos.noarch<br>
ovirt-hosted-engine-ha-1.3.5.3-1.1.el7.noarch<br>
                                <br>
                                I also have a series of stuck tasks that
                                I can't clear related to the host that
                                can't be added... This is a secondary
                                issue and I don't want to get off track,
                                but they look like this:<br>
                                <PastedGraphic-2.png><br>
                                <br>
                                I'd appreciate any help that can be
                                offered.<br>
                                <br>
                                Cheers,<br>
                                Gervais<br>
                                <br>
                                <br>
                                Gervais de Montbrun<br>
                                Systems Administrator  / silverorange
                                Inc.<br>
                                <br>
                                Phone                                   <span style="white-space:pre-wrap">	</span><a
                                  moz-do-not-send="true"
                                  href="tel:%2B1%20902%20367%204532%20ext.%20104"
                                  value="+19023674532" target="_blank">+1
                                  902 367 4532 ext. 104</a> <tel:<a
                                  moz-do-not-send="true"
                                  href="tel:%2B1%20902%20367%204532%20ext.%20104"
                                  value="+19023674532" target="_blank">+1
                                  902 367 4532 ext. 104</a>><br>
                                Mobile
                                                                  <span style="white-space:pre-wrap">	</span><a
                                  moz-do-not-send="true"
                                  href="tel:%2B1%20902%20978%200009"
                                  value="+19029780009" target="_blank">+1
                                  902 978 0009</a> <tel:<a
                                  moz-do-not-send="true"
                                  href="tel:%2B1%20902%20978%200009"
                                  value="+19029780009" target="_blank">+1
                                  902 978 0009</a>><br>
                                <br>
                                <hosted-engine--deploy-logs.zip><br>
                              </blockquote>
                              <br>
                              <br>
                              Users mailing list<br>
                              <a moz-do-not-send="true"
                                href="mailto:Users@ovirt.org"
                                target="_blank">Users@ovirt.org</a><br>
                              <a moz-do-not-send="true"
                                href="http://lists.ovirt.org/mailman/listinfo/users"
                                target="_blank">http://lists.ovirt.org/mailman/listinfo/users</a>
                              <<a moz-do-not-send="true"
                                href="http://lists.ovirt.org/mailman/listinfo/users"
                                target="_blank">http://lists.ovirt.org/mailman/listinfo/users</a>><br>
                              <br>
                              -- <br>
                              Wee<br>
                            </blockquote>
                            <br>
                          </blockquote>
                          <br>
_______________________________________________<br>
                          Users mailing list<br>
                          <a moz-do-not-send="true"
                            href="mailto:Users@ovirt.org"
                            target="_blank">Users@ovirt.org</a><br>
                          <a moz-do-not-send="true"
                            href="http://lists.ovirt.org/mailman/listinfo/users"
                            target="_blank">http://lists.ovirt.org/mailman/listinfo/users</a><br>
                        </blockquote>
                        <br>
                      </div>
                    </div>
                  </blockquote>
                  <br>
                </div>
              </div>
            </div>
          </blockquote>
        </div>
        <br>
      </div>
    </blockquote>
    <br>
  </body>
</html>

--------------030000020408000408030009--