
This is a multi-part message in MIME format. --------------030000020408000408030009 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Hi Gervais, Okay, I see two problems: there are some leftover direcyories causing issues and for some reason VDSM seems to be trying to bind to a port something is already running on (probably an older version of VDSM.) Try removing the duplicate dirs (rmdir /var/run/vdsm/storage/248f46f0-d793-4581-9810-c9d965e2f286 and /rhev/data-center/mnt - if they aren't empty don't rm -rf them because they might be mounted from your production servers. Just mv -i them to /root or somewhere.) Next shutdown the vdsm service with "service vdsm stop" (I think, might be service stop vdsm, I don't use CentOS much) and kill any running vdsm processes (ps ax |grep vdsm) The error that I saw was: MainThread::ERROR::2016-05-13 08:58:38,262::clientIF::128::vds::(__init__) failed to init clientIF, shutting down storage dispatcher MainThread::ERROR::2016-05-13 08:58:38,289::vdsm::171::vds::(run) Exception raised Traceback (most recent call last): File "/usr/share/vdsm/vdsm", line 169, in run serve_clients(log) File "/usr/share/vdsm/vdsm", line 102, in serve_clients cif = clientIF.getInstance(irs, log, scheduler) File "/usr/share/vdsm/clientIF.py", line 193, in getInstance cls._instance = clientIF(irs, log, scheduler) File "/usr/share/vdsm/clientIF.py", line 123, in __init__ self._createAcceptor(host, port) File "/usr/share/vdsm/clientIF.py", line 201, in _createAcceptor port, sslctx) File "/usr/share/vdsm/protocoldetector.py", line 170, in __init__ sock = _create_socket(host, port) File "/usr/share/vdsm/protocoldetector.py", line 40, in _create_socket server_socket.bind(addr[0][4]) File "/usr/lib64/python2.7/socket.py", line 224, in meth return getattr(self._sock,name)(*args) error: [Errno 98] Address already in use If you get the same error, do a netstat -lnp and compare it to the same from a working box to see if something else is running on the VDSM port. On 2016-05-13 09:37 AM, Gervais de Montbrun wrote:
Hi Charles,
I think the problem I am having is due to the setup failing and not something in vdsm configs as I have never gotten this server to start up properly and the BRIDGE ethernet interface + ovirt routes are not setup.
I put the logs here: https://www.dropbox.com/sh/5ugyykqh1lgru9l/AACXxRYWr3tgd0WbBVFW5twHa?dl=0
hosted-engine--deploy-logs.zip# Logs from when I tried to deploy and it failed vdsm.tar.gz# /var/log/vdsm
Output from running vdsm from the command line:
[root@cultivar2 log]# su -s /bin/bash vdsm [vdsm@cultivar2 log]$ python /usr/share/vdsm/vdsm (PID: 6521) I am the actual vdsm 4.17.26-1.el7 cultivar2.grove.silverorange.com <http://cultivar2.grove.silverorange.com/> (3.10.0-327.el7.x86_64) VDSM will run with cpu affinity: frozenset([1]) /usr/bin/taskset --all-tasks --pid --cpu-list 1 6521 (cwd None) SUCCESS: <err> = ''; <rc> = 0 Starting scheduler vdsm.Scheduler started Run and protect: registerDomainStateChangeCallback(callbackFunc=<functools.partial object at 0x381b158>) Run and protect: registerDomainStateChangeCallback, Return response: None Trying to connect to Super Vdsm Preparing MOM interface Using named unix socket /var/run/vdsm/mom-vdsm.sock Unregistering all secrests trying to connect libvirt recovery: started Setting channels' timeout to 30 seconds. Starting VM channels listener thread. Listening at 0.0.0.0:54321 <http://0.0.0.0:54321> Adding detector <rpc.bindingxmlrpc.XmlDetector instance at 0x3b4ecb0> recovery: completed in 0s Adding detector <yajsonrpc.stompreactor.StompDetector instance at 0x382e5a8> Starting executor Starting worker jsonrpc.Executor/0 Worker started Starting worker jsonrpc.Executor/1 Worker started Starting worker jsonrpc.Executor/2 Worker started Starting worker jsonrpc.Executor/3 Worker started Starting worker jsonrpc.Executor/4 Worker started Starting worker jsonrpc.Executor/5 Worker started Starting worker jsonrpc.Executor/6 Worker started Starting worker jsonrpc.Executor/7 Worker started XMLRPC server running Starting executor Starting worker periodic/0 Worker started Starting worker periodic/1 Worker started Starting worker periodic/2 Worker started Starting worker periodic/3 Worker started trying to connect libvirt Panic: Connect to supervdsm service failed: [Errno 2] No such file or directory Traceback (most recent call last): File "/usr/share/vdsm/supervdsm.py", line 78, in _connect utils.retry(self._manager.connect, Exception, timeout=60, tries=3) File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 959, in retry return func() File "/usr/lib64/python2.7/multiprocessing/managers.py", line 500, in connect conn = Client(self._address, authkey=self._authkey) File "/usr/lib64/python2.7/multiprocessing/connection.py", line 173, in Client c = SocketClient(address) File "/usr/lib64/python2.7/multiprocessing/connection.py", line 308, in SocketClient s.connect(address) File "/usr/lib64/python2.7/socket.py", line 224, in meth return getattr(self._sock,name)(*args) error: [Errno 2] No such file or directory Killed
Thanks for the help. It's really appreciated.
Cheers, Gervais
On Fri, May 13, 2016 at 12:55 AM, Charles Tassell <ctassell@gmail.com <mailto:ctassell@gmail.com>> wrote:
Hi Gervais,
Hmm, can you tar up the logfiles (/var/log/vdsm/* on the host you are installing on) and put them somewhere to look at? Also, I found that starting VDSM from the command line is useful as it sometimes spits out error messages that don't show up in the logs. I think the command I used was: su -s /bin/bash vdsm python /usr/share/vdsm/vdsm
My problem was that I customized the logging settings in /etc/vdsm/*conf to try and tone down the debugging stuff and had a syntax error.
On 16-05-12 10:24 PM, Gervais de Montbrun wrote:
Hi Charles,
Thanks for the suggestion.
I cleaned up again using the bash script from the recoving-from-failed-install link below, then reinstalled (yum install ovirt-hosted-engine-setup).
I enabled NetworkManager and firewalld as you suggested. The install stops very early on with an error: [ ERROR ] Failed to execute stage 'Programs detection': hosted-engine cannot be deployed while NetworkManager is running, please stop and disable it before proceeding
I disabled and stopped NetworkManager and tried again. Same result. :(
Any more guesses?
Cheers, Gervais
On May 12, 2016, at 9:08 PM, Charles Tassell <ctassell@gmail.com <mailto:ctassell@gmail.com>> wrote:
Hey Gervais,
Try enabling NetworkManager and firewalld before doing the hosted-engine --deploy. I have run into problems with oVirt trying to perform tasks on hosts where firewalld is disabled, so maybe you are running into a similar problem. Also, I think the setup script will disable NetworkManager if it needs to. I know I didn't manually disable it on any of the boxes I installed on.
On 16-05-12 04:49 PM, users-request@ovirt.org <mailto:users-request@ovirt.org> wrote:
Message: 1 Date: Thu, 12 May 2016 14:22:12 -0300 From: Gervais de Montbrun <gervais@demontbrun.com <mailto:gervais@demontbrun.com>> To: Wee Sritippho <wee.s@forest.go.th <mailto:wee.s@forest.go.th>> Cc: users <users@ovirt.org <mailto:users@ovirt.org>> Subject: Re: [ovirt-users] Adding another host to my cluster Message-ID: <28B7FC74-5C52-4F60-B9F3-39A36621A7CA@demontbrun.com <mailto:28B7FC74-5C52-4F60-B9F3-39A36621A7CA@demontbrun.com>> Content-Type: text/plain; charset="utf-8"
Hi Wee (and others)
Thanks for the reply. I tried what you suggested, but I am in the exact same state. :-(
I don't want to completely remove my hosted engine setup as it is working on the two other hosts in my cluster. I did not run the rm -rf stes listed here (https://www.ovirt.org/documentation/how-to/hosted-engine/#recoving-from-fail... <https://www.ovirt.org/documentation/how-to/hosted-engine/#recoving-from-failed-install>) that would wipe my hosted_engine nfs mount. If you know that this is 100% necessary, please let me know.
I did: hosted-engine --clean-metadata --force-cleanup --host-id=3 run the bash script to remove all of the ovirt packages and config files reinstalled ovirt-hosted-engine-setup ran "hosted-engine --deploy"
I'm back exactly where I started. Is there a way to run just the network configuration part of the deploy?
Since the last attempt, I did upgrade my hosted engine and my cluster is now running oVirt 3.6.5.
Cheers, Gervais
On May 12, 2016, at 11:50 AM, Wee Sritippho <wee.s@forest.go.th <mailto:wee.s@forest.go.th>> wrote:
Hi,
I used to have a similar problem where one of my host can't be deployed due to the absence of ovirtmgmt bridge. Simone said it's a bug ( https://bugzilla.redhat.com/1323465 <https://bugzilla.redhat.com/1323465> ) which would be fixed in 3.6.6.
This is what I've done to solve it:
1. In the web UI, set the failed host to maintenance. 2. Remove it. 3. In that host, run a script from https://www.ovirt.org/documentation/how-to/hosted-engine/#recoving-from-fail... <https://www.ovirt.org/documentation/how-to/hosted-engine/#recoving-from-failed-install> 4. Install ovirt-hosted-engine-setup again. 5. Redeploy again.
Hope that helps
On 11 ??????? 2016 22 ?????? 48 ???? 58 ?????? GMT+07:00, Gervais de Montbrun <gervais@demontbrun.com <mailto:gervais@demontbrun.com>> wrote: Hi Folks,
I hate to reply to my own message, but I'm really hoping someone can help me with my issue http://lists.ovirt.org/pipermail/users/2016-May/039690.html <http://lists.ovirt.org/pipermail/users/2016-May/039690.html>
Does anyone have a suggestion for me? If there is any more information that I can provide that would help you to help me, please advise.
Cheers, Gervais
On May 9, 2016, at 1:42 PM, Gervais de Montbrun <gervais@demontbrun.com <mailto:gervais@demontbrun.com> <mailto:gervais@demontbrun.com <mailto:gervais@demontbrun.com>>> wrote:
Hi All,
I'm trying to add a third host into my oVirt cluster. I have hosted engine setup on the first two. It's failing to finish the hosted-engine --deploy on this third host. I wiped the server and did a CentOS 7 minimum install and ran it again to have a clean machine.
My setup: CentOS 7 clean install yum install -y http://resources.ovirt.org/pub/yum-repo/ovirt-release36.rpm <http://resources.ovirt.org/pub/yum-repo/ovirt-release36.rpm> yum install -y ovirt-hosted-engine-setup yum upgrade -y && reboot systemctl disable NetworkManager ; systemctl stop NetworkManager ; systemctl disable firewalld ; systemctl stop firewalld hosted-engine --deploy
hosted-engine --deploy always throws an error: [ ERROR ] The VDSM host was found in a failed state. Please check engine and bootstrap installation logs. [ ERROR ] Unable to add Cultivar2 to the manager and then echo's [ INFO ] Waiting for VDSM hardware info ... [ ERROR ] Failed to execute stage 'Closing up': VDSM did not start within 120 seconds [ INFO ] Stage: Clean up [ INFO ] Generating answer file '/var/lib/ovirt-hosted-engine-setup/answers/answers-20160509131103.conf' [ INFO ] Stage: Pre-termination [ INFO ] Stage: Termination [ ERROR ] Hosted Engine deployment failed: this system is not reliable, please check the issue, fix and redeploy Log file is located at /var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20160509130658-qb8ev0.log
Full output of hosted-engine --deploy included in the attached zip file. I've also included vdsm.log (There is more than one tries worth of tries in there). You'll also find the ovirt-hosted-engine-setup-20160509130658-qb8ev0.log listed above.
This is my "test" setup. Cultivar0 is my first host and my nfs server for storage. I have two hosts in the setup already and everything is working fine. The host does show up in the oVirt admin, but shows "Installed Failed" <PastedGraphic-1.png>
Trying to reinstall from within the interface just fails again.
The ovirt bridge interface is not configured and there are no config files in /etc/sysconfi/network-scripts related to ovirt.
OS: [root@cultivar2 ovirt-hosted-engine-setup]# cat /etc/redhat-release CentOS Linux release 7.2.1511 (Core)
[root@cultivar2 ovirt-hosted-engine-setup]# uname -a Linux cultivar2.grove.silverorange.com <http://cultivar2.grove.silverorange.com> <http://cultivar2.grove.silverorange.com/> 3.10.0-327.13.1.el7.x86_64 #1 SMP Thu Mar 31 16:04:38 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
Versions: [root@cultivar2 ovirt-hosted-engine-setup]# rpm -qa | grep -i ovirt libgovirt-0.3.3-1.el7_2.1.x86_64 ovirt-hosted-engine-setup-1.3.5.0-1.1.el7.noarch ovirt-host-deploy-1.4.1-1.el7.centos.noarch ovirt-vmconsole-1.0.0-1.el7.centos.noarch ovirt-vmconsole-host-1.0.0-1.el7.centos.noarch ovirt-release36-007-1.noarch ovirt-engine-sdk-python-3.6.5.0-1.el7.centos.noarch ovirt-setup-lib-1.0.1-1.el7.centos.noarch ovirt-hosted-engine-ha-1.3.5.3-1.1.el7.noarch [root@cultivar2 ovirt-hosted-engine-setup]# [root@cultivar2 ovirt-hosted-engine-setup]# [root@cultivar2 ovirt-hosted-engine-setup]# [root@cultivar2 ovirt-hosted-engine-setup]# rpm -qa | grep -i virt libvirt-daemon-driver-secret-1.2.17-13.el7_2.4.x86_64 virt-viewer-2.0-6.el7.x86_64 libgovirt-0.3.3-1.el7_2.1.x86_64 libvirt-daemon-kvm-1.2.17-13.el7_2.4.x86_64 ovirt-hosted-engine-setup-1.3.5.0-1.1.el7.noarch fence-virt-0.3.2-2.el7.x86_64 virt-what-1.13-6.el7.x86_64 libvirt-python-1.2.17-2.el7.x86_64 libvirt-daemon-1.2.17-13.el7_2.4.x86_64 libvirt-daemon-config-nwfilter-1.2.17-13.el7_2.4.x86_64 libvirt-lock-sanlock-1.2.17-13.el7_2.4.x86_64 libvirt-daemon-driver-nodedev-1.2.17-13.el7_2.4.x86_64 libvirt-daemon-driver-network-1.2.17-13.el7_2.4.x86_64 libvirt-daemon-driver-storage-1.2.17-13.el7_2.4.x86_64 ovirt-host-deploy-1.4.1-1.el7.centos.noarch virt-v2v-1.28.1-1.55.el7.centos.2.x86_64 ovirt-vmconsole-1.0.0-1.el7.centos.noarch ovirt-vmconsole-host-1.0.0-1.el7.centos.noarch libvirt-client-1.2.17-13.el7_2.4.x86_64 libvirt-daemon-driver-nwfilter-1.2.17-13.el7_2.4.x86_64 ovirt-release36-007-1.noarch libvirt-daemon-driver-interface-1.2.17-13.el7_2.4.x86_64 libvirt-daemon-driver-qemu-1.2.17-13.el7_2.4.x86_64 ovirt-engine-sdk-python-3.6.5.0-1.el7.centos.noarch ovirt-setup-lib-1.0.1-1.el7.centos.noarch ovirt-hosted-engine-ha-1.3.5.3-1.1.el7.noarch
I also have a series of stuck tasks that I can't clear related to the host that can't be added... This is a secondary issue and I don't want to get off track, but they look like this: <PastedGraphic-2.png>
I'd appreciate any help that can be offered.
Cheers, Gervais
Gervais de Montbrun Systems Administrator / silverorange Inc.
Phone +1 902 367 4532 ext. 104 <tel:%2B1%20902%20367%204532%20ext.%20104> <tel:+1 902 367 4532 ext. 104 <tel:%2B1%20902%20367%204532%20ext.%20104>> Mobile +1 902 978 0009 <tel:%2B1%20902%20978%200009> <tel:+1 902 978 0009 <tel:%2B1%20902%20978%200009>>
<hosted-engine--deploy-logs.zip>
Users mailing list Users@ovirt.org <mailto:Users@ovirt.org> http://lists.ovirt.org/mailman/listinfo/users <http://lists.ovirt.org/mailman/listinfo/users>
-- Wee
_______________________________________________ Users mailing list Users@ovirt.org <mailto:Users@ovirt.org> http://lists.ovirt.org/mailman/listinfo/users
--------------030000020408000408030009 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: 8bit <html> <head> <meta content="text/html; charset=utf-8" http-equiv="Content-Type"> </head> <body bgcolor="#FFFFFF" text="#000000"> <div class="moz-cite-prefix">Hi Gervais,<br> <br> Okay, I see two problems: there are some leftover direcyories causing issues and for some reason VDSM seems to be trying to bind to a port something is already running on (probably an older version of VDSM.) Try removing the duplicate dirs (rmdir /var/run/vdsm/storage/248f46f0-d793-4581-9810-c9d965e2f286 and /rhev/data-center/mnt - if they aren't empty don't rm -rf them because they might be mounted from your production servers. Just mv -i them to /root or somewhere.)<br> <br> Next shutdown the vdsm service with "service vdsm stop" (I think, might be service stop vdsm, I don't use CentOS much) and kill any running vdsm processes (ps ax |grep vdsm) The error that I saw was:<br> <br> MainThread::ERROR::2016-05-13 08:58:38,262::clientIF::128::vds::(__init__) failed to init clientIF, shutting down storage dispatcher<br> MainThread::ERROR::2016-05-13 08:58:38,289::vdsm::171::vds::(run) Exception raised<br> Traceback (most recent call last):<br> File "/usr/share/vdsm/vdsm", line 169, in run<br> serve_clients(log)<br> File "/usr/share/vdsm/vdsm", line 102, in serve_clients<br> cif = clientIF.getInstance(irs, log, scheduler)<br> File "/usr/share/vdsm/clientIF.py", line 193, in getInstance<br> cls._instance = clientIF(irs, log, scheduler)<br> File "/usr/share/vdsm/clientIF.py", line 123, in __init__<br> self._createAcceptor(host, port)<br> File "/usr/share/vdsm/clientIF.py", line 201, in _createAcceptor<br> port, sslctx)<br> File "/usr/share/vdsm/protocoldetector.py", line 170, in __init__<br> sock = _create_socket(host, port)<br> File "/usr/share/vdsm/protocoldetector.py", line 40, in _create_socket<br> server_socket.bind(addr[0][4])<br> File "/usr/lib64/python2.7/socket.py", line 224, in meth<br> return getattr(self._sock,name)(*args)<br> error: [Errno 98] Address already in use<br> <br> If you get the same error, do a netstat -lnp and compare it to the same from a working box to see if something else is running on the VDSM port.<br> <br> <br> On 2016-05-13 09:37 AM, Gervais de Montbrun wrote:<br> </div> <blockquote cite="mid:CAESCRhO-o2my2MqO0-GXzW=y0eyK0YdUpGkJN8Mq_wnXxn_QBA@mail.gmail.com" type="cite"> <div dir="ltr"><span style="color:rgb(0,0,0);font-family:Helvetica;font-size:12px">Hi Charles,</span> <div class="" style="color:rgb(0,0,0);font-family:Helvetica;font-size:12px"><br class=""> </div> <div class="" style="color:rgb(0,0,0);font-family:Helvetica;font-size:12px">I think the problem I am having is due to the setup failing and not something in vdsm configs as I have never gotten this server to start up properly and the BRIDGE ethernet interface + ovirt routes are not setup.</div> <div class="" style="color:rgb(0,0,0);font-family:Helvetica;font-size:12px"><br class=""> </div> <div class="" style="color:rgb(0,0,0);font-family:Helvetica;font-size:12px">I put the logs here: <a moz-do-not-send="true" href="https://www.dropbox.com/sh/5ugyykqh1lgru9l/AACXxRYWr3tgd0WbBVFW5twHa?dl=0" class="">https://www.dropbox.com/sh/5ugyykqh1lgru9l/AACXxRYWr3tgd0WbBVFW5twHa?dl=0</a></div> <div class="" style="color:rgb(0,0,0);font-family:Helvetica;font-size:12px"><br class=""> </div> <div class="" style="color:rgb(0,0,0);font-family:Helvetica;font-size:12px">hosted-engine--deploy-logs.zip<span class="" style="white-space:pre"> </span># Logs from when I tried to deploy and it failed</div> <div class="" style="color:rgb(0,0,0);font-family:Helvetica;font-size:12px">vdsm.tar.gz<span class="" style="white-space:pre"> </span># /var/log/vdsm</div> <div class="" style="color:rgb(0,0,0);font-family:Helvetica;font-size:12px"><br class=""> </div> <div class="" style="color:rgb(0,0,0);font-family:Helvetica;font-size:12px">Output from running vdsm from the command line:</div> <blockquote class="" style="color:rgb(0,0,0);font-family:Helvetica;font-size:12px;margin:0px 0px 0px 40px;border:none;padding:0px"> <div class="">[root@cultivar2 log]# su -s /bin/bash vdsm</div> <div class="">[vdsm@cultivar2 log]$ python /usr/share/vdsm/vdsm</div> <div class="">(PID: 6521) I am the actual vdsm 4.17.26-1.el7 <a moz-do-not-send="true" href="http://cultivar2.grove.silverorange.com/" class="">cultivar2.grove.silverorange.com</a> (3.10.0-327.el7.x86_64)</div> <div class="">VDSM will run with cpu affinity: frozenset([1])</div> <div class="">/usr/bin/taskset --all-tasks --pid --cpu-list 1 6521 (cwd None)</div> <div class="">SUCCESS: <err> = ''; <rc> = 0</div> <div class="">Starting scheduler vdsm.Scheduler</div> <div class="">started</div> <div class="">Run and protect: registerDomainStateChangeCallback(callbackFunc=<functools.partial object at 0x381b158>)</div> <div class="">Run and protect: registerDomainStateChangeCallback, Return response: None</div> <div class="">Trying to connect to Super Vdsm</div> <div class="">Preparing MOM interface</div> <div class="">Using named unix socket /var/run/vdsm/mom-vdsm.sock</div> <div class="">Unregistering all secrests</div> <div class="">trying to connect libvirt</div> <div class="">recovery: started</div> <div class="">Setting channels' timeout to 30 seconds.</div> <div class="">Starting VM channels listener thread.</div> <div class="">Listening at <a moz-do-not-send="true" href="http://0.0.0.0:54321">0.0.0.0:54321</a></div> <div class="">Adding detector <rpc.bindingxmlrpc.XmlDetector instance at 0x3b4ecb0></div> <div class="">recovery: completed in 0s</div> <div class="">Adding detector <yajsonrpc.stompreactor.StompDetector instance at 0x382e5a8></div> <div class="">Starting executor</div> <div class="">Starting worker jsonrpc.Executor/0</div> <div class="">Worker started</div> <div class="">Starting worker jsonrpc.Executor/1</div> <div class="">Worker started</div> <div class="">Starting worker jsonrpc.Executor/2</div> <div class="">Worker started</div> <div class="">Starting worker jsonrpc.Executor/3</div> <div class="">Worker started</div> <div class="">Starting worker jsonrpc.Executor/4</div> <div class="">Worker started</div> <div class="">Starting worker jsonrpc.Executor/5</div> <div class="">Worker started</div> <div class="">Starting worker jsonrpc.Executor/6</div> <div class="">Worker started</div> <div class="">Starting worker jsonrpc.Executor/7</div> <div class="">Worker started</div> <div class="">XMLRPC server running</div> <div class="">Starting executor</div> <div class="">Starting worker periodic/0</div> <div class="">Worker started</div> <div class="">Starting worker periodic/1</div> <div class="">Worker started</div> <div class="">Starting worker periodic/2</div> <div class="">Worker started</div> <div class="">Starting worker periodic/3</div> <div class="">Worker started</div> <div class="">trying to connect libvirt</div> <div class="">Panic: Connect to supervdsm service failed: [Errno 2] No such file or directory</div> <div class="">Traceback (most recent call last):</div> <div class=""> File "/usr/share/vdsm/supervdsm.py", line 78, in _connect</div> <div class=""> utils.retry(self._manager.connect, Exception, timeout=60, tries=3)</div> <div class=""> File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 959, in retry</div> <div class=""> return func()</div> <div class=""> File "/usr/lib64/python2.7/multiprocessing/managers.py", line 500, in connect</div> <div class=""> conn = Client(self._address, authkey=self._authkey)</div> <div class=""> File "/usr/lib64/python2.7/multiprocessing/connection.py", line 173, in Client</div> <div class=""> c = SocketClient(address)</div> <div class=""> File "/usr/lib64/python2.7/multiprocessing/connection.py", line 308, in SocketClient</div> <div class=""> s.connect(address)</div> <div class=""> File "/usr/lib64/python2.7/socket.py", line 224, in meth</div> <div class=""> return getattr(self._sock,name)(*args)</div> <div class="">error: [Errno 2] No such file or directory</div> <div class="">Killed</div> </blockquote> <div class="" style="color:rgb(0,0,0);font-family:Helvetica;font-size:12px"> <div class=""><br class=""> </div> <div class="">Thanks for the help. It's really appreciated.</div> <div class=""> <div id="signature" class=""><br class=""> Cheers,<br class=""> Gervais</div> </div> </div> </div> <div class="gmail_extra"><br> <div class="gmail_quote">On Fri, May 13, 2016 at 12:55 AM, Charles Tassell <span dir="ltr"><<a moz-do-not-send="true" href="mailto:ctassell@gmail.com" target="_blank">ctassell@gmail.com</a>></span> wrote:<br> <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> <div text="#000000" bgcolor="#FFFFFF"> <div>Hi Gervais,<br> <br> Hmm, can you tar up the logfiles (/var/log/vdsm/* on the host you are installing on) and put them somewhere to look at? Also, I found that starting VDSM from the command line is useful as it sometimes spits out error messages that don't show up in the logs. I think the command I used was:<br> su -s /bin/bash vdsm<br> python /usr/share/vdsm/vdsm<br> <br> My problem was that I customized the logging settings in /etc/vdsm/*conf to try and tone down the debugging stuff and had a syntax error. <div> <div class="h5"><br> <br> On 16-05-12 10:24 PM, Gervais de Montbrun wrote:<br> </div> </div> </div> <div> <div class="h5"> <blockquote type="cite"> <div dir="ltr"> <div dir="auto" style="word-wrap:break-word">Hi Charles,<br> <br> Thanks for the suggestion.<br> <br> I cleaned up again using the bash script from the recoving-from-failed-install link below, then reinstalled (yum install ovirt-hosted-engine-setup).<br> <br> I enabled NetworkManager and firewalld as you suggested. The install stops very early on with an error:<br> <span style="white-space:pre-wrap"> </span>[ ERROR ] Failed to execute stage 'Programs detection': hosted-engine cannot be deployed while NetworkManager is running, please stop and disable it before proceeding <br> <br> I disabled and stopped NetworkManager and tried again. Same result. :(<br> <br> Any more guesses?<br> <br> Cheers,<br> Gervais<br> <br> <br> <br> <blockquote type="cite">On May 12, 2016, at 9:08 PM, Charles Tassell <<a moz-do-not-send="true" href="mailto:ctassell@gmail.com" target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:ctassell@gmail.com">ctassell@gmail.com</a></a>> wrote:<br> <br> Hey Gervais,<br> <br> Try enabling NetworkManager and firewalld before doing the hosted-engine --deploy. I have run into problems with oVirt trying to perform tasks on hosts where firewalld is disabled, so maybe you are running into a similar problem. Also, I think the setup script will disable NetworkManager if it needs to. I know I didn't manually disable it on any of the boxes I installed on.<br> <br> On 16-05-12 04:49 PM, <a moz-do-not-send="true" href="mailto:users-request@ovirt.org" target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:users-request@ovirt.org">users-request@ovirt.org</a></a> wrote:<br> <blockquote type="cite">Message: 1<br> Date: Thu, 12 May 2016 14:22:12 -0300<br> From: Gervais de Montbrun <<a moz-do-not-send="true" href="mailto:gervais@demontbrun.com" target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:gervais@demontbrun.com">gervais@demontbrun.com</a></a>><br> To: Wee Sritippho <<a moz-do-not-send="true" href="mailto:wee.s@forest.go.th" target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:wee.s@forest.go.th">wee.s@forest.go.th</a></a>><br> Cc: users <<a moz-do-not-send="true" href="mailto:users@ovirt.org" target="_blank">users@ovirt.org</a>><br> Subject: Re: [ovirt-users] Adding another host to my cluster<br> Message-ID: <<a moz-do-not-send="true" href="mailto:28B7FC74-5C52-4F60-B9F3-39A36621A7CA@demontbrun.com" target="_blank">28B7FC74-5C52-4F60-B9F3-39A36621A7CA@demontbrun.com</a>><br> Content-Type: text/plain; charset="utf-8"<br> <br> Hi Wee<br> (and others)<br> <br> Thanks for the reply. I tried what you suggested, but I am in the exact same state. :-(<br> <br> I don't want to completely remove my hosted engine setup as it is working on the two other hosts in my cluster. I did not run the rm -rf stes listed here (<a moz-do-not-send="true" href="https://www.ovirt.org/documentation/how-to/hosted-engine/#recoving-from-fail..." target="_blank"><a class="moz-txt-link-freetext" href="https://www.ovirt.org/documentation/how-to/hosted-engine/#recoving-from-failed-install">https://www.ovirt.org/documentation/how-to/hosted-engine/#recoving-from-failed-install</a></a> <<a moz-do-not-send="true" href="https://www.ovirt.org/documentation/how-to/hosted-engine/#recoving-from-fail..." target="_blank">https://www.ovirt.org/documentation/how-to/hosted-engine/#recoving-from-failed-install</a>>) that would wipe my hosted_engine nfs mount. If you know that this is 100% necessary, please let me know.<br> <br> I did:<br> hosted-engine --clean-metadata --force-cleanup --host-id=3<br> run the bash script to remove all of the ovirt packages and config files<br> reinstalled ovirt-hosted-engine-setup<br> ran "hosted-engine --deploy"<br> <br> I'm back exactly where I started. Is there a way to run just the network configuration part of the deploy?<br> <br> Since the last attempt, I did upgrade my hosted engine and my cluster is now running oVirt 3.6.5.<br> <br> Cheers,<br> Gervais<br> <br> <br> <br> <blockquote type="cite">On May 12, 2016, at 11:50 AM, Wee Sritippho <<a moz-do-not-send="true" href="mailto:wee.s@forest.go.th" target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:wee.s@forest.go.th">wee.s@forest.go.th</a></a>> wrote:<br> <br> Hi,<br> <br> I used to have a similar problem where one of my host can't be deployed due to the absence of ovirtmgmt bridge. Simone said it's a bug ( <a moz-do-not-send="true" href="https://bugzilla.redhat.com/1323465" target="_blank">https://bugzilla.redhat.com/1323465</a> <<a moz-do-not-send="true" href="https://bugzilla.redhat.com/1323465" target="_blank">https://bugzilla.redhat.com/1323465</a>> ) which would be fixed in 3.6.6.<br> <br> This is what I've done to solve it:<br> <br> 1. In the web UI, set the failed host to maintenance.<br> 2. Remove it.<br> 3. In that host, run a script from <a moz-do-not-send="true" href="https://www.ovirt.org/documentation/how-to/hosted-engine/#recoving-from-fail..." target="_blank"><a class="moz-txt-link-freetext" href="https://www.ovirt.org/documentation/how-to/hosted-engine/#recoving-from-failed-install">https://www.ovirt.org/documentation/how-to/hosted-engine/#recoving-from-failed-install</a></a> <<a moz-do-not-send="true" href="https://www.ovirt.org/documentation/how-to/hosted-engine/#recoving-from-fail..." target="_blank">https://www.ovirt.org/documentation/how-to/hosted-engine/#recoving-from-failed-install</a>><br> 4. Install ovirt-hosted-engine-setup again.<br> 5. Redeploy again.<br> <br> Hope that helps<br> <br> On 11 ??????? 2016 22 ?????? 48 ???? 58 ?????? GMT+07:00, Gervais de Montbrun <<a moz-do-not-send="true" href="mailto:gervais@demontbrun.com" target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:gervais@demontbrun.com">gervais@demontbrun.com</a></a>> wrote:<br> Hi Folks,<br> <br> I hate to reply to my own message, but I'm really hoping someone can help me with my issue<br> <a moz-do-not-send="true" href="http://lists.ovirt.org/pipermail/users/2016-May/039690.html" target="_blank">http://lists.ovirt.org/pipermail/users/2016-May/039690.html</a> <<a moz-do-not-send="true" href="http://lists.ovirt.org/pipermail/users/2016-May/039690.html" target="_blank">http://lists.ovirt.org/pipermail/users/2016-May/039690.html</a>><br> <br> Does anyone have a suggestion for me? If there is any more information that I can provide that would help you to help me, please advise.<br> <br> Cheers,<br> Gervais<br> <br> <br> <br> <blockquote type="cite">On May 9, 2016, at 1:42 PM, Gervais de Montbrun <<a moz-do-not-send="true" href="mailto:gervais@demontbrun.com" target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:gervais@demontbrun.com">gervais@demontbrun.com</a></a> <mailto:<a moz-do-not-send="true" href="mailto:gervais@demontbrun.com" target="_blank">gervais@demontbrun.com</a>>> wrote:<br> <br> Hi All,<br> <br> I'm trying to add a third host into my oVirt cluster. I have hosted engine setup on the first two. It's failing to finish the hosted-engine --deploy on this third host. I wiped the server and did a CentOS 7 minimum install and ran it again to have a clean machine.<br> <br> My setup:<br> CentOS 7 clean install<br> yum install -y <a moz-do-not-send="true" href="http://resources.ovirt.org/pub/yum-repo/ovirt-release36.rpm" target="_blank"><a class="moz-txt-link-freetext" href="http://resources.ovirt.org/pub/yum-repo/ovirt-release36.rpm">http://resources.ovirt.org/pub/yum-repo/ovirt-release36.rpm</a></a> <<a moz-do-not-send="true" href="http://resources.ovirt.org/pub/yum-repo/ovirt-release36.rpm" target="_blank">http://resources.ovirt.org/pub/yum-repo/ovirt-release36.rpm</a>><br> yum install -y ovirt-hosted-engine-setup<br> yum upgrade -y && reboot<br> systemctl disable NetworkManager ; systemctl stop NetworkManager ; systemctl disable firewalld ; systemctl stop firewalld<br> hosted-engine --deploy<br> <br> hosted-engine --deploy always throws an error:<br> [ ERROR ] The VDSM host was found in a failed state. Please check engine and bootstrap installation logs.<br> [ ERROR ] Unable to add Cultivar2 to the manager<br> and then echo's<br> [ INFO ] Waiting for VDSM hardware info<br> ...<br> [ ERROR ] Failed to execute stage 'Closing up': VDSM did not start within 120 seconds<br> [ INFO ] Stage: Clean up<br> [ INFO ] Generating answer file '/var/lib/ovirt-hosted-engine-setup/answers/answers-20160509131103.conf'<br> [ INFO ] Stage: Pre-termination<br> [ INFO ] Stage: Termination<br> [ ERROR ] Hosted Engine deployment failed: this system is not reliable, please check the issue, fix and redeploy<br> Log file is located at /var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20160509130658-qb8ev0.log<br> <br> Full output of hosted-engine --deploy included in the attached zip file.<br> I've also included vdsm.log (There is more than one tries worth of tries in there).<br> You'll also find the ovirt-hosted-engine-setup-20160509130658-qb8ev0.log listed above.<br> <br> This is my "test" setup. Cultivar0 is my first host and my nfs server for storage. I have two hosts in the setup already and everything is working fine. The host does show up in the oVirt admin, but shows "Installed Failed"<br> <PastedGraphic-1.png><br> <br> Trying to reinstall from within the interface just fails again.<br> <br> The ovirt bridge interface is not configured and there are no config files in /etc/sysconfi/network-scripts related to ovirt.<br> <br> OS:<br> [root@cultivar2 ovirt-hosted-engine-setup]# cat /etc/redhat-release<br> CentOS Linux release 7.2.1511 (Core)<br> <br> [root@cultivar2 ovirt-hosted-engine-setup]# uname -a<br> Linux <a moz-do-not-send="true" href="http://cultivar2.grove.silverorange.com" target="_blank">cultivar2.grove.silverorange.com</a> <<a moz-do-not-send="true" href="http://cultivar2.grove.silverorange.com/" target="_blank">http://cultivar2.grove.silverorange.com/</a>> 3.10.0-327.13.1.el7.x86_64 #1 SMP Thu Mar 31 16:04:38 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux<br> <br> Versions:<br> [root@cultivar2 ovirt-hosted-engine-setup]# rpm -qa | grep -i ovirt<br> libgovirt-0.3.3-1.el7_2.1.x86_64<br> ovirt-hosted-engine-setup-1.3.5.0-1.1.el7.noarch<br> ovirt-host-deploy-1.4.1-1.el7.centos.noarch<br> ovirt-vmconsole-1.0.0-1.el7.centos.noarch<br> ovirt-vmconsole-host-1.0.0-1.el7.centos.noarch<br> ovirt-release36-007-1.noarch<br> ovirt-engine-sdk-python-3.6.5.0-1.el7.centos.noarch<br> ovirt-setup-lib-1.0.1-1.el7.centos.noarch<br> ovirt-hosted-engine-ha-1.3.5.3-1.1.el7.noarch<br> [root@cultivar2 ovirt-hosted-engine-setup]#<br> [root@cultivar2 ovirt-hosted-engine-setup]#<br> [root@cultivar2 ovirt-hosted-engine-setup]#<br> [root@cultivar2 ovirt-hosted-engine-setup]# rpm -qa | grep -i virt<br> libvirt-daemon-driver-secret-1.2.17-13.el7_2.4.x86_64<br> virt-viewer-2.0-6.el7.x86_64<br> libgovirt-0.3.3-1.el7_2.1.x86_64<br> libvirt-daemon-kvm-1.2.17-13.el7_2.4.x86_64<br> ovirt-hosted-engine-setup-1.3.5.0-1.1.el7.noarch<br> fence-virt-0.3.2-2.el7.x86_64<br> virt-what-1.13-6.el7.x86_64<br> libvirt-python-1.2.17-2.el7.x86_64<br> libvirt-daemon-1.2.17-13.el7_2.4.x86_64<br> libvirt-daemon-config-nwfilter-1.2.17-13.el7_2.4.x86_64<br> libvirt-lock-sanlock-1.2.17-13.el7_2.4.x86_64<br> libvirt-daemon-driver-nodedev-1.2.17-13.el7_2.4.x86_64<br> libvirt-daemon-driver-network-1.2.17-13.el7_2.4.x86_64<br> libvirt-daemon-driver-storage-1.2.17-13.el7_2.4.x86_64<br> ovirt-host-deploy-1.4.1-1.el7.centos.noarch<br> virt-v2v-1.28.1-1.55.el7.centos.2.x86_64<br> ovirt-vmconsole-1.0.0-1.el7.centos.noarch<br> ovirt-vmconsole-host-1.0.0-1.el7.centos.noarch<br> libvirt-client-1.2.17-13.el7_2.4.x86_64<br> libvirt-daemon-driver-nwfilter-1.2.17-13.el7_2.4.x86_64<br> ovirt-release36-007-1.noarch<br> libvirt-daemon-driver-interface-1.2.17-13.el7_2.4.x86_64<br> libvirt-daemon-driver-qemu-1.2.17-13.el7_2.4.x86_64<br> ovirt-engine-sdk-python-3.6.5.0-1.el7.centos.noarch<br> ovirt-setup-lib-1.0.1-1.el7.centos.noarch<br> ovirt-hosted-engine-ha-1.3.5.3-1.1.el7.noarch<br> <br> I also have a series of stuck tasks that I can't clear related to the host that can't be added... This is a secondary issue and I don't want to get off track, but they look like this:<br> <PastedGraphic-2.png><br> <br> I'd appreciate any help that can be offered.<br> <br> Cheers,<br> Gervais<br> <br> <br> Gervais de Montbrun<br> Systems Administrator / silverorange Inc.<br> <br> Phone <span style="white-space:pre-wrap"> </span><a moz-do-not-send="true" href="tel:%2B1%20902%20367%204532%20ext.%20104" value="+19023674532" target="_blank">+1 902 367 4532 ext. 104</a> <tel:<a moz-do-not-send="true" href="tel:%2B1%20902%20367%204532%20ext.%20104" value="+19023674532" target="_blank">+1 902 367 4532 ext. 104</a>><br> Mobile <span style="white-space:pre-wrap"> </span><a moz-do-not-send="true" href="tel:%2B1%20902%20978%200009" value="+19029780009" target="_blank">+1 902 978 0009</a> <tel:<a moz-do-not-send="true" href="tel:%2B1%20902%20978%200009" value="+19029780009" target="_blank">+1 902 978 0009</a>><br> <br> <hosted-engine--deploy-logs.zip><br> </blockquote> <br> <br> Users mailing list<br> <a moz-do-not-send="true" href="mailto:Users@ovirt.org" target="_blank">Users@ovirt.org</a><br> <a moz-do-not-send="true" href="http://lists.ovirt.org/mailman/listinfo/users" target="_blank">http://lists.ovirt.org/mailman/listinfo/users</a> <<a moz-do-not-send="true" href="http://lists.ovirt.org/mailman/listinfo/users" target="_blank">http://lists.ovirt.org/mailman/listinfo/users</a>><br> <br> -- <br> Wee<br> </blockquote> <br> </blockquote> <br> _______________________________________________<br> Users mailing list<br> <a moz-do-not-send="true" href="mailto:Users@ovirt.org" target="_blank">Users@ovirt.org</a><br> <a moz-do-not-send="true" href="http://lists.ovirt.org/mailman/listinfo/users" target="_blank">http://lists.ovirt.org/mailman/listinfo/users</a><br> </blockquote> <br> </div> </div> </blockquote> <br> </div> </div> </div> </blockquote> </div> <br> </div> </blockquote> <br> </body> </html> --------------030000020408000408030009--