Hi Charles,
I think the problem I am having is due to the setup failing and not
something in vdsm configs as I have never gotten this server to start up
properly and the BRIDGE ethernet interface + ovirt routes are not setup.
I put the logs here:
https://www.dropbox.com/sh/5ugyykqh1lgru9l/AACXxRYWr3tgd0WbBVFW5twHa?dl=0
hosted-engine--deploy-logs.zip # Logs from when I tried to deploy and it
failed
vdsm.tar.gz # /var/log/vdsm
Output from running vdsm from the command line:
[root@cultivar2 log]# su -s /bin/bash vdsm
[vdsm@cultivar2 log]$ python /usr/share/vdsm/vdsm
(PID: 6521) I am the actual vdsm 4.17.26-1.el7
cultivar2.grove.silverorange.com (3.10.0-327.el7.x86_64)
VDSM will run with cpu affinity: frozenset([1])
/usr/bin/taskset --all-tasks --pid --cpu-list 1 6521 (cwd None)
SUCCESS: <err> = ''; <rc> = 0
Starting scheduler vdsm.Scheduler
started
Run and protect:
registerDomainStateChangeCallback(callbackFunc=<functools.partial object at
0x381b158>)
Run and protect: registerDomainStateChangeCallback, Return response: None
Trying to connect to Super Vdsm
Preparing MOM interface
Using named unix socket /var/run/vdsm/mom-vdsm.sock
Unregistering all secrests
trying to connect libvirt
recovery: started
Setting channels' timeout to 30 seconds.
Starting VM channels listener thread.
Listening at 0.0.0.0:54321
Adding detector <rpc.bindingxmlrpc.XmlDetector instance at 0x3b4ecb0>
recovery: completed in 0s
Adding detector <yajsonrpc.stompreactor.StompDetector instance at 0x382e5a8>
Starting executor
Starting worker jsonrpc.Executor/0
Worker started
Starting worker jsonrpc.Executor/1
Worker started
Starting worker jsonrpc.Executor/2
Worker started
Starting worker jsonrpc.Executor/3
Worker started
Starting worker jsonrpc.Executor/4
Worker started
Starting worker jsonrpc.Executor/5
Worker started
Starting worker jsonrpc.Executor/6
Worker started
Starting worker jsonrpc.Executor/7
Worker started
XMLRPC server running
Starting executor
Starting worker periodic/0
Worker started
Starting worker periodic/1
Worker started
Starting worker periodic/2
Worker started
Starting worker periodic/3
Worker started
trying to connect libvirt
Panic: Connect to supervdsm service failed: [Errno 2] No such file or
directory
Traceback (most recent call last):
File "/usr/share/vdsm/supervdsm.py", line 78, in _connect
utils.retry(self._manager.connect, Exception, timeout=60, tries=3)
File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 959, in retry
return func()
File "/usr/lib64/python2.7/multiprocessing/managers.py", line 500, in
connect
conn = Client(self._address, authkey=self._authkey)
File "/usr/lib64/python2.7/multiprocessing/connection.py", line 173, in
Client
c = SocketClient(address)
File "/usr/lib64/python2.7/multiprocessing/connection.py", line 308, in
SocketClient
s.connect(address)
File "/usr/lib64/python2.7/socket.py", line 224, in meth
return getattr(self._sock,name)(*args)
error: [Errno 2] No such file or directory
Killed
Thanks for the help. It's really appreciated.
Cheers,
Gervais
On Fri, May 13, 2016 at 12:55 AM, Charles Tassell <ctassell(a)gmail.com>
wrote:
Hi Gervais,
Hmm, can you tar up the logfiles (/var/log/vdsm/* on the host you are
installing on) and put them somewhere to look at? Also, I found that
starting VDSM from the command line is useful as it sometimes spits out
error messages that don't show up in the logs. I think the command I used
was:
su -s /bin/bash vdsm
python /usr/share/vdsm/vdsm
My problem was that I customized the logging settings in /etc/vdsm/*conf
to try and tone down the debugging stuff and had a syntax error.
On 16-05-12 10:24 PM, Gervais de Montbrun wrote:
Hi Charles,
Thanks for the suggestion.
I cleaned up again using the bash script from the
recoving-from-failed-install link below, then reinstalled (yum install
ovirt-hosted-engine-setup).
I enabled NetworkManager and firewalld as you suggested. The install stops
very early on with an error:
[ ERROR ] Failed to execute stage 'Programs detection': hosted-engine
cannot be deployed while NetworkManager is running, please stop and disable
it before proceeding
I disabled and stopped NetworkManager and tried again. Same result. :(
Any more guesses?
Cheers,
Gervais
On May 12, 2016, at 9:08 PM, Charles Tassell <ctassell(a)gmail.com> wrote:
Hey Gervais,
Try enabling NetworkManager and firewalld before doing the hosted-engine
--deploy. I have run into problems with oVirt trying to perform tasks on
hosts where firewalld is disabled, so maybe you are running into a similar
problem. Also, I think the setup script will disable NetworkManager if it
needs to. I know I didn't manually disable it on any of the boxes I
installed on.
On 16-05-12 04:49 PM, users-request(a)ovirt.org wrote:
Message: 1
Date: Thu, 12 May 2016 14:22:12 -0300
From: Gervais de Montbrun <gervais(a)demontbrun.com>
To: Wee Sritippho <wee.s(a)forest.go.th>
Cc: users <users(a)ovirt.org>
Subject: Re: [ovirt-users] Adding another host to my cluster
Message-ID: <28B7FC74-5C52-4F60-B9F3-39A36621A7CA(a)demontbrun.com>
Content-Type: text/plain; charset="utf-8"
Hi Wee
(and others)
Thanks for the reply. I tried what you suggested, but I am in the exact
same state. :-(
I don't want to completely remove my hosted engine setup as it is working
on the two other hosts in my cluster. I did not run the rm -rf stes listed
here (
<
https://www.ovirt.org/documentation/how-to/hosted-engine/#recoving-from-f...
https://www.ovirt.org/documentation/how-to/hosted-engine/#recoving-from-f...
<
https://www.ovirt.org/documentation/how-to/hosted-engine/#recoving-from-f...>)
that would wipe my hosted_engine nfs mount. If you know that this is 100%
necessary, please let me know.
I did:
hosted-engine --clean-metadata --force-cleanup --host-id=3
run the bash script to remove all of the ovirt packages and config files
reinstalled ovirt-hosted-engine-setup
ran "hosted-engine --deploy"
I'm back exactly where I started. Is there a way to run just the network
configuration part of the deploy?
Since the last attempt, I did upgrade my hosted engine and my cluster is
now running oVirt 3.6.5.
Cheers,
Gervais
On May 12, 2016, at 11:50 AM, Wee Sritippho <wee.s(a)forest.go.th> wrote:
Hi,
I used to have a similar problem where one of my host can't be deployed
due to the absence of ovirtmgmt bridge. Simone said it's a bug (
<
https://bugzilla.redhat.com/1323465>https://bugzilla.redhat.com/1323465 <
https://bugzilla.redhat.com/1323465> ) which would be fixed in 3.6.6.
This is what I've done to solve it:
1. In the web UI, set the failed host to maintenance.
2. Remove it.
3. In that host, run a script from
<
https://www.ovirt.org/documentation/how-to/hosted-engine/#recoving-from-f...
https://www.ovirt.org/documentation/how-to/hosted-engine/#recoving-from-f...
<
https://www.ovirt.org/documentation/how-to/hosted-engine/#recoving-from-f...
>
4. Install ovirt-hosted-engine-setup again.
5. Redeploy again.
Hope that helps
On 11 ??????? 2016 22 ?????? 48 ???? 58 ?????? GMT+07:00, Gervais de
Montbrun < <gervais@demontbrun.com>gervais(a)demontbrun.com> wrote:
Hi Folks,
I hate to reply to my own message, but I'm really hoping someone can help
me with my issue
http://lists.ovirt.org/pipermail/users/2016-May/039690.html <
http://lists.ovirt.org/pipermail/users/2016-May/039690.html>
Does anyone have a suggestion for me? If there is any more information
that I can provide that would help you to help me, please advise.
Cheers,
Gervais
On May 9, 2016, at 1:42 PM, Gervais de Montbrun <gervais(a)demontbrun.com
<mailto:gervais@demontbrun.com>> wrote:
Hi All,
I'm trying to add a third host into my oVirt cluster. I have hosted engine
setup on the first two. It's failing to finish the hosted-engine --deploy
on this third host. I wiped the server and did a CentOS 7 minimum install
and ran it again to have a clean machine.
My setup:
CentOS 7 clean install
yum install -y
http://resources.ovirt.org/pub/yum-repo/ovirt-release36.rpm
<
http://resources.ovirt.org/pub/yum-repo/ovirt-release36.rpm>
yum install -y ovirt-hosted-engine-setup
yum upgrade -y && reboot
systemctl disable NetworkManager ; systemctl stop NetworkManager ;
systemctl disable firewalld ; systemctl stop firewalld
hosted-engine --deploy
hosted-engine --deploy always throws an error:
[ ERROR ] The VDSM host was found in a failed state. Please check engine
and bootstrap installation logs.
[ ERROR ] Unable to add Cultivar2 to the manager
and then echo's
[ INFO ] Waiting for VDSM hardware info
...
[ ERROR ] Failed to execute stage 'Closing up': VDSM did not start within
120 seconds
[ INFO ] Stage: Clean up
[ INFO ] Generating answer file
'/var/lib/ovirt-hosted-engine-setup/answers/answers-20160509131103.conf'
[ INFO ] Stage: Pre-termination
[ INFO ] Stage: Termination
[ ERROR ] Hosted Engine deployment failed: this system is not reliable,
please check the issue, fix and redeploy
Log file is located at
/var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20160509130658-qb8ev0.log
Full output of hosted-engine --deploy included in the attached zip file.
I've also included vdsm.log (There is more than one tries worth of tries
in there).
You'll also find the ovirt-hosted-engine-setup-20160509130658-qb8ev0.log
listed above.
This is my "test" setup. Cultivar0 is my first host and my nfs server for
storage. I have two hosts in the setup already and everything is working
fine. The host does show up in the oVirt admin, but shows "Installed Failed"
<PastedGraphic-1.png>
Trying to reinstall from within the interface just fails again.
The ovirt bridge interface is not configured and there are no config files
in /etc/sysconfi/network-scripts related to ovirt.
OS:
[root@cultivar2 ovirt-hosted-engine-setup]# cat /etc/redhat-release
CentOS Linux release 7.2.1511 (Core)
[root@cultivar2 ovirt-hosted-engine-setup]# uname -a
Linux
cultivar2.grove.silverorange.com <
http://cultivar2.grove.silverorange.com/> 3.10.0-327.13.1.el7.x86_64 #1
SMP Thu Mar 31 16:04:38 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
Versions:
[root@cultivar2 ovirt-hosted-engine-setup]# rpm -qa | grep -i ovirt
libgovirt-0.3.3-1.el7_2.1.x86_64
ovirt-hosted-engine-setup-1.3.5.0-1.1.el7.noarch
ovirt-host-deploy-1.4.1-1.el7.centos.noarch
ovirt-vmconsole-1.0.0-1.el7.centos.noarch
ovirt-vmconsole-host-1.0.0-1.el7.centos.noarch
ovirt-release36-007-1.noarch
ovirt-engine-sdk-python-3.6.5.0-1.el7.centos.noarch
ovirt-setup-lib-1.0.1-1.el7.centos.noarch
ovirt-hosted-engine-ha-1.3.5.3-1.1.el7.noarch
[root@cultivar2 ovirt-hosted-engine-setup]#
[root@cultivar2 ovirt-hosted-engine-setup]#
[root@cultivar2 ovirt-hosted-engine-setup]#
[root@cultivar2 ovirt-hosted-engine-setup]# rpm -qa | grep -i virt
libvirt-daemon-driver-secret-1.2.17-13.el7_2.4.x86_64
virt-viewer-2.0-6.el7.x86_64
libgovirt-0.3.3-1.el7_2.1.x86_64
libvirt-daemon-kvm-1.2.17-13.el7_2.4.x86_64
ovirt-hosted-engine-setup-1.3.5.0-1.1.el7.noarch
fence-virt-0.3.2-2.el7.x86_64
virt-what-1.13-6.el7.x86_64
libvirt-python-1.2.17-2.el7.x86_64
libvirt-daemon-1.2.17-13.el7_2.4.x86_64
libvirt-daemon-config-nwfilter-1.2.17-13.el7_2.4.x86_64
libvirt-lock-sanlock-1.2.17-13.el7_2.4.x86_64
libvirt-daemon-driver-nodedev-1.2.17-13.el7_2.4.x86_64
libvirt-daemon-driver-network-1.2.17-13.el7_2.4.x86_64
libvirt-daemon-driver-storage-1.2.17-13.el7_2.4.x86_64
ovirt-host-deploy-1.4.1-1.el7.centos.noarch
virt-v2v-1.28.1-1.55.el7.centos.2.x86_64
ovirt-vmconsole-1.0.0-1.el7.centos.noarch
ovirt-vmconsole-host-1.0.0-1.el7.centos.noarch
libvirt-client-1.2.17-13.el7_2.4.x86_64
libvirt-daemon-driver-nwfilter-1.2.17-13.el7_2.4.x86_64
ovirt-release36-007-1.noarch
libvirt-daemon-driver-interface-1.2.17-13.el7_2.4.x86_64
libvirt-daemon-driver-qemu-1.2.17-13.el7_2.4.x86_64
ovirt-engine-sdk-python-3.6.5.0-1.el7.centos.noarch
ovirt-setup-lib-1.0.1-1.el7.centos.noarch
ovirt-hosted-engine-ha-1.3.5.3-1.1.el7.noarch
I also have a series of stuck tasks that I can't clear related to the host
that can't be added... This is a secondary issue and I don't want to get
off track, but they look like this:
<PastedGraphic-2.png>
I'd appreciate any help that can be offered.
Cheers,
Gervais
Gervais de Montbrun
Systems Administrator / silverorange Inc.
Phone +1 902 367 4532 ext. 104
<%2B1%20902%20367%204532%20ext.%20104> <tel:+1 902 367 4532 ext. 104
<%2B1%20902%20367%204532%20ext.%20104>>
Mobile +1 902 978 0009 <tel:+1 902 978
0009>
<hosted-engine--deploy-logs.zip>
Users mailing list
Users(a)ovirt.org
http://lists.ovirt.org/mailman/listinfo/users <
http://lists.ovirt.org/mailman/listinfo/users>
--
Wee
_______________________________________________
Users mailing list
Users(a)ovirt.org
http://lists.ovirt.org/mailman/listinfo/users