
We have found a resolution to this issue which is a bit convoluted - and still does not explain why this started in the first place. Once we have prepped a HV server to be added (all the NIC's are ready, selinux, networkmanager, firewalld, etc have been disabled, we have to do the following: yum install librbd1 rm /etc/yum.repos.d/CentOS-Base.repo (yes... delete the base repo...) vi /etc/yum.repos.d/CentOS-Vault.repo add the following: [vault] name=CentOS-$releasever - Extras #mirrorlist=http://vault.centos.org/?release=$releasever&arch=$basearch&repo=extras baseurl=http://vault.centos.org/centos/7.0.1406/os/x86_64/ gpgcheck=0 gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-7 vi /etc/yum.repos.d/ovirt-3.5-dependencies.repo edit baseurls as show below: [ovirt-3.5-glusterfs-epel] name=GlusterFS is a clustered file-system capable of scaling to several petabytes. baseurl=https://download.gluster.org/pub/gluster/glusterfs/old-releases/3.6/3.6.1/RH... enabled=1 skip_if_unavailable=1 gpgcheck=0 [ovirt-3.5-glusterfs-noarch-epel] name=GlusterFS is a clustered file-system capable of scaling to several petabytes. baseurl=http://download.gluster.org/pub/gluster/glusterfs/old-releases/3.6/3.6.1/RHE... #baseurl=http://download.gluster.org/pub/gluster/glusterfs/LATEST/EPEL.repo/epel-$rel... enabled=1 skip_if_unavailable=1 gpgcheck=0 yum remove ovirt-release35 yum remove vdsm yum remove libvirt yum install ovirt-release35-002-1 yum install libvirt-1.1.1-29.el7 yum install vdsm-4.16.7-1.gitdb83943.el7 We can then successfully add the HV into the cluster. This is using CentOS 7.0.1406 (Core) as the host OS and oVirt 3.5.0.1-1.el6 If anyone has any questions please feel free to ask. *** *Mark Steele* CIO / VP Technical Operations | TelVue Corporation TelVue - We Share Your Vision 16000 Horizon Way, Suite 100 | Mt. Laurel, NJ 08054 800.885.8886 x128 | msteele@telvue.com | http://www.telvue.com twitter: http://twitter.com/telvue | facebook: https://www.facebook.com/telvue On Tue, Feb 20, 2018 at 11:24 AM, Yaniv Kaul <ykaul@redhat.com> wrote:
On Tue, Feb 20, 2018 at 12:52 PM, Mark Steele <msteele@telvue.com> wrote:
Is it possible that the HostedEngine became corrupted somehow and that is preventing us from adding hosts?
I doubt that. I still suspect the libvirt auth. issue. Nevertheless, as commented more than once, you are running on somewhat old version with a recent CentOS version. Not sure this combination is tested or anyone's running it.
Is creating a new hosted engine an option?
You could backup and restore to a new HE. Y.
*** *Mark Steele* CIO / VP Technical Operations | TelVue Corporation TelVue - We Share Your Vision 16000 Horizon Way, Suite 100 | Mt. Laurel, NJ 08054 <https://maps.google.com/?q=16000+Horizon+Way,+Suite+100+%7C+Mt.+Laurel,+NJ+08054&entry=gmail&source=g> 800.885.8886 x128 <(800)%20885-8886> | msteele@telvue.com | http:// www.telvue.com twitter: http://twitter.com/telvue | facebook: https://www.facebook .com/telvue
On Mon, Feb 19, 2018 at 9:55 AM, Mark Steele <msteele@telvue.com> wrote:
At this point I'm wondering if there is anyone in the community that freelances and would be willing to provide remote support to resolve this issue?
We are running with 1/2 our normal hosts, and not being able to add anymore back into the cluster is a serious problem.
Best regards,
*** *Mark Steele* CIO / VP Technical Operations | TelVue Corporation TelVue - We Share Your Vision 16000 Horizon Way, Suite 100 | Mt. Laurel, NJ 08054 <https://maps.google.com/?q=16000+Horizon+Way,+Suite+100+%7C+Mt.+Laurel,+NJ+08054&entry=gmail&source=g> 800.885.8886 x128 <(800)%20885-8886> | msteele@telvue.com | http:// www.telvue.com twitter: http://twitter.com/telvue | facebook: https://www.facebook .com/telvue
On Sat, Feb 17, 2018 at 12:53 PM, Mark Steele <msteele@telvue.com> wrote:
Yaniv,
I have one of my developers assisting me and we are continuing to run into issues. This is a note from him:
Hi, I'm trying to add a host to ovirt, but I'm running into package dependency problems. I have existing hosts that are working and integrated properly, and inspecting those, I am able to match the packages between the new host and the existing, but when I then try to add the new host to ovirt, it fails on reinstall because it's trying to install packages that are later versions. does the installation run list from ovirt-release35 002-1 have unspecified versions? The working hosts use libvirt-1.1.1-29, and vdsm-4.16.7, but it's trying to install vdsm-4.16.30, which requires a higher version of libvirt, at which point, the installation fails. is there some way I can specify which package versions the ovirt install procedure uses? or better yet, skip the package management step entirely?
*** *Mark Steele* CIO / VP Technical Operations | TelVue Corporation TelVue - We Share Your Vision 16000 Horizon Way, Suite 100 | Mt. Laurel, NJ 08054 <https://maps.google.com/?q=16000+Horizon+Way,+Suite+100+%7C+Mt.+Laurel,+NJ+08054&entry=gmail&source=g> 800.885.8886 x128 <(800)%20885-8886> | msteele@telvue.com | http:// www.telvue.com twitter: http://twitter.com/telvue | facebook: https://www.facebook .com/telvue
On Sat, Feb 17, 2018 at 2:32 AM, Yaniv Kaul <ykaul@redhat.com> wrote:
On Fri, Feb 16, 2018 at 11:14 PM, Mark Steele <msteele@telvue.com> wrote:
We are using CentOS Linux release 7.0.1406 (Core) and oVirt Engine Version: 3.5.0.1-1.el6
You are seeing https://bugzilla.redhat.com/show_bug.cgi?id=1444426 , which is a result of a default change of libvirt and was fixed in later versions of oVirt than the one you are using. See patch https://gerrit.ovirt.org/#/c/76934/ for how it was fixed, you can probably configure it manually. Y.
We have four other hosts that are running this same configuration already. I took one host out of the cluster (forcefully) that was working and now it will not add back in either - throwing the same SASL error.
We are looking at downgrading libvirt as I've seen that somewhere else - is there another version of RH I should be trying? I have a host I can put it on.
*** *Mark Steele* CIO / VP Technical Operations | TelVue Corporation TelVue - We Share Your Vision 16000 Horizon Way, Suite 100 | Mt. Laurel, NJ 08054 <https://maps.google.com/?q=16000+Horizon+Way,+Suite+100+%7C+Mt.+Laurel,+NJ+08054&entry=gmail&source=g> 800.885.8886 x128 <(800)%20885-8886> | msteele@telvue.com | http:// www.telvue.com twitter: http://twitter.com/telvue | facebook: https://www.facebook .com/telvue
On Fri, Feb 16, 2018 at 3:31 PM, Yaniv Kaul <ykaul@redhat.com> wrote:
> > > On Feb 16, 2018 6:47 PM, "Mark Steele" <msteele@telvue.com> wrote: > > Hello all, > > We recently had a network event where we lost access to our storage > for a period of time. The Cluster basically shut down all our VM's and in > the process we had three HV's that went offline and would not communicate > properly with the cluster. > > We have since completely reinstalled CentOS on the hosts and > attempted to install them into the cluster with no joy. We've gotten to the > point where we generally get an error message in the web gui: > > > Which EL release and which oVirt release are you using? My guess > would be latest EL, with an older oVirt? > Y. > > > Stage: Misc Configuration > Host hv-ausa-02 installation failed. Command returned failure code 1 > during SSH session 'root@10.1.90.154'. > > the following is what we are seeing in the messages log: > > Feb 16 11:39:53 hv-ausa-02 vdsm-tool: libvirt: XML-RPC error : > authentication failed: authentication failed > Feb 16 11:39:53 hv-ausa-02 libvirtd: 2018-02-16 16:39:53.761+0000: > 15231: error : virNetSASLSessionListMechanisms:390 : internal > error: cannot list SASL mechanisms -4 (SASL(-4): no mechanism available: > Internal Error -4 in server.c near line 1757) > Feb 16 11:39:53 hv-ausa-02 libvirtd: 2018-02-16 16:39:53.761+0000: > 15231: error : remoteDispatchAuthSaslInit:3411 : authentication > failed: authentication failed > Feb 16 11:39:53 hv-ausa-02 libvirtd: 2018-02-16 16:39:53.761+0000: > 15226: error : virNetSocketReadWire:1808 : End of file while reading data: > Input/output error > Feb 16 11:39:53 hv-ausa-02 vdsm-tool: libvirt: XML-RPC error : > authentication failed: authentication failed > Feb 16 11:39:53 hv-ausa-02 libvirtd: 2018-02-16 16:39:53.962+0000: > 15233: error : virNetSASLSessionListMechanisms:390 : internal > error: cannot list SASL mechanisms -4 (SASL(-4): no mechanism available: > Internal Error -4 in server.c near line 1757) > Feb 16 11:39:53 hv-ausa-02 libvirtd: 2018-02-16 16:39:53.963+0000: > 15233: error : remoteDispatchAuthSaslInit:3411 : authentication > failed: authentication failed > Feb 16 11:39:53 hv-ausa-02 libvirtd: 2018-02-16 16:39:53.963+0000: > 15226: error : virNetSocketReadWire:1808 : End of file while reading data: > Input/output error > Feb 16 11:39:53 hv-ausa-02 vdsm-tool: libvirt: XML-RPC error : > authentication failed: authentication failed > Feb 16 11:39:53 hv-ausa-02 vdsm-tool: Traceback (most recent call > last): > Feb 16 11:39:53 hv-ausa-02 vdsm-tool: File "/usr/bin/vdsm-tool", > line 219, in main > Feb 16 11:39:53 hv-ausa-02 vdsm-tool: return > tool_command[cmd]["command"](*args) > Feb 16 11:39:53 hv-ausa-02 vdsm-tool: File > "/usr/lib/python2.7/site-packages/vdsm/tool/upgrade_300_networks.py", > line 83, in upgrade_networks > Feb 16 11:39:53 hv-ausa-02 vdsm-tool: networks = netinfo.networks() > Feb 16 11:39:53 hv-ausa-02 vdsm-tool: File > "/usr/lib/python2.7/site-packages/vdsm/netinfo.py", line 112, in > networks > Feb 16 11:39:53 hv-ausa-02 vdsm-tool: conn = libvirtconnection.get() > Feb 16 11:39:53 hv-ausa-02 vdsm-tool: File > "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line > 159, in get > Feb 16 11:39:53 hv-ausa-02 vdsm-tool: conn = _open_qemu_connection() > Feb 16 11:39:53 hv-ausa-02 vdsm-tool: File > "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line > 95, in _open_qemu_connection > Feb 16 11:39:53 hv-ausa-02 vdsm-tool: return > utils.retry(libvirtOpen, timeout=10, sleep=0.2) > Feb 16 11:39:53 hv-ausa-02 vdsm-tool: File > "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 1108, in > retry > Feb 16 11:39:53 hv-ausa-02 vdsm-tool: return func() > Feb 16 11:39:53 hv-ausa-02 vdsm-tool: File > "/usr/lib64/python2.7/site-packages/libvirt.py", line 105, in > openAuth > Feb 16 11:39:53 hv-ausa-02 vdsm-tool: if ret is None:raise > libvirtError('virConnectOpenAuth() failed') > Feb 16 11:39:53 hv-ausa-02 vdsm-tool: libvirtError: authentication > failed: authentication failed > Feb 16 11:39:53 hv-ausa-02 systemd: vdsm-network.service: control > process exited, code=exited status=1 > Feb 16 11:39:53 hv-ausa-02 systemd: Failed to start Virtual Desktop > Server Manager network restoration. > Feb 16 11:39:53 hv-ausa-02 systemd: Dependency failed for Virtual > Desktop Server Manager. > Feb 16 11:39:53 hv-ausa-02 systemd: Job vdsmd.service/start failed > with result 'dependency'. > Feb 16 11:39:53 hv-ausa-02 systemd: Unit vdsm-network.service > entered failed state. > Feb 16 11:39:53 hv-ausa-02 systemd: vdsm-network.service failed. > Feb 16 11:40:01 hv-ausa-02 systemd: Started Session 10 of user root. > Feb 16 11:40:01 hv-ausa-02 systemd: Starting Session 10 of user root. > Feb 16 11:40:01 hv-ausa-02 systemd: Started Session 11 of user root. > Feb 16 11:40:01 hv-ausa-02 systemd: Starting Session 11 of user root. > > Can someone point me in the right direction to resolve this - it > seems to be a SASL issue perhaps? > > *** > *Mark Steele* > CIO / VP Technical Operations | TelVue Corporation > TelVue - We Share Your Vision > 16000 Horizon Way, Suite 100 | Mt. Laurel, NJ 08054 > <https://maps.google.com/?q=16000+Horizon+Way,+Suite+100+%7C+Mt.+Laurel,+NJ+08054&entry=gmail&source=g> > 800.885.8886 x128 <(800)%20885-8886> | msteele@telvue.com | http:// > www.telvue.com > twitter: http://twitter.com/telvue | facebook: https://www.facebook > .com/telvue > > _______________________________________________ > Users mailing list > Users@ovirt.org > http://lists.ovirt.org/mailman/listinfo/users > > >