Hello,
Today I updated my ovirt engine v3.5 and all my hosts on one datacenter (centos 7.4 ones).
and suddenly my vdsm and vdsm-network services stopped working.
btw: My other DC is centos 6 based (managed from the same ovirt engine), everything works just fine there.
vdsm fails dependent on vdsm-network service, with lots of RPC error.
I tried to configure vdsm-tool configure --force, deleted everything (vdsm-libvirt), reinstalled.
Could not make it work.
Sep 18 23:06:02 node6 vdsm-tool[5340]: libvirt: XML-RPC error : authentication failed: Failed to start SASL negotiation: -1 (SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (No Kerberos credent
Sep 18 23:06:02 node6 vdsm-tool[5340]: Traceback (most recent call last):
Sep 18 23:06:02 node6 vdsm-tool[5340]: File "/usr/bin/vdsm-tool", line 219, in main
Sep 18 23:06:02 node6 libvirtd[4312]: 2017-09-18 20:06:02.558+0000: 4312: error : virNetSocketReadWire:1808 : End of file while reading data: Input/output error
Sep 18 23:06:02 node6 vdsm-tool[5340]: return tool_command[cmd]["command"](*args)
Sep 18 23:06:02 node6 vdsm-tool[5340]: File "/usr/lib/python2.7/site-packages/vdsm/tool/upgrade_300_networks.py", line 83, in upgrade_networks
Sep 18 23:06:02 node6 vdsm-tool[5340]: networks = netinfo.networks()
Sep 18 23:06:02 node6 vdsm-tool[5340]: File "/usr/lib/python2.7/site-packages/vdsm/netinfo.py", line 112, in networks
Sep 18 23:06:02 node6 vdsm-tool[5340]: conn = libvirtconnection.get()
Sep 18 23:06:02 node6 vdsm-tool[5340]: File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line 159, in get
Sep 18 23:06:02 node6 vdsm-tool[5340]: conn = _open_qemu_connection()
Sep 18 23:06:02 node6 vdsm-tool[5340]: File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line 95, in _open_qemu_connection
Sep 18 23:06:02 node6 vdsm-tool[5340]: return utils.retry(libvirtOpen, timeout=10, sleep=0.2)
Sep 18 23:06:02 node6 vdsm-tool[5340]: File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 1108, in retry
Sep 18 23:06:02 node6 vdsm-tool[5340]: return func()
Sep 18 23:06:02 node6 vdsm-tool[5340]: File "/usr/lib64/python2.7/site-packages/libvirt.py", line 105, in openAuth
Sep 18 23:06:02 node6 vdsm-tool[5340]: if ret is None:raise libvirtError('virConnectOpenAuth() failed')
Sep 18 23:06:02 node6 vdsm-tool[5340]: libvirtError: authentication failed: Failed to start SASL negotiation: -1 (SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (No Kerberos credentials availa
Sep 18 23:06:02 node6 systemd[1]: vdsm-network.service: control process exited, code=exited status=1
Sep 18 23:06:02 node6 systemd[1]: Failed to start Virtual Desktop Server Manager network restoration.
-----
libvirt is running but throws some errors.
[root@node6 ~]# systemctl status libvirtd
● libvirtd.service - Virtualization daemon
Loaded: loaded (/usr/lib/systemd/system/libvirtd.service; enabled; vendor preset: enabled)
Drop-In: /etc/systemd/system/libvirtd.service.d
└─unlimited-core.conf
Active: active (running) since Mon 2017-09-18 23:15:47 +03; 19min ago
Docs: man:libvirtd(8)
http://libvirt.org Main PID: 6125 (libvirtd)
CGroup: /system.slice/libvirtd.service
└─6125 /usr/sbin/libvirtd --listen
Sep 18 23:15:56 node6 libvirtd[6125]: 2017-09-18 20:15:56.195+0000: 6125: error : virNetSocketReadWire:1808 : End of file while reading data: Input/output error
Sep 18 23:15:56 node6 libvirtd[6125]: 2017-09-18 20:15:56.396+0000: 6125: error : virNetSocketReadWire:1808 : End of file while reading data: Input/output error
Sep 18 23:15:56 node6 libvirtd[6125]: 2017-09-18 20:15:56.597+0000: 6125: error : virNetSocketReadWire:1808 : End of file while reading data: Input/output error
----------------
[root@node6 ~]# virsh
Welcome to virsh, the virtualization interactive terminal.
Type: 'help' for help with commands
'quit' to quit
virsh # list
error: failed to connect to the hypervisor
error: authentication failed: Failed to start SASL negotiation: -1 (SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (No Kerberos credentials available (default cache: KEYRING:persistent:0)))
=================
I do not want to lose all my virtual servers, is there any way to recover them? Currenty everything is down. I am ok to install a new ovirt engine if somehow I can restore my virtual servers. I can also split centos 6 and centos 7 ovirt engine's.