So I have my 2nd node in my cluster that showed an upgrade option in OVIRT.
I put it in maint mode and ran the upgrade, it went through it but at one point it lost
its internet connection or connection within the gluster, it didnt get to the reboot
process and simply lost its connection to the engine from there.
I can see the gluster is still running and was able to keep all 3 glusters syncing but it
seems the VDSM may be the culprit here.
ovirt-ha-agent wont start and the hosted-engine --connect-storage returns:
Traceback (most recent call last):
File "/usr/lib64/python3.6/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/usr/lib64/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File
"/usr/lib/python3.6/site-packages/ovirt_hosted_engine_setup/connect_storage_server.py",
line 30, in <module>
timeout=ohostedcons.Const.STORAGE_SERVER_TIMEOUT,
File
"/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/client/client.py", line
312, in connect_storage_server
sserver.connect_storage_server(timeout=timeout)
File
"/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/storage_server.py",
line 411, in connect_storage_server
timeout=timeout,
File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/util.py",
line 474, in connect_vdsm_json_rpc
__vdsm_json_rpc_connect(logger, timeout)
File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/util.py",
line 415, in __vdsm_json_rpc_connect
timeout=VDSM_MAX_RETRY * VDSM_DELAY
RuntimeError: Couldn't connect to VDSM within 60 seconds
VDSM just keeps loop restart and failing, vdsm-tool configure --force throws this :
[root@ovirt-2 ~]# vdsm-tool configure --force
Checking configuration status...
sanlock is configured for vdsm
abrt is already configured for vdsm
Current revision of multipath.conf detected, preserving
lvm is configured for vdsm
Managed volume database is already configured
libvirt is already configured for vdsm
SUCCESS: ssl configured to true. No conflicts
Running configure...
libsepol.context_from_record: type insights_client_var_lib_t is not defined
libsepol.context_from_record: could not create context structure
libsepol.context_from_string: could not create context structure
libsepol.sepol_context_to_sid: could not convert
system_u:object_r:insights_client_var_lib_t:s0 to sid
invalid context system_u:object_r:insights_client_var_lib_t:s0
libsemanage.semanage_validate_and_compile_fcontexts: setfiles returned error code 255.
Traceback (most recent call last):
File "/usr/bin/vdsm-tool", line 209, in main
return tool_command[cmd]["command"](*args)
File "/usr/lib/python3.6/site-packages/vdsm/tool/__init__.py", line 40, in
wrapper
func(*args, **kwargs)
File "/usr/lib/python3.6/site-packages/vdsm/tool/configurator.py", line 145,
in configure
_configure(c)
File "/usr/lib/python3.6/site-packages/vdsm/tool/configurator.py", line 92, in
_configure
getattr(module, 'configure', lambda: None)()
File "/usr/lib/python3.6/site-packages/vdsm/tool/configurators/sebool.py",
line 88, in configure
_setup_booleans(True)
File "/usr/lib/python3.6/site-packages/vdsm/tool/configurators/sebool.py",
line 60, in _setup_booleans
sebool_obj.finish()
File "/usr/lib/python3.6/site-packages/seobject.py", line 340, in finish
self.commit()
File "/usr/lib/python3.6/site-packages/seobject.py", line 330, in commit
rc = semanage_commit(self.sh)
OSError: [Errno 0] Error
Anyone have ideas where I could recover this, I am not sure if something corrupted on
update or on a reboot -- I would prefer updating notes from the CLI next time but
unfortunately I have not looked that far, it would have helped me see what failed and
where much easier.