Ovirt 3.5 / CentOS 7 / GlusterFS - Host loses network config on reboot

I have a three node cluster that worked fine upon setup, but randomly (Its happened three times now) when a host is rebooted it loses its network config. I backed up the ifcfg*, route, and rule* files to quickly put things back in place, but even then vdsm fails to start and host reinstall from the engine errors on vdsm startup. The only way that I have found to recover the node is yum remove vdsm* libvirt*, and rm -Rf /etc/vdsm & /var/lib/vdsm. Then I can run host reinstall from the engine. Where the error occurs in host-deploy log: 2015-01-20 01:21:58 DEBUG otopi.plugins.otopi.services.systemd plugin.execute:866 execute-output: ('/bin/systemctl', 'start', 'vdsmd.service') stderr: Job for vdsmd.service failed. See 'systemctl status vdsmd.service' and 'journalctl -xn' for details. 2015-01-20 01:21:58 DEBUG otopi.context context._executeMethod:152 method exception Traceback (most recent call last): File "/tmp/ovirt-6z1CuMPhsV/pythonlib/otopi/context.py", line 142, in _executeMethod method['method']() File "/tmp/ovirt-6z1CuMPhsV/otopi-plugins/ovirt-host-deploy/vdsm/packages.py", line 219, in _start self.services.state('vdsmd', True) File "/tmp/ovirt-6z1CuMPhsV/otopi-plugins/otopi/services/systemd.py", line 138, in state 'start' if state else 'stop' File "/tmp/ovirt-6z1CuMPhsV/otopi-plugins/otopi/services/systemd.py", line 77, in _executeServiceCommand raiseOnError=raiseOnError File "/tmp/ovirt-6z1CuMPhsV/pythonlib/otopi/plugin.py", line 871, in execute command=args[0], RuntimeError: Command '/bin/systemctl' failed to execute 2015-01-20 01:21:58 ERROR otopi.context context._executeMethod:161 Failed to execute stage 'Closing up': Command '/bin/systemctl' failed to execute 2015-01-20 01:21:58 DEBUG otopi.context context.dumpEnvironment:490 ENVIRONMENT DUMP - BEGIN 2015-01-20 01:21:58 DEBUG otopi.context context.dumpEnvironment:500 ENV BASE/error=bool:'True' 2015-01-20 01:21:58 DEBUG otopi.context context.dumpEnvironment:500 ENV BASE/exceptionInfo=list:'[(<type 'exceptions.RuntimeError'>, RuntimeError("Command '/bin/systemctl' failed to execute",), <traceback object at 0x259cbd8>)]' 2015-01-20 01:21:58 DEBUG otopi.context context.dumpEnvironment:504 ENVIRONMENT DUMP - END 2015-01-20 01:21:58 INFO otopi.context context.runSequence:417 Stage: Pre-termination 2015-01-20 01:21:58 DEBUG otopi.context context.runSequence:421 STAGE pre-terminate 2015-01-20 01:21:58 DEBUG otopi.context context._executeMethod:138 Stage pre-terminate METHOD otopi.plugins.otopi.core.misc.Plugin._preTerminate 2015-01-20 01:21:58 DEBUG otopi.context context.dumpEnvironment:490 ENVIRONMENT DUMP - BEGIN 2015-01-20 01:21:58 DEBUG otopi.context context.dumpEnvironment:500 ENV BASE/aborted=bool:'False' 2015-01-20 01:21:58 DEBUG otopi.context context.dumpEnvironment:500 ENV BASE/debug=int:'0' 2015-01-20 01:21:58 DEBUG otopi.context context.dumpEnvironment:500 ENV BASE/error=bool:'True' 2015-01-20 01:21:58 DEBUG otopi.context context.dumpEnvironment:500 ENV BASE/exceptionInfo=list:'[(<type 'exceptions.RuntimeError'>, RuntimeError("Command '/bin/systemctl' failed to execute",), <traceback object at 0x259cbd8>)]' 2015-01-20 01:21:58 DEBUG otopi.context context.dumpEnvironment:500 ENV BASE/executionDirectory=str:'/root' 2015-01-20 01:21:58 DEBUG otopi.context context.dumpEnvironment:500 ENV BASE/log=bool:'True' 2015-01-20 01:21:58 DEBUG otopi.context context.dumpEnvironment:500 ENV BASE/pluginGroups=str:'otopi:ovirt-host-deploy' 2015-01-20 01:21:58 DEBUG otopi.context context.dumpEnvironment:500 ENV BASE/pluginPath=str:'/tmp/ovirt-6z1CuMPhsV/otopi-plugins' 2015-01-20 01:21:58 DEBUG otopi.context context.dumpEnvironment:500 ENV BASE/suppressEnvironmentKeys=list:'[]' 2015-01-20 01:21:58 DEBUG otopi.context context.dumpEnvironment:500 ENV COMMAND/chkconfig=str:'/sbin/chkconfig'
participants (1)
-
Tim Macy