Il 03/03/2014 11:33, René Koch ha scritto:
Hi,
I have some issues with hosted engine (oVirt 3.4 prerelease repo on CentOS 6.5).
My setups is the following:
2 hosts (will be 4 in the future) with 4 GlusterFS shares:
- engine (for hosted engine)
- iso (for ISO domain)
- ovirt (oVirt storage domain)
I had a split-brain situation today (after rebooting both nodes) on
hosted-engine.lockspace file on engine GlusterFS volume which I resolved.
How did you solved it? By switching to NFS only?
hosted engine used engine share via NFS (TCP) as glusterfs isn't
supported for oVirt hosted engine, yet. I'll switch to GlusterFS as soon as oVirt
will support it (I hope this will be soon as RHEV 3.3 is already supporting GlusterFS for
hosted engine).
First of all ovirt-ha-agent fails to start on both nodes:
# service ovirt-ha-agent start
Starting ovirt-ha-agent: [ OK ]
# service ovirt-ha-agent status
ovirt-ha-agent dead but subsys locked
MainThread::INFO::2014-03-03
11:20:39,539::agent::52::ovirt_hosted_engine_ha.agent.agent.Agent::(run)
ovirt-hosted-engine-ha agent 1.1.0 started
MainThread::INFO::2014-03-03
11:20:39,590::hosted_engine::223::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_hostname)
Found
certificate common name: 10.0.200.101
MainThread::CRITICAL::2014-03-03
11:20:39,590::agent::103::ovirt_hosted_engine_ha.agent.agent.Agent::(run) Could not start
ha-agent
Traceback (most recent call last):
File
"/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line
97, in run
self._run_agent()
File
"/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line
154, in _run_agent
hosted_engine.HostedEngine(self.shutdown_requested).start_monitoring()
File
"/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
line 152, in __init__
"STOP_VM": self._stop_engine_vm
File
"/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/state_machine.py",
line 56, in __init__
logger, actions)
File
"/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/fsm/machine.py",
line 93, in __init__
self._logger = FSMLoggerAdapter(logger, self)
File
"/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/fsm/machine.py",
line 16, in __init__
super(FSMLoggerAdapter, self).__init__(logger, None)
TypeError: super() argument 1 must be type, not classobj
If I want to start my hosted engine, I receive the following error in vdsm logs, which
makes absolutly no sense to me, as there is plenty of disk
space available:
Thread-62::DEBUG::2014-03-03 11:24:46,282::libvirtconnection::124::root::(wrapper)
Unknown libvirterror: ecode: 38 edom: 42 level: 2 message: Failed
to acquire lock: No space left on device
seems like a vdsm failure in starting monitor the hosted engine storage domain.
Can you attach vdsm logs?
Thread-62::DEBUG::2014-03-03
11:24:46,282::vm::2252::vm.Vm::(_startUnderlyingVm)
vmId=`f26dd37e-13b5-430c-b2f2-ecd098b82a91`::_ongoingCreations released
Thread-62::ERROR::2014-03-03 11:24:46,283::vm::2278::vm.Vm::(_startUnderlyingVm)
vmId=`f26dd37e-13b5-430c-b2f2-ecd098b82a91`::The vm start process failed
Traceback (most recent call last):
File "/usr/share/vdsm/vm.py", line 2238, in _startUnderlyingVm
self._run()
File "/usr/share/vdsm/vm.py", line 3159, in _run
self._connection.createXML(domxml, flags),
File "/usr/lib64/python2.6/site-packages/vdsm/libvirtconnection.py", line 92,
in wrapper
ret = f(*args, **kwargs)
File "/usr/lib64/python2.6/site-packages/libvirt.py", line 2665, in
createXML
if ret is None:raise libvirtError('virDomainCreateXML() failed', conn=self)
libvirtError: Failed to acquire lock: No space left on device
Thread-62::DEBUG::2014-03-03 11:24:46,286::vm::2720::vm.Vm::(setDownStatus)
vmId=`f26dd37e-13b5-430c-b2f2-ecd098b82a91`::Changed state to Down: Failed
to acquire lock: No space left on device
# df -h | grep engine
ovirt-host01:/engine 281G 21G 261G 8%
/rhev/data-center/mnt/ovirt-host01:_engine
# sudo -u vdsm dd if=/dev/zero
of=/rhev/data-center/mnt/ovirt-host01:_engine/2851af27-8744-445d-9fb1-a0d083c8dc82/images/0e4d270f-2f7e-4b2b-847f-f114a4ba9bdc/test
bs=512 count=100
100+0 records in
100+0 records out
51200 bytes (51 kB) copied, 0.0230566 s, 2.2 MB/s
Could you give me some information on how to fix the ovirt-ha-agent and then
hosted-engine storage issue? Thanks a lot.
Btw, I had some issues during installation which I will explain in separate emails.
--
Sandro Bonazzola
Better technology. Faster innovation. Powered by community collaboration.
See how it works at
redhat.com