[Users] hosted engine issues

Hi, I have some issues with hosted engine (oVirt 3.4 prerelease repo on CentOS 6.5). My setups is the following: 2 hosts (will be 4 in the future) with 4 GlusterFS shares: - engine (for hosted engine) - iso (for ISO domain) - ovirt (oVirt storage domain) I had a split-brain situation today (after rebooting both nodes) on hosted-engine.lockspace file on engine GlusterFS volume which I resolved. hosted engine used engine share via NFS (TCP) as glusterfs isn't supported for oVirt hosted engine, yet. I'll switch to GlusterFS as soon as oVirt will support it (I hope this will be soon as RHEV 3.3 is already supporting GlusterFS for hosted engine). First of all ovirt-ha-agent fails to start on both nodes: # service ovirt-ha-agent start Starting ovirt-ha-agent: [ OK ] # service ovirt-ha-agent status ovirt-ha-agent dead but subsys locked MainThread::INFO::2014-03-03 11:20:39,539::agent::52::ovirt_hosted_engine_ha.agent.agent.Agent::(run) ovirt-hosted-engine-ha agent 1.1.0 started MainThread::INFO::2014-03-03 11:20:39,590::hosted_engine::223::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_hostname) Found certificate common name: 10.0.200.101 MainThread::CRITICAL::2014-03-03 11:20:39,590::agent::103::ovirt_hosted_engine_ha.agent.agent.Agent::(run) Could not start ha-agent Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line 97, in run self._run_agent() File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line 154, in _run_agent hosted_engine.HostedEngine(self.shutdown_requested).start_monitoring() File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 152, in __init__ "STOP_VM": self._stop_engine_vm File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/state_machine.py", line 56, in __init__ logger, actions) File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/fsm/machine.py", line 93, in __init__ self._logger = FSMLoggerAdapter(logger, self) File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/fsm/machine.py", line 16, in __init__ super(FSMLoggerAdapter, self).__init__(logger, None) TypeError: super() argument 1 must be type, not classobj If I want to start my hosted engine, I receive the following error in vdsm logs, which makes absolutly no sense to me, as there is plenty of disk space available: Thread-62::DEBUG::2014-03-03 11:24:46,282::libvirtconnection::124::root::(wrapper) Unknown libvirterror: ecode: 38 edom: 42 level: 2 message: Failed to acquire lock: No space left on device Thread-62::DEBUG::2014-03-03 11:24:46,282::vm::2252::vm.Vm::(_startUnderlyingVm) vmId=`f26dd37e-13b5-430c-b2f2-ecd098b82a91`::_ongoingCreations released Thread-62::ERROR::2014-03-03 11:24:46,283::vm::2278::vm.Vm::(_startUnderlyingVm) vmId=`f26dd37e-13b5-430c-b2f2-ecd098b82a91`::The vm start process failed Traceback (most recent call last): File "/usr/share/vdsm/vm.py", line 2238, in _startUnderlyingVm self._run() File "/usr/share/vdsm/vm.py", line 3159, in _run self._connection.createXML(domxml, flags), File "/usr/lib64/python2.6/site-packages/vdsm/libvirtconnection.py", line 92, in wrapper ret = f(*args, **kwargs) File "/usr/lib64/python2.6/site-packages/libvirt.py", line 2665, in createXML if ret is None:raise libvirtError('virDomainCreateXML() failed', conn=self) libvirtError: Failed to acquire lock: No space left on device Thread-62::DEBUG::2014-03-03 11:24:46,286::vm::2720::vm.Vm::(setDownStatus) vmId=`f26dd37e-13b5-430c-b2f2-ecd098b82a91`::Changed state to Down: Failed to acquire lock: No space left on device # df -h | grep engine ovirt-host01:/engine 281G 21G 261G 8% /rhev/data-center/mnt/ovirt-host01:_engine # sudo -u vdsm dd if=/dev/zero of=/rhev/data-center/mnt/ovirt-host01:_engine/2851af27-8744-445d-9fb1-a0d083c8dc82/images/0e4d270f-2f7e-4b2b-847f-f114a4ba9bdc/test bs=512 count=100 100+0 records in 100+0 records out 51200 bytes (51 kB) copied, 0.0230566 s, 2.2 MB/s Could you give me some information on how to fix the ovirt-ha-agent and then hosted-engine storage issue? Thanks a lot. Btw, I had some issues during installation which I will explain in separate emails. -- Best Regards René Koch Senior Solution Architect ============================================ LIS-Linuxland GmbH Brünner Straße 163, A-1210 Vienna Phone: +43 1 236 91 60 Mobile: +43 660 / 512 21 31 E-Mail: rkoch@linuxland.at ============================================

Il 03/03/2014 11:33, René Koch ha scritto:
Hi,
I have some issues with hosted engine (oVirt 3.4 prerelease repo on CentOS 6.5). My setups is the following: 2 hosts (will be 4 in the future) with 4 GlusterFS shares: - engine (for hosted engine) - iso (for ISO domain) - ovirt (oVirt storage domain)
I had a split-brain situation today (after rebooting both nodes) on hosted-engine.lockspace file on engine GlusterFS volume which I resolved.
How did you solved it? By switching to NFS only?
hosted engine used engine share via NFS (TCP) as glusterfs isn't supported for oVirt hosted engine, yet. I'll switch to GlusterFS as soon as oVirt will support it (I hope this will be soon as RHEV 3.3 is already supporting GlusterFS for hosted engine).
First of all ovirt-ha-agent fails to start on both nodes:
# service ovirt-ha-agent start Starting ovirt-ha-agent: [ OK ] # service ovirt-ha-agent status ovirt-ha-agent dead but subsys locked
MainThread::INFO::2014-03-03 11:20:39,539::agent::52::ovirt_hosted_engine_ha.agent.agent.Agent::(run) ovirt-hosted-engine-ha agent 1.1.0 started MainThread::INFO::2014-03-03 11:20:39,590::hosted_engine::223::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_hostname) Found certificate common name: 10.0.200.101 MainThread::CRITICAL::2014-03-03 11:20:39,590::agent::103::ovirt_hosted_engine_ha.agent.agent.Agent::(run) Could not start ha-agent Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line 97, in run self._run_agent() File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line 154, in _run_agent hosted_engine.HostedEngine(self.shutdown_requested).start_monitoring() File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 152, in __init__ "STOP_VM": self._stop_engine_vm File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/state_machine.py", line 56, in __init__ logger, actions) File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/fsm/machine.py", line 93, in __init__ self._logger = FSMLoggerAdapter(logger, self) File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/fsm/machine.py", line 16, in __init__ super(FSMLoggerAdapter, self).__init__(logger, None) TypeError: super() argument 1 must be type, not classobj
If I want to start my hosted engine, I receive the following error in vdsm logs, which makes absolutly no sense to me, as there is plenty of disk space available:
Thread-62::DEBUG::2014-03-03 11:24:46,282::libvirtconnection::124::root::(wrapper) Unknown libvirterror: ecode: 38 edom: 42 level: 2 message: Failed to acquire lock: No space left on device
seems like a vdsm failure in starting monitor the hosted engine storage domain. Can you attach vdsm logs?
Thread-62::DEBUG::2014-03-03 11:24:46,282::vm::2252::vm.Vm::(_startUnderlyingVm) vmId=`f26dd37e-13b5-430c-b2f2-ecd098b82a91`::_ongoingCreations released Thread-62::ERROR::2014-03-03 11:24:46,283::vm::2278::vm.Vm::(_startUnderlyingVm) vmId=`f26dd37e-13b5-430c-b2f2-ecd098b82a91`::The vm start process failed Traceback (most recent call last): File "/usr/share/vdsm/vm.py", line 2238, in _startUnderlyingVm self._run() File "/usr/share/vdsm/vm.py", line 3159, in _run self._connection.createXML(domxml, flags), File "/usr/lib64/python2.6/site-packages/vdsm/libvirtconnection.py", line 92, in wrapper ret = f(*args, **kwargs) File "/usr/lib64/python2.6/site-packages/libvirt.py", line 2665, in createXML if ret is None:raise libvirtError('virDomainCreateXML() failed', conn=self) libvirtError: Failed to acquire lock: No space left on device Thread-62::DEBUG::2014-03-03 11:24:46,286::vm::2720::vm.Vm::(setDownStatus) vmId=`f26dd37e-13b5-430c-b2f2-ecd098b82a91`::Changed state to Down: Failed to acquire lock: No space left on device
# df -h | grep engine ovirt-host01:/engine 281G 21G 261G 8% /rhev/data-center/mnt/ovirt-host01:_engine
# sudo -u vdsm dd if=/dev/zero of=/rhev/data-center/mnt/ovirt-host01:_engine/2851af27-8744-445d-9fb1-a0d083c8dc82/images/0e4d270f-2f7e-4b2b-847f-f114a4ba9bdc/test bs=512 count=100 100+0 records in 100+0 records out 51200 bytes (51 kB) copied, 0.0230566 s, 2.2 MB/s
Could you give me some information on how to fix the ovirt-ha-agent and then hosted-engine storage issue? Thanks a lot.
Btw, I had some issues during installation which I will explain in separate emails.
-- Sandro Bonazzola Better technology. Faster innovation. Powered by community collaboration. See how it works at redhat.com

Hi René, thanks for the report.
TypeError: super() argument 1 must be type, not classobj What Python version are you using?
You can debug a crash of this version of ha-agent using: /usr/share/ovirt-hosted-engine-ha/ovirt-ha-agent --no-daemon --pdb But this exception is trying to tell you that FSMLoggerAdapter(logging.LoggerAdapter) does not have object in the ancestor list. And that is very weird. It can be related to the disk space issues.
libvirtError: Failed to acquire lock: No space left on device
Check the free space on all your devices, including /tmp and /var. Or post the output of "df -h" command here Regards -- Martin Sivák msivak@redhat.com Red Hat Czech RHEV-M SLA / Brno, CZ ----- Original Message -----
Il 03/03/2014 11:33, René Koch ha scritto:
Hi,
I have some issues with hosted engine (oVirt 3.4 prerelease repo on CentOS 6.5). My setups is the following: 2 hosts (will be 4 in the future) with 4 GlusterFS shares: - engine (for hosted engine) - iso (for ISO domain) - ovirt (oVirt storage domain)
I had a split-brain situation today (after rebooting both nodes) on hosted-engine.lockspace file on engine GlusterFS volume which I resolved.
How did you solved it? By switching to NFS only?
hosted engine used engine share via NFS (TCP) as glusterfs isn't supported for oVirt hosted engine, yet. I'll switch to GlusterFS as soon as oVirt will support it (I hope this will be soon as RHEV 3.3 is already supporting GlusterFS for hosted engine).
First of all ovirt-ha-agent fails to start on both nodes:
# service ovirt-ha-agent start Starting ovirt-ha-agent: [ OK ] # service ovirt-ha-agent status ovirt-ha-agent dead but subsys locked
MainThread::INFO::2014-03-03 11:20:39,539::agent::52::ovirt_hosted_engine_ha.agent.agent.Agent::(run) ovirt-hosted-engine-ha agent 1.1.0 started MainThread::INFO::2014-03-03 11:20:39,590::hosted_engine::223::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_hostname) Found certificate common name: 10.0.200.101 MainThread::CRITICAL::2014-03-03 11:20:39,590::agent::103::ovirt_hosted_engine_ha.agent.agent.Agent::(run) Could not start ha-agent Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line 97, in run self._run_agent() File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line 154, in _run_agent hosted_engine.HostedEngine(self.shutdown_requested).start_monitoring() File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 152, in __init__ "STOP_VM": self._stop_engine_vm File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/state_machine.py", line 56, in __init__ logger, actions) File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/fsm/machine.py", line 93, in __init__ self._logger = FSMLoggerAdapter(logger, self) File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/fsm/machine.py", line 16, in __init__ super(FSMLoggerAdapter, self).__init__(logger, None) TypeError: super() argument 1 must be type, not classobj
If I want to start my hosted engine, I receive the following error in vdsm logs, which makes absolutly no sense to me, as there is plenty of disk space available:
Thread-62::DEBUG::2014-03-03 11:24:46,282::libvirtconnection::124::root::(wrapper) Unknown libvirterror: ecode: 38 edom: 42 level: 2 message: Failed to acquire lock: No space left on device
seems like a vdsm failure in starting monitor the hosted engine storage domain. Can you attach vdsm logs?
Thread-62::DEBUG::2014-03-03 11:24:46,282::vm::2252::vm.Vm::(_startUnderlyingVm) vmId=`f26dd37e-13b5-430c-b2f2-ecd098b82a91`::_ongoingCreations released Thread-62::ERROR::2014-03-03 11:24:46,283::vm::2278::vm.Vm::(_startUnderlyingVm) vmId=`f26dd37e-13b5-430c-b2f2-ecd098b82a91`::The vm start process failed Traceback (most recent call last): File "/usr/share/vdsm/vm.py", line 2238, in _startUnderlyingVm self._run() File "/usr/share/vdsm/vm.py", line 3159, in _run self._connection.createXML(domxml, flags), File "/usr/lib64/python2.6/site-packages/vdsm/libvirtconnection.py", line 92, in wrapper ret = f(*args, **kwargs) File "/usr/lib64/python2.6/site-packages/libvirt.py", line 2665, in createXML if ret is None:raise libvirtError('virDomainCreateXML() failed', conn=self) libvirtError: Failed to acquire lock: No space left on device Thread-62::DEBUG::2014-03-03 11:24:46,286::vm::2720::vm.Vm::(setDownStatus) vmId=`f26dd37e-13b5-430c-b2f2-ecd098b82a91`::Changed state to Down: Failed to acquire lock: No space left on device
# df -h | grep engine ovirt-host01:/engine 281G 21G 261G 8% /rhev/data-center/mnt/ovirt-host01:_engine
# sudo -u vdsm dd if=/dev/zero of=/rhev/data-center/mnt/ovirt-host01:_engine/2851af27-8744-445d-9fb1-a0d083c8dc82/images/0e4d270f-2f7e-4b2b-847f-f114a4ba9bdc/test bs=512 count=100 100+0 records in 100+0 records out 51200 bytes (51 kB) copied, 0.0230566 s, 2.2 MB/s
Could you give me some information on how to fix the ovirt-ha-agent and then hosted-engine storage issue? Thanks a lot.
Btw, I had some issues during installation which I will explain in separate emails.
-- Sandro Bonazzola Better technology. Faster innovation. Powered by community collaboration. See how it works at redhat.com

On 03/03/2014 12:05 PM, Martin Sivak wrote:
Hi René,
thanks for the report.
TypeError: super() argument 1 must be type, not classobj What Python version are you using?
# python --version Python 2.6.6
You can debug a crash of this version of ha-agent using:
/usr/share/ovirt-hosted-engine-ha/ovirt-ha-agent --no-daemon --pdb
This gives me the same information as in vdsm.log
But this exception is trying to tell you that FSMLoggerAdapter(logging.LoggerAdapter) does not have object in the ancestor list. And that is very weird.
It can be related to the disk space issues.
libvirtError: Failed to acquire lock: No space left on device
Check the free space on all your devices, including /tmp and /var. Or post the output of "df -h" command here
I can't see a full filesystem here: # df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/vg0-lv_root 5.0G 1.1G 3.6G 24% / tmpfs 16G 0 16G 0% /dev/shm /dev/sda1 243M 45M 185M 20% /boot /dev/mapper/vg0-lv_data 281G 21G 261G 8% /data /dev/mapper/vg0-lv_tmp 2.0G 69M 1.9G 4% /tmp /dev/mapper/vg0-lv_var 5.0G 384M 4.3G 9% /var ovirt-host01:/engine 281G 21G 261G 8% /rhev/data-center/mnt/ovirt-host01:_engine Thanks, René
Regards
-- Martin Sivák msivak@redhat.com Red Hat Czech RHEV-M SLA / Brno, CZ
----- Original Message -----
Il 03/03/2014 11:33, René Koch ha scritto:
Hi,
I have some issues with hosted engine (oVirt 3.4 prerelease repo on CentOS 6.5). My setups is the following: 2 hosts (will be 4 in the future) with 4 GlusterFS shares: - engine (for hosted engine) - iso (for ISO domain) - ovirt (oVirt storage domain)
I had a split-brain situation today (after rebooting both nodes) on hosted-engine.lockspace file on engine GlusterFS volume which I resolved.
How did you solved it? By switching to NFS only?
hosted engine used engine share via NFS (TCP) as glusterfs isn't supported for oVirt hosted engine, yet. I'll switch to GlusterFS as soon as oVirt will support it (I hope this will be soon as RHEV 3.3 is already supporting GlusterFS for hosted engine).
First of all ovirt-ha-agent fails to start on both nodes:
# service ovirt-ha-agent start Starting ovirt-ha-agent: [ OK ] # service ovirt-ha-agent status ovirt-ha-agent dead but subsys locked
MainThread::INFO::2014-03-03 11:20:39,539::agent::52::ovirt_hosted_engine_ha.agent.agent.Agent::(run) ovirt-hosted-engine-ha agent 1.1.0 started MainThread::INFO::2014-03-03 11:20:39,590::hosted_engine::223::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_hostname) Found certificate common name: 10.0.200.101 MainThread::CRITICAL::2014-03-03 11:20:39,590::agent::103::ovirt_hosted_engine_ha.agent.agent.Agent::(run) Could not start ha-agent Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line 97, in run self._run_agent() File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line 154, in _run_agent hosted_engine.HostedEngine(self.shutdown_requested).start_monitoring() File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 152, in __init__ "STOP_VM": self._stop_engine_vm File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/state_machine.py", line 56, in __init__ logger, actions) File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/fsm/machine.py", line 93, in __init__ self._logger = FSMLoggerAdapter(logger, self) File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/fsm/machine.py", line 16, in __init__ super(FSMLoggerAdapter, self).__init__(logger, None) TypeError: super() argument 1 must be type, not classobj
If I want to start my hosted engine, I receive the following error in vdsm logs, which makes absolutly no sense to me, as there is plenty of disk space available:
Thread-62::DEBUG::2014-03-03 11:24:46,282::libvirtconnection::124::root::(wrapper) Unknown libvirterror: ecode: 38 edom: 42 level: 2 message: Failed to acquire lock: No space left on device
seems like a vdsm failure in starting monitor the hosted engine storage domain. Can you attach vdsm logs?
Thread-62::DEBUG::2014-03-03 11:24:46,282::vm::2252::vm.Vm::(_startUnderlyingVm) vmId=`f26dd37e-13b5-430c-b2f2-ecd098b82a91`::_ongoingCreations released Thread-62::ERROR::2014-03-03 11:24:46,283::vm::2278::vm.Vm::(_startUnderlyingVm) vmId=`f26dd37e-13b5-430c-b2f2-ecd098b82a91`::The vm start process failed Traceback (most recent call last): File "/usr/share/vdsm/vm.py", line 2238, in _startUnderlyingVm self._run() File "/usr/share/vdsm/vm.py", line 3159, in _run self._connection.createXML(domxml, flags), File "/usr/lib64/python2.6/site-packages/vdsm/libvirtconnection.py", line 92, in wrapper ret = f(*args, **kwargs) File "/usr/lib64/python2.6/site-packages/libvirt.py", line 2665, in createXML if ret is None:raise libvirtError('virDomainCreateXML() failed', conn=self) libvirtError: Failed to acquire lock: No space left on device Thread-62::DEBUG::2014-03-03 11:24:46,286::vm::2720::vm.Vm::(setDownStatus) vmId=`f26dd37e-13b5-430c-b2f2-ecd098b82a91`::Changed state to Down: Failed to acquire lock: No space left on device
# df -h | grep engine ovirt-host01:/engine 281G 21G 261G 8% /rhev/data-center/mnt/ovirt-host01:_engine
# sudo -u vdsm dd if=/dev/zero of=/rhev/data-center/mnt/ovirt-host01:_engine/2851af27-8744-445d-9fb1-a0d083c8dc82/images/0e4d270f-2f7e-4b2b-847f-f114a4ba9bdc/test bs=512 count=100 100+0 records in 100+0 records out 51200 bytes (51 kB) copied, 0.0230566 s, 2.2 MB/s
Could you give me some information on how to fix the ovirt-ha-agent and then hosted-engine storage issue? Thanks a lot.
Btw, I had some issues during installation which I will explain in separate emails.
-- Sandro Bonazzola Better technology. Faster innovation. Powered by community collaboration. See how it works at redhat.com

On 03/03/2014 11:47 AM, Sandro Bonazzola wrote:
Il 03/03/2014 11:33, René Koch ha scritto:
Hi,
I have some issues with hosted engine (oVirt 3.4 prerelease repo on CentOS 6.5). My setups is the following: 2 hosts (will be 4 in the future) with 4 GlusterFS shares: - engine (for hosted engine) - iso (for ISO domain) - ovirt (oVirt storage domain)
I had a split-brain situation today (after rebooting both nodes) on hosted-engine.lockspace file on engine GlusterFS volume which I resolved.
How did you solved it? By switching to NFS only?
I removed the file on host1 (directly on the brick) and ran "gluster volume heal engine full", which synced the file from host2 to host1.
hosted engine used engine share via NFS (TCP) as glusterfs isn't supported for oVirt hosted engine, yet. I'll switch to GlusterFS as soon as oVirt will support it (I hope this will be soon as RHEV 3.3 is already supporting GlusterFS for hosted engine).
First of all ovirt-ha-agent fails to start on both nodes:
# service ovirt-ha-agent start Starting ovirt-ha-agent: [ OK ] # service ovirt-ha-agent status ovirt-ha-agent dead but subsys locked
MainThread::INFO::2014-03-03 11:20:39,539::agent::52::ovirt_hosted_engine_ha.agent.agent.Agent::(run) ovirt-hosted-engine-ha agent 1.1.0 started MainThread::INFO::2014-03-03 11:20:39,590::hosted_engine::223::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_hostname) Found certificate common name: 10.0.200.101 MainThread::CRITICAL::2014-03-03 11:20:39,590::agent::103::ovirt_hosted_engine_ha.agent.agent.Agent::(run) Could not start ha-agent Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line 97, in run self._run_agent() File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line 154, in _run_agent hosted_engine.HostedEngine(self.shutdown_requested).start_monitoring() File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 152, in __init__ "STOP_VM": self._stop_engine_vm File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/state_machine.py", line 56, in __init__ logger, actions) File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/fsm/machine.py", line 93, in __init__ self._logger = FSMLoggerAdapter(logger, self) File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/fsm/machine.py", line 16, in __init__ super(FSMLoggerAdapter, self).__init__(logger, None) TypeError: super() argument 1 must be type, not classobj
If I want to start my hosted engine, I receive the following error in vdsm logs, which makes absolutly no sense to me, as there is plenty of disk space available:
Thread-62::DEBUG::2014-03-03 11:24:46,282::libvirtconnection::124::root::(wrapper) Unknown libvirterror: ecode: 38 edom: 42 level: 2 message: Failed to acquire lock: No space left on device
seems like a vdsm failure in starting monitor the hosted engine storage domain. Can you attach vdsm logs?
Logs are quite big for an email (6.8MB). I attached the last entries which show the information for vm-start.
Thread-62::DEBUG::2014-03-03 11:24:46,282::vm::2252::vm.Vm::(_startUnderlyingVm) vmId=`f26dd37e-13b5-430c-b2f2-ecd098b82a91`::_ongoingCreations released Thread-62::ERROR::2014-03-03 11:24:46,283::vm::2278::vm.Vm::(_startUnderlyingVm) vmId=`f26dd37e-13b5-430c-b2f2-ecd098b82a91`::The vm start process failed Traceback (most recent call last): File "/usr/share/vdsm/vm.py", line 2238, in _startUnderlyingVm self._run() File "/usr/share/vdsm/vm.py", line 3159, in _run self._connection.createXML(domxml, flags), File "/usr/lib64/python2.6/site-packages/vdsm/libvirtconnection.py", line 92, in wrapper ret = f(*args, **kwargs) File "/usr/lib64/python2.6/site-packages/libvirt.py", line 2665, in createXML if ret is None:raise libvirtError('virDomainCreateXML() failed', conn=self) libvirtError: Failed to acquire lock: No space left on device Thread-62::DEBUG::2014-03-03 11:24:46,286::vm::2720::vm.Vm::(setDownStatus) vmId=`f26dd37e-13b5-430c-b2f2-ecd098b82a91`::Changed state to Down: Failed to acquire lock: No space left on device
# df -h | grep engine ovirt-host01:/engine 281G 21G 261G 8% /rhev/data-center/mnt/ovirt-host01:_engine
# sudo -u vdsm dd if=/dev/zero of=/rhev/data-center/mnt/ovirt-host01:_engine/2851af27-8744-445d-9fb1-a0d083c8dc82/images/0e4d270f-2f7e-4b2b-847f-f114a4ba9bdc/test bs=512 count=100 100+0 records in 100+0 records out 51200 bytes (51 kB) copied, 0.0230566 s, 2.2 MB/s
Could you give me some information on how to fix the ovirt-ha-agent and then hosted-engine storage issue? Thanks a lot.
Btw, I had some issues during installation which I will explain in separate emails.
participants (3)
-
Martin Sivak
-
René Koch
-
Sandro Bonazzola