
I am running vdsm from packages as my interest is in developing for the engine and not vdsm. I updated the vdsm package in an attempt to solve this, now I have: # rpm -q vdsm vdsm-4.10.3-10.fc18.x86_64 I noticed that when the storage domain crashes I can't even do "df -h" (hangs) I'm also getting some errors in /var/log/messages: Mar 24 19:57:44 bufferoverflow vdsm SuperVdsmProxy WARNING Connect to svdsm failed [Errno 2] No such file or directory Mar 24 19:57:45 bufferoverflow vdsm SuperVdsmProxy WARNING Connect to svdsm failed [Errno 2] No such file or directory Mar 24 19:57:46 bufferoverflow vdsm SuperVdsmProxy WARNING Connect to svdsm failed [Errno 2] No such file or directory Mar 24 19:57:47 bufferoverflow vdsm SuperVdsmProxy WARNING Connect to svdsm failed [Errno 2] No such file or directory Mar 24 19:57:48 bufferoverflow vdsm SuperVdsmProxy WARNING Connect to svdsm failed [Errno 2] No such file or directory Mar 24 19:57:49 bufferoverflow vdsm SuperVdsmProxy WARNING Connect to svdsm failed [Errno 2] No such file or directory Mar 24 19:57:50 bufferoverflow vdsm SuperVdsmProxy WARNING Connect to svdsm failed [Errno 2] No such file or directory Mar 24 19:57:51 bufferoverflow sanlock[1208]: 2013-03-24 19:57:51+0200 7412 [4759]: 1083422e close_task_aio 0 0x7ff3740008c0 busy Mar 24 19:57:51 bufferoverflow sanlock[1208]: 2013-03-24 19:57:51+0200 7412 [4759]: 1083422e close_task_aio 1 0x7ff374000910 busy Mar 24 19:57:51 bufferoverflow sanlock[1208]: 2013-03-24 19:57:51+0200 7412 [4759]: 1083422e close_task_aio 2 0x7ff374000960 busy Mar 24 19:57:51 bufferoverflow sanlock[1208]: 2013-03-24 19:57:51+0200 7412 [4759]: 1083422e close_task_aio 3 0x7ff3740009b0 busy Mar 24 19:57:51 bufferoverflow vdsm SuperVdsmProxy WARNING Connect to svdsm failed [Errno 2] No such file or directory Mar 24 19:57:52 bufferoverflow vdsm SuperVdsmProxy WARNING Connect to svdsm failed [Errno 2] No such file or directory Mar 24 19:57:53 bufferoverflow vdsm SuperVdsmProxy WARNING Connect to svdsm failed [Errno 2] No such file or directory Mar 24 19:57:54 bufferoverflow vdsm SuperVdsmProxy WARNING Connect to svdsm failed [Errno 2] No such file or directory Mar 24 19:57:55 bufferoverflow vdsm SuperVdsmProxy WARNING Connect to svdsm failed [Errno 2] No such file or directory Mar 24 19:57:55 bufferoverflow vdsm Storage.Misc ERROR Panic: Couldn't connect to supervdsm Mar 24 19:57:55 bufferoverflow respawn: slave '/usr/share/vdsm/vdsm' died, respawning slave Mar 24 19:57:55 bufferoverflow vdsm fileUtils WARNING Dir /rhev/data-center/mnt already exists Mar 24 19:57:58 bufferoverflow vdsm vds WARNING Unable to load the json rpc server module. Please make sure it is installed. Mar 24 19:57:58 bufferoverflow vdsm vm.Vm WARNING vmId=`4d3d81b3-d083-4569-acc2-8e631ed51843`::Unknown type found, device: '{'device': u'unix', 'alias': u'channel0', 'type': u'channel', 'address': {u'bus': u'0', u'controller': u'0', u'type': u'virtio-serial', u'port': u'1'}}' found Mar 24 19:57:58 bufferoverflow vdsm vm.Vm WARNING vmId=`4d3d81b3-d083-4569-acc2-8e631ed51843`::Unknown type found, device: '{'device': u'unix', 'alias': u'channel1', 'type': u'channel', 'address': {u'bus': u'0', u'controller': u'0', u'type': u'virtio-serial', u'port': u'2'}}' found Mar 24 19:57:58 bufferoverflow vdsm vm.Vm WARNING vmId=`4d3d81b3-d083-4569-acc2-8e631ed51843`::_readPauseCode unsupported by libvirt vm Mar 24 19:57:58 bufferoverflow kernel: [ 7402.688177] ata1: hard resetting link Mar 24 19:57:59 bufferoverflow kernel: [ 7402.994510] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Mar 24 19:57:59 bufferoverflow kernel: [ 7403.005510] ACPI Error: [DSSP] Namespace lookup failure, AE_NOT_FOUND (20120711/psargs-359) Mar 24 19:57:59 bufferoverflow kernel: [ 7403.005517] ACPI Error: Method parse/execution failed [\_SB_.PCI0.SAT0.SPT0._GTF] (Node ffff880407c74d48), AE_NOT_FOUND (20120711/psparse-536) Mar 24 19:57:59 bufferoverflow kernel: [ 7403.015485] ACPI Error: [DSSP] Namespace lookup failure, AE_NOT_FOUND (20120711/psargs-359) Mar 24 19:57:59 bufferoverflow kernel: [ 7403.015493] ACPI Error: Method parse/execution failed [\_SB_.PCI0.SAT0.SPT0._GTF] (Node ffff880407c74d48), AE_NOT_FOUND (20120711/psparse-536) Mar 24 19:57:59 bufferoverflow kernel: [ 7403.016061] ata1.00: configured for UDMA/133 Mar 24 19:57:59 bufferoverflow kernel: [ 7403.016066] ata1: EH complete Mar 24 19:58:01 bufferoverflow sanlock[1208]: 2013-03-24 19:58:01+0200 7422 [4759]: 1083422e close_task_aio 0 0x7ff3740008c0 busy Mar 24 19:58:01 bufferoverflow sanlock[1208]: 2013-03-24 19:58:01+0200 7422 [4759]: 1083422e close_task_aio 1 0x7ff374000910 busy Mar 24 19:58:01 bufferoverflow sanlock[1208]: 2013-03-24 19:58:01+0200 7422 [4759]: 1083422e close_task_aio 2 0x7ff374000960 busy Mar 24 19:58:01 bufferoverflow sanlock[1208]: 2013-03-24 19:58:01+0200 7422 [4759]: 1083422e close_task_aio 3 0x7ff3740009b0 busy Mar 24 19:58:01 bufferoverflow kernel: [ 7405.714145] device-mapper: table: 253:0: multipath: error getting device Mar 24 19:58:01 bufferoverflow kernel: [ 7405.714148] device-mapper: ioctl: error adding target to table Mar 24 19:58:01 bufferoverflow kernel: [ 7405.715051] device-mapper: table: 253:0: multipath: error getting device Mar 24 19:58:01 bufferoverflow kernel: [ 7405.715053] device-mapper: ioctl: error adding target to table ata1 is a 500GB SSD. (only SATA device on the system except a DVD drive) Yuval On Sun, Mar 24, 2013 at 2:52 PM, Dan Kenigsberg <danken@redhat.com> wrote:
On Fri, Mar 22, 2013 at 08:24:35PM +0200, Limor Gavish wrote:
Hello,
I am using Ovirt 3.2 on Fedora 18: [wil@bufferoverflow ~]$ rpm -q vdsm vdsm-4.10.3-7.fc18.x86_64
(the engine is built from sources).
I seem to have hit this bug: https://bugzilla.redhat.com/show_bug.cgi?id=922515
This bug is only one part of the problem, but it's nasty enough that I have just suggested it as a fix to the ovirt-3.2 branch of vdsm: http://gerrit.ovirt.org/13303
Could you test if with it, vdsm relinquishes its spm role, and recovers as operational?
in the following configuration: Single host (no migrations) Created a VM, installed an OS inside (Fedora18) stopped the VM. created template from it. Created an additional VM from the template using thin provision. Started the second VM.
in addition to the errors in the logs the storage domains (both data and ISO) crashed, i.e went to "unknown" and "inactive" states respectively. (see the attached engine.log)
I attached the VDSM and engine logs.
is there a way to work around this problem? It happens repeatedly.
Yuval Meir