----- Original Message -----
From: "Neil" <nwilson123(a)gmail.com>
To: dron(a)redhat.com
Cc: "users" <users(a)ovirt.org>
Sent: Monday, February 24, 2014 5:02:04 PM
Subject: Re: [Users] Vm's being paused
Hi Dafna,
My sincere apologies for not coming back to you sooner on this. I've
finally had a chance to start investigating, but in between my last
discussion and now, updates have been done on both the hosts and the
engine, so perhaps something there has fixed it, as I haven't had a
pause happen in quite a long time.
When trying to gather the info you requested above I think I've found
what is causing all the excessive logging...that I sent through
previously...
I have a VM called Proxy, which a few years back ran out of disk
space, and wouldn't boot, as it required an fsck, but we'd get an
unknown storage error when doing an fsck on the image, so we had to
attach a new LUN and dd out the entire image, then run an fsck, and
then re-import the image, which got the VM operational again. A while
back we tried to remove the old disk image, and received a storage
error, and looking at this now I see that it appears the old image
never successfully removed. If I look at the VM under Disks I can see
the old disk still attached in place, but there is an hourglass
instead of a green arrow showing. Also right clicking on the Disk the
only option you can choose is Add, so something seems to still have
this locked.
In the logs I have the same error showing over and over...
AttributeError: 'Drive' object has no attribute 'format'
Thread-313::DEBUG::2014-02-24
16:44:30,056::libvirtconnection::108::libvirtconnection::(wrapper)
Unknown libvirterror: ecode: 8 edom: 10 level: 2 message: invalid
argument: invalid path
/rhev/data-center/mnt/blockSD/0e6991ae-6238-4c61-96d2-ca8fed35161e/images/6128b18f-eee9-422e-bc8a-f3b9fe331b09/38ac4afa-22e9-4359-ac16-3ff5d7b3b6db
not assigned to domain
Thread-313::ERROR::2014-02-24
16:44:30,057::sampling::355::vm.Vm::(collect)
vmId=`23b9212c-1e25-4003-aa18-b1e819bf6bb1`::Stats function failed:
<AdvancedStatsFunction _highWrite at 0x1c9de30>
Traceback (most recent call last):
File "/usr/share/vdsm/sampling.py", line 351, in collect
statsFunction()
File "/usr/share/vdsm/sampling.py", line 226, in __call__
retValue = self._function(*args, **kwargs)
File "/usr/share/vdsm/vm.py", line 528, in _highWrite
self._vm.extendDrivesIfNeeded()
File "/usr/share/vdsm/vm.py", line 2288, in extendDrivesIfNeeded
capacity, alloc, physical = self._dom.blockInfo(drive.path, 0)
File "/usr/share/vdsm/vm.py", line 841, in f
ret = attr(*args, **kwargs)
File "/usr/lib64/python2.6/site-packages/vdsm/libvirtconnection.py",
line 76, in wrapper
ret = f(*args, **kwargs)
File "/usr/lib64/python2.6/site-packages/libvirt.py", line 1814, in
blockInfo
if ret is None: raise libvirtError ('virDomainGetBlockInfo()
failed', dom=self)
libvirtError: invalid argument: invalid path
/rhev/data-center/mnt/blockSD/0e6991ae-6238-4c61-96d2-ca8fed35161e/images/6128b18f-eee9-422e-bc8a-f3b9fe331b09/38ac4afa-22e9-4359-ac16-3ff5d7b3b6db
not assigned to domain
Any ideas on how to get rid of the "corrupt" disk finally?
This may happen when you migrate a vm from machines running different versions of vdsm.
Vdsm changed the path to the disk lately, so when you migrate a vm, some disk are not
found where vdsm think they should be. This leads to missing format attribute and
libvirt errors when trying to check the status of such disks.
This issue is fixed in upstream:
http://gerrit.ovirt.org/24202
And in ovirt-3.4:
http://gerrit.ovirt.org/24324
I think the best way to avoid this issue, is to have the same vdsm version on all
hosts in the same cluster.
Nir