
----- Original Message -----
From: "Neil" <nwilson123@gmail.com> To: dron@redhat.com Cc: "users" <users@ovirt.org> Sent: Monday, February 24, 2014 5:02:04 PM Subject: Re: [Users] Vm's being paused
Hi Dafna,
My sincere apologies for not coming back to you sooner on this. I've finally had a chance to start investigating, but in between my last discussion and now, updates have been done on both the hosts and the engine, so perhaps something there has fixed it, as I haven't had a pause happen in quite a long time.
When trying to gather the info you requested above I think I've found what is causing all the excessive logging...that I sent through previously...
I have a VM called Proxy, which a few years back ran out of disk space, and wouldn't boot, as it required an fsck, but we'd get an unknown storage error when doing an fsck on the image, so we had to attach a new LUN and dd out the entire image, then run an fsck, and then re-import the image, which got the VM operational again. A while back we tried to remove the old disk image, and received a storage error, and looking at this now I see that it appears the old image never successfully removed. If I look at the VM under Disks I can see the old disk still attached in place, but there is an hourglass instead of a green arrow showing. Also right clicking on the Disk the only option you can choose is Add, so something seems to still have this locked.
In the logs I have the same error showing over and over...
AttributeError: 'Drive' object has no attribute 'format' Thread-313::DEBUG::2014-02-24 16:44:30,056::libvirtconnection::108::libvirtconnection::(wrapper) Unknown libvirterror: ecode: 8 edom: 10 level: 2 message: invalid argument: invalid path /rhev/data-center/mnt/blockSD/0e6991ae-6238-4c61-96d2-ca8fed35161e/images/6128b18f-eee9-422e-bc8a-f3b9fe331b09/38ac4afa-22e9-4359-ac16-3ff5d7b3b6db not assigned to domain Thread-313::ERROR::2014-02-24 16:44:30,057::sampling::355::vm.Vm::(collect) vmId=`23b9212c-1e25-4003-aa18-b1e819bf6bb1`::Stats function failed: <AdvancedStatsFunction _highWrite at 0x1c9de30> Traceback (most recent call last): File "/usr/share/vdsm/sampling.py", line 351, in collect statsFunction() File "/usr/share/vdsm/sampling.py", line 226, in __call__ retValue = self._function(*args, **kwargs) File "/usr/share/vdsm/vm.py", line 528, in _highWrite self._vm.extendDrivesIfNeeded() File "/usr/share/vdsm/vm.py", line 2288, in extendDrivesIfNeeded capacity, alloc, physical = self._dom.blockInfo(drive.path, 0) File "/usr/share/vdsm/vm.py", line 841, in f ret = attr(*args, **kwargs) File "/usr/lib64/python2.6/site-packages/vdsm/libvirtconnection.py", line 76, in wrapper ret = f(*args, **kwargs) File "/usr/lib64/python2.6/site-packages/libvirt.py", line 1814, in blockInfo if ret is None: raise libvirtError ('virDomainGetBlockInfo() failed', dom=self) libvirtError: invalid argument: invalid path /rhev/data-center/mnt/blockSD/0e6991ae-6238-4c61-96d2-ca8fed35161e/images/6128b18f-eee9-422e-bc8a-f3b9fe331b09/38ac4afa-22e9-4359-ac16-3ff5d7b3b6db not assigned to domain
Any ideas on how to get rid of the "corrupt" disk finally?
This may happen when you migrate a vm from machines running different versions of vdsm. Vdsm changed the path to the disk lately, so when you migrate a vm, some disk are not found where vdsm think they should be. This leads to missing format attribute and libvirt errors when trying to check the status of such disks. This issue is fixed in upstream: http://gerrit.ovirt.org/24202 And in ovirt-3.4: http://gerrit.ovirt.org/24324 I think the best way to avoid this issue, is to have the same vdsm version on all hosts in the same cluster. Nir