Re: [Users] Vm's being paused

Monday, 24 February 2014

----- Original Message -----
...
 From: "Neil" <nwilson123(a)gmail.com&gt;
 To: dron(a)redhat.com
 Cc: "users" <users(a)ovirt.org&gt;
 Sent: Monday, February 24, 2014 5:02:04 PM
 Subject: Re: [Users] Vm's being paused

 Hi Dafna,

 My sincere apologies for not coming back to you sooner on this. I've
 finally had a chance to start investigating, but in between my last
 discussion and now, updates have been done on both the hosts and the
 engine, so perhaps something there has fixed it, as I haven't had a
 pause happen in quite a long time.

 When trying to gather the info you requested above I think I've found
 what is causing all the excessive logging...that I sent through
 previously...

 I have a VM called Proxy, which a few years back ran out of disk
 space, and wouldn't boot, as it required an fsck, but we'd get an
 unknown storage error when doing an fsck on the image, so we had to
 attach a new LUN and dd out the entire image, then run an fsck, and
 then re-import the image, which got the VM operational again. A while
 back we tried to remove the old disk image, and received a storage
 error, and looking at this now I see that it appears the old image
 never successfully removed.  If I look at the VM under Disks I can see
 the old disk still attached in place, but there is an hourglass
 instead of a green arrow showing. Also right clicking on the Disk the
 only option you can choose is Add, so something seems to still have
 this locked.

 In the logs I have the same error showing over and over...

 AttributeError: 'Drive' object has no attribute 'format'
 Thread-313::DEBUG::2014-02-24
 16:44:30,056::libvirtconnection::108::libvirtconnection::(wrapper)
 Unknown libvirterror: ecode: 8 edom: 10 level: 2 message: invalid
 argument: invalid path

/rhev/data-center/mnt/blockSD/0e6991ae-6238-4c61-96d2-ca8fed35161e/images/6128b18f-eee9-422e-bc8a-f3b9fe331b09/38ac4afa-22e9-4359-ac16-3ff5d7b3b6db
 not assigned to domain
 Thread-313::ERROR::2014-02-24
 16:44:30,057::sampling::355::vm.Vm::(collect)
 vmId=`23b9212c-1e25-4003-aa18-b1e819bf6bb1`::Stats function failed:
 <AdvancedStatsFunction _highWrite at 0x1c9de30>
 Traceback (most recent call last):
   File "/usr/share/vdsm/sampling.py", line 351, in collect
     statsFunction()
   File "/usr/share/vdsm/sampling.py", line 226, in __call__
     retValue = self._function(*args, **kwargs)
   File "/usr/share/vdsm/vm.py", line 528, in _highWrite
     self._vm.extendDrivesIfNeeded()
   File "/usr/share/vdsm/vm.py", line 2288, in extendDrivesIfNeeded
     capacity, alloc, physical = self._dom.blockInfo(drive.path, 0)
   File "/usr/share/vdsm/vm.py", line 841, in f
     ret = attr(*args, **kwargs)
   File "/usr/lib64/python2.6/site-packages/vdsm/libvirtconnection.py",
 line 76, in wrapper
     ret = f(*args, **kwargs)
   File "/usr/lib64/python2.6/site-packages/libvirt.py", line 1814, in
   blockInfo
     if ret is None: raise libvirtError ('virDomainGetBlockInfo()
 failed', dom=self)
 libvirtError: invalid argument: invalid path

/rhev/data-center/mnt/blockSD/0e6991ae-6238-4c61-96d2-ca8fed35161e/images/6128b18f-eee9-422e-bc8a-f3b9fe331b09/38ac4afa-22e9-4359-ac16-3ff5d7b3b6db
 not assigned to domain

 Any ideas on how to get rid of the "corrupt" disk finally? 
This may happen when you migrate a vm from machines running different versions of vdsm.
Vdsm changed the path to the disk lately, so when you migrate a vm, some disk are not
found where vdsm think they should be. This leads to missing format attribute and 
libvirt errors when trying to check the status of such disks.

This issue is fixed in upstream:
http://gerrit.ovirt.org/24202

And in ovirt-3.4:
http://gerrit.ovirt.org/24324

I think the best way to avoid this issue, is to have the same vdsm version on all
hosts in the same cluster.

Nir

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Re: [Users] Vm's being paused