[Users] Starting VM gets paused

Nicolas Ecarnot nicolas at ecarnot.net
Sun Apr 21 08:02:10 UTC 2013


Le 20/04/2013 22:55, Itamar Heim a écrit :
> On 03/27/2013 10:38 AM, Nicolas Ecarnot wrote:
>> Le 26/03/2013 12:17, Nicolas Ecarnot a écrit :
>>> Le 25/03/2013 12:10, Nicolas Ecarnot a écrit :
>>>> Le 24/03/2013 09:53, Dafna Ron a écrit :
>>>>> is the vm preallocated or thin provision disk type?
>>>>
>>>> This VM has 3 disks :
>>>> - first disk to host the windows system : Thin provision
>>>> - second disk to store some data : Preallocated
>>>> - third disk to store some more data : Thin provision
>>>>
>>>> I'm realizing that amongst the 15 VMs, only this one and another one
>>>> that is stopped are using preallocated disks.
>>>> I'm regularly migrating some VMs (and stopping and starting and playing
>>>> with them) with no issue, and they all are using thin provisioned
>>>> disks!
>>>>
>>>> Could this be a common factor of the problem?
>>>>
>>>>>
>>>>> also, can you please attach engine, vdsm, libvirt and the vm's qemu
>>>>> logs?
>>>>
>>>> Relevant logs :
>>>>
>>>> ############
>>>>
>>>> Ok, I'm in the process of collecting the logs and posting them in a
>>>> useable manner.
>>>>
>>>> More to come.
>>>
>>> Ok, once again, I ran a test and observed the relevant logs.
>>> I tried to isolate the time frames, but it may be long for vdsm.log
>>>
>>> Here they are :
>>> * /var/log/libvirt/qemu/serv-chk-adm3.log
>>> http://pastebin.com/JVKMSmxD
>>> * /var/log/libvirtd.log
>>> http://pastebin.com/sWGDCqNh
>>> * /var/log/vdsm/vdsm.log (the BIG one)
>>> http://pastebin.com/bevTEhym
>>>
>>> What I can add to help you help me, is that :
>>> - I saw that all my VM appear as tainted. I did not know what that meant
>>> (but RTFMed since), and this does not appear to disturb the other VMs
>>> - Many VMs including the problematic one have been imported from
>>> ovirt-v2v with now such issue.
>>> - This particular VM was also imported, but the starting point was a
>>> vmdk or ova single file.
>>> - Two additionnal data disks were added
>>> - As I said, this is the only running VM stored as pre allocated.
>>>
>>> Regards,
>>>
>>
>> One suggestion : I see no obvious errors in the log files. Could this
>> paused state happen due to a VM's kernel panic?
>>
>
> is this still relevant?

It is!
Further investigations from my colleague shown the following facts :
- This VM has 3 disks. Only one of those disks is responsible for the 
problem
- In this disk, my coworker has found only 3 files (database files) that 
he can do nothing with without leading to the freeze.
- He tried to cat them into /dev/null, and this is leading to the freeze
- He tried to copy them into another disk -> freeze!

We see absolutely no evidence of a kernel panic.
Rather, this seems to be related to a network bottleneck between the 
node and the iSCSI SAN, leading to oVirt unable to sustain a sufficent 
bandwidth and freezing the VM.

Since then, we moved to another solution, but for the sake of opensource 
debugging, we did kept the faulty VM for your eyes only :)

-- 
Nicolas Ecarnot



More information about the Users mailing list