[ovirt-users] Disk migration eats all CPU, vms running in SPM become unresponsive

Markus Stockhausen stockhausen at collogia.de
Thu Jul 24 13:02:52 EDT 2014


> users-bounces at ovirt.org [users-bounces at ovirt.org]" im Auftrag von "Federico Alberto Sayd [fsayd at uncu.edu.ar]
> Gesendet: Donnerstag, 24. Juli 2014 18:16
> An: users at ovirt.org
> Betreff: [ovirt-users] Disk migration eats all CPU,     vms running in SPM become unresponsive
> 
> Hello:
> 
> I am experiencing some troubles with ovirt nodes:
> 
> When a node is selected as SPM and I move a disk between storage
> domains, it seems that migration process eats all CPU and some VMs
> (running on the SPM) hang, others lose network connectivity. The events
> tab at Ovirt Engine reports the CPU exceeding the defined threshold and
> then reports that VMs in such host (SPM) are not responding.
> 
> How can I debug this? Why do the VMs become unresponsive or lost network
> connectivity when the host CPU goes too high?
> 
> I have attached a screenshot of the ovirt-engine events, and the
> relevant engine.log
> 
> 
> My setup:
> 
> oVirt Engine Version:
>      3.4.0-1.el6 (Centos 6.5)
> Nodes:
>      Centos 6.5
>      vdsm-4.14.6-0.el6
>      libvirt-0.10.2-29.el6_5.9
>      KVM: 0.12.1.2 - 2.415.el6_5.10 (jenkins build)
> 

Hi, 

just a quick guess:

I would start the analysis by looking if SPM is swapping during the 
disk move. We had issues during snapshot deletion. More details at 
https://bugzilla.redhat.com/show_bug.cgi?id=1116558

If you encounter BSODs for windows VMs during that operation have 
a look at https://bugzilla.redhat.com/show_bug.cgi?id=1110305

Lessons learned so far for us:

- run SPM on the smallest host (with the fewest running VMs)
to  mit mitigate side effects of high CPU/disk utilization.

- Clear page cache if nodes run out of free memory.

Markus
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: InterScan_Disclaimer.txt
URL: <http://lists.ovirt.org/pipermail/users/attachments/20140724/e3767685/attachment.txt>


More information about the Users mailing list