[ovirt-users] delete snapshots or create a template make a huge impact on VMs on SPM-node

Daniel Helgenberger daniel.helgenberger at m-box.de
Wed Oct 15 14:51:03 UTC 2014


Hi Ricky,
On 15.10.2014 14:05, Ricky Schneberger wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
>
> On 2014-09-25 13:57, Ricky Schneberger wrote:
>> Hi,
>>
>> When we do some tasks in ovirt as delete an snapshot or clone a
>> machine or just want to make a machine template the impact on VMs
>> residing on the SPM node is huge. Is there a way to prevent this?
>> This is just normal tasks and we do this all the time.
>>
>>
>>
>>
>> _______________________________________________ Users mailing list 
>> Users at ovirt.org http://lists.ovirt.org/mailman/listinfo/users
>>
>
> Just bump this thread to know if this is a common issue or if there is
> some configuration we must do.
I think this is a common issue and one of the reasons why SPM goes away
in 3.6 [1].

Please see the thread from some time ago, I put it in the end.
Markus Stockhausen summarizes my experiences quite well:

    Lessons learned so far for us:

    - run SPM on the smallest host (with the fewest running VMs)
    to  mit mitigate side effects of high CPU/disk utilization.

    - Clear page cache if nodes run out of free memory.


Note Allon's Comment # 9
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1116558#c9
>
>
> Regards
> - -- 
> Ricky Schneberger
>
> - ------------------------------------
> DANGER! Human at keyboard!
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1
>
> iEYEARECAAYFAlQ3trwACgkQOap81biMC2PdwQCfdNDqms/9Lmrq0MiNNJKhunDt
> UDkAn2fqfqmKryiR4GqHqFJIec9WApuq
> =OPiV
> -----END PGP SIGNATURE-----

users-bounces at ovirt.org [users-bounces at ovirt.org]" im Auftrag von "Federico Alberto Sayd [fsayd at uncu.edu.ar]
Gesendet: Donnerstag, 24. Juli 2014 18:16
An: users at ovirt.org
Betreff: [ovirt-users] Disk migration eats all CPU,     vms running in SPM become unresponsive

Hello:

I am experiencing some troubles with ovirt nodes:

When a node is selected as SPM and I move a disk between storage
domains, it seems that migration process eats all CPU and some VMs
(running on the SPM) hang, others lose network connectivity. The events
tab at Ovirt Engine reports the CPU exceeding the defined threshold and
then reports that VMs in such host (SPM) are not responding.

How can I debug this? Why do the VMs become unresponsive or lost network
connectivity when the host CPU goes too high?

I have attached a screenshot of the ovirt-engine events, and the
relevant engine.log


My setup:

oVirt Engine Version:
     3.4.0-1.el6 (Centos 6.5)
Nodes:
     Centos 6.5
     vdsm-4.14.6-0.el6
     libvirt-0.10.2-29.el6_5.9
     KVM: 0.12.1.2 - 2.415.el6_5.10 (jenkins build)

Hi, 

just a quick guess:

I would start the analysis by looking if SPM is swapping during the 
disk move. We had issues during snapshot deletion. More details at 
https://bugzilla.redhat.com/show_bug.cgi?id=1116558

If you encounter BSODs for windows VMs during that operation have 
a look at https://bugzilla.redhat.com/show_bug.cgi?id=1110305

Lessons learned so far for us:

- run SPM on the smallest host (with the fewest running VMs)
to  mit mitigate side effects of high CPU/disk utilization.

- Clear page cache if nodes run out of free memory.

Markus

-- 
Daniel Helgenberger
m box bewegtbild GmbH

P: +49/30/2408781-22
F: +49/30/2408781-10

ACKERSTR. 19
D-10115 BERLIN


www.m-box.de  www.monkeymen.tv

Geschäftsführer: Martin Retschitzegger / Michaela Göllner
Handeslregister: Amtsgericht Charlottenburg / HRB 112767




More information about the Users mailing list