Re: [ovirt-users] Ovirt VM Performance abd CPU times

On 29.10.2014 11:48, Xavier Naveira wrote:
On 10/29/2014 11:47 AM, Xavier Naveira wrote:
On 10/29/2014 11:40 AM, Daniel Helgenberger wrote:
On 29.10.2014 10:21, Xavier Naveira wrote:
Hi,
We are migrating our ifrastructure from kvm+libvirt hypervisors to ovirt.
Everything is working fine but we're noticing that all the qemu-kvm processes in the hypervisors take a lot of CPU. Without further details of the workload this is hard tell. One Reason I can think of might be KSM [1]. Is it enabled on your cluster(s)? What is your mem over-commitment setting?
Note, IIRC the KSM policy is currently hard coded; it will start at 80% host mem usage.
[1] http://www.ovirt.org/Sla/host-mom-policy
The typical example is an idle machine, running top from the machine itself it reports cpu use percentages below 10% and loads (with 2 processors) of 0.0x. The process running that machine in the hypervisor rports cpu uses in the order of the 80-100%.
Should the values look like this? Why are the idle machines eating up so much CPU time?
Thank you. Xavier
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Hi, thank you for the answer.
I've been trying to work out some pattern and realized that the VMs using that much cpu all are Redhat 5.x, the Readhat 6.x doesn't exhibit this kind of high cpu use. (we run only redhat/centos 5.x/6.x on the cluster) What OS are the hosts running? In case of EL6, make sure you have tuned-0.2.19-13.el6.noarch installed [1].
To further investigate please post Engine, VDSM, libvirt and kernel versions from the hosts. [1] https://access.redhat.com/solutions/358033
I'll take a look to the KSM config.
Cheers,
Xavier
-- Daniel Helgenberger m box bewegtbild GmbH P: +49/30/2408781-22 F: +49/30/2408781-10 ACKERSTR. 19 D-10115 BERLIN www.m-box.de www.monkeymen.tv Geschäftsführer: Martin Retschitzegger / Michaela Göllner Handeslregister: Amtsgericht Charlottenburg / HRB 112767

On 10/29/2014 01:26 PM, Daniel Helgenberger wrote:
On 29.10.2014 11:48, Xavier Naveira wrote:
On 10/29/2014 11:47 AM, Xavier Naveira wrote:
On 10/29/2014 11:40 AM, Daniel Helgenberger wrote:
On 29.10.2014 10:21, Xavier Naveira wrote:
Hi,
We are migrating our ifrastructure from kvm+libvirt hypervisors to ovirt.
Everything is working fine but we're noticing that all the qemu-kvm processes in the hypervisors take a lot of CPU. Without further details of the workload this is hard tell. One Reason I can think of might be KSM [1]. Is it enabled on your cluster(s)? What is your mem over-commitment setting?
Note, IIRC the KSM policy is currently hard coded; it will start at 80% host mem usage.
[1] http://www.ovirt.org/Sla/host-mom-policy
The typical example is an idle machine, running top from the machine itself it reports cpu use percentages below 10% and loads (with 2 processors) of 0.0x. The process running that machine in the hypervisor rports cpu uses in the order of the 80-100%.
Should the values look like this? Why are the idle machines eating up so much CPU time?
Thank you. Xavier
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Hi, thank you for the answer.
I've been trying to work out some pattern and realized that the VMs using that much cpu all are Redhat 5.x, the Readhat 6.x doesn't exhibit this kind of high cpu use. (we run only redhat/centos 5.x/6.x on the cluster) What OS are the hosts running? In case of EL6, make sure you have tuned-0.2.19-13.el6.noarch installed [1].
That's exactly the version we've in the hypervisors.
To further investigate please post Engine, VDSM, libvirt and kernel versions from the hosts.
vdsm-xmlrpc-4.14.11.2-0.el6.noarch vdsm-cli-4.14.11.2-0.el6.noarch vdsm-python-4.14.11.2-0.el6.x86_64 vdsm-4.14.11.2-0.el6.x86_64 vdsm-python-zombiereaper-4.14.11.2-0.el6.noarch libvirt-client-0.10.2-29.el6_5.12.x86_64 libvirt-0.10.2-29.el6_5.12.x86_64 libvirt-lock-sanlock-0.10.2-29.el6_5.12.x86_64 libvirt-python-0.10.2-29.el6_5.12.x86_64 2.6.32-431.23.3.el6.x86_64 #1 SMP Wed Jul 16 06:12:23 EDT 2014 x86_64 x86_64 x86_64 GNU/Linux
[1] https://access.redhat.com/solutions/358033
I'll take a look to the KSM config.
Cheers,
Xavier

On 10/29/2014 03:07 PM, Xavier Naveira wrote:
On 10/29/2014 01:26 PM, Daniel Helgenberger wrote:
On 29.10.2014 11:48, Xavier Naveira wrote:
On 10/29/2014 11:47 AM, Xavier Naveira wrote:
On 10/29/2014 11:40 AM, Daniel Helgenberger wrote:
On 29.10.2014 10:21, Xavier Naveira wrote:
Hi,
We are migrating our ifrastructure from kvm+libvirt hypervisors to ovirt.
Everything is working fine but we're noticing that all the qemu-kvm processes in the hypervisors take a lot of CPU. Without further details of the workload this is hard tell. One Reason I can think of might be KSM [1]. Is it enabled on your cluster(s)? What is your mem over-commitment setting?
Note, IIRC the KSM policy is currently hard coded; it will start at 80% host mem usage.
[1] http://www.ovirt.org/Sla/host-mom-policy
The typical example is an idle machine, running top from the machine itself it reports cpu use percentages below 10% and loads (with 2 processors) of 0.0x. The process running that machine in the hypervisor rports cpu uses in the order of the 80-100%.
Should the values look like this? Why are the idle machines eating up so much CPU time?
Thank you. Xavier
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Hi, thank you for the answer.
I've been trying to work out some pattern and realized that the VMs using that much cpu all are Redhat 5.x, the Readhat 6.x doesn't exhibit this kind of high cpu use. (we run only redhat/centos 5.x/6.x on the cluster) What OS are the hosts running? In case of EL6, make sure you have tuned-0.2.19-13.el6.noarch installed [1].
That's exactly the version we've in the hypervisors.
I checked the value of kernel.sched_migration_cost and it is already 5000000, but it was a good shot :)
To further investigate please post Engine, VDSM, libvirt and kernel versions from the hosts.
vdsm-xmlrpc-4.14.11.2-0.el6.noarch vdsm-cli-4.14.11.2-0.el6.noarch vdsm-python-4.14.11.2-0.el6.x86_64 vdsm-4.14.11.2-0.el6.x86_64 vdsm-python-zombiereaper-4.14.11.2-0.el6.noarch
libvirt-client-0.10.2-29.el6_5.12.x86_64 libvirt-0.10.2-29.el6_5.12.x86_64 libvirt-lock-sanlock-0.10.2-29.el6_5.12.x86_64 libvirt-python-0.10.2-29.el6_5.12.x86_64
2.6.32-431.23.3.el6.x86_64 #1 SMP Wed Jul 16 06:12:23 EDT 2014 x86_64 x86_64 x86_64 GNU/Linux
[1] https://access.redhat.com/solutions/358033
I'll take a look to the KSM config.
Cheers,
Xavier

On 10/29/2014 03:07 PM, Xavier Naveira wrote:
On 10/29/2014 01:26 PM, Daniel Helgenberger wrote:
On 29.10.2014 11:48, Xavier Naveira wrote:
On 10/29/2014 11:47 AM, Xavier Naveira wrote:
On 10/29/2014 11:40 AM, Daniel Helgenberger wrote:
On 29.10.2014 10:21, Xavier Naveira wrote:
Hi,
We are migrating our ifrastructure from kvm+libvirt hypervisors to ovirt.
Everything is working fine but we're noticing that all the qemu-kvm processes in the hypervisors take a lot of CPU. Without further details of the workload this is hard tell. One Reason I can think of might be KSM [1]. Is it enabled on your cluster(s)? What is your mem over-commitment setting?
Note, IIRC the KSM policy is currently hard coded; it will start at 80% host mem usage.
[1] http://www.ovirt.org/Sla/host-mom-policy
The typical example is an idle machine, running top from the machine itself it reports cpu use percentages below 10% and loads (with 2 processors) of 0.0x. The process running that machine in the hypervisor rports cpu uses in the order of the 80-100%.
Should the values look like this? Why are the idle machines eating up so much CPU time?
Thank you. Xavier
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Hi, thank you for the answer.
I've been trying to work out some pattern and realized that the VMs using that much cpu all are Redhat 5.x, the Readhat 6.x doesn't exhibit this kind of high cpu use. (we run only redhat/centos 5.x/6.x on the cluster) What OS are the hosts running? In case of EL6, make sure you have tuned-0.2.19-13.el6.noarch installed [1].
That's exactly the version we've in the hypervisors.
To further investigate please post Engine, VDSM, libvirt and kernel versions from the hosts.
vdsm-xmlrpc-4.14.11.2-0.el6.noarch vdsm-cli-4.14.11.2-0.el6.noarch vdsm-python-4.14.11.2-0.el6.x86_64 vdsm-4.14.11.2-0.el6.x86_64 vdsm-python-zombiereaper-4.14.11.2-0.el6.noarch
libvirt-client-0.10.2-29.el6_5.12.x86_64 libvirt-0.10.2-29.el6_5.12.x86_64 libvirt-lock-sanlock-0.10.2-29.el6_5.12.x86_64 libvirt-python-0.10.2-29.el6_5.12.x86_64
2.6.32-431.23.3.el6.x86_64 #1 SMP Wed Jul 16 06:12:23 EDT 2014 x86_64 x86_64 x86_64 GNU/Linux
[1] https://access.redhat.com/solutions/358033
I'll take a look to the KSM config.
Cheers,
Xavier
Actually, this seems to be it. But I'm already at a newer kernel: https://bugzilla.redhat.com/show_bug.cgi?id=705082

On 29.10.2014 15:57, Xavier Naveira wrote:
On 10/29/2014 03:07 PM, Xavier Naveira wrote:
On 10/29/2014 01:26 PM, Daniel Helgenberger wrote:
On 29.10.2014 11:48, Xavier Naveira wrote:
On 10/29/2014 11:47 AM, Xavier Naveira wrote:
On 10/29/2014 11:40 AM, Daniel Helgenberger wrote:
On 29.10.2014 10:21, Xavier Naveira wrote: > Hi, > > We are migrating our ifrastructure from kvm+libvirt hypervisors to > ovirt. > > Everything is working fine but we're noticing that all the qemu-kvm > processes in the hypervisors take a lot of CPU. Without further details of the workload this is hard tell. One Reason I can think of might be KSM [1]. Is it enabled on your cluster(s)? What is your mem over-commitment setting?
Note, IIRC the KSM policy is currently hard coded; it will start at 80% host mem usage.
[1] http://www.ovirt.org/Sla/host-mom-policy > > The typical example is an idle machine, running top from the machine > itself it reports cpu use percentages below 10% and loads (with 2 > processors) of 0.0x. The process running that machine in the > hypervisor > rports cpu uses in the order of the 80-100%. > > Should the values look like this? Why are the idle machines eating > up so > much CPU time? > > Thank you. > Xavier > > _______________________________________________ > Users mailing list > Users@ovirt.org > http://lists.ovirt.org/mailman/listinfo/users >
Hi, thank you for the answer.
I've been trying to work out some pattern and realized that the VMs using that much cpu all are Redhat 5.x, the Readhat 6.x doesn't exhibit this kind of high cpu use. (we run only redhat/centos 5.x/6.x on the cluster) What OS are the hosts running? In case of EL6, make sure you have tuned-0.2.19-13.el6.noarch installed [1].
That's exactly the version we've in the hypervisors.
To further investigate please post Engine, VDSM, libvirt and kernel versions from the hosts.
vdsm-xmlrpc-4.14.11.2-0.el6.noarch vdsm-cli-4.14.11.2-0.el6.noarch vdsm-python-4.14.11.2-0.el6.x86_64 vdsm-4.14.11.2-0.el6.x86_64 vdsm-python-zombiereaper-4.14.11.2-0.el6.noarch
libvirt-client-0.10.2-29.el6_5.12.x86_64 libvirt-0.10.2-29.el6_5.12.x86_64 libvirt-lock-sanlock-0.10.2-29.el6_5.12.x86_64 libvirt-python-0.10.2-29.el6_5.12.x86_64
2.6.32-431.23.3.el6.x86_64 #1 SMP Wed Jul 16 06:12:23 EDT 2014 x86_64 x86_64 x86_64 GNU/Linux
[1] https://access.redhat.com/solutions/358033
I'll take a look to the KSM config.
Cheers,
Xavier
Actually, this seems to be it. But I'm already at a newer kernel: https://bugzilla.redhat.com/show_bug.cgi?id=705082 Well, I do not have such hardware so I never run into the issue. You could disable HT as I suspect your physical cores are less then 64?
Your workload might differ but my VMs usually do not benefit from 'threaded' cores and I want HT disabled anyway. Also, you can check cluster settings and disable 'count threads as cores' if enabled. But I think this might not make any difference.
-- Daniel Helgenberger m box bewegtbild GmbH P: +49/30/2408781-22 F: +49/30/2408781-10 ACKERSTR. 19 D-10115 BERLIN www.m-box.de www.monkeymen.tv Geschäftsführer: Martin Retschitzegger / Michaela Göllner Handeslregister: Amtsgericht Charlottenburg / HRB 112767

On 10/29/2014 04:06 PM, Daniel Helgenberger wrote:
On 29.10.2014 15:57, Xavier Naveira wrote:
On 10/29/2014 03:07 PM, Xavier Naveira wrote:
On 10/29/2014 01:26 PM, Daniel Helgenberger wrote:
On 29.10.2014 11:48, Xavier Naveira wrote:
On 10/29/2014 11:47 AM, Xavier Naveira wrote:
On 10/29/2014 11:40 AM, Daniel Helgenberger wrote: > > > On 29.10.2014 10:21, Xavier Naveira wrote: >> Hi, >> >> We are migrating our ifrastructure from kvm+libvirt hypervisors to >> ovirt. >> >> Everything is working fine but we're noticing that all the qemu-kvm >> processes in the hypervisors take a lot of CPU. > Without further details of the workload this is hard tell. One > Reason I > can think of might be KSM [1]. Is it enabled on your cluster(s)? > What is > your mem over-commitment setting? > > Note, IIRC the KSM policy is currently hard coded; it will start at > 80% > host mem usage. > > [1] http://www.ovirt.org/Sla/host-mom-policy >> >> The typical example is an idle machine, running top from the machine >> itself it reports cpu use percentages below 10% and loads (with 2 >> processors) of 0.0x. The process running that machine in the >> hypervisor >> rports cpu uses in the order of the 80-100%. >> >> Should the values look like this? Why are the idle machines eating >> up so >> much CPU time? >> >> Thank you. >> Xavier >> >> _______________________________________________ >> Users mailing list >> Users@ovirt.org >> http://lists.ovirt.org/mailman/listinfo/users >> >
Hi, thank you for the answer.
I've been trying to work out some pattern and realized that the VMs using that much cpu all are Redhat 5.x, the Readhat 6.x doesn't exhibit this kind of high cpu use. (we run only redhat/centos 5.x/6.x on the cluster) What OS are the hosts running? In case of EL6, make sure you have tuned-0.2.19-13.el6.noarch installed [1].
That's exactly the version we've in the hypervisors.
To further investigate please post Engine, VDSM, libvirt and kernel versions from the hosts.
vdsm-xmlrpc-4.14.11.2-0.el6.noarch vdsm-cli-4.14.11.2-0.el6.noarch vdsm-python-4.14.11.2-0.el6.x86_64 vdsm-4.14.11.2-0.el6.x86_64 vdsm-python-zombiereaper-4.14.11.2-0.el6.noarch
libvirt-client-0.10.2-29.el6_5.12.x86_64 libvirt-0.10.2-29.el6_5.12.x86_64 libvirt-lock-sanlock-0.10.2-29.el6_5.12.x86_64 libvirt-python-0.10.2-29.el6_5.12.x86_64
2.6.32-431.23.3.el6.x86_64 #1 SMP Wed Jul 16 06:12:23 EDT 2014 x86_64 x86_64 x86_64 GNU/Linux
[1] https://access.redhat.com/solutions/358033
I'll take a look to the KSM config.
Cheers,
Xavier
Actually, this seems to be it. But I'm already at a newer kernel: https://bugzilla.redhat.com/show_bug.cgi?id=705082 Well, I do not have such hardware so I never run into the issue. You could disable HT as I suspect your physical cores are less then 64?
Your workload might differ but my VMs usually do not benefit from 'threaded' cores and I want HT disabled anyway. Also, you can check cluster settings and disable 'count threads as cores' if enabled. But I think this might not make any difference.
Yeah, these are machines with 4 sockets, 6 core per socket and HT enabled, so total 48 "CPU". So, are you implying that the problem is the number of "CPUs"? We were hoping to add some more hypervisors to the cluster next week that have even more cores... I can probably try to disable HT when we add the next hypervisor next week but it feels that it'd be just a workaround?

On 10/29/2014 04:29 PM, Xavier Naveira wrote:
On 10/29/2014 04:06 PM, Daniel Helgenberger wrote:
On 29.10.2014 15:57, Xavier Naveira wrote:
On 10/29/2014 03:07 PM, Xavier Naveira wrote:
On 10/29/2014 01:26 PM, Daniel Helgenberger wrote:
On 29.10.2014 11:48, Xavier Naveira wrote:
On 10/29/2014 11:47 AM, Xavier Naveira wrote: > On 10/29/2014 11:40 AM, Daniel Helgenberger wrote: >> >> >> On 29.10.2014 10:21, Xavier Naveira wrote: >>> Hi, >>> >>> We are migrating our ifrastructure from kvm+libvirt hypervisors to >>> ovirt. >>> >>> Everything is working fine but we're noticing that all the >>> qemu-kvm >>> processes in the hypervisors take a lot of CPU. >> Without further details of the workload this is hard tell. One >> Reason I >> can think of might be KSM [1]. Is it enabled on your cluster(s)? >> What is >> your mem over-commitment setting? >> >> Note, IIRC the KSM policy is currently hard coded; it will start at >> 80% >> host mem usage. >> >> [1] http://www.ovirt.org/Sla/host-mom-policy >>> >>> The typical example is an idle machine, running top from the >>> machine >>> itself it reports cpu use percentages below 10% and loads (with 2 >>> processors) of 0.0x. The process running that machine in the >>> hypervisor >>> rports cpu uses in the order of the 80-100%. >>> >>> Should the values look like this? Why are the idle machines eating >>> up so >>> much CPU time? >>> >>> Thank you. >>> Xavier >>> >>> _______________________________________________ >>> Users mailing list >>> Users@ovirt.org >>> http://lists.ovirt.org/mailman/listinfo/users >>> >> > Hi, thank you for the answer.
I've been trying to work out some pattern and realized that the VMs using that much cpu all are Redhat 5.x, the Readhat 6.x doesn't exhibit this kind of high cpu use. (we run only redhat/centos 5.x/6.x on the cluster) What OS are the hosts running? In case of EL6, make sure you have tuned-0.2.19-13.el6.noarch installed [1].
That's exactly the version we've in the hypervisors.
To further investigate please post Engine, VDSM, libvirt and kernel versions from the hosts.
vdsm-xmlrpc-4.14.11.2-0.el6.noarch vdsm-cli-4.14.11.2-0.el6.noarch vdsm-python-4.14.11.2-0.el6.x86_64 vdsm-4.14.11.2-0.el6.x86_64 vdsm-python-zombiereaper-4.14.11.2-0.el6.noarch
libvirt-client-0.10.2-29.el6_5.12.x86_64 libvirt-0.10.2-29.el6_5.12.x86_64 libvirt-lock-sanlock-0.10.2-29.el6_5.12.x86_64 libvirt-python-0.10.2-29.el6_5.12.x86_64
2.6.32-431.23.3.el6.x86_64 #1 SMP Wed Jul 16 06:12:23 EDT 2014 x86_64 x86_64 x86_64 GNU/Linux
[1] https://access.redhat.com/solutions/358033
I'll take a look to the KSM config.
Cheers,
Xavier
Actually, this seems to be it. But I'm already at a newer kernel: https://bugzilla.redhat.com/show_bug.cgi?id=705082 Well, I do not have such hardware so I never run into the issue. You could disable HT as I suspect your physical cores are less then 64?
Your workload might differ but my VMs usually do not benefit from 'threaded' cores and I want HT disabled anyway. Also, you can check cluster settings and disable 'count threads as cores' if enabled. But I think this might not make any difference.
Yeah, these are machines with 4 sockets, 6 core per socket and HT enabled, so total 48 "CPU".
So, are you implying that the problem is the number of "CPUs"? We were hoping to add some more hypervisors to the cluster next week that have even more cores...
I can probably try to disable HT when we add the next hypervisor next week but it feels that it'd be just a workaround?
I opened a bug at redhat just in case:https://bugzilla.redhat.com/show_bug.cgi?id=1158547

On 29.10.2014 16:44, Xavier Naveira wrote:
On 10/29/2014 04:29 PM, Xavier Naveira wrote:
On 10/29/2014 04:06 PM, Daniel Helgenberger wrote:
On 29.10.2014 15:57, Xavier Naveira wrote:
On 10/29/2014 03:07 PM, Xavier Naveira wrote:
On 10/29/2014 01:26 PM, Daniel Helgenberger wrote:
On 29.10.2014 11:48, Xavier Naveira wrote: > On 10/29/2014 11:47 AM, Xavier Naveira wrote: >> On 10/29/2014 11:40 AM, Daniel Helgenberger wrote: >>> >>> >>> On 29.10.2014 10:21, Xavier Naveira wrote: >>>> Hi, >>>> >>>> We are migrating our ifrastructure from kvm+libvirt hypervisors to >>>> ovirt. >>>> >>>> Everything is working fine but we're noticing that all the >>>> qemu-kvm >>>> processes in the hypervisors take a lot of CPU. >>> Without further details of the workload this is hard tell. One >>> Reason I >>> can think of might be KSM [1]. Is it enabled on your cluster(s)? >>> What is >>> your mem over-commitment setting? >>> >>> Note, IIRC the KSM policy is currently hard coded; it will start at >>> 80% >>> host mem usage. >>> >>> [1] http://www.ovirt.org/Sla/host-mom-policy >>>> >>>> The typical example is an idle machine, running top from the >>>> machine >>>> itself it reports cpu use percentages below 10% and loads (with 2 >>>> processors) of 0.0x. The process running that machine in the >>>> hypervisor >>>> rports cpu uses in the order of the 80-100%. >>>> >>>> Should the values look like this? Why are the idle machines eating >>>> up so >>>> much CPU time? >>>> >>>> Thank you. >>>> Xavier >>>> >>>> _______________________________________________ >>>> Users mailing list >>>> Users@ovirt.org >>>> http://lists.ovirt.org/mailman/listinfo/users >>>> >>> >> > Hi, thank you for the answer. > > I've been trying to work out some pattern and realized that the VMs > using that much cpu all are Redhat 5.x, the Readhat 6.x doesn't > exhibit > this kind of high cpu use. (we run only redhat/centos 5.x/6.x on the > cluster) What OS are the hosts running? In case of EL6, make sure you have tuned-0.2.19-13.el6.noarch installed [1].
That's exactly the version we've in the hypervisors.
To further investigate please post Engine, VDSM, libvirt and kernel versions from the hosts.
vdsm-xmlrpc-4.14.11.2-0.el6.noarch vdsm-cli-4.14.11.2-0.el6.noarch vdsm-python-4.14.11.2-0.el6.x86_64 vdsm-4.14.11.2-0.el6.x86_64 vdsm-python-zombiereaper-4.14.11.2-0.el6.noarch
libvirt-client-0.10.2-29.el6_5.12.x86_64 libvirt-0.10.2-29.el6_5.12.x86_64 libvirt-lock-sanlock-0.10.2-29.el6_5.12.x86_64 libvirt-python-0.10.2-29.el6_5.12.x86_64
2.6.32-431.23.3.el6.x86_64 #1 SMP Wed Jul 16 06:12:23 EDT 2014 x86_64 x86_64 x86_64 GNU/Linux
[1] https://access.redhat.com/solutions/358033 > > I'll take a look to the KSM config. > > Cheers, > > Xavier >
Actually, this seems to be it. But I'm already at a newer kernel: https://bugzilla.redhat.com/show_bug.cgi?id=705082 Well, I do not have such hardware so I never run into the issue. You could disable HT as I suspect your physical cores are less then 64?
Your workload might differ but my VMs usually do not benefit from 'threaded' cores and I want HT disabled anyway. Also, you can check cluster settings and disable 'count threads as cores' if enabled. But I think this might not make any difference.
Yeah, these are machines with 4 sockets, 6 core per socket and HT enabled, so total 48 "CPU". Good to know; yet the largest host I have has 32 (2 sockets, 8 cores, HT enabled) CPUs and is not showing this issue (at least I just looked and everything seems fine).
So, are you implying that the problem is the number of "CPUs"? We were hoping to add some more hypervisors to the cluster next week that have even more cores...
I can probably try to disable HT when we add the next hypervisor next week but it feels that it'd be just a workaround? Maybe, but not a bad one as you should not have any disadvantages.
I opened a bug at redhat just in case:https://bugzilla.redhat.com/show_bug.cgi?id=1158547 I have to ask as I cannot see the BZ because I have no subscription any more. Against witch component did you open it?
-- Daniel Helgenberger m box bewegtbild GmbH P: +49/30/2408781-22 F: +49/30/2408781-10 ACKERSTR. 19 D-10115 BERLIN www.m-box.de www.monkeymen.tv Geschäftsführer: Martin Retschitzegger / Michaela Göllner Handeslregister: Amtsgericht Charlottenburg / HRB 112767

On 10/29/2014 05:15 PM, Daniel Helgenberger wrote:
On 29.10.2014 16:44, Xavier Naveira wrote:
On 10/29/2014 04:29 PM, Xavier Naveira wrote:
On 10/29/2014 04:06 PM, Daniel Helgenberger wrote:
On 29.10.2014 15:57, Xavier Naveira wrote:
On 10/29/2014 03:07 PM, Xavier Naveira wrote:
On 10/29/2014 01:26 PM, Daniel Helgenberger wrote: > On 29.10.2014 11:48, Xavier Naveira wrote: >> On 10/29/2014 11:47 AM, Xavier Naveira wrote: >>> On 10/29/2014 11:40 AM, Daniel Helgenberger wrote: >>>> >>>> >>>> On 29.10.2014 10:21, Xavier Naveira wrote: >>>>> Hi, >>>>> >>>>> We are migrating our ifrastructure from kvm+libvirt hypervisors to >>>>> ovirt. >>>>> >>>>> Everything is working fine but we're noticing that all the >>>>> qemu-kvm >>>>> processes in the hypervisors take a lot of CPU. >>>> Without further details of the workload this is hard tell. One >>>> Reason I >>>> can think of might be KSM [1]. Is it enabled on your cluster(s)? >>>> What is >>>> your mem over-commitment setting? >>>> >>>> Note, IIRC the KSM policy is currently hard coded; it will start at >>>> 80% >>>> host mem usage. >>>> >>>> [1] http://www.ovirt.org/Sla/host-mom-policy >>>>> >>>>> The typical example is an idle machine, running top from the >>>>> machine >>>>> itself it reports cpu use percentages below 10% and loads (with 2 >>>>> processors) of 0.0x. The process running that machine in the >>>>> hypervisor >>>>> rports cpu uses in the order of the 80-100%. >>>>> >>>>> Should the values look like this? Why are the idle machines eating >>>>> up so >>>>> much CPU time? >>>>> >>>>> Thank you. >>>>> Xavier >>>>> >>>>> _______________________________________________ >>>>> Users mailing list >>>>> Users@ovirt.org >>>>> http://lists.ovirt.org/mailman/listinfo/users >>>>> >>>> >>> >> Hi, thank you for the answer. >> >> I've been trying to work out some pattern and realized that the VMs >> using that much cpu all are Redhat 5.x, the Readhat 6.x doesn't >> exhibit >> this kind of high cpu use. (we run only redhat/centos 5.x/6.x on the >> cluster) > What OS are the hosts running? In case of EL6, make sure you have > tuned-0.2.19-13.el6.noarch installed [1].
That's exactly the version we've in the hypervisors.
> > To further investigate please post Engine, VDSM, libvirt and kernel > versions from the hosts.
vdsm-xmlrpc-4.14.11.2-0.el6.noarch vdsm-cli-4.14.11.2-0.el6.noarch vdsm-python-4.14.11.2-0.el6.x86_64 vdsm-4.14.11.2-0.el6.x86_64 vdsm-python-zombiereaper-4.14.11.2-0.el6.noarch
libvirt-client-0.10.2-29.el6_5.12.x86_64 libvirt-0.10.2-29.el6_5.12.x86_64 libvirt-lock-sanlock-0.10.2-29.el6_5.12.x86_64 libvirt-python-0.10.2-29.el6_5.12.x86_64
2.6.32-431.23.3.el6.x86_64 #1 SMP Wed Jul 16 06:12:23 EDT 2014 x86_64 x86_64 x86_64 GNU/Linux
> > [1] https://access.redhat.com/solutions/358033 >> >> I'll take a look to the KSM config. >> >> Cheers, >> >> Xavier >> >
Actually, this seems to be it. But I'm already at a newer kernel: https://bugzilla.redhat.com/show_bug.cgi?id=705082 Well, I do not have such hardware so I never run into the issue. You could disable HT as I suspect your physical cores are less then 64?
Your workload might differ but my VMs usually do not benefit from 'threaded' cores and I want HT disabled anyway. Also, you can check cluster settings and disable 'count threads as cores' if enabled. But I think this might not make any difference.
Yeah, these are machines with 4 sockets, 6 core per socket and HT enabled, so total 48 "CPU". Good to know; yet the largest host I have has 32 (2 sockets, 8 cores, HT enabled) CPUs and is not showing this issue (at least I just looked and everything seems fine).
So, are you implying that the problem is the number of "CPUs"? We were hoping to add some more hypervisors to the cluster next week that have even more cores...
I can probably try to disable HT when we add the next hypervisor next week but it feels that it'd be just a workaround? Maybe, but not a bad one as you should not have any disadvantages.
I opened a bug at redhat just in case:https://bugzilla.redhat.com/show_bug.cgi?id=1158547 I have to ask as I cannot see the BZ because I have no subscription any more. Against witch component did you open it?
I did as in the original bug: kernel, and then I took KVM as subsystem.

On our clusters with pure computation we use always HT off, HT was slowing down scientific calculations. Just I'm interested if any benefit of enabling HT on the Ovirt host? At least on my system host-Centos6.5, guest-Centos7VM-LAMPstack with separate CentOS7-MariaDBVM did not show any performance boost if I switch on/off HT on 2650v2. a. On Wed, Oct 29, 2014 at 5:24 PM, Xavier Naveira <xnaveira@gmail.com> wrote:
On 10/29/2014 05:15 PM, Daniel Helgenberger wrote:
On 29.10.2014 16:44, Xavier Naveira wrote:
On 10/29/2014 04:29 PM, Xavier Naveira wrote:
On 10/29/2014 04:06 PM, Daniel Helgenberger wrote:
On 29.10.2014 15:57, Xavier Naveira wrote:
On 10/29/2014 03:07 PM, Xavier Naveira wrote:
> On 10/29/2014 01:26 PM, Daniel Helgenberger wrote: > >> On 29.10.2014 11:48, Xavier Naveira wrote: >> >>> On 10/29/2014 11:47 AM, Xavier Naveira wrote: >>> >>>> On 10/29/2014 11:40 AM, Daniel Helgenberger wrote: >>>> >>>>> >>>>> >>>>> On 29.10.2014 10:21, Xavier Naveira wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> We are migrating our ifrastructure from kvm+libvirt hypervisors >>>>>> to >>>>>> ovirt. >>>>>> >>>>>> Everything is working fine but we're noticing that all the >>>>>> qemu-kvm >>>>>> processes in the hypervisors take a lot of CPU. >>>>>> >>>>> Without further details of the workload this is hard tell. One >>>>> Reason I >>>>> can think of might be KSM [1]. Is it enabled on your cluster(s)? >>>>> What is >>>>> your mem over-commitment setting? >>>>> >>>>> Note, IIRC the KSM policy is currently hard coded; it will start >>>>> at >>>>> 80% >>>>> host mem usage. >>>>> >>>>> [1] http://www.ovirt.org/Sla/host-mom-policy >>>>> >>>>>> >>>>>> The typical example is an idle machine, running top from the >>>>>> machine >>>>>> itself it reports cpu use percentages below 10% and loads (with >>>>>> 2 >>>>>> processors) of 0.0x. The process running that machine in the >>>>>> hypervisor >>>>>> rports cpu uses in the order of the 80-100%. >>>>>> >>>>>> Should the values look like this? Why are the idle machines >>>>>> eating >>>>>> up so >>>>>> much CPU time? >>>>>> >>>>>> Thank you. >>>>>> Xavier >>>>>> >>>>>> _______________________________________________ >>>>>> Users mailing list >>>>>> Users@ovirt.org >>>>>> http://lists.ovirt.org/mailman/listinfo/users >>>>>> >>>>>> >>>>> >>>> Hi, thank you for the answer. >>> >>> I've been trying to work out some pattern and realized that the VMs >>> using that much cpu all are Redhat 5.x, the Readhat 6.x doesn't >>> exhibit >>> this kind of high cpu use. (we run only redhat/centos 5.x/6.x on >>> the >>> cluster) >>> >> What OS are the hosts running? In case of EL6, make sure you have >> tuned-0.2.19-13.el6.noarch installed [1]. >> > > That's exactly the version we've in the hypervisors. > > >> To further investigate please post Engine, VDSM, libvirt and kernel >> versions from the hosts. >> > > vdsm-xmlrpc-4.14.11.2-0.el6.noarch > vdsm-cli-4.14.11.2-0.el6.noarch > vdsm-python-4.14.11.2-0.el6.x86_64 > vdsm-4.14.11.2-0.el6.x86_64 > vdsm-python-zombiereaper-4.14.11.2-0.el6.noarch > > libvirt-client-0.10.2-29.el6_5.12.x86_64 > libvirt-0.10.2-29.el6_5.12.x86_64 > libvirt-lock-sanlock-0.10.2-29.el6_5.12.x86_64 > libvirt-python-0.10.2-29.el6_5.12.x86_64 > > 2.6.32-431.23.3.el6.x86_64 #1 SMP Wed Jul 16 06:12:23 EDT 2014 x86_64 > x86_64 x86_64 GNU/Linux > > > >> [1] https://access.redhat.com/solutions/358033 >> >>> >>> I'll take a look to the KSM config. >>> >>> Cheers, >>> >>> Xavier >>> >>> >> > Actually, this seems to be it. But I'm already at a newer kernel: https://bugzilla.redhat.com/show_bug.cgi?id=705082
Well, I do not have such hardware so I never run into the issue. You could disable HT as I suspect your physical cores are less then 64?
Your workload might differ but my VMs usually do not benefit from 'threaded' cores and I want HT disabled anyway. Also, you can check cluster settings and disable 'count threads as cores' if enabled. But I think this might not make any difference.
Yeah, these are machines with 4 sockets, 6 core per socket and HT
enabled, so total 48 "CPU".
Good to know; yet the largest host I have has 32 (2 sockets, 8 cores, HT enabled) CPUs and is not showing this issue (at least I just looked and everything seems fine).
So, are you implying that the problem is the number of "CPUs"? We were hoping to add some more hypervisors to the cluster next week that have even more cores...
I can probably try to disable HT when we add the next hypervisor next week but it feels that it'd be just a workaround?
Maybe, but not a bad one as you should not have any disadvantages.
I opened a bug at redhat just in case:https://bugzilla.redhat.com/show_bug.cgi?id=1158547
I have to ask as I cannot see the BZ because I have no subscription any more. Against witch component did you open it?
I did as in the original bug: kernel, and then I took KVM as subsystem.
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

I haven't tested the performance differences with or without HT. We were running pure kvm-libvirt hosts on these machines and we're migrating them to oVirt and that's what triggered the problem with the Redhat 5 vms. I'll probably give it a try and disable HT on the next hypervisor that we'll be adding to the cluster next week and see if that solves the problem or it just mitigates it. X On Thu, Oct 30, 2014 at 10:09 AM, Arman Khalatyan <arm2arm@gmail.com> wrote:
On our clusters with pure computation we use always HT off, HT was slowing down scientific calculations. Just I'm interested if any benefit of enabling HT on the Ovirt host? At least on my system host-Centos6.5, guest-Centos7VM-LAMPstack with separate CentOS7-MariaDBVM did not show any performance boost if I switch on/off HT on 2650v2. a.
On Wed, Oct 29, 2014 at 5:24 PM, Xavier Naveira <xnaveira@gmail.com> wrote:
On 10/29/2014 05:15 PM, Daniel Helgenberger wrote:
On 29.10.2014 16:44, Xavier Naveira wrote:
On 10/29/2014 04:29 PM, Xavier Naveira wrote:
On 10/29/2014 04:06 PM, Daniel Helgenberger wrote:
On 29.10.2014 15:57, Xavier Naveira wrote:
> On 10/29/2014 03:07 PM, Xavier Naveira wrote: > >> On 10/29/2014 01:26 PM, Daniel Helgenberger wrote: >> >>> On 29.10.2014 11:48, Xavier Naveira wrote: >>> >>>> On 10/29/2014 11:47 AM, Xavier Naveira wrote: >>>> >>>>> On 10/29/2014 11:40 AM, Daniel Helgenberger wrote: >>>>> >>>>>> >>>>>> >>>>>> On 29.10.2014 10:21, Xavier Naveira wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> We are migrating our ifrastructure from kvm+libvirt >>>>>>> hypervisors to >>>>>>> ovirt. >>>>>>> >>>>>>> Everything is working fine but we're noticing that all the >>>>>>> qemu-kvm >>>>>>> processes in the hypervisors take a lot of CPU. >>>>>>> >>>>>> Without further details of the workload this is hard tell. One >>>>>> Reason I >>>>>> can think of might be KSM [1]. Is it enabled on your cluster(s)? >>>>>> What is >>>>>> your mem over-commitment setting? >>>>>> >>>>>> Note, IIRC the KSM policy is currently hard coded; it will >>>>>> start at >>>>>> 80% >>>>>> host mem usage. >>>>>> >>>>>> [1] http://www.ovirt.org/Sla/host-mom-policy >>>>>> >>>>>>> >>>>>>> The typical example is an idle machine, running top from the >>>>>>> machine >>>>>>> itself it reports cpu use percentages below 10% and loads >>>>>>> (with 2 >>>>>>> processors) of 0.0x. The process running that machine in the >>>>>>> hypervisor >>>>>>> rports cpu uses in the order of the 80-100%. >>>>>>> >>>>>>> Should the values look like this? Why are the idle machines >>>>>>> eating >>>>>>> up so >>>>>>> much CPU time? >>>>>>> >>>>>>> Thank you. >>>>>>> Xavier >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Users mailing list >>>>>>> Users@ovirt.org >>>>>>> http://lists.ovirt.org/mailman/listinfo/users >>>>>>> >>>>>>> >>>>>> >>>>> Hi, thank you for the answer. >>>> >>>> I've been trying to work out some pattern and realized that the >>>> VMs >>>> using that much cpu all are Redhat 5.x, the Readhat 6.x doesn't >>>> exhibit >>>> this kind of high cpu use. (we run only redhat/centos 5.x/6.x on >>>> the >>>> cluster) >>>> >>> What OS are the hosts running? In case of EL6, make sure you have >>> tuned-0.2.19-13.el6.noarch installed [1]. >>> >> >> That's exactly the version we've in the hypervisors. >> >> >>> To further investigate please post Engine, VDSM, libvirt and kernel >>> versions from the hosts. >>> >> >> vdsm-xmlrpc-4.14.11.2-0.el6.noarch >> vdsm-cli-4.14.11.2-0.el6.noarch >> vdsm-python-4.14.11.2-0.el6.x86_64 >> vdsm-4.14.11.2-0.el6.x86_64 >> vdsm-python-zombiereaper-4.14.11.2-0.el6.noarch >> >> libvirt-client-0.10.2-29.el6_5.12.x86_64 >> libvirt-0.10.2-29.el6_5.12.x86_64 >> libvirt-lock-sanlock-0.10.2-29.el6_5.12.x86_64 >> libvirt-python-0.10.2-29.el6_5.12.x86_64 >> >> 2.6.32-431.23.3.el6.x86_64 #1 SMP Wed Jul 16 06:12:23 EDT 2014 >> x86_64 >> x86_64 x86_64 GNU/Linux >> >> >> >>> [1] https://access.redhat.com/solutions/358033 >>> >>>> >>>> I'll take a look to the KSM config. >>>> >>>> Cheers, >>>> >>>> Xavier >>>> >>>> >>> >> > Actually, this seems to be it. But I'm already at a newer kernel: > https://bugzilla.redhat.com/show_bug.cgi?id=705082 > Well, I do not have such hardware so I never run into the issue. You could disable HT as I suspect your physical cores are less then 64?
Your workload might differ but my VMs usually do not benefit from 'threaded' cores and I want HT disabled anyway. Also, you can check cluster settings and disable 'count threads as cores' if enabled. But I think this might not make any difference.
> > Yeah, these are machines with 4 sockets, 6 core per socket and HT
enabled, so total 48 "CPU".
Good to know; yet the largest host I have has 32 (2 sockets, 8 cores, HT enabled) CPUs and is not showing this issue (at least I just looked and everything seems fine).
So, are you implying that the problem is the number of "CPUs"? We were hoping to add some more hypervisors to the cluster next week that have even more cores...
I can probably try to disable HT when we add the next hypervisor next week but it feels that it'd be just a workaround?
Maybe, but not a bad one as you should not have any disadvantages.
I opened a bug at redhat just in case:https://bugzilla.redhat.com/show_bug.cgi?id=1158547
I have to ask as I cannot see the BZ because I have no subscription any more. Against witch component did you open it?
I did as in the original bug: kernel, and then I took KVM as subsystem.
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

On Thu, Oct 30, 2014 at 10:46 AM, Xavier Naveira <xnaveira@gmail.com> wrote:
I haven't tested the performance differences with or without HT. We were running pure kvm-libvirt hosts on these machines and we're migrating them to oVirt and that's what triggered the problem with the Redhat 5 vms.
So a point is that the hw is the same and that the problem is only related to RH EL 5.x VMs. Also no change in BIOS settings. Did you reinstall from scratch? What about the sw? previously used version of Qemu/KVM and libvirt? current version of Qemu/KVM and libvirt with oVirt? Also, did you compare the qemu-kvm generated command line between plain Qemu/KVM and the one instantiated by oVirt?

These hardware is exactly the same, no changes in bios, reinstalled from scratch. The versions in the "old ones" are: RHEL 5.6 kvm-83-224.el5 libvirt-0.8.2-15.el5_6.1 kmod-kvm-83-224.el5 kvm-tools-83-224.el5 libvirt-0.8.2-15.el5_6.1 Kernel: 2.6.18-238.1.1.el5 #1 SMP Tue Jan 4 13:32:19 EST 2011 x86_64 x86_64 x86_64 GNU/Linux I the ovirt "new" ones: RHEL 6.5 qemu-img-rhev-0.12.1.2-2.415.el6_5.14.x86_64 qemu-kvm-rhev-0.12.1.2-2.415.el6_5.14.x86_64 qemu-kvm-rhev-tools-0.12.1.2-2.415.el6_5.14.x86_64 libvirt-0.10.2-29.el6_5.12.x86_64 libvirt-client-0.10.2-29.el6_5.12.x86_64 Kernel: 2.6.32-431.23.3.el6.x86_64 #1 SMP Wed Jul 16 06:12:23 EDT 2014 x86_64 x86_64 x86_64 GNU/Linux I haven't gone into the details but the command line generated by our old software is like 2 lines of text while in the ovirt nodes for the same type of machine it is 10 :) X On Thu, Oct 30, 2014 at 11:07 AM, Gianluca Cecchi <gianluca.cecchi@gmail.com
wrote:
On Thu, Oct 30, 2014 at 10:46 AM, Xavier Naveira <xnaveira@gmail.com> wrote:
I haven't tested the performance differences with or without HT. We were running pure kvm-libvirt hosts on these machines and we're migrating them to oVirt and that's what triggered the problem with the Redhat 5 vms.
So a point is that the hw is the same and that the problem is only related to RH EL 5.x VMs. Also no change in BIOS settings. Did you reinstall from scratch?
What about the sw? previously used version of Qemu/KVM and libvirt? current version of Qemu/KVM and libvirt with oVirt?
Also, did you compare the qemu-kvm generated command line between plain Qemu/KVM and the one instantiated by oVirt?

On Thu, Oct 30, 2014 at 12:25 PM, Xavier Naveira <xnaveira@gmail.com> wrote:
These hardware is exactly the same, no changes in bios, reinstalled from scratch.
The versions in the "old ones" are:
RHEL 5.6 kvm-83-224.el5 libvirt-0.8.2-15.el5_6.1 kmod-kvm-83-224.el5 kvm-tools-83-224.el5 libvirt-0.8.2-15.el5_6.1 Kernel: 2.6.18-238.1.1.el5 #1 SMP Tue Jan 4 13:32:19 EST 2011 x86_64 x86_64 x86_64 GNU/Linux
I the ovirt "new" ones:
RHEL 6.5 qemu-img-rhev-0.12.1.2-2.415.el6_5.14.x86_64 qemu-kvm-rhev-0.12.1.2-2.415.el6_5.14.x86_64 qemu-kvm-rhev-tools-0.12.1.2-2.415.el6_5.14.x86_64 libvirt-0.10.2-29.el6_5.12.x86_64 libvirt-client-0.10.2-29.el6_5.12.x86_64 Kernel: 2.6.32-431.23.3.el6.x86_64 #1 SMP Wed Jul 16 06:12:23 EDT 2014 x86_64 x86_64 x86_64 GNU/Linux
I haven't gone into the details but the command line generated by our old software is like 2 lines of text while in the ovirt nodes for the same type of machine it is 10 :)
So one "important" thing is that hypervisor OS was RHEL 5 with plain Qemu/KVM and instead RHEL 6 with oVirt.... You could check if plain Qemu/KVM with RH EL 6.5 generates the same problems for RHEL 5.x VMs.. Not that it would solve the problem itself, but it would help to put the different OS version of hypervisor as a possible cause

Yeah, that's definitively worth a try. The problem is that we have been running on ovirt for some months now and haven't realized the problem with the RHEL 5.x vms until recently, when we began to import them (until now we mainly run RHEL 6.x vms), so the ovirt hypervisors are production machines and I don't have a lot of margin doing tests on them. Hopefully next week we'll be able to decomission another of the "old ones" and then we can do some testing before adding it to the ovirt cluster... The other thing that we're going to test is disabling HT. X On Thu, Oct 30, 2014 at 12:33 PM, Gianluca Cecchi <gianluca.cecchi@gmail.com
wrote:
On Thu, Oct 30, 2014 at 12:25 PM, Xavier Naveira <xnaveira@gmail.com> wrote:
These hardware is exactly the same, no changes in bios, reinstalled from scratch.
The versions in the "old ones" are:
RHEL 5.6 kvm-83-224.el5 libvirt-0.8.2-15.el5_6.1 kmod-kvm-83-224.el5 kvm-tools-83-224.el5 libvirt-0.8.2-15.el5_6.1 Kernel: 2.6.18-238.1.1.el5 #1 SMP Tue Jan 4 13:32:19 EST 2011 x86_64 x86_64 x86_64 GNU/Linux
I the ovirt "new" ones:
RHEL 6.5 qemu-img-rhev-0.12.1.2-2.415.el6_5.14.x86_64 qemu-kvm-rhev-0.12.1.2-2.415.el6_5.14.x86_64 qemu-kvm-rhev-tools-0.12.1.2-2.415.el6_5.14.x86_64 libvirt-0.10.2-29.el6_5.12.x86_64 libvirt-client-0.10.2-29.el6_5.12.x86_64 Kernel: 2.6.32-431.23.3.el6.x86_64 #1 SMP Wed Jul 16 06:12:23 EDT 2014 x86_64 x86_64 x86_64 GNU/Linux
I haven't gone into the details but the command line generated by our old software is like 2 lines of text while in the ovirt nodes for the same type of machine it is 10 :)
So one "important" thing is that hypervisor OS was RHEL 5 with plain Qemu/KVM and instead RHEL 6 with oVirt.... You could check if plain Qemu/KVM with RH EL 6.5 generates the same problems for RHEL 5.x VMs.. Not that it would solve the problem itself, but it would help to put the different OS version of hypervisor as a possible cause

Hi, We have installed and added a new hypervisor into the ovirt cluster but this time with disabled HT. I migrated a RedHat 5.10 machine to it and immediately the qemu-kvm process running the vm (freshly installed, just basic packages) began to consume 20-40% CPU as showed running top on the hypervisor. Now that I have a hypervisor to run tests in, what would you suggest the next step is? Thank you. Xavier On Thu, Oct 30, 2014 at 12:38 PM, Xavier Naveira <xnaveira@gmail.com> wrote:
Yeah, that's definitively worth a try. The problem is that we have been running on ovirt for some months now and haven't realized the problem with the RHEL 5.x vms until recently, when we began to import them (until now we mainly run RHEL 6.x vms), so the ovirt hypervisors are production machines and I don't have a lot of margin doing tests on them. Hopefully next week we'll be able to decomission another of the "old ones" and then we can do some testing before adding it to the ovirt cluster... The other thing that we're going to test is disabling HT.
X
On Thu, Oct 30, 2014 at 12:33 PM, Gianluca Cecchi < gianluca.cecchi@gmail.com> wrote:
On Thu, Oct 30, 2014 at 12:25 PM, Xavier Naveira <xnaveira@gmail.com> wrote:
These hardware is exactly the same, no changes in bios, reinstalled from scratch.
The versions in the "old ones" are:
RHEL 5.6 kvm-83-224.el5 libvirt-0.8.2-15.el5_6.1 kmod-kvm-83-224.el5 kvm-tools-83-224.el5 libvirt-0.8.2-15.el5_6.1 Kernel: 2.6.18-238.1.1.el5 #1 SMP Tue Jan 4 13:32:19 EST 2011 x86_64 x86_64 x86_64 GNU/Linux
I the ovirt "new" ones:
RHEL 6.5 qemu-img-rhev-0.12.1.2-2.415.el6_5.14.x86_64 qemu-kvm-rhev-0.12.1.2-2.415.el6_5.14.x86_64 qemu-kvm-rhev-tools-0.12.1.2-2.415.el6_5.14.x86_64 libvirt-0.10.2-29.el6_5.12.x86_64 libvirt-client-0.10.2-29.el6_5.12.x86_64 Kernel: 2.6.32-431.23.3.el6.x86_64 #1 SMP Wed Jul 16 06:12:23 EDT 2014 x86_64 x86_64 x86_64 GNU/Linux
I haven't gone into the details but the command line generated by our old software is like 2 lines of text while in the ovirt nodes for the same type of machine it is 10 :)
So one "important" thing is that hypervisor OS was RHEL 5 with plain Qemu/KVM and instead RHEL 6 with oVirt.... You could check if plain Qemu/KVM with RH EL 6.5 generates the same problems for RHEL 5.x VMs.. Not that it would solve the problem itself, but it would help to put the different OS version of hypervisor as a possible cause

On Fri, Nov 7, 2014 at 10:48 AM, Xavier Naveira <xnaveira@gmail.com> wrote:
Hi,
We have installed and added a new hypervisor into the ovirt cluster but this time with disabled HT.
I migrated a RedHat 5.10 machine to it and immediately the qemu-kvm process running the vm (freshly installed, just basic packages) began to consume 20-40% CPU as showed running top on the hypervisor.
Now that I have a hypervisor to run tests in, what would you suggest the next step is?
Thank you.
Xavier
If I remember correctly you had to test plain Qemu/KVM on CentOS 6.5 and see if the difference is made by oVirt itself or by the OS changed from 5.x to 6.y... And also compare command line (eventually both in 5.x and 6.x) between plain Qemu/KVM and oVirt spawned VMs Gianluca

We tried a minimal installation from CD of RedHat 5.10 and it is the same. This should be fairly easy to reproduce: - Install a RedHat 6.5 hypervisor - Install a RedHat 5.10 guest in it - Enjoy your overused CPU Is there someone with a similar setup out there? Xavier On Fri, Nov 7, 2014 at 12:30 PM, Gianluca Cecchi <gianluca.cecchi@gmail.com> wrote:
On Fri, Nov 7, 2014 at 10:48 AM, Xavier Naveira <xnaveira@gmail.com> wrote:
Hi,
We have installed and added a new hypervisor into the ovirt cluster but this time with disabled HT.
I migrated a RedHat 5.10 machine to it and immediately the qemu-kvm process running the vm (freshly installed, just basic packages) began to consume 20-40% CPU as showed running top on the hypervisor.
Now that I have a hypervisor to run tests in, what would you suggest the next step is?
Thank you.
Xavier
If I remember correctly you had to test plain Qemu/KVM on CentOS 6.5 and see if the difference is made by oVirt itself or by the OS changed from 5.x to 6.y... And also compare command line (eventually both in 5.x and 6.x) between plain Qemu/KVM and oVirt spawned VMs Gianluca

Ok, so we found a "solution" that I thought I'll share in case this thread pops up in a future search. The problem was that our Redhat 5.x guests still run with a kernel that lacks the "tickeless" feature, meaning that they are poking the host a 1000 times per second even if they haven't anything to do. Disabling hyperthreading and most importantly, adding the parameter "divider=10" to the kernel boot line in grub does indeed lower the cpu utilization in the hosts almost to 0% when idling. In our case we use the command sudo /sbin/grubby --update-kernel=ALL --args="divider=10" To upgrade the /boot/grub/grub.conf file and then we restart the vm. Thank you everyone for the help. Xavier On Mon, Nov 10, 2014 at 8:32 AM, Xavier Naveira <xnaveira@gmail.com> wrote:
We tried a minimal installation from CD of RedHat 5.10 and it is the same.
This should be fairly easy to reproduce:
- Install a RedHat 6.5 hypervisor - Install a RedHat 5.10 guest in it - Enjoy your overused CPU
Is there someone with a similar setup out there?
Xavier
On Fri, Nov 7, 2014 at 12:30 PM, Gianluca Cecchi < gianluca.cecchi@gmail.com> wrote:
On Fri, Nov 7, 2014 at 10:48 AM, Xavier Naveira <xnaveira@gmail.com> wrote:
Hi,
We have installed and added a new hypervisor into the ovirt cluster but this time with disabled HT.
I migrated a RedHat 5.10 machine to it and immediately the qemu-kvm process running the vm (freshly installed, just basic packages) began to consume 20-40% CPU as showed running top on the hypervisor.
Now that I have a hypervisor to run tests in, what would you suggest the next step is?
Thank you.
Xavier
If I remember correctly you had to test plain Qemu/KVM on CentOS 6.5 and see if the difference is made by oVirt itself or by the OS changed from 5.x to 6.y... And also compare command line (eventually both in 5.x and 6.x) between plain Qemu/KVM and oVirt spawned VMs Gianluca
participants (4)
-
Arman Khalatyan
-
Daniel Helgenberger
-
Gianluca Cecchi
-
Xavier Naveira