[Engine-devel] Mom Balloon policy issue
Yaniv Kaul
ykaul at redhat.com
Wed Oct 10 09:33:20 UTC 2012
On 10/10/2012 11:23 AM, Noam Slomianko wrote:
> Regarding qemu-kvm memory allocation behaviour, this is the clearest explanation I found:
>
> "Unless using explicit hugepages for the guest, KVM will
> *not* initially allocate all RAM from the host OS. Each
> page of RAM is only allocated when the guest OS touches
> that page. So if you set memory=8G and currentMemory=4G,
> although you might expect resident usage of KVM on the
> host to be 4 G, it can in fact be pretty much any value
> between 0 and 4 GB depending on whether the guest OS
> has touched the memory pages, or as much as 8 GB if the
> guest OS has no balloon driver and touches all memory
> immediately."[1]
Windows zeros out all memory on startup, so it essentially touches all
of its pages right away.
Y.
>
> And as far as I've seen it behave the same after the guest is ballooned and then deflated.
>
> Conclusion, the time in which the guest is allocated the resident memory is not deterministic and dependent on use and OS
> For example Windows7 uses all available memory as cache while fedora does not.
> So an inactive guest running windows7 will consume all possible resident memory right away while an inactive guest running fedora might never grow in memory size.
>
> [1] http://www.redhat.com/archives/virt-tools-list/2011-January/msg00001.html
>
> ------------------
> Noam Slomianko
> Red Hat Enterprise Virtualization, SLA team
>
> ----- Original Message -----
> From: "Adam Litke" <agl at us.ibm.com>
> To: "Noam Slomianko" <nslomian at redhat.com>
> Cc: "Doron Fediuck" <dfediuck at redhat.com>, vdsm-devel at lists.fedorahosted.org, engine-devel at ovirt.org
> Sent: Tuesday, October 9, 2012 8:06:02 PM
> Subject: Re: Mom Balloon policy issue
>
> Thanks for writing this. Some thoughts inline, below. Also, cc'ing some lists
> in case other folks want to participate in the discussion.
>
> On Tue, Oct 09, 2012 at 01:12:30PM -0400, Noam Slomianko wrote:
>> Greetings,
>>
>> I've fiddled around with ballooning and wanted to raise a question for debate.
>>
>> Currently as long as the host is under memory pressure, MOM will try and reclaim back memory from all guests with more free memory then a given threshold.
>>
>> Main issue: Guest allocated memory is not the same as the resident (physical) memory used by qemu.
>> This means that when memory is reclaimed back (the balloon is inflated) we might not get as much memory as planed back (or non at all).
>>
>> *Example1 no memory is reclaimed back:
>> name | allocated memory | used by the vm | resident memory used in the host by qemu
>> Vm1 | 4G | 4G, | 4G
>> Vm2 | 4G | 1G | 1G
>> - MOM will inflate the balloon in vm2 (as vm has no free memory) and will gain no memory
> One thing to keep in mind is that VMs having less host RSS than their memory
> allocation is a temporary condition. All VMs will eventually consume their full
> allocation if allowed to run. I'd be curious to know how long this process
> takes in general.
>
> We might be able to handle this case by refusing to inflate the balloon if:
> (VM free memory - planned balloon inflation) > host RSS
>
>
>> *Example1 memory is reclaimed partially:
>> name | allocated memory | used by the vm | resident memory used in the host by qemu
>> Vm1 | 4G | 4G, | 4G
>> Vm2 | 4G | 1G | 1G
>> Vm3 | 4G | 1G | 4G
>> - MOM will inflate the balloon in vm2 and vm3 slowly gaining only from vm3
> The above rule extension may help here too.
>
>> this behaviour might in the cause us to:
>> * spend time reclaiming memory from many guests when we can reclaim only from a subgroup
>> * be under the impression that we have more potential memory to reclaim when we do
>> * bring inactive VMs dangerously low as they are constantly reclaimed (I've had guests crashing from kernel out of memory)
>>
>>
>> To address this I suggest that we collect guest memory stats from libvirt as well, so we have the option to use them in our calculations.
>> This can be achieved with the command "virsh dommemstat <domain>" which returns
>> actual 3915372 (allocated)
>> rss 2141580 (resident memory used by qemu)
> I would suggest adding these two fields to the VmStats that are collected by
> vdsm. Then, to try it out, add the fields to the GuestMemory Collector. (Note:
> MOM does have a collector that gathers RSS for VMs. It's called GuestQemuProc).
> You can then extend the Balloon policy to add a snippet to check if the proposed
> balloon adjustment should be carried out. You could add the logic to the
> change_big_enough function.
>
>> additional topic:
>> * should we include per guest config (for example a hard minimum memory cap, this vm cannot run effectively with less then 1G memory)
> Yes. This is probably something we want to do. There is a whole topic around
> VM tagging that we should consider. In the future we will want to be able to do
> many different things in policy based on a VMs tag. For example, some VMs may
> be completely exempt from ballooning. Others may have a minimum limit.
>
> I want to avoid passing in the raw guest configuration because MOM needs to work
> with direct libvirt vms and with ovirt/vdsm vms. Therefore, we want to think
> carefully about the abstractions we use when presenting VM properties to MOM.
>
More information about the Engine-devel
mailing list