Hi All,
A question regarding memory management with ovirt. I know memory can
be complicated hence I'm asking the experts. :)
Two examples of where it looks - to me - that memory management from
ovirt perspective is incorrect. This is resulting in us not getting as
much out of a host as we'd expect.
## Example 1:
host: dev-cluster-04
I understand the mem on the host to be:
128G total (physical)
68G used
53G available
56G buff/cache
I understand therefore 53G should still be available to allocate
(approximately, minus a few things).
```
DEV [root@dev-cluster-04:~] # free -m
total used free shared buff/cache available
Mem: 128741 68295 4429 4078 56016 53422
Swap: 12111 1578 10533
DEV [root@dev-cluster-04:~] # cat /proc/meminfo
MemTotal: 131831292 kB
MemFree: 4540852 kB
MemAvailable: 54709832 kB
Buffers: 3104 kB
Cached: 5174136 kB
SwapCached: 835012 kB
Active: 66943552 kB
Inactive: 5980340 kB
Active(anon): 66236968 kB
Inactive(anon): 5713972 kB
Active(file): 706584 kB
Inactive(file): 266368 kB
Unevictable: 50036 kB
Mlocked: 54132 kB
SwapTotal: 12402684 kB
SwapFree: 10786688 kB
Dirty: 812 kB
Writeback: 0 kB
AnonPages: 67068548 kB
Mapped: 143880 kB
Shmem: 4176328 kB
Slab: 52183680 kB
SReclaimable: 49822156 kB
SUnreclaim: 2361524 kB
KernelStack: 20000 kB
PageTables: 213628 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 78318328 kB
Committed_AS: 110589076 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 859104 kB
VmallocChunk: 34291324976 kB
HardwareCorrupted: 0 kB
AnonHugePages: 583680 kB
CmaTotal: 0 kB
CmaFree: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 621088 kB
DirectMap2M: 44439552 kB
DirectMap1G: 91226112 kB
```
The ovirt engine, compute -> hosts view shows s4-dev-cluster-01 as 93%
memory utilised.
Clicking on the node says:
Physical Memory: 128741 MB total, 119729 MB used, 9012 MB free
So ovirt engine says 9G free. The OS reports 4G free but 53G
available. Surely ovirt should be looking at available memory?
This is a problem, for instance, when trying to run a VM, called
dev-cassandra-01, with mem size 24576, max mem 24576 and mem
guarantee set to 10240 on this host it fails with:
```
Cannot run VM. There is no host that satisfies current scheduling
constraints. See below for details:
The host dev-cluster-04.fnb.co.za did not satisfy internal filter
Memory because its available memory is too low (19884 MB) to run the
VM.
```
To me this looks blatantly wrong. The host has 53G available according
to free -m.
Guessing I'm missing something, unless this is some sort of bug?
versions:
```
engine: 4.3.7.2-1.el7
host:
OS Version: RHEL - 7 - 6.1810.2.el7.centos
OS Description: CentOS Linux 7 (Core)
Kernel Version: 3.10.0 - 957.12.1.el7.x86_64
KVM Version: 2.12.0 - 18.el7_6.3.1
LIBVIRT Version: libvirt-4.5.0-10.el7_6.7
VDSM Version: vdsm-4.30.13-1.el7
SPICE Version: 0.14.0 - 6.el7_6.1
GlusterFS Version: [N/A]
CEPH Version: librbd1-10.2.5-4.el7
Open vSwitch Version: openvswitch-2.10.1-3.el7
Kernel Features: PTI: 1, IBRS: 0, RETP: 1, SSBD: 3
VNC Encryption: Disabled
```
## Example 2:
A ovirt host with two VMs:
According to the host, it has 128G of physical memory of which 56G is
used, 69G is buff/cache and 65G is available.
As is shown here:
```
LIVE [root@prod-cluster-01:~] # cat /proc/meminfo
MemTotal: 131326836 kB
MemFree: 2630812 kB
MemAvailable: 66573596 kB
Buffers: 2376 kB
Cached: 5670628 kB
SwapCached: 151072 kB
Active: 59106140 kB
Inactive: 2744176 kB
Active(anon): 58099732 kB
Inactive(anon): 2327428 kB
Active(file): 1006408 kB
Inactive(file): 416748 kB
Unevictable: 40004 kB
Mlocked: 42052 kB
SwapTotal: 4194300 kB
SwapFree: 3579492 kB
Dirty: 0 kB
Writeback: 0 kB
AnonPages: 56085040 kB
Mapped: 121816 kB
Shmem: 4231808 kB
Slab: 65143868 kB
SReclaimable: 63145684 kB
SUnreclaim: 1998184 kB
KernelStack: 25296 kB
PageTables: 148336 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 69857716 kB
Committed_AS: 76533164 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 842296 kB
VmallocChunk: 34291404724 kB
HardwareCorrupted: 0 kB
AnonHugePages: 55296 kB
CmaTotal: 0 kB
CmaFree: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 722208 kB
DirectMap2M: 48031744 kB
DirectMap1G: 87031808 kB
LIVE [root@prod-cluster-01:~] # free -m
total used free shared buff/cache available
Mem: 128248 56522 2569 4132 69157 65013
Swap: 4095 600 3495
```
However the compute -> hosts ovirt screen shows this node as 94%
memory.
Clicking compute -> hosts -> prod-cluster-01 -> general says:
Physical Memory: 128248 MB total, 120553 MB used, 7695 MB free
Swap Size: 4095 MB total, 600 MB used, 3495 MB free
The physical memory in the above makes no sense to me. Unless it
includes caches which I would think it shouldn't.
This host has just two VMs:
LIVE [root@prod-cluster-01:~] # virsh -c
qemu:///system?authfile=/etc/ovirt-hosted-engine/virsh_auth.conf list
Id Name State
----------------------------------------------------
35 prod-box-18 running
36 prod-box-11 running
Moreover each VM has 32G memory set, in every possible place - from
what I can see.
```
LIVE [root@prod-cluster-01:~] # virsh -c
qemu:///system?authfile=/etc/ovirt-hosted-engine/virsh_auth.conf dumpxml prod-box-11|grep
-i mem
<ovirt-vm:memGuaranteedSize
type="int">32768</ovirt-vm:memGuaranteedSize>
<ovirt-vm:minGuaranteedMemoryMb
type="int">32768</ovirt-vm:minGuaranteedMemoryMb>
<memory unit='KiB'>33554432</memory>
<currentMemory unit='KiB'>33554432</currentMemory>
<cell id='0' cpus='0-27' memory='33554432'
unit='KiB'/>
<suspend-to-mem enabled='no'/>
<model type='qxl' ram='65536' vram='32768'
vgamem='16384' heads='1' primary='yes'/>
<memballoon model='virtio'>
</memballoon>
```
prod-box-11 is however set as high performance VM. That could cause a
problem.
Same for the other VM:
```
LIVE [root@prod-cluster-01:~] # virsh -c
qemu:///system?authfile=/etc/ovirt-hosted-engine/virsh_auth.conf dumpxml prod-box-18|grep
-i mem
<ovirt-vm:memGuaranteedSize
type="int">32768</ovirt-vm:memGuaranteedSize>
<ovirt-vm:minGuaranteedMemoryMb
type="int">32768</ovirt-vm:minGuaranteedMemoryMb>
<memory unit='KiB'>33554432</memory>
<currentMemory unit='KiB'>33554432</currentMemory>
<cell id='0' cpus='0-27' memory='33554432'
unit='KiB'/>
<suspend-to-mem enabled='no'/>
<model type='qxl' ram='65536' vram='32768'
vgamem='16384' heads='1' primary='yes'/>
<memballoon model='virtio'>
</memballoon>
```
So I understand that two VMs each having allocated 32G of ram should
consume approx 64G of ram on the host. The host has 128G of ram, so
usage should be at approx 50%. However ovirt is reporting 94% usage.
Versions:
```
engine: 4.3.5.5-1.el7
host:
OS Version: RHEL - 7 - 6.1810.2.el7.centos
OS Description: CentOS Linux 7 (Core)
Kernel Version: 3.10.0 - 957.10.1.el7.x86_64
KVM Version: 2.12.0 - 18.el7_6.3.1
LIBVIRT Version: libvirt-4.5.0-10.el7_6.6
VDSM Version: vdsm-4.30.11-1.el7
SPICE Version: 0.14.0 - 6.el7_6.1
GlusterFS Version: [N/A]
CEPH Version: librbd1-10.2.5-4.el7
Open vSwitch Version: openvswitch-2.10.1-3.el7
Kernel Features: PTI: 1, IBRS: 0, RETP: 1
VNC Encryption: Disabled
```
Thanks for any insights!
--
Divan Santana
https://divansantana.com