
Hi All, A question regarding memory management with ovirt. I know memory can be complicated hence I'm asking the experts. :) Two examples of where it looks - to me - that memory management from ovirt perspective is incorrect. This is resulting in us not getting as much out of a host as we'd expect. ## Example 1: host: dev-cluster-04 I understand the mem on the host to be: 128G total (physical) 68G used 53G available 56G buff/cache I understand therefore 53G should still be available to allocate (approximately, minus a few things). ``` DEV [root@dev-cluster-04:~] # free -m total used free shared buff/cache available Mem: 128741 68295 4429 4078 56016 53422 Swap: 12111 1578 10533 DEV [root@dev-cluster-04:~] # cat /proc/meminfo MemTotal: 131831292 kB MemFree: 4540852 kB MemAvailable: 54709832 kB Buffers: 3104 kB Cached: 5174136 kB SwapCached: 835012 kB Active: 66943552 kB Inactive: 5980340 kB Active(anon): 66236968 kB Inactive(anon): 5713972 kB Active(file): 706584 kB Inactive(file): 266368 kB Unevictable: 50036 kB Mlocked: 54132 kB SwapTotal: 12402684 kB SwapFree: 10786688 kB Dirty: 812 kB Writeback: 0 kB AnonPages: 67068548 kB Mapped: 143880 kB Shmem: 4176328 kB Slab: 52183680 kB SReclaimable: 49822156 kB SUnreclaim: 2361524 kB KernelStack: 20000 kB PageTables: 213628 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 78318328 kB Committed_AS: 110589076 kB VmallocTotal: 34359738367 kB VmallocUsed: 859104 kB VmallocChunk: 34291324976 kB HardwareCorrupted: 0 kB AnonHugePages: 583680 kB CmaTotal: 0 kB CmaFree: 0 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB DirectMap4k: 621088 kB DirectMap2M: 44439552 kB DirectMap1G: 91226112 kB ``` The ovirt engine, compute -> hosts view shows s4-dev-cluster-01 as 93% memory utilised. Clicking on the node says: Physical Memory: 128741 MB total, 119729 MB used, 9012 MB free So ovirt engine says 9G free. The OS reports 4G free but 53G available. Surely ovirt should be looking at available memory? This is a problem, for instance, when trying to run a VM, called dev-cassandra-01, with mem size 24576, max mem 24576 and mem guarantee set to 10240 on this host it fails with: ``` Cannot run VM. There is no host that satisfies current scheduling constraints. See below for details: The host dev-cluster-04.fnb.co.za did not satisfy internal filter Memory because its available memory is too low (19884 MB) to run the VM. ``` To me this looks blatantly wrong. The host has 53G available according to free -m. Guessing I'm missing something, unless this is some sort of bug? versions: ``` engine: 4.3.7.2-1.el7 host: OS Version: RHEL - 7 - 6.1810.2.el7.centos OS Description: CentOS Linux 7 (Core) Kernel Version: 3.10.0 - 957.12.1.el7.x86_64 KVM Version: 2.12.0 - 18.el7_6.3.1 LIBVIRT Version: libvirt-4.5.0-10.el7_6.7 VDSM Version: vdsm-4.30.13-1.el7 SPICE Version: 0.14.0 - 6.el7_6.1 GlusterFS Version: [N/A] CEPH Version: librbd1-10.2.5-4.el7 Open vSwitch Version: openvswitch-2.10.1-3.el7 Kernel Features: PTI: 1, IBRS: 0, RETP: 1, SSBD: 3 VNC Encryption: Disabled ``` ## Example 2: A ovirt host with two VMs: According to the host, it has 128G of physical memory of which 56G is used, 69G is buff/cache and 65G is available. As is shown here: ``` LIVE [root@prod-cluster-01:~] # cat /proc/meminfo MemTotal: 131326836 kB MemFree: 2630812 kB MemAvailable: 66573596 kB Buffers: 2376 kB Cached: 5670628 kB SwapCached: 151072 kB Active: 59106140 kB Inactive: 2744176 kB Active(anon): 58099732 kB Inactive(anon): 2327428 kB Active(file): 1006408 kB Inactive(file): 416748 kB Unevictable: 40004 kB Mlocked: 42052 kB SwapTotal: 4194300 kB SwapFree: 3579492 kB Dirty: 0 kB Writeback: 0 kB AnonPages: 56085040 kB Mapped: 121816 kB Shmem: 4231808 kB Slab: 65143868 kB SReclaimable: 63145684 kB SUnreclaim: 1998184 kB KernelStack: 25296 kB PageTables: 148336 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 69857716 kB Committed_AS: 76533164 kB VmallocTotal: 34359738367 kB VmallocUsed: 842296 kB VmallocChunk: 34291404724 kB HardwareCorrupted: 0 kB AnonHugePages: 55296 kB CmaTotal: 0 kB CmaFree: 0 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB DirectMap4k: 722208 kB DirectMap2M: 48031744 kB DirectMap1G: 87031808 kB LIVE [root@prod-cluster-01:~] # free -m total used free shared buff/cache available Mem: 128248 56522 2569 4132 69157 65013 Swap: 4095 600 3495 ``` However the compute -> hosts ovirt screen shows this node as 94% memory. Clicking compute -> hosts -> prod-cluster-01 -> general says: Physical Memory: 128248 MB total, 120553 MB used, 7695 MB free Swap Size: 4095 MB total, 600 MB used, 3495 MB free The physical memory in the above makes no sense to me. Unless it includes caches which I would think it shouldn't. This host has just two VMs: LIVE [root@prod-cluster-01:~] # virsh -c qemu:///system?authfile=/etc/ovirt-hosted-engine/virsh_auth.conf list Id Name State ---------------------------------------------------- 35 prod-box-18 running 36 prod-box-11 running Moreover each VM has 32G memory set, in every possible place - from what I can see. ``` LIVE [root@prod-cluster-01:~] # virsh -c qemu:///system?authfile=/etc/ovirt-hosted-engine/virsh_auth.conf dumpxml prod-box-11|grep -i mem <ovirt-vm:memGuaranteedSize type="int">32768</ovirt-vm:memGuaranteedSize> <ovirt-vm:minGuaranteedMemoryMb type="int">32768</ovirt-vm:minGuaranteedMemoryMb> <memory unit='KiB'>33554432</memory> <currentMemory unit='KiB'>33554432</currentMemory> <cell id='0' cpus='0-27' memory='33554432' unit='KiB'/> <suspend-to-mem enabled='no'/> <model type='qxl' ram='65536' vram='32768' vgamem='16384' heads='1' primary='yes'/> <memballoon model='virtio'> </memballoon> ``` prod-box-11 is however set as high performance VM. That could cause a problem. Same for the other VM: ``` LIVE [root@prod-cluster-01:~] # virsh -c qemu:///system?authfile=/etc/ovirt-hosted-engine/virsh_auth.conf dumpxml prod-box-18|grep -i mem <ovirt-vm:memGuaranteedSize type="int">32768</ovirt-vm:memGuaranteedSize> <ovirt-vm:minGuaranteedMemoryMb type="int">32768</ovirt-vm:minGuaranteedMemoryMb> <memory unit='KiB'>33554432</memory> <currentMemory unit='KiB'>33554432</currentMemory> <cell id='0' cpus='0-27' memory='33554432' unit='KiB'/> <suspend-to-mem enabled='no'/> <model type='qxl' ram='65536' vram='32768' vgamem='16384' heads='1' primary='yes'/> <memballoon model='virtio'> </memballoon> ``` So I understand that two VMs each having allocated 32G of ram should consume approx 64G of ram on the host. The host has 128G of ram, so usage should be at approx 50%. However ovirt is reporting 94% usage. Versions: ``` engine: 4.3.5.5-1.el7 host: OS Version: RHEL - 7 - 6.1810.2.el7.centos OS Description: CentOS Linux 7 (Core) Kernel Version: 3.10.0 - 957.10.1.el7.x86_64 KVM Version: 2.12.0 - 18.el7_6.3.1 LIBVIRT Version: libvirt-4.5.0-10.el7_6.6 VDSM Version: vdsm-4.30.11-1.el7 SPICE Version: 0.14.0 - 6.el7_6.1 GlusterFS Version: [N/A] CEPH Version: librbd1-10.2.5-4.el7 Open vSwitch Version: openvswitch-2.10.1-3.el7 Kernel Features: PTI: 1, IBRS: 0, RETP: 1 VNC Encryption: Disabled ``` Thanks for any insights! -- Divan Santana https://divansantana.com