Question about Huge Pages

Hello, I'm testing virtualization of some Oracle servers. I have 4.1.1 with CentOS 7.3 servers as hypervisors. Typically on physical Oracle servers I configure huge pages for Oracle memory areas. In particular I disable Transparent Huge Pages, because they are known to be in conflict with Oracle performances, both in RAC and in standalone configurations. In RHEL systems I configure "transparent_hugepage=never" boot parameter, while in Oracle Linux OS uek kernels it is already disabled by default. I notice that in CentOS 7.3, by default, transparent huge pages are configured: [root@ov300 ~]# cat /proc/meminfo | grep -i huge AnonHugePages: 17006592 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB [root@ov300 ~]# I'm going to configure a VM with 64Gb of ram and with an Oracle RDBMS that would have 16Gb of SGA. I suspect that I could have problems if I dont' change configuration at hypervisor level... What do you think about this subject? Is there any drawback if I manually configure the hypervisors to boot with the "transparent_hugepage=never" boot parameter? Thanks in advance, Gianluca

On 18 Apr 2017, at 18:03, Gianluca Cecchi <gianluca.cecchi@gmail.com> wrote:
Hello, I'm testing virtualization of some Oracle servers. I have 4.1.1 with CentOS 7.3 servers as hypervisors. Typically on physical Oracle servers I configure huge pages for Oracle memory areas. In particular I disable Transparent Huge Pages, because they are known to be in conflict with Oracle performances, both in RAC and in standalone configurations. In RHEL systems I configure "transparent_hugepage=never" boot parameter, while in Oracle Linux OS uek kernels it is already disabled by default. I notice that in CentOS 7.3, by default, transparent huge pages are configured:
[root@ov300 ~]# cat /proc/meminfo | grep -i huge AnonHugePages: 17006592 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB [root@ov300 ~]#
I'm going to configure a VM with 64Gb of ram and with an Oracle RDBMS that would have 16Gb of SGA. I suspect that I could have problems if I dont' change configuration at hypervisor level...
What do you think about this subject? Is there any drawback if I manually configure the hypervisors to boot with the "transparent_hugepage=never" boot parameter?
Why not reserving regular hugepages for VMs on boot? then you can use it with vdsm hook for that Oracle VM. It improves VM performance in general, the only drawback is less flexibility since that memory can't be used by others unless they specifically ask for hugepages. Also, I suppose you disable KSM, and I'm not sure about ballooning, unless you need it I'd disable it too. The hook is being improved right now in master, but it should be usable in stable too. Thanks, michal
Thanks in advance, Gianluca _______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

On Wed, Apr 19, 2017 at 8:03 AM, Michal Skrivanek <mskrivan@redhat.com> wrote:
Why not reserving regular hugepages for VMs on boot?
Do you mean at hypervisor level? In this case it is what I'm doing normally for physical servers where I install Oracle RDBMS
then you can use it with vdsm hook for that Oracle VM.
Which hook are you referring? This one: http://www.ovirt.org/develop/developer-guide/vdsm/hook/hugepages/ ? In case is it still current? In the sense that I need to mount the hugetblfs virtual file system at hst level? The hook description seems low detailed... Normally if I want oracle user able to use huge pages on physical server, I have to specify # # Huge pages # vm.hugetlb_shm_group = 2000 # 18GB allocatable vm.nr_hugepages = 9216 # where 2000 is the group id for dba group, the main group of oracle user How to map this with vrtualization? Eg: 1) vm.hugetlb_shm_group at hypervisor side should be set to the group of the qemu user as the qemu-kvm process runs with it? 2) Then I have to set VM for VM the hugepages=xxx value in the hook and that will bypass the sysctl.conf configuration in the guest? 3) I presume I have to set the vm.hugetlb_shm_group parameter at guest level.... Thanks, Gianluca
It improves VM performance in general, the only drawback is less flexibility since that memory can't be used by others unless they specifically ask for hugepages.
This seems to confirm that I have to set a statich sysctl.conf entry at hypervisor level such as vm.nr_hugepages = YYYY
Also, I suppose you disable KSM, and I'm not sure about ballooning, unless you need it I'd disable it too.
I kept the defaults at the moment that I suppose should be a) KSM disabled ksm has been configured to start by default as normally, but ksmtuned has been disabled: [g.cecchi@ov300 ~]$ sudo systemctl status ksm ● ksm.service - Kernel Samepage Merging Loaded: loaded (/usr/lib/systemd/system/ksm.service; enabled; vendor preset: enabled) Active: active (exited) since Tue 2017-04-11 11:07:28 CEST; 1 weeks 1 days ago Process: 976 ExecStart=/usr/libexec/ksmctl start (code=exited, status=0/SUCCESS) Main PID: 976 (code=exited, status=0/SUCCESS) CGroup: /system.slice/ksm.service Apr 11 11:07:28 ov300.datacenter.polimi.it systemd[1]: Starting Kernel Samepage Merging... Apr 11 11:07:28 ov300.datacenter.polimi.it systemd[1]: Started Kernel Samepage Merging. [g.cecchi@ov300 ~]$ sudo systemctl status ksmtuned ● ksmtuned.service - Kernel Samepage Merging (KSM) Tuning Daemon Loaded: loaded (/usr/lib/systemd/system/ksmtuned.service; disabled; vendor preset: disabled) Active: inactive (dead) [g.cecchi@ov300 ~]$ b) ballooning enabled for a newly created VM unless I explicitly disable it (at least I see this happens in 4.1.1) What to do for a) and b) to not interfere with huge pages?
The hook is being improved right now in master, but it should be usable in stable too.
I will be happy to test and verify and contribute to its description, as soon as I understand its usage.... Gianluca

On 19/04/17 14:01 +0200, Gianluca Cecchi wrote:
On Wed, Apr 19, 2017 at 8:03 AM, Michal Skrivanek <mskrivan@redhat.com> wrote:
Why not reserving regular hugepages for VMs on boot?
Do you mean at hypervisor level? In this case it is what I'm doing normally for physical servers where I install Oracle RDBMS
then you can use it with vdsm hook for that Oracle VM.
Which hook are you referring? This one: http://www.ovirt.org/develop/developer-guide/vdsm/hook/hugepages/ ? In case is it still current? In the sense that I need to mount the hugetblfs virtual file system at hst level? The hook description seems low detailed... Normally if I want oracle user able to use huge pages on physical server, I have to specify
# # Huge pages # vm.hugetlb_shm_group = 2000 # 18GB allocatable vm.nr_hugepages = 9216 #
where 2000 is the group id for dba group, the main group of oracle user
How to map this with vrtualization? Eg: 1) vm.hugetlb_shm_group at hypervisor side should be set to the group of the qemu user as the qemu-kvm process runs with it? 2) Then I have to set VM for VM the hugepages=xxx value in the hook and that will bypass the sysctl.conf configuration in the guest? 3) I presume I have to set the vm.hugetlb_shm_group parameter at guest level....
If you are using recent CentOS (or I guess Fedora), there isn't any extra setup required. Just create the custom property: On the host where engine is running: $ engine-config -s "UserDefinedVMProperties=hugepages=^.*$" $ service ovirt-engine restart and you should see 'hugepages' when editing a VM under custom properties. Set the number to (desired memory / 2048) and you're good to go. The VM will run with it's memory backed by hugepages. If you need hugepages even inside the VM, do whatever you would do on a physical host. mpolednik
Thanks, Gianluca
It improves VM performance in general, the only drawback is less flexibility since that memory can't be used by others unless they specifically ask for hugepages.
This seems to confirm that I have to set a statich sysctl.conf entry at hypervisor level such as vm.nr_hugepages = YYYY
Also, I suppose you disable KSM, and I'm not sure about ballooning, unless you need it I'd disable it too.
I kept the defaults at the moment that I suppose should be
a) KSM disabled
ksm has been configured to start by default as normally, but ksmtuned has been disabled:
[g.cecchi@ov300 ~]$ sudo systemctl status ksm ● ksm.service - Kernel Samepage Merging Loaded: loaded (/usr/lib/systemd/system/ksm.service; enabled; vendor preset: enabled) Active: active (exited) since Tue 2017-04-11 11:07:28 CEST; 1 weeks 1 days ago Process: 976 ExecStart=/usr/libexec/ksmctl start (code=exited, status=0/SUCCESS) Main PID: 976 (code=exited, status=0/SUCCESS) CGroup: /system.slice/ksm.service
Apr 11 11:07:28 ov300.datacenter.polimi.it systemd[1]: Starting Kernel Samepage Merging... Apr 11 11:07:28 ov300.datacenter.polimi.it systemd[1]: Started Kernel Samepage Merging.
[g.cecchi@ov300 ~]$ sudo systemctl status ksmtuned ● ksmtuned.service - Kernel Samepage Merging (KSM) Tuning Daemon Loaded: loaded (/usr/lib/systemd/system/ksmtuned.service; disabled; vendor preset: disabled) Active: inactive (dead) [g.cecchi@ov300 ~]$
b) ballooning enabled for a newly created VM unless I explicitly disable it (at least I see this happens in 4.1.1)
What to do for a) and b) to not interfere with huge pages?
The hook is being improved right now in master, but it should be usable in stable too.
I will be happy to test and verify and contribute to its description, as soon as I understand its usage....
Gianluca
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

On Wed, Apr 19, 2017 at 3:44 PM, Martin Polednik <mpolednik@redhat.com> wrote:
If you are using recent CentOS (or I guess Fedora), there isn't any extra setup required. Just create the custom property:
Both my engine and my hosts are CentOS 7.3 + updates
On the host where engine is running:
$ engine-config -s "UserDefinedVMProperties=hugepages=^.*$" $ service ovirt-engine restart
and you should see 'hugepages' when editing a VM under custom properties.
So no vdsm hook at all to install?
Set the number to (desired memory / 2048) and you're good to go. The VM will run with it's memory backed by hugepages.
As in sysctl.conf? So that if I want 4Gb of Huge Pages I have to set 2048?
If you need hugepages even inside the VM, do whatever you would do on a physical host.
mpolednik
yes, the main subject is to have Huge Pages inside the guest, so that Oracle RDBMS at startup detect them and use them Gianluca

--Apple-Mail=_2B6E5EAC-A5EA-4F24-8549-C3234A9C6DD6 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8
On 19 Apr 2017, at 16:28, Gianluca Cecchi <gianluca.cecchi@gmail.com> = wrote: =20 On Wed, Apr 19, 2017 at 3:44 PM, Martin Polednik <mpolednik@redhat.com = <mailto:mpolednik@redhat.com>> wrote: =20 =20 If you are using recent CentOS (or I guess Fedora), there isn't any extra setup required. Just create the custom property: =20 Both my engine and my hosts are CentOS 7.3 + updates
=20 =20 On the host where engine is running: =20 $ engine-config -s "UserDefinedVMProperties=3Dhugepages=3D^.*$" $ service ovirt-engine restart =20 and you should see 'hugepages' when editing a VM under custom =
that=E2=80=99s good properties.
=20 So no vdsm hook at all to install?
today you still need the hook.
=20 =20 Set the number to (desired memory / 2048) and you're good to go. The VM will run with it's memory backed by hugepages. =20 As in sysctl.conf? So that if I want 4Gb of Huge Pages I have to set = 2048?
yes. there might be some=20
=20 =20 If you need hugepages even inside the VM, do whatever you would do on a physical host. =20 mpolednik =20 =20 yes, the main subject is to have Huge Pages inside the guest, so that = Oracle RDBMS at startup detect them and use them
yes, so if you do that via sysctl.conf on real HW just do the same here, = or modify kernel cmdline. Note that those are two separate things the hook is making QEMU process use hugepages memory in the host - that = improves performance of any VM then how it looks in guest is no concern to oVirt, it=E2=80=99s = guest-side hugepages. You can enable/set them regardless the previous = step, which may be fine if you just want to expose the capability to = some app - e.g. in testing that the guest-side Oracle can work with = hugepages in the guest. But you probably want both Oracle to see hugepages and also actually use = them - then you need both reserve that on host for qemu process and then = inside guest reserve that for oracle. I.e. you need to add a = =E2=80=9Cbuffer=E2=80=9D on host side to accommodate the non-hugepages = parts of the guest e.g. on 24GB host you can reserve 20GB hugepages for = VMs to use, and then run a VM with 20GB memory, reserving 16GB hugepages = inside the guest for oracle to use. Thanks, michal
=20 Gianluca=20
--Apple-Mail=_2B6E5EAC-A5EA-4F24-8549-C3234A9C6DD6 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8 <html><head><meta http-equiv=3D"Content-Type" content=3D"text/html = charset=3Dutf-8"></head><body style=3D"word-wrap: break-word; = -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" = class=3D""><br class=3D""><div><blockquote type=3D"cite" class=3D""><div = class=3D"">On 19 Apr 2017, at 16:28, Gianluca Cecchi <<a = href=3D"mailto:gianluca.cecchi@gmail.com" = class=3D"">gianluca.cecchi@gmail.com</a>> wrote:</div><br = class=3D"Apple-interchange-newline"><div class=3D""><div dir=3D"ltr" = class=3D""><div class=3D"gmail_extra"><div class=3D"gmail_quote">On Wed, = Apr 19, 2017 at 3:44 PM, Martin Polednik <span dir=3D"ltr" = class=3D""><<a href=3D"mailto:mpolednik@redhat.com" target=3D"_blank" = class=3D"">mpolednik@redhat.com</a>></span> wrote:<br = class=3D""><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 = .8ex;border-left:1px #ccc solid;padding-left:1ex"><div = class=3D"HOEnZb"><div class=3D"h5"><blockquote class=3D"gmail_quote" = style=3D"margin:0 0 0 .8ex;border-left:1px #ccc = solid;padding-left:1ex"><br class=3D""> </blockquote> <br class=3D""></div></div> If you are using recent CentOS (or I guess Fedora), there isn't any<br = class=3D""> extra setup required. Just create the custom property:<br = class=3D""></blockquote><div class=3D""><br class=3D""></div><div = class=3D"">Both my engine and my hosts are CentOS 7.3 + = updates</div></div></div></div></div></blockquote><div><br = class=3D""></div>that=E2=80=99s good</div><div><br class=3D""><blockquote = type=3D"cite" class=3D""><div class=3D""><div dir=3D"ltr" class=3D""><div = class=3D"gmail_extra"><div class=3D"gmail_quote"><div = class=3D""> </div><blockquote class=3D"gmail_quote" style=3D"margin:0= 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> <br class=3D""> On the host where engine is running:<br class=3D""> <br class=3D""> $ engine-config -s "UserDefinedVMProperties=3Dhugep<wbr = class=3D"">ages=3D^.*$"<br class=3D""> $ service ovirt-engine restart<br class=3D""> <br class=3D""> and you should see 'hugepages' when editing a VM under custom = properties.<br class=3D""></blockquote><div class=3D""><br = class=3D""></div><div class=3D"">So no vdsm hook at all to = install?</div></div></div></div></div></blockquote><div><br = class=3D""></div>today you still need the hook.</div><div><br = class=3D""><blockquote type=3D"cite" class=3D""><div class=3D""><div = dir=3D"ltr" class=3D""><div class=3D"gmail_extra"><div = class=3D"gmail_quote"><div class=3D""><br class=3D""></div><div = class=3D""> </div><blockquote class=3D"gmail_quote" style=3D"margin:0= 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> Set the number to (desired memory / 2048) and you're good to go. The<br = class=3D""> VM will run with it's memory backed by hugepages. </blockquote><div = class=3D""><br class=3D""></div><div class=3D"">As in sysctl.conf? So = that if I want 4Gb of Huge Pages I have to set = 2048?</div></div></div></div></div></blockquote><div><br = class=3D""></div>yes. there might be some </div><div><br = class=3D""><blockquote type=3D"cite" class=3D""><div class=3D""><div = dir=3D"ltr" class=3D""><div class=3D"gmail_extra"><div = class=3D"gmail_quote"><div class=3D""><br class=3D""></div><div = class=3D""> </div><blockquote class=3D"gmail_quote" style=3D"margin:0= 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">If you need<br = class=3D""> hugepages even inside the VM, do whatever you would do on a physical<br = class=3D""> host.<br class=3D""> <br class=3D""> mpolednik<div class=3D"HOEnZb"><div class=3D"h5"><br = class=3D""></div></div></blockquote><div class=3D""><br = class=3D""></div><div class=3D"">yes, the main subject is to have Huge = Pages inside the guest, so that Oracle RDBMS at startup detect them and = use them</div></div></div></div></div></blockquote><div><br = class=3D""></div>yes, so if you do that via sysctl.conf on real HW just = do the same here, or modify kernel cmdline.</div><div><br = class=3D""></div><div>Note that those are two separate = things</div><div>the hook is making QEMU process use hugepages memory in = the host - that improves performance of any VM</div><div>then how it = looks in guest is no concern to oVirt, it=E2=80=99s guest-side = hugepages. You can enable/set them regardless the previous step, which = may be fine if you just want to expose the capability to some app = - e.g. in testing that the guest-side Oracle can work with = hugepages in the guest.</div><div>But you probably want both Oracle to = see hugepages and also actually use them - then you need both reserve = that on host for qemu process and then inside guest reserve that for = oracle. I.e. you need to add a =E2=80=9Cbuffer=E2=80=9D on host side to = accommodate the non-hugepages parts of the guest e.g. on 24GB host you = can reserve 20GB hugepages for VMs to use, and then run a VM with 20GB = memory, reserving 16GB hugepages inside the guest for oracle to = use.</div><div><br = class=3D""></div><div>Thanks,</div><div>michal</div><div><br = class=3D""><blockquote type=3D"cite" class=3D""><div class=3D""><div = dir=3D"ltr" class=3D""><div class=3D"gmail_extra"><div = class=3D"gmail_quote"><div class=3D""><br class=3D""></div><div = class=3D"">Gianluca </div></div></div></div> </div></blockquote></div><br class=3D""></body></html>= --Apple-Mail=_2B6E5EAC-A5EA-4F24-8549-C3234A9C6DD6--

On Thu, Apr 20, 2017 at 10:35 AM, Michal Skrivanek <mskrivan@redhat.com> wrote:
On 19 Apr 2017, at 16:28, Gianluca Cecchi <gianluca.cecchi@gmail.com> wrote:
On Wed, Apr 19, 2017 at 3:44 PM, Martin Polednik <mpolednik@redhat.com> wrote:
If you are using recent CentOS (or I guess Fedora), there isn't any extra setup required. Just create the custom property:
Both my engine and my hosts are CentOS 7.3 + updates
that’s good
On the host where engine is running:
$ engine-config -s "UserDefinedVMProperties=hugepages=^.*$" $ service ovirt-engine restart
and you should see 'hugepages' when editing a VM under custom properties.
So no vdsm hook at all to install?
today you still need the hook.
Set the number to (desired memory / 2048) and you're good to go. The VM will run with it's memory backed by hugepages.
As in sysctl.conf? So that if I want 4Gb of Huge Pages I have to set 2048?
yes. there might be some
If you need hugepages even inside the VM, do whatever you would do on a physical host.
mpolednik
yes, the main subject is to have Huge Pages inside the guest, so that Oracle RDBMS at startup detect them and use them
yes, so if you do that via sysctl.conf on real HW just do the same here, or modify kernel cmdline.
Note that those are two separate things the hook is making QEMU process use hugepages memory in the host - that improves performance of any VM then how it looks in guest is no concern to oVirt, it’s guest-side hugepages. You can enable/set them regardless the previous step, which may be fine if you just want to expose the capability to some app - e.g. in testing that the guest-side Oracle can work with hugepages in the guest. But you probably want both Oracle to see hugepages and also actually use them - then you need both reserve that on host for qemu process and then inside guest reserve that for oracle. I.e. you need to add a “buffer” on host side to accommodate the non-hugepages parts of the guest e.g. on 24GB host you can reserve 20GB hugepages for VMs to use, and then run a VM with 20GB memory, reserving 16GB hugepages inside the guest for oracle to use.
Thanks, michal
Gianluca
I'm making some tests right now. Steps done: - configure huge pages on hypervisor [root@ractor ~]# cat /etc/sysctl.d/huge-pages.conf # 20/04/2017 8Gb vm.nr_hugepages = 4096 [root@ractor ~]# rebooted host (I also updated in the mean time it to latest 4.1.1 packages with vdsm-4.19.10.1-1.el7.centos.x86_64 and vdsm-hook-hugepages-4.19. 10.1-1.el7.centos.noarch) I also set "transparent_hugepage=never" boot parameter because I know that they are in conflict with Huge Pages So the situation is: [root@ractor ~]# cat /proc/cmdline BOOT_IMAGE=/vmlinuz-3.10.0-514.16.1.el7.x86_64 root=/dev/mapper/centos-root ro rd.lvm.lv=centos/root rd.lvm.lv=centos/swap rhgb quiet LANG=en_US.UTF-8 transparent_hugepage=never [root@ractor ~]# [root@ractor ~]# cat /proc/meminfo | grep -i huge AnonHugePages: 0 kB HugePages_Total: 4096 HugePages_Free: 4096 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB [root@ractor ~]# I edited a pre-existing CentOS 6 VM setting for it 8Gb of ram and 2048 pages (4Gb) in custom property forhugepages. When I power on I get this addition in qemu-kvm process definition as expected: -mem-path /dev/hugepages/libvirt/qemu I noticed that now I have on host.... [root@ractor vdsm]# cat /proc/meminfo | grep -i huge AnonHugePages: 0 kB HugePages_Total: 6144 HugePages_Free: 2048 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB [root@ractor vdsm]# So apparently it did allocated 2048 new huge pages... Does it mean that actually I have not to pre-allocate huge pages at all on host and it eventually will increase them (but not able to remove then I suppose) ? Anyway the count doesn't seem correct... because it seems that a total of 4096 pages are in use/locked... (HugePages_Total - HugePages_Free + HugePages_Rsvd) while they should be 2048..... [root@ractor vdsm]# ll /dev/hugepages/libvirt/qemu/ total 0 [root@ractor vdsm]# ll /hugetlbfs/libvirt/qemu/ total 0 [root@ractor vdsm]# If I power off the VM [root@ractor vdsm]# cat /proc/meminfo | grep -i huge AnonHugePages: 0 kB HugePages_Total: 4096 HugePages_Free: 4096 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB [root@ractor vdsm]# Does this mean that in CentOS 7.3 Huge Pages could be reclaimed....??? Nevertheless, when I configure huge pages in guest it seems to work as expected [root@dbtest ~]# cat /proc/meminfo | grep -i huge AnonHugePages: 0 kB HugePages_Total: 2048 HugePages_Free: 2048 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB Going into Oracle DB initialization, after configuring its dedicated memory (SGA) to 2354Mb, I get this confirmation inside its log file Thu Apr 20 17:16:27 2017 Per process system memlock (soft) limit = 4096M Thu Apr 20 17:16:27 2017 Expected per process system memlock (soft) limit to lock SHARED GLOBAL AREA (SGA) into memory: 2354M Thu Apr 20 17:16:27 2017 Available system pagesizes: 4K, 2048K Thu Apr 20 17:16:27 2017 Supported system pagesize(s): Thu Apr 20 17:16:27 2017 *PAGESIZE* AVAILABLE_PAGES EXPECTED_PAGES *ALLOCATED_PAGES* ERROR(s) Thu Apr 20 17:16:27 2017 4K Configured 3 3 NONE Thu Apr 20 17:16:27 2017 *2048K * 2048 1177 *1177* NONE Gianluca

--Apple-Mail=_1B8461A0-C40E-47B7-B0E6-39D36113183A Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 > On 20 Apr 2017, at 17:39, Gianluca Cecchi <gianluca.cecchi@gmail.com> = wrote: >=20 > On Thu, Apr 20, 2017 at 10:35 AM, Michal Skrivanek = <mskrivan@redhat.com <mailto:mskrivan@redhat.com>> wrote: >=20 >> On 19 Apr 2017, at 16:28, Gianluca Cecchi <gianluca.cecchi@gmail.com = <mailto:gianluca.cecchi@gmail.com>> wrote: >>=20 >> On Wed, Apr 19, 2017 at 3:44 PM, Martin Polednik = <mpolednik@redhat.com <mailto:mpolednik@redhat.com>> wrote: >>=20 >>=20 >> If you are using recent CentOS (or I guess Fedora), there isn't any >> extra setup required. Just create the custom property: >>=20 >> Both my engine and my hosts are CentOS 7.3 + updates >=20 > that=E2=80=99s good >=20 >> =20 >>=20 >> On the host where engine is running: >>=20 >> $ engine-config -s "UserDefinedVMProperties=3Dhugepages=3D^.*$" >> $ service ovirt-engine restart >>=20 >> and you should see 'hugepages' when editing a VM under custom = properties. >>=20 >> So no vdsm hook at all to install? >=20 > today you still need the hook. >=20 >>=20 >> =20 >> Set the number to (desired memory / 2048) and you're good to go. The >> VM will run with it's memory backed by hugepages. >>=20 >> As in sysctl.conf? So that if I want 4Gb of Huge Pages I have to set = 2048? >=20 > yes. there might be some=20 >=20 >>=20 >> =20 >> If you need >> hugepages even inside the VM, do whatever you would do on a physical >> host. >>=20 >> mpolednik >>=20 >>=20 >> yes, the main subject is to have Huge Pages inside the guest, so that = Oracle RDBMS at startup detect them and use them >=20 > yes, so if you do that via sysctl.conf on real HW just do the same = here, or modify kernel cmdline. >=20 > Note that those are two separate things > the hook is making QEMU process use hugepages memory in the host - = that improves performance of any VM > then how it looks in guest is no concern to oVirt, it=E2=80=99s = guest-side hugepages. You can enable/set them regardless the previous = step, which may be fine if you just want to expose the capability to = some app - e.g. in testing that the guest-side Oracle can work with = hugepages in the guest. > But you probably want both Oracle to see hugepages and also actually = use them - then you need both reserve that on host for qemu process and = then inside guest reserve that for oracle. I.e. you need to add a = =E2=80=9Cbuffer=E2=80=9D on host side to accommodate the non-hugepages = parts of the guest e.g. on 24GB host you can reserve 20GB hugepages for = VMs to use, and then run a VM with 20GB memory, reserving 16GB hugepages = inside the guest for oracle to use. >=20 > Thanks, > michal >=20 >>=20 >> Gianluca=20 >=20 >=20 > I'm making some tests right now.=20 > Steps done:=20 > - configure huge pages on hypervisor >=20 > [root@ractor ~]# cat /etc/sysctl.d/huge-pages.conf=20 > # 20/04/2017 8Gb > vm.nr_hugepages =3D 4096 > [root@ractor ~]#=20 >=20 > rebooted host (I also updated in the mean time it to latest 4.1.1 = packages with vdsm-4.19.10.1-1.el7.centos.x86_64 and = vdsm-hook-hugepages-4.19.10.1-1.el7.centos.noarch) > I also set "transparent_hugepage=3Dnever" boot parameter because I = know that they are in conflict with Huge Pages >=20 > So the situation is: >=20 > [root@ractor ~]# cat /proc/cmdline=20 > BOOT_IMAGE=3D/vmlinuz-3.10.0-514.16.1.el7.x86_64 = root=3D/dev/mapper/centos-root ro rd.lvm.lv = <http://rd.lvm.lv/>=3Dcentos/root rd.lvm.lv = <http://rd.lvm.lv/>=3Dcentos/swap rhgb quiet LANG=3Den_US.UTF-8 = transparent_hugepage=3Dnever > [root@ractor ~]#=20 >=20 > [root@ractor ~]# cat /proc/meminfo | grep -i huge > AnonHugePages: 0 kB > HugePages_Total: 4096 > HugePages_Free: 4096 > HugePages_Rsvd: 0 > HugePages_Surp: 0 > Hugepagesize: 2048 kB > [root@ractor ~]#=20 >=20 > I edited a pre-existing CentOS 6 VM setting for it 8Gb of ram and 2048 = pages (4Gb) in custom property forhugepages. >=20 > When I power on I get this addition in qemu-kvm process definition as = expected: >=20 > -mem-path /dev/hugepages/libvirt/qemu >=20 > I noticed that now I have on host.... >=20 > [root@ractor vdsm]# cat /proc/meminfo | grep -i huge > AnonHugePages: 0 kB > HugePages_Total: 6144 > HugePages_Free: 2048 > HugePages_Rsvd: 0 > HugePages_Surp: 0 > Hugepagesize: 2048 kB > [root@ractor vdsm]#=20 >=20 > So apparently it did allocated 2048 new huge pages... > Does it mean that actually I have not to pre-allocate huge pages at = all on host and it eventually will increase them (but not able to remove = then I suppose) ? >=20 > Anyway the count doesn't seem correct... because it seems that a total = of 4096 pages are in use/locked... (HugePages_Total - HugePages_Free + = HugePages_Rsvd) > while they should be 2048..... >=20 > [root@ractor vdsm]# ll /dev/hugepages/libvirt/qemu/ > total 0 > [root@ractor vdsm]# ll /hugetlbfs/libvirt/qemu/ > total 0 > [root@ractor vdsm]#=20 >=20 > If I power off the VM >=20 > [root@ractor vdsm]# cat /proc/meminfo | grep -i huge > AnonHugePages: 0 kB > HugePages_Total: 4096 > HugePages_Free: 4096 > HugePages_Rsvd: 0 > HugePages_Surp: 0 > Hugepagesize: 2048 kB > [root@ractor vdsm]#=20 >=20 > Does this mean that in CentOS 7.3 Huge Pages could be reclaimed....??? it tries to=E2=80=A6well, as I said, the hook is being improved right = now and in 4.2 it will likely be more consumable >=20 > Nevertheless, when I configure huge pages in guest it seems to work as = expected >=20 > [root@dbtest ~]# cat /proc/meminfo | grep -i huge > AnonHugePages: 0 kB > HugePages_Total: 2048 > HugePages_Free: 2048 > HugePages_Rsvd: 0 > HugePages_Surp: 0 > Hugepagesize: 2048 kB >=20 > Going into Oracle DB initialization, after configuring its dedicated = memory (SGA) to 2354Mb, I get this confirmation inside its log file Yes, but that would always work even without setting things up on host. = The =E2=80=9Conly=E2=80=9D difference would be the actual performance. >=20 > Thu Apr 20 17:16:27 2017 > Per process system memlock (soft) limit =3D 4096M > Thu Apr 20 17:16:27 2017 > Expected per process system memlock (soft) limit to lock > SHARED GLOBAL AREA (SGA) into memory: 2354M > Thu Apr 20 17:16:27 2017 > Available system pagesizes: > 4K, 2048K=20 > Thu Apr 20 17:16:27 2017 > Supported system pagesize(s): > Thu Apr 20 17:16:27 2017 > PAGESIZE AVAILABLE_PAGES EXPECTED_PAGES ALLOCATED_PAGES ERROR(s) > Thu Apr 20 17:16:27 2017 > 4K Configured 3 3 = NONE > Thu Apr 20 17:16:27 2017 > 2048K 2048 1177 1177 = NONE >=20 > Gianluca >=20 --Apple-Mail=_1B8461A0-C40E-47B7-B0E6-39D36113183A Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8 <html><head><meta http-equiv=3D"Content-Type" content=3D"text/html = charset=3Dutf-8"></head><body style=3D"word-wrap: break-word; = -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" = class=3D""><br class=3D""><div><blockquote type=3D"cite" class=3D""><div = class=3D"">On 20 Apr 2017, at 17:39, Gianluca Cecchi <<a = href=3D"mailto:gianluca.cecchi@gmail.com" = class=3D"">gianluca.cecchi@gmail.com</a>> wrote:</div><br = class=3D"Apple-interchange-newline"><div class=3D""><div dir=3D"ltr" = class=3D""><div class=3D"gmail_extra"><div class=3D"gmail_quote">On Thu, = Apr 20, 2017 at 10:35 AM, Michal Skrivanek <span dir=3D"ltr" = class=3D""><<a href=3D"mailto:mskrivan@redhat.com" target=3D"_blank" = class=3D"">mskrivan@redhat.com</a>></span> wrote:<br = class=3D""><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px = 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div = style=3D"word-wrap:break-word" class=3D""><br class=3D""><div = class=3D""><span class=3D"gmail-m_5930415786222383046gmail-"><blockquote = type=3D"cite" class=3D""><div class=3D"">On 19 Apr 2017, at 16:28, = Gianluca Cecchi <<a href=3D"mailto:gianluca.cecchi@gmail.com" = target=3D"_blank" class=3D"">gianluca.cecchi@gmail.com</a>> = wrote:</div><br = class=3D"gmail-m_5930415786222383046gmail-m_-6852186268654812432Apple-inte= rchange-newline"><div class=3D""><div dir=3D"ltr" class=3D""><div = class=3D"gmail_extra"><div class=3D"gmail_quote">On Wed, Apr 19, 2017 at = 3:44 PM, Martin Polednik <span dir=3D"ltr" class=3D""><<a = href=3D"mailto:mpolednik@redhat.com" target=3D"_blank" = class=3D"">mpolednik@redhat.com</a>></span> wrote:<br = class=3D""><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px = 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div = class=3D"gmail-m_5930415786222383046gmail-m_-6852186268654812432HOEnZb"><d= iv = class=3D"gmail-m_5930415786222383046gmail-m_-6852186268654812432h5"><block= quote class=3D"gmail_quote" style=3D"margin:0px 0px 0px = 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><br = class=3D""> </blockquote> <br class=3D""></div></div> If you are using recent CentOS (or I guess Fedora), there isn't any<br = class=3D""> extra setup required. Just create the custom property:<br = class=3D""></blockquote><div class=3D""><br class=3D""></div><div = class=3D"">Both my engine and my hosts are CentOS 7.3 + = updates</div></div></div></div></div></blockquote><div class=3D""><br = class=3D""></div></span>that=E2=80=99s good</div><div class=3D""><span = class=3D"gmail-m_5930415786222383046gmail-"><br class=3D""><blockquote = type=3D"cite" class=3D""><div class=3D""><div dir=3D"ltr" class=3D""><div = class=3D"gmail_extra"><div class=3D"gmail_quote"><div = class=3D""> </div><blockquote class=3D"gmail_quote" = style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid = rgb(204,204,204);padding-left:1ex"> <br class=3D""> On the host where engine is running:<br class=3D""> <br class=3D""> $ engine-config -s "UserDefinedVMProperties=3Dhugep<wbr = class=3D"">ages=3D^.*$"<br class=3D""> $ service ovirt-engine restart<br class=3D""> <br class=3D""> and you should see 'hugepages' when editing a VM under custom = properties.<br class=3D""></blockquote><div class=3D""><br = class=3D""></div><div class=3D"">So no vdsm hook at all to = install?</div></div></div></div></div></blockquote><div class=3D""><br = class=3D""></div></span>today you still need the hook.</div><div = class=3D""><span class=3D"gmail-m_5930415786222383046gmail-"><br = class=3D""><blockquote type=3D"cite" class=3D""><div class=3D""><div = dir=3D"ltr" class=3D""><div class=3D"gmail_extra"><div = class=3D"gmail_quote"><div class=3D""><br class=3D""></div><div = class=3D""> </div><blockquote class=3D"gmail_quote" = style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid = rgb(204,204,204);padding-left:1ex"> Set the number to (desired memory / 2048) and you're good to go. The<br = class=3D""> VM will run with it's memory backed by hugepages. </blockquote><div = class=3D""><br class=3D""></div><div class=3D"">As in sysctl.conf? So = that if I want 4Gb of Huge Pages I have to set = 2048?</div></div></div></div></div></blockquote><div class=3D""><br = class=3D""></div></span>yes. there might be some </div><div = class=3D""><span class=3D"gmail-m_5930415786222383046gmail-"><br = class=3D""><blockquote type=3D"cite" class=3D""><div class=3D""><div = dir=3D"ltr" class=3D""><div class=3D"gmail_extra"><div = class=3D"gmail_quote"><div class=3D""><br class=3D""></div><div = class=3D""> </div><blockquote class=3D"gmail_quote" = style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid = rgb(204,204,204);padding-left:1ex">If you need<br class=3D""> hugepages even inside the VM, do whatever you would do on a physical<br = class=3D""> host.<br class=3D""> <br class=3D""> mpolednik<div = class=3D"gmail-m_5930415786222383046gmail-m_-6852186268654812432HOEnZb"><d= iv class=3D"gmail-m_5930415786222383046gmail-m_-6852186268654812432h5"><br= class=3D""></div></div></blockquote><div class=3D""><br = class=3D""></div><div class=3D"">yes, the main subject is to have Huge = Pages inside the guest, so that Oracle RDBMS at startup detect them and = use them</div></div></div></div></div></blockquote><div class=3D""><br = class=3D""></div></span>yes, so if you do that via sysctl.conf on real = HW just do the same here, or modify kernel cmdline.</div><div = class=3D""><br class=3D""></div><div class=3D"">Note that those are two = separate things</div><div class=3D"">the hook is making QEMU process use = hugepages memory in the host - that improves performance of any = VM</div><div class=3D"">then how it looks in guest is no concern to = oVirt, it=E2=80=99s guest-side hugepages. You can enable/set them = regardless the previous step, which may be fine if you just want to = expose the capability to some app - e.g. in testing that the = guest-side Oracle can work with hugepages in the guest.</div><div = class=3D"">But you probably want both Oracle to see hugepages and also = actually use them - then you need both reserve that on host for qemu = process and then inside guest reserve that for oracle. I.e. you need to = add a =E2=80=9Cbuffer=E2=80=9D on host side to accommodate the = non-hugepages parts of the guest e.g. on 24GB host you can reserve 20GB = hugepages for VMs to use, and then run a VM with 20GB memory, reserving = 16GB hugepages inside the guest for oracle to use.</div><div = class=3D""><br class=3D""></div><div class=3D"">Thanks,</div><div = class=3D"">michal</div><div class=3D""><br class=3D""><blockquote = type=3D"cite" class=3D""><div class=3D""><div dir=3D"ltr" class=3D""><div = class=3D"gmail_extra"><div class=3D"gmail_quote"><div class=3D""><br = class=3D""></div><div class=3D"">Gianluca </div></div></div></div> </div></blockquote></div><br class=3D""></div></blockquote><div = class=3D""><br class=3D""></div><div class=3D"">I'm making some tests = right now. </div><div class=3D"">Steps done: </div><div = class=3D"">- configure huge pages on hypervisor</div><div class=3D""><br = class=3D""></div><div class=3D""><div class=3D"">[root@ractor ~]# cat = /etc/sysctl.d/huge-pages.conf </div><div class=3D""># 20/04/2017 = 8Gb</div><div class=3D"">vm.nr_hugepages =3D 4096</div><div = class=3D"">[root@ractor ~]# </div></div><div class=3D""><br = class=3D""></div><div class=3D"">rebooted host (I also updated in the = mean time it to latest 4.1.1 packages = with vdsm-4.19.10.1-1.el7.<wbr class=3D"">centos.x86_64 = and vdsm-hook-hugepages-4.19.<wbr = class=3D"">10.1-1.el7.centos.noarch)</div><div class=3D"">I also set = "transparent_hugepage=3Dnever" boot parameter because I know that they = are in conflict with Huge Pages</div><div class=3D""><br = class=3D""></div><div class=3D"">So the situation is:</div><div = class=3D""><br class=3D""></div><div class=3D""><div = class=3D"">[root@ractor ~]# cat /proc/cmdline </div><div = class=3D"">BOOT_IMAGE=3D/vmlinuz-3.10.0-<wbr = class=3D"">514.16.1.el7.x86_64 root=3D/dev/mapper/centos-root ro <a = href=3D"http://rd.lvm.lv/" target=3D"_blank" = class=3D"">rd.lvm.lv</a>=3Dcentos/root <a href=3D"http://rd.lvm.lv/" = target=3D"_blank" class=3D"">rd.lvm.lv</a>=3Dcentos/swap rhgb quiet = LANG=3Den_US.UTF-8 transparent_hugepage=3Dnever</div><div = class=3D"">[root@ractor ~]# </div></div><div class=3D""><br = class=3D""></div><div class=3D""><div class=3D"">[root@ractor ~]# cat = /proc/meminfo | grep -i huge</div><div class=3D"">AnonHugePages: = 0 kB</div><div class=3D"">HugePages_Total: = 4096</div><div class=3D"">HugePages_Free: = 4096</div><div class=3D"">HugePages_Rsvd: = 0</div><div class=3D"">HugePages_Surp: = 0</div><div class=3D"">Hugepagesize: 2048 = kB</div><div class=3D"">[root@ractor ~]# </div></div><div = class=3D""><br class=3D""></div><div class=3D"">I edited a pre-existing = CentOS 6 VM setting for it 8Gb of ram and 2048 pages (4Gb) in custom = property forhugepages.</div><div class=3D""><br class=3D""></div><div = class=3D"">When I power on I get this addition in qemu-kvm process = definition as expected:</div><div class=3D""><br class=3D""></div><div = class=3D"">-mem-path /dev/hugepages/libvirt/qemu<br class=3D""></div><div = class=3D""><br class=3D""></div><div class=3D"">I noticed that now I = have on host....</div><div class=3D""><br class=3D""></div><div = class=3D""><div class=3D"">[root@ractor vdsm]# cat /proc/meminfo | grep = -i huge</div><div class=3D"">AnonHugePages: = 0 kB</div><div class=3D"">HugePages_Total: 6144</div><div = class=3D"">HugePages_Free: 2048</div><div = class=3D"">HugePages_Rsvd: 0</div><div = class=3D"">HugePages_Surp: 0</div><div = class=3D"">Hugepagesize: 2048 kB</div><div = class=3D"">[root@ractor vdsm]# </div></div><div class=3D""><br = class=3D""></div><div class=3D"">So apparently it did allocated 2048 new = huge pages...</div></div></div><div class=3D"gmail_extra"><div = class=3D"">Does it mean that actually I have not to pre-allocate huge = pages at all on host and it eventually will increase them (but not able = to remove then I suppose) ?</div><div class=3D""><br class=3D""></div><div= class=3D"">Anyway the count doesn't seem correct... because it seems = that a total of 4096 pages are in use/locked... (HugePages_Total - = HugePages_Free + HugePages_Rsvd)</div><div class=3D"">while they = should be 2048.....</div><div class=3D""><br class=3D""></div><div = class=3D""><div class=3D"">[root@ractor vdsm]# ll = /dev/hugepages/libvirt/qemu/</div><div class=3D"">total 0</div><div = class=3D"">[root@ractor vdsm]# ll /hugetlbfs/libvirt/qemu/</div><div = class=3D"">total 0</div><div class=3D"">[root@ractor = vdsm]# </div></div><div class=3D""><br class=3D""></div><div = class=3D"">If I power off the VM</div><div class=3D""><br = class=3D""></div><div class=3D""><div class=3D"">[root@ractor vdsm]# cat = /proc/meminfo | grep -i huge</div><div class=3D"">AnonHugePages: = 0 kB</div><div class=3D"">HugePages_Total: = 4096</div><div class=3D"">HugePages_Free: = 4096</div><div class=3D"">HugePages_Rsvd: = 0</div><div class=3D"">HugePages_Surp: = 0</div><div class=3D"">Hugepagesize: 2048 = kB</div><div class=3D"">[root@ractor vdsm]# </div></div><div = class=3D""><br class=3D""></div><div class=3D"">Does this mean that in = CentOS 7.3 Huge Pages could be = reclaimed....???</div></div></div></div></blockquote><div><br = class=3D""></div>it tries to=E2=80=A6well, as I said, the hook is being = improved right now and in 4.2 it will likely be more = consumable</div><div><br class=3D""><blockquote type=3D"cite" = class=3D""><div class=3D""><div dir=3D"ltr" class=3D""><div = class=3D"gmail_extra"><div class=3D""><br class=3D""></div><div = class=3D"">Nevertheless, when I configure huge pages in guest it seems = to work as expected</div><div class=3D""><br class=3D""></div><div = class=3D""><div class=3D"">[root@dbtest ~]# cat /proc/meminfo | grep -i = huge</div><div class=3D"">AnonHugePages: 0 = kB</div><div class=3D"">HugePages_Total: 2048</div><div = class=3D"">HugePages_Free: 2048</div><div = class=3D"">HugePages_Rsvd: 0</div><div = class=3D"">HugePages_Surp: 0</div><div = class=3D"">Hugepagesize: 2048 kB</div><div = class=3D""><br class=3D""></div></div><div class=3D"">Going into Oracle = DB initialization, after configuring its dedicated memory (SGA) to = 2354Mb, I get this confirmation inside its log = file</div></div></div></div></blockquote><div><br class=3D""></div>Yes, = but that would always work even without setting things up on host. The = =E2=80=9Conly=E2=80=9D difference would be the actual = performance.</div><div><br class=3D""><blockquote type=3D"cite" = class=3D""><div class=3D""><div dir=3D"ltr" class=3D""><div = class=3D"gmail_extra"><div class=3D""><br class=3D""></div><div = class=3D""><div class=3D"">Thu Apr 20 17:16:27 2017</div><div = class=3D""> Per process system memlock (soft) limit =3D = 4096M</div><div class=3D"">Thu Apr 20 17:16:27 2017</div><div = class=3D""> Expected per process system memlock (soft) limit to = lock</div><div class=3D""> SHARED GLOBAL AREA (SGA) into memory: = 2354M</div><div class=3D"">Thu Apr 20 17:16:27 2017</div><div = class=3D""> Available system pagesizes:</div><div class=3D""> = 4K, 2048K </div><div class=3D"">Thu Apr 20 17:16:27 2017</div><div = class=3D""> Supported system pagesize(s):</div><div class=3D"">Thu = Apr 20 17:16:27 2017</div><div class=3D""> <b = class=3D"">PAGESIZE</b> AVAILABLE_PAGES EXPECTED_PAGES = <b class=3D"">ALLOCATED_PAGES</b> ERROR(s)</div><div = class=3D"">Thu Apr 20 17:16:27 2017</div><div class=3D""> = 4K Configured = 3 = 3 NONE</div><div class=3D"">Thu Apr 20 = 17:16:27 2017</div><div class=3D""> <b class=3D"">2048K= </b> 2048 = 1177 <b = class=3D"">1177</b> NONE</div></div><div = class=3D""><br class=3D""></div><div class=3D"">Gianluca</div><div = class=3D""><br class=3D""></div></div></div> </div></blockquote></div><br class=3D""></body></html>= --Apple-Mail=_1B8461A0-C40E-47B7-B0E6-39D36113183A--
participants (3)
-
Gianluca Cecchi
-
Martin Polednik
-
Michal Skrivanek