vdsm memory consumption (ovirt 4.0)

I have issues with my ovirt setup related to memory consumption. After upgrading to 4.0 I noted a considerable grow in vdsm memory consumption. I suspect that the grow is related to a memory leak. When I boot up the system and activate the host the memory consumption is about 600MB. After 5 days running and host in maintenance mode the memory consumption is about 1,4 GB. I need to put my hosts in maintenance and reboot to free memory. Can anyone help me to debug this problem? OS Version: RHEL - 7 - 2.1511.el7.centos.2.10 Kernel Version: 3.10.0 - 327.22.2.el7.x86_64 KVM Version: 2.3.0 - 31.el7.16.1 LIBVIRT Version: libvirt-1.2.17-13.el7_2.5 VDSM Version: vdsm-4.18.11-1.el7.centos Thank you

On Tue, Aug 30, 2016 at 1:30 AM, Federico Alberto Sayd <fsayd@uncu.edu.ar> wrote:
I have issues with my ovirt setup related to memory consumption. After upgrading to 4.0 I noted a considerable grow in vdsm memory consumption. I suspect that the grow is related to a memory leak.
We need more details, see bellow...
When I boot up the system and activate the host the memory consumption is about 600MB. After 5 days running and host in maintenance mode the memory consumption is about 1,4 GB.
I need to put my hosts in maintenance and reboot to free memory.
You can restart vdsm (systemctl restart vdsmd) instead, running vms are not effected by this.
Can anyone help me to debug this problem?
We had a memory in vdsm-4.18.5, fixed in vdsm-4.18.11. Since you are running 4.18.11, there may be another leak. Please enable health monitoring by creating /etc/vdsm/vdsm.conf.d/50-health.conf [devel] health_monitor_enable = true And restart vdsm. Please run with this setting for couple of hours, maybe one day, and then share the vdsm logs from this timeframe. You may disable health monitoring by setting [devel] health_monitor_enable = false Or by renaming or deleting this configuration file: /etc/vdsm/vdsm.conf.d/50-health.conf.disabled Nir

This is a multi-part message in MIME format. --------------DDCD2422C552AD1D59118A17 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: quoted-printable Hello Nir: I followed your instructions , added the config file, restarted vdsm,=20 and today I have the vdsm logs from a host: https://drive.google.com/file/d/0ByrwZ1AkYuyeR1hmRm90a1R6MEk/view?usp=3Ds= haring Please tell me if you see anything related to the memory issue. Thanks Federico El 30/08/16 a las 03:47, Nir Soffer escribi=C3=B3:
On Tue, Aug 30, 2016 at 1:30 AM, Federico Alberto Sayd=20 <fsayd@uncu.edu.ar <mailto:fsayd@uncu.edu.ar>> wrote:
I have issues with my ovirt setup related to memory consumption. Af= ter upgrading to 4.0 I noted a considerable grow in vdsm memory consumption. I suspect that the grow is related to a memory leak.
We need more details, see bellow...
When I boot up the system and activate the host the memory consumpt= ion is about 600MB. After 5 days running and host in maintenance mode t= he memory consumption is about 1,4 GB.
I need to put my hosts in maintenance and reboot to free memory.
You can restart vdsm (systemctl restart vdsmd) instead, running vms are not effected by this.
Can anyone help me to debug this problem?
We had a memory in vdsm-4.18.5, fixed in vdsm-4.18.11. Since you are running 4.18.11, there may be another leak.
Please enable health monitoring by creating /etc/vdsm/vdsm.conf.d/50-health.conf
[devel] health_monitor_enable =3D true
And restart vdsm.
Please run with this setting for couple of hours, maybe one day, and then share the vdsm logs from this timeframe.
You may disable health monitoring by setting
[devel] health_monitor_enable =3D false
Or by renaming or deleting this configuration file:
/etc/vdsm/vdsm.conf.d/50-health.conf.disabled
Nir
--------------DDCD2422C552AD1D59118A17 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: quoted-printable <html> <head> <meta content=3D"text/html; charset=3Dutf-8" http-equiv=3D"Content-Ty= pe"> </head> <body bgcolor=3D"#FFFFFF" text=3D"#000000"> <p>Hello Nir:</p> <p><br> </p> <p>I followed your instructions , added the config file, restarted vdsm, and today I have the vdsm logs from a host:</p> <p><a class=3D"moz-txt-link-freetext" href=3D"https://drive.google.co= m/file/d/0ByrwZ1AkYuyeR1hmRm90a1R6MEk/view?usp=3Dsharing">https://drive.g= oogle.com/file/d/0ByrwZ1AkYuyeR1hmRm90a1R6MEk/view?usp=3Dsharing</a></p> <p>Please tell me if you see anything related to the memory issue.</p=
<p><br> </p> <p>Thanks<br> </p> <br> Federico<br> <br> <div class=3D"moz-cite-prefix">El 30/08/16 a las 03:47, Nir Soffer escribi=C3=B3:<br> </div> <blockquote cite=3D"mid:CAMRbyyuWaLJV1ju4rUTeqeOKi=3DVr=3Djm_1YGqBS1i=3DEf3S8XypQ@mai= l.gmail.com" type=3D"cite"> <div dir=3D"ltr"> <div class=3D"gmail_extra"> <div class=3D"gmail_quote">On Tue, Aug 30, 2016 at 1:30 AM, Federico Alberto Sayd <span dir=3D"ltr"><<a moz-do-not-send=3D"true" href=3D"mailto:fsayd@uncu.edu.ar= " target=3D"_blank"><a class=3D"moz-txt-link-abbreviated" h= ref=3D"mailto:fsayd@uncu.edu.ar">fsayd@uncu.edu.ar</a></a>></span> wro= te:<br> <blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb= (204,204,204);padding-left:1ex">I have issues with my ovirt setup related to memory consumption. After<br> upgrading to 4.0 I noted a considerable grow in vdsm memory consumption.<br> I suspect that the grow is related to a memory leak.<br> </blockquote> <div><br> </div> <div>We need more details, see bellow...</div> <div>=C2=A0</div> <blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb= (204,204,204);padding-left:1ex"><br> When I boot up the system and activate the host the memory consumption<br> is about 600MB. After 5 days running and host in maintenance mode the<br> memory consumption is about 1,4 GB.<br> <br> I need to put my hosts in maintenance and reboot to free memory.<br> </blockquote> <div><br> </div> <div>You can restart vdsm (systemctl restart vdsmd) instead, running vms</div> <div>are not effected by this.</div> <div>=C2=A0</div> <blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb= (204,204,204);padding-left:1ex"><br> Can anyone help me to debug this problem?<br> </blockquote> <div><br> </div> <div>We had a memory in vdsm-4.18.5, fixed =C2=A0in vdsm-4.18= .11. Since you</div> <div>are running 4.18.11, there may be another leak.</div> <div><br> </div> <div>Please enable health monitoring by creating</div> <div>/etc/vdsm/vdsm.conf.d/50-health.conf</div> <div><br> </div> <div>[devel]</div> <div>health_monitor_enable =3D true</div> <div><br> </div> <div>And restart vdsm.</div> <div><br> </div> <div>Please run with this setting for couple of hours, maybe one day,</div> <div>and then share the vdsm logs from this timeframe.</div> <div><br> </div> <div>You may disable health monitoring by setting</div> <div><br> </div> <div> <div>[devel]</div> </div> <div>health_monitor_enable =3D false</div> <div><br> </div> <div>Or by renaming or deleting this configuration file:</div=
<div><br> </div> <div>/etc/vdsm/vdsm.conf.d/50-health.conf.disabled</div> <div><br> </div> <div>Nir</div> <div><br> </div> </div> </div> </div> </blockquote> <br> </body> </html> --------------DDCD2422C552AD1D59118A17--

On Wed, Aug 31, 2016 at 8:06 PM, Federico Alberto Sayd <fsayd@uncu.edu.ar> wrote:
Hello Nir:
I followed your instructions , added the config file, restarted vdsm, and today I have the vdsm logs from a host:
https://drive.google.com/file/d/0ByrwZ1AkYuyeR1hmRm90a1R6MEk/ view?usp=sharing
Please tell me if you see anything related to the memory issue.
This logs start when vdsm is using 567640 kB (554 MiB) - very unusual. The memory usage grow by 18 MiB during one day. No garbage collection issues. This smells like we keep some data forever for no reason. $ grep rss= vdsm-leak.log | head -n 1 Thread-33::DEBUG::2016-08-30 12:01:43,845::health::122::health::(_check_resources) user=1.73%, sys=1.65%, rss=567640 kB (+44), threads=57 $ grep rss= vdsm-leak.log | tail -n 1 Thread-33::DEBUG::2016-08-31 13:00:36,913::health::122::health::(_check_resources) user=4.18%, sys=1.87%, rss=586584 kB (+0), threads=52 I would like to see the logs since vdsm was started - do you have them? Also, can you describe the workload on this hypervisor? - how many vms are running at the same time - how many vms are started and stopped per hour - using default vdsm.conf? if not, please attach your conf Nir
Thanks
Federico
El 30/08/16 a las 03:47, Nir Soffer escribió:
On Tue, Aug 30, 2016 at 1:30 AM, Federico Alberto Sayd < <fsayd@uncu.edu.ar>fsayd@uncu.edu.ar> wrote:
I have issues with my ovirt setup related to memory consumption. After upgrading to 4.0 I noted a considerable grow in vdsm memory consumption. I suspect that the grow is related to a memory leak.
We need more details, see bellow...
When I boot up the system and activate the host the memory consumption is about 600MB. After 5 days running and host in maintenance mode the memory consumption is about 1,4 GB.
I need to put my hosts in maintenance and reboot to free memory.
You can restart vdsm (systemctl restart vdsmd) instead, running vms are not effected by this.
Can anyone help me to debug this problem?
We had a memory in vdsm-4.18.5, fixed in vdsm-4.18.11. Since you are running 4.18.11, there may be another leak.
Please enable health monitoring by creating /etc/vdsm/vdsm.conf.d/50-health.conf
[devel] health_monitor_enable = true
And restart vdsm.
Please run with this setting for couple of hours, maybe one day, and then share the vdsm logs from this timeframe.
You may disable health monitoring by setting
[devel] health_monitor_enable = false
Or by renaming or deleting this configuration file:
/etc/vdsm/vdsm.conf.d/50-health.conf.disabled
Nir

On Wed, Aug 31, 2016 at 9:06 PM, Nir Soffer <nsoffer@redhat.com> wrote:
On Wed, Aug 31, 2016 at 8:06 PM, Federico Alberto Sayd <fsayd@uncu.edu.ar> wrote:
Hello Nir:
I followed your instructions , added the config file, restarted vdsm, and today I have the vdsm logs from a host:
https://drive.google.com/file/d/0ByrwZ1AkYuyeR1hmRm90a1R6MEk /view?usp=sharing
Please tell me if you see anything related to the memory issue.
This logs start when vdsm is using 567640 kB (554 MiB) - very unusual.
The memory usage grow by 18 MiB during one day. No garbage collection issues. This smells like we keep some data forever for no reason.
$ grep rss= vdsm-leak.log | head -n 1 Thread-33::DEBUG::2016-08-30 12:01:43,845::health::122::health::(_check_resources) user=1.73%, sys=1.65%, rss=567640 kB (+44), threads=57
$ grep rss= vdsm-leak.log | tail -n 1 Thread-33::DEBUG::2016-08-31 13:00:36,913::health::122::health::(_check_resources) user=4.18%, sys=1.87%, rss=586584 kB (+0), threads=52
I would like to see the logs since vdsm was started - do you have them?
Also, can you describe the workload on this hypervisor?
- how many vms are running at the same time - how many vms are started and stopped per hour - using default vdsm.conf? if not, please attach your conf
I could reproduce similar leak in master - it seems that we leak about 1MiB for each vm started and stopped. I opened this bug: https://bugzilla.redhat.com/1372205 Please check if this bug match your issue. If it does, please add your logs and other info to this bug. Thanks, Nir
participants (2)
-
Federico Alberto Sayd
-
Nir Soffer