[ovirt-users] memory leak in 3.5.6 - not vdsm

Charles Kozler charles at fixflyer.com
Fri Jan 22 21:53:11 UTC 2016


Sandro -

Do you have available documentation that can support upgrading self hosted?
I followed this
http://community.redhat.com/blog/2014/10/up-and-running-with-ovirt-3-5/

Would it be as easy as installing the RPM and then running yum upgrade?

Thanks

On Fri, Jan 22, 2016 at 4:42 PM, Sandro Bonazzola <sbonazzo at redhat.com>
wrote:

>
> Il 22/Gen/2016 22:31, "Charles Kozler" <charles at fixflyer.com> ha scritto:
> >
> > Hi Nir -
> >
> > do you have a release target date for 3.5.8? Any estimate would help.
> >
>
> There won't be any supported release after 3.5.6. Please update to 3.6.2
> next week
>
> > If its not VDSM, what is it exactly? Sorry, I understood from the ticket
> it was something inside vdsm, was I mistaken?
> >
> > CentOS 6 is the servers. 6.7 to be exact
> >
> > I have done all forms of flushing that I can (page cache, inodes,
> dentry's, etc) and as well moved VM's around to other nodes and nothing
> changes the memory. How can I find the leak? Where is the leak? RES shows
> the following of which, the totals dont add up to 20GB
> >
> >    PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>
>
> >  19044 qemu      20   0 8876m 4.0g 5680 S  3.6 12.9   1571:44 qemu-kvm
>
>
> >  26143 qemu      20   0 5094m 1.1g 5624 S  9.2  3.7   6012:12 qemu-kvm
>
>
> >   5837 root       0 -20  964m 624m 3664 S  0.0  2.0  85:22.09 glusterfs
>
>
> >  14328 root       0 -20  635m 169m 3384 S  0.0  0.5  43:15.23 glusterfs
>
>
> >   5134 vdsm       0 -20 4368m 111m  10m S  5.9  0.3   3710:50 vdsm
>
>
> >   4095 root      15  -5  727m  43m  10m S  0.0  0.1   0:02.00
> supervdsmServer
> >
> > 4.0G + 1.1G + 624M + 169 + 111M + 43M = ~7GB
> >
> > This was top sorted by RES from highest to lowest
> >
> > At that point I wouldnt know where else to look except slab / kernel
> structures. Of which slab shows:
> >
> > [compute[root at node1 ~]$ cat /proc/meminfo | grep -i slab
> > Slab:            2549748 kB
> >
> > So roughly 2-3GB. Adding that to the other use of 7GB we have still
> about 10GB unaccounted for
> >
> > On Fri, Jan 22, 2016 at 4:24 PM, Nir Soffer <nsoffer at redhat.com> wrote:
> >>
> >> On Fri, Jan 22, 2016 at 11:08 PM, Charles Kozler <charles at fixflyer.com>
> wrote:
> >> > Hi Nir -
> >> >
> >> > Thanks for getting back to me. Will the patch to 3.6 be backported to
> 3.5?
> >>
> >> We plan to include them in 3.5.8.
> >>
> >> > As you can tell from the images, it takes days and days for it to
> increase
> >> > over time. I also wasnt sure if that was the right bug because VDSM
> memory
> >> > shows normal from top ...
> >> >
> >> >    PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
> >> >   5134 vdsm       0 -20 4368m 111m  10m S  2.0  0.3   3709:28 vdsm
> >>
> >> As you wrote, this issue is not related to vdsm.
> >>
> >> >
> >> > Res is only 111M. This is from node1 which is showing currently 20GB
> of 32GB
> >> > used with only 2 VMs running on it - 1 with 4G and another with ~1 GB
> of RAM
> >> > configured
> >> >
> >> > The images are from nagios and the value here is a direct correlation
> to
> >> > what you would see in the free command output. See below from an
> example of
> >> > node 1 and node 2
> >> >
> >> > [compute[root at node1 ~]$ free
> >> >              total       used       free     shared    buffers
>  cached
> >> > Mem:      32765316   20318156   12447160        252      30884
>  628948
> >> > -/+ buffers/cache:   19658324   13106992
> >> > Swap:     19247100          0   19247100
> >> > [compute[root at node1 ~]$ free -m
> >> >              total       used       free     shared    buffers
>  cached
> >> > Mem:         31997      19843      12153          0         30
> 614
> >> > -/+ buffers/cache:      19199      12798
> >> > Swap:        18795          0      18795
> >> >
> >> > And its correlated image http://i.imgur.com/PZLEgyx.png (~19GB used)
> >> >
> >> > And as a control, node 2 that I just restarted today
> >> >
> >> > [compute[root at node2 ~]$ free
> >> >              total       used       free     shared    buffers
>  cached
> >> > Mem:      32765316    1815324   30949992        212      35784
>  717320
> >> > -/+ buffers/cache:    1062220   31703096
> >> > Swap:     19247100          0   19247100
> >>
> >> Is this rhel/centos 6?
> >>
> >> > [compute[root at node2 ~]$ free -m
> >> >              total       used       free     shared    buffers
>  cached
> >> > Mem:         31997       1772      30225          0         34
> 700
> >> > -/+ buffers/cache:       1036      30960
> >> > Swap:        18795          0      18795
> >> >
> >> > And its correlated image http://i.imgur.com/8ldPVqY.png  (~2GB
> used). Note
> >> > how 1772 in the image is exactly what is registered under 'used' in
> free
> >> > command
> >>
> >> I guess you should start looking at the processes running on these
> nodes.
> >>
> >> Maybe try to collect memory usage per process using ps?
> >>
> >> >
> >> > On Fri, Jan 22, 2016 at 3:59 PM, Nir Soffer <nsoffer at redhat.com>
> wrote:
> >> >>
> >> >> On Fri, Jan 22, 2016 at 9:25 PM, Charles Kozler <
> charles at fixflyer.com>
> >> >> wrote:
> >> >> > Here is a screenshot of my three nodes and their increased memory
> usage
> >> >> > over
> >> >> > 30 days. Note that node #2 had 1 single VM that had 4GB of RAM
> assigned
> >> >> > to
> >> >> > it. I had since shut it down and saw no memory reclamation occur.
> >> >> > Further, I
> >> >> > flushed page caches and inodes and ran 'sync'. I tried everything
> but
> >> >> > nothing brought the memory usage down. vdsm was low too (couple
> hundred
> >> >> > MB)
> >> >>
> >> >> Note that there is an old leak in vdsm, will be fixed in next 3.6
> build:
> >> >> https://bugzilla.redhat.com/1269424
> >> >>
> >> >> > and there was no qemu-kvm process running so I'm at a loss
> >> >> >
> >> >> > http://imgur.com/a/aFPcK
> >> >> >
> >> >> > Please advise on what I can do to debug this. Note I have
> restarted node
> >> >> > 2
> >> >> > (which is why you see the drop) to see if it raises in memory use
> over
> >> >> > tim
> >> >> > even with no VM's running
> >> >>
> >> >> Not sure what is "memory" that you show in the graphs. Theoretically
> this
> >> >> may be
> >> >> normal memory usage, Linux using free memory for the buffer cache.
> >> >>
> >> >> Can you instead show the output of "free", during one day, maybe run
> once
> >> >> per hour?
> >> >>
> >> >> You may also like to install sysstat for collecting and monitoring
> >> >> resources usage.
> >> >>
> >> >> >
> >> >> > [compute[root at node2 log]$ rpm -qa | grep -i ovirt
> >> >> > libgovirt-0.3.2-1.el6.x86_64
> >> >> > ovirt-release35-006-1.noarch
> >> >> > ovirt-hosted-engine-ha-1.2.8-1.el6.noarch
> >> >> > ovirt-hosted-engine-setup-1.2.6.1-1.el6.noarch
> >> >> > ovirt-engine-sdk-python-3.5.6.0-1.el6.noarch
> >> >> > ovirt-host-deploy-1.3.2-1.el6.noarch
> >> >> >
> >> >> >
> >> >> > --
> >> >> >
> >> >> > Charles Kozler
> >> >> > Vice President, IT Operations
> >> >> >
> >> >> > FIX Flyer, LLC
> >> >> > 225 Broadway | Suite 1600 | New York, NY 10007
> >> >> > 1-888-349-3593
> >> >> > http://www.fixflyer.com
> >> >> >
> >> >> > NOTICE TO RECIPIENT: THIS E-MAIL IS MEANT ONLY FOR THE INTENDED
> >> >> > RECIPIENT(S)
> >> >> > OF THE TRANSMISSION, AND CONTAINS CONFIDENTIAL INFORMATION WHICH IS
> >> >> > PROPRIETARY TO FIX FLYER LLC.  ANY UNAUTHORIZED USE, COPYING,
> >> >> > DISTRIBUTION,
> >> >> > OR DISSEMINATION IS STRICTLY PROHIBITED.  ALL RIGHTS TO THIS
> INFORMATION
> >> >> > IS
> >> >> > RESERVED BY FIX FLYER LLC.  IF YOU ARE NOT THE INTENDED RECIPIENT,
> >> >> > PLEASE
> >> >> > CONTACT THE SENDER BY REPLY E-MAIL AND PLEASE DELETE THIS E-MAIL
> FROM
> >> >> > YOUR
> >> >> > SYSTEM AND DESTROY ANY COPIES.
> >> >> >
> >> >> > _______________________________________________
> >> >> > Users mailing list
> >> >> > Users at ovirt.org
> >> >> > http://lists.ovirt.org/mailman/listinfo/users
> >> >> >
> >> >
> >> >
> >> >
> >> >
> >> > --
> >> >
> >> > Charles Kozler
> >> > Vice President, IT Operations
> >> >
> >> > FIX Flyer, LLC
> >> > 225 Broadway | Suite 1600 | New York, NY 10007
> >> > 1-888-349-3593
> >> > http://www.fixflyer.com
> >> >
> >> > NOTICE TO RECIPIENT: THIS E-MAIL IS MEANT ONLY FOR THE INTENDED
> RECIPIENT(S)
> >> > OF THE TRANSMISSION, AND CONTAINS CONFIDENTIAL INFORMATION WHICH IS
> >> > PROPRIETARY TO FIX FLYER LLC.  ANY UNAUTHORIZED USE, COPYING,
> DISTRIBUTION,
> >> > OR DISSEMINATION IS STRICTLY PROHIBITED.  ALL RIGHTS TO THIS
> INFORMATION IS
> >> > RESERVED BY FIX FLYER LLC.  IF YOU ARE NOT THE INTENDED RECIPIENT,
> PLEASE
> >> > CONTACT THE SENDER BY REPLY E-MAIL AND PLEASE DELETE THIS E-MAIL FROM
> YOUR
> >> > SYSTEM AND DESTROY ANY COPIES.
> >
> >
> >
> >
> > --
> >
> > Charles Kozler
> > Vice President, IT Operations
> >
> > FIX Flyer, LLC
> > 225 Broadway | Suite 1600 | New York, NY 10007
> > 1-888-349-3593
> > http://www.fixflyer.com
> >
> > NOTICE TO RECIPIENT: THIS E-MAIL IS MEANT ONLY FOR THE INTENDED
> RECIPIENT(S) OF THE TRANSMISSION, AND CONTAINS CONFIDENTIAL INFORMATION
> WHICH IS PROPRIETARY TO FIX FLYER LLC.  ANY UNAUTHORIZED USE, COPYING,
> DISTRIBUTION, OR DISSEMINATION IS STRICTLY PROHIBITED.  ALL RIGHTS TO THIS
> INFORMATION IS RESERVED BY FIX FLYER LLC.  IF YOU ARE NOT THE INTENDED
> RECIPIENT, PLEASE CONTACT THE SENDER BY REPLY E-MAIL AND PLEASE DELETE THIS
> E-MAIL FROM YOUR SYSTEM AND DESTROY ANY COPIES.
> >
> > _______________________________________________
> > Users mailing list
> > Users at ovirt.org
> > http://lists.ovirt.org/mailman/listinfo/users
> >
>



-- 

*Charles Kozler*
*Vice President, IT Operations*

FIX Flyer, LLC
225 Broadway | Suite 1600 | New York, NY 10007
1-888-349-3593
http://www.fixflyer.com <http://fixflyer.com>

NOTICE TO RECIPIENT: THIS E-MAIL IS MEANT ONLY FOR THE INTENDED
RECIPIENT(S) OF THE TRANSMISSION, AND CONTAINS CONFIDENTIAL INFORMATION
WHICH IS PROPRIETARY TO FIX FLYER LLC.  ANY UNAUTHORIZED USE, COPYING,
DISTRIBUTION, OR DISSEMINATION IS STRICTLY PROHIBITED.  ALL RIGHTS TO THIS
INFORMATION IS RESERVED BY FIX FLYER LLC.  IF YOU ARE NOT THE INTENDED
RECIPIENT, PLEASE CONTACT THE SENDER BY REPLY E-MAIL AND PLEASE DELETE THIS
E-MAIL FROM YOUR SYSTEM AND DESTROY ANY COPIES.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20160122/592f8d3d/attachment-0001.html>


More information about the Users mailing list