[ovirt-users] memory leak in 3.5.6 - not vdsm
Nir Soffer
nsoffer at redhat.com
Fri Jan 22 22:42:58 UTC 2016
On Fri, Jan 22, 2016 at 11:30 PM, Charles Kozler <charles at fixflyer.com> wrote:
> Hi Nir -
>
> do you have a release target date for 3.5.8? Any estimate would help.
>
> If its not VDSM, what is it exactly? Sorry, I understood from the ticket it
> was something inside vdsm, was I mistaken?
The bug I mentioned in my previous mail *is* a vdsm leak. This issue is not.
>
> CentOS 6 is the servers. 6.7 to be exact
>
> I have done all forms of flushing that I can (page cache, inodes, dentry's,
> etc) and as well moved VM's around to other nodes and nothing changes the
> memory. How can I find the leak? Where is the leak? RES shows the following
> of which, the totals dont add up to 20GB
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 19044 qemu 20 0 8876m 4.0g 5680 S 3.6 12.9 1571:44 qemu-kvm
> 26143 qemu 20 0 5094m 1.1g 5624 S 9.2 3.7 6012:12 qemu-kvm
> 5837 root 0 -20 964m 624m 3664 S 0.0 2.0 85:22.09 glusterfs
> 14328 root 0 -20 635m 169m 3384 S 0.0 0.5 43:15.23 glusterfs
> 5134 vdsm 0 -20 4368m 111m 10m S 5.9 0.3 3710:50 vdsm
> 4095 root 15 -5 727m 43m 10m S 0.0 0.1 0:02.00
> supervdsmServer
>
> 4.0G + 1.1G + 624M + 169 + 111M + 43M = ~7GB
>
> This was top sorted by RES from highest to lowest
Can you you list *all* processes and sum the RSS of all of them?
You something like:
for status in /proc/*/status; do egrep '^VmRSS' $status; done |
awk '{sum+=$2} END {print sum}'
> At that point I wouldnt know where else to look except slab / kernel
> structures. Of which slab shows:
>
> [compute[root at node1 ~]$ cat /proc/meminfo | grep -i slab
> Slab: 2549748 kB
>
> So roughly 2-3GB. Adding that to the other use of 7GB we have still about
> 10GB unaccounted for
>
> On Fri, Jan 22, 2016 at 4:24 PM, Nir Soffer <nsoffer at redhat.com> wrote:
>>
>> On Fri, Jan 22, 2016 at 11:08 PM, Charles Kozler <charles at fixflyer.com>
>> wrote:
>> > Hi Nir -
>> >
>> > Thanks for getting back to me. Will the patch to 3.6 be backported to
>> > 3.5?
>>
>> We plan to include them in 3.5.8.
>>
>> > As you can tell from the images, it takes days and days for it to
>> > increase
>> > over time. I also wasnt sure if that was the right bug because VDSM
>> > memory
>> > shows normal from top ...
>> >
>> > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
>> > 5134 vdsm 0 -20 4368m 111m 10m S 2.0 0.3 3709:28 vdsm
>>
>> As you wrote, this issue is not related to vdsm.
>>
>> >
>> > Res is only 111M. This is from node1 which is showing currently 20GB of
>> > 32GB
>> > used with only 2 VMs running on it - 1 with 4G and another with ~1 GB of
>> > RAM
>> > configured
>> >
>> > The images are from nagios and the value here is a direct correlation to
>> > what you would see in the free command output. See below from an example
>> > of
>> > node 1 and node 2
>> >
>> > [compute[root at node1 ~]$ free
>> > total used free shared buffers
>> > cached
>> > Mem: 32765316 20318156 12447160 252 30884
>> > 628948
>> > -/+ buffers/cache: 19658324 13106992
>> > Swap: 19247100 0 19247100
>> > [compute[root at node1 ~]$ free -m
>> > total used free shared buffers
>> > cached
>> > Mem: 31997 19843 12153 0 30
>> > 614
>> > -/+ buffers/cache: 19199 12798
>> > Swap: 18795 0 18795
>> >
>> > And its correlated image http://i.imgur.com/PZLEgyx.png (~19GB used)
>> >
>> > And as a control, node 2 that I just restarted today
>> >
>> > [compute[root at node2 ~]$ free
>> > total used free shared buffers
>> > cached
>> > Mem: 32765316 1815324 30949992 212 35784
>> > 717320
>> > -/+ buffers/cache: 1062220 31703096
>> > Swap: 19247100 0 19247100
>>
>> Is this rhel/centos 6?
>>
>> > [compute[root at node2 ~]$ free -m
>> > total used free shared buffers
>> > cached
>> > Mem: 31997 1772 30225 0 34
>> > 700
>> > -/+ buffers/cache: 1036 30960
>> > Swap: 18795 0 18795
>> >
>> > And its correlated image http://i.imgur.com/8ldPVqY.png (~2GB used).
>> > Note
>> > how 1772 in the image is exactly what is registered under 'used' in free
>> > command
>>
>> I guess you should start looking at the processes running on these nodes.
>>
>> Maybe try to collect memory usage per process using ps?
>>
>> >
>> > On Fri, Jan 22, 2016 at 3:59 PM, Nir Soffer <nsoffer at redhat.com> wrote:
>> >>
>> >> On Fri, Jan 22, 2016 at 9:25 PM, Charles Kozler <charles at fixflyer.com>
>> >> wrote:
>> >> > Here is a screenshot of my three nodes and their increased memory
>> >> > usage
>> >> > over
>> >> > 30 days. Note that node #2 had 1 single VM that had 4GB of RAM
>> >> > assigned
>> >> > to
>> >> > it. I had since shut it down and saw no memory reclamation occur.
>> >> > Further, I
>> >> > flushed page caches and inodes and ran 'sync'. I tried everything but
>> >> > nothing brought the memory usage down. vdsm was low too (couple
>> >> > hundred
>> >> > MB)
>> >>
>> >> Note that there is an old leak in vdsm, will be fixed in next 3.6
>> >> build:
>> >> https://bugzilla.redhat.com/1269424
>> >>
>> >> > and there was no qemu-kvm process running so I'm at a loss
>> >> >
>> >> > http://imgur.com/a/aFPcK
>> >> >
>> >> > Please advise on what I can do to debug this. Note I have restarted
>> >> > node
>> >> > 2
>> >> > (which is why you see the drop) to see if it raises in memory use
>> >> > over
>> >> > tim
>> >> > even with no VM's running
>> >>
>> >> Not sure what is "memory" that you show in the graphs. Theoretically
>> >> this
>> >> may be
>> >> normal memory usage, Linux using free memory for the buffer cache.
>> >>
>> >> Can you instead show the output of "free", during one day, maybe run
>> >> once
>> >> per hour?
>> >>
>> >> You may also like to install sysstat for collecting and monitoring
>> >> resources usage.
>> >>
>> >> >
>> >> > [compute[root at node2 log]$ rpm -qa | grep -i ovirt
>> >> > libgovirt-0.3.2-1.el6.x86_64
>> >> > ovirt-release35-006-1.noarch
>> >> > ovirt-hosted-engine-ha-1.2.8-1.el6.noarch
>> >> > ovirt-hosted-engine-setup-1.2.6.1-1.el6.noarch
>> >> > ovirt-engine-sdk-python-3.5.6.0-1.el6.noarch
>> >> > ovirt-host-deploy-1.3.2-1.el6.noarch
>> >> >
>> >> >
>> >> > --
>> >> >
>> >> > Charles Kozler
>> >> > Vice President, IT Operations
>> >> >
>> >> > FIX Flyer, LLC
>> >> > 225 Broadway | Suite 1600 | New York, NY 10007
>> >> > 1-888-349-3593
>> >> > http://www.fixflyer.com
>> >> >
>> >> > NOTICE TO RECIPIENT: THIS E-MAIL IS MEANT ONLY FOR THE INTENDED
>> >> > RECIPIENT(S)
>> >> > OF THE TRANSMISSION, AND CONTAINS CONFIDENTIAL INFORMATION WHICH IS
>> >> > PROPRIETARY TO FIX FLYER LLC. ANY UNAUTHORIZED USE, COPYING,
>> >> > DISTRIBUTION,
>> >> > OR DISSEMINATION IS STRICTLY PROHIBITED. ALL RIGHTS TO THIS
>> >> > INFORMATION
>> >> > IS
>> >> > RESERVED BY FIX FLYER LLC. IF YOU ARE NOT THE INTENDED RECIPIENT,
>> >> > PLEASE
>> >> > CONTACT THE SENDER BY REPLY E-MAIL AND PLEASE DELETE THIS E-MAIL FROM
>> >> > YOUR
>> >> > SYSTEM AND DESTROY ANY COPIES.
>> >> >
>> >> > _______________________________________________
>> >> > Users mailing list
>> >> > Users at ovirt.org
>> >> > http://lists.ovirt.org/mailman/listinfo/users
>> >> >
>> >
>> >
>> >
>> >
>> > --
>> >
>> > Charles Kozler
>> > Vice President, IT Operations
>> >
>> > FIX Flyer, LLC
>> > 225 Broadway | Suite 1600 | New York, NY 10007
>> > 1-888-349-3593
>> > http://www.fixflyer.com
>> >
>> > NOTICE TO RECIPIENT: THIS E-MAIL IS MEANT ONLY FOR THE INTENDED
>> > RECIPIENT(S)
>> > OF THE TRANSMISSION, AND CONTAINS CONFIDENTIAL INFORMATION WHICH IS
>> > PROPRIETARY TO FIX FLYER LLC. ANY UNAUTHORIZED USE, COPYING,
>> > DISTRIBUTION,
>> > OR DISSEMINATION IS STRICTLY PROHIBITED. ALL RIGHTS TO THIS INFORMATION
>> > IS
>> > RESERVED BY FIX FLYER LLC. IF YOU ARE NOT THE INTENDED RECIPIENT,
>> > PLEASE
>> > CONTACT THE SENDER BY REPLY E-MAIL AND PLEASE DELETE THIS E-MAIL FROM
>> > YOUR
>> > SYSTEM AND DESTROY ANY COPIES.
>
>
>
>
> --
>
> Charles Kozler
> Vice President, IT Operations
>
> FIX Flyer, LLC
> 225 Broadway | Suite 1600 | New York, NY 10007
> 1-888-349-3593
> http://www.fixflyer.com
>
> NOTICE TO RECIPIENT: THIS E-MAIL IS MEANT ONLY FOR THE INTENDED RECIPIENT(S)
> OF THE TRANSMISSION, AND CONTAINS CONFIDENTIAL INFORMATION WHICH IS
> PROPRIETARY TO FIX FLYER LLC. ANY UNAUTHORIZED USE, COPYING, DISTRIBUTION,
> OR DISSEMINATION IS STRICTLY PROHIBITED. ALL RIGHTS TO THIS INFORMATION IS
> RESERVED BY FIX FLYER LLC. IF YOU ARE NOT THE INTENDED RECIPIENT, PLEASE
> CONTACT THE SENDER BY REPLY E-MAIL AND PLEASE DELETE THIS E-MAIL FROM YOUR
> SYSTEM AND DESTROY ANY COPIES.
More information about the Users
mailing list