Hi,
I wrote the tracemalloc module which is easy to use on Python 3.4 and
newer. If you take tracemalloc snapshots while the memory usage is
growing, and comparing snapshots don't show anything obvious, you can
maybe suspect memory fragmentation. You're talking about 4 GB of
memory usage, I don't think that memory fragmentation can explain it.
Do you need my help to use tracemalloc?
Quick tutorial in the official documentation:
Victor
On Fri, Nov 12, 2021 at 3:51 PM David Malcolm <dmalcolm(a)redhat.com> wrote:
On Fri, 2021-11-12 at 09:54 +0100, Sandro Bonazzola wrote:
> Il giorno ven 12 nov 2021 alle ore 09:50 Sandro Bonazzola <
> sbonazzo(a)redhat.com> ha scritto:
>
> >
> >
> > Il giorno ven 12 nov 2021 alle ore 09:47 Sandro Bonazzola <
> > sbonazzo(a)redhat.com> ha scritto:
> >
> > >
> > >
> > > Il giorno mer 10 nov 2021 alle ore 15:45 Chris Adams
> > > <cma(a)cmadams.net>
> > > ha scritto:
> > >
> > > > I have seen vdsmd leak memory for years (I've been running
> > > > oVirt since
> > > > version 3.5), but never been able to nail it down. I've
> > > > upgraded a
> > > > cluster to oVirt 4.4.9 (reloading the hosts with CentOS 8-
> > > > stream), and I
> > > > still see it happen. One host in the cluster, which has been
> > > > up 8 days,
> > > > has vdsmd with 4.3 GB resident memory. On a couple of other
> > > > hosts, it's
> > > > around half a gigabyte.
> > > >
> > > > In the past, it seemed more likely to happen on the hosted
> > > > engine hosts
> > > > and/or the SPM host... but the host with the 4.3 GB vdsmd is
> > > > not either
> > > > of those.
> > > >
> > > > I'm not sure what I do that would make my setup
"special"
> > > > compared to
> > > > others; I loaded a pretty minimal install of CentOS 8-stream,
> > > > with the
> > > > only extra thing being I add the core parts of the Dell
> > > > PowerEdge
> > > > OpenManage tools (so I can get remote SNMP hardware
> > > > monitoring).
> > > >
> > > > When I run "pmap $(pidof -x vdsmd)", the bulk of the RAM
use is
> > > > a single
> > > > anonymous block (which I'm guessing is just the python general
> > > > memory
> > > > allocator).
> > > >
> > > > I thought maybe the switch to CentOS 8 and python 3 might clear
> > > > something up, but obviously not. Any ideas?
> > > >
> > >
> > > I guess we still have the reproducibility issue (
> > >
https://lists.ovirt.org/archives/list/devel@ovirt.org/thread/KO5SEPAZMLBW...
> > > ).
> > > But maybe in the meanwhile there's a new way to track things
> > > down. +Marcin
> > > Sobczyk <msobczyk(a)redhat.com> ?
> > >
> > >
> > >
> > Perhaps
https://docs.python.org/3.6/library/tracemalloc.html ?
> >
>
> +David Malcolm <dmalcolm(a)redhat.com> I saw your slides on python
> memory
> leak debugging, maybe you can give some suggestions here.
I haven't worked on Python itself in > 8 years, so my knowledge is out-
of-date here.
Adding in Victor Stinner, who has worked on the CPython memory
allocators more recently, and, in particular, implemented the
tracemalloc library linked to above.
Dave