[ovirt-users] VDSM memory consumption

Dan Kenigsberg danken at redhat.com
Thu Mar 26 07:42:13 EDT 2015


On Wed, Mar 25, 2015 at 01:29:25PM -0500, Darrell Budic wrote:
> 
> > On Mar 25, 2015, at 5:34 AM, Dan Kenigsberg <danken at redhat.com> wrote:
> > 
> > On Tue, Mar 24, 2015 at 02:01:40PM -0500, Darrell Budic wrote:
> >> 
> >>> On Mar 24, 2015, at 4:33 AM, Dan Kenigsberg <danken at redhat.com> wrote:
> >>> 
> >>> On Mon, Mar 23, 2015 at 04:00:14PM -0400, John Taylor wrote:
> >>>> Chris Adams <cma at cmadams.net> writes:
> >>>> 
> >>>>> Once upon a time, Sven Kieske <s.kieske at mittwald.de> said:
> >>>>>> On 13/03/15 12:29, Kapetanakis Giannis wrote:
> >>>>>>> We also face this problem since 3.5 in two different installations...
> >>>>>>> Hope it's fixed soon
> >>>>>> 
> >>>>>> Nothing will get fixed if no one bothers to
> >>>>>> open BZs and send relevants log files to help
> >>>>>> track down the problems.
> >>>>> 
> >>>>> There's already an open BZ:
> >>>>> 
> >>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1158108
> >>>>> 
> >>>>> I'm not sure if that is exactly the same problem I'm seeing or not; my
> >>>>> vdsm process seems to be growing faster (RSS grew 952K in a 5 minute
> >>>>> period just now; VSZ didn't change).
> >>>> 
> >>>> For those following this I've added a comment on the bz [1], although in
> >>>> my case the memory leak is, like Chris Adams, a lot more than the 300KiB/h
> >>>> in the original bug report by Daniel Helgenberger .
> >>>> 
> >>>> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1158108
> >>> 
> >>> That's interesting (and worrying).
> >>> Could you check your suggestion by editing sampling.py so that
> >>> _get_interfaces_and_samples() returns the empty dict immediately?
> >>> Would this make the leak disappear?
> >> 
> >> Looks like you’ve got something there. Just a quick test for now, watching RSS in top. I’ll let it go this way for a while and see what it looks in a few hours.
> >> 
> >> System 1: 13 VMs w/ 24 interfaces between them
> >> 
> >> 11:47 killed a vdsm @ 9.116G RSS (after maybe a week and a half running)
> >> 
> >> 11:47: 97xxx
> >> 11:57 135544 and climbing
> >> 12:00 136400
> >> 
> >> restarted with sampling.py modified to just return empty set:
> >> 
> >> def _get_interfaces_and_samples():
> >>    links_and_samples = {}
> >>    return links_and_samples
> > 
> > Thanks for the input. Just to be a little more certain that the culprit
> > is _get_interfaces_and_samples() per se, would you please decorate it
> > with memoized, and add a log line in the end
> > 
> > @utils.memoized   # add this line
> > def _get_interfaces_and_samples():
> >    ...
> >    logging.debug('LINKS %s', links_and_samples)  ## and this line
> >    return links_and_samples
> > 
> > I'd like to see what happens when the function is run only once, and
> > returns a non-empty reasonable dictionary of links and samples.
> 
> Looks similar, I modified my second server for this test:

Thanks again. Would you be kind to search further?
Does the following script leak anything on your host, when placed in your
/usr/share/vdsm:

    #!/usr/bin/python

    from time import sleep
    from virt.sampling import _get_interfaces_and_samples

    while True:
        _get_interfaces_and_samples()
        sleep(0.2)

Something that can be a bit harder would be to:
# service vdsmd stop
# su - vdsm -s /bin/bash
# cd /usr/share/vdsm
# valgrind --leak-check=full --log-file=/tmp/your.log vdsm

as suggested by Thomas on
https://bugzilla.redhat.com/show_bug.cgi?id=1158108#c6

Regards,
Dan.


More information about the Users mailing list