[ovirt-users] VDSM memory consumption
Dan Kenigsberg
danken at redhat.com
Thu Mar 26 11:42:13 UTC 2015
On Wed, Mar 25, 2015 at 01:29:25PM -0500, Darrell Budic wrote:
>
> > On Mar 25, 2015, at 5:34 AM, Dan Kenigsberg <danken at redhat.com> wrote:
> >
> > On Tue, Mar 24, 2015 at 02:01:40PM -0500, Darrell Budic wrote:
> >>
> >>> On Mar 24, 2015, at 4:33 AM, Dan Kenigsberg <danken at redhat.com> wrote:
> >>>
> >>> On Mon, Mar 23, 2015 at 04:00:14PM -0400, John Taylor wrote:
> >>>> Chris Adams <cma at cmadams.net> writes:
> >>>>
> >>>>> Once upon a time, Sven Kieske <s.kieske at mittwald.de> said:
> >>>>>> On 13/03/15 12:29, Kapetanakis Giannis wrote:
> >>>>>>> We also face this problem since 3.5 in two different installations...
> >>>>>>> Hope it's fixed soon
> >>>>>>
> >>>>>> Nothing will get fixed if no one bothers to
> >>>>>> open BZs and send relevants log files to help
> >>>>>> track down the problems.
> >>>>>
> >>>>> There's already an open BZ:
> >>>>>
> >>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1158108
> >>>>>
> >>>>> I'm not sure if that is exactly the same problem I'm seeing or not; my
> >>>>> vdsm process seems to be growing faster (RSS grew 952K in a 5 minute
> >>>>> period just now; VSZ didn't change).
> >>>>
> >>>> For those following this I've added a comment on the bz [1], although in
> >>>> my case the memory leak is, like Chris Adams, a lot more than the 300KiB/h
> >>>> in the original bug report by Daniel Helgenberger .
> >>>>
> >>>> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1158108
> >>>
> >>> That's interesting (and worrying).
> >>> Could you check your suggestion by editing sampling.py so that
> >>> _get_interfaces_and_samples() returns the empty dict immediately?
> >>> Would this make the leak disappear?
> >>
> >> Looks like you’ve got something there. Just a quick test for now, watching RSS in top. I’ll let it go this way for a while and see what it looks in a few hours.
> >>
> >> System 1: 13 VMs w/ 24 interfaces between them
> >>
> >> 11:47 killed a vdsm @ 9.116G RSS (after maybe a week and a half running)
> >>
> >> 11:47: 97xxx
> >> 11:57 135544 and climbing
> >> 12:00 136400
> >>
> >> restarted with sampling.py modified to just return empty set:
> >>
> >> def _get_interfaces_and_samples():
> >> links_and_samples = {}
> >> return links_and_samples
> >
> > Thanks for the input. Just to be a little more certain that the culprit
> > is _get_interfaces_and_samples() per se, would you please decorate it
> > with memoized, and add a log line in the end
> >
> > @utils.memoized # add this line
> > def _get_interfaces_and_samples():
> > ...
> > logging.debug('LINKS %s', links_and_samples) ## and this line
> > return links_and_samples
> >
> > I'd like to see what happens when the function is run only once, and
> > returns a non-empty reasonable dictionary of links and samples.
>
> Looks similar, I modified my second server for this test:
Thanks again. Would you be kind to search further?
Does the following script leak anything on your host, when placed in your
/usr/share/vdsm:
#!/usr/bin/python
from time import sleep
from virt.sampling import _get_interfaces_and_samples
while True:
_get_interfaces_and_samples()
sleep(0.2)
Something that can be a bit harder would be to:
# service vdsmd stop
# su - vdsm -s /bin/bash
# cd /usr/share/vdsm
# valgrind --leak-check=full --log-file=/tmp/your.log vdsm
as suggested by Thomas on
https://bugzilla.redhat.com/show_bug.cgi?id=1158108#c6
Regards,
Dan.
More information about the Users
mailing list