[ovirt-users] 3.4: VDSM Memory consumption

Wed Sep 24 17:20:06 UTC 2014

> On 01.09.2014 18:08, Dan Kenigsberg wrote:
> > On Mon, Sep 01, 2014 at 03:30:53PM +0000, Daniel Helgenberger wrote:
> >> Hello,
> >>
> >> in my LAB cluster I run into OOM conditions frequently because of a huge
> >> VDSM process. The memory stats from my nodes right now:
> >>
> >> Node A; running one VM:
> >>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND 
> >>  3465 vdsm       0 -20 18.6g 9.8g 8244 S 32.1 50.3  27265:21 vdsm                                                                                                                                                  
> >>  7439 qemu      20   0 5641m 4.1g 4280 S 22.9 20.9  12737:08 qemu-kvm                                                                                                                                              
> >>  2912 root      15  -5 2710m  35m 5968 S  0.0  0.2   0:04.76 supervdsmServer 
> >>
> >> Node B, running 3 VMs including HosedEngine:
> >>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND 
> >> 9079 vdsm       0 -20  9.9g 5.0g 7496 S 49.7 43.0  11858:06 vdsm                                                                                                                                                  
> >>  3347 qemu      20   0 7749m 1.8g 5264 S  4.3 15.8   3:25.71 qemu-kvm                                                                                                                                              
> >> 18463 qemu      20   0 3865m 415m 5516 R  1.6  3.5 359:15.24 qemu-kvm                                                                                                                                              
> >> 11755 qemu      20   0 3873m 276m 5336 S 80.5  2.3  21639:39 qemu-kvm
> >>
> >> Basically VDSM consumes more then all my VMs.
> >>
> >> I thought of VDSM as a 'supervisor' process for qemu-kvm?
> >>
> >> I attached recend vdsm logs as well as a screen shot.
> > Thanks for your report. It sounds like
> >
> >     Bug 1130045 - Very high memory consumption
> >
> > which we believe is due to python-ioprocess.

On Wed, Sep 24, 2014 at 03:39:25PM +0000, Daniel Helgenberger wrote:
> Hi Dan,
> 
> just to get this right, you yourself pointed me to the BZ. I was looking
> at the duplicate, since the metadata tags 3.4. Sorry my lack of
> knowlage, I really have no idea whatever there is an ioprocess python
> binding in 3.4 or not - I just see vdsmd resident size growing in 3.4.
> The top output below was from 3.4.3; I just upgraded to 3.4.4. But
> clearly vdsmd should not use 10GB RAM?

I'm sorry to have mislead you. The bug I refered to was indeed due to
ioprocess, and caused a very dramatic memory leak in 3.5. We have yet
another memory leak in 3.5.0, when managing gluster blocks
    Bug 1142647 - supervdsm leaks memory when using glusterfs

3.4.z does not use ioprocess, and do not have that gluster bug, so you
are seeing completely different and much older.

These leaks are not so easy to debug - but they are important. I'd love
if you open a BZ about it. Please specify the rate of the leak, when
does it happen (when a host has VMs? when a host is polled by Engine? On
nfs? iscsi?

What's `lsof -p pid-of-fat-vdsm`?

I think that Francesco has some debug patches to help nail it down - he
can provide smarter questions.

Dan.