
Dan, I just reply to the list since I do not want to clutter BZ: While migrating VMs is easy (and the sampling is already running), can someone tell me the correct polling port to block with iptables? Thanks, On 27.09.2014 13:18, Daniel Helgenberger wrote:
Hallo Dan,
I just opened a BZ aginst this [1]. Please let me know if I can be of further assistance. I stopped the cron script for now, so vdsm can 'grow nicely' (about 91.5 MB per 6hrs)
Cheers, [1] https://bugzilla.redhat.com/show_bug.cgi?id=1147148
On 24.09.2014 19:20, Dan Kenigsberg wrote:
On 01.09.2014 18:08, Dan Kenigsberg wrote:
On Mon, Sep 01, 2014 at 03:30:53PM +0000, Daniel Helgenberger wrote:
Hello,
in my LAB cluster I run into OOM conditions frequently because of a huge VDSM process. The memory stats from my nodes right now:
Node A; running one VM: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 3465 vdsm 0 -20 18.6g 9.8g 8244 S 32.1 50.3 27265:21 vdsm 7439 qemu 20 0 5641m 4.1g 4280 S 22.9 20.9 12737:08 qemu-kvm 2912 root 15 -5 2710m 35m 5968 S 0.0 0.2 0:04.76 supervdsmServer
Node B, running 3 VMs including HosedEngine: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 9079 vdsm 0 -20 9.9g 5.0g 7496 S 49.7 43.0 11858:06 vdsm 3347 qemu 20 0 7749m 1.8g 5264 S 4.3 15.8 3:25.71 qemu-kvm 18463 qemu 20 0 3865m 415m 5516 R 1.6 3.5 359:15.24 qemu-kvm 11755 qemu 20 0 3873m 276m 5336 S 80.5 2.3 21639:39 qemu-kvm
Basically VDSM consumes more then all my VMs.
I thought of VDSM as a 'supervisor' process for qemu-kvm?
I attached recend vdsm logs as well as a screen shot. Thanks for your report. It sounds like
Bug 1130045 - Very high memory consumption
which we believe is due to python-ioprocess. On Wed, Sep 24, 2014 at 03:39:25PM +0000, Daniel Helgenberger wrote: Hi Dan,
just to get this right, you yourself pointed me to the BZ. I was looking at the duplicate, since the metadata tags 3.4. Sorry my lack of knowlage, I really have no idea whatever there is an ioprocess python binding in 3.4 or not - I just see vdsmd resident size growing in 3.4. The top output below was from 3.4.3; I just upgraded to 3.4.4. But clearly vdsmd should not use 10GB RAM? I'm sorry to have mislead you. The bug I refered to was indeed due to ioprocess, and caused a very dramatic memory leak in 3.5. We have yet another memory leak in 3.5.0, when managing gluster blocks Bug 1142647 - supervdsm leaks memory when using glusterfs
3.4.z does not use ioprocess, and do not have that gluster bug, so you are seeing completely different and much older.
These leaks are not so easy to debug - but they are important. I'd love if you open a BZ about it. Please specify the rate of the leak, when does it happen (when a host has VMs? when a host is polled by Engine? On nfs? iscsi?
What's `lsof -p pid-of-fat-vdsm`?
I think that Francesco has some debug patches to help nail it down - he can provide smarter questions.
Dan.
-- Daniel Helgenberger m box bewegtbild GmbH P: +49/30/2408781-22 F: +49/30/2408781-10 ACKERSTR. 19 D-10115 BERLIN www.m-box.de www.monkeymen.tv Geschäftsführer: Martin Retschitzegger / Michaela Göllner Handeslregister: Amtsgericht Charlottenburg / HRB 112767