Dan,
I just reply to the list since I do not want to clutter BZ:
While migrating VMs is easy (and the sampling is already running), can
someone tell me the correct polling port to block with iptables?
Thanks,
On 27.09.2014 13:18, Daniel Helgenberger wrote:
Hallo Dan,
I just opened a BZ aginst this [1]. Please let me know if I can be of
further assistance. I stopped the cron script for now, so vdsm can 'grow
nicely' (about 91.5 MB per 6hrs)
Cheers,
[1]
https://bugzilla.redhat.com/show_bug.cgi?id=1147148
On 24.09.2014 19:20, Dan Kenigsberg wrote:
>> On 01.09.2014 18:08, Dan Kenigsberg wrote:
>>> On Mon, Sep 01, 2014 at 03:30:53PM +0000, Daniel Helgenberger wrote:
>>>> Hello,
>>>>
>>>> in my LAB cluster I run into OOM conditions frequently because of a huge
>>>> VDSM process. The memory stats from my nodes right now:
>>>>
>>>> Node A; running one VM:
>>>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
>>>> 3465 vdsm 0 -20 18.6g 9.8g 8244 S 32.1 50.3 27265:21 vdsm
>>>> 7439 qemu 20 0 5641m 4.1g 4280 S 22.9 20.9 12737:08 qemu-kvm
>>>> 2912 root 15 -5 2710m 35m 5968 S 0.0 0.2 0:04.76
supervdsmServer
>>>>
>>>> Node B, running 3 VMs including HosedEngine:
>>>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
>>>> 9079 vdsm 0 -20 9.9g 5.0g 7496 S 49.7 43.0 11858:06 vdsm
>>>> 3347 qemu 20 0 7749m 1.8g 5264 S 4.3 15.8 3:25.71 qemu-kvm
>>>> 18463 qemu 20 0 3865m 415m 5516 R 1.6 3.5 359:15.24 qemu-kvm
>>>> 11755 qemu 20 0 3873m 276m 5336 S 80.5 2.3 21639:39 qemu-kvm
>>>>
>>>> Basically VDSM consumes more then all my VMs.
>>>>
>>>> I thought of VDSM as a 'supervisor' process for qemu-kvm?
>>>>
>>>> I attached recend vdsm logs as well as a screen shot.
>>> Thanks for your report. It sounds like
>>>
>>> Bug 1130045 - Very high memory consumption
>>>
>>> which we believe is due to python-ioprocess.
> On Wed, Sep 24, 2014 at 03:39:25PM +0000, Daniel Helgenberger wrote:
>> Hi Dan,
>>
>> just to get this right, you yourself pointed me to the BZ. I was looking
>> at the duplicate, since the metadata tags 3.4. Sorry my lack of
>> knowlage, I really have no idea whatever there is an ioprocess python
>> binding in 3.4 or not - I just see vdsmd resident size growing in 3.4.
>> The top output below was from 3.4.3; I just upgraded to 3.4.4. But
>> clearly vdsmd should not use 10GB RAM?
> I'm sorry to have mislead you. The bug I refered to was indeed due to
> ioprocess, and caused a very dramatic memory leak in 3.5. We have yet
> another memory leak in 3.5.0, when managing gluster blocks
> Bug 1142647 - supervdsm leaks memory when using glusterfs
>
> 3.4.z does not use ioprocess, and do not have that gluster bug, so you
> are seeing completely different and much older.
>
> These leaks are not so easy to debug - but they are important. I'd love
> if you open a BZ about it. Please specify the rate of the leak, when
> does it happen (when a host has VMs? when a host is polled by Engine? On
> nfs? iscsi?
>
> What's `lsof -p pid-of-fat-vdsm`?
>
> I think that Francesco has some debug patches to help nail it down - he
> can provide smarter questions.
>
> Dan.
>
>
--
Daniel Helgenberger
m box bewegtbild GmbH
P: +49/30/2408781-22
F: +49/30/2408781-10
ACKERSTR. 19
D-10115 BERLIN
Geschäftsführer: Martin Retschitzegger / Michaela Göllner
Handeslregister: Amtsgericht Charlottenburg / HRB 112767