On 30.09.2014 11:57, Piotr Kliczewski wrote:
>
>
>
> ----- Original Message -----
>> From: "Daniel Helgenberger" <daniel.helgenberger(a)m-box.de>
>> To: "Piotr Kliczewski" <pkliczew(a)redhat.com>, "Dan
Kenigsberg" <danken(a)redhat.com>
>> Cc: "Francesco Romani" <fromani(a)redhat.com>, users(a)ovirt.org
>> Sent: Tuesday, September 30, 2014 11:50:28 AM
>> Subject: Re: [ovirt-users]?3.4: VDSM Memory consumption
>>
>> Hello Piotr,
>>
>> On 30.09.2014 08:37, Piotr Kliczewski wrote:
>>>
>>>
>>> ----- Original Message -----
>>>> From: "Dan Kenigsberg" <danken(a)redhat.com>
>>>> To: "Daniel Helgenberger"
<daniel.helgenberger(a)m-box.de>,
>>>> pkliczew(a)redhat.com
>>>> Cc: "Francesco Romani" <fromani(a)redhat.com>,
users(a)ovirt.org
>>>> Sent: Tuesday, September 30, 2014 1:11:42 AM
>>>> Subject: Re: [ovirt-users]?3.4: VDSM Memory consumption
>>>>
>>>> On Mon, Sep 29, 2014 at 09:02:19PM +0000, Daniel Helgenberger wrote:
>>>>> Hello Francesco,
>>>>>
>>>>> --
>>>>> Daniel Helgenberger
>>>>> m box bewegtbild GmbH
>>>>>
>>>>> P: +49/30/2408781-22
>>>>> F: +49/30/2408781-10
>>>>> ACKERSTR. 19
>>>>> D-10115 BERLIN
>>>>>
www.m-box.de www.monkeymen.tv
>>>>>
>>>>>> On 29.09.2014, at 22:19, Francesco Romani
<fromani(a)redhat.com> wrote:
>>>>>>
>>>>>> ----- Original Message -----
>>>>>>> From: "Daniel Helgenberger"
<daniel.helgenberger(a)m-box.de>
>>>>>>> To: "Francesco Romani" <fromani(a)redhat.com>
>>>>>>> Cc: "Dan Kenigsberg" <danken(a)redhat.com>,
users(a)ovirt.org
>>>>>>> Sent: Monday, September 29, 2014 2:54:13 PM
>>>>>>> Subject: Re: [ovirt-users] 3.4: VDSM Memory consumption
>>>>>>>
>>>>>>> Hello Francesco,
>>>>>>>
>>>>>>>> On 29.09.2014 13:55, Francesco Romani wrote:
>>>>>>>> ----- Original Message -----
>>>>>>>>> From: "Daniel Helgenberger"
<daniel.helgenberger(a)m-box.de>
>>>>>>>>> To: "Dan Kenigsberg"
<danken(a)redhat.com>
>>>>>>>>> Cc: users(a)ovirt.org
>>>>>>>>> Sent: Monday, September 29, 2014 12:25:22 PM
>>>>>>>>> Subject: Re: [ovirt-users] 3.4: VDSM Memory
consumption
>>>>>>>>>
>>>>>>>>> Dan,
>>>>>>>>>
>>>>>>>>> I just reply to the list since I do not want to
clutter BZ:
>>>>>>>>>
>>>>>>>>> While migrating VMs is easy (and the sampling is
already running),
>>>>>>>>> can
>>>>>>>>> someone tell me the correct polling port to block
with iptables?
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>> Hi Daniel,
>>>>>>>>
>>>>>>>> there is indeed a memory profiling patch under
discussion:
>>>>>>>>
http://gerrit.ovirt.org/#/c/32019/
>>>>>>>>
>>>>>>>> but for your case we'll need a backport to 3.4.x and
clearer install
>>>>>>>> instructions,
>>>>>>>> which I'll prepare as soon as possible.
>>>>>>> I updated the BZ (and are now blocking 54321/tcp on one of
my hosts).
>>>>>>> and verified it is not reachable. As general info: This
system I am
>>>>>>> using is my LAB / Test / eval setup for a final deployment
for ovirt
>>>>>>> (then 3.5) in production; so it will go away some time in
the future (a
>>>>>>> few weeks / months). If I am the only one experiencing this
problem
>>>>>>> then
>>>>>>> you might be better of allocating resources elsewhere ;)
>>>>>> Thanks for your understanding :)
>>>>>>
>>>>>> Unfortunately it is true that developer resources aren't so
abundant,
>>>>>> but it is also true that memleaks should never be discarded
easily and
>>>>>> without
>>>>>> due investigation, considering the nature and the role of VDSM.
>>>>>>
>>>>>> So, I'm all in for further investigation regarding this
issue.
>>>>>>
>>>>>>>> As for your question: if I understood correctly what you
are asking
>>>>>>>> (still catching up the thread), if you are trying to
rule out the
>>>>>>>> stats
>>>>>>>> polling
>>>>>>>> made by Engine to this bad leak, one simple way to test
is just to
>>>>>>>> shutdown
>>>>>>>> Engine,
>>>>>>>> and let VDSMs run unguarded on hypervisors. You'll
be able to command
>>>>>>>> these
>>>>>>>> VDSMs using vdsClient or restarting Engine.
>>>>>>> As I said in my BZ comment this is not an option right now,
but if
>>>>>>> understand the matter correctly IPTABLES reject should
ultimately do
>>>>>>> the
>>>>>>> same?
>>>>>> Definitely yes! Just do whatever it is more convenient for you.
>>>>>>
>>>>> As you might have already seen in the BZ comment the leak stopped
after
>>>>> blocking the port. Though this is clearly no permanent option -
please
>>>>> let
>>>>> me know if I can be of any more assistance!
>>>> The immediate suspect in this situation is M2Crypto. Could you verify
>>>> that by re-opening the firewall and setting ssl=False in vdsm.conf?
>>>>
>>>> You should disable ssl on Engine side and restart both Engine and Vdsm
>>>> (too bad I do not recall how that's done on Engine: Piotr, can you
help?).
>>>>
>>> In vdc_options table there is option EncryptHostCommunication.
>> Please confirm the following procedure is correct:
>>
>> 1. Change Postgres table value:
>> # sudo -u postgres psql -U postgres engine -c "update vdc_options set
>> option_value = 'false' where option_name =
'EncryptHostCommunication';"
>> engine=# SELECT * from vdc_options where
>> option_name='EncryptHostCommunication';
>> option_id | option_name | option_value | version
>> -----------+--------------------------+--------------+---------
>> 335 | EncryptHostCommunication | false | general
>> (1 row)
>>
>> 2. Restart engine
>> 3. On the hosts;
>> grep ssl /etc/vdsm/vdsm.conf
>> #ssl = true
>> ssl = false
>>
>> 4. restart VDSM
>>
>> I assume I have to set 'ssl = false' this on on all hosts?
>>> Please to set it to false and restart the engine.
>>>
> I believe that you need to update a bit more on vdsm side.
Indeed your beliefs were true; I was already running in this error while
starting vdsm:
vdsm: Running validate_configuration
FAILED: conflicting vdsm and libvirt-qemu tls configuration.
vdsm.conf with ssl=False requires the following changed:
libvirtd.conf: listen_tcp=1, auth_tcp="none",
qemu.conf: spice_tls=0.
After changing the apropriate lines vdsm is running again.
@Dan: As soon as I have more info I will update BZ, and if this test
comes up negatve I would reverse the changes?
I suppose 'Reinstalling' the host will revert this?
Yes - after you revert the config change on Engine's side.
(thanks for your help, /me is looking forward to blaming ssl)