[Users] management server very slow lately

Juan Hernandez jhernand at redhat.com
Mon Mar 25 07:36:44 UTC 2013


On 03/22/2013 09:58 PM, Jonathan Horne wrote:
> is it ok to restart the engine at any time, or should i be prepared for a
> maintenance window?
>

The engine can be restarted at any time, assuming that your users don't 
need to use it (via the user portal) the few seconds it will be down.

> this manager has 12 hosts, and about 75 VMs.  we are running 3.1, dreyou's
> EL6 packages.
>

Version 3.1 had the problem with the memory limit. To fix it open the 
/usr/share/ovirt-engine/service/engine-service.py file, go to line 203 
and replace -Xms with -Xmx, the resulting lines 202 and 203 should be 
the following:

         "-Xms%s" % engineHeapMin,
         "-Xmx%s" % engineHeapMax,

Then restart the engine and then it should never consume more than 1 GiB 
of heap, which will mean a max of approx 2 GiB of virtual space, and a 
much smaller resident set size.

Let us know if this makes it faster.

> [jhorne at d0lppc021 ~]$ rpm -qa|grep ovirt
> ovirt-engine-restapi-3.1.0-3.19.el6.noarch
> ovirt-engine-sdk-3.1.0.5-1.el6.noarch
> ovirt-engine-backend-3.1.0-3.19.el6.noarch
> ovirt-engine-tools-common-3.1.0-3.19.el6.noarch
> ovirt-log-collector-3.1.0-16.el6.noarch
> ovirt-image-uploader-3.1.0-16.el6.noarch
> ovirt-engine-setup-3.1.0-3.19.el6.noarch
> ovirt-engine-config-3.1.0-3.19.el6.noarch
> ovirt-iso-uploader-3.1.0-16.el6.noarch
> ovirt-engine-webadmin-portal-3.1.0-3.19.el6.noarch
> ovirt-engine-genericapi-3.1.0-3.19.el6.noarch
> ovirt-engine-3.1.0-3.19.el6.noarch
> ovirt-engine-cli-3.1.0.7-1.el6.noarch
> ovirt-engine-userportal-3.1.0-3.19.el6.noarch
> ovirt-engine-notification-service-3.1.0-3.19.el6.noarch
> ovirt-engine-jbossas711-1-0.x86_64
> ovirt-engine-dbscripts-3.1.0-3.19.el6.noarch
>
> thanks,
> jonathan
>
>
>
>
>
> On 3/22/13 10:05 AM, "Juan Hernandez" <jhernand at redhat.com> wrote:
>
>> On 03/22/2013 02:54 PM, Jonathan Horne wrote:
>>> top - 08:53:38 up 70 days, 16:31,  1 user,  load average: 0.40, 0.34,
>>> 0.32
>>> Tasks: 432 total,   1 running, 431 sleeping,   0 stopped,   0 zombie
>>> Cpu(s):  1.3%us,  0.1%sy,  0.0%ni, 98.6%id,  0.0%wa,  0.0%hi,  0.0%si,
>>> 0.0%st
>>> Mem:  32876240k total, 18653508k used, 14222732k free,   522432k buffers
>>> Swap:  2097144k total,     4528k used,  2092616k free,  6270908k cached
>>>
>>>     PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>>>    2121 ovirt     20   0 12.9g 7.7g  18m S  9.0 24.6  16539:08 java
>>>
>>
>> This is not normal at all. First thing that is strange is that your
>> engine is taking 7.7 GiB of RAM, which it should never take, as it is by
>> default limited to 1 GiB. Did you assign more memory to the engine on
>> purpose? How much? If you assign a lot of memory it can start to consume
>> a lot of CPU just for garbage collection. You may want to enable verbose
>> garbage collection adding this to /etc/sysconfig/ovirt-engine (or
>> /etc/ovirt-engine/engine.conf if you are using the latest source code):
>>
>>    ENGINE_VERBOSE_GC=true
>>
>> Then restart the engine and it will start to dump garbage collection
>> statistics to /var/log/ovirt-engine/console.log. The garbage collection
>> should be quite silent in an low activity system.
>>
>> We used to have a bug that caused the max amount of memory not be
>> correctly limited, but it was fixed long ago:
>>
>>    http://gerrit.ovirt.org/7952
>>
>> The other thing that seems strange is the amount of CPU that it is
>> consuming. Do you have many hosts managed by that engine? In an
>> otherwise idle environment the CPU consumption is caused by the periodic
>> polls of the hosts, one each two seconds by default. If you see
>> continually the engine using a significant amount of CPU (you the output
>> of top above it is 9%) it could be useful to get a snapshot of the
>> stacks of threads, to see which threads in particular are consuming the
>> CPU. Send the QUIT signal to the engine process and it will dump the
>> stacks of the threads to /var/log/ovirt-engine/console.log:
>>
>>    # kill -3 $(cat /var/run/ovirt-engine.pid)
>>
>> Once you have that dump you can check which thread is consuming the CPU
>> as follows:
>>
>> 1. Get the PIDs of the threads of the engine together with their use of
>> CPU:
>>
>>    # ps -L -u ovirt -o tid,pcpu
>>
>> 2. If you see one of them consuming a high amount of CPU time then try
>> to find it in the stack dump generated in
>> /var/log/ovirt-engine/console.log. Lets assume that the PID is 13397,
>> for example, translate it to hex:
>>
>>    # printf "%04x\n" 13397
>>    3455
>>
>> 3. Then look in /var/log/ovirt-engine/console.log for a line containing
>> "nid=0x3455". There you will find the stack trace of that thread,
>> something like this:
>>
>>    "ajp-/127.0.0.1:8702-Acceptor-0" daemon prio=10
>> tid=0x00007f41e0220800 nid=0x3493 runnable [0x00007f41dbdf2000]
>>     java.lang.Thread.State: RUNNABLE
>>          ...
>>
>> Most threads will be waiting, but if you find one thread that is
>> consistently RUNNABLE then there is probably an issue. The dump of the
>> stack of that thread can help to find out what it is doing and why it is
>> consuming the CPU.
>>
>>>
>>> I don't have a lot of experience with jboss, so im not sure it thats
>>> good or bad.  I did the jboss restart, and that helped a little, but its
>>> still a little sluggish again, now a few days later.
>>>
>>> Thanks,
>>>
>>> -----Original Message-----
>>> From: Itamar Heim [mailto:iheim at redhat.com]
>>> Sent: Friday, March 15, 2013 6:32 AM
>>> To: Jonathan Horne
>>> Cc: users at ovirt.org
>>> Subject: Re: [Users] management server very slow lately
>>>
>>> On 03/13/2013 08:51 PM, Jonathan Horne wrote:
>>>> Hello, lately my manager server web interface is extremely sluggish.
>>>> Perhaps the server is ready for a reboot?
>>>>
>>>> My management server is also the hosts of my NFS export and ISO mounts.
>>>> Is there a prescribed method for rebooting when I am also providing
>>>> NFS services from the management server?  My assumption is that aside
>>>> from NFS, I should be able to reboot the management serve and the
>>>> nodes and virtual machines will be fine in the mean time?
>>>
>>> what's the cpu consumption of your ovirt-engine service (java process).
>>> cpu load on the engine? memory/swap state of the engine, etc
>>>
>>>
>>> ________________________________
>>> This is a PRIVATE message. If you are not the intended recipient,
>>> please delete without copying and kindly advise us by e-mail of the
>>> mistake in delivery. NOTE: Regardless of content, this e-mail shall not
>>> operate to bind SKOPOS to any order or other contract unless pursuant to
>>> explicit written agreement or government initiative expressly permitting
>>> the use of e-mail for such purpose.
>>> _______________________________________________
>>> Users mailing list
>>> Users at ovirt.org
>>> http://lists.ovirt.org/mailman/listinfo/users
>>>
>>
>>
>> --
>> Dirección Comercial: C/Jose Bardasano Baos, 9, Edif. Gorbea 3, planta
>> 3ºD, 28016 Madrid, Spain
>> Inscrita en el Reg. Mercantil de Madrid ­ C.I.F. B82657941 - Red Hat S.L.
>
>
> ________________________________
> This is a PRIVATE message. If you are not the intended recipient, please delete without copying and kindly advise us by e-mail of the mistake in delivery. NOTE: Regardless of content, this e-mail shall not operate to bind SKOPOS to any order or other contract unless pursuant to explicit written agreement or government initiative expressly permitting the use of e-mail for such purpose.
>


-- 
Dirección Comercial: C/Jose Bardasano Baos, 9, Edif. Gorbea 3, planta 
3ºD, 28016 Madrid, Spain
Inscrita en el Reg. Mercantil de Madrid – C.I.F. B82657941 - Red Hat S.L.



More information about the Users mailing list