[Users] management server very slow lately

Fri Mar 22 20:58:34 UTC 2013

is it ok to restart the engine at any time, or should i be prepared for a
maintenance window?

this manager has 12 hosts, and about 75 VMs.  we are running 3.1, dreyou's
EL6 packages.

[jhorne at d0lppc021 ~]$ rpm -qa|grep ovirt
ovirt-engine-restapi-3.1.0-3.19.el6.noarch
ovirt-engine-sdk-3.1.0.5-1.el6.noarch
ovirt-engine-backend-3.1.0-3.19.el6.noarch
ovirt-engine-tools-common-3.1.0-3.19.el6.noarch
ovirt-log-collector-3.1.0-16.el6.noarch
ovirt-image-uploader-3.1.0-16.el6.noarch
ovirt-engine-setup-3.1.0-3.19.el6.noarch
ovirt-engine-config-3.1.0-3.19.el6.noarch
ovirt-iso-uploader-3.1.0-16.el6.noarch
ovirt-engine-webadmin-portal-3.1.0-3.19.el6.noarch
ovirt-engine-genericapi-3.1.0-3.19.el6.noarch
ovirt-engine-3.1.0-3.19.el6.noarch
ovirt-engine-cli-3.1.0.7-1.el6.noarch
ovirt-engine-userportal-3.1.0-3.19.el6.noarch
ovirt-engine-notification-service-3.1.0-3.19.el6.noarch
ovirt-engine-jbossas711-1-0.x86_64
ovirt-engine-dbscripts-3.1.0-3.19.el6.noarch

thanks,
jonathan

On 3/22/13 10:05 AM, "Juan Hernandez" <jhernand at redhat.com> wrote:

>On 03/22/2013 02:54 PM, Jonathan Horne wrote:
>> top - 08:53:38 up 70 days, 16:31,  1 user,  load average: 0.40, 0.34,
>>0.32
>> Tasks: 432 total,   1 running, 431 sleeping,   0 stopped,   0 zombie
>> Cpu(s):  1.3%us,  0.1%sy,  0.0%ni, 98.6%id,  0.0%wa,  0.0%hi,  0.0%si,
>>0.0%st
>> Mem:  32876240k total, 18653508k used, 14222732k free,   522432k buffers
>> Swap:  2097144k total,     4528k used,  2092616k free,  6270908k cached
>>
>>    PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>>   2121 ovirt     20   0 12.9g 7.7g  18m S  9.0 24.6  16539:08 java
>>
>
>This is not normal at all. First thing that is strange is that your
>engine is taking 7.7 GiB of RAM, which it should never take, as it is by
>default limited to 1 GiB. Did you assign more memory to the engine on
>purpose? How much? If you assign a lot of memory it can start to consume
>a lot of CPU just for garbage collection. You may want to enable verbose
>garbage collection adding this to /etc/sysconfig/ovirt-engine (or
>/etc/ovirt-engine/engine.conf if you are using the latest source code):
>
>   ENGINE_VERBOSE_GC=true
>
>Then restart the engine and it will start to dump garbage collection
>statistics to /var/log/ovirt-engine/console.log. The garbage collection
>should be quite silent in an low activity system.
>
>We used to have a bug that caused the max amount of memory not be
>correctly limited, but it was fixed long ago:
>
>   http://gerrit.ovirt.org/7952
>
>The other thing that seems strange is the amount of CPU that it is
>consuming. Do you have many hosts managed by that engine? In an
>otherwise idle environment the CPU consumption is caused by the periodic
>polls of the hosts, one each two seconds by default. If you see
>continually the engine using a significant amount of CPU (you the output
>of top above it is 9%) it could be useful to get a snapshot of the
>stacks of threads, to see which threads in particular are consuming the
>CPU. Send the QUIT signal to the engine process and it will dump the
>stacks of the threads to /var/log/ovirt-engine/console.log:
>
>   # kill -3 $(cat /var/run/ovirt-engine.pid)
>
>Once you have that dump you can check which thread is consuming the CPU
>as follows:
>
>1. Get the PIDs of the threads of the engine together with their use of
>CPU:
>
>   # ps -L -u ovirt -o tid,pcpu
>
>2. If you see one of them consuming a high amount of CPU time then try
>to find it in the stack dump generated in
>/var/log/ovirt-engine/console.log. Lets assume that the PID is 13397,
>for example, translate it to hex:
>
>   # printf "%04x\n" 13397
>   3455
>
>3. Then look in /var/log/ovirt-engine/console.log for a line containing
>"nid=0x3455". There you will find the stack trace of that thread,
>something like this:
>
>   "ajp-/127.0.0.1:8702-Acceptor-0" daemon prio=10
>tid=0x00007f41e0220800 nid=0x3493 runnable [0x00007f41dbdf2000]
>    java.lang.Thread.State: RUNNABLE
>         ...
>
>Most threads will be waiting, but if you find one thread that is
>consistently RUNNABLE then there is probably an issue. The dump of the
>stack of that thread can help to find out what it is doing and why it is
>consuming the CPU.
>
>>
>> I don't have a lot of experience with jboss, so im not sure it thats
>>good or bad.  I did the jboss restart, and that helped a little, but its
>>still a little sluggish again, now a few days later.
>>
>> Thanks,
>>
>> -----Original Message-----
>> From: Itamar Heim [mailto:iheim at redhat.com]
>> Sent: Friday, March 15, 2013 6:32 AM
>> To: Jonathan Horne
>> Cc: users at ovirt.org
>> Subject: Re: [Users] management server very slow lately
>>
>> On 03/13/2013 08:51 PM, Jonathan Horne wrote:
>>> Hello, lately my manager server web interface is extremely sluggish.
>>> Perhaps the server is ready for a reboot?
>>>
>>> My management server is also the hosts of my NFS export and ISO mounts.
>>> Is there a prescribed method for rebooting when I am also providing
>>> NFS services from the management server?  My assumption is that aside
>>> from NFS, I should be able to reboot the management serve and the
>>> nodes and virtual machines will be fine in the mean time?
>>
>> what's the cpu consumption of your ovirt-engine service (java process).
>> cpu load on the engine? memory/swap state of the engine, etc
>>
>>
>> ________________________________
>> This is a PRIVATE message. If you are not the intended recipient,
>>please delete without copying and kindly advise us by e-mail of the
>>mistake in delivery. NOTE: Regardless of content, this e-mail shall not
>>operate to bind SKOPOS to any order or other contract unless pursuant to
>>explicit written agreement or government initiative expressly permitting
>>the use of e-mail for such purpose.
>> _______________________________________________
>> Users mailing list
>> Users at ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>
>
>--
>Dirección Comercial: C/Jose Bardasano Baos, 9, Edif. Gorbea 3, planta
>3ºD, 28016 Madrid, Spain
>Inscrita en el Reg. Mercantil de Madrid  C.I.F. B82657941 - Red Hat S.L.

________________________________
This is a PRIVATE message. If you are not the intended recipient, please delete without copying and kindly advise us by e-mail of the mistake in delivery. NOTE: Regardless of content, this e-mail shall not operate to bind SKOPOS to any order or other contract unless pursuant to explicit written agreement or government initiative expressly permitting the use of e-mail for such purpose.