[Users] management server very slow lately
Jonathan Horne
jhorne at skopos.us
Fri Mar 22 20:58:34 UTC 2013
is it ok to restart the engine at any time, or should i be prepared for a
maintenance window?
this manager has 12 hosts, and about 75 VMs. we are running 3.1, dreyou's
EL6 packages.
[jhorne at d0lppc021 ~]$ rpm -qa|grep ovirt
ovirt-engine-restapi-3.1.0-3.19.el6.noarch
ovirt-engine-sdk-3.1.0.5-1.el6.noarch
ovirt-engine-backend-3.1.0-3.19.el6.noarch
ovirt-engine-tools-common-3.1.0-3.19.el6.noarch
ovirt-log-collector-3.1.0-16.el6.noarch
ovirt-image-uploader-3.1.0-16.el6.noarch
ovirt-engine-setup-3.1.0-3.19.el6.noarch
ovirt-engine-config-3.1.0-3.19.el6.noarch
ovirt-iso-uploader-3.1.0-16.el6.noarch
ovirt-engine-webadmin-portal-3.1.0-3.19.el6.noarch
ovirt-engine-genericapi-3.1.0-3.19.el6.noarch
ovirt-engine-3.1.0-3.19.el6.noarch
ovirt-engine-cli-3.1.0.7-1.el6.noarch
ovirt-engine-userportal-3.1.0-3.19.el6.noarch
ovirt-engine-notification-service-3.1.0-3.19.el6.noarch
ovirt-engine-jbossas711-1-0.x86_64
ovirt-engine-dbscripts-3.1.0-3.19.el6.noarch
thanks,
jonathan
On 3/22/13 10:05 AM, "Juan Hernandez" <jhernand at redhat.com> wrote:
>On 03/22/2013 02:54 PM, Jonathan Horne wrote:
>> top - 08:53:38 up 70 days, 16:31, 1 user, load average: 0.40, 0.34,
>>0.32
>> Tasks: 432 total, 1 running, 431 sleeping, 0 stopped, 0 zombie
>> Cpu(s): 1.3%us, 0.1%sy, 0.0%ni, 98.6%id, 0.0%wa, 0.0%hi, 0.0%si,
>>0.0%st
>> Mem: 32876240k total, 18653508k used, 14222732k free, 522432k buffers
>> Swap: 2097144k total, 4528k used, 2092616k free, 6270908k cached
>>
>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
>> 2121 ovirt 20 0 12.9g 7.7g 18m S 9.0 24.6 16539:08 java
>>
>
>This is not normal at all. First thing that is strange is that your
>engine is taking 7.7 GiB of RAM, which it should never take, as it is by
>default limited to 1 GiB. Did you assign more memory to the engine on
>purpose? How much? If you assign a lot of memory it can start to consume
>a lot of CPU just for garbage collection. You may want to enable verbose
>garbage collection adding this to /etc/sysconfig/ovirt-engine (or
>/etc/ovirt-engine/engine.conf if you are using the latest source code):
>
> ENGINE_VERBOSE_GC=true
>
>Then restart the engine and it will start to dump garbage collection
>statistics to /var/log/ovirt-engine/console.log. The garbage collection
>should be quite silent in an low activity system.
>
>We used to have a bug that caused the max amount of memory not be
>correctly limited, but it was fixed long ago:
>
> http://gerrit.ovirt.org/7952
>
>The other thing that seems strange is the amount of CPU that it is
>consuming. Do you have many hosts managed by that engine? In an
>otherwise idle environment the CPU consumption is caused by the periodic
>polls of the hosts, one each two seconds by default. If you see
>continually the engine using a significant amount of CPU (you the output
>of top above it is 9%) it could be useful to get a snapshot of the
>stacks of threads, to see which threads in particular are consuming the
>CPU. Send the QUIT signal to the engine process and it will dump the
>stacks of the threads to /var/log/ovirt-engine/console.log:
>
> # kill -3 $(cat /var/run/ovirt-engine.pid)
>
>Once you have that dump you can check which thread is consuming the CPU
>as follows:
>
>1. Get the PIDs of the threads of the engine together with their use of
>CPU:
>
> # ps -L -u ovirt -o tid,pcpu
>
>2. If you see one of them consuming a high amount of CPU time then try
>to find it in the stack dump generated in
>/var/log/ovirt-engine/console.log. Lets assume that the PID is 13397,
>for example, translate it to hex:
>
> # printf "%04x\n" 13397
> 3455
>
>3. Then look in /var/log/ovirt-engine/console.log for a line containing
>"nid=0x3455". There you will find the stack trace of that thread,
>something like this:
>
> "ajp-/127.0.0.1:8702-Acceptor-0" daemon prio=10
>tid=0x00007f41e0220800 nid=0x3493 runnable [0x00007f41dbdf2000]
> java.lang.Thread.State: RUNNABLE
> ...
>
>Most threads will be waiting, but if you find one thread that is
>consistently RUNNABLE then there is probably an issue. The dump of the
>stack of that thread can help to find out what it is doing and why it is
>consuming the CPU.
>
>>
>> I don't have a lot of experience with jboss, so im not sure it thats
>>good or bad. I did the jboss restart, and that helped a little, but its
>>still a little sluggish again, now a few days later.
>>
>> Thanks,
>>
>> -----Original Message-----
>> From: Itamar Heim [mailto:iheim at redhat.com]
>> Sent: Friday, March 15, 2013 6:32 AM
>> To: Jonathan Horne
>> Cc: users at ovirt.org
>> Subject: Re: [Users] management server very slow lately
>>
>> On 03/13/2013 08:51 PM, Jonathan Horne wrote:
>>> Hello, lately my manager server web interface is extremely sluggish.
>>> Perhaps the server is ready for a reboot?
>>>
>>> My management server is also the hosts of my NFS export and ISO mounts.
>>> Is there a prescribed method for rebooting when I am also providing
>>> NFS services from the management server? My assumption is that aside
>>> from NFS, I should be able to reboot the management serve and the
>>> nodes and virtual machines will be fine in the mean time?
>>
>> what's the cpu consumption of your ovirt-engine service (java process).
>> cpu load on the engine? memory/swap state of the engine, etc
>>
>>
>> ________________________________
>> This is a PRIVATE message. If you are not the intended recipient,
>>please delete without copying and kindly advise us by e-mail of the
>>mistake in delivery. NOTE: Regardless of content, this e-mail shall not
>>operate to bind SKOPOS to any order or other contract unless pursuant to
>>explicit written agreement or government initiative expressly permitting
>>the use of e-mail for such purpose.
>> _______________________________________________
>> Users mailing list
>> Users at ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>
>
>--
>Dirección Comercial: C/Jose Bardasano Baos, 9, Edif. Gorbea 3, planta
>3ºD, 28016 Madrid, Spain
>Inscrita en el Reg. Mercantil de Madrid C.I.F. B82657941 - Red Hat S.L.
________________________________
This is a PRIVATE message. If you are not the intended recipient, please delete without copying and kindly advise us by e-mail of the mistake in delivery. NOTE: Regardless of content, this e-mail shall not operate to bind SKOPOS to any order or other contract unless pursuant to explicit written agreement or government initiative expressly permitting the use of e-mail for such purpose.
More information about the Users
mailing list