Hi Toni,
(re-adding the users(a)ovirt.org list)
El 16/11/20 a las 23:37, tferic(a)swissonline.ch escribió:
Hi,
Chances are, that this performance issue may be unrelated to oVirt.
The first thing that comes to my mind is that the system may have a
bottleneck on the I/O.
Please have a look at:
iostat -x 5
The system may have problems with disk performance, if %util > 95 and
avgqu-sz > 3 over longer periods of time.
Actually %util never goes beyond the 85%, being the average on 70%; for
the avgqu-sz, the average value is about 25 (with peaks of 160 as far as
I could see).
If that's not the case, you may next want to have a look at
limits
(/etc/security/limits.conf).
There may also be default limits which may be too low. (nofile, nproc
come to my mind)
This file seems not to have been touched (all the content is commented out).
Are there any errors or warnings in the messages or audit logs?
Nothing special, I've checked for the last couple days and everything
seems ok.
And also htop, iotop, vmstat may help to get a quick overview.
Yep, I've used most of these commands to check, but I've been unable to
see something clarifying.
In my opinion the slowness could be caused for not using SSD disks and
because of a low polling interval by the DWHd, but that's just my
assumption and I've been unable to get some data that would confirm this.
Thanks for the help!
Nico
Hope this helps,
Toni
On 16.11.20 14:08, Nicolás wrote:
> Hi,
>
> We're running oVirt 4.3.8 and even if this is a problem we've had
> since a lot of time (I would say since 4.0.x), I decided to look for
> help in case anything can be done.
>
> Our environment is heavily used by users in our University (about
> 3000 users), and currently our oVirt infrastructure has 1928 virtual
> machines, being 882 of them currently running. We have a separate
> physical machine for the manager node, and the problem is that this
> machine is very, very, very slow, despite it has (from my point of
> view) enough physical resources to run efficiently.
>
> By slow I mean even the SSH access to it takes about 10 seconds, not
> just the admin/user portals. Any operation takes enough time to make
> the experience not comfortable to our users (enter the VM portal,
> start a VM, open a console...).
>
> Node machine's parameters are:
> 18GB of RAM memory
> 1 processor with 12CPUs
> 300GB SCSI Disk, local storage. No Storage Domain is stored in this
> node machine.
>
> Currently, most consuming processes are:
>
> 28802 ovirt 20 0 3302876 1,5g 25988 S 1,0 8,7 22:46.36
> ovirt-engine -server -XX:+TieredCompilation -Xms1024M -Xmx1024M
> -Xss1M -Djava.aw+
> 28701 ovirt 20 0 5465432 807704 13140 S 5,9 4,4 3:15.53
> ovirt-engine-dwhd
> -Dorg.ovirt.engine.dwh.settings=/tmp/tmp8wtnTA/settings.proper+
>
> # free -m
> total used free shared buff/cache
> available
> Mem: 17886 6669 2734 255 8482 10625
> Swap: 6143 1002 5141
>
> There are also a lot of postgresql processes:
>
> postgres 2186 0.0 0.0 261812 4136 ? Ss jun05 37:39
> /opt/rh/rh-postgresql10/root/usr/bin/postmaster -D
> /var/opt/rh/rh-postgresql10/lib/pgsql/data
> postgres 3176 0.0 0.0 216084 656 ? Ss jun05 0:00
> postgres: logger process
> postgres 3290 0.0 0.2 262204 37476 ? Ss jun05 77:31
> postgres: checkpointer process
> postgres 3291 0.3 0.2 262052 36960 ? Ss jun05 754:02
> postgres: writer process
> postgres 3292 0.0 0.0 261812 1988 ? Ss jun05 100:51
> postgres: wal writer process
> postgres 3293 0.0 0.2 262380 36748 ? Ss jun05 21:58
> postgres: autovacuum launcher process
> postgres 3294 0.1 0.0 219216 1412 ? Ds jun05 335:41
> postgres: stats collector process
> postgres 3295 0.0 0.0 262220 1460 ? Ss jun05 0:11
> postgres: bgworker: logical replication launcher
> postgres 3393 0.0 0.2 265792 40452 ? Ss jun05 15:36
> postgres: engine engine ::1(51664) idle
> postgres 6105 0.3 0.0 271532 15976 ? Ds 13:01 0:00
> postgres: autovacuum worker process ovirt_engine_history
> postgres 6216 0.2 0.0 263864 11440 ? Ss 13:02 0:00
> postgres: autovacuum worker process engine
> postgres 6245 0.0 0.0 262888 6212 ? Ss 13:02 0:00
> postgres: engine engine 127.0.0.1(42400) idle
> postgres 6246 0.0 0.0 262844 3256 ? Ss 13:02 0:00
> postgres: autovacuum worker process template1
> postgres 18815 0.0 0.0 262912 5852 ? Ss nov01 0:00
> postgres: django django 127.0.0.1(59564) idle
> postgres 23148 0.0 0.2 266052 43024 ? Ss oct28 9:01
> postgres: engine engine 127.0.0.1(59714) idle
> postgres 23149 0.0 0.0 262980 6820 ? Ss oct28 0:00
> postgres: engine engine 127.0.0.1(59716) idle
> postgres 28784 0.0 0.0 262816 3492 ? Ss 12:02 0:00
> postgres: ovirt_engine_history ovirt_engine_history 127.0.0.1(39470)
> idle
> postgres 28785 0.0 0.0 262816 3496 ? Ss 12:02 0:00
> postgres: ovirt_engine_history ovirt_engine_history 127.0.0.1(39472)
> idle
> postgres 28921 0.8 0.7 375152 146452 ? Ss 12:03 0:30
> postgres: engine engine 127.0.0.1(39484) idle
> postgres 29007 3.3 0.7 369348 142184 ? Ds 12:03 1:58
> postgres: engine engine 127.0.0.1(39498) SELECT
> postgres 29009 5.0 0.3 294736 72776 ? Ss 12:03 2:58
> postgres: ovirt_engine_history ovirt_engine_history 127.0.0.1(39500)
> idle
> postgres 29048 0.0 0.0 263664 12176 ? Ss 12:03 0:00
> postgres: engine engine 127.0.0.1(39530) idle
> postgres 29064 0.6 0.8 384936 157852 ? Ss 12:03 0:24
> postgres: engine engine 127.0.0.1(39532) idle
> postgres 29065 1.2 0.7 358944 137028 ? Ss 12:03 0:43
> postgres: engine engine 127.0.0.1(39534) idle
> postgres 29066 1.9 0.7 355800 132800 ? Ss 12:03 1:08
> postgres: engine engine 127.0.0.1(39536) idle
> postgres 29067 1.0 0.7 370688 140504 ? Ss 12:03 0:37
> postgres: engine engine 127.0.0.1(39538) idle
> postgres 29068 1.7 0.7 360984 139584 ? Ss 12:03 1:02
> postgres: engine engine 127.0.0.1(39540) idle
> postgres 29069 1.3 0.7 358140 136268 ? Ss 12:03 0:48
> postgres: engine engine 127.0.0.1(39542) idle
> postgres 29070 3.7 0.7 372064 141724 ? Ss 12:03 2:13
> postgres: engine engine 127.0.0.1(39544) idle
> postgres 29071 0.8 0.6 337224 114132 ? Ss 12:03 0:31
> postgres: engine engine 127.0.0.1(39546) idle
> postgres 29072 0.5 0.6 336616 114276 ? Ss 12:03 0:18
> postgres: engine engine 127.0.0.1(39548) idle
> postgres 29073 1.7 0.7 357872 134540 ? Ss 12:03 1:02
> postgres: engine engine 127.0.0.1(39550) idle
> postgres 29139 3.5 0.7 361244 134176 ? Ss 12:03 2:07
> postgres: engine engine 127.0.0.1(39572) idle
> postgres 29140 3.0 0.7 367884 138372 ? Ss 12:03 1:46
> postgres: engine engine 127.0.0.1(39570) idle
> postgres 29141 1.0 0.9 391984 169204 ? Ss 12:03 0:38
> postgres: engine engine 127.0.0.1(39574) idle
> postgres 29142 1.3 0.7 358820 136140 ? Ss 12:03 0:48
> postgres: engine engine 127.0.0.1(39576) idle
> postgres 29143 1.6 0.8 385000 156592 ? Ss 12:03 0:58
> postgres: engine engine 127.0.0.1(39578) idle
> postgres 29144 3.6 0.9 403740 174552 ? Ss 12:03 2:09
> postgres: engine engine 127.0.0.1(39580) idle
> postgres 29145 3.4 0.7 355800 128876 ? Ss 12:03 2:01
> postgres: engine engine 127.0.0.1(39582) idle
> postgres 29146 1.8 0.8 380884 157828 ? Ss 12:03 1:06
> postgres: engine engine 127.0.0.1(39586) idle
> postgres 29147 3.5 0.6 353228 123200 ? Ss 12:03 2:05
> postgres: engine engine 127.0.0.1(39584) idle
> postgres 29148 0.9 0.8 383452 156268 ? Ss 12:03 0:34
> postgres: engine engine 127.0.0.1(39588) idle
> postgres 29149 1.3 0.7 365552 139612 ? Ss 12:03 0:47
> postgres: engine engine 127.0.0.1(39590) idle
> postgres 29150 1.2 0.7 363448 136564 ? Ss 12:03 0:43
> postgres: engine engine 127.0.0.1(39592) SELECT
> postgres 29151 1.4 0.7 365940 139184 ? Ss 12:03 0:51
> postgres: engine engine 127.0.0.1(39594) idle
> postgres 29152 1.2 0.7 358328 134268 ? Ss 12:03 0:44
> postgres: engine engine 127.0.0.1(39596) idle
> postgres 29402 1.3 0.7 362044 139812 ? Ss 12:04 0:47
> postgres: engine engine 127.0.0.1(39698) idle
> postgres 29403 2.1 0.8 377760 154556 ? Ss 12:04 1:15
> postgres: engine engine 127.0.0.1(39700) idle
> postgres 29404 2.2 0.7 369160 144600 ? Ss 12:04 1:18
> postgres: engine engine 127.0.0.1(39702) idle
> postgres 32230 0.1 0.5 336252 101912 ? Ss 12:19 0:04
> postgres: engine engine 127.0.0.1(40564) idle
> postgres 32232 3.4 0.6 357644 127732 ? Ss 12:19 1:29
> postgres: engine engine 127.0.0.1(40566) idle
> postgres 32234 0.0 0.1 270924 19836 ? Ss 12:19 0:00
> postgres: engine engine 127.0.0.1(40568) idle
>
> Actually, I'm not really sure why is this machine so slow. In terms
> of RAM memory and CPU, it still has a bunch of free resources.
>
> Could someone help track this down and point some advices on how to
> optimize the user experience in this case?
>
> Thanks!
> _______________________________________________
> Users mailing list -- users(a)ovirt.org
> To unsubscribe send an email to users-leave(a)ovirt.org
> Privacy Statement:
https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct:
>
https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
>
https://lists.ovirt.org/archives/list/users@ovirt.org/message/HYVADDU3XJX...