oVirt 4.5.2 /var growing rapidly due to ovirt_engine_history db

Hi, We have recently upgrade our oVirt environment to 4.5.2 version. Environment is based on hosted-engine. Since we have upgrade we noticed rapid incrase in /var partition in engine VM. It is increasing very rapidly. If we vacuum ovirt_engine_history db, /var size reduces but next day size will increase again upto 5-10%. We did db vacuuming couple of time but not sure we it is increasing so rapidly. Here is the partial output of vacuuming that was done on 26-08-22. Table "host_interface_hourly_history" had more entries to be removed. Rest of the table had not much entries. Previously table "host_interface_samples_history" had entries to be removed. Any Idea what can be the reason for that? # dwh-vacuum -f -v SELECT pg_catalog.set_config('search_path', '', false); vacuumdb: vacuuming database "ovirt_engine_history" RESET search_path; SELECT c.relname, ns.nspname FROM pg_catalog.pg_class c JOIN pg_catalog.pg_namespace ns ON c.relnamespace OPERATOR(pg_catalog.=) ns.oid LEFT JOIN pg_catalog.pg_class t ON c.reltoastrelid OPERATOR(pg_catalog.=) t.oid WHERE c.relkind OPERATOR(pg_catalog.=) ANY (array['r', 'm']) ORDER BY c.relpages DESC; SELECT pg_catalog.set_config('search_path', '', false); VACUUM (FULL, VERBOSE) public.host_interface_samples_history; INFO: vacuuming "public.host_interface_samples_history" INFO: "host_interface_samples_history": found 3135 removable, 84609901 nonremovable row versions in 1564960 pages DETAIL: 0 dead row versions cannot be removed yet. CPU: user: 41.88 s, system: 14.93 s, elapsed: 422.83 s. VACUUM (FULL, VERBOSE) public.host_interface_hourly_history; INFO: vacuuming "public.host_interface_hourly_history" INFO: "host_interface_hourly_history": found 252422 removable, 39904650 nonremovable row versions in 473269 pages Please let me know if any further information is required. Regards Sohail

Hi Sohail, so a few questions, you upgraded from which version? is the /var partition increased once after upgrade, or it keeps increasing every day? In general we recommend configuring DWH+Greafana on a separate (remote) machine, docs: Installing and Configuring Data Warehouse on a Separate Machine <https://ovirt.org/documentation/data_warehouse_guide/index.html#Installing_and_Configuring_Data_Warehouse_on_a_Separate_Machine_DWH_admin> . Thanks, Aviv On Mon, Aug 29, 2022 at 4:24 PM <sohail_akhter3@hotmail.com> wrote:
Hi,
We have recently upgrade our oVirt environment to 4.5.2 version. Environment is based on hosted-engine. Since we have upgrade we noticed rapid incrase in /var partition in engine VM. It is increasing very rapidly. If we vacuum ovirt_engine_history db, /var size reduces but next day size will increase again upto 5-10%. We did db vacuuming couple of time but not sure we it is increasing so rapidly. Here is the partial output of vacuuming that was done on 26-08-22. Table "host_interface_hourly_history" had more entries to be removed. Rest of the table had not much entries. Previously table "host_interface_samples_history" had entries to be removed. Any Idea what can be the reason for that?
# dwh-vacuum -f -v SELECT pg_catalog.set_config('search_path', '', false); vacuumdb: vacuuming database "ovirt_engine_history" RESET search_path; SELECT c.relname, ns.nspname FROM pg_catalog.pg_class c JOIN pg_catalog.pg_namespace ns ON c.relnamespace OPERATOR(pg_catalog.=) ns.oid LEFT JOIN pg_catalog.pg_class t ON c.reltoastrelid OPERATOR(pg_catalog.=) t.oid WHERE c.relkind OPERATOR(pg_catalog.=) ANY (array['r', 'm']) ORDER BY c.relpages DESC; SELECT pg_catalog.set_config('search_path', '', false); VACUUM (FULL, VERBOSE) public.host_interface_samples_history; INFO: vacuuming "public.host_interface_samples_history" INFO: "host_interface_samples_history": found 3135 removable, 84609901 nonremovable row versions in 1564960 pages DETAIL: 0 dead row versions cannot be removed yet. CPU: user: 41.88 s, system: 14.93 s, elapsed: 422.83 s. VACUUM (FULL, VERBOSE) public.host_interface_hourly_history; INFO: vacuuming "public.host_interface_hourly_history" INFO: "host_interface_hourly_history": found 252422 removable, 39904650 nonremovable row versions in 473269 pages
Please let me know if any further information is required.
Regards Sohail _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/VD2ROVZY2TOLZH...

Hi Aviv, Thanks for your reply. We upgrade from 4.4.7 to 4.4.10 and then 4.5.2 We observed it started to increase since we upgraded to 4.5.2. I have to do dwh vacuuming every 2-3 days. Here for example. ---------------------------------------------- omitted output......... [root@manager ~]# dwh-vacuum -f -v SELECT pg_catalog.set_config('search_path', '', false); vacuumdb: vacuuming database "ovirt_engine_history" RESET search_path; SELECT c.relname, ns.nspname FROM pg_catalog.pg_class c JOIN pg_catalog.pg_namespace ns ON c.relnamespace OPERATOR(pg_catalog.=) ns.oid LEFT JOIN pg_catalog.pg_class t ON c.reltoastrelid OPERATOR(pg_catalog.=) t.oid WHERE c.relkind OPERATOR(pg_catalog.=) ANY (array['r', 'm']) ORDER BY c.relpages DESC; SELECT pg_catalog.set_config('search_path', '', false); VACUUM (FULL, VERBOSE) public.host_interface_samples_history; INFO: vacuuming "public.host_interface_samples_history" INFO: "host_interface_samples_history": found 94115 removable, 70244664 nonremovable row versions in 1903718 pages DETAIL: 0 dead row versions cannot be removed yet. CPU: user: 36.72 s, system: 12.91 s, elapsed: 195.78 s. VACUUM (FULL, VERBOSE) public.host_interface_hourly_history; INFO: vacuuming "public.host_interface_hourly_history" INFO: "host_interface_hourly_history": found 126645 removable, 40469226 nonremovable row versions in 482262 pages DETAIL: 0 dead row versions cannot be removed yet. CPU: user: 20.71 s, system: 5.58 s, elapsed: 115.83 s. VACUUM (FULL, VERBOSE) public.vm_disks_usage_samples_history; INFO: vacuuming "public.vm_disks_usage_samples_history" INFO: "vm_disks_usage_samples_history": found 2028 removable, 1672491 nonremovable row versions in 307111 pages DETAIL: 0 dead row versions cannot be removed yet. CPU: user: 4.35 s, system: 3.77 s, elapsed: 51.81 s. ------------------------------------------------------------------------------------------------- We have the plans to switch dwh and grafana to separate vm. Meanwhile we were curious to know the reason of this rapid increase.

Hi Sohail, I really can't find any reason for that if the changes between 4.4.10 to 4.5.2, it might be something in your local environment configuration. Sorry for not helping much, still trying to think if it's a bug, in the meanwhile I would suggest switching dwh and grafana to separate vm, it's not a complicated process. Thanks, and please keep me updated! Aviv On Thu, Sep 8, 2022 at 1:45 PM <sohail_akhter3@hotmail.com> wrote:
Hi Aviv,
Thanks for your reply. We upgrade from 4.4.7 to 4.4.10 and then 4.5.2 We observed it started to increase since we upgraded to 4.5.2. I have to do dwh vacuuming every 2-3 days. Here for example. ---------------------------------------------- omitted output......... [root@manager ~]# dwh-vacuum -f -v SELECT pg_catalog.set_config('search_path', '', false); vacuumdb: vacuuming database "ovirt_engine_history" RESET search_path; SELECT c.relname, ns.nspname FROM pg_catalog.pg_class c JOIN pg_catalog.pg_namespace ns ON c.relnamespace OPERATOR(pg_catalog.=) ns.oid LEFT JOIN pg_catalog.pg_class t ON c.reltoastrelid OPERATOR(pg_catalog.=) t.oid WHERE c.relkind OPERATOR(pg_catalog.=) ANY (array['r', 'm']) ORDER BY c.relpages DESC; SELECT pg_catalog.set_config('search_path', '', false); VACUUM (FULL, VERBOSE) public.host_interface_samples_history; INFO: vacuuming "public.host_interface_samples_history" INFO: "host_interface_samples_history": found 94115 removable, 70244664 nonremovable row versions in 1903718 pages DETAIL: 0 dead row versions cannot be removed yet. CPU: user: 36.72 s, system: 12.91 s, elapsed: 195.78 s. VACUUM (FULL, VERBOSE) public.host_interface_hourly_history; INFO: vacuuming "public.host_interface_hourly_history" INFO: "host_interface_hourly_history": found 126645 removable, 40469226 nonremovable row versions in 482262 pages DETAIL: 0 dead row versions cannot be removed yet. CPU: user: 20.71 s, system: 5.58 s, elapsed: 115.83 s. VACUUM (FULL, VERBOSE) public.vm_disks_usage_samples_history; INFO: vacuuming "public.vm_disks_usage_samples_history" INFO: "vm_disks_usage_samples_history": found 2028 removable, 1672491 nonremovable row versions in 307111 pages DETAIL: 0 dead row versions cannot be removed yet. CPU: user: 4.35 s, system: 3.77 s, elapsed: 51.81 s.
------------------------------------------------------------------------------------------------- We have the plans to switch dwh and grafana to separate vm. Meanwhile we were curious to know the reason of this rapid increase. _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/NOD2S7Q27GV2F7...

Hi Aviv We are still observing this issue. dwh db is increasing very rapidly. So far we are unable to find what is causing so much increasing. These are top tables consuming disk space. I added entry in the root crontab to vacuum the db but it did not work. public.host_interface_samples_history | 56 GB public.host_interface_hourly_history | 11 GB public.vm_disks_usage_samples_history | 8819 MB public.vm_interface_samples_history | 3396 MB public.vm_samples_history | 2689 MB public.host_interface_configuration | 2310 MB public.vm_disk_samples_history | 1839 MB public.vm_disks_usage_hourly_history | 1210 MB public.vm_device_history | 655 MB public.vm_interface_hourly_history | 536 MB public.vm_hourly_history | 428 MB public.statistics_vms_users_usage_hourly | 366 MB public.vm_disk_hourly_history | 330 MB public.host_samples_history | 140 MB public.host_interface_daily_history | 77 MB public.calendar | 40 MB public.host_hourly_history | 24 MB public.vm_disk_configuration | 19 MB public.vm_interface_configuration | 16 MB public.vm_configuration | 14 MB public.vm_disks_usage_daily_history | 11 MB public.vm_interface_daily_history | 4992 kB public.storage_domain_samples_history | 4312 kB public.vm_daily_history | 4088 kB public.vm_disk_daily_history | 3048 kB public.statistics_vms_users_usage_daily | 2816 kB public.cluster_configuration | 1248 kB public.host_configuration | 1136 kB public.storage_domain_hourly_history | 744 kB public.tag_relations_history | 352 kB This the row count in host_interface_samples_history table. ovirt_engine_history=# select count(*) from host_interface_samples_history; count ----------- 316633499 I had no choice except to truncate the table. But in next 3-4 hours count again increased too much ovirt_engine_history=# select count(*) from host_interface_samples_history; count --------- 8743168 (1 row) So even if we move dwh to separate vm it will use the disk space in few days. I applied below recommendations also but it did not make any change. # cat ovirt-engine-dwhd.conf # # These variables control the amount of memory used by the java # virtual machine where the daemon runs: # DWH_HEAP_MIN=1g DWH_HEAP_MAX=3g # Recommendation as per oVirt Guide in case dwh and engine are on same machine #https://www.ovirt.org/documentation/data_warehouse_guide/#Installing_and_Con... DWH_TABLES_KEEP_HOURLY=780 DWH_TABLES_KEEP_DAILY=0 Is there anything else we can check further to find what is causing so much increase? Please let me know if you need any further information. Regards Sohail
participants (2)
-
Aviv Litman
-
sohail_akhter3@hotmail.com