[ovirt-devel] [ovirt-users] [Feture discussion] Full vacuum tool

Roy Golan rgolan at redhat.com
Wed Dec 7 19:55:43 UTC 2016


On 7 December 2016 at 21:44, Roy Golan <rgolan at redhat.com> wrote:

>
>
> On 7 December 2016 at 21:00, Michal Skrivanek <mskrivan at redhat.com> wrote:
>
>>
>>
>> On 07 Dec 2016, at 11:28, Yaniv Kaul <ykaul at redhat.com> wrote:
>>
>>
>>
>> On Wed, Dec 7, 2016 at 10:57 AM, Roy Golan <rgolan at redhat.com> wrote:
>>
>>> Hi all,
>>>
>>> This is a discussion on the RFE[1] to provide a tool to perform full
>>> vacuum on our DBs.
>>>
>>> First if you are not familiar with vacuum please read this [2]
>>>
>>> # Backgroud
>>> ovirt 'engine' DB have several busy table with 2 differnt usage patten.
>>> One is audit_log and the others are the 'v*_statistics' tables and the
>>> difference between them is mostly inserts vs mostly hot updates.
>>> Tables with tons of updates creates garbage or 'dead' records that
>>> should be removed, and for this postgres have the aforementioned autovacuum
>>> cleaner. It will make the db reuse its already allocated space to perform
>>> future updates/inserts and so on.
>>> Autovacuum is essential for a db to function optimally and tweaking it
>>> is out of the scope of the feature.
>>>
>>> Full vacuum is designed to reclaim the disk space and reset the table
>>> statistics. It is a heavy maintenance task, it takes an exclusive lock on
>>> the table and may take seconds to minutes. In some situations it is
>>> effectively a downtime due to the long table lock and should not be running
>>> when the engine is running.
>>>
>>
>> So, effectively this should be interesting mostly/only for the audit log.
>> All other busy table are mostly in-place updates
>>
>
> Given that autovacuum is performing well the yes but if it starts to fall
> behind this may help a bit.
> audit_log is insert mostly and also delete, we remove a day, each day.
>
>>
>>
>>> # Critiria
>>> Provide a way to reclaim disk space claimed by the garbage created over
>>> time by the engine db and dwh.
>>>
>>> # Usage
>>> Either use it as part of the upgrade procedure (after all dbscipts
>>> execution)
>>>
>>
>> That does sound as a good start not requiring much user involvement
>>
>> or just provide the tool and admin will run in on demand
>>> - engine db credentials read from /etc/ovirt-engine/engine.conf.d/
>>> - invocation:
>>>  ```
>>>  tool: [dbname(default engine)] [table: (default all)]
>>>  ```
>>> - if we invoke it on upgrade than an installation plugin should be added
>>> to invoke with default, no interaction
>>>
>>
>> +1
>>
>> - since VACUUM ANALYZE is consider a recommended maintenance task we can
>>> to it by default and ask the user for FULL.
>>>
>>
>> When would you run it? ANALYZE nightly?
>>
>> No I'd still avoid doing this repeatedly, autovaccum should handle that
> as well, but this would cover situations where it isn't functioning
> optimally.
>
> I think  its worth adding a report of the db status and the rate of the
> autovacuum (a slight midifed version of the query mskrivanek ran on one of
> the production systems [3])  that will go to the logcollector. Perhaps the
> output of the ANALYZE will help as well.
>
> [3] https://gist.github.com/rgolangh/049cff30b89c5b29284ceee80a35db
> b4#file-table_status_by_dead_rows-sql
>


Very interesting collection of pg scrips to measure bloat and vacuum -
needs access to postgres objects though

- https://github.com/pgexperts/pgx_scripts
-
https://github.com/pgexperts/pgx_scripts/blob/master/bloat/table_bloat_check.sql
-
https://github.com/pgexperts/pgx_scripts/blob/master/vacuum/last_autovacuum.sql


>
>
>>>
>> Will the user know to answer intelligently if vacuum is needed or not?
>> Except for 'yes, you need it', we cannot even provide a time estimate (I
>> assume a disk space estimate is available!)
>>
>> perhaps we can estimate the bloat, there should be a github script to
> calculate that [4] not sure how good it is.
>
>> I would suggest to run ANALYZE for sure and provide an option at the end
>> of installation, to run the required command line - so make it as
>> accessible as possible, but not part of the flow.
>>
>>
>> If there are no significant gains why bother any other time but on
>> upgrade when it can be run unconditionally?
>>
>>
>> I'm wondering if the community can run ANALYZE on their database, and we
>> can estimate how many are in dire need for full vacuum already.
>> Y.
>>
>> I'll send a different mail for that.
>
>
>>
>> - remote db is supported as well, doesn't have to be local
>>>
>>
>> Well, not sure if we need to bother. It was introduced for large
>> deployments where the host can't fit both engine and db load. Do we still
>> have this issue? I wouldn't say so for 4.1. It may be very niche case
>>
>> Running full vacuum is anyway a psql command, so there is no hidden cost
> here (to the development side I mean)
>
>
>> Thanks,
>> michal
>>
>>
>>> # Questions
>>>  - Will remote dwh have the credentials under
>>> /etc/ovirt-engine/engine.conf.d?
>>>  - Should  AAA schema be taken into account as well?
>>>
>>> Please review, thanks
>>> Roy
>>>
>>> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1388430
>>> [2] https://www.postgresql.org/docs/9.2/static/runtime-config-au
>>> tovacuum.html
>>> [3] https://www.postgresql.org/docs/devel/static/sql-vacuum.html
>>>
>>> _______________________________________________
>>> Users mailing list
>>> Users at ovirt.org
>>> http://lists.ovirt.org/mailman/listinfo/users
>>>
>>>
>> _______________________________________________
>> Users mailing list
>> Users at ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/devel/attachments/20161207/d366000e/attachment-0001.html>


More information about the Devel mailing list