- publishing free software manuals
The PostgreSQL 9.0 Reference Manual - Volume 3 - Server Administration Guide
by The PostgreSQL Global Development Group
Paperback (6"x9"), 274 pages
ISBN 9781906966072
RRP £9.95 ($14.95)

Sales of this book support the PostgreSQL project! Get a printed copy>>>

9.1.5 The Autovacuum Daemon

PostgreSQL has an optional but highly recommended feature called autovacuum, whose purpose is to automate the execution of VACUUM and ANALYZE commands. When enabled, autovacuum checks for tables that have had a large number of inserted, updated or deleted tuples. These checks use the statistics collection facility; therefore, autovacuum cannot be used unless track_counts is set to true. In the default configuration, autovacuuming is enabled and the related configuration parameters are appropriately set.

The “autovacuum daemon” actually consists of multiple processes. There is a persistent daemon process, called the autovacuum launcher, which is in charge of starting autovacuum worker processes for all databases. The launcher will distribute the work across time, attempting to start one worker within each database every autovacuum_naptime seconds. (Therefore, if the installation has N databases, a new worker will be launched every autovacuum_naptime/N seconds.) A maximum of autovacuum_max_workers worker processes are allowed to run at the same time. If there are more than autovacuum_max_workers databases to be processed, the next database will be processed as soon as the first worker finishes. Each worker process will check each table within its database and execute VACUUM and/or ANALYZE as needed.

If several large tables all become eligible for vacuuming in a short amount of time, all autovacuum workers might become occupied with vacuuming those tables for a long period. This would result in other tables and databases not being vacuumed until a worker became available. There is no limit on how many workers might be in a single database, but workers do try to avoid repeating work that has already been done by other workers. Note that the number of running workers does not count towards max_connections or superuser_reserved_connections limits.

Tables whose relfrozenxid value is more than autovacuum_freeze_max_age transactions old are always vacuumed (this also applies to those tables whose freeze max age has been modified via storage parameters; see below). Otherwise, if the number of tuples obsoleted since the last VACUUM exceeds the “vacuum threshold”, the table is vacuumed. The vacuum threshold is defined as:

vacuum threshold = vacuum base threshold + vacuum scale 
  factor * number of tuples

where the vacuum base threshold is autovacuum_vacuum_threshold, the vacuum scale factor is autovacuum_vacuum_scale_factor, and the number of tuples is pg_class.reltuples. The number of obsolete tuples is obtained from the statistics collector; it is a semi-accurate count updated by each UPDATE and DELETE operation. (It is only semi-accurate because some information might be lost under heavy load.) If the relfrozenxid value of the table is more than vacuum_freeze_table_age transactions old, the whole table is scanned to freeze old tuples and advance relfrozenxid, otherwise only pages that have been modified since the last vacuum are scanned.

For analyze, a similar condition is used: the threshold, defined as:

analyze threshold = analyze base threshold + analyze scale 
  factor * number of tuples

is compared to the total number of tuples inserted, updated, or deleted since the last ANALYZE.

Temporary tables cannot be accessed by autovacuum. Therefore, appropriate vacuum and analyze operations should be performed via session SQL commands.

The default thresholds and scale factors are taken from ‘postgresql.conf’, but it is possible to override them on a table-by-table basis; see Volume 1B: Storage Parameters for more information. If a setting has been changed via storage parameters, that value is used; otherwise the global settings are used. See section 4.9 Automatic Vacuuming for more details on the global settings.

Besides the base threshold values and scale factors, there are six more autovacuum parameters that can be set for each table via storage parameters. The first parameter, autovacuum_enabled, can be set to false to instruct the autovacuum daemon to skip that particular table entirely. In this case autovacuum will only touch the table if it must do so to prevent transaction ID wraparound. Another two parameters, autovacuum_vacuum_cost_delay and autovacuum_vacuum_cost_limit, are used to set table-specific values for the cost-based vacuum delay feature (see section 4.4.3 Cost-Based Vacuum Delay). autovacuum_freeze_min_age, autovacuum_freeze_max_age and autovacuum_freeze_table_age are used to set values for vacuum_freeze_min_age, autovacuum_freeze_max_age and vacuum_freeze_table_age respectively.

When multiple workers are running, the cost limit is “balanced” among all the running workers, so that the total impact on the system is the same, regardless of the number of workers actually running.

ISBN 9781906966072The PostgreSQL 9.0 Reference Manual - Volume 3 - Server Administration GuideSee the print edition