|The PostgreSQL 9.0 Reference Manual - Volume 3 - Server Administration Guide
by The PostgreSQL Global Development Group
Paperback (6"x9"), 274 pages
RRP £9.95 ($14.95)
Sales of this book support the PostgreSQL project! Get a printed copy>>>
9.1.2 Recovering Disk Space
In PostgreSQL, an
DELETE of a row does not
immediately remove the old version of the row.
This approach is necessary to gain the benefits of multiversion
concurrency control (MVCC, see Volume 1A: 11 Concurrency Control): the row version
must not be deleted while it is still potentially visible to other
transactions. But eventually, an outdated or deleted row version is no
longer of interest to any transaction. The space it occupies must then be
reclaimed for reuse by new rows, to avoid unbounded growth of disk
space requirements. This is done by running
The standard form of
VACUUM removes dead row
versions in tables and indexes and marks the space available for
future reuse. However, it will not return the space to the operating
system, except in the special case where one or more pages at the
end of a table become entirely free and an exclusive table lock can be
easily obtained. In contrast,
VACUUM FULL actively compacts
tables by writing a complete new version of the table file with no dead
space. This minimizes the size of the table, but can take a long time.
It also requires extra disk space for the new copy of the table, until
the operation completes.
The usual goal of routine vacuuming is to do standard
often enough to avoid needing
VACUUM FULL. The
autovacuum daemon attempts to work this way, and in fact will
VACUUM FULL. In this approach, the idea
is not to keep tables at their minimum size, but to maintain steady-state
usage of disk space: each table occupies space equivalent to its
minimum size plus however much space gets used up between vacuumings.
VACUUM FULL can be used to shrink a table back
to its minimum size and return the disk space to the operating system,
there is not much point in this if the table will just grow again in the
future. Thus, moderately-frequent standard
VACUUM runs are a
better approach than infrequent
VACUUM FULL runs for
maintaining heavily-updated tables.
Some administrators prefer to schedule vacuuming themselves, for example
doing all the work at night when load is low.
The difficulty with doing vacuuming according to a fixed schedule
is that if a table has an unexpected spike in update activity, it may
get bloated to the point that
VACUUM FULL is really necessary
to reclaim space. Using the autovacuum daemon alleviates this problem,
since the daemon schedules vacuuming dynamically in response to update
activity. It is unwise to disable the daemon completely unless you
have an extremely predictable workload. One possible compromise is
to set the daemon's parameters so that it will only react to unusually
heavy update activity, thus keeping things from getting out of hand,
VACUUMs are expected to do the bulk of the
work when the load is typical.
For those not using autovacuum, a typical approach is to schedule a
VACUUM once a day during a low-usage period,
supplemented by more frequent vacuuming of heavily-updated tables as
necessary. (Some installations with extremely high update rates vacuum
their busiest tables as often as once every few minutes.) If you have
multiple databases in a cluster, don't forget to
VACUUM each one; the program
vacuumdb might be helpful.
VACUUMmay not be satisfactory when a table contains large numbers of dead row versions as a result of massive update or delete activity. If you have such a table and you need to reclaim the excess disk space it occupies, you will need to use
VACUUM FULL, or alternatively
CLUSTERor one of the table-rewriting variants of
ALTER TABLE. These commands rewrite an entire new copy of the table and build new indexes for it. All these options require exclusive lock. Note that they also temporarily use extra disk space approximately equal to the size of the table, since the old copies of the table and indexes can't be released until the new ones are complete.
Tip: If you have a table whose entire contents are deleted on a periodic basis, consider doing it with
TRUNCATErather than using
TRUNCATEremoves the entire content of the table immediately, without requiring a subsequent
VACUUM FULLto reclaim the now-unused disk space. The disadvantage is that strict MVCC semantics are violated.
|ISBN 9781906966072||The PostgreSQL 9.0 Reference Manual - Volume 3 - Server Administration Guide||See the print edition|