| PostgreSQL Reference Manual - Volume 3 - Server Administration Guide by The PostgreSQL Global Development Group Paperback (6"x9"), 204 pages ISBN 0954612043 RRP £13.95 ($24.95) Sales of this book support the PostgreSQL project! Get a printed copy>>> |
9.1.2 Updating planner statistics
The PostgreSQL query planner relies on
statistical information about the contents of tables in order to
generate good plans for queries. These statistics are gathered by
the ANALYZE command, which can be invoked by itself or
as an optional step in VACUUM. It is important to have
reasonably accurate statistics, otherwise poor choices of plans may
degrade database performance.
As with vacuuming for space recovery, frequent updates of statistics
are more useful for heavily-updated tables than for seldom-updated
ones. But even for a heavily-updated table, there may be no need for
statistics updates if the statistical distribution of the data is
not changing much. A simple rule of thumb is to think about how much
the minimum and maximum values of the columns in the table change.
For example, a timestamp column that contains the time
of row update will have a constantly-increasing maximum value as
rows are added and updated; such a column will probably need more
frequent statistics updates than, say, a column containing URLs for
pages accessed on a website. The URL column may receive changes just
as often, but the statistical distribution of its values probably
changes relatively slowly.
It is possible to run ANALYZE on specific tables and even
just specific columns of a table, so the flexibility exists to update some
statistics more frequently than others if your application requires it.
In practice, however, it is usually best to just analyze the entire database
because it is a fast operation. It uses a statistical random sampling of
the rows of a table rather than reading every single row.
Tip: Although per-column tweaking of
ANALYZEfrequency may not be very productive, you may well find it worthwhile to do per-column adjustment of the level of detail of the statistics collected byANALYZE. Columns that are heavily used inWHEREclauses and have highly irregular data distributions may require a finer-grain data histogram than other columns. SeeALTER TABLE SET STATISTICS.
Recommended practice for most sites is to schedule a database-wide
ANALYZE once a day at a low-usage time of day; this can
usefully be combined with a nightly VACUUM. However,
sites with relatively slowly changing table statistics may find that
this is overkill, and that less-frequent ANALYZE runs
are sufficient.
| ISBN 0954612043 | PostgreSQL Reference Manual - Volume 3 - Server Administration Guide | See the print edition |