- publishing free software manuals
PostgreSQL Reference Manual - Volume 1 - SQL Language Reference
by The PostgreSQL Global Development Group
Paperback (6"x9"), 716 pages
ISBN 0954612027
RRP £32.00 ($49.95)

Sales of this book support the PostgreSQL project! Get a printed copy>>>

3.9.2 Implementing Partitioning

To set up a partitioned table, do the following:

  1. Create the “master” table, from which all of the partitions will inherit. This table will contain no data. Do not define any check constraints on this table, unless you intend them to be applied equally to all partitions. There is no point in defining any indexes or unique constraints on it, either.
  2. Create several “child” tables that each inherit from the master table. Normally, these tables will not add any columns to the set inherited from the master. We will refer to the child tables as partitions, though they are in every way normal PostgreSQL tables.
  3. Add table constraints to the partition tables to define the allowed key values in each partition. Typical examples would be:
    CHECK ( x = 1 )
    CHECK ( county IN ( 'Oxfordshire', 'Buckinghamshire',
     'Warwickshire' ))
    CHECK ( outletID >= 100 AND outletID < 200 )
    
    Ensure that the constraints guarantee that there is no overlap between the key values permitted in different partitions. A common mistake is to set up range constraints like this:
    CHECK ( outletID BETWEEN 100 AND 200 )
    CHECK ( outletID BETWEEN 200 AND 300 )
    
    This is wrong since it is not clear which partition the key value 200 belongs in. Note that there is no difference in syntax between range and list partitioning; those terms are descriptive only.
  4. For each partition, create an index on the key column(s), as well as any other indexes you might want. (The key index is not strictly necessary, but in most scenarios it is helpful. If you intend the key values to be unique then you should always create a unique or primary-key constraint for each partition.)
  5. Optionally, define a rule or trigger to redirect modifications of the master table to the appropriate partition.
  6. Ensure that the constraint_exclusion configuration parameter is enabled in ‘postgresql.conf’. Without this, queries will not be optimized as desired.

For example, suppose we are constructing a database for a large ice cream company. The company measures peak temperatures every day as well as ice cream sales in each region. Conceptually, we want a table like this:

CREATE TABLE measurement (
    city_id         int not null,
    logdate         date not null,
    peaktemp        int,
    unitsales       int
);

We know that most queries will access just the last week's, month's or quarter's data, since the main use of this table will be to prepare online reports for management. To reduce the amount of old data that needs to be stored, we decide to only keep the most recent 3 years worth of data. At the beginning of each month we will remove the oldest month's data.

In this situation we can use partitioning to help us meet all of our different requirements for the measurements table. Following the steps outlined above, partitioning can be set up as follows:

  1. The master table is the measurement table, declared exactly as above.
  2. Next we create one partition for each active month:
    CREATE TABLE measurement_y2004m02 ( ) INHERITS (measurement);
    CREATE TABLE measurement_y2004m03 ( ) INHERITS (measurement);
    ...
    CREATE TABLE measurement_y2005m11 ( ) INHERITS (measurement);
    CREATE TABLE measurement_y2005m12 ( ) INHERITS (measurement);
    CREATE TABLE measurement_y2006m01 ( ) INHERITS (measurement);
    
    Each of the partitions are complete tables in their own right, but they inherit their definition from the measurement table. This solves one of our problems: deleting old data. Each month, all we will need to do is perform a DROP TABLE on the oldest child table and create a new child table for the new month's data.
  3. We must add non-overlapping table constraints, so that our table creation script becomes:
    CREATE TABLE measurement_y2004m02 (
        CHECK ( logdate >= DATE '2004-02-01' AND logdate <
     DATE '2004-03-01' )
    ) INHERITS (measurement);
    CREATE TABLE measurement_y2004m03 (
        CHECK ( logdate >= DATE '2004-03-01' AND logdate <
     DATE '2004-04-01' )
    ) INHERITS (measurement);
    ...
    CREATE TABLE measurement_y2005m11 (
        CHECK ( logdate >= DATE '2005-11-01' AND logdate <
     DATE '2005-12-01' )
    ) INHERITS (measurement);
    CREATE TABLE measurement_y2005m12 (
        CHECK ( logdate >= DATE '2005-12-01' AND logdate <
     DATE '2006-01-01' )
    ) INHERITS (measurement);
    CREATE TABLE measurement_y2006m01 (
        CHECK ( logdate >= DATE '2006-01-01' AND logdate <
     DATE '2006-02-01' )
    ) INHERITS (measurement);
    
  4. We probably need indexes on the key columns too:
    CREATE INDEX measurement_y2004m02_logdate ON
     measurement_y2004m02 (logdate);
    CREATE INDEX measurement_y2004m03_logdate ON
     measurement_y2004m03 (logdate);
    ...
    CREATE INDEX measurement_y2005m11_logdate ON
     measurement_y2005m11 (logdate);
    CREATE INDEX measurement_y2005m12_logdate ON
     measurement_y2005m12 (logdate);
    CREATE INDEX measurement_y2006m01_logdate ON
     measurement_y2006m01 (logdate);
    
    We choose not to add further indexes at this time.
  5. If data will be added only to the latest partition, we can set up a very simple rule to insert data. We must redefine this each month so that it always points to the current partition.
    CREATE OR REPLACE RULE measurement_current_partition AS
    ON INSERT TO measurement
    DO INSTEAD
        INSERT INTO measurement_y2006m01 VALUES (
     NEW.city_id, NEW.logdate, NEW.peaktemp, NEW.unitsales );
    
    We might want to insert data and have the server automatically locate the partition into which the row should be added. We could do this with a more complex set of rules as shown below.
    CREATE RULE measurement_insert_y2004m02 AS
    ON INSERT TO measurement WHERE
        ( logdate >= DATE '2004-02-01' AND logdate < DATE
     '2004-03-01' )
    DO INSTEAD
        INSERT INTO measurement_y2004m02 VALUES (
     NEW.city_id, NEW.logdate, NEW.peaktemp, NEW.unitsales );
    ...
    CREATE RULE measurement_insert_y2005m12 AS
    ON INSERT TO measurement WHERE
        ( logdate >= DATE '2005-12-01' AND logdate < DATE
     '2006-01-01' )
    DO INSTEAD
        INSERT INTO measurement_y2005m12 VALUES (
     NEW.city_id, NEW.logdate, NEW.peaktemp, NEW.unitsales );
    CREATE RULE measurement_insert_y2006m01 AS
    ON INSERT TO measurement WHERE
        ( logdate >= DATE '2006-01-01' AND logdate < DATE
     '2006-02-01' )
    DO INSTEAD
        INSERT INTO measurement_y2006m01 VALUES (
     NEW.city_id, NEW.logdate, NEW.peaktemp, NEW.unitsales );
    
    Note that the WHERE clause in each rule exactly matches the CHECK constraint for its partition.

As we can see, a complex partitioning scheme could require a substantial amount of data definition language (DDL). In the above example we would be creating a new partition each month, so it may be wise to write a script that generates the required data definition language (DDL) automatically.

Partitioning can also be arranged using a UNION ALL view:

CREATE VIEW measurement AS
          SELECT * FROM measurement_y2004m02
UNION ALL SELECT * FROM measurement_y2004m03
...
UNION ALL SELECT * FROM measurement_y2005m11
UNION ALL SELECT * FROM measurement_y2005m12
UNION ALL SELECT * FROM measurement_y2006m01;

However, the need to recreate the view adds an extra step to adding and dropping individual partitions of the data set.

ISBN 0954612027PostgreSQL Reference Manual - Volume 1 - SQL Language ReferenceSee the print edition