- publishing free software manuals
The PostgreSQL 9.0 Reference Manual - Volume 1A - SQL Language Reference
by The PostgreSQL Global Development Group
Paperback (6"x9"), 454 pages
ISBN 9781906966041
RRP £14.95 ($19.95)

Sales of this book support the PostgreSQL project! Get a printed copy>>>

3.9.2 Implementing Partitioning

To set up a partitioned table, do the following:

  1. Create the “master” table, from which all of the partitions will inherit. This table will contain no data. Do not define any check constraints on this table, unless you intend them to be applied equally to all partitions. There is no point in defining any indexes or unique constraints on it, either.
  2. Create several “child” tables that each inherit from the master table. Normally, these tables will not add any columns to the set inherited from the master. We will refer to the child tables as partitions, though they are in every way normal PostgreSQL tables.
  3. Add table constraints to the partition tables to define the allowed key values in each partition. Typical examples would be:
    CHECK ( x = 1 )
    CHECK ( county IN ( 'Oxfordshire', 'Buckinghamshire', 
      'Warwickshire' ))
    CHECK ( outletID >= 100 AND outletID < 200 )
    
    Ensure that the constraints guarantee that there is no overlap between the key values permitted in different partitions. A common mistake is to set up range constraints like:
    CHECK ( outletID BETWEEN 100 AND 200 )
    CHECK ( outletID BETWEEN 200 AND 300 )
    
    This is wrong since it is not clear which partition the key value 200 belongs in. Note that there is no difference in syntax between range and list partitioning; those terms are descriptive only.
  4. For each partition, create an index on the key column(s), as well as any other indexes you might want. (The key index is not strictly necessary, but in most scenarios it is helpful. If you intend the key values to be unique then you should always create a unique or primary-key constraint for each partition.)
  5. Optionally, define a trigger or rule to redirect data inserted into the master table to the appropriate partition.
  6. Ensure that the constraint_exclusion configuration parameter is not disabled in ‘postgresql.conf’. If it is, queries will not be optimized as desired.

For example, suppose we are constructing a database for a large ice cream company. The company measures peak temperatures every day as well as ice cream sales in each region. Conceptually, we want a table like:

CREATE TABLE measurement (
    city_id         int not null,
    logdate         date not null,
    peaktemp        int,
    unitsales       int
);

We know that most queries will access just the last week's, month's or quarter's data, since the main use of this table will be to prepare online reports for management. To reduce the amount of old data that needs to be stored, we decide to only keep the most recent 3 years worth of data. At the beginning of each month we will remove the oldest month's data.

In this situation we can use partitioning to help us meet all of our different requirements for the measurements table. Following the steps outlined above, partitioning can be set up as follows:

  1. The master table is the measurement table, declared exactly as above.
  2. Next we create one partition for each active month:
    CREATE TABLE measurement_y2006m02 ( ) INHERITS (measurement);
    CREATE TABLE measurement_y2006m03 ( ) INHERITS (measurement);
    ...
    CREATE TABLE measurement_y2007m11 ( ) INHERITS (measurement);
    CREATE TABLE measurement_y2007m12 ( ) INHERITS (measurement);
    CREATE TABLE measurement_y2008m01 ( ) INHERITS (measurement);
    
    Each of the partitions are complete tables in their own right, but they inherit their definitions from the measurement table. This solves one of our problems: deleting old data. Each month, all we will need to do is perform a DROP TABLE on the oldest child table and create a new child table for the new month's data.
  3. We must provide non-overlapping table constraints. Rather than just creating the partition tables as above, the table creation script should really be:
    CREATE TABLE measurement_y2006m02 (
        CHECK ( logdate >= DATE '2006-02-01' AND logdate < 
      DATE '2006-03-01' )
    ) INHERITS (measurement);
    CREATE TABLE measurement_y2006m03 (
        CHECK ( logdate >= DATE '2006-03-01' AND logdate < 
      DATE '2006-04-01' )
    ) INHERITS (measurement);
    ...
    CREATE TABLE measurement_y2007m11 (
        CHECK ( logdate >= DATE '2007-11-01' AND logdate < 
      DATE '2007-12-01' )
    ) INHERITS (measurement);
    CREATE TABLE measurement_y2007m12 (
        CHECK ( logdate >= DATE '2007-12-01' AND logdate < 
      DATE '2008-01-01' )
    ) INHERITS (measurement);
    CREATE TABLE measurement_y2008m01 (
        CHECK ( logdate >= DATE '2008-01-01' AND logdate < 
      DATE '2008-02-01' )
    ) INHERITS (measurement);
    
  4. We probably need indexes on the key columns too:
    CREATE INDEX measurement_y2006m02_logdate ON 
      measurement_y2006m02 (logdate);
    CREATE INDEX measurement_y2006m03_logdate ON 
      measurement_y2006m03 (logdate);
    ...
    CREATE INDEX measurement_y2007m11_logdate ON 
      measurement_y2007m11 (logdate);
    CREATE INDEX measurement_y2007m12_logdate ON 
      measurement_y2007m12 (logdate);
    CREATE INDEX measurement_y2008m01_logdate ON 
      measurement_y2008m01 (logdate);
    
    We choose not to add further indexes at this time.
  5. We want our application to be able to say INSERT INTO measurement ... and have the data be redirected into the appropriate partition table. We can arrange that by attaching a suitable trigger function to the master table. If data will be added only to the latest partition, we can use a very simple trigger function:
    CREATE OR REPLACE FUNCTION measurement_insert_trigger()
    RETURNS TRIGGER AS $$
    BEGIN
        INSERT INTO measurement_y2008m01 VALUES (NEW.*);
        RETURN NULL;
    END;
    $$
    LANGUAGE plpgsql;
    
    After creating the function, we create a trigger which calls the trigger function:
    CREATE TRIGGER insert_measurement_trigger
        BEFORE INSERT ON measurement
        FOR EACH ROW EXECUTE PROCEDURE
       measurement_insert_trigger();
    
    We must redefine the trigger function each month so that it always points to the current partition. The trigger definition does not need to be updated, however. We might want to insert data and have the server automatically locate the partition into which the row should be added. We could do this with a more complex trigger function, for example:
    CREATE OR REPLACE FUNCTION measurement_insert_trigger()
    RETURNS TRIGGER AS $$
    BEGIN
        IF ( NEW.logdate >= DATE '2006-02-01' AND
             NEW.logdate < DATE '2006-03-01' ) THEN
            INSERT INTO measurement_y2006m02 VALUES (NEW.*);
        ELSIF ( NEW.logdate >= DATE '2006-03-01' AND
                NEW.logdate < DATE '2006-04-01' ) THEN
            INSERT INTO measurement_y2006m03 VALUES (NEW.*);
        ...
        ELSIF ( NEW.logdate >= DATE '2008-01-01' AND
                NEW.logdate < DATE '2008-02-01' ) THEN
            INSERT INTO measurement_y2008m01 VALUES (NEW.*);
        ELSE
            RAISE EXCEPTION 'Date out of range.  Fix the 
      measurement_insert_trigger() function!';
        END IF;
        RETURN NULL;
    END;
    $$
    LANGUAGE plpgsql;
    
    The trigger definition is the same as before. Note that each IF test must exactly match the CHECK constraint for its partition. While this function is more complex than the single-month case, it doesn't need to be updated as often, since branches can be added in advance of being needed.

    Note: In practice it might be best to check the newest partition first, if most inserts go into that partition. For simplicity we have shown the trigger's tests in the same order as in other parts of this example.

As we can see, a complex partitioning scheme could require a substantial amount of data definition language (DDL). In the above example we would be creating a new partition each month, so it might be wise to write a script that generates the required data definition language (DDL) automatically.

ISBN 9781906966041The PostgreSQL 9.0 Reference Manual - Volume 1A - SQL Language ReferenceSee the print edition