|The PostgreSQL 9.0 Reference Manual - Volume 2 - Programming Guide
by The PostgreSQL Global Development Group
Paperback (6"x9"), 478 pages
RRP £14.95 ($19.95)
Sales of this book support the PostgreSQL project! Get a printed copy>>>
5.14.5 Operator Classes and Operator Families
So far we have implicitly assumed that an operator class deals with only one data type. While there certainly can be only one data type in a particular index column, it is often useful to index operations that compare an indexed column to a value of a different data type. Also, if there is use for a cross-data-type operator in connection with an operator class, it is often the case that the other data type has a related operator class of its own. It is helpful to make the connections between related classes explicit, because this can aid the planner in optimizing SQL queries (particularly for B-tree operator classes, since the planner contains a great deal of knowledge about how to work with them).
To handle these needs, PostgreSQL uses the concept of an operator family . An operator family contains one or more operator classes, and can also contain indexable operators and corresponding support functions that belong to the family as a whole but not to any single class within the family. We say that such operators and functions are “loose” within the family, as opposed to being bound into a specific class. Typically each operator class contains single-data-type operators while cross-data-type operators are loose in the family.
All the operators and functions in an operator family must have compatible semantics, where the compatibility requirements are set by the index method. You might therefore wonder why bother to single out particular subsets of the family as operator classes; and indeed for many purposes the class divisions are irrelevant and the family is the only interesting grouping. The reason for defining operator classes is that they specify how much of the family is needed to support any particular index. If there is an index using an operator class, then that operator class cannot be dropped without dropping the index--but other parts of the operator family, namely other operator classes and loose operators, could be dropped. Thus, an operator class should be specified to contain the minimum set of operators and functions that are reasonably needed to work with an index on a specific data type, and then related but non-essential operators can be added as loose members of the operator family.
As an example, PostgreSQL has a built-in
B-tree operator family
integer_ops, which includes operator
int2_ops for indexes on
columns respectively. The family also contains cross-data-type comparison
operators allowing any two of these types to be compared, so that an index
on one of these types can be searched using a comparison value of another
type. The family could be duplicated by these definitions:
CREATE OPERATOR FAMILY integer_ops USING btree; CREATE OPERATOR CLASS int8_ops DEFAULT FOR TYPE int8 USING btree FAMILY integer_ops AS -- standard int8 comparisons OPERATOR 1 < , OPERATOR 2 <= , OPERATOR 3 = , OPERATOR 4 >= , OPERATOR 5 > , FUNCTION 1 btint8cmp(int8, int8) ; CREATE OPERATOR CLASS int4_ops DEFAULT FOR TYPE int4 USING btree FAMILY integer_ops AS -- standard int4 comparisons OPERATOR 1 < , OPERATOR 2 <= , OPERATOR 3 = , OPERATOR 4 >= , OPERATOR 5 > , FUNCTION 1 btint4cmp(int4, int4) ; CREATE OPERATOR CLASS int2_ops DEFAULT FOR TYPE int2 USING btree FAMILY integer_ops AS -- standard int2 comparisons OPERATOR 1 < , OPERATOR 2 <= , OPERATOR 3 = , OPERATOR 4 >= , OPERATOR 5 > , FUNCTION 1 btint2cmp(int2, int2) ; ALTER OPERATOR FAMILY integer_ops USING btree ADD -- cross-type comparisons int8 vs int2 OPERATOR 1 < (int8, int2) , OPERATOR 2 <= (int8, int2) , OPERATOR 3 = (int8, int2) , OPERATOR 4 >= (int8, int2) , OPERATOR 5 > (int8, int2) , FUNCTION 1 btint82cmp(int8, int2) , -- cross-type comparisons int8 vs int4 OPERATOR 1 < (int8, int4) , OPERATOR 2 <= (int8, int4) , OPERATOR 3 = (int8, int4) , OPERATOR 4 >= (int8, int4) , OPERATOR 5 > (int8, int4) , FUNCTION 1 btint84cmp(int8, int4) , -- cross-type comparisons int4 vs int2 OPERATOR 1 < (int4, int2) , OPERATOR 2 <= (int4, int2) , OPERATOR 3 = (int4, int2) , OPERATOR 4 >= (int4, int2) , OPERATOR 5 > (int4, int2) , FUNCTION 1 btint42cmp(int4, int2) , -- cross-type comparisons int4 vs int8 OPERATOR 1 < (int4, int8) , OPERATOR 2 <= (int4, int8) , OPERATOR 3 = (int4, int8) , OPERATOR 4 >= (int4, int8) , OPERATOR 5 > (int4, int8) , FUNCTION 1 btint48cmp(int4, int8) , -- cross-type comparisons int2 vs int8 OPERATOR 1 < (int2, int8) , OPERATOR 2 <= (int2, int8) , OPERATOR 3 = (int2, int8) , OPERATOR 4 >= (int2, int8) , OPERATOR 5 > (int2, int8) , FUNCTION 1 btint28cmp(int2, int8) , -- cross-type comparisons int2 vs int4 OPERATOR 1 < (int2, int4) , OPERATOR 2 <= (int2, int4) , OPERATOR 3 = (int2, int4) , OPERATOR 4 >= (int2, int4) , OPERATOR 5 > (int2, int4) , FUNCTION 1 btint24cmp(int2, int4) ;
Notice that this definition “overloads” the operator strategy and support function numbers: each number occurs multiple times within the family. This is allowed so long as each instance of a particular number has distinct input data types. The instances that have both input types equal to an operator class's input type are the primary operators and support functions for that operator class, and in most cases should be declared as part of the operator class rather than as loose members of the family.
In a B-tree operator family, all the operators in the family must sort compatibly, meaning that the transitive laws hold across all the data types supported by the family: “if A = B and B = C, then A = C”, and “if A < B and B < C, then A < C”. For each operator in the family there must be a support function having the same two input data types as the operator. It is recommended that a family be complete, i.e., for each combination of data types, all operators are included. Each operator class should include just the non-cross-type operators and support function for its data type.
To build a multiple-data-type hash operator family, compatible hash support functions must be created for each data type supported by the family. Here compatibility means that the functions are guaranteed to return the same hash code for any two values that are considered equal by the family's equality operators, even when the values are of different types. This is usually difficult to accomplish when the types have different physical representations, but it can be done in some cases. Notice that there is only one support function per data type, not one per equality operator. It is recommended that a family be complete, i.e., provide an equality operator for each combination of data types. Each operator class should include just the non-cross-type equality operator and the support function for its data type.
GIN and GiST indexes do not have any explicit notion of cross-data-type operations. The set of operators supported is just whatever the primary support functions for a given operator class can handle.
Note: Prior to PostgreSQL 8.3, there was no concept of operator families, and so any cross-data-type operators intended to be used with an index had to be bound directly into the index's operator class. While this approach still works, it is deprecated because it makes an index's dependencies too broad, and because the planner can handle cross-data-type comparisons more effectively when both data types have operators in the same operator family.
|ISBN 9781906966065||The PostgreSQL 9.0 Reference Manual - Volume 2 - Programming Guide||See the print edition|