- publishing free software manuals
The PostgreSQL 9.0 Reference Manual - Volume 2 - Programming Guide
by The PostgreSQL Global Development Group
Paperback (6"x9"), 478 pages
ISBN 9781906966065
RRP £14.95 ($19.95)

Sales of this book support the PostgreSQL project! Get a printed copy>>>

5.9.2 Base Types in C-Language Functions

To know how to write C-language functions, you need to know how PostgreSQL internally represents base data types and how they can be passed to and from functions. Internally, PostgreSQL regards a base type as a “blob of memory”. The user-defined functions that you define over a type in turn define the way that PostgreSQL can operate on it. That is, PostgreSQL will only store and retrieve the data from disk and use your user-defined functions to input, process, and output the data.

Base types can have one of three internal formats:

By-value types can only be 1, 2, or 4 bytes in length (also 8 bytes, if sizeof(Datum) is 8 on your machine). You should be careful to define your types such that they will be the same size (in bytes) on all architectures. For example, the long type is dangerous because it is 4 bytes on some machines and 8 bytes on others, whereas int type is 4 bytes on most Unix machines. A reasonable implementation of the int4 type on Unix machines might be:

/* 4-byte integer, passed by value */
typedef int int4;

On the other hand, fixed-length types of any size can be passed by-reference. For example, here is a sample implementation of a PostgreSQL type:

/* 16-byte structure, passed by reference */
typedef struct {
  double x, y;
} Point;

Only pointers to such types can be used when passing them in and out of PostgreSQL functions. To return a value of such a type, allocate the right amount of memory with palloc, fill in the allocated memory, and return a pointer to it. (Also, if you just want to return the same value as one of your input arguments that's of the same data type, you can skip the extra palloc and just return the pointer to the input value.)

Finally, all variable-length types must also be passed by reference. All variable-length types must begin with a length field of exactly 4 bytes, and all data to be stored within that type must be located in the memory immediately following that length field. The length field contains the total length of the structure, that is, it includes the size of the length field itself.

Warning: Never modify the contents of a pass-by-reference input value. If you do so you are likely to corrupt on-disk data, since the pointer you are given might point directly into a disk buffer. The sole exception to this rule is explained in section 5.10 User-Defined Aggregates.

As an example, we can define the type text as follows:

typedef struct {
    int4 length;
    char data[1];
} text;

Obviously, the data field declared here is not long enough to hold all possible strings. Since it's impossible to declare a variable-size structure in C, we rely on the knowledge that the C compiler won't range-check array subscripts. We just allocate the necessary amount of space and then access the array as if it were declared the right length. (This is a common trick, which you can read about in many textbooks about C.)

When manipulating variable-length types, we must be careful to allocate the correct amount of memory and set the length field correctly. For example, if we wanted to store 40 bytes in a text structure, we might use a code fragment like this:

#include "postgres.h"
...
char buffer[40]; /* our source data */
...
text *destination = (text *) palloc(VARHDRSZ + 40);
destination->length = VARHDRSZ + 40;
memcpy(destination->data, buffer, 40);
...

VARHDRSZ is the same as sizeof(int4), but it's considered good style to use the macro VARHDRSZ to refer to the size of the overhead for a variable-length type.

Table 5-1 specifies which C type corresponds to which SQL type when writing a C-language function that uses a built-in type of PostgreSQL. The “Defined In” column gives the header file that needs to be included to get the type definition. (The actual definition might be in a different file that is included by the listed file. It is recommended that users stick to the defined interface.) Note that you should always include ‘postgres.h’ first in any source file, because it declares a number of things that you will need anyway.

Table 5-1: Equivalent C Types for Built-In SQL Types
SQL Type C Type Defined In
abstime AbsoluteTime ‘utils/nabstime.h’
boolean bool ‘postgres.h’ (maybe compiler built-in)
box BOX* ‘utils/geo_decls.h’
bytea bytea* ‘postgres.h’
"char" char (compiler built-in)
character BpChar* ‘postgres.h’
cid CommandId ‘postgres.h’
date DateADT ‘utils/date.h’
smallint (int2) int2 or int16 ‘postgres.h’
int2vector int2vector* ‘postgres.h’
integer (int4) int4 or int32 ‘postgres.h’
real (float4) float4* ‘postgres.h’
double precision (float8) float8* ‘postgres.h’
interval Interval* ‘utils/timestamp.h’
lseg LSEG* ‘utils/geo_decls.h’
name Name ‘postgres.h’
oid Oid ‘postgres.h’
oidvector oidvector* ‘postgres.h’
path PATH* ‘utils/geo_decls.h’
point POINT* ‘utils/geo_decls.h’
regproc regproc ‘postgres.h’
reltime RelativeTime ‘utils/nabstime.h’
text text* ‘postgres.h’
tid ItemPointer ‘storage/itemptr.h’
time TimeADT ‘utils/date.h’
time with time zone TimeTzADT ‘utils/date.h’
timestamp Timestamp* ‘utils/timestamp.h’
tinterval TimeInterval ‘utils/nabstime.h’
varchar VarChar* ‘postgres.h’
xid TransactionId ‘postgres.h’

Now that we've gone over all of the possible structures for base types, we can show some examples of real functions.

ISBN 9781906966065The PostgreSQL 9.0 Reference Manual - Volume 2 - Programming GuideSee the print edition