| The PostgreSQL 9.0 Reference Manual - Volume 2 - Programming Guide
by The PostgreSQL Global Development Group Paperback (6"x9"), 478 pages ISBN 9781906966065 RRP £14.95 ($19.95) Sales of this book support the PostgreSQL project! Get a printed copy>>> |
5.9.2 Base Types in C-Language Functions
To know how to write C-language functions, you need to know how PostgreSQL internally represents base data types and how they can be passed to and from functions. Internally, PostgreSQL regards a base type as a “blob of memory”. The user-defined functions that you define over a type in turn define the way that PostgreSQL can operate on it. That is, PostgreSQL will only store and retrieve the data from disk and use your user-defined functions to input, process, and output the data.
Base types can have one of three internal formats:
- pass by value, fixed-length
- pass by reference, fixed-length
- pass by reference, variable-length
By-value types can only be 1, 2, or 4 bytes in length
(also 8 bytes, if sizeof(Datum) is 8 on your machine).
You should be careful to define your types such that they will be the
same size (in bytes) on all architectures. For example, the
long type is dangerous because it is 4 bytes on some
machines and 8 bytes on others, whereas int type is 4 bytes
on most Unix machines. A reasonable implementation of the
int4 type on Unix machines might be:
/* 4-byte integer, passed by value */ typedef int int4;
On the other hand, fixed-length types of any size can be passed by-reference. For example, here is a sample implementation of a PostgreSQL type:
/* 16-byte structure, passed by reference */
typedef struct {
double x, y;
} Point;
Only pointers to such types can be used when passing
them in and out of PostgreSQL functions.
To return a value of such a type, allocate the right amount of
memory with palloc, fill in the allocated memory,
and return a pointer to it. (Also, if you just want to return the
same value as one of your input arguments that's of the same data type,
you can skip the extra palloc and just return the
pointer to the input value.)
Finally, all variable-length types must also be passed by reference. All variable-length types must begin with a length field of exactly 4 bytes, and all data to be stored within that type must be located in the memory immediately following that length field. The length field contains the total length of the structure, that is, it includes the size of the length field itself.
Warning: Never modify the contents of a pass-by-reference input value. If you do so you are likely to corrupt on-disk data, since the pointer you are given might point directly into a disk buffer. The sole exception to this rule is explained in section 5.10 User-Defined Aggregates.
As an example, we can define the type text as
follows:
typedef struct {
int4 length;
char data[1];
} text;
Obviously, the data field declared here is not long enough to hold all possible strings. Since it's impossible to declare a variable-size structure in C, we rely on the knowledge that the C compiler won't range-check array subscripts. We just allocate the necessary amount of space and then access the array as if it were declared the right length. (This is a common trick, which you can read about in many textbooks about C.)
When manipulating
variable-length types, we must be careful to allocate
the correct amount of memory and set the length field correctly.
For example, if we wanted to store 40 bytes in a text
structure, we might use a code fragment like this:
#include "postgres.h" ... char buffer[40]; /* our source data */ ... text *destination = (text *) palloc(VARHDRSZ + 40); destination->length = VARHDRSZ + 40; memcpy(destination->data, buffer, 40); ...
VARHDRSZ is the same as sizeof(int4), but
it's considered good style to use the macro VARHDRSZ
to refer to the size of the overhead for a variable-length type.
Table 5-1 specifies which C type corresponds to which SQL type when writing a C-language function that uses a built-in type of PostgreSQL. The “Defined In” column gives the header file that needs to be included to get the type definition. (The actual definition might be in a different file that is included by the listed file. It is recommended that users stick to the defined interface.) Note that you should always include ‘postgres.h’ first in any source file, because it declares a number of things that you will need anyway.
| SQL Type | C Type | Defined In
|
abstime | AbsoluteTime | ‘utils/nabstime.h’
|
boolean | bool | ‘postgres.h’ (maybe compiler built-in)
|
box | BOX* | ‘utils/geo_decls.h’
|
bytea | bytea* | ‘postgres.h’
|
"char" | char | (compiler built-in)
|
character | BpChar* | ‘postgres.h’
|
cid | CommandId | ‘postgres.h’
|
date | DateADT | ‘utils/date.h’
|
smallint (int2) | int2 or int16 | ‘postgres.h’
|
int2vector | int2vector* | ‘postgres.h’
|
integer (int4) | int4 or int32 | ‘postgres.h’
|
real (float4) | float4* | ‘postgres.h’
|
double precision (float8) | float8* | ‘postgres.h’
|
interval | Interval* | ‘utils/timestamp.h’
|
lseg | LSEG* | ‘utils/geo_decls.h’
|
name | Name | ‘postgres.h’
|
oid | Oid | ‘postgres.h’
|
oidvector | oidvector* | ‘postgres.h’
|
path | PATH* | ‘utils/geo_decls.h’
|
point | POINT* | ‘utils/geo_decls.h’
|
regproc | regproc | ‘postgres.h’
|
reltime | RelativeTime | ‘utils/nabstime.h’
|
text | text* | ‘postgres.h’
|
tid | ItemPointer | ‘storage/itemptr.h’
|
time | TimeADT | ‘utils/date.h’
|
time with time zone | TimeTzADT | ‘utils/date.h’
|
timestamp | Timestamp* | ‘utils/timestamp.h’
|
tinterval | TimeInterval | ‘utils/nabstime.h’
|
varchar | VarChar* | ‘postgres.h’
|
xid | TransactionId | ‘postgres.h’ |
Now that we've gone over all of the possible structures for base types, we can show some examples of real functions.
| ISBN 9781906966065 | The PostgreSQL 9.0 Reference Manual - Volume 2 - Programming Guide | See the print edition |