| The PostgreSQL 9.0 Reference Manual - Volume 3 - Server Administration Guide
by The PostgreSQL Global Development Group Paperback (6"x9"), 274 pages ISBN 9781906966072 RRP £9.95 ($14.95) Sales of this book support the PostgreSQL project! Get a printed copy>>> |
8.2.2 Setting the Character Set
initdb defines the default character set (encoding)
for a PostgreSQL cluster. For example,
initdb -E EUC_JP
sets the default character set to
EUC_JP (Extended Unix Code for Japanese). You
can use --encoding instead of
-E if you prefer longer option strings.
If no -E or --encoding option is
given, initdb attempts to determine the appropriate
encoding to use based on the specified or default locale.
You can specify a non-default encoding at database creation time, provided that the encoding is compatible with the selected locale:
createdb -E EUC_KR -T template0 --lc-collate=ko_KR.euckr --lc-ctype=ko_KR.euckr korean
This will create a database named korean that
uses the character set EUC_KR, and locale ko_KR.
Another way to accomplish this is to use this SQL command:
CREATE DATABASE korean WITH ENCODING 'EUC_KR' LC_COLLATE='ko_KR.euckr' LC_CTYPE='ko_KR.euckr' TEMPLATE=template0;
Notice that the above commands specify copying the template0
database. When copying any other database, the encoding and locale
settings cannot be changed from those of the source database, because
that might result in corrupt data. For more information see
section 7.3 Template Databases.
The encoding for a database is stored in the system catalog
pg_database. You can see it by using the
psql -l option or the
\l command.
$ psql -l
List of databases
Name | Owner | Encoding | Collation | Ctype |
-----------+----------+-----------+-------------+-------------+
clocaledb | hlinnaka | SQL_ASCII | C | C |
englishdb | hlinnaka | UTF8 | en_GB.UTF8 | en_GB.UTF8 |
japanese | hlinnaka | UTF8 | ja_JP.UTF8 | ja_JP.UTF8 |
korean | hlinnaka | EUC_KR | ko_KR.euckr | ko_KR.euckr |
postgres | hlinnaka | UTF8 | fi_FI.UTF8 | fi_FI.UTF8 |
template0 | hlinnaka | UTF8 | fi_FI.UTF8 | fi_FI.UTF8 |
template1 | hlinnaka | UTF8 | fi_FI.UTF8 | fi_FI.UTF8 |
Access Privileges
-------------------------------------
{=c/hlinnaka,hlinnaka=CTc/hlinnaka}
{=c/hlinnaka,hlinnaka=CTc/hlinnaka}
(7 rows)
Important: On most modern operating systems, PostgreSQL can determine which character set is implied by the
LC_CTYPEsetting, and it will enforce that only the matching database encoding is used. On older systems it is your responsibility to ensure that you use the encoding expected by the locale you have selected. A mistake in this area is likely to lead to strange behavior of locale-dependent operations such as sorting.PostgreSQL will allow superusers to create databases with
SQL_ASCIIencoding even whenLC_CTYPEis notCorPOSIX. As noted above,SQL_ASCIIdoes not enforce that the data stored in the database has any particular encoding, and so this choice poses risks of locale-dependent misbehavior. Using this combination of settings is deprecated and may someday be forbidden altogether.
| ISBN 9781906966072 | The PostgreSQL 9.0 Reference Manual - Volume 3 - Server Administration Guide | See the print edition |