| The PostgreSQL 9.0 Reference Manual - Volume 1A - SQL Language Reference
by The PostgreSQL Global Development Group Paperback (6"x9"), 454 pages ISBN 9781906966041 RRP £14.95 ($19.95) Sales of this book support the PostgreSQL project! Get a printed copy>>> |
10.6.6 Snowball Dictionary
The Snowball dictionary template is based on a project
by Martin Porter, inventor of the popular Porter's stemming algorithm
for the English language. Snowball now provides stemming algorithms for
many languages (see the Snowball site for more information). Each algorithm understands how to
reduce common variant forms of words to a base, or stem, spelling within
its language. A Snowball dictionary requires a language
parameter to identify which stemmer to use, and optionally can specify a
stopword file name that gives a list of words to eliminate.
(PostgreSQL's standard stopword lists are also
provided by the Snowball project.)
For example, there is a built-in definition equivalent to
CREATE TEXT SEARCH DICTIONARY english_stem (
TEMPLATE = snowball,
Language = english,
StopWords = english
);
The stopword file format is the same as already explained.
A Snowball dictionary recognizes everything, whether or not it is able to simplify the word, so it should be placed at the end of the dictionary list. It is useless to have it before any other dictionary because a token will never pass through it to the next dictionary.
| ISBN 9781906966041 | The PostgreSQL 9.0 Reference Manual - Volume 1A - SQL Language Reference | See the print edition |