| The PostgreSQL 9.0 Reference Manual - Volume 1A - SQL Language Reference
by The PostgreSQL Global Development Group Paperback (6"x9"), 454 pages ISBN 9781906966041 RRP £14.95 ($19.95) Sales of this book support the PostgreSQL project! Get a printed copy>>> |
10.6.4.2 Thesaurus Example
Consider a simple astronomical thesaurus thesaurus_astro,
which contains some astronomical word combinations:
supernovae stars : sn crab nebulae : crab
Below we create a dictionary and bind some token types to an astronomical thesaurus and English stemmer:
CREATE TEXT SEARCH DICTIONARY thesaurus_astro (
TEMPLATE = thesaurus,
DictFile = thesaurus_astro,
Dictionary = english_stem
);
ALTER TEXT SEARCH CONFIGURATION russian
ALTER MAPPING FOR asciiword, asciihword, hword_asciipart
WITH thesaurus_astro, english_stem;
Now we can see how it works.
ts_lexize is not very useful for testing a thesaurus,
because it treats its input as a single token. Instead we can use
plainto_tsquery and to_tsvector
which will break their input strings into multiple tokens:
SELECT plainto_tsquery('supernova star');
plainto_tsquery
-----------------
'sn'
SELECT to_tsvector('supernova star');
to_tsvector
-------------
'sn':1
In principle, one can use to_tsquery if you quote
the argument:
SELECT to_tsquery(”'supernova star”'); to_tsquery ------------ 'sn'
Notice that supernova star matches supernovae
stars in thesaurus_astro because we specified
the english_stem stemmer in the thesaurus definition.
The stemmer removed the e and s.
To index the original phrase as well as the substitute, just include it in the right-hand part of the definition:
supernovae stars : sn supernovae stars
SELECT plainto_tsquery('supernova star');
plainto_tsquery
-----------------------------
'sn' & 'supernova' & 'star'
| ISBN 9781906966041 | The PostgreSQL 9.0 Reference Manual - Volume 1A - SQL Language Reference | See the print edition |