|The PostgreSQL 9.0 Reference Manual - Volume 1A - SQL Language Reference
by The PostgreSQL Global Development Group
Paperback (6"x9"), 454 pages
RRP £14.95 ($19.95)
Sales of this book support the PostgreSQL project! Get a printed copy>>>
10.3.4 Highlighting Results
To present search results it is ideal to show a part of each document and
how it is related to the query. Usually, search engines show fragments of
the document with marked search terms. PostgreSQL
provides a function
implements this functionality.
regconfig, ] document
ts_headline accepts a document along
with a query, and returns an excerpt from
the document in which terms from the query are highlighted. The
configuration to be used to parse the document can be specified by
config; if config
is omitted, the
default_text_search_config configuration is used.
If an options string is specified it must
consist of a comma-separated list of one or more
The available options are:
StopSel: the strings with which to delimit query words appearing in the document, to distinguish them from other excerpted words. You must double-quote these strings if they contain spaces or commas.
MinWords: these numbers determine the longest and shortest headlines to output.
ShortWord: words of this length or less will be dropped at the start and end of a headline. The default value of three eliminates common English articles.
HighlightAll: Boolean flag; if
truethe whole document will be used as the headline, ignoring the preceding three parameters.
MaxFragments: maximum number of text excerpts or fragments to display. The default value of zero selects a non-fragment-oriented headline generation method. A value greater than zero selects fragment-based headline generation. This method finds text fragments with as many query words as possible and stretches those fragments around the query words. As a result query words are close to the middle of each fragment and have words on each side. Each fragment will be of at most
MaxWordsand words of length
ShortWordor less are dropped at the start and end of each fragment. If not all query words are found in the document, then a single fragment of the first
MinWordsin the document will be displayed.
FragmentDelimiter: When more than one fragment is displayed, the fragments will be separated by this string.
Any unspecified options receive these defaults:
StartSel=<b>, StopSel=</b>, MaxWords=35, MinWords=15, ShortWord=3, HighlightAll=FALSE, MaxFragments=0, FragmentDelimiter=" ... "
SELECT ts_headline('english', 'The most common type of search is to find all documents containing given query terms and return them in order of their similarity to the query.', to_tsquery('query & similarity')); ts_headline -------------------------------------------------- containing given <b>query</b> terms and return them in order of their <b>similarity</b> to the <b>query</b>. SELECT ts_headline('english', 'The most common type of search is to find all documents containing given query terms and return them in order of their similarity to the query.', to_tsquery('query & similarity'), 'StartSel = <, StopSel = >'); ts_headline -------------------------------------------------- containing given <query> terms and return them in order of their <similarity> to the <query>.
ts_headline uses the original document, not a
tsvector summary, so it can be slow and should be used with
care. A typical mistake is to call
every matching document when only ten documents are
to be shown. SQL subqueries can help; here is an
SELECT id, ts_headline(body, q), rank FROM (SELECT id, body, q, ts_rank_cd(ti, q) AS rank FROM apod, to_tsquery('stars') q WHERE ti @@ q ORDER BY rank DESC LIMIT 10) AS foo;
|ISBN 9781906966041||The PostgreSQL 9.0 Reference Manual - Volume 1A - SQL Language Reference||See the print edition|