| The PostgreSQL 9.0 Reference Manual - Volume 1A - SQL Language Reference
by The PostgreSQL Global Development Group Paperback (6"x9"), 454 pages ISBN 9781906966041 RRP £14.95 ($19.95) Sales of this book support the PostgreSQL project! Get a printed copy>>> |
10.4.4 Gathering Document Statistics
The function ts_stat is useful for checking your
configuration and for finding stop-word candidates.
ts_stat(sqlquerytext, [ weightstext, ] OUT wordtext, OUT ndocinteger, OUT nentryinteger) returnssetof record
sqlquery is a text value containing an SQL
query which must return a single tsvector column.
ts_stat executes the query and returns statistics about
each distinct lexeme (word) contained in the tsvector
data. The columns returned are
-
word
text---the value of a lexeme -
ndoc
integer---number of documents (tsvectors) the word occurred in -
nentry
integer---total number of occurrences of the word
If weights is supplied, only occurrences having one of those weights are counted.
For example, to find the ten most frequent words in a document collection:
SELECT * FROM ts_stat('SELECT vector FROM apod')
ORDER BY nentry DESC, ndoc DESC, word
LIMIT 10;
The same, but counting only word occurrences with weight A
or B:
SELECT * FROM ts_stat('SELECT vector FROM apod', 'ab')
ORDER BY nentry DESC, ndoc DESC, word
LIMIT 10;
| ISBN 9781906966041 | The PostgreSQL 9.0 Reference Manual - Volume 1A - SQL Language Reference | See the print edition |