[Xapian-discuss] Extract common phrases from index

Josh leftdrive at gmail.com
Thu Nov 6 04:47:56 GMT 2008


Is it possible to extract common phrases from an index?

Basically, I'd like to index my document set and find words that
commonly appear next to each other.

For example if I a set of recent political news articles I may expect
to find "John McCain", "Sarah Palin" and "Barrack Obama".

Ideally I'd like to specify any number of words (all 2 word phrases,
all 3 word phrases).

Possible? Crazy? Point me the right direction.

Thanks!
Josh



More information about the Xapian-discuss mailing list