[Xapian-discuss] Xapian and Synonyms

Richard Boulton richard at tartarus.org
Fri Jul 29 12:50:31 BST 2011


On 29 July 2011 12:13, Justin Finkelstein <justin at redwiredesign.com> wrote:
>    o When I search for 'eggplant', I get 22 results
>    o When I search for 'aubergine', I get 66

> Xapian::Query(((Zeggplant:(pos=1) SYNONYM aubergine:(pos=1)) AND
> <alldocuments>))
> Xapian::Query(((Zaubergin:(pos=1) SYNONYM eggplant:(pos=1)) AND
> <alldocuments>))
>
> Any thoughts on why the number of results would differ when one word
> over another?

The difference is due to stemming: the first query searches for
documents with any stemmed form of "eggplant", or exactly the word
"aubergine".  The second query is for stemmed forms of "aubergine" or
exactly the word "eggplant".

The crucial point is that "add_synonym" wants to be given terms, not words.

To get the behaviour you seem to want, you'd need to pass add_synonym
the stemmed forms of the terms.  The easiest way to do this is to use
a Xapian::Stem class for the language in question.

-- 
Richard



More information about the Xapian-discuss mailing list