[Snowball-discuss] german stemmer / Kürbisse
Grant Ingersoll
gsingers at apache.org
Wed Oct 28 01:04:02 GMT 2009
Hi Wolfgang,
I can't speak for Sphinx, but I think in general with stemming you
will always run into these kinds of situations where words that you
think should stem a particular way don't. The approach we take in
Lucene/Solr is to either have a protected word list or another
TokenFilter (in our chain) that handles the exceptions that we deem
important. YMMV with other search engines.
-Grant
On Oct 22, 2009, at 8:50 AM, Wolfgang Klinger wrote:
>
> *hiya!*
>
> I use the sphinx search engine and have problems with
> libstemmer_de.
>
> I have text that includes the german word "Kürbis".
> The plural of "Kürbis" is "Kürbisse". Now if I search für "Kürbisse"
> I would expect results für "Kürbis" too (that's why I use libstemmer).
>
> Obviously libstemmer_de creates "kurbiss" as stemmed form instead
> of "kurbis" and therefore I get no results.
> Is that a known problem? How can I solve it?
>
>
> tia, kind regards
> Wolfgang
>
>
> _______________________________________________
> Snowball-discuss mailing list
> Snowball-discuss at lists.tartarus.org
> http://lists.tartarus.org/mailman/listinfo/snowball-discuss
More information about the Snowball-discuss
mailing list