[Snowball-discuss] German Stemmer
Tobias N. Sasse
tobi at byte23.de
Tue Nov 3 15:28:38 GMT 2009
Hi Guys,
I am a german computer science student and currently doing research in
textual analytic systems. I need stemmers for all kinds of languages
(a good start would be English, German, French, Spanish...)
I had a quick look at the German version on your site and sady
recognized that the german version produces tons of errors. For
instance a
"katze" -> "katz"
"kätzchen" -> "katzch"
"kätzchens" -> "katzch"
is wrong, there is no german word "katzch" it should be "katze" (the
actual stem) and "katz" is also wrong, the trailing "e" is missing...
So my question is: do you know an improved version, or an alternate
algorithm? What about the other languages, and how is the quality in
there - I am not a linguist, thus can't judge their quality....
Thanks for your info
Tobi
---
Tobias N. Sasse
tobi at byte23.de
http://tobi.byte23.de
More information about the Snowball-discuss
mailing list