[Snowball-discuss] New member questions
Martin Porter
martin.f.porter2 at gmail.com
Fri Jan 24 21:05:36 GMT 2025
Harri,
You will have to excuse me for not being completely on top of the subject
of snowball these days, but the main work was done a quarter of a century
ago, and I am now 80 years of age. Papers evaluating snowball were often a
bit negative: that is because they were testing some new system of the
authors with a base system, and the base system would often use snowball.
If they outperformed the base system their paper would be published, if not
they would hold it back and try again. There was a general paper with
comparisons of IR with/without snowball for several languages, which put
snowball in a much better light, but I can't quite find it at the moment.
If I do I'll let you know. But for Finnish itself this is encouraging,
http://clef.isti.cnr.it/2004/working_notes/WorkingNotes2004/16.pdf
with their conclusion,
"For Finnish, the performance is quite high. The snowball stemmer works
very well."
(The stemmers do not reduce words to real vocabulary words incidentally,
just to a character string that collects variant forms together.)
I think the final statement on these stemming algorithms must be that they
are a simple and inexpensive way of conflating variant forms of a word
together, and that this can be useful in certain circumstances.
Martin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.tartarus.org/pipermail/snowball-discuss/attachments/20250124/c55f8d8a/attachment.htm>
More information about the Snowball-discuss
mailing list