[Snowball-discuss] evaluation of Snowball stemmers
Fred Gey
gey at uclink.berkeley.edu
Fri Dec 10 22:44:36 GMT 2004
I meant that the performance with stemming was 29%/56% better than
performance without stemming ((prec-with minus prec-without)/prec-without) *
100). I attach the fragment of text (as a Word document) which describes
the experiments.
Fred
-----Original Message-----
From: Martin Porter [mailto:martin.porter at grapeshot.co.uk]
Sent: Friday, December 10, 2004 2:26 PM
To: gey at berkeley.edu; 'Diana Maynard'
Cc: snowball-discuss at lists.tartarus.org
Subject: RE: [Snowball-discuss] evaluation of Snowball stemmers
Fred,
Do you mean you got a 29%/56% average precision improvement when you
switched stemming off? Anything is possible, but this does surprise me: I
would have expected Russian, with its highly (and regularly) inflected
vocabulary to do quite well under stemming.
If you look at the paper at
http://clef.isti.cnr.it/2004/working_notes/WorkingNotes2004/16.pdf
(Mono- and Crosslingual Retrieval Experiments at the University of
Hildesheim - René Hackl, Thomas Mandl and Christa Womser-Hacker) the
evidence, for Finnish, points the other way ("the snowball stemmer works
very well"). Their Russian experiments were not unfortunately taken to
conclusion, but I feel much more confidence myself in the snowball Russian
stemmer than the snowball Finnish stemmer.
On the other hand I have had verbal notice (which I did not entirely trust!)
of the Finnish stemmer doing badly in some other tests.
I should point out that although the version of the stemmer you picked up
works for KOI-8, Snowball is designed to make switching to other character
codes as easy as possible. See the notes at
http://snowball.tartarus.org/codesets/guide.html
Martin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: russian-monolingual-additiona-experiments.doc
Type: application/msword
Size: 34816 bytes
Desc: not available
Url : http://lists.tartarus.org/mailman/private/snowball-discuss/attachments/20041210/a25b0adb/russian-monolingual-additiona-experiments-0001.doc
More information about the Snowball-discuss
mailing list