[Snowball-discuss] Differences between actual and expected stems in terms of the finnish java stemmer

pprett at sbox.tugraz.at pprett at sbox.tugraz.at
Tue Sep 14 15:53:32 BST 2004


Hi,
first of all i want to thank martin and his team for their outstanding work
concerning stemming.

I'm currently writing a testing framework for different stemmers. as i was
testing the finnish snowball stemmer (java implementation) i faced some
differences in terms of expected stems.
input test file and expected stems test file were those provided by the snowball
project homepage.
there were just differences in terms of the suffixes "seen" "siin" "den" and
"tten".
e.g.: inflaatiotavoitteeseen -> inflaatiotavoittees (expected: inflaatiotavoit)

after some research i discovered that the methods r_LONG and r_VI computed wrong
return values -> the error may be in the method find_among_b but i'm not quite
sure.

regards,
        peter




More information about the Snowball-discuss mailing list