[Snowball-discuss] Testing a stemmer

A. Tordai atordai at science.uva.nl
Thu Feb 23 17:35:38 GMT 2006


Yes, I'm sorry I wasn't exactly clear. What I mean is I have finished the
Hungarian stemmers and I need to create a word list to submit it with to
the Snowball site. I've downloaded the libstemmer_c package where I found
a program called stemwords.c which can be used to stem an entire list of
words with. Unfortunately I don't really understand how I can use a
stemmer of my own making. In the modules.h file in the libstemmer
directory it says I can't edit the file manually and the module names come
from a mkmodules.pl file which isn't in the package.

In other words is there some way I can insert the c version of my stemmer
somewhere so I can stem a word list using this package?

Thank you

Anna Tordai

> This seems to be a very general question, as it raises the whole issue of
> stemmer evaluation! But the simplest test is to arrange for two column
> output (http://snowball.tartarus.org/algorithms/german/diffs.txt etc)and
> inspect it by hand.
>
> It can be useful to work with a similar list, sorted from the end of the
> word (a reverse index). Lovins mentions using these two lists in her early
> paper.
>
> Martin
>
>>Dear Snowball people,
>>
>>What would be the simplest way of testing a self-made stemmer on a word
>>list. That is, is there a some kind of testing program?
>>
>>Thank you,
>>
>>Anna Tordai
>
>
>





More information about the Snowball-discuss mailing list