[Snowball-discuss] -nisse ending in German stemmer

Ralf Junker ralfjunker at gmx.de
Wed Dec 16 10:43:25 GMT 2009


On 16.12.2009 10:32, Martin Porter wrote:

> If I remember correctly, the test data is rebuilt automatically
> after modified algorithms have been put in place on the snowball
> site.

If the test data is automatically rebuld from modified stemmer sources,
any potential stemmer errors will propagate there.

I was under the impression that the test data is intended for stemmer
regression testing. Is it not?

IMO, the test data should be "authoritative" - and hence manually 
checked - to verify stemmer implementations (be it in C or other 
programming languages). Otherwise, what would be their purpose?

Ralf



More information about the Snowball-discuss mailing list