[Snowball-discuss] -nisse ending in German stemmer

Martin Porter martin.porter at grapeshot.co.uk
Fri Dec 11 17:26:48 GMT 2009




Wolfgang Klinger pointed out in October that the German stemmer reduces
Ku"rbisse (pumpkins) to Ku"rbiss not Ku"rbis, and I promised to
investigate. 

In the sample German vocabulary there is a collection of words ending
-isse (or -issen or -isses). Among these, about 70% actually have the
ending -nisse, while 30% have -isse without the preceding n. For those
with the -nisse ending, stemming to -nis is always correct. For those
with the -isse ending and no preceding n, stemming to -is is wrong in
all but a couple of cases, one being the word Ku"rbisse again.

So I've put in a new rule, to reduce -nisse (and -nissen and -nisses) to
-nis. 

Thanks to Wolfgang for this pointer.

However, Ku"rbisse still stems to Ku"rbiss by this change.


Martin





More information about the Snowball-discuss mailing list