[Snowball-discuss] Words 'rheden' and 'heden' in the Dutch stemmer

Fred Oranje fred.oranje at oranje-bv.nl
Wed Mar 28 16:04:33 BST 2012


Hi everyone,

While implementing the Dutch stemmer in Delphi I came across an issue with the words 'rheden' en 'heden'.

In my opinion the stemmed versions of these words should be 'rhed' en 'hed', but the Snowball implementation returns 'rheden' en 'heden'. In step 1 of the Snowball implementation both 'heden' and 'en' get removed, but because 'heden' en 'rheden' contains 'heden' removal of 'en' never gets executed, even though the 'heden' isn't removed in those cases. My implementation differed in those two cases.

The word 'rhenen' does stem into 'rhen', so I think 'rheden' and 'heden' should also.

What do you guys think?

Kind regards,

Fred Oranje



More information about the Snowball-discuss mailing list