[Snowball-discuss] Words 'rheden' and 'heden' in the Dutch stemmer
Fred Oranje
fred.oranje at oranje-bv.nl
Wed Mar 28 16:04:33 BST 2012
Hi everyone,
While implementing the Dutch stemmer in Delphi I came across an issue with the words 'rheden' en 'heden'.
In my opinion the stemmed versions of these words should be 'rhed' en 'hed', but the Snowball implementation returns 'rheden' en 'heden'. In step 1 of the Snowball implementation both 'heden' and 'en' get removed, but because 'heden' en 'rheden' contains 'heden' removal of 'en' never gets executed, even though the 'heden' isn't removed in those cases. My implementation differed in those two cases.
The word 'rhenen' does stem into 'rhen', so I think 'rheden' and 'heden' should also.
What do you guys think?
Kind regards,
Fred Oranje
More information about the Snowball-discuss
mailing list