[Snowball-discuss] (Case FC2450947) Stemming Special Cases

Martin Porter martin.f.porter at gmail.com
Sat May 24 09:54:18 BST 2014


(Sorry for the terrible delay in replying which was caused by
technical difficulties with snowball-discuss.)

The paste / past error is interesting; I had never noticed it before.
The -e is of course removed so that paste conflates with pasting. The
alternative would be to restore the -e after removing the -ing from
pasting and to leave paste invariant. That would not work if "past"
was a verb. For example there is a problem with

rout route

which are distinct verbs in English. "routing" is the -ing form of
both verbs, and therefore ambiguous.

One could extend the definition of shortv to include "st" preceded by
"a" and possibly preceded by consonants. Verbs affected are

taste
paste
waste
haste

and no others I think. Again we're in luck, in that tast, past, are
wast are not verbs in English. On the other hand tast, hast, wast, are
hardly words of contemporary English, so it comes down to just paste /
past again. Despite the idea of a general rule, it should be fixed by
an addition to the exception lists.

Martin



More information about the Snowball-discuss mailing list