[Snowball-discuss] English stemmer and 'ian' suffix

Martin Porter martin.f.porter at gmail.com
Fri May 24 21:44:04 BST 2013


Dear J F,

Thanks for you enquiry.

There is no special reason, except that removing -ian often does not
help things much. Think of agrarian, patrician, prussian, utilitarian
etc. A real problem is that with -ian, you sometimes want to remove
the whole three letters, as in orwellian, keynesian, which you cite,
sometimes just -an, as in antiquarian, historian, italian, and
sometimes just -n, as in indian, persian, bolivian. In general, the
snowball stemmers avoid dealing with the rarer suffixes, and this is
discussed in the introductory document, so in that sense I guess it
has come up before.

Martin

On Fri, May 24, 2013 at 8:07 AM,  <jf at dockes.org> wrote:
> Hello,
>
> Is there is a reason why the English stemmer does not seem to
> handle a 'ian' suffix: politician, orwellian, keynesian... ?
>
> I guess that the question already came up ?
>
> Regards,
>
> J.F. Dockès



More information about the Snowball-discuss mailing list