[Snowball-discuss] Porter2 algorithm question
Martin Holmes
mholmes at uvic.ca
Fri May 31 17:25:43 BST 2019
Thanks Martin! That's really helpful. I was confused by starting from
the Snowball (Porter2) stemmer without realizing that this logic is laid
out in the original stemmer, and I guess is to some extent assumed in
the description of Porter2 that I was using:
<http://snowball.tartarus.org/algorithms/english/stemmer.html>
Cheers,
Martin
On 2019-05-31 12:08 a.m., Martin Porter wrote:
> See the first among the list of "common errors" in the paragraph
> headed "common errors" in
>
> https://tartarus.org/martin/PorterStemmer/
>
> Only one rule is applied for these lists of endings, irrespective of
> whether it results in a suffix removal or not. The order of the
> suffixes in these lists is effectively random, although there is often
> some plan to them. For example the list
>
> ational tional enci anci izer ....
>
> is by alphabetic order of the last letter but one, -a-, -a-, -c-, -c-,
> -e-, ... to emphasize the suggestion, "the test for the string S1 can
> be made fast by doing a program switch on the penultimate letter of
> the word being tested". See the appearance of this phrase in
>
> http://snowball.tartarus.org/algorithms/porter/stemmer.html
>
> -- Martin
>
--
------------------------------------------
Martin Holmes
UVic Humanities Computing and Media Centre
More information about the Snowball-discuss
mailing list