[Snowball-discuss] Question about Porter2 Step 4

Håvard Lindset lindset@webpixels.net
Sun Oct 19 22:48:01 2003


Hi folks,

This is what the Porter 2 definition
(http://snowball.tartarus.org/english/stemmer.html) has to say about a part
of Step 4:

> Search for the longest among the following suffixes, and,
> if found and in R2, perform the action indicated.
>
> ... (removed the non-relevant part of step 4)
>
> ion
>    delete if preceded by s or t"

When I feed the word "unquestionably" to my stemmer, it returns "unquest",
while the provided sample list of stemmed words shows the word being stemmed
to "unquestion" (and so does
http://snowball.tartarus.org/demo.php?words=unquestionably)

When step 4 kicks in, this is what the word looks like:

  u n q u e s t i o n
     |       |
     |       R2------
     R1--------------

According to the Porter2 definition described on the site, ion should be
removed because it's preceded a "t", and "ion" is located in R2

Has the step 4 rules been changed, or has the provided dictionary/stemmed
list (and demo) not been updated for the Porter2 method? What should I do?

Thanks

Best regards,
Håvard Lindset