[Snowball-discuss] porter2 question
Reetz, Wendy
wreetz@greenapple.com
Fri Oct 4 14:19:02 2002
Martin,
Ah, I missed that one, thanks.
No, I hadn't looked for a php version on the Snowball site. I was
already done with the first algorithm before I went on the site, I
really only went there in search of a set of words and their appropriate
stems as test data.
I will look into though.
Thanks,
Wendy
-----Original Message-----
From: Martin Porter [mailto:martin_porter@softhome.net]=20
Sent: Friday, October 04, 2002 8:51 AM
To: Reetz, Wendy
Cc: snowball-discuss@lists.tartarus.org
Subject: RE: [Snowball-discuss] porter2 question
> ... no 'OU' stripping in the 4th
>step any longer. Was that intentional? =20
>
>Wendy
Yes. In the old stemmer -s is removed from -ous early on so -ou- is
removed
later to compensate, but in the new stemmer -s is not removed after -u-
(cactus, ferrous, locus etc) so -ous survives as an ending until step 4.
There has been some php work on the stemmers by "dark panda": You can
find
the relevant correspondence if you put "php" in the front page search
box -
but maybe you have found it already.
Martin