[Snowball-discuss] Personal pronoun "his" in Snowball
EnglishStemmer
Martin Porter
martin.porter at grapeshot.co.uk
Sun May 22 14:13:00 BST 2005
Steve,
You should switch from the Porter to the Porter2 stemmer. If you look at the
page on this stemmer there are clear guidelines about extending the
exclusion list of words. You will also find that Porter2 does not remove the
final "s" from "his".
(Your "EnglishStemmer" should equate with Porter2, if the naming conventions
I set up are being followed.)
The way phrase retrieval combines with stemming obviously depends on the
underlying IR model that is being used, and I'm not quite sure what your
assumptions are here. In Xapian for example, a phrase can generate a
structure of terms, each of which might be stemmed or unstemmed,
"his palms" --> PHRASE ---+--- 'hi'
|
+--- 'palm'
etc.
Martin
More information about the Snowball-discuss
mailing list