[Snowball-discuss] Questions about english stemmer & the apostrophe
Neal Richter
nealr@rightnow.com
Thu Feb 26 22:50:02 2004
Question:
I'm sure this has been discussed before... I tried a google search on the
snowball-discuss archive with no luck.
Is there a rationale for behavior below on words with the apostrophe?
bagpipe -> bagpip
bagpipe's -> bagpipe'
bagpipes -> bagpip
bakeries -> bakeri
bakeries' -> bakeries'
bakery -> bakeri
bakery's -> bakery'
bakerys -> bakeri //This isn't a word - but the form is OK sometimes.
I looked at several older versions of various (porter derived) english
stemmers, all have this behavior.
One could argue that when the apostrophe is used an IR application would
want to preserve the original noun. Apostrophes are used to denote
possession by an entity, and the generalization of stemming 'bakery's ->
bakeri' would be inappropriate.
Since stemming is used to generalize word forms... you could also
argue that the possessive form should be generalized as well.
Eh???
Thanks!
Neal Richter
Knowledgebase Developer
RightNow Technologies, Inc.
Customer Service for Every Web Site
Office: 406-522-1485