[Snowball-discuss] Comon strings ending in s that may not be ordinary words

Tolkin, Steve Steve.Tolkin@FMR.COM
Mon, 26 Nov 2001 08:34:54 -0500


Dear Martin,
	I was not going to bring this up, but since you did,
there are many common short abbreviations, acronyms,
initialisms, etc. that end with "s".
For example (common in the U.S.) IRS, INS, EDS, etc.
These, and many others, will not be handled
by the new approach either.  The first two above are 
especially interesting because the first will conflate
with "ir" as in information retrieval, and then second
will become the probable stopword "in".

 
Hopefully helpfully yours,
Steve
-- 
Steven Tolkin          steve.tolkin@fmr.com      617-563-0516 
Fidelity Investments   82 Devonshire St. V1D     Boston MA 02109
There is nothing so practical as a good theory.  Comments are by me, 
not Fidelity Investments, its subsidiaries or affiliates.

You said:

> Message: 8
> To: snowball-discuss@lists.sourceforge.net
> From: martin_porter@softhome.net (Martin Porter)
> Date: Thu, 22 Nov 2001 03:06:26 -0700
> Subject: [Snowball-discuss] Changes to Porter2
> 
> 
> I have made some changes to the porter2 algorithm.
> 
> The documentation errors noticed by Andrew Aksyonoff have 
> been corrected.
> 
> -s removal has been changed. You now need a vowel somewhere before the
> letter before the s. So 'gas', 'this', 'has', 'was' keep the 
> s, 'dogs',
> 'cats', 'woos', 'kiwis' lose the s. Usefully, the s is not 
> removed from
> non-words like 'cvs', 'spss', 'lms' etc.
> 
> In general there is a problem identifying plurals of words 
> ending Xs, where
> X is vowel other than e. As you know, porter2 leaves -us 
> alone but removes s
> after a,i,o. This works fairly well. 
> 
> I have added a few more exceptions in following suggestions 
> from Steve Tolkin.
> 
...
> _______________________________________________
> Snowball-discuss mailing list
> Snowball-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/snowball-discuss
> 

_______________________________________________
Snowball-discuss mailing list
Snowball-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/snowball-discuss

_____________________________________________________________________
VirusChecked by the Incepta Group plc
_____________________________________________________________________