[Snowball-discuss] newbie does snowball remove stop words?

Andrew Davidson adavidson2 at apple.com
Tue Mar 24 17:04:04 GMT 2015


Hi Oerd

thanks. sound easy enough for english. Do you know where I can find list of stop words for other languages 

thanks

andy


> On Mar 24, 2015, at 7:22 AM, Oerd Cukalla <rosaccu at gmail.com> wrote:
> 
> Hi Andrew,
> 
> you may want to implement a lookup table populated with the stopwords and only stem a word in input if it is not in the stopwords table.
> 
> It should be quite easy to implement in Java, but let me know if you need assistance.
> 
> Have a nice day, 
>     Oerd
> 
> 
> On Tue, 17 Mar 2015 21:36 Andrew Davidson <adavidson2 at apple.com <mailto:adavidson2 at apple.com>> wrote:
> Hi Olly
> 
> I imagine removing stop words is a fairly common requirement. Any idea how people implement stop word removal with  snowball? 
> 
> The reason I originally thought snowball provided stop word removal was because of the following links  http://snowball.tartarus.org/algorithms/english/stop.txt <http://snowball.tartarus.org/algorithms/english/stop.txt> (from http://snowball.tartarus.org/algorithms/english/stemmer.html <http://snowball.tartarus.org/algorithms/english/stemmer.html>)
> 
> It seems to suggest there is some stop word support
> 
> Thanks
> 
> Andy
> 
>> On Mar 16, 2015, at 10:20 PM, Olly Betts <olly at survex.com <mailto:olly at survex.com>> wrote:
>> 
>> On Mon, Mar 16, 2015 at 06:31:09PM -0700, Andrew Davidson wrote:
>>> today I downloaded the java version of snowball and compiled it. I ran
>>> a couple of little example through it. It does not appear to remove
>>> stop works. Is this a bug? 
>> 
>> That's not a bug - snowball is a stemmer, not a stopword remover.
>> 
>> Cheers,
>>    Olly
> 
> _______________________________________________
> Snowball-discuss mailing list
> Snowball-discuss at lists.tartarus.org <mailto:Snowball-discuss at lists.tartarus.org>
> http://lists.tartarus.org/mailman/listinfo/snowball-discuss <http://lists.tartarus.org/mailman/listinfo/snowball-discuss>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.tartarus.org/mailman/private/snowball-discuss/attachments/20150324/7b5fc4b2/attachment.html>


More information about the Snowball-discuss mailing list