[Snowball-discuss] newbie does snowball remove stop words?
Andrew Davidson
adavidson2 at apple.com
Tue Mar 24 17:04:04 GMT 2015
Hi Oerd
thanks. sound easy enough for english. Do you know where I can find list of stop words for other languages
thanks
andy
> On Mar 24, 2015, at 7:22 AM, Oerd Cukalla <rosaccu at gmail.com> wrote:
>
> Hi Andrew,
>
> you may want to implement a lookup table populated with the stopwords and only stem a word in input if it is not in the stopwords table.
>
> It should be quite easy to implement in Java, but let me know if you need assistance.
>
> Have a nice day,
> Oerd
>
>
> On Tue, 17 Mar 2015 21:36 Andrew Davidson <adavidson2 at apple.com <mailto:adavidson2 at apple.com>> wrote:
> Hi Olly
>
> I imagine removing stop words is a fairly common requirement. Any idea how people implement stop word removal with snowball?
>
> The reason I originally thought snowball provided stop word removal was because of the following links http://snowball.tartarus.org/algorithms/english/stop.txt <http://snowball.tartarus.org/algorithms/english/stop.txt> (from http://snowball.tartarus.org/algorithms/english/stemmer.html <http://snowball.tartarus.org/algorithms/english/stemmer.html>)
>
> It seems to suggest there is some stop word support
>
> Thanks
>
> Andy
>
>> On Mar 16, 2015, at 10:20 PM, Olly Betts <olly at survex.com <mailto:olly at survex.com>> wrote:
>>
>> On Mon, Mar 16, 2015 at 06:31:09PM -0700, Andrew Davidson wrote:
>>> today I downloaded the java version of snowball and compiled it. I ran
>>> a couple of little example through it. It does not appear to remove
>>> stop works. Is this a bug?
>>
>> That's not a bug - snowball is a stemmer, not a stopword remover.
>>
>> Cheers,
>> Olly
>
> _______________________________________________
> Snowball-discuss mailing list
> Snowball-discuss at lists.tartarus.org <mailto:Snowball-discuss at lists.tartarus.org>
> http://lists.tartarus.org/mailman/listinfo/snowball-discuss <http://lists.tartarus.org/mailman/listinfo/snowball-discuss>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.tartarus.org/mailman/private/snowball-discuss/attachments/20150324/7b5fc4b2/attachment.html>
More information about the Snowball-discuss
mailing list