[Snowball-discuss] Problems eliminating stop words

richard at lemurconsulting.com richard at lemurconsulting.com
Fri Sep 22 18:00:53 BST 2006


On Fri, Sep 22, 2006 at 06:21:00PM +0200, Alfredo Favenza wrote:
> I have some problem eliminating stop words by using italian stemmer
> (java version).
> In the output text file I notice that the algorithm doesn't eliminate
> stop words like il, lo, la, gli and others.
> Someone can help mer about this problem?

The stemmers do not perform stopword removal.

We provide files of suggested stop-words (see
http://snowball.tartarus.org/algorithms/italian/stop.txt for the italian
one), but these are not integrated into the stemming algorithms.  It is up
to you to write code to perform stop-word removal separately.

-- 
Richard



More information about the Snowball-discuss mailing list