[Snowball-discuss] Spanish stop words

Julio Fraire julio.fraire at gmail.com
Fri Jun 6 17:48:52 BST 2008


Actually:

       | the following words are from verb sentir (to feel) and not from
verb ser (to be)
line 294: sintiendo     | correct word would be 'siendo'
line 295: sentido               | correct word would be 'sido' (same word
for male/female/singular/plural)
line 296: sentida               | correct word would be 'sido' (same word
for male/female/singular/plural)
line 297: sentidos      | correct word would be 'sido' (same word for
male/female/singular/plural)
line 298: sentidas      | correct word would be 'sido' (same word for
male/female/singular/plural)
line 299: siente                | correct word would be 'es'
line 300: sentid                | correct word would be 'sed' (the same word
than in thirst)

Those are correct. Words are derived from verb "sentir" and not from "ser",
as your correction suggests. Verb "ser" is escaped in other parts of the
stop list.

The first two words you mention are indeed a mistake (vosostros and
vosostras).

Julio Fraire

On Fri, Jun 6, 2008 at 2:14 AM, Gonzalez, Francisco (C&I Spain) <
francisco.gonzalez-pascual at hp.com> wrote:

> Hello,
>
> I am fairly new to Snowball and to this list, so please excuse me for my
> ignorance in advance.
>
> I am planning to use the Spanish stemming algorithm and I would like to
> point out some words in the Spanish stop word list which I think may be
> wrong.
>
> line 123: vosostros     | correct word would be 'vosotros' (you male
> plural)
> line 124: vosostras     | correct word would be 'vosotras' (you female
> plural)
>        | the following words are from verb sentir (to feel) and not from
> verb ser (to be)
> line 294: sintiendo     | correct word would be 'siendo'
> line 295: sentido               | correct word would be 'sido' (same word
> for male/female/singular/plural)
> line 296: sentida               | correct word would be 'sido' (same word
> for male/female/singular/plural)
> line 297: sentidos      | correct word would be 'sido' (same word for
> male/female/singular/plural)
> line 298: sentidas      | correct word would be 'sido' (same word for
> male/female/singular/plural)
> line 299: siente                | correct word would be 'es'
> line 300: sentid                | correct word would be 'sed' (the same
> word than in thirst)
>
> I fixed the words in my file, but I think it would be useful to everyone to
> fix the original file too.
>
> Which would be the right procedure to do that?
>
> Thank you.
>
>
> _______________________________________________
> Snowball-discuss mailing list
> Snowball-discuss at lists.tartarus.org
> http://lists.tartarus.org/mailman/listinfo/snowball-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.tartarus.org/mailman/private/snowball-discuss/attachments/20080606/037f67a0/attachment.htm 


More information about the Snowball-discuss mailing list