[Snowball-discuss] Removing été from French stop words

Olly Betts olly at survex.com
Thu Apr 16 22:20:43 BST 2020


On Thu, Apr 16, 2020 at 08:11:08AM -0400, Philippe Ouellet wrote:
> How does asciifolding fit into this? 

I'd think if someone wants to fold to ASCII before checking for
stopwords then they should expect to have to adjust the list to reflect
that.

> I would like “mais” to be a stop word, but “maïs” should not (it means
> corn). “mais” has no other meaning than “but”, and should be a stop
> word.
> 
> The current list has “mais” in its list, should we comment it?

So whether "mais" should be enabled or commented by default depends on
whether dropping the accent from it is at all common in French text.

Cheers,
    Olly



More information about the Snowball-discuss mailing list