[Snowball-discuss] French stop words list
Jean-Christophe Deschamps
jcd at q-e-d.org
Sun Nov 1 23:37:35 GMT 2009
Hi,
[Sorry for typo in title and forgetting some entries]
In the process of evaluating the pertinence of using a stemmer for
storing place names, I found the list of French stop words available on
the Snowball website. [It turns out this kind of stemming is
unsuitable for places names, at least in my situation.]
It seems to me the proposed list has both omisions and errors.
I list my suggestions below. Plus (+) indicate omissions while minus
(-) denote entries that should be removed due to incorrectness.
"ayant" is invariable as a word, being a participe of a non "status"
verb (unable to be used as a "verbal adjective"), except -as noted- in
the form ayants-droit. There might be other rare exceptions that
escape me right now.
"étant" is invariable as well.
Of course when they are terminations of nouns or of verbs in verbal
adjective forms, they accept final -e, -s and -es.
E.g.:
Ce stade a une capacité de 43 000 places payantes.
Il accueille donc 43 000 spectateurs payant cher leur place.
+ceci | this
+celà | that
+cet | this
+cette | this
+es | in; en + le/la, excl. specilization of degree (es
Lettres)
+ès | in; en + le/la, excl. specilization of degree (ès
Lettres)
+ici | here
+ils | they
+là | there
+les | the (pl)
+lès | near (in name of village near a town)
+lez | near (in name of village near a town)
+leurs | their (pl)
+quel | which
+quels | which
+quelle | which
+quelles | which
+sans | without
+soi | oneself
+sous | under
-étante
-étants
-étantes
-ayante
-ayantes
ayants | excl. in ayants-droit: beneficiaries of rights
+eus
I'm not being pedantic here, I simply thought it would be worst
mentionning.
Please copy answer to this address as I'm not subscribed.
More information about the Snowball-discuss
mailing list