[Snowball-discuss] French stemmer question
Martin Holmes
mholmes at uvic.ca
Tue Nov 10 03:32:30 GMT 2020
On 2020-11-09 6:28 p.m., Olly Betts wrote:
> On Mon, Nov 09, 2020 at 01:33:36PM -0800, Martin Holmes wrote:
>> Looking at the stemming algorithm description for French here:
>>
>> <https://snowballstem.org/algorithms/french/stemmer.html>
>>
>> I see this:
>>
>> "...if preceded by eus, delete if in R2, else replace by eux if in R1..."
>>
>> However, R2 is contained in R1, so the logic seems reversed to me.
>
> As you say R2 is within R1, so if the suffix isn't in R2 it can still be
> in R1.
>
> The R1/R2 example "fameusement" further up actually illustrates this
> case.
>
> The "-ement" suffix is in RV so removed. This is preceded by "-eus"
> which isn't in R2 but is in R1, so it's replaced by "eux" and the
> stem is "fameux".
>
> So this seems OK to me.
>
>> Shouldn't it be:
>>
>> ...if preceded by eus, replace by eux if in R1, else delete if in R2...
>
> If the suffix in question is not in R1 then it won't be in R2 either,
> so the "else ..." part in your version seems to be redundant.
>
> If there's some ambiguity in how this can be interpreted we can try to
> improve the phrasing.
Thanks Olly. I get it now. But I do find myself wishing that there were
parentheses controlling all the if-then-elses in the prose. :-) What
puzzled me was my impression that you would first delete it if it's in
R2, but you would then replace it instead if it also turned out to be in R1.
Cheers,
Martin
> Cheers,
> Olly
>
--
-------------------------------------
Humanities Computing and Media Centre
University of Victoria
mholmes at uvic.ca
More information about the Snowball-discuss
mailing list