[Snowball-discuss] do R1, R2 and RV need to be updated after deleting something?
alfonso.moscato at merqurio.it
alfonso.moscato at merqurio.it
Thu Aug 26 11:27:23 BST 2021
Hello to all.
I am implementing the stemming algorithm for Italian (https://snowballstem.org/algorithms/italian/stemmer.html), and I have a doubt:
I have a word, say “praticabilità”
R1 is “icabilità”
R” is “abilità”
RV is “ticabilità”
(or at least I hope so 😊)
In step 1 there is the rule:
ità
delete if in R2
if preceded by abil, ic or iv, delete if in R2
And in step 3 there is the rule:
Delete a final a, e, i, o, à, è, ì or ò if it is in RV, and a preceding i if it is in RV
In step 1 I delete “abilità” and the word becomes “pratic”
I leave RV untouched, and so it is still “ticabilità”
In step 3 I search for “à” in RV and I found it as last character.
So I think I have to delete 1 character and I delete wrongly “c”
I wonder which the correct algorithm is. Maybe I need to delete matches from R1, R2, and RV too?
Thanks in advance for your help.
Alfonso
Alfonso Moscato
CIO & COO
Merqurio Holding
Corso Umberto I, 23 - 80138 Napoli
Tel.+39 0815524300
Fax.+39 0814201136
Linea Verde: +39 800014863
Diretto. +39 081 96.336.22
Mobile. +39 348 36.79.384
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.tartarus.org/pipermail/snowball-discuss/attachments/20210826/eda7c117/attachment.htm>
More information about the Snowball-discuss
mailing list