[Snowball-discuss] Norwegian stemmer question

Blake Madden madindayton at outlook.com
Sat Jul 26 14:50:27 BST 2025


Hello,

I was trying to understand the recent changes to the Norwegian stemmer, in particular step 1 for "ers".  The rule states:

(b) ers
find the longest suffix preceding ers, and perform the action indicated.

(i> amm   ast   ind   kap   kk   lt   nk   omm   pp   v   øst
do nothing
(ii> giv   hav   skap
delete ers suffix
Something I'm confused by is that "balders" gets stemmed to "bald", according to the output files. Why is the "ers" removed in this case? It isn't proceeded by "giv", "hav", or "skap", so it shouldn't be deleted. And nothing in step 2 or 3 is looking at "ers", so it shouldn't be getting removed there.

Also, I'm confused by where it says for "amm   ast   ind   kap   kk   lt   nk   omm   pp   v   øst" to "do nothing"? Why are these explicitly mentioned? If it isn't "giv   hav   skap", then nothing should happen anyway, right? If "ers" is proceeded by "bald", I would expect for it to not delete anything.

I tried looking at the Snowball code and I noticed this:

  'giv' 'hav' 'skap' ''
    (delete)

There is a blank '' included in the list of values in front of the suffix that would trigger a delete. What does that imply, it's not explained in the docs.

Thank you for any clarification,
Blake
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.tartarus.org/pipermail/snowball-discuss/attachments/20250726/61d88aff/attachment.htm>


More information about the Snowball-discuss mailing list