[Snowball-discuss] The Norwegian stemmer algorithm

Ask Solem Hoel ask@gan.no
Tue, 27 Nov 2001 14:19:30 +0100


Hello there (o:

I'm making a port of the scandinavian stemmer algorithm
for perl. You can fetch it from:

http://www.unixmonks.net/~ask/Stemmer-Norwegian-0.3.tar.gz

There is one thing I can't understand, though,
on the description of the algorithm you say:

> R2 is not used: R1 is defined in the same way as in the German
> stemmer.

And on the German page, it says:

> R1 and R2 are first set up in the standard way (see 3.1), but then R1
> is adjusted so that the region before it contains at least 3 letters.

Where is "3.1" ? :-)

If you unpack that tarball and try to run it against the diff.txt:
% perl stemmer.pl diffs.txt | wc -l
you'll see that 120 out of 20628 differs.

Why???

I'd guess this has something
to with the snowball thingie:

> $p1 = limit
> goto v gopast non-v setmark p1
> try ($p1 < 3 $p1 = 3)

What does this do?

Thanks:)

-- 
/ Ask Solem Hoel        | GAN Media             \
: +47 48054613          | +47 22707439          :
\ www.unixmonks.net     | www.gan.no/media      /

_______________________________________________
Snowball-discuss mailing list
Snowball-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/snowball-discuss

_____________________________________________________________________
VirusChecked by the Incepta Group plc
_____________________________________________________________________