[Snowball-discuss] The Norwegian stemmer algorithm
Ask Solem Hoel
ask@gan.no
Tue, 27 Nov 2001 14:19:30 +0100
Hello there (o:
I'm making a port of the scandinavian stemmer algorithm
for perl. You can fetch it from:
http://www.unixmonks.net/~ask/Stemmer-Norwegian-0.3.tar.gz
There is one thing I can't understand, though,
on the description of the algorithm you say:
> R2 is not used: R1 is defined in the same way as in the German
> stemmer.
And on the German page, it says:
> R1 and R2 are first set up in the standard way (see 3.1), but then R1
> is adjusted so that the region before it contains at least 3 letters.
Where is "3.1" ? :-)
If you unpack that tarball and try to run it against the diff.txt:
% perl stemmer.pl diffs.txt | wc -l
you'll see that 120 out of 20628 differs.
Why???
I'd guess this has something
to with the snowball thingie:
> $p1 = limit
> goto v gopast non-v setmark p1
> try ($p1 < 3 $p1 = 3)
What does this do?
Thanks:)
--
/ Ask Solem Hoel | GAN Media \
: +47 48054613 | +47 22707439 :
\ www.unixmonks.net | www.gan.no/media /
_______________________________________________
Snowball-discuss mailing list
Snowball-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/snowball-discuss
_____________________________________________________________________
VirusChecked by the Incepta Group plc
_____________________________________________________________________