[Snowball-discuss] The Norwegian stemmer algorithm

Oleg Bartunov oleg@sai.msu.su
Tue, 27 Nov 2001 22:10:28 +0300 (GMT)


I think we need to make an agreement about namespace for
perl interfaces - there will be a lot.
Lingua::Stemmer::Snowball::Norwegian would be ok

Also, it'd be probably better to have unified perl interface to
snowball's stemmers.

	Regards,

		Oleg
On Tue, 27 Nov 2001, Ask Solem Hoel wrote:

> Hello there (o:
>
> I'm making a port of the scandinavian stemmer algorithm
> for perl. You can fetch it from:
>
> http://www.unixmonks.net/~ask/Stemmer-Norwegian-0.3.tar.gz
>
> There is one thing I can't understand, though,
> on the description of the algorithm you say:
>
> > R2 is not used: R1 is defined in the same way as in the German
> > stemmer.
>
> And on the German page, it says:
>
> > R1 and R2 are first set up in the standard way (see 3.1), but then R1
> > is adjusted so that the region before it contains at least 3 letters.
>
> Where is "3.1" ? :-)
>
> If you unpack that tarball and try to run it against the diff.txt:
> % perl stemmer.pl diffs.txt | wc -l
> you'll see that 120 out of 20628 differs.
>
> Why???
>
> I'd guess this has something
> to with the snowball thingie:
>
> > $p1 = limit
> > goto v gopast non-v setmark p1
> > try ($p1 < 3 $p1 = 3)
>
> What does this do?
>
> Thanks:)
>
>

	Regards,
		Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83


_______________________________________________
Snowball-discuss mailing list
Snowball-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/snowball-discuss

_____________________________________________________________________
VirusChecked by the Incepta Group plc
_____________________________________________________________________