[Snowball-discuss] Slovenian stemmer

Martin Porter martin at porterloo.wanadoo.co.uk
Mon Jan 26 09:22:14 GMT 2009


Bostjan,

I'm sorry not to have replied to your email sooner. I have been very busy
this past week.

You can see where the stemmer you attached comes from if you go to

http://news.gmane.org/gmane.comp.search.snowball

and type "slovene" in the search box at the bottom. This gives the email
history. The stemmer was done by Bostjan Jerko, and I wanted various points
clarified before adding it in as one of the snowball set, but I eventually
lost touch with Bostjan and the issues were not finally resolved. (I see you
are also "Bostan" which adds to the confusion.)

Something I wanted resolved was how the stemmer differed from the one
described by Willett and Popovic in,

Popovic M and Willett P (1990) Processing of documents and queries in a
Slovene language free text retrieval system. Literary and Linguistic
Computing, 5: 182-190. 

(I urge you to look at the Willett/ Popovic work).

My changes to Jerko's stemmer did not, I recall, change its functionality,
so the version you have could be used as a test.

It seems to me you have three choices,

1) Get to know snowball and try it with the Jerko stemmer (I don't have time
to help you more with this at present),

2) Code up the Jerko stemmer in a language more familiar to you -- not too
easy since there is no functional description of his method,

3) Code up the Willett/Popovic stemmer in a language familiar to you.

I hope this helps, and I'm sorry I can't provide you with a ready-made solution,

Martin

At 21:14 22/01/2009 +0100, =?UTF-8?Q?Bostjan_Povh wrote:
>I'm doing some more research in archive of snowball-discuss and not
>surprisingly I found out that the file I have actually comes from your work
>in March 2005.
>
>So to change my question:
>Can you please just tell me what is the current status of the stemmer and if
>I can do something to finish it?
>I can provide a complete vocabulary if necessary?
>
>I am aware that you probably don't have any time, so if what I'm asking
>takes too much time, is it possible for me to get a Java implementation of
>the stemmer although it is not perfect at the moment?
>
>
>
>Thanks and regards,
> Bostjan
>





More information about the Snowball-discuss mailing list