[Snowball-discuss] Croatian stemmer help

Martin Porter martin at porterloo.wanadoo.co.uk
Fri Aug 1 12:10:14 BST 2008


Tomislav

>Basically I wanna know easiest way for replacing strings in Snowball based
>on regular expression syntax like this one:
>$w = preg_replace("/[^aeiou]eta$/", "$1e", $w);

it would be along the lines of,

... ['eta' non-vowel] <- 'something'

to replace 'eta' and the preceding non-vowel by 'something'. (I'm sorry to
be vague ... I've mislaid my Perl book and don't know PHP.)

Snowball is unlike Perl, and you can't really do an expression to expression
translation. What I suggest is that you look at one of the stemmer
definitions and see how it's coded up in Snowball. You'll soon understand.

Incidentally, if your PHP works well, there may be no advantage in
translating into Snowball. If you wanted to submit the PHP stemmer to the
snowball site, we'd be happy to put it up at

http://snowball.tartarus.org/otherlangs/index.html

(assuming BSD licensing),

Martin 





More information about the Snowball-discuss mailing list