[Snowball-discuss] Custom stemmer missing from modules_utf8.h

Craig Rairdin craigr at laridian.com
Thu Dec 1 21:34:34 GMT 2011


I am trying to create a modified version of the English stemmer, and be able
to pass it UTF8-encoded strings and get UTF8-encoded strings back.

I created a custom version of the English stemmer. I added it to
libstemmer/modules.txt and libstemmer/modules_utf8.txt and specified
UTF_8,ISO_8859_1 for the character sets. I also added it to
'libstemmer_algorithms' and 'ISO_8859_1_algorithms' in GNUmakefile.

When I run the make file my new stemmer shows up in libstemmer/modules.h but
not in libstemmer/modules_utf8.h. I note I do not see mkinc_utf8.mak, which
I thought was built by modules.pl along with mkinc.mak (which I do see).

Perhaps I'm doing something wrong. I assume that I need to include
modules_utf8.h and libstemmer_utf8.c in my project instead of modules.h and
libstemmer.c, respectively. I see my new stemmer in modules.h but not in
modules_utf8.h, so when I try to create the stemmer with sb_stemmer_new(),
it returns null.

Thanks for your help.

Craig


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.tartarus.org/mailman/private/snowball-discuss/attachments/20111201/b126ab74/attachment.htm>


More information about the Snowball-discuss mailing list