[Snowball-discuss] More patches

Richard Boulton richard at lemurconsulting.com
Mon Feb 12 13:31:25 GMT 2007


Olly Betts wrote:
> I'm currently updating Xapian to use UTF-8 stemmers generated by the
> latest version of snowball.  I've patched the snowball compiler to
> generate the stemmers as C++ classes, and I'm embedding the patched
> compiler in the Xapian build system, so Xapian users can easily drop
> in new stemmers.

I'd be interested in adding a "C++" output mode to snowball, so patches 
to do this would probably be accepted.

Ideally, I'd like to make a C++ version of the libstemmer library, and 
maintain it in Snowball rather than Xapian.  In particular, it would 
seem useful to me for developers to be able to link against a 
system-wide snowball dynamic library, rather than the specific version 
compiled into Xapian.  However, that discussion possibly belongs on the 
Xapian mailing lists rather than here, and for now whatever works is 
fine by me. :)

> This improves the shortcutting of backwards among - if there are fewer
> characters available than the shortest string in the among, there's
> no way it can match.  It also includes a cosmetic tweak (avoiding
> generating "z->c - 0" in the output) which makes the generated source
> a little more readable (of course the C compiler will optimise the "- 0"
> away anyway):
> 
> http://oligarchy.co.uk/xapian/patches/snowball-min-length-shortcut-backwards-among.patch

I've applied this patch too - I believe I've now applied all that 
patches you've sent so far!

-- 
Richard



More information about the Snowball-discuss mailing list