[Snowball-discuss] Unicode and python bindings

Patrick Mézard pmezard at gmail.com
Fri May 19 12:06:23 BST 2006


Andreas Jung wrote:
> TextIndexNG3 for Zope  (sf.net/projects/textindexng) comes with its 
> own Python bindings against the latest Snowball code base...and the 
> completeimplementation is based on unicode and in use since ages...
Thank you for pointing me back to TextIndexNG3, I gave it a try some 
times ago but not hard enough it seems. It compiles perfectly under 
Windows and looking at the code there is no doubt you handle the whole 
thing much better than I do. I just wonder why you decided to convert 
all the input UTF-16 python strings to UTF-8 instead of using them 
directly with Snowball. Anyway, both versions work the same (for my 
tests at least) and yours is definitely better.

As Martin said, it could be useful to add something to the project 
description on Snowball page. But I do not know how to present this: the 
bindings I was looking for and finally rewrote were to be used as a 
replacement for the ones provided with Xapian. The fact is yours can 
almost be used as a standalone module (provided people knows how to 
build them from the separate setup.py, which is really easy) and are 
based on the latest Snowball release. Maybe this can be added to the 
Snowball project page, or even to the wrapper one.

--
Patrick Mézard




More information about the Snowball-discuss mailing list