[Snowball-discuss] Unicode and python bindings

Andreas Jung lists at andreas-jung.com
Tue May 16 20:50:49 BST 2006


TextIndexNG3 for Zope  (sf.net/projects/textindexng) comes with its own 
Python bindings against the latest Snowball code base...and the 
completeimplementation is based on unicode and in use since ages...

-aj

--On 16. Mai 2006 14:39:05 +0200 Patrick Mézard <pmezard at gmail.com> wrote:

> Hello,
>
> Trying to solve issues I raised in a previous post
> (<http://thread.gmane.org/gmane.comp.search.snowball/772/focus=772>), I
> finally rewrote parts of the original Weongyo Jeong python bindings to
> fit my needs. The main change is the module interface now consumes python
> Unicode strings (UTF-16) instead of native strings. The idea is that code
> dealing with multiple languages usually unifies first the documents
> encodings into Unicode before passing them to other modules, including
> stemming. With the original bindings, since I failed to use the UTF-8
> interface, I had to convert back from Unicode to specific encodings which
> was at best a pain, at worst impossible.
>
> The new version is temporary available there:
> <http://perso.wanadoo.fr/patrick.mezard/dev/pysnowball-0.0.2.zip> and I
> can provide a copy of the darcs (<http://abridgegame.org/darcs/>)
> repository I used to rewrite my branch.
>
> I think it still needs to be reviewed before any release (I am far from
> being a python C extension expert), even if it passes the few tests I
> could imagine.
>
> What's your opinion about this?
>
> --
> Patrick Mézard
>
>
> _______________________________________________
> Snowball-discuss mailing list
> Snowball-discuss at lists.tartarus.org
> http://lists.tartarus.org/mailman/listinfo/snowball-discuss


 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 186 bytes
Desc: not available
Url : http://lists.tartarus.org/mailman/private/snowball-discuss/attachments/20060516/7f910e18/attachment.pgp


More information about the Snowball-discuss mailing list