[Snowball-discuss] Python 3 bindings

Richard Boulton richard at tartarus.org
Tue Aug 9 16:58:17 BST 2011


On 8 August 2011 16:40, Peter Bouda <pbouda at cidles.eu> wrote:
> http://www.dasskript.com/patches/pystemmer_python3.patch

Great - thanks for the clean patch.

> 1) I use Cython instead of Pyrex in setup.py.

That seems reasonable from what I know, though I've not been following
the two projects closely recently.

> 2) The "algorithms()" now returns Unicode strings under Python 2. I don't
> know how to solve this really; I think either it's ASCII under both Python 2
> and 3 (i.e. "english" resp. b"english" and so on) or Unicode (u"english"
> resp. "english"). The first solution is bad for Python 3, the latter breaks
> compatibility under Python 2. Maybe you have an idea about this.

I don't think the compatibility break with Python 2 is a big problem
(in particular, u"english" compares equal to "english" under Python 2,
which will avoid many problems).  I'm bumping the version number to
'1.2.0' instead of to '1.1.1' as you did, though, to reflect this
incompatibility.

> The
> algorithm's string in "__init__" is less critical, as it may only encode to
> "ascii" a second time under Python 2 (if the user passes a ascii encoded
> string), but that's not a problem, I think.

I think that's fine, too.

> I am happy to license this under the PyStemmer license.

Great - I've applied the patch, and will release the new version to
the python package index shortly.

-- 
Richard



More information about the Snowball-discuss mailing list