[Snowball-discuss] Pystemmer
Uwe Schmitt
uschmitt at mineway.de
Thu Sep 18 15:28:07 BST 2008
Hi,
I upgraded the Python wrapper today and want to report about
it:
1) I changed setup.py to use cython instead of pyrex.
cython replaces pyrex and results in 40% speed improvement
according to the included benchmark.py script.
2) I took the libstemmer_c from
http://snowball.tartarus.org/dist/libstemmer_c.tgz
(I tried libstemmer_c from the svn trunk, but got
some problems)
3) In order to get setup.py work, I had to remove
one line from Pytemmers MANIFEST file.
Else I got duplicate symbols during linking.
As a consequence, the Python module only supports
UTF-8 encoding.
I prefer this to using different ISO-88XX encodings
depending on the stemmed language.
I could provide the result as a tar ball, if anybody
is interested in it.
Greetings, Uwe
--
Dr. rer. nat. Uwe Schmitt
F&E Mathematik
mineway GmbH
Science Park 2
D-66123 Saarbrücken
Telefon: +49 (0)681 8390 5334
Telefax: +49 (0)681 830 4376
uschmitt at mineway.de
www.mineway.de
Geschäftsführung: Dr.-Ing. Mathias Bauer
Amtsgericht Saarbrücken HRB 12339
More information about the Snowball-discuss
mailing list