[Snowball-discuss] Hungarian characters in hungarian/stop.txt
Martin Porter
martin.f.porter at gmail.com
Wed Jun 11 07:12:21 BST 2014
The general "snowball" policy has been to host contributed stemmers,
but not to distribute them as part of a general snowball release. This
is because of our inability to evaluate and support stemmers in
languages with which we have no familiarity.
Changing the character encodings for Hungarian on the snowball site
ought to involve Anna Tordai, the author of the algorithm.
Of course, although Unicode is pretty standard nowadays, snowball is
"code independent", and Unicode support didn't come in until 2005.
Failure of the older contributions to conform to Unicode is a
nuisance, but not an error.
I'll see if we can get a message through to Ms Tordai.
Martin
More information about the Snowball-discuss
mailing list