[Snowball-discuss] a simple algorithm problem
Martin Porter
martin.porter at grapeshot.co.uk
Thu Jan 6 10:20:43 GMT 2005
>Presumably this still restricts Snowball to code points in the BMP? Or
>does it just restrict it to recognising and doing things with
>characters at code points in the BMP, passing through any others?
It would be the latter. Since stemming is applicable to a system of
languages, all of whose characters are, I would assert, in the BMP, I do
think that is a problem.
>What's the character encoding of snowball scripts at the moment?
The scripts themselves are in ASCII, and ASCII assumptions are made in the
Snowball compiler.
More information about the Snowball-discuss
mailing list