[Snowball-discuss] a simple algorithm problem

Martin Porter martin.porter at grapeshot.co.uk
Thu Jan 6 10:20:43 GMT 2005


>Presumably this still restricts Snowball to code points in the BMP? Or
>does it just restrict it to recognising and doing things with
>characters at code points in the BMP, passing through any others?

It would be the latter. Since stemming is applicable to a system of
languages, all  of whose characters are, I would assert, in the BMP, I do
think that is a problem.

>What's the character encoding of snowball scripts at the moment?

The scripts themselves are in ASCII, and ASCII assumptions are made in the
Snowball compiler. 






More information about the Snowball-discuss mailing list