[Snowball-discuss] a simple algorithm problem
Martin Porter
martin.porter at grapeshot.co.uk
Wed Dec 15 23:16:55 GMT 2004
See the section
http://snowball.tartarus.org/q/use.html
and look at the widechars option etc.
Quite a bit of work has, as you probably know, been done on Turkish
stemming, but it is not work I am at all familiar with.
Martin
At 22:22 15/12/2004 +0000, ayhan peker wrote:
>Hi Martin,
>Thank you very much for your reply.
>I dont think i will want to run snowball as 8 bit ascii. Because my
>system and my database is modified to accept unicode chars (utf-8). I
>tried to run it previously but database was unable to return back to
>client as unicode which i intend to do.
>By the way this is for a pure turkish search engine (when i tried to run
>it with ascii only my robot-database-web interface all got muddled). So
>for me it is too late to try to go back to non-unicode mode.
>Could you tell me how i can run snowball in 16-bit char mode or do you
>have a piece of documentation i can read about it?
>
>I quite like to develop turkish-stemming algorithm. But it is one of the
>most difficult languages in the world to do. What i am trying to do is
>have a start on this project -a simple start- :) . I intend to continue
>to develop and get some others to contribute to this development.
>
>Best regards.
>Ayhan
More information about the Snowball-discuss
mailing list