Thank you everyone for the really good explanation - no more to say/ask about unicode and probabilistic model. One more question: how did you compare your library to the Lemur (http://www.lemurproject.org)? Thank you in advance, Regards, /Alexandre.