[Snowball-discuss] Article about using snowball stemmer to dolanguage identificaction

Martin Porter martin.f.porter at gmail.com
Thu Jun 30 11:00:32 BST 2011


Iolalla,

Very interesting. In the past I've used stopword lists for language
identification, and the results were adequate. But I did notice that
in Finnish there aren't so very many stopwords!

But I don't quite see how applying a stemmer leads to a language
identification. Do you count the number of valid endings removed, and
use that as a measure?

Like Cominvent, I wondered how the mixing was done in the hybrid approach,

Martin



More information about the Snowball-discuss mailing list