[Snowball-discuss] Article about using snowball stemmer to dolanguage identificaction
Martin Porter
martin.f.porter at gmail.com
Thu Jun 30 11:00:32 BST 2011
Iolalla,
Very interesting. In the past I've used stopword lists for language
identification, and the results were adequate. But I did notice that
in Finnish there aren't so very many stopwords!
But I don't quite see how applying a stemmer leads to a language
identification. Do you count the number of valid endings removed, and
use that as a measure?
Like Cominvent, I wondered how the mixing was done in the hybrid approach,
Martin
More information about the Snowball-discuss
mailing list