[Snowball-discuss] Multi language full text: Which stemming language should be used?

Craig Rairdin craigr at laridian.com
Mon Apr 30 15:19:50 BST 2012


I would think you would index your original documents according to the
language they are written in. You would end up with multiple indexes, one
for each language.

Then, assuming the user does not tell you what language their search terms
are in, stem the search terms in each of your supported language and do
lookups for each term in each language.

You cannot just do the stemming operation once in, say, English because
stemming for each language is different.

Craig

On 4/30/12 7:17 AM, "Manoj M" <manojmarathayil at gmail.com> wrote:

Which stemming language I should be using if I want to support all
language full text search. As far as I know the index need to created
using that specific stemming language to support search with that
language, but this is not possible for me as my search program may
contain different languages.

Thanks in advance.

--
Regards,
Manoj Marathayil

_______________________________________________
Snowball-discuss mailing list
Snowball-discuss at lists.tartarus.org
http://lists.tartarus.org/mailman/listinfo/snowball-discuss





More information about the Snowball-discuss mailing list