[Snowball-discuss] Asian Languages

Charlie Hull charlie at lemurconsulting.com
Wed Jul 15 09:26:18 BST 2009


Martin Porter wrote:
> Olga,
> 
> As yet, we do not have any solutions for Asian languages on the snowball site,
> 
> Martin Porter
>  
>> Functions needed: segmentation, tokenization, stemming, part of speech
>> tagging, compound decomposition, noun phrase extraction.
>>

You might find some useful links in the xapian-discuss mailing lists: I 
know CJK tokenisers have been discussed in the past:
http://search.gmane.org/search.php?group=gmane.comp.search.xapian.general&query=CJK

Charlie
> 
> 
> 
> _______________________________________________
> Snowball-discuss mailing list
> Snowball-discuss at lists.tartarus.org
> http://lists.tartarus.org/mailman/listinfo/snowball-discuss


-- 
Charlie Hull
Flax - Open Source Enterprise Search

tel/fax: +44 (0)8700 118334
mobile:  +44 (0)7767 825828
web: www.flax.co.uk



More information about the Snowball-discuss mailing list