[Snowball-discuss] Fwd: How to add Tamil Support to stemmer?

Damodharan Rajalingam r.damodharan at ymail.com
Wed Mar 27 16:05:06 GMT 2013


Hi Richard,
I am working on getting the sample dataset. I did this work for my Master's thesis 4 years back so I am having difficulty in spotting some files :) . I will cleanup the dataset I used for evaluation and add it to the git repo along with the documentation on evaluation results.

Thanks
Damu




________________________________
 From: Richard Boulton <richard at tartarus.org>
To: Shrinivasan T <tshrinivasan at gmail.com> 
Cc: snowball-discuss at lists.tartarus.org 
Sent: Wednesday, March 27, 2013 6:31 PM
Subject: Re: [Snowball-discuss] Fwd: How to add Tamil Support to stemmer?
 
On 27 March 2013 12:03, Shrinivasan T <tshrinivasan at gmail.com> wrote:
> The patch for stemmer for tamil language is here.
> https://github.com/rdamodharan/tamil-stemmer/blob/master/snowball-tamil.patch
>
> We apply the patch and compile stemmer to make it work with tamil language.
>
> How to add the patch to the upstream stemmer?

"rdamodharan" has actually done exactly what's needed for this, by
submitting a pull request on github to our repository;
https://github.com/snowballstem/snowball/pull/2  Unfortunately, I
haven't had a chance to look at this so far; I will make sure to make
time to do so over the next few days.

I have no way of evaluating the results of this stemmer, but am
willing to take the word of Tamil speakers as to whether the algorithm
is of use.  There may be some changes to the code that should be made
to improve performance, as Martin mentioned.  One thing that would be
of great use is a sample dataset, similar to that in
https://github.com/snowballstem/snowball-data/blob/master/english/voc.txt,
together with a sample file containing the corresponding expected
output.

-- 
Richard

_______________________________________________
Snowball-discuss mailing list
Snowball-discuss at lists.tartarus.org
http://lists.tartarus.org/mailman/listinfo/snowball-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.tartarus.org/mailman/private/snowball-discuss/attachments/20130327/86e4fee3/attachment-0001.htm>


More information about the Snowball-discuss mailing list