[Snowball-discuss] Mobile phone implementation of the English Stemmer

Alexandra Elizabeth Duncan aed02@doc.ic.ac.uk
Thu Aug 28 17:55:01 2003


Hi
I wrote a couple of emails to this mailing list back in July - I am an 
MSc student studying Computer Science at Imperial College, London. I 
have just about completed my thesis/project which has been concerned 
with writing a mobile phone translator (english to french and french to 
english).  Please excuse the length of this email but I thought you 
might be interested in the work I have done using the Porter algorithm.

Very briefly, there is a small dictionary of words stored as part of the 
application on the mobile phone.  A user inputs a word to be translated 
and the application returns the translation if the word is found in the 
phone dictionary.  If the word is not in the dictionary, the application 
queries a remote dictionary and returns the translation.

Given the constrained system requirements of mobile phones, I have had 
to work at compressing the words to be stored on the phone.  For this I 
used the Porter algorithm and the Java implementation from the website.
The words that make up the dictionary are stemmed and stored on the 
phone.  When the user inputs a word, that word is then stemmed (using 
the Java implementation modified slightly for the mobile phone) and then
matched against the stemmed words in the dictionary.
By doing this, I was able to get about 25% compression on the english 
words I had.

I only implement the stemming for the english words and therefore only 
the english words are compressed.  I did try to implement the french 
stemmer but I found it was too large for the mobile phone and more 
complicated.

I would like to say thank you for the excellent and informative website 
- it has been of great use to me in the past 3 months.

I was also wondering if you know of anyone who has implemented the 
stemmer on a mobile phone.  If not, this would lend my project a bit of 
extra kudos, I have to say!

I will be finalising the code and writing the actual thesis in the next 
2 weeks.  If anyone is interested in the work that I have done on it, 
please let me know as I would be more than happy to supply the code 
and/or the report.

Thank you once again
Alex Duncan