[Snowball-discuss] Name Convention and abstract Method "stem()"

Jens Krefeldt J.Krefeldt at gmx.de
Sat Aug 20 19:15:46 BST 2005


Hi Snowball developers,

1. many thx for this nice piece of code.

2. I have an improvement suggestion for the JAVA Snowball Stemmer: If you 
want to use a SnowballStemmer you have to prepare something like this:

            String name = ...;
            Class stemClass = Class.forName("org.tartarus.snowball.ext." + 
name + "Stemmer");
            stemmer = (SnowballProgram) stemClass.newInstance();
            stemMethod = stemClass.getMethod("stem", new Class[0]);
            stemMethod.invoke(stemmer, null);

My Question: Why doesn't the SnowballProgram class has an abstract stem() 
method like

public abstract class SnowballProgram {
    ...
    /**
     * Every derived <CODE>SnowballProgram</CODE> has to implement this 
method
     * to initialize the appropiate stem algorithm.
     */
    public abstract boolean stem();

    ...
}

This will reduce the code shown above (and also makes it faster, reflection 
is slow!):

            String name = ...;
            Class stemClass = Class.forName("org.tartarus.snowball.ext." + 
name + "Stemmer");
            stemmer = (SnowballProgram) stemClass.newInstance();
            stemmer.stem();

This also the standard Design Pattern "Abstract Factory" [Gamma et al, 1996] 
and implicitly used by JAVA.

3. I would suggest to rename the stemmer classes regarding to ISO country 
(http://www.iso.org/iso/en/prods-services/iso3166ma/02iso-3166-code-lists/list-en1.html) 
and language (http://www.loc.gov/standards/iso639-2/englangn.html) codes, so 
that I can identify them via a java.util.Locale object, e.g.:

germanStemmer.java     ==>     de_DE.java      or      de_DE_Stemmer.java


Pls let me know what you think.

Best regards

Jens





More information about the Snowball-discuss mailing list