[Snowball-discuss] Snowball 2.0.0 released

Olly Betts olly at survex.com
Mon Oct 7 00:53:48 BST 2019


Please can we keep this discussion on the list?  I think it's of wider
interest.

On Sat, Oct 05, 2019 at 07:28:12PM +0200, Yann Barsamian wrote:
> I now understand that testing the return value is irrelevant, and that I can
> always return snowballstemmer.getCurrent() --- but the question remains, am
> I guaranteed to get a non-null String and/or non-empty String when the word
> I give as input is non-null and non-empty ?

You should never get null from getCurrent().

Whether a non-empty input can produce an empty output is a property of
the stemming algorithm in use.  It's clearly possible to implement an
algorithm which returns empty outputs for some inputs - for example, the
very simple algorithm "if word ends with `s` remove it" would produce an
empty output for input "s".  

I think most of the current algorithms probably only produce an empty
output for an empty input, but I haven't examined them all with that
in mind.  I recall the arabic algorithm could return empty outputs at
one point, but I'm not sure if the subsequent revisions changed that.

> If you have time, it would be great if you could comment the file
> /java/org/tartarus/snowball/SnowballStemmer.java in order to indicate
> what you told me about the abstract stem() function

I'll add a note.

> and give insights of what are the cases, in the 6 languages you
> extracted, of the reason why there is the "false" return value.

The return value is just the last signal from the Snowball program, so
if it's false then execution didn't reach the very end of the `stem`
function.  But for the current stemmers, that doesn't indicate an error
(or indeed anything at all).

If you know Perl, it's like how a Perl function without an explicit
return statement will return the value of the last expression evaluated
in the function.  So a Perl function which conceptually "returns void"
will often actually yield a return value, but it's not a useful value to
look at.

Cheers,
    Olly



More information about the Snowball-discuss mailing list