[Snowball-discuss] Turkish stemmer code

Olly Betts olly at survex.com
Fri Feb 16 22:02:32 GMT 2007


I noticed some uses of "test" in the new turkish stemmer which seem
redundant to me (or else I don't understand Snowball well enough, which
is quite possible).

This snippet is used 4 times (once with "non-vowel" instead of "vowel"):

    test(next (test vowel))

But the inner "test" seems redundant as the cursor will be reset after
"vowel" anyway by the outer "test", so I think this is just the same as:

    test(next vowel)

Also, this snippet is used 4 times (once with a grouping instead of
a single character literal):

    ((test 'n') next (test vowel))

But "next" advances the cursor by a character, so isn't that the same
as this:

    ('n' (test vowel))

I found a turkish wordlist, but it's aimed at checking for insecure
passwords so only contains ASCII characters (and the licence means
it's not suitable for distributing with snowball anyway).  But I
tried the changes above and for this restricted test set I get the
same results with and without these changes.

Cheers,
    Olly



More information about the Snowball-discuss mailing list