[Snowball-discuss] Multiple errors in generated Java sources for Latin algorithm

Alexander Myltsev alexander.myltsev at phystech.edu
Thu Jun 1 08:10:21 BST 2017


Hi,

I’m trying to generate Java sources for http://snowballstem.org/otherapps/schinke/ algorithm. I added stem.sbl from schinke.tgz to “snowball/algorithms/latin/stem.sbl” sources. Then updated GNUmakefile:

diff --git a/GNUmakefile b/GNUmakefile
index d6c7606..08237fa 100644
--- a/GNUmakefile
+++ b/GNUmakefile
@@ -29,7 +29,7 @@ libstemmer_algorithms = arabic \
                        danish dutch english finnish french german hungarian \
                        italian \
                        norwegian porter portuguese romanian \
-                       russian spanish swedish tamil turkish
+                       russian spanish swedish tamil turkish latin
 
 KOI8_R_algorithms = russian
 ISO_8859_1_algorithms = danish dutch english finnish french german italian \

Apparently generated sources are not compiled against 

java version "1.8.0_92"
Java(TM) SE Runtime Environment (build 1.8.0_92-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.92-b14, mixed mode)

with error:

[error] ./src/main/java/org/tartarus/snowball/ext/latinStemmer.java:260: missing return statement
[error]     } 

Even if I stub the error with `return true` or `return false`, stemmer produces weird results. When I launch `TestApp latin in.txt –o out.txt` for input `datum` it produces string `datum                    datum`, but should just `dat`.

Does anybody know how to make Latin algorithm work?
--
Alexander A. Myltsev

Computer scientist, software engineer, entrepreneur, explorer
site: myltsev.com 
tel.: +7-926-915-49-77
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.tartarus.org/mailman/private/snowball-discuss/attachments/20170601/a339c987/attachment.html>


More information about the Snowball-discuss mailing list