[Snowball-discuss] More patches for the C implementation of the English stemmer

Olly Betts olly at survex.com
Wed Jul 28 17:17:43 BST 2010


Here are five more patches:

0001-Fix-bug-with-y-to-Y-marking.patch:

Fixes a bug - the algorithm description says that an initial 'y' should be
converted to 'Y', but the C implementation only does this if it is followed
by a vowel.  For examples, this means that the C implementation stems "ygoe"
(an obsolete spelling of "ago") to "ygo", while the Snowball implementation
stems it to "ygoe".

0002-Use-const-qualifier-on-char-pointers-we-don-t-modify.patch:

Adding const here allows this to be compiled as C++, and fixes warnings with
some C compilers.

0003-Use-memcpy-in-setto-rather-than-memmove-since-we-kno.patch:

memcpy() is a little faster than memmove(), and safe here since the second
argument is always a string constant, so can't overlap with the first.

0004-Fix-indenting-and-add-missing-newline.patch:

Fixes a few code formatting inconsistencies.

0005-Specialise-setto-z-00-as-setto0-z.patch:

setto0(z, "\00") is used several times and the special case code is just
an assignment.  This patch results in smaller compiled code which runs
faster.

Cheers,
    Olly
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Fix-bug-with-y-to-Y-marking.patch
Type: text/x-diff
Size: 823 bytes
Desc: not available
URL: <http://lists.tartarus.org/mailman/private/snowball-discuss/attachments/20100729/5c9cfe15/attachment.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-Use-const-qualifier-on-char-pointers-we-don-t-modify.patch
Type: text/x-diff
Size: 1952 bytes
Desc: not available
URL: <http://lists.tartarus.org/mailman/private/snowball-discuss/attachments/20100729/5c9cfe15/attachment-0001.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0003-Use-memcpy-in-setto-rather-than-memmove-since-we-kno.patch
Type: text/x-diff
Size: 405 bytes
Desc: not available
URL: <http://lists.tartarus.org/mailman/private/snowball-discuss/attachments/20100729/5c9cfe15/attachment-0002.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0004-Fix-indenting-and-add-missing-newline.patch
Type: text/x-diff
Size: 1367 bytes
Desc: not available
URL: <http://lists.tartarus.org/mailman/private/snowball-discuss/attachments/20100729/5c9cfe15/attachment-0003.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0005-Specialise-setto-z-00-as-setto0-z.patch
Type: text/x-diff
Size: 1573 bytes
Desc: not available
URL: <http://lists.tartarus.org/mailman/private/snowball-discuss/attachments/20100729/5c9cfe15/attachment-0004.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: Digital signature
URL: <http://lists.tartarus.org/mailman/private/snowball-discuss/attachments/20100729/5c9cfe15/attachment.pgp>


More information about the Snowball-discuss mailing list