[Snowball-discuss] {'p','r','o','b','l','e','m'}
Martin Porter
martin_porter@softhome.net
Fri, 01 Mar 2002 09:32:23 -0700
[to Richard Boulton]
Richard,
I've hit a slight snag in converting strings "abc" to structures
{'a','b','c'}, in the ANSI C generated from Snowball, which is exemplified
by this little program:
#include <stdio.h>
struct among
{ int s_size; /* number of chars in string */
char * s; /* search string */
int substring_i;/* index to longest matching substring */
int result; /* result of the lookup */
int (* function)(void);
};
static struct among A[4] =
{ { 4, "Abcd", 12, 12, 0},
{ 4, "aBcd", 12, 12, 0},
{ 4, "abCd", 12, 12, 0},
{ 4, "abcD", 12, 12, 0}
};
static struct among B[4] =
{ { 4, {'A','b','c','d',0}, 12, 12, 0},
{ 4, {'a','B','c','d',0}, 12, 12, 0},
{ 4, {'a','B','C','d',0}, 12, 12, 0},
{ 4, {'a','b','c','D',0}, 12, 12, 0}
};
main(int argc, char * argv[])
{
printf("%s\n", (A[1]).s);
printf("%s\n", (B[1]).s);
return 0;
}
A[4] has a valid initializer; B[4] does not. It is an oddity of C that
strings are treated differently from the character arrays they represent. B
only gets initialized correctly if the among structure is given as
....
char s[100]; /* wasting space */
....
- very clumsy! And of course no finite upper bound is acceptable here.
The only solution seems to be to declare all strings in the form
static const symbol string_N[4] = {'a','b','c','d'};
and have a separate initialisation for all the among structures:
A[0].s = string_1;
A[1].s = string_2;
which I can do, but its disappointing that a small extension to Snowball
requires all this reorganisation. The generated code is also much less
attractive of course.
I thought I'd run this by you to make sure I wasn't missing anything
obvious. Assuming not I'll rework the generated ANSI C so that strings "..."
disappear entirely. (*sigh*)
Martin
_______________________________________________
Snowball-discuss mailing list
Snowball-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/snowball-discuss