[Snowball-discuss] Minor(?) errors in snowball's generated C code

Tom Lane tgl at sss.pgh.pa.us
Sun Sep 30 17:55:40 BST 2018


The Postgres project uses the C versions of snowball's stemmers.
Recently I got around to updating that code to match current
master from github.com/snowballstem (after an embarrassingly
long interval in which we hadn't updated our copy).  That resulted
in some new warnings from the Coverity static analyzer that we run
over all our code:

/srv/coverity/git/pgsql-git/postgresql/src/backend/snowball/libstemmer/stem_UTF_8_indonesian.c: 370 in indonesian_UTF_8_stem()
364                         z->lb = z->c; z->c = z->l; /* backwards, line 185 */
365     
366                         {   int ret = r_remove_suffix(z); /* call remove_suffix, line 185 */
367                             if (ret == 0) goto lab6;
368                             if (ret < 0) return ret;
369                         }
>>>     CID 1439829:  Incorrect expression  (UNUSED_VALUE)
>>>     Assigning value from "z->lb" to "z->c" here, but that stored value is overwritten before it can be used.
370                         z->c = z->lb;
371                         z->c = c_test8;
372                     }
373                     if (!(z->I[0] > 2)) goto lab6; /* $(<integer expression> > <integer expression>), line 186 */
374                     {   int ret = r_remove_second_order_prefix(z); /* call remove_second_order_prefix, line 186 */
375                         if (ret == 0) goto lab6;

(There's an identical complaint against stem_ISO_8859_1_indonesian.c.)

This is not much of a problem --- any decent compiler will just discard
the dead store on line 370.  But maybe it's an easily fixable mistake
in the Snowball compiler?  Or a bug in the indonesian stemmer algorithm?

/srv/coverity/git/pgsql-git/postgresql/src/backend/snowball/libstemmer/stem_UTF_8_arabic.c: 972 in r_Normalize_post()
966     
967             z->ket = z->c; /* [, line 321 */
968             if (z->c - 1 <= z->lb || z->p[z->c - 1] >> 5 != 5 || !((124 >> (z->p[z->c - 1] & 0x1f)) & 1)) goto lab0; /* substring, line 321 */
969             if (!(find_among_b(z, a_1, 5))) goto lab0;
970             z->bra = z->c; /* ], line 321 */
971             {   int ret = slice_from_s(z, 2, s_50); /* <-, line 322 */
>>>     CID 1439828:  Control flow issues  (MISSING_RESTORE)
>>>     Value of non-local "z->c" that was saved in "c1" is not restored as it was along other paths.
972                 if (ret < 0) return ret;
973             }
974             z->c = z->lb;
975         lab0:
976             z->c = c1;
977         }

It's not clear to me if this is actually a bug, or it's more of a case of
"we don't care whether this happens in this code path".  Coverity does have
a lot of heuristic checks that sometimes give false positives.

			regards, tom lane



More information about the Snowball-discuss mailing list