simon-git: charset (master): Simon Tatham

Commits to Tartarus hosted VCS tartarus-commits at lists.tartarus.org
Fri Jan 4 09:52:43 GMT 2019


TL;DR:
  653de25 convcs: read with getc, not fgets or fread.

Repository:     https://git.tartarus.org/simon/charset.git
On the web:     https://git.tartarus.org/?p=simon/charset.git
Branch updated: master
Committer:      Simon Tatham <anakin at pobox.com>
Date:           2019-01-04 09:52:43

commit 653de2598e7b1607e13ece74d3b596038901c0d2
web diff https://git.tartarus.org/?p=simon/charset.git;a=commitdiff;h=653de2598e7b1607e13ece74d3b596038901c0d2;hp=2bad719f0a96d99567a5abe9fb36e0e8e632213d
Author: Simon Tatham <anakin at pobox.com>
Date:   Fri Jan 4 09:46:51 2019 +0000

    convcs: read with getc, not fgets or fread.
    
    In commit d97d7fdcb I switched over from fread to fgets for the sake
    of better interactive behaviour. But I'd forgotten that fgets has its
    own downside: it doesn't return the length of the data it read, so you
    have to find that out by calling strlen() on the output buffer, which
    fails if any NUL bytes appeared in the input. So convcs has been
    unable to handle UTF-16 input for a while.
    
    Ideally the replacement read function would have semantics like Unix
    read(2): block until at least one byte is available, then read as much
    data (up to the output buffer size) as is available without further
    blocking, and return it.
    
    But I don't want to introduce a whole layer of portability annoyance
    just for this, so instead I've written a manual loop on getc,
    terminating on any of a full buffer, EOF or \n. That should combine
    the NUL-tolerance of fread with the newline handling of fgets.

 convcs.c | 28 ++++++++++++++++++++++++----
 1 file changed, 24 insertions(+), 4 deletions(-)



More information about the tartarus-commits mailing list