simon-git: charset (master): Simon Tatham
Commits to Tartarus hosted VCS
tartarus-commits at lists.tartarus.org
Fri Jan 4 09:52:43 GMT 2019
TL;DR:
653de25 convcs: read with getc, not fgets or fread.
Repository: https://git.tartarus.org/simon/charset.git
On the web: https://git.tartarus.org/?p=simon/charset.git
Branch updated: master
Committer: Simon Tatham <anakin at pobox.com>
Date: 2019-01-04 09:52:43
commit 653de2598e7b1607e13ece74d3b596038901c0d2
web diff https://git.tartarus.org/?p=simon/charset.git;a=commitdiff;h=653de2598e7b1607e13ece74d3b596038901c0d2;hp=2bad719f0a96d99567a5abe9fb36e0e8e632213d
Author: Simon Tatham <anakin at pobox.com>
Date: Fri Jan 4 09:46:51 2019 +0000
convcs: read with getc, not fgets or fread.
In commit d97d7fdcb I switched over from fread to fgets for the sake
of better interactive behaviour. But I'd forgotten that fgets has its
own downside: it doesn't return the length of the data it read, so you
have to find that out by calling strlen() on the output buffer, which
fails if any NUL bytes appeared in the input. So convcs has been
unable to handle UTF-16 input for a while.
Ideally the replacement read function would have semantics like Unix
read(2): block until at least one byte is available, then read as much
data (up to the output buffer size) as is available without further
blocking, and return it.
But I don't want to introduce a whole layer of portability annoyance
just for this, so instead I've written a manual loop on getc,
terminating on any of a full buffer, EOF or \n. That should combine
the NUL-tolerance of fread with the newline handling of fgets.
convcs.c | 28 ++++++++++++++++++++++++----
1 file changed, 24 insertions(+), 4 deletions(-)
More information about the tartarus-commits
mailing list