simon-git: halibut (main): Simon Tatham
Commits to Tartarus hosted VCS
tartarus-commits at lists.tartarus.org
Fri Oct 15 18:30:51 BST 2021
TL;DR:
ef87a02 Fix null dereference on an empty index term.
67653b3 bk_info: ensure node names are not NULL.
e209471 Disallow the zero Unicode character in all input.
ce45faf Disallow \u with no following hex digits at all.
Repository: https://git.tartarus.org/simon/halibut.git
On the web: https://git.tartarus.org/?p=simon/halibut.git
Branch updated: main
Committer: Simon Tatham <anakin at pobox.com>
Date: 2021-10-15 18:30:51
commit ef87a02bbe8f5865b0d43a4fafde82431b3102c9
web diff https://git.tartarus.org/?p=simon/halibut.git;a=commitdiff;h=ef87a02bbe8f5865b0d43a4fafde82431b3102c9;hp=884fc67a2feda365ae0c4dfa5b462806a8a4c913
Author: Simon Tatham <anakin at pobox.com>
Date: Fri Oct 15 18:01:27 2021 +0100
Fix null dereference on an empty index term.
When read_file processes the case-insensitive index directive \ii, it
calls ustrlow() on indexstr.text (the rdstring where the text inside
that directive has been accumulated), and then appends a zero byte to
pass it to index_merge.
But if you write a completely empty \ii{}, then indexstr.text was NULL
at the point where ustrlow() tried to dereference it. That's not a
very useful thing to do, but we can at least not segfault. Ensure
we've allocated a string by that point, even if it was empty.
input.c | 1 +
1 file changed, 1 insertion(+)
commit 67653b3c10fd3986b4212a6f41a5d180b5073631
web diff https://git.tartarus.org/?p=simon/halibut.git;a=commitdiff;h=67653b3c10fd3986b4212a6f41a5d180b5073631;hp=ef87a02bbe8f5865b0d43a4fafde82431b3102c9
Author: Simon Tatham <anakin at pobox.com>
Date: Fri Oct 15 18:01:57 2021 +0100
bk_info: ensure node names are not NULL.
If a node name accidentally comes out empty, we never accumulated
anything into the info_data, so the pointer never got allocated. Now
we accumulate a zero-length string at least, so we end up with ""
instead of NULL.
The only case I know of where this can happen is if an unexpected
string terminator appears due to a \0 in the input. That shouldn't
really have got that far in the first place, so I'll outlaw it in the
next commit.
bk_info.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
commit e20947122d1224181e62416030bcd7a2f11749c6
web diff https://git.tartarus.org/?p=simon/halibut.git;a=commitdiff;h=e20947122d1224181e62416030bcd7a2f11749c6;hp=67653b3c10fd3986b4212a6f41a5d180b5073631
Author: Simon Tatham <anakin at pobox.com>
Date: Fri Oct 15 18:20:57 2021 +0100
Disallow the zero Unicode character in all input.
Halibut works internally with standard C-style null-terminated strings
(or rather wide strings), so L'\0' appearing unexpectedly in the input
can cause all kinds of havoc.
It would be nice to redo all the string processing using (pointer,
length) pairs and become robust against that, but I don't think it's
realistic without a major rewrite. Zero characters have no actual use
that I can see, so a simpler fix is to just outlaw them completely.
This applies to a direct \0 appearing in the input file, and also to
any sneaky attempts to enter one via \u0000.
error.c | 6 ++++++
halibut.h | 2 ++
input.c | 16 +++++++++++++++-
3 files changed, 23 insertions(+), 1 deletion(-)
commit ce45faf832b4cf4113198727117c8dac1be95128
web diff https://git.tartarus.org/?p=simon/halibut.git;a=commitdiff;h=ce45faf832b4cf4113198727117c8dac1be95128;hp=e20947122d1224181e62416030bcd7a2f11749c6
Author: Simon Tatham <anakin at pobox.com>
Date: Fri Oct 15 18:11:22 2021 +0100
Disallow \u with no following hex digits at all.
That gets a different error message from \u0000, because it might
reasonably have been intended as something other than \u (e.g. \U).
input.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
More information about the tartarus-commits
mailing list