simon-git: charset (master): Simon Tatham
Commits to Tartarus hosted VCS
tartarus-commits at lists.tartarus.org
Tue Dec 26 07:47:48 GMT 2017
TL;DR:
d33e458 New character sets: ISO/IEC 6937 and a variant.
0a45f74 Alternative CMake-based build script.
0a81212 Add extern "C" in charset.h.
Repository: https://git.tartarus.org/simon/charset.git
On the web: https://git.tartarus.org/?p=simon/charset.git
Branch updated: master
Committer: Simon Tatham <anakin at pobox.com>
Date: 2017-12-26 07:47:48
commit d33e45816f8b3e6bc1ede926514eb780de9382ed
web diff https://git.tartarus.org/?p=simon/charset.git;a=commitdiff;h=d33e45816f8b3e6bc1ede926514eb780de9382ed;hp=8718813d32346b14917df1348b61ba3ad329ddd5
Author: Simon Tatham <anakin at pobox.com>
Date: Tue Dec 26 07:46:24 2017 +0000
New character sets: ISO/IEC 6937 and a variant.
These are _mostly_ single-byte character sets, except that the
0xC0-0xCF range of bytes are introducer characters for two-byte
encodings of accented letters - but you'd be forgiven for mistaking
them for something more like combining characters, since each
introducer character consistently adds the same diacritic to a
(defined) selection of permissible follow-up letters.
Here I support ISO 6937 itself (assuming Wikipedia's transcription of
it to be accurate), and also a variant form I found in EN 300 468 (one
of the standards for DVB digital broadcast television) which is used
in broadcast episode-guide metadata and extends the standard version
of the character set by adding the euro sign.
To make it easier to handle things that are mostly single-byte but
with special cases, I've extended sbcsgen.pl to be able to output a
full sbcs_data structure containing two-way translation tables, but
_not_ also generate a charset_spec and an ENUM_CHARSET to match them.
This partial output is triggered by replacing the keyword 'charset'
with 'tables' at the start of an SBCS definition section.
Makefile.am | 8 +-
charset.h | 2 +
enum.h | 1 +
iso6937.c | 336 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
localenc.c | 5 +
sbcs.dat | 55 ++++++++++
sbcsgen.pl | 19 ++--
7 files changed, 414 insertions(+), 12 deletions(-)
commit 0a45f74685aa1465b279d63c9b5f053bbeffbd84
web diff https://git.tartarus.org/?p=simon/charset.git;a=commitdiff;h=0a45f74685aa1465b279d63c9b5f053bbeffbd84;hp=d33e45816f8b3e6bc1ede926514eb780de9382ed
Author: Simon Tatham <anakin at pobox.com>
Date: Tue Dec 26 07:46:24 2017 +0000
Alternative CMake-based build script.
This is subsidiary to the autotools one, in the sense that it works by
_reading_ Makefile.am to get the lists of source files, so that I
don't have to maintain those in more than one place. But it means that
now CMake-based superprojects as well as autotools-based ones can
include libcharset as a subdirectory or git submodule, and incorporate
libcharset's build-time needs into their own just by saying something
like this:
add_subdirectory(charset EXCLUDE_FROM_ALL)
target_include_directories(some_target PRIVATE charset)
target_link_libraries(some_target charset)
.gitignore | 2 ++
CMakeLists.txt | 39 +++++++++++++++++++++++++++++++++++++++
2 files changed, 41 insertions(+)
commit 0a81212ae48131db761890fb058111ae2f2ce59f
web diff https://git.tartarus.org/?p=simon/charset.git;a=commitdiff;h=0a81212ae48131db761890fb058111ae2f2ce59f;hp=0a45f74685aa1465b279d63c9b5f053bbeffbd84
Author: Simon Tatham <anakin at pobox.com>
Date: Tue Dec 26 07:47:31 2017 +0000
Add extern "C" in charset.h.
Now I can include it in a C++ program and still successfully link and
run against a libcharset static library compiled in the normal way.
charset.h | 14 ++++++++++++++
1 file changed, 14 insertions(+)
More information about the tartarus-commits
mailing list