simon-git: charset (main): Simon Tatham

Commits to Tartarus hosted VCS tartarus-commits at lists.tartarus.org
Sat Apr 25 10:32:18 BST 2026


TL;DR:
  cf7c952 Add a full version of CP437, including 01-1F and 7F.
  d05f1df convcs: add a --passthrough option for control characters.

Repository:     https://git.tartarus.org/simon/charset.git
On the web:     https://git.tartarus.org/?p=simon/charset.git
Branch updated: main
Committer:      Simon Tatham <anakin at pobox.com>
Date:           2026-04-25 10:32:18

commit cf7c952c99e9a6b1def4e6fb3507fbdbac0f9327
web diff https://git.tartarus.org/?p=simon/charset.git;a=commitdiff;h=cf7c952c99e9a6b1def4e6fb3507fbdbac0f9327;hp=63b35f54a8b914e590a7cab617f729c46c4be6c4
Author: Simon Tatham <anakin at pobox.com>
Date:   Sat Apr 25 10:27:16 2026 +0100

    Add a full version of CP437, including 01-1F and 7F.
    
    The unicode.org translation table for CP437 doesn't include the
    printable glyphs that the full character set puts in these positions.
    But sometimes you want them anyway, so here's a version of CP437 that
    uses them.

 charset.h  |  1 +
 localenc.c |  1 +
 sbcs.dat   | 36 ++++++++++++++++++++++++++++++++++++
 3 files changed, 38 insertions(+)

commit d05f1df85931e68e402569e3261eb0911963a14d
web diff https://git.tartarus.org/?p=simon/charset.git;a=commitdiff;h=d05f1df85931e68e402569e3261eb0911963a14d;hp=cf7c952c99e9a6b1def4e6fb3507fbdbac0f9327
Author: Simon Tatham <anakin at pobox.com>
Date:   Sat Apr 25 10:29:12 2026 +0100

    convcs: add a --passthrough option for control characters.
    
    If you're converting a text file from a character set such as the new
    CP437full, or indeed the older ISO-8859-1-X11, then you might need to
    strike some kind of a balance between passing through some C0 byte
    values as control characters (newline in particular might be hard to
    do without), and translating others as specified by the charset.
    
    Rather than try to make a huge collection of intermediate versions of
    CP437 guessing at which subsets of control characters anyone might
    need, I've implemented this as a command-line option in the convcs
    tool. So now you can say, for example,
    
      convcs --passthrough=10,13 cp437full utf-8
    
    and get _most_ control characters converted into the Unicode
    representations of their CP437 glyphs, but pass CR and LF through
    unchanged.

 convcs.c | 60 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 60 insertions(+)



More information about the tartarus-commits mailing list