simon-git: cvt-utf8 (master): cvt-utf8.git
Commits to Tartarus hosted VCS
tartarus-commits at lists.tartarus.org
Sat Jan 21 19:06:50 GMT 2017
TL;DR:
3551777 Rewrite the Unihan zip-file untangling.
Repository: https://git.tartarus.org/simon/cvt-utf8.git
On the web: https://git.tartarus.org/?p=simon/cvt-utf8.git
Branch updated: master
Committer: cvt-utf8.git
Date: 2017-01-21 19:06:50
commit 35517774facc527c523730220d92905003a1059d
web diff https://git.tartarus.org/?p=simon/cvt-utf8.git;a=commitdiff;h=35517774facc527c523730220d92905003a1059d;hp=0013973562ef5099eab40802aec4f8f5df93ce99
Author: Simon Tatham <anakin at pobox.com>
Date: Sat Jan 21 19:02:04 2017 +0000
Rewrite the Unihan zip-file untangling.
Jacob Nevins pointed out that unicode.org has changed their zip file
organisation so as to divide up the giant Unihan.txt into multiple
files. So we now need to iterate over all members of the zip file, not
just the first one.
The simplest way to achieve that in turn is to completely throw out my
old code that would unpack a zip file in a streamed presentation even
after having already read its first character, and replace it with the
really simple approach of just slurping the whole file into memory and
passing it to the standard Python zipfile module. I think these days
that's not an unreasonable demand on the computer running this build
step.
cvt-utf8 | 88 ++++++++++++++++------------------------------------------------
1 file changed, 21 insertions(+), 67 deletions(-)
More information about the tartarus-commits
mailing list