simon-git: putty (main): Simon Tatham

Commits to Tartarus hosted VCS tartarus-commits at lists.tartarus.org
Thu Dec 24 15:51:11 GMT 2020


TL;DR:
  d13adebe uxutils.c: move some definitions into a header file.
  092c51af uxutils.c: add special case for M1 macOS.
  43cdc3d9 Tidy up arithmetic in the SHA-512 implementation.
  c6d921ad Reorganise SHA-512 to match SHA-256.
  a9763ce4 Hardware-accelerated SHA-512 on the Arm architecture.

Repository:     https://git.tartarus.org/simon/putty.git
On the web:     https://git.tartarus.org/?p=simon/putty.git
Branch updated: main
Committer:      Simon Tatham <anakin at pobox.com>
Date:           2020-12-24 15:51:11

commit d13adebe1ab7c6c7f6c7d2633851a2f47a596556
web diff https://git.tartarus.org/?p=simon/putty.git;a=commitdiff;h=d13adebe1ab7c6c7f6c7d2633851a2f47a596556;hp=e9e6c03c6eba045f69ec402a528254d304b857d7
Author: Simon Tatham <anakin at pobox.com>
Date:   Thu Dec 24 09:34:13 2020 +0000

    uxutils.c: move some definitions into a header file.
    
    If the autoconf/ifdef system ends up taking the trivial branch through
    all the Arm-architecture ifdefs, then we define the always-fail
    version of getauxval as a 'static inline' function, and then (because
    none of our desired HWCAP_FOO values is defined at all) never call it.
    This leads to a compiler warning because we defined a static function
    and never called it - i.e. at the default -Werror, a build failure.
    
    Of course it's perfectly sensible to define a static inline function
    that never gets called! Header files do it all the time, and nobody is
    expected to ensure that if they include a header file then they take
    care to refer to every static inline function it defines.
    
    But if the definition is in the _source_ file rather than a header
    file, then clang (in particular on macOS) will give a warning. So the
    easy solution is to move the inline definitions of getauxval into a
    header file, which suppresses the warning without requiring me to faff
    about with further ifdefs to make the definitions conditional on at
    least one use.

 unix/uxutils.c | 30 ++----------------------------
 unix/uxutils.h | 45 +++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 47 insertions(+), 28 deletions(-)

commit 092c51afed5d166aa3fb7da84a695d50a3db9cdb
web diff https://git.tartarus.org/?p=simon/putty.git;a=commitdiff;h=092c51afed5d166aa3fb7da84a695d50a3db9cdb;hp=d13adebe1ab7c6c7f6c7d2633851a2f47a596556
Author: Simon Tatham <anakin at pobox.com>
Date:   Thu Dec 24 10:04:08 2020 +0000

    uxutils.c: add special case for M1 macOS.
    
    The M1 chip in the new range of Macs includes the crypto extension
    that permits AES, SHA-1 and SHA-256 acceleration. But you can't find
    that out by querying the ELF aux vector, because macOS isn't even
    ELF-based at all, so there isn't an ELF aux vector, and no web search
    I've tried has turned up any MachO thing obviously analogous to it.
    
    Running 'sysctl -a' does show some flags indicating CPU architecture
    extensions, but they're more advanced ones than this. So I think we
    have to assume that if we're on the new M1 macOS at all, then we have
    the basic crypto extension available.
    
    Accordingly, I've added a special case to all the query functions that
    simply returns true if defined __APPLE__.

 unix/uxutils.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

commit 43cdc3d9104d6da4f56bb4f136d68736a1873b71
web diff https://git.tartarus.org/?p=simon/putty.git;a=commitdiff;h=43cdc3d9104d6da4f56bb4f136d68736a1873b71;hp=092c51afed5d166aa3fb7da84a695d50a3db9cdb
Author: Simon Tatham <anakin at pobox.com>
Date:   Thu Dec 24 10:52:48 2020 +0000

    Tidy up arithmetic in the SHA-512 implementation.
    
    It was written in an awkward roundabout way involving all the
    arithmetic being done in horrible macros looking like assembler
    instructions, and lots of explicit temp variables. That's because,
    when I originally wrote it, I needed it to compile on platforms
    without a 64-bit integer type.
    
    In commit a647f2ba11 I switched it over to using uint64_t, but I did
    it in a way that made minimal change to the code structure, by
    rewriting the insides of those macros to contain ordinary uint64_t
    arithmetic instead of faffing about with 32-bit halves. So it worked,
    but it still looked disgusting.
    
    Now I've reworked it so that individual arithmetic operations are
    written directly in the sensible way, and the more complicated
    SHA-specific operations are written as inline functions instead of
    macros.

 sshsh512.c | 241 ++++++++++++++++++++++++++++++-------------------------------
 1 file changed, 117 insertions(+), 124 deletions(-)

commit c6d921add5a8cdad4ac47571063f2a93a880cb2a
web diff https://git.tartarus.org/?p=simon/putty.git;a=commitdiff;h=c6d921add5a8cdad4ac47571063f2a93a880cb2a;hp=43cdc3d9104d6da4f56bb4f136d68736a1873b71
Author: Simon Tatham <anakin at pobox.com>
Date:   Thu Dec 24 15:20:03 2020 +0000

    Reorganise SHA-512 to match SHA-256.
    
    This builds on the previous refactoring by reworking the SHA-512
    vtables and block layer to look more like the SHA-256 version, in
    which the block and padding structure is a subroutine of the top-level
    vtable methods instead of an owning layer around them.
    
    This also organises the code in a way that makes it easy to drop in
    hardware-accelerated versions alongside it: the block layer and the
    big arrays of constants are now nicely separate from the inner
    block-transform part.

 ssh.h      |   1 +
 sshsh512.c | 444 +++++++++++++++++++++++++++----------------------------------
 2 files changed, 197 insertions(+), 248 deletions(-)

commit a9763ce4ed8e01d4351c0873529b3bf88d2653d0
web diff https://git.tartarus.org/?p=simon/putty.git;a=commitdiff;h=a9763ce4ed8e01d4351c0873529b3bf88d2653d0;hp=c6d921add5a8cdad4ac47571063f2a93a880cb2a
Author: Simon Tatham <anakin at pobox.com>
Date:   Thu Dec 24 11:40:15 2020 +0000

    Hardware-accelerated SHA-512 on the Arm architecture.
    
    The NEON support for SHA-512 acceleration looks very like SHA-256,
    with a pair of chained instructions to generate a 128-bit vector
    register full of message schedule, and another pair to update the hash
    state based on those. But since SHA-512 is twice as big in all
    dimensions, those four instructions between them only account for two
    rounds of it, in place of four rounds of SHA-256.
    
    Also, it's a tighter squeeze to fit all the data needed by those
    instructions into their limited number of register operands. The NEON
    SHA-256 implementation was able to keep its hash state and message
    schedule stored as 128-bit vectors and then pass combinations of those
    vectors directly to the instructions that did the work; for SHA-512,
    in several places you have to make one of the input operands to the
    main instruction by combining two halves of different vectors from
    your existing state. But that operation is a quick single EXT
    instruction, so no trouble.
    
    The only other problem I've found is that clang - in particular the
    version on M1 macOS, but as far as I can tell, even on current trunk -
    doesn't seem to implement the NEON intrinsics for the SHA-512
    extension. So I had to bodge my own versions with inline assembler in
    order to get my implementation to compile under clang. Hopefully at
    some point in the future the gap might be filled and I can relegate
    that to a backwards-compatibility hack!
    
    This commit adds the same kind of switching mechanism for SHA-512 that
    we already had for SHA-256, SHA-1 and AES, and as with all of those,
    plumbs it through to testcrypt so that you can explicitly ask for the
    hardware or software version of SHA-512. So the test suite can run the
    standard test vectors against both implementations in turn.
    
    On M1 macOS, I'm testing at run time for the presence of SHA-512 by
    checking a sysctl setting. You can perform the same test on the
    command line by running "sysctl hw.optional.armv8_2_sha512".
    
    As far as I can tell, on Windows there is not yet any flag to test for
    this CPU feature, so for the moment, the new accelerated SHA-512 is
    turned off unconditionally on Windows.

 configure.ac       |   2 +-
 ssh.h              |   5 +
 sshsh512.c         | 529 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
 test/cryptsuite.py | 159 ++++++++--------
 testcrypt.c        |   4 +
 unix/uxutils.c     |  13 ++
 unix/uxutils.h     |  14 ++
 windows/winmiscs.c |   8 +
 8 files changed, 659 insertions(+), 75 deletions(-)



More information about the tartarus-commits mailing list