simon-git: putty (master): Simon Tatham

Sun Dec 13 14:55:28 GMT 2015

TL;DR:
  90c7b15 Fix copy-and-paste error in testbn main program.
  984792e Add direct tests of division/modulus to testbn.
  482b4ab Rewrite the core divide function to not use DIVMOD_WORD.

Repository:     git://git.tartarus.org/simon/putty.git
On the web:     http://tartarus.org/~simon-git/gitweb/?p=putty.git
Branch updated: master
Committer:      Simon Tatham <anakin at pobox.com>
Date:           2015-12-13 14:55:28

commit 90c7b1562ce540d38f688492543467cc4dfa983c
web diff http://tartarus.org/~simon-git/gitweb/?p=putty.git;a=commitdiff;h=90c7b1562ce540d38f688492543467cc4dfa983c;hp=d0e9630e1c2f880bb7cb7ae107685bd1a6d189c4
Author: Simon Tatham <anakin at pobox.com>
Date:   Sun Dec 13 14:46:42 2015 +0000

    Fix copy-and-paste error in testbn main program.

    I called a 'pow' test line 'mul' in an error message.

 sshbn.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

commit 984792e9f4523eec1505e83ab17b8f377f7db43d
web diff http://tartarus.org/~simon-git/gitweb/?p=putty.git;a=commitdiff;h=984792e9f4523eec1505e83ab17b8f377f7db43d;hp=90c7b1562ce540d38f688492543467cc4dfa983c
Author: Simon Tatham <anakin at pobox.com>
Date:   Sun Dec 13 14:46:43 2015 +0000

    Add direct tests of division/modulus to testbn.

    I'm about to rewrite the division code, so it'll be useful to have a
    way to test it directly, particularly one which exercises difficult
    cases such as extreme values of the leading word and remainders just
    above and below zero.

 sshbn.c            |   59 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 testdata/bignum.py |   15 +++++++++++++
 2 files changed, 74 insertions(+)

commit 482b4ab872cc4987bce862c8af0de1e9bfc4c696
web diff http://tartarus.org/~simon-git/gitweb/?p=putty.git;a=commitdiff;h=482b4ab872cc4987bce862c8af0de1e9bfc4c696;hp=984792e9f4523eec1505e83ab17b8f377f7db43d
Author: Simon Tatham <anakin at pobox.com>
Date:   Sun Dec 13 14:46:43 2015 +0000

    Rewrite the core divide function to not use DIVMOD_WORD.

    DIVMOD_WORD is a portability hazard, because implementing it requires
    either a way to get direct access to the x86 DIV instruction or
    equivalent (be it inline assembler or a compiler intrinsic), or else
    an integer type we can use as BignumDblInt. But I'm starting to think
    about porting to 64-bit Visual Studio with a 64-bit BignumInt, and in
    that situation neither of those options will be available.

    I could write a piece of _out_-of-line x86-64 assembler in a separate
    source file and put a function call in DIVMOD_WORD, but instead I've
    decided to solve the problem in a more futureproof way: remove
    DIVMOD_WORD totally and write a division function that doesn't need it
    at all, solving not only today's porting headache but all future ones
    in this area.

    The new implementation works by precomputing (a good enough
    approximation to) the leading word of the reciprocal of the modulus,
    and then getting each word of quotient by multiplying by that
    reciprocal, where we previously used DIVMOD_WORD to divide by the
    leading word of the actual modulus. The reciprocal itself is computed
    outside internal_mod() and passed in as a parameter, allowing me to
    save time by only computing it once when I'm about to do a modpow.

    To some extent this complicates the implementation: the advantage of
    DIVMOD_WORD was that it yielded a full word q of quotient every time
    it was used, so the subtraction of q*m from the input could be done in
    a nicely word-aligned way. But the reciprocal multiply approach yields
    _almost_ a full word of quotient, because you have to make the
    reciprocal a bit short to avoid overflow at multiplication time. For a
    start, this means we have to do fractionally more iterations of the
    main loop; but more painfully, we can no longer depend on the
    subtraction of q*m at every step being word-aligned, and instead we
    have to be prepared to do it at any bit shift.

    But the flip side is that once we've implemented that, the rest of the
    algorithm becomes a lot less full of horrible special cases: in
    particular, we can now completely throw away the horribleness at all
    the call sites where we shift the modulus up by a fractional word to
    set its top bit, and then have to do a little dance to get the last
    few bits of quotient involving a second call to internal_mod.

    So there are points both for and against the new implementation in
    simplicity terms; but I think on balance it's more comprehensible than
    the old one, and a quick timing test suggests it also ends up a touch
    faster overall - the new testbn gets through the output of
    testdata/bignum.py in 4.034s where the old one took 4.392s.

 sshbn.c |  591 +++++++++++++++++++++++++++++++++++++++++++++++----------------
 sshbn.h |   55 +-----
 2 files changed, 444 insertions(+), 202 deletions(-)