simon-git: putty (main): Simon Tatham

Thu Jul 21 18:48:26 BST 2022

TL;DR:
  810e21de Unix Plink: handle stdout/stderr backlog consistently.

Repository:     https://git.tartarus.org/simon/putty.git
On the web:     https://git.tartarus.org/?p=simon/putty.git
Branch updated: main
Committer:      Simon Tatham <anakin at pobox.com>
Date:           2022-07-21 18:48:26

commit 810e21de8234474ca606804c317a7dd5b6c686ab
web diff https://git.tartarus.org/?p=simon/putty.git;a=commitdiff;h=810e21de8234474ca606804c317a7dd5b6c686ab;hp=42740a54550476e47b8f68981f24ac455c1daa51
Author: Simon Tatham <anakin at pobox.com>
Date:   Thu Jul 21 18:37:58 2022 +0100

    Unix Plink: handle stdout/stderr backlog consistently.

    Whenever we successfully send some data to standard output/error,
    we're supposed to notify the backend that this has happened, and tell
    it how much backlog still remains, by calling backend_unthrottle().

    In Unix Plink, the call to backend_unthrottle() was happening on some
    but not all calls to try_output(). In particular, it was happening
    when we called try_output() as a result of stdout or stderr having
    just been reported writable by poll(), but not when we called it from
    plink_output() after the backend had just sent us some more data. Of
    course that _normally_ works - if you were polling stdout for
    writability at all then it's because a previous call had returned
    EAGAIN, so that's when you _have_ backlog to dispose of. But it's also
    possible, by an accident of timing, that before you get round to doing
    that poll, the seat passes you further data and you call try_output()
    anyway, and by chance, the blockage has cleared. In that situation,
    you end up having cleared your backlog but forgotten to tell the
    backend about it - which might mean the backend never unfreezes the
    channel or (in 'simple' mode) the entire SSH socket.

    A user reported (and I reproduced) that when Plink is compiled on
    MacOS, running an interactive session through it and doing
    output-intensive activity like scrolling around in htop(1) can quite
    easily get it into what turned out to be that stuck state. (I don't
    know why MacOS and not any other platform, but since it's a race
    condition, that seems like a plausible enough cause of a difference in
    timing.)

    Also, we were inconsistently computing the backlog size: sometimes it
    was the total size of the stdout and stderr bufchains, and sometimes
    it was just the size of the one we'd made an effort to empty.

    Now the backlog size is consistently stdout+stderr (the same as it is
    in Windows Plink), and the call to backend_unthrottle() happens
    _inside_ try_output(), so that I don't have to remember it at every
    call site.

 unix/plink.c | 23 ++++++++++++++---------
 1 file changed, 14 insertions(+), 9 deletions(-)