darkness

Wednesday, 27 September 2006

Of SSH agents and screen

darkness @ 14:42:48

Updated 2007-05-21: I think I have some changes to the option parsing code, some indentation fixes, and a note that it actually requires gawk (not just awk as I found on default *BSD).

I run into a problem frequently:

  1. On host A (perhaps my desktop computer) I add my SSH keys into the ssh-agent instance I’m always running under.
  2. I SSH into host B. I can do ssh-agent -l on host B and see the keys that I added on host A because of agent forwarding.
  3. On host B I start a Screen session.
  4. I do some work, then detach the Screen session and log out of host B.
  5. Later, on host C (perhaps my laptop computer) I add my SSH keys again.
  6. I SSH from host C into host B.
  7. I reattach my Screen session on host B.

At this point, in the Screen session on host B, I don’t have any keys in my agent. In fact, I can’t add any: the environment variables point to the agent forwarding from my SSH connection from A to B, but that connection has been closed.

Here is my solution, all from my ~/.bashrc:

### SSH agent forwarding under a long running screen

# We need to know where our Screen FIFOs are kept so we can check for
# a duplicate session name.
[ -d ~/.screens ] || { mkdir ~/.screens; chmod 700 ~/.screens; }
SCREENDIR=~/.screens
export SCREENDIR

# Think of it as a parameterized constant.
get_screen_auth_sock() { echo ~/.ssh/agent-screen-"$1"; }

# Clean up dead sockets.
find ~/.ssh -maxdepth 1 -path "$(get_screen_auth_sock '*')" -type l \
| while read link; do
    [ -e "$link" ] || rm -f $link
done

sshscreen() {
    if [ "x$STY" != "x" ]; then
        echo "don't use sshscreen from inside screen" >&2
        return 1
    fi

    if ! type -P gawk >/dev/null; then
        echo "sshscreen requires gawk (not found)" >&2
        return 1
    fi

    local OPTIND=1 opt pattern session num_sessions sock
    local create=0 reattach=0 set_name=""
    while getopts ":r:x:d:R:D:S:" opt; do
        if [ "x${opt##[rxdRD]}" = "x" ]; then
            reattach=1
            if [ "x${OPTARG##-[a-zA-Z]}" = "x" ]; then
                OPTIND=$(($OPTIND - 1))
            else
                pattern="-S $OPTARG"
            fi
        elif [ "x$opt" = "x:" -a "x${OPTARG##[rxdRD]}" = "x" ]; then
            # Reattach option with no option argument.
            reattach=1
        elif [ "x$opt" = "xS" ]; then
            create=1
            session="$OPTARG"
        elif [ "x$opt" = "x:" -a "x$OPTARG" = "xS" ]; then
            echo "-S requires an argument" >&2
            return 1
        fi
    done

    if [ $create -eq 1 -a $reattach -eq 1 ]; then
        echo "sshscreen can't handle -S and a reattach option as well" >&2
        return 1
    elif [ $create -eq 0 -a $reattach -eq 0 ]; then
        # I assume we're creating a new session.  I attempt to mimic
        # the default Screen session name here.  I fear the
        # portability of "hostname -s".  (I mean, really, I fear the
        # portability of a whole lot of this.)
        create=1
        session="$(tty | sed 's!^/dev/!!; s/[^a-zA-Z0-9]/-/g').$(hostname -s)"
        set_name="-S $session"
    elif [ $reattach -eq 1 ]; then
        # Three argument form of match() is a GNU extension.
        session=$(screen $pattern -ls \
              | gawk '/[ \t]+[0-9]+/{match($0, /[0-9]+\.([^ \t]+)/, m);
                     print m[1]; c = c + 1} END{exit(c)}')
        num_sessions=$?
        if [ $num_sessions -le 0 ]; then
            echo "no matching sessions found" >&2
            return 1
        elif [ $num_sessions -gt 1 ]; then
            echo "more than one matching session, please be more specific" >&2
            screen $pattern -ls
            return 1
        fi
    fi

    if [ $create -eq 1 ]; then
        find $SCREENDIR -print | sed 's/^[0-9]*\.//' | fgrep -q -- "$session"
        if [ $? -eq 0 ]; then
            # We can't have a duplicate session name, because they
            # would share the same SSH agent socket.  Note that by
            # "session name" I mean the portion of the name listed by
            # "screen -ls" with the leading "<pid>." removed.
            echo "session name '$session' not unique," \
                "please specify a different one with -S" >&2
            return 1
        fi
    fi

    sock=$(get_screen_auth_sock "$session")
    if [ -e "$SSH_AUTH_SOCK" ] && [ ! -e "$sock" -o -L "$sock" ]; then
        ln -sf "$SSH_AUTH_SOCK" "$sock"
    fi

    # It isn't necessary to specify SSH_AUTH_SOCK when doing a
    # reattach, only when creating a new session.
    SSH_AUTH_SOCK=$sock screen $set_name "$@"
}

Installation: put it in your ~/.bashrc. Adjust $SCREENDIR or make ~/.screens where your screen sockets will live. screen may require certain permissions of this directory that you don’t have set; if so, it will bitch and refuse to start the next time you run it, which is fine. If you have existing screen sessions, don’t try moving the socket files. I did that and was unable to ever reattach that session; I had to kill it. Instead, maybe set SCREENDIR to wherever your system keeps screen sockets.

The quick version: instead of using screen to create new sessions or to reattach detached sessions, use sshscreen. So instead of running screen -S foo irssi to create a new session with irssi in window 0, use sshscreen -S foo irssi. (Just sshscreen works to start a new session too, of course.) Instead of running screen -x foo to reattach, use sshscreen -x foo. sshscreen might accept all the other parameters you’re used to giving to screen (but some don’t do what you expect, like sshscreen -ls won’t list anything). To be safe you might want to give the switch with the session name first. Note that sshscreen won’t have any effect on a screen session unless that session was started with sshscreen.

This all works by making SSH_AUTH_SOCK point to something like ~/.ssh/agent-screen-foo when running in the Screen session foo. ~/.ssh/agent-screen-foo is really a symbolic link to the appropriate SSH agent socket, and this link is updated to your current agent socket every time you use sshscreen to reattach.

As an example, you log in to host B via SSH and reattach using sshscreen -x foo. This causes ~/.ssh/agent-screen-foo to be symbolically linked to the value of SSH_AUTH_SOCK (the agent socket for your current SSH connection).


Some implementation notes:

Note that, on my system at least, when [ -e ... ] is used on a dangling symbolic link, it returns false.

while getopts ":r:x:d:R:D:S:" opt; do

This tries to grab every option that you might use to create or reattach a session, so that it can figure out the name of the session you’re trying to operate on. It’s not very well thought out, I can guarantee you; but so far it works for me. In particular, I think it might break if you included some options prior to the option that specifies which Screen session to operate on. (This could possibly be fixed by including more Screen options, just so getopts doesn’t think it has encountered a non-option argument.)

You don’t always have to tell Screen a session name, though. If you try to create a session without naming it, sshscreen tries to make one up for you. If you only have one session and do something simple like sshscreen -x, your lone session should get reattached correctly.

One area where screen works and sshscreen doesn’t is creating two sessions from the same host and TTY. The reason for this: the default session name is something like <pid>.<tty>.<host>. However, sshscreen doesn’t know the PID of the “window manager” Screen process, which is what <pid> refers to. So while screen can create two sessions with the same <tty>.<host> (but different <pid>s), sshscreen only knows <tty>.<host>. sshscreen will refuse to create two sessions with the same name.

(If you didn’t understand any of that, let me summarize the impact of it. Some day you might get a session name '...' not unique error; just use -S somethingrandom and you’ll be fine.)

session=$(screen $pattern -ls \
          | gawk '/[ \t]+[0-9]+/{match($0, /[0-9]+\.([^ \t]+)/, m);
                                 print m[1]; c = c + 1} END{exit(c)}')

Screen lets you specify a partial session name. I thought about emulating this behavior and searching $SCREENDIR myself, but that seemed like a lot of trouble. So instead I came up with this nasty gawk script (third argument to match() is a gawk extension) to search for the given session name. We have to get the session name, so that we know what name we need to use for the agent socket link. We have to get this name before calling Screen so that we can get the value of SSH_AUTH_SOCK.

1 Comment »

  1. Thank you for posting this - worked like a charm with no changes necessary!

    Comment by merlin — Tuesday, 22 May 2007 @ 21:31:19

RSS feed for comments on this post. TrackBack URL

Leave a comment

Powered by WordPress