Tracking down a troublesome bug

| No Comments

This story starts with SSH agents. Briefly, they’re programs which load your SSH keys and use them on behalf of SSH in order to authenticate to remote hosts. Very handy. Now you don’t need to type your password all the time.

When there’s no SSH agent running, your keys are usually encrypted, so you have to type a passphrase to tell the agent how to decrypt them. The default behaviour of the standard SSH is to bind a shinyfresh Unix-domain socket somewhere in /tmp and listen on it; it tells the SSH process how to find it using an environment variable. If you log in twice, you get two agents. Both of them need to be told your passphrase. This is annoying. Why do I need both? The answer appears to be that I want both because the agents’ lifetimes are tied to their respective sessions. All of this seems rather daft to me. Why not just have one agent, which listens on a well-known socket and lives on until it dies of natural causes? Now I only need to type my passphrase once each time the machine boots.

(Critics might argue that I’m extending my window of vulnerability to include times when I’m not logged in. True; but (a) I usually spend most of the time logged in, and (b) there’s only a sensible attack if the bad guy can run processes with my uid — and if he can do that, I’m already toast.)

Enter the Gnome keyring manager. It’s a handy looking gadget which remembers secrets. It protects the secrets with a passphrase. By default, it uses your login password — which is somewhat sane, since if a bad guy knows your password, you’re toast anyway. And it gets your login password through PAM — when you log into GDM. It drops its keyring on the floor when you lock the screen, and picks it up again when you type your password again — again, through PAM. Quite clever, really.

So I twiddled my login profile to try to pick up gnome-keyring-daemon and set the environment variable from it. I managed to botch this edit in such a way that I ended up starting new gnome-keyring-daemon processes if I couldn’t contact the one that was already meant to be running. Oh, well.

Now I notice that — every now and then — lots of new gnome-keyring-daemon processes appear out of thin air. SSH logins provoke this, as you’d expect, but I don’t SSH into my laptop very often so this isn’t too bad.

‘I know,’ I thought, ‘I’ll set a trap.’

while pause 2; do
  if [ $(pidof gnome-keyring-daemon | wc -w) -gt 1 ]; then
    xmsg 'gotcha (gnome-keyring-daemon)'
    break
  fi
done

So I went about my business for a while, and then suspended the laptop while I did something else. When I came back, I had a ‘gotcha’ window.

Aha! Progress! It’s suspend/resume cycles which are doing this. I’d managed to acquire not just one but eight new keyring daemons.

Hmm. Where did they come from? ps(1) was unhelpful: it told me that their PPID was 1. They’re daemons. Of course their PPIDs are 1. No clue where they came from.

Hmm.

tr '\0' '\n' </proc/24244/environ | sort | less

Interesting. Most of this looks like the environment my bash_profile sets. But there’s other stuff. An XDG_SESSION_COOKIE which doesn’t match my desktop session. There’s a hack that isn’t on my PATH — so .bashrc wasn’t run.

An idea hits me and I check the process’s PGID. It matches the system D-Bus daemon. (Oh, joy. PAM and D-Bus in the same story. How can this get better? The ways cannot be counted in an unsigned long — but I digress.) I poke about in pm-support’s scripts for a bit without getting any bright ideas.

‘I know,’ I thought again: ‘I’ll set another trap.’

In .bash_profile:

if [ "$XDG_SESSION_COOKIE" -a -z "$DBUS_SESSION_BUS_ADDRESS" ]; then
  mkdir -p /tmp/mdw/caught
  touch /tmp/mdw/caught/$$
  echo "A rat!  A rat!  Dead for a ducat!"
  pause
  exit 1
fi

This doesn’t stop me making new shells in my session, but it does trap the spawning-new-keyring-daemons case.

Suspend. Screen fades; backlight stays on. Wait. Move pointer; screensaver login box appears. Type password.

ls /tmp/mdw/caught

Got one!

pstree

A long dangly tendril from upowerd(8) through pm-support to my pause(1) program. And in the middle of this? su(8) and a hook script called 01PulseAudio.

Hat-trick.

(I kill pause(1) at this point and my laptop finally suspends.)

Leave a comment

About this Entry

This page contains a single entry by Mark Wooding published on May 11, 2010 1:06 AM.

In Soviet Russia, banks imitate scammers was the previous entry in this blog.

The GCHQ challenge, unpicked is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.

Pages

OpenID accepted here Learn more about OpenID
Powered by Movable Type 5.2.13