(October 2014)

Fork me on GitHub

For the TL;DR crowd: I desperately needed to extract the complete (and very lengthy) command line I had written 6 months ago in a bash shell - which was still running under screen. Read on to see how I eventually made it...

Let's assume you are using bash as your daily shell. You've heard lots of worrying things lately about its safety. But that's not what's bothering you.

What's bothering you is that some months ago you abused the holy UNIX principles with it. It couldn't be helped. You were asked to implement a daemon that monitors a folder for incoming "stuff". The clients upload files there, and you were expected to do various things with them.

And the daemon was supposed to be delivered yesterday - as always...

So you hacked at a frantic pace, and made a lengthy bash pipeline that did it all. Loops over the input files, pipes, redirects - you name it. And having no time to fool around with supervisord, you pasted "The Magic CmdLine (TM)" inside a screen, where your bash patiently started chugging along... And you moved to the next emergency on your list - which naturally, was scheduled for yesterday too...

Alzheimer sets in - seeking help

The Magic CmdLine (TM) obediently worked and worked. You checked its behaviour during the first couple of weeks - and just as expected, any errors on incoming files sent by the clients, were reported on its stderr, which your invocation conveniently redirected to some error log file. All was well.

6 months passed. No news is good news.

And suddenly, you remembered it. Only now (you moron!) you realize - horror of horrors - that you've forgotten all about The Magic CmdLine (TM). Arguments, redirects, everything down the mental drain. Gulp.

You are a single reboot away from, erm, a well-deserved flogging from your boss (translation: screen or no screen, The Magic CmdLine (TM) that you painstakingly investigated 6 months ago, will be forever lost... when the next reboot happens).

Oh God.

You reattach to the magic screen, fingers crossed. You hit ESC and scroll back up, hoping to see the magic line invocation... but to no avail. The error reports it generated were written all over screen's scrollback buffer - you can't scroll all the way up to the invocation line.

Heck, there has to be a way to get to that line. ps aux | grep ... doesn't help - it shows the currently running piece of the command you wrote. You want the whole shebang.

Desperate, you seek help at StackOverflow's gathering of UNIX gods.

I have a long running bash instance (inside a screen session) that is executing a complex set of commands inside a loop (with each loop doing pipes, redirects, etc). The long command line was written inside the terminal - it's not inside any script. Now, I know the bash process ID, and I have root access - how can I see the exact command line being executed inside that bash?

You also add a nice and simple example of the problem:

# In shell A, this sequence of commands was run - under screen:

$ echo $$
8909

$ while true ; do echo 1 ; echo 2>/dev/null ; sleep 30 ; done

# In shell B, I want to do some magic based on the 8909 PID, and get the
# string... "while true ; do echo 1 ; echo 2>/dev/null ; sleep 30 ; done"

And the answers start pouring in... but they are useless.

One tells you to search the process list - which would only give sleep in this example (and in your real case, the currently running part of your complex Magic CmdLine (TM)).

Another one suggests cat /proc/8909/cmdline - which of course just gives bash.

Another offers ps -p 8909 --no-headers -o cmd - which also gives bash.

(Sigh)

Use the Source, Luke

Desperate times call for desperate measures.

You download the bash source code, untar, and hoping you can somehow recover the history information kept inside the running instance of bash, you grep for dear life:

$ find . -iname \*.c | grep hist
./lib/readline/history.c
./lib/readline/examples/histexamp.c
./lib/readline/histexpand.c
./lib/readline/histsearch.c
./lib/readline/histfile.c
./bashhist.c

Hmm... histfile.c - isn't our dear beloved bash saving history into a file? Opening inside VIM, searching...

Well, would you look at that!...

...
int
write_history (filename)
     const char *filename;
{
  return (history_do_write (filename, history_length, HISTORY_OVERWRITE));
}
...

Saints be praised, we are saved!

Spawn GDB - and activate God mode! That is, attach to the running bash, and call write_history on our own - it conveniently takes the filename to save in as an argument!

$ gdb --pid 8909
...
Loaded symbols for /lib/i386-linux-gnu/i686/cmov/libnss_files.so.2
0xb76e7424 in __kernel_vsyscall ()

(gdb) call write_history("/tmp/foo")
$1 = 0

(gdb) detach

(gdb) q 

$ tail -1 /tmp/foo
while true ; do echo 1 ; echo 2>/dev/null ; sleep 30 ; done

Gotcha! The Magic CmdLine (TM) is yours again!

You answer your own question in StackOverflow. The other answers are somehow deleted, apparently ashamed of your monstrous luck.

The moderator there stars your answer.

Life is good again. It is time...

The Empire Strikes Back

You SSH into the remote machine running the thing. The bash process running the thing has PID 53165.

You spawn GDB... hands trembling.. and attach...

Say what?!?

$ ssh IamInHell
[ttsiod@IamInHell] su -
....
[root@IAmInHell ~] gdb --pid 53165
GNU gdb (GDB) Red Hat Enterprise Linux (7.2-60.el6_4.1)
...
Attaching to process 53165
/bin/bash (deleted): No such file or directory

No!...

The stupid ShellShock thingie forced everyone to update their bash... The automatic updates in that machine removed the binary of the old bash, and installed a new one.

So even though the process is still up, 6 months later... the binary file from which it was spawned (and which GDB searches for, to get the address of write_history) can't be found.

I don't believe this - I am cursed.

(Yes, I switched from 'you' to 'I' - who am I kidding)

Return of the Jedi

But I also know my OS - I know it well. I can get the content of that old binary from memory - via the proc interface:

[root@IamInHell] lsof +L1 | grep bash | grep 53165
bash      53165   root  txt    REG  253,0   938832     0 791023 /bin/bash (deleted)

[root@IamInHell] cat /proc/53165/exe > /tmp/oldBash

And finally, I can...

[root@IamInHell] gdb --pid 53165 /tmp/oldBash
...
Loaded symbols for /lib64/libnss_files.so.2
0x0000003d37aac8be in waitpid () from /lib64/libc.so.6
...
(gdb) call write_history("/tmp/foo")
$1 = 0
(gdb) detach
Detaching from program: /tmp/oldBash, process 53165
(gdb) q

[root@IamInHell] tail -1 /tmp/foo 
while true ; do ...

Dancing around my desk. The people around me are looking at me, puzzled and wondering about my mental state.

But I don't care - I am a friggin Jedi Master :-)

Sharing my good fortune

To ease the pain of my future self - and potentially the pain of other fellow coders / admins, I write a script automating all this.

And decide it's been quite a while since I posted something on my blog :-)

 

Update, many years later: A question that keeps re-appearing every time this post shows up in places like Hacker News or Reddit: "Why didn't you just "Ctrl-c", up arrow?"... Or: "Why didn't you just "Ctrl-z", up arrow, down-arrow, "fg"?

Remember: The script was running in production, doing actual work. Stopping it with Ctrl-c, when you don't remember anything about it and what it spawns, was not a safe option. If you think about it, you'll realize there are plenty of things that don't recover from Ctrl-C... Batch processes in network operations that affect state, API calls... I am sure you know what I mean.

As for Ctrl-z: try running this in your bash, and applying the "Ctrl-z/fg" approach to it: while true ; do sleep 1 ; done. You'll see that the loop won't recover - it won't resume its proper operation...

So, to put it simply: sending signals to processes I had no recollection of, was not as safe as reading my shell's memory with the god of debuggers.

Discuss on Reddit

Discuss on HN



profile for ttsiodras at Stack Overflow, Q&A for professional and enthusiast programmers
GitHub member ttsiodras
 
Index
 
 
CV
 
 
Updated: Tue Jun 13 21:38:08 2023
 

The comments on this website require the use of JavaScript. Perhaps your browser isn't JavaScript capable; or the script is not being run for another reason. If you're interested in reading the comments or leaving a comment behind please try again with a different browser or from a different connection.