Is your Ubuntu MOTD warning you of a zombie process?

[cc theme=”blackboard” width=”100%”]
Welcome to Ubuntu 11.10 (GNU/Linux 3.0.0-20-server x86_64)

* Documentation: https://help.ubuntu.com/11.10/serverguide/C

System information as of Thu Jun 28 18:36:57 EDT 2012

System load: 0.0 Processes: 94
Usage of /: 68.2% of 1.79TB Users logged in: 1
Memory usage: 29% IP address for eth0: 10.0.0.10
Swap usage: 8%

=> There is 1 zombie process.
[/cc]

What’s the scoop with that last line “There is 1 zombie process“, is my operating system getting caught up in this current climate of zombie infatuation? Well no, sadly it’s more boring than that. A zombie process occurs when a child process ends, but the parent doesn’t “reap” it. For a much better run down on what a zombie process is check out the Wikipedia article: Zombie process.

Here is a quick run down on some terminology. A process is just a fancy name for a running instance of a program. A child process (or just “child”) is a process started by another process. A process that starts another process is the “parent process” of the process it starts.

The ‘ps’ command shows processes we are running.

[cc theme=”blackboard”]
user@host:~$ ps
PID TTY TIME CMD
5828 pts/4 00:00:00 bash
6122 pts/4 00:00:00 ps
[/cc]

The ‘pstree’ command can show a family tree (of sorts) for processes, parents, their children, the children of their children, etc. Our shell is bash, and as we can see in the output from ‘ps’ above, the process ID number (pid) of our bash prompt is 5828.

[cc theme=”blackboard”]
user@host:~$ pstree -Gpl 5828
bash(5828)───pstree(6123)
[/cc]

Here we can see that bash is the parent process of the pstree command itself when we run it from the bash prompt. The pstree command exits shortly after it displays this information, and bash will go back to being childless. If we run another instance of bash from inside of the current bash prompt, the new bash instance will be a child of the first.

[cc theme=”blackboard”]
user@host:~$ bash
user@host:~$ pstree -Gpl 5828
bash(5828)───bash(6124)───pstree(6389)
[/cc]

So you can see our original bash process with the pid number 5828 has begotten our new child bash process of 6124. The new bash process is where we are running the ‘pstree’ command from, so pstree is a child of 6124.

For an interesting look at your systems family tree, try running ‘pstree -Gpl 1‘.

Hopefully you have a good handle on the whole parent/child thing. Now we’ll go zombie hunting. The system has told us that there is a zombie, but we know nothing about it. The ps command has options that will print the status of a process in a column of its output.

[cc theme=”blackboard” width=”100%”]
root@host:~# ps aux |grep Z
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 2925 0.0 0.0 9256 880 pts/2 S+ 18:40 0:00 grep Z
root 28766 0.0 0.0 0 0 ? Z Jun06 0:00 [apt] proc file system (you probably do), you can see lots of information about a given process including the parent pid by looking at the ‘stat’ file for that pid.

[cc theme=”blackboard” width=”100%”]
root@host:~# cat /proc/6124/stat
6124 (bash) S 5828 6124 5828 34820 6398 4202496 1166 3867 0 0 4 0 0 1 20 0 1 0 7883109 25640960 617 18446744073709551615 4194304 5111244 140736553088608 140736553087152 140358224629054 0 65536 3686404 1266761467 18446744071579277349 0 0 17 0 0 0 0 0 0
[/cc]

The 4th value in the ‘stat’ file is ppid, or the “parent pid” of the process.

The ‘stat’ file for any pid in a procfs enabled system can be found in /proc/[pid]/stat, where [pid] is replaced with the pid number you are interested in. For a description of the ‘stat’ file format search for ‘/proc/[pid]/stat’ at the URL below:
http://www.kernel.org/doc/man-pages/online/pages/man5/proc.5.html

To see just the pid number and ignore the other information we’re not currently interested in we can use the ‘awk’ command to select only the 4th field.
[cc theme=”blackboard” width=”100%”]
root@host:~# awk ‘{print $4}’ /proc/6124/stat
5828
[/cc]

Armed with the information above, I’ve created a quick little zombie hunting script for use in the cron scheduler, or command line. The script first tries to alert the parent process to reap its child using the SIGCHLD signal. When SIGCHLD fails SIGKILL is used next.

Zombie Hunter

[cc lang=”bash” width=”100%”]
#!/bin/bash
zombies=(`ps ax |awk ‘{print $3″ “$1}’ |grep -e ^’Z ‘ |sed ‘s/Z //1’`)
for zombie in ${zombies[@]}
do
echo “Found a zombie process “`awk ‘{print $2}’ /proc/$zombie/stat`” [pid:$zombie]”
parent=”`awk ‘{print $4}’ /proc/$zombie/stat`”
echo “Asking parent process “`awk ‘{print $2}’ /proc/$parent/stat`” [pid:$parent] to come quietly…”
kill -SIGCHLD $parent
sleep 10 # This seems awfully patient
if [ -f /proc/$parent/stat ]; then
echo “Asking not so nicely”
kill -9 $parent
fi
sleep 1
if ! [ -f /proc/$zombie/stat ]; then
echo “Zombie vanquished”
fi
done
[/cc]

[cc theme=”blackboard” width=”100%”]
root@host:~# ./zombie-hunter
Found a zombie process (apt) [pid:28766]
Asking parent process (run-parts) [pid:28763] to come quietly…
Asking not so nicely
Zombie vanquished
[/cc]

Published on :Posted on

Post your comment

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.