Interprocess Communication
Contents:
Signals
Files
Pipes
System V IPC
Sockets
Computer processes have almost as many ways of communicating as people do. The difficulties of interprocess communication should not be underestimated. It doesn't do you any good to listen for verbal cues when your friend is using only body language. Likewise, two processes can communicate only when they agree on the means of communication, and on the conventions built on top of that. As with any kind of communication, the conventions to be agreed upon range from lexical to pragmatic: everything from which lingo you'll use, up to whose turn it is to talk. These conventions are necessary because it's very difficult to communicate bare semantics in the absence of context.
In our lingo, interprocess communication is usually pronounced IPC. The IPC facilities of Perl range from the very simple to the very complex. Which facility you should use depends on the complexity of the information to be communicated. The simplest kind of information is almost no information at all: just the awareness that a particular event has happened at a particular point in time. In Perl, these events are communicated via a signal mechanism modeled on the Unix signal system.
At the other extreme, the socket facilities of Perl allow you to communicate with any other process on the Internet using any mutually supported protocol you like. Naturally, this freedom comes at a price: you have to go through a number of steps to set up the connections and make sure you're talking the same language as the process on the other end. This may in turn require you to adhere to any number of other strange customs, depending on local conventions. To be protocoligorically correct, you might even be required to speak a language like XML, or Java, or Perl. Horrors.
Sandwiched in between are some facilities intended primarily for communicating with processes on the same machine. These include good old-fashioned files, pipes, FIFOs, and the various System V IPC syscalls. Support for these facilities varies across platforms; modern Unix systems (including Apple's Mac OS X) should support all of them, and, except for signals and SysV IPC, most of the rest are supported on any recent Microsoft operating systems, including pipes, forking, file locking, and sockets.[1]
[1] Well, except for
AF_UNIX
sockets.
More information about porting in general can be found in the standard Perl documentation set (in whatever format your system displays it) under perlport. Microsoft-specific information can be found under perlwin32 and perlfork, which are installed even on non-Microsoft systems. For texttutorials, we suggest the following:
- The Perl tutorial, by Tom Christiansen and Nathan Torkington (Anonymous and Associates, 1998), chapters 16 through 18.
- Advanced Developing in the UNIX Environment, by W. Richard Stevens (Addison-Wesley, 1992).
- TCP/IP Illustrated, by W. Richard Stevens, Volumes I-III (Addison-Wesley, 1992-1996).
Signals
Perl uses a simple signal-handling model: the %SIG
hash contains references (either symbolic or hard) to user-defined signal handlers. Certain events cause the operating system to deliver a signal to the affected process. The handler corresponding to that event is called with one argument containing the name of the signal that triggered it. To send a signal to another process, you use the kill
function. Think of it as sending a one-bit piece of information to the other process.[2] If that process has installed a signal handler for that signal, it can execute code when it receives the signal. But there's no way for the sending process to get any sort of return value, other than knowing that the signal was legally sent. The sender receives no feedback saying what, if anything, the receiving process did with the signal.
[2] Actually, it's more like five or six bits, depending on how many signals your OS defines and on whether the other process makes use of the fact that you didn't send a different signal.
We've classified this facility as a form of IPC, but in fact, signals can come from various sources, not just other processes. A signal might also come from your own process, or it might be generated when the user at the keyboard types a particular sequence like Control-C or Control-Z, or it might be manufactured by the kernel when a special event transpires, such as when a child process exits, or when your process runs out of stack space or hits a file size or memory limit. But your own process can't easily distinguish among these cases. A signal is like a package that arrives mysteriously on your doorstep with no return address. You'd best open it carefully.
Since entries in the %SIG
array can be hard references, it's common practice to use anonymous functions for simple signal handlers:
$SIG{INT} = sub { die "\nOutta here!\n" }; $SIG{ALRM} = sub { die "Your alarm clock went off" };
Or you could create a named function and assign its name or reference to the appropriate slot in the hash. For example, to intercept interrupt and quit signals (often bound to Control-C and Control-\ on your keyboard), set up a handler like this:
sub catch_zap { my $signame = shift; our $shucks++; die "Somebody sent me a SIG$signame!"; } $shucks = 0; $SIG{INT} = 'catch_zap'; # always means &main::catch_zap $SIG{INT} = \&catch_zap; # best strategy $SIG{QUIT} = \&catch_zap; # catch another, too
Notice how all we do in the signal handler is set a global variable and then raise an exception with
die
. Whenever possible, try to avoid anything more complicated than that, because on most systems the C library is not re-entrant. Signals are delivered asynchronously,[3] so calling any print
functions (or even anything that needs to malloc(3) more memory) could in theory trigger a memory fault and subsequent core dump if you were already in a related C library routine when the signal was delivered. (Even the die
routine is a bit unsafe unless the process is executing within an eval
, which suppresses the I/O from die
, which keeps it from calling the C library. Probably.)
[3]Synchronizing signal delivery with Perl-level opcodes is scheduled for a future release of Perl, which should solve the matter of signals and core dumps.
An even easier way to trap signals is to use the sigtrap
pragma to install simple, default signal handlers:
use sigtrap qw(die INT QUIT); use sigtrap qw(die untrapped normal-signals stack-trace any error-signals);
The pragma is useful when you don't want to bother writing your own handler, but you still want to catch dangerous signals and perform an orderly shutdown. By default, some of these signals are so fatal to your process that your program will just stop in its tracks when it receives one. Unfortunately, that means that any
END
functions for at-exit handling and DESTROY
methods for object finalization are not called. But they are called on ordinary Perl exceptions (such as when you call die
), so you can use this pragma to painlessly convert the signals into exceptions. Even though you aren't dealing with the signals yourself, your program still behaves correctly. See the description of use sigtrap
in "Pragmatic Modules", for many more features of this pragma.
You may also set the %SIG
handler to either of the strings "IGNORE
" or "DEFAULT
", in which case Perl will try to discard the signal or allow the default action for that signal to occur (though some signals can be neither trapped nor ignored, such as the KILL
and STOP
signals; see signal(3), if you have it, for a list of signals available on your system and their default behaviors).
The operating system thinks of signals as numbers rather than names, but Perl, like most people, prefers symbolic names to magic numbers. To find the names of the signals, list out the keys of the %SIG
hash, or use the kill -l command if you have one on your system. You can also use Perl's standard Config
module to determine your operating system's mapping between signal names and signal numbers. See Config(3) for an example of this.
Because %SIG
is a global hash, assignments to it affect your entire program. It's often more considerate to the rest of your program to confine your signal catching to a restricted scope. Do this with a local
signal handler assignment, which goes out of effect once the enclosing block is exited. (But remember that local
values are visible in functions called from within that block.)
{ local $SIG{INT} = 'IGNORE'; ... # Do whatever you want here, ignoring all SIGINTs. fn(); # SIGINTs ignored inside fn() too! ... # And here. } # Block exit restores previous $SIG{INT} value. fn(); # SIGINTs not ignored inside fn() (presumably).
Signaling Process Groups
Processes (under Unix, at least) are organized into process groups, generally corresponding to an entire job. For example, when you fire off a single shell command that consists of a series of filter commands that pipe data from one to the other, those processes (and their child processes) all belong to the same process group. That process group has a number corresponding to the process number of the process group leader. If you send a signal to a positive process number, it just sends the signal to the process, but if you send a signal to a negative number, it sends that signal to every process whose process group number is the corresponding positive number, that is, the process number of the process group leader. (Conveniently for the process group leader, the process group ID is just $$
.)
Suppose your program wants to send a hang-up signal to all child processes it started directly, plus any grandchildren started by those children, plus any greatgrandchildren started by those grandchildren, and so on. To do this, your program first calls setpgrp(0,0)
to become the leader of a new process group, and any processes it creates will be part of the new group. It doesn't matter whether these processes were started manually via fork
, automaticaly via piped open
s, or as backgrounded jobs with system("cmd &")
. Even if those processes had children of their own, sending a hang-up signal to your entire process group will find them all (except for processes that have set their own process group or changed their UID to give themselves diplomatic immunity to your signals).
{ local $SIG{HUP} = 'IGNORE'; # exempt myself kill(HUP, -$$); # signal my own process group }
Another interesting signal is signal number . This doesn't actually affect the target process, but instead checks that it's alive and hasn't changed its UID. That is, it checks whether it's legal to send a signal, without actually sending one.
unless (kill 0 => $kid_pid) { warn "something wicked happened to $kid_pid"; }
Signal number is the only signal that works the same under Microsoft ports of Perl as it does in Unix. On Microsoft systems,
kill
does not actually deliver a signal. Instead, it forces the target process to exit with the status indicated by the signal number. This may be fixed someday. The magic signal, however, still behaves in the standard, nondestructive fashion.
Reaping Zombies
When a process exits, its parent is sent a CHLD
signal by the kernel and the process becomes a zombie[4] until the parent calls wait
or waitpid
. If you start another process in Perl using anything except fork
, Perl takes care of reaping your zombied children, but if you use a raw fork
, you're expected to clean up after yourself. On many but not all kernels, a simple hack for autoreaping zombies is to set $SIG{CHLD}
to 'IGNORE'
. A more flexible (but tedious) approach is to reap them yourself. Because more than one child may have died before you get around to dealing with them, you must gather your zombies in a loop until there aren't any more:
use POSIX ":sys_wait_h"; sub REAPER { 1 until waitpid(-1, WNOHANG) == -1) }
To run this code as needed, you can either set a
CHLD
signal handler for it:
$SIG{CHLD} = \&REAPER;
or, if you're running in a loop, just arrange to call the reaper every so often. This is the best approach because it isn't subject to the occasional core dump that signals can sometimes trigger in the C library. However, it's expensive if called in a tight loop, so a reasonable compromise is to use a hybrid strategy where you minimize the risk within the handler by doing as little as possible and waiting until outside to reap zombies:
our $zombies = 0; $SIG{CHLD} = sub { $zombies++ }; sub reaper { my $zombie; our %Kid_Status; # store each exit status $zombies = 0; while (($zombie = waitpid(-1, WNOHANG)) != -1) { $Kid_Status{$zombie} = $?; } } while (1) { reaper() if $zombies; ... }
This code assumes your kernel supports reliable signals. Old SysV traditionally didn't, which made it impossible to write correct signal handlers there. Ever since way back in the 5.003 release, Perl has used the sigaction(2) syscall where available, which is a lot more dependable. This means that unless you're running on an ancient operating system or with an ancient Perl, you won't have to reinstall your handlers and risk missing signals. Fortunately, all BSD-flavored systems (including Linux, Solaris, and Mac OS X) plus all POSIX-compliant systems provide reliable signals, so the old broken SysV behavior is more a matter of historical note than of current concern.
[4]Yes, that really is the technical term.
With these newer kernels, many other things will work better, too. For example, "slow" syscalls (those that can block, like read
, wait
, and accept
) will restart automatically if interrupted by a signal. In the bad old days, user code had to remember to check explicitly whether each slow syscall failed with $!
($ERRNO
) set to EINTR
and, if so, restart. This wouldn't happen just from INT
signals; even innocuous signals like TSTP
(from a Control-Z) or CONT
(from foregrounding the job) would abort the syscall. Perl now restarts the syscall for you automatically if the operating system allows it to. This is generally construed to be a feature.
You can check whether you have the more rigorous POSIX-style signal behavior by loading the Config
module and checking whether $Config{d_sigaction}
has a true value. To find out whether slow syscalls are restartable, check your system documentation on sigaction(2) or sigvec(3), or scrounge around your C sys/signal.h file for SV_INTERRUPT
or SA_RESTART
. If one or both symbols are found, you probably have restartable syscalls.
Timing Out Slow Operations
A common use for signals is to impose time limits on long-running operations. If you're on a Unix system (or any other POSIX-conforming system that supports the ALRM
signal), you can ask the kernel to send your process an ALRM
at some point in the future:
use Fcntl ':flock'; eval { local $SIG{ALRM} = sub { die "alarm clock restart" }; alarm 10; # schedule alarm in 10 seconds eval { flock(FH, LOCK_EX) # a blocking, exclusive lock or die "can't flock: $!"; }; alarm 0; # cancel the alarm }; alarm 0; # race condition protection die if $@ && $@ !~ /alarm clock restart/; # reraise
If the alarm hits while you're waiting for the lock, and you simply catch the signal and return, you'll go right back into the
flock
because Perl automatically restarts syscalls where it can. The only way out is to raise an exception through die
and then let eval
catch it. (This works because the exception winds up calling the C library's longjmp(3) function, which is what really gets you out of the restarting syscall.)
The nested exception trap is included because calling flock
would raise an exception if flock
is not implemented on your platform, and you need to make sure to clear the alarm anyway. The second alarm 0
is provided in case the signal comes in after running the flock
but before getting to the first alarm 0
. Without the second alarm
, you would risk a tiny race condition--but size doesn't matter in race conditions; they either exist or they don't. And we prefer that they don't.
Blocking Signals
Now and then, you'd like to delay receipt of a signal during some critical section of code. You don't want to blindly ignore the signal, but what you're doing is too important to interrupt. Perl's %SIG
hash doesn't implement signal blocking, but the POSIX
module does, through its interface to the sigprocmask(2) syscall:
use POSIX qw(:signal_h); $sigset = POSIX::SigSet->new; $blockset = POSIX::SigSet->new(SIGINT, SIGQUIT, SIGCHLD); sigprocmask(SIG_BLOCK, $blockset, $sigset) or die "Could not block INT,QUIT,CHLD signals: $!\n";
Once the three signals are all blocked, you can do whatever you want without fear of being bothered. When you're done with your critical section, unblock the signals by restoring the old signal mask:
sigprocmask(SIG_SETMASK, $sigset) or die "Could not restore INT,QUIT,CHLD signals: $!\n";
If any of the three signals came in while blocked, they are delivered immediately. If two or more different signals are pending, the order of delivery is not defined. Additionally, no distinction is made between having received a particular signal once while blocked and having received it many times.[5] For example, if nine child processes exited while you were blocking
CHLD
signals, your handler (if you had one) would still be called only once after you unblocked. That's why, when you reap zombies, you should always loop until they're all gone.
[5] Traditionally, that is. Countable signals may be implemented on some real-time systems according to the latest specs, but we haven't seen these yet.