Some Advanced Perl Techniques

Contents:

Trapping Errors with eval
Picking Items from a List with grep
Transforming Items from a List with map
Unquoted Hash Keys
More Powerful Regular Expressions
Slices
Exercise

What we've put in the rest of this tutorial is the core of Perl, the part that every Perl user should understand. But there are a few other techniques that, while not obligatory, are still valuable tools to have in your toolbox. We've gathered the most important of those for this chapter.

Don't be misled by the title of the chapter, though; the techniques here aren't especially more difficult to understand than what we have elsewhere. They are "advanced" merely in the sense that they aren't necessary for beginners. The first time you read this tutorial, you may want to skip (or skim) this chapter so you can get right to using Perl. Come back to it a month or two later, when you're ready to get even more out of Perl. Consider this entire chapter a huge footnote[361].

[361]We contemplated doing that in one of the drafts, but got firmly rejected by Anonymous's editors.

Trapping Errors with eval

Sometimes, your ordinary, everyday code can cause a fatal error in your program. Each of these typical statements could crash a program:

$barney = $fred / $dino; # divide-by-zero error? print "match\n" if /^($wilma)/; # illegal regular expression error? open CAVEMAN, $fred # user-generated error from die?
 or die "Can't open file '$fred' for input: $!";

You could go to some trouble to catch some of these, but it's hard to get them all. (How could you check the string $wilma from that example to ensure that it makes a valid regular expression?) Fortunately, Perl provides a simple way to catch fatal errors: wrap the code in an eval block:

eval {
 $barney = $fred / $dino
};

Now, even if $dino is zero, that line won't crash the program. The eval is actually an expression (not a control structure, like while or foreach) so that semicolon is required at the end of the block.

When a normally fatal error happens during the execution of an eval block, the block is done running, but the program doesn't crash. So that means that right after an eval finishes, you'll be wanting to know whether it exited normally or whether it caught a fatal error for you. The answer is in the special $@ variable. If the eval caught a fatal error, $@ will hold what would have been the program's dying words, perhaps something like: Illegal division by zero at my_program line 12. If there was no error, $@ will be empty. Of course, that means that $@ is a useful Boolean (true/false) value, true if there was an error, so you'll sometimes see code like this after an eval block:

print "An error occurred: $@" if $@;

The eval block is a true block, so it makes a new scope for lexical (my) variables. This piece of a program shows an eval block hard at work:

foreach my $person (qw/ fred wilma betty barney dino pebbles /) {
 eval {
 open FILE, "<$person"
 or die "Can't open file '$person': $!"; my($total, $count); while (<FILE>) {
 $total += $_;
 $count++;
 } my $average = $total/$count;
 print "Average for file $person was $average\n"; &do_something($person, $average);
 }; if ($@) {
 print "An error occurred ($@), continuing\n";
 }
 }

How many possible fatal errors are being trapped here? If there is an error in opening the file, that error is trapped. Calculating the average may divide by zero, so that error is trapped. Even the call to the mysteriously named &do_something subroutine will be protected against fatal errors, because an eval block traps any otherwise-fatal errors that occur during the time that it's active. (This feature is handy if you have to call a subroutine written by someone else, and you don't know whether they've coded defensively enough to avoid crashing your program.)

If an error occurs during the processing of one of the files, we'll get an error message, but the program will go on to the next file without further complaint.

You can nest eval blocks inside other eval blocks. The inner one traps errors while it runs, keeping them from reaching the outer blocks. (Of course, after the inner eval finishes, if it caught an error, you may wish to re-post the error by using die, thereby letting the outer eval catch it.) An eval block traps any errors that occur during its execution, including errors that happen during subroutine calls (as we saw in the example earlier).

We mentioned earlier that the eval is an expression, which is why the trailing semicolon is needed after the closing curly brace. But since it's an expression, it has a return value. If there's no error, it's like a subroutine: the return value is the last expression evaluated, or it's returned early with an optional return keyword. Here's another way to do the math without having to worry about divide-by-zero:

my $barney = eval {
 $fred / $dino };

If the eval traps a fatal error, the return value is either undef or an empty list, depending upon the context. So in the previous example, $barney is either the correct result from dividing, or it's undef; we don't really need to check $@ (although it's probably a good idea to check defined($barney) before we use it further).

There are four kinds of problems that eval can't trap. The first group are the very serious errors that crash Perl itself, such as running out of memory or getting an untrapped signal. Since Perl itself isn't running, there's no way it can trap these errors.[362]

[362]Some of these errors are listed with an (X) code on the perldiag manpage, if you're curious.

Of course, syntax errors inside the eval block are caught at compile time -- they're never returned in $@.

The exit operator terminates the program at once, even if it's called from a subroutine inside an eval block. (This correctly implies that when writing a subroutine, you should use die rather than exit to signal when something goes wrong.)

The fourth and final kind of problem that an eval block can't trap are warnings, either user-generated ones (from warn) or Perl's internally generated warnings (requested with the -w command-line option or the use warnings pragma). There's a separate mechanism from eval for trapping warnings; see the discussion of the __WARN__ pseudosignal in the Perl documentation for the details.

We should also mention that there's another form of eval that can be dangerous if it's mishandled. In fact, you'll sometimes run across someone who will say that you shouldn't use eval in your code for security reasons. They're (mostly) right that eval should be used only with great care, but they're talking about the other form of eval, sometimes called "eval of a string". If the keyword eval is followed directly by a block of code in curly braces, as we're doing here, there's no need to worry -- that's the safe kind of eval.