Loop Statements
All loop statements have an optional LABEL in their formal syntax. (You can put a label on any statement, but it has a special meaning to a loop.) If present, the label consists of an identifier followed by a colon. It's customary to make the label uppercase to avoid potential confusion with reserved words, and so it stands out better. And although Perl won't get confused if you use a label that already has a meaning like if
or open
, your readers might.
while and until Statements
The while
statement repeatedly executes the block as long as EXPR is true. If the word while
is replaced by the word until
, the sense of the test is reversed; that is, it executes the block only as long as EXPR remains false. The conditional is still tested before the first iteration, though.
The while
or until
statement can have an optional extra block: the continue
block. This block is executed every time the block is continued, either by falling off the end of the first block or by an explicit next
(a loop-control operator that goes to the next iteration). The continue
block is not heavily used in practice, but it's in here so we can define the for
loop rigorously in the next section.
Unlike the foreach
loop we'll see in a moment, a while
loop never implicitly localizes any variables in its test condition. This can have "interesting" consequences when while
loops use globals for loop variables. In particular, see the section "Line Input (Angle) Operator" in "Bits and Pieces" for how implicit assignment to the global $_
can occur in certain while
loops, along with an example of how to deal with the problem by explicitly localizing $_
. For other loop variables, however, it's best to declare them with my
, as in the next example.
A variable declared in the test condition of a while
or until
statement is visible only in the block or blocks governed by that test. It is not part of the surrounding scope. For example:
while (my $line = <STDIN>) { $line = lc $line; } continue { print $line; # still visible } # $line now out of scope here
Here the scope of
$line
extends from its declaration in the control expression throughout the rest of the loop construct, including the continue
block, but not beyond. If you want the scope to extend further, declare the variable before the loop.
for Loops
The three-part for
loop has three semicolon-separated expressions within its parentheses. These expressions function respectively as the initialization, the condition, and the re-initialization expressions of the loop. All three expressions are optional (but not the semicolons); if omitted, the condition is always true. Thus, the three-part for
loop can be defined in terms of the corresponding while
loop. This:
LABEL: for (my $i = 1; $i <= 10; $i++) { ... }
is like this:
{ my $i = 1; LABEL: while ($i <= 10) { ... } continue { $i++; } }
except that there's not really an outer block. (We just put one there to show how the scope of the
my
is limited.)
If you want to iterate through two variables simultaneously, just separate the parallel expressions with commas:
for ($i = 0, $bit = 0; $i < 32; $i++, $bit <<= 1) { print "Bit $i is set\n" if $mask & $bit; } # the values in $i and $bit persist past the loop
Or declare those variables to be visible only inside the for
loop:
for (my ($i, $bit) = (0, 1); $i < 32; $i++, $bit <<= 1) { print "Bit $i is set\n" if $mask & $bit; } # loop's versions of $i and $bit now out of scope
Besides the normal looping through array indices,
for
can lend itself to many other interesting applications. It doesn't even need an explicit loop variable. Here's one example that avoids the problem you get when you explicitly test for end-of-file on an interactive file descriptor, causing your program to appear to hang.
$on_a_tty = -t STDIN && -t STDOUT; sub prompt { print "yes? " if $on_a_tty } for ( prompt(); <STDIN>; prompt() ) { # do something }
Another traditional application for the three-part for
loop results from the fact that all three expressions are optional, and the default condition is true. If you leave out all three expressions, you have written an infinite loop:
for (;;) { ... }
This is the same as writing:
while (1) { ... }
If the notion of infinite loops bothers you, we should point out that you can always fall out of the loop at any point with an explicit loop-control operator such as last
. Of course, if you're writing the code to control a cruise missile, you may not actually need an explicit loop exit. The loop will be terminated automatically at the appropriate moment.[3]
[3] That is, the fallout from the loop tends to occur automatically.
foreach Loops
The foreach
loop iterates over a list of values by setting the control variable (VAR) to each successive element of the list:
foreach VAR (LIST) { ... }
The foreach
keyword is just a synonym for the for
keyword, so you can use foreach
and for
interchangeably, whichever you think is more readable in a given situation. If VAR is omitted, the global $_
is used. (Don't worry--Perl can easily distinguish for (@ARGV)
from for ($i=0; $i<$#ARGV; $i++)
because the latter contains semicolons.) Here are some examples:
$sum = 0; foreach $value (@array) { $sum += $value } for $count (10,9,8,7,6,5,4,3,2,1,'BOOM') { # do a countdown print "$count\n"; sleep(1); } for (reverse 'BOOM', 1 .. 10) { # same thing print "$_\n"; sleep(1); } for $field (split /:/, $data) { # any LIST expression print "Field contains: `$field'\n"; } foreach $key (sort keys %hash) { print "$key => $hash{$key}\n"; }
That last one is the canonical way to print out the values of a hash in sorted order. See the keys
and sort
entries in "Functions" for more elaborate examples.
There is no way with foreach
to tell where you are in a list. You may compare adjacent elements by remembering the previous one in a variable, but sometimes you just have to break down and write a three-part for
loop with subscripts. That's what the other kind of for
loop is there for, after all.
If LIST consists entirely of assignable values (meaning variables, generally, not enumerated constants), you can modify each of those variables by modifying VAR inside the loop. That's because the foreach
loop index variable is an implicit alias for each item in the list that you're looping over. Not only can you modify a single array in place, you can also modify multiple arrays and hashes in a single list:
foreach $pay (@salaries) { # grant 8% raises $pay *= 1.08; } for (@christmas, @easter) { # change menu s/ham/turkey/; } s/ham/turkey/ for @christmas, @easter; # same thing for ($scalar, @array, values %hash) { s/^\s+//; # strip leading whitespace s/\s+$//; # strip trailing whitespace }
The loop variable is valid only from within the dynamic or lexical scope of the loop and will be implicitly lexical if the variable was previously declared with my
. This renders it invisible to any function defined outside the lexical scope of the variable, even if called from within that loop. However, if no lexical declaration is in scope, the loop variable will be a localized (dynamically scoped) global variable; this allows functions called from within the loop to access that variable. In either case, any previous value the localized variable had before the loop will be restored automatically upon loop exit.
If you prefer, you may explicitly declare which kind of variable (lexical or global) to use. This makes it easier for maintainers of your code to know what's really going on; otherwise, they'll need to search back up through enclosing scopes for a previous declaration to figure out which kind of variable it is:
for my $i (1 .. 10) { ... } # $i always lexical for our $Tick (1 .. 10) { ... } # $Tick always global
When a declaration accompanies the loop variable, the shorter
for
spelling is preferred over foreach
, since it reads better in English.
Here's how a C or Java developer might first think to code up a particular algorithm in Perl:
for ($i = 0; $i < @ary1; $i++) { for ($j = 0; $j < @ary2; $j++) { if ($ary1[$i] > $ary2[$j]) { last; # Can't go to outer loop. :-( } $ary1[$i] += $ary2[$j]; } # this is where that last takes me }
But here's how a veteran Perl developer might do it:
WID: foreach $this (@ary1) { JET: foreach $that (@ary2) { next WID if $this > $that; $this += $that; } }
See how much easier that was in idiomatic Perl? It's cleaner, safer, and faster. It's cleaner because it's less noisy. It's safer because if code gets added between the inner and outer loops later on, the new code won't be accidentally executed, since
next
(explained below) explicitly iterates the outer loop rather than merely breaking out of the inner one. And it's faster because Perl executes a foreach
statement more rapidly than it would the equivalent for
loop, since the elements are accessed directly instead of through subscripting.
But write it however you like. TMTOWTDI.
Like the while
statement, the foreach
statement can also take a continue
block. This lets you execute a bit of code at the bottom of each loop iteration no matter whether you got there in the normal course of events or through a next
.
Speaking of which, now we can finally say it: next
is next.
Loop Control
We mentioned that you can put a LABEL on a loop to give it a name. The loop's LABEL identifies the loop for the loop-control operators next
, last
, and redo
. The LABEL names the loop as a whole, not just the top of the loop. Hence, a loop-control operator referring to the loop doesn't actually "go to" the loop label itself. As far as the computer is concerned, the label could just as easily have been placed at the end of the loop. But people like things labeled at the top, for some reason.
Loops are typically named for the item the loop is processing on each iteration. This interacts nicely with the loop-control operators, which are designed to read like English when used with an appropriate label and a statement modifier. The archetypal loop works on lines, so the archetypal loop label is LINE:
, and the archetypal loop-control operator is something like this:
next LINE if /^#/; # discard comments
The syntax for the loop-control operators is:
last LABEL next LABEL redo LABEL
The LABEL is optional; if omitted, the operator refers to the innermost enclosing loop. But if you want to jump past more than one level, you must use a LABEL to name the loop you want to affect. That LABEL does not have to be in your lexical scope, though it probably ought to be. But in fact, the LABEL can be anywhere in your dynamic scope. If this forces you to jump out of an
eval
or subroutine, Perl issues a warning (upon request).
Just as you may have as many return
operators in a function as you like, you may have as many loop-control operators in a loop as you like. This is not to be considered wicked or even uncool. During the early days of structured developing, some people insisted that loops and subroutines have only one entry and one exit. The one-entry notion is still a good idea, but the one-exit notion has led people to write a lot of unnatural code. Much of developing consists of traversing decision trees. A decision tree naturally starts with a single trunk but ends with many leaves. Write your code with the number of loop exits (and function returns) that is natural to the problem you're trying to solve. If you've declared your variables with reasonable scopes, everything gets automatically cleaned up at the appropriate moment, no matter how you leave the block.
The last
operator immediately exits the loop in question. The continue
block, if any, is not executed. The following example bombs out of the loop on the first blank line:
LINE: while (<STDIN>) { last LINE if /^$/; # exit when done with mail header ... }
The next
operator skips the rest of the current iteration of the loop and starts the next one. If there is a continue
clause on the loop, it is executed just before the condition is re-evaluated, just like the third component of a three-part for
loop. Thus it can be used to increment a loop variable, even when a particular iteration of the loop has been interrupted by a next
:
LINE: while (<STDIN>) { next LINE if /^#/; # skip comments next LINE if /^$/; # skip blank lines ... } continue { $count++; }
The redo
operator restarts the loop block without evaluating the conditional again. The continue
block, if any, is not executed. This operator is often used by programs that want to fib to themselves about what was just input. Suppose you were processing a file that sometimes had a backslash at the end of a line to continue the record on the next line. Here's how you could use redo
for that:
while (<>) { chomp; if (s/\\$//) { $_ .= <>; redo unless eof; # don't read past each file's eof } # now process $_ }
which is the customary Perl shorthand for the more explicitly (and tediously) written version:
LINE: while (defined($line = <ARGV>)) { chomp($line); if ($line =~ s/\\$//) { $line .= <ARGV>; redo LINE unless eof(ARGV); } # now process $line }
Here's an example from a real program that uses all three loop-control operators. Although this particular strategy of parsing command-line arguments is less common now that we have the
Getopts::*
modules bundled with Perl, it's still a nice illustration of the use of loop-control operators on named, nested loops:
ARG: while (@ARGV && $ARGV[0] =~ s/^-(?=.)//) { OPT: for (shift @ARGV) { m/^$/ && do { next ARG; }; m/^-$/ && do { last ARG; }; s/^d// && do { $Debug_Level++; redo OPT; }; s/^l// && do { $Generate_Listing++; redo OPT; }; s/^i(.*)// && do { $In_Place = $1 || ".bak"; next ARG; }; say_usage("Unknown option: $_"); } }
One more point about loop-control operators. You may have noticed that we are not calling them "statements". That's because they aren't statements--although like any expression, they can be used as statements. You can almost think of them as unary operators that just happen to cause a change in control flow. So you can use them anywhere it makes sense to use them in an expression. In fact, you can even use them where it doesn't make sense. One sometimes sees this coding error:
open FILE, $file or warn "Can't open $file: $!\n", next FILE; # WRONG
The intent is fine, but the
next FILE
is being parsed as one of the arguments to warn
, which is a list operator. So the next
executes before the warn
gets a chance to emit the warning. In this case, it's easily fixed by turning the warn
list operator into the warn
function call with some suitably situated parentheses:
open FILE, $file or warn("Can't open $file: $!\n"), next FILE; # okay
However, you might find it easier to read this:
unless (open FILE, $file) { warn "Can't open $file: $!\n"; next FILE; }