Directory Handles
Another way to get a list of names from a given directory is with a directory handle. A directory handle looks and acts like a filehandle. You open it (with opendir
instead of open
), you read from it (with readdir
instead of readline
), and you close it (with closedir
instead of close
). But instead of reading the contents of a file, you're reading the names of files (and other things) in a directory. For example:
my $dir_to_process = "/etc"; opendir DH, $dir or die "Cannot open $dir: $!"; foreach $file (readdir DH) { print "one file in $dir is $file\n"; } closedir DH;
Like filehandles, directory handles are automatically closed at the end of the program or if the directory handle is reopened onto another directory.
Unlike globbing, which in older versions of Perl fired off a separate process, a directory handle never fires off another process. So it makes them more efficient for applications that demand every ounce of power from the machine. However, it's also a lower-level operation, meaning that we have to do more of the work ourselves.
For example, the names are returned in no particular order.[280] And the list includes all files, not just those matching a particular pattern (like *.pm
from our globbing examples). And the list includes all files, especially the dot files, and particularly the dot and dot-dot entries.[281]
[280]It's actually the unsorted order of the directory entries, similar to the order you get from ls -f or find.
[281]Do not make the mistake of many old Unix programs and presume that dot and dot-dot are always returned as the first two entries (sorted or not). If that hadn't even occurred to you, pretend we never said it, because it's a false presumption. In fact, we're now sorry for even bringing it up.
So, if we wanted only the pm-ending files, we could use a skip-over function inside the loop:
while ($name = readdir DIR) { next unless $name =~ /\.pm$/; ... more processing ... }
Note here that the syntax is that of a regular expression, not a glob. And if we wanted all the non-dot files, we could say that:
next if $name =~ /^\./;
Or if we wanted everything but the common dot (current directory) and dot-dot (parent directory) entries, we could explicitly say that:
next if $name eq "." or $name eq "..";
Now we'll look at the part that gets most people mixed up, so pay close attention. The filenames returned by the readdir
operator have no pathname component. It's just the name within the directory. So, we're not looking at /etc/passwd, we're just looking at passwd. (And because this is another difference from the globbing operation, it's easy to see how people get confused.)
So you'll need to patch up the name to get the full name:
opendir SOMEDIR, $dirname or die "Cannot open $dirname: $!"; while (my $name = readdir SOMEDIR) { next if $name =~ /^\./; # skip over dot files $name = "$dirname/$name"; # patch up the path next unless -f $name and -r $name; # only readable files ... }
Without the patch, the file tests would have been checking files in the current directory, rather than in the directory named in $dirname
. This is the single most-common mistake when using directory handles.