File::Glob
use File::Glob ':glob'; # Override glob built-in. @list = <*.[Cchy]>; # Now uses POSIX glob, not csh glob. use File::Glob qw(:glob csh_glob); @sources = bsd_glob("*.{C,c,h,y,pm,xs}", GLOB_CSH); @sources = csh_glob("*.{C,c,h,y,pm,xs}"); # (same thing) use File::Glob ':glob'; # call glob with extra arguments $homedir = bsd_glob('~jrhacker', GLOB_TILDE | GLOB_ERR); if (GLOB_ERROR) { # An error occurred expanding the home directory. }
The
File::Glob
module's bsd_glob
function implements the glob(3) routine from the C library. An optional second argument contains flags governing additional matching properties. The :glob
import tag imports both the function and the necessary flags.
The module also implements a csh_glob
function. This is what the built-in Perl glob
and GLOBPAT fileglobbing operators really call. Calling csh_glob
is (mostly) like calling bsd_glob
this way:
bsd_glob(@_ ? $_[0] : $_, GLOB_BRACE | GLOB_NOMAGIC | GLOB_QUOTE | GLOB_TILDE);
If you import the
:glob
tag, then all calls to the built-in fileglobbing operators in the current package will really call the module's bsd_glob
function instead of its csh_glob
function. One reason you might want to do this is that, although bsd_glob
handles patterns with whitespace in them correctly, csh_glob
handles them, um, in the historical fashion. Old scripts would write <*.c *.h>
to glob both of those. Neither function is bothered by whitespace in the actual filenames, however.
The bsd_glob
function takes an argument containing the fileglobbing pattern (not a regular expression pattern) plus an optional flags argument. Filenames with a leading dot are not matched unless specifically requested. The return value is influenced by the flags in the second argument, which should be bitwise ORed together:[3]
[3]Due to restrictions in the syntax of the built-in
glob
operator, you may need to call the function asbsd_glob
if you want to pass it the second argument.
GLOB_BRACE
- Preprocess the string to expand
{pat,pat,...}
strings as csh(1) would. The pattern{}
is left unexpanded for historical reasons, mostly to ease typing of find(1) patterns. GLOB_CSH
- Synonym for
GLOB_BRACE | GLOB_NOMAGIC | GLOB_QUOTE | GLOB_TILDE
. GLOB_ERR
- Return an error when
bsd_glob
encounters a directory it cannot open or read. Ordinarily,bsd_glob
skips over the error, looking for more matches. GLOB_MARK
- Return values that are directories with a slash appended.
GLOB_NOCASE
bsd_glob
treat case differences as insignificant. (But see below for exceptions on MS-DOSish systems).GLOB_NOCHECK
- If the pattern does not match any pathname, then makes
bsd_glob
return a list consisting of only the pattern, as /bin/sh does. IfGLOB_QUOTE
is set, its effect is present in the pattern returned. GLOB_NOMAGIC
- Same as
GLOB_NOCHECK
but it only returns the pattern if it does not contain any of the special characters*
,?
or[
.NOMAGIC
is provided to simplify implementing the historic csh(1) globbing behavior and should probably not be used anywhere else. GLOB_NOSORT
- (using normal character comparisons irrespective of locale setting). This flag prevents that sorting for a small increase in speed.
GLOB_QUOTE
- Use the backslash character for quoting: every occurrence of a backslash followed by a character in the pattern is replaced by that character, avoiding any special interpretation of the character. (But see below for exceptions on MS-DOSish systems).
GLOB_TILDE
- Allow patterns whose first path component is
~
USER. If USER is omitted, the tilde by itself (or followed by a slash) represents the current user's home directory.
The bsd_glob
function returns a (possibly empty) list of matching paths, which will be tainted if that matters to your program. On error, GLOB_ERROR
will be true and $!
($OS_ERROR
) will be set to the standard system error. GLOB_ERROR
is guaranteed to be false if no error occurred, and to be either GLOB_ABEND
or GLOB_NOSPACE
otherwise. (GLOB_ABEND
means that the bsd_glob
was stopped due to some error, GLOB_NOSPACE
because it ran out of memory.) If bsd_glob
had already found some matching paths when the error occurred, it returns the list of filenames found so far, and also setsGLOB_ERROR
. Note that this implementation of bsd_glob
varies from most others by not considering ENOENT
and ENOTDIR
as terminating error conditions. Instead, it continues processing despite those errors, unless the GLOB_ERR
flag is set.
If no flag argument is supplied, your system's defaults are followed, meaning that filenames differing only in case are indistinguishable from one another on VMS, OS/2, old Mac OS (but not Mac OS X), and Microsoft systems (but not when Perl was built with Cygwin). If you supply any flags at all and still want this behavior, then you must include GLOB_NOCASE
in the flags. Whatever system you're on, you can change your defaults up front by importing the :case
or :nocase
flags.
On MS-DOSish systems, the backslash is a valid directory separator character.[4] In this case, use of backslash as a quoting character (via GLOB_QUOTE
) interferes with the use of backslash as a directory separator. The best (simplest, most portable) solution is to use slashes for directory separators, backslashes for quoting. However, this does not match some users' expectations, so backslashes (under GLOB_QUOTE
) quote only the glob metacharacters [
, ]
, {
, }
, -
, ~
, and itself. All other backslashes are passed through unchanged, if you can manage to get them by Perl's own backslash quoting in strings. It may take as many as four backslashes to finally match one in the filesystem. This is so completely insane that even MS-DOSish users should strongly consider using slashes. If you really want to use backslashes, look into the standard File::DosGlob
module, as it might be more to your liking than Unix-flavored fileglobbing.
[4] Although technically, so is a slash--at least as far as those kernels and syscalls are concerned; command shells are remarkably less enlightened.