Filename Generation/Pathname Expansion

Filename Generation/Pathname Expansion

Wildcards, globbing

When you give the shell abbreviated filenames that contain special characters, also called , the shell can generate filenames that match the names of existing files. These special characters are also referred to as wildcards because they act as the jokers do in a deck of cards. When one of these characters appears in an argument on the command line, the shell expands that argument in sorted order into a list of filenames and passes the list to the program that the command line calls. Filenames that contain these special characters are called ambiguous file references because they do not refer to any one specific file. The process that the shell performs on these filenames is called pathname expansion or globbing.

Ambiguous file references refer to a group of files with similar names quickly, saving you the effort of typing the names individually. They can also help you find a file whose name you do not remember in its entirety. If no filename matches the ambiguous file reference, the shell generally passes the unexpanded referencespecial characters and allto the command.

The ? Special Character

The question mark (?) is a special character that causes the shell to generate filenames. It matches any single character in the name of an existing file. The following command uses this special character in an argument to the lpr utility:

$ lpr memo?

The shell expands the memo? argument and generates a list of files in the working directory that have names composed of memo followed by any single character. The shell then passes this list to lpr. The lpr utility never "knows" that the shell generated the filenames it was called with. If no filename matches the ambiguous file reference, the shell passes the string itself (memo?) to lpr or, if it is set up to do so, passes a null string (see nullglob on page ).

The following example uses ls first to display the names of all files in the working directory and then to display the filenames that memo? matches:

$ ls
mem   memo12  memo9  memoalex  newmemo5
memo  memo5   memoa  memos
$ ls memo?
memo5  memo9  memoa  memos

The memo? ambiguous file reference does not match mem, memo, memo12, memoalex, or newmemo5. You can also use a question mark in the middle of an ambiguous file reference:

$ ls
7may4report  may4report      mayqreport  may_report
may14report  may4report.79   mayreport
$ ls may?report  may4report  may_report  mayqreport

To practice generating filenames, you can use echo and ls. The echo utility displays the arguments that the shell passes to it:

$ echo may?report may4report may_report mayqreport

The shell first expands the ambiguous file reference into a list of all files in the working directory that match the string may?report and then passes this list to echo, as though you had entered the list of filenames as arguments to echo. Next echo displays the list of filenames.

A question mark does not match a leading period (one that indicates a hidden filename; see page ). When you want to match filenames that begin with a period, you must explicitly include the period in the ambiguous file reference.

The * Special Character

The asterisk (*) performs a function similar to that of the question mark but matches any number of characters, including zero characters, in a filename. The following example shows all of the files in the working directory and then shows three commands that display all the filenames that begin with the string memo, end with the string mo, and contain the string alx:

$ ls
amemo   memo       memoalx.0620  memosally  user.memo
mem     memo.0612  memoalx.keep  sallymemo
memalx  memoa      memorandum    typescript
$ echo memo *
memo memo.0612 memoa memoalx.0620 memoalx.keep memorandum memosally
$ echo *mo
amemo memo sallymemo user.memo
$ echo *alx *
memalx memoalx.0620 memoalx.keep

The ambiguous file reference memo* does not match amemo, mem, sallymemo, or user.memo. Like the question mark, an asterisk does not match a leading period in a filename.

The a option causes ls to display hidden filenames. The command echo* does not display . (the working directory), .. (the parent of the working directory), .aaa, or .profile. In contrast, the command echo .* displays only those four names:

$ ls
aaa        memo.sally  sally.0612  thurs
memo.0612  report      saturday
$ ls -a
.   .aaa      aaa        memo.sally  sally.0612  thurs
..  .profile  memo.0612  report      saturday
$ echo *
aaa memo.0612 memo.sally report sally.0612 saturday thurs
$ echo .*
. .. .aaa .profile

In the following example .p* does not match memo.0612, private, reminder, or report. Next the ls .* command causes ls to list .private and .profile in addition to the contents of the . directory (the working directory) and the .. directory (the parent of the working directory). When called with the same argument, echo displays the names of files (including directories) in the working directory that begin with a dot (.), but not the contents of directories.

$  ls -a
.       .private   memo.0612  reminder
..      .profile   private    report
$ echo .p*
.private .profile
$ ls . *
.private .profile
memo.0612  private    reminder   report
$ echo .*
. .. .private .profile

You can take advantage of ambiguous file references when you establish conventions for naming files. For example, when you end all text filenames with .txt, you can reference that group of files with *.txt. The next command uses this convention to send all the text files in the working directory to the printer. The ampersand causes lpr to run in the background.

$ lpr *.txt &

The [ ] Special Characters

A pair of brackets surrounding a list of characters causes the shell to match filenames containing the individual characters. Whereas memo? matches memo followed by any character, memo[17a] is more restrictive, and matches only memo1, memo7, and memoa. The brackets define a character class that includes all the characters within the brackets. (GNU calls this a character list; a GNU character class is something different.) The shell expands an argument that includes a character-class definition, by substituting each member of the character class, one at a time, in place of the brackets and their contents. The shell then passes the list of matching filenames to the program it is calling.

Each character-class definition can replace only a single character within a filename. The brackets and their contents are like a question mark that substitutes only the members of the character class.

The first of the following commands lists the names of all the files in the working directory that begin with a, e, i, o, or u. The second command displays the contents of the files named page2.txt, page4.txt, page6.txt, and page8.txt.

$ echo [aeiou]*
$ less page[2468].txt

A hyphen within brackets defines a range of characters within a character-class definition. For example, [69] represents [6789], [az] represents all lowercase letters in English, and [azAZ] represents all letters, both uppercase and lowercase, in English.

The following command lines show three ways to print the files named part0, part1, part2, part3, and part5. Each of these command lines causes the shell to call lpr with five filenames:

$ lpr part0 part1 part2 part3 part5
$ lpr part[01235]
$ lpr part[0-35]

The first command line explicitly specifies the five filenames. The second and third command lines use ambiguous file references, incorporating character-class definitions. The shell expands the argument on the second command line to include all files that have names beginning with part and ending with any of the characters in the character class. The character class is explicitly defined as 0, 1, 2, 3, and 5. The third command line also uses a character-class definition but defines the character class to be all characters in the range 03 plus 5.

The following command line prints 39 files, part0 through part38:

$ lpr part[0-9] part[12][0-9] part3[0-8]

The next two examples list the names of some of the files in the working directory. The first lists the files whose names start with a through m. The second lists files whose names end with x, y, or z.

$ echo [a-m]*
$ echo *[x-z]


When an exclamation point (!) or a caret (^) immediately follows the opening bracket ([) that defines a character class, the string enclosed by the brackets matches any character not between the brackets. Thus [^ab]* matches any filename that does not begin with a or b.

The following examples show that *[^ab] matches filenames that do not end with the letters a or b and that [b-d]* matches filenames that begin with b, c, or d.

$ ls
aa  ab  ac  ad  ba  bb  bc  bd  cc  dd
$ ls *[^ab]
ac  ad  bc  bd  cc  ddcc  dd
$ ls [b-d]*
ba  bb  bc  bd  cc  dd

You can match a hyphen () or a closing bracket (]) by placing it immediately before the final closing bracket.

The next example demonstrates that the ls utility cannot interpret ambiguous file references. First ls is called with an argument of ?old. The shell expands ?old into a matching filename, hold, and passes that name to ls. The second command is the same as the first, except the ? is quoted (refer to "" on page ). The shell does not recognize this question mark as a special character and passes it on to ls. The ls utility generates an error message saying that it cannot find a file named ?old (because there is no file named ?old).

$ ls ?old
$ ls \?old
ls: ?old: No such file or directory

Like most utilities and programs, ls cannot interpret ambiguous file references; that work is left to the shell.

Tip: The shell expands ambiguous file references

The shell does the expansion when it processes an ambiguous file reference, not the program that the shell runs. In the examples in this section, the utilities (ls, cat, echo, lpr) never see the ambiguous file references. The shell expands the ambiguous file references and passes a list of ordinary filenames to the utility. In the previous examples, echo shows this to be true because it simply displays its arguments; it never displays the ambiguous file reference.