Special Characters
You can use special characters within a regular expression to cause the regular expression to match more than one string. A regular expression that includes a special character always matches the longest possible string, starting as far toward the beginning (left) of the line as possible.
Periods
A period (.) matches any character (Table A-2).
Table A-2. Period
Regular expression
|
Matches
|
Examples
|
/ .alk/
|
All strings consisting of a SPACE followed by any character followed by alk
|
will talk, may balk
|
/.ing/
|
All strings consisting of any character preceding ing
|
sing song, ping, before inglenook |
Brackets
Brackets ([]) define a character class[1] that matches any single character within the brackets (Table A-3). If the first character following the left bracket is a caret (^), the brackets define a character class that matches any single character not within the brackets. You can use a hyphen to indicate a range of characters. Within a character-class definition, backslashes and asterisks (described in the following sections) lose their special meanings. A right bracket (appearing as a member of the character class) can appear only as the first character following the left bracket. A caret is special only if it is the first character following the left bracket. A dollar sign is special only if it is followed immediately by the right bracket.
[1] GNU documentation calls these List Operators and defines Character Class operators as expressions that match a predefined group of characters, such as all numbers (page 1024).
Table A-3. Brackets
Regular expression
|
Matches
|
Examples
|
/[bB]ill/
|
Member of the character class b and B followed by ill
|
bill, Bill, billed
|
/t[aeiou].k/
|
t followed by a lowercase vowel, any character, and a k
|
talkative, stink, teak, tanker
|
/# [69]/
|
# followed by a SPACE and a member of the character class 6 through 9
|
# 60, # 8:, get # 9
|
/[^azAZ]/
|
Any character that is not a letter (ASCII character set only)
|
1, 7, @, ., }, Stop! |
Asterisks
An asterisk can follow a regular expression that represents a single character (Table A-4). The asterisk represents zero or more occurrences of a match of the regular expression. An asterisk following a period matches any string of characters. (A period matches any character, and an asterisk matches zero or more occurrences of the preceding regular expression.) A character-class definition followed by an asterisk matches any string of characters that are members of the character class.
Table A-4. Asterisks
Regular expression
|
Matches
|
Examples
|
/ab*c/
|
a followed by zero or more b's followed by a c
|
ac, abc, abbc, debbcaabbbc
|
/ab.*c/
|
ab followed by zero or more characters followed by c
|
abc, abxc, ab45c, xab 756.345 x cat
|
/t.*ing/
|
t followed by zero or more characters followed by ing
|
thing, ting, I thought of going
|
/[azAZ ]*/
|
A string composed only of letters and SPACEs
|
1. any string without numbers or punctuation!
|
/(.*)/
|
As long a string as possible between ( and )
|
Get (this) and (that);
|
/([^)]*)/
|
The shortest string possible that starts with ( and ends with )
|
(this), Get (this and that) |
Carets and Dollar Signs
A regular expression that begins with a caret (^) can match a string only at the beginning of a line. In a similar manner, a dollar sign ($) at the end of a regular expression matches the end of a line. The caret and dollar sign are called anchors because they force (anchor) a match to the beginning or end of a line (Table A-5).
Table A-5. Carets and dollar signs
Regular expression
|
Matches
|
Examples
|
/^T/
|
A T at the beginning of a line
|
This line..., That Time..., In Time
|
/^+[09]/
|
A plus sign followed by a digit at the beginning of a line
|
+5 +45.72, +759 Keep this...
|
/:$/
|
A colon that ends a line
|
...below: |
Quoting Special Characters
You can quote any special character (but not a digit or a parenthesis) by preceding it with a backslash (Table A-6). Quoting a special character makes it represent itself.
Table A-6. Quoted special characters
Regular expression
|
Matches
|
Examples
|
/end\./
|
All strings that contain end followed by a period
|
The end., send., pretend.mail
|
/ \\/
|
A single backslash
|
\
|
/ \*/
|
An asterisk
|
*.c, an asterisk (*)
|
/ \[5\]/
|
[5]
|
it was five [5]
|
/and\/or/
|
and/or
|
and/or |
|