Pattern Matching Quick Reference with Examples
Article gives a tutorial introduction to regular expressions. This article is intended for those of you who just need a quick listing of regular expression syntax as a refresher from time to time. It also includes some simple examples. The characters in Table 26.6 have special meaning only in search patterns.
Pattern | What Does it Match? |
---|---|
Match any single character except newline. | |
* | Match any number (or none) of the single characters that immediately precede it. The preceding character can also be a regular expression. For example, since (dot) means any character,* means "match any number of any character."
|
^ | Match the following regular expression at the beginning of the line. |
$ | Match the preceding regular expression at the end of the line. |
[ ] | Match any one of the enclosed characters. |
A hyphen (- ) indicates a range of consecutive characters. A caret (^ ) as the first character in the brackets reverses the sense: it matches any one character not in the list. A hyphen or a right square bracket (] ) as the first character is treated as a member of the list. All other metacharacters are treated as members of the list.
| |
{n ,m \}
| Match a range of occurrences of the single character that immediately precedes it. The preceding character can also be a regular expression. \{n \} will match exactly n occurrences; \{n ,\} will match at least n occurrences; and \{n ,m \} will match any number of occurrences between n and m .
|
Turn off the special meaning of the character that follows. | |
( \) | Save the pattern enclosed between \( and \) into a special holding space. Up to nine patterns can be saved on a single line. They can be "replayed" in substitutions by the escape sequences \1 to \9. |
< \> | Match characters at beginning (< ) or end (> ) of a word.
|
+ | Match one or more instances of preceding regular expression. |
? | Match zero or one instances of preceding regular expression. |
| | Match the regular expression specified before or after. |
( ) | Apply a match to the enclosed group of regular expressions. |
The characters in Table 26.7 have special meaning only in replacement patterns.
Pattern | What Does it Match? |
---|---|
Turn off the special meaning of the character that follows. | |
n
| Restore the n th pattern previously saved by ( and ) .n is a number from 1 to 9, with 1 starting on the left.
|
& | Re-use the search pattern as part of the replacement pattern. |
~ | Re-use the previous replacement pattern in the current replacement pattern. |
u | Convert first character of replacement pattern to uppercase. |
U | Convert replacement pattern to uppercase. |
l | Convert first character of replacement pattern to lowercase. |
L | Convert replacement pattern to lowercase. |
Examples of Searching
When used with grep or egrep, regular expressions are surrounded by quotes. (If the pattern contains a $
, you must use single quotes; e.g., '
pattern
'
.) When used with ed, ex, sed, and awk, regular expressions are usually surrounded by /
(although any delimiter works). Table 26.8 has some example patterns.
Pattern | What Does it Match? |
---|---|
bag | The string bag .
|
^bag | bag at beginning of line.
|
bag$ | bag at end of line.
|
^bag$ | bag as the only word on line.
|
[Bb]ag | Bag or bag .
|
b[aeiou]g | Second letter is a vowel. |
b[^aeiou]g | Second letter is a consonant (or uppercase or symbol). |
g | Second letter is any character. |
^...$ | Any line containing exactly three characters. |
^\. | Any line that begins with a (dot). |
^\.[a-z][a-z] | Same, followed by two lowercase letters (e.g., troff requests). |
^\.[a-z]\{2\} | Same as previous, grep or sed only. |
^[^.] | Any line that doesn't begin with a (dot). |
bugs* | bug , bugs , bugss , etc.
|
"word" | word in quotes. |
"*word"* | word, with or without quotes. |
[A-Z][A-Z]* | One or more uppercase letters. |
[A-Z]+ | Same, egrep or awk only. |
[A-Z].* | An uppercase letter, followed by zero or more characters. |
[A-Z]* | Zero or more uppercase letters. |
[a-zA-Z] | Any letter. |
[^0-9A-Za-z] | Any symbol (not a letter or a number). |
[567] | One of the numbers , , or . |
egrep or awk pattern: | |
five|six|seven | One of the words five , six , or seven .
|
[23]?86 | One of the numbers , , or . |
compan(y|ies) | One of the words company or companies .
|
ex or vi pattern: | |
<the | Words like theater or the .
|
the\> | Words like breathe or the .
|
<the\> | The word the .
|
sed or grep pattern: | |
{5,\} | Five or more zeros in a row. |
[0-9]\{3\}-[0-9]\{2\}-[0-9]\{4\} | US social security number (nnn - nn - nnnn ). |
Examples of Searching and Replacing
The following examples show the metacharacters available to sed or ex. (ex commands begin with a colon.) A space is marked by ; a TAB is marked by
tab
.
Command | Result |
---|---|
s/.*/( & )/ | Redo the entire line, but add parentheses. |
s/.*/mv & &.old/ | Change a wordlist into mv commands. |
/^$/d | Delete blank lines. |
:g/^$/d | ex version of previous. |
/^[![]() tab ]*$/d
| Delete blank lines, plus lines containing only spaces or TABs. |
:g/^[![]() tab ]*$/d
| ex version of previous. |
s/![]() ![]() ![]() | Turn one or more spaces into one space. |
:%s/![]() ![]() ![]() | ex version of previous. |
:s/[0-9]/Item &:/ | Turn a number into an item label (on the current line). |
:s | Repeat the substitution on the first occurrence. |
:& | Same. |
:sg | Same, but for all occurrences on the line. |
:&g | Same. |
:%&g | Repeat the substitution globally. |
:.,$s/Fortran/\U&/g | Change word to uppercase, on current line to last line. |
:%s/.*/\L&/ | Lowercase entire file. |
:s/\<./\u&/g | Uppercase first letter of each word on current line (useful for titles). |
:%s/yes/No/g | Globally change a word to No .
|
:%s/Yes/~/g | Globally change a different word to No (previous replacement).
|
s/die or do/do or die/ | Transpose words. |
s/\([Dd]ie\) or \([Dd]o\)/\2 or \1/ | Transpose, using hold buffers to preserve case. |
- DG from Anonymous' UNIX tutorial (SVR4/Solaris)