Using Regular Expressions

Contents:

Matches with m//
Option Modifiers
The Binding Operator, =~
Interpolating into Patterns
The Match Variables
Substitutions with s///
The split Operator
The join Function
Exercises

Now that we've seen what goes inside a regular expression, let's take what we've learned back into Perl.

Matches with m//

We've been writing patterns in pairs of forward slashes, like /fred/. But this is actually a shortcut for the m// (pattern match) operator. As we saw with the qw// operator, you may choose any pair of delimiters to quote the contents. So, we could write that same expression as m(fred), m<fred>, m{fred}, or m[fred] using those paired delimiters, or as m,fred,, m!fred!, m^fred^, or many other ways using nonpaired delimiters.[191]

[191]Nonpaired delimiters are the ones that don't have a different "left" and "right" variety; the same punctuation mark is used for both ends.

The shortcut is that if you choose the forward slash as the delimiter, you may omit the initial m. Since Perl folks love to avoid typing extra characters, you'll see most pattern matches written using slashes, as in /fred/.

Of course, you should wisely choose a delimiter that doesn't appear in your pattern.[192] If you wanted to make a pattern to match the beginning of an ordinary web URL, you might start to write /^http:\/\// to match the initial "http://". But that's easier to read, write, maintain, and debug if you use a better choice of delimiter: m%^http://%.[193]

[192]If you're using paired delimiters, you shouldn't generally have to worry about using the delimiter inside the pattern, since that delimiter will generally be paired inside your pattern. That is, m(fred(.*)barney) and m{\w{2,}} and m[wilma[\n \t]+betty] are all fine, even though the pattern contains the quoting character, since each "left" has a corresponding "right". But the angle brackets ("<" and ">") aren't regular expression metacharacters, so they may not be paired; if the pattern were m{(\d+)\s*>=?\s*(\d+)}, quoting it with angle brackets would mean having to backslash the greater-than sign so that it wouldn't prematurely end the pattern.

[193]Remember, the forward slash is not a metacharacter, so it doesn't need to be backslashed when it's not the delimiter.

It's common to use curly braces as the delimiter. If you use a developers' text editor, it probably has the ability to jump from an opening curly brace to the corresponding closing one, which can be handy in maintaining code.