Lesson 9 | The pattern-matching operator |
Objective | The m/ pattern matching operator |
m/ Pattern Matching Operator in Perl
The most common contexts for regular expressions in Perl are the pattern-matching operator (m/) and the substitution operator (s/).
For example, let us take a small HTML file for our input data and experiment with regular expressions a bit:
Here's our input data (put it in a file called test.html):
<html>
<head>
<title> Example File </title>
</head>
<body>
<h1> Example File </h1>
<div align=center> This is a div tag. </div>
<p> This is another paragraph. </p>
</body>
</html>
Using the m/ operator, we can easily find all the lines with the word
example in them:
#!/usr/bin/perl
while(<>) {
/Example/ and print;
}
Advanced Perl Programming
Pattern Matching Operator
(Note that the /
expression/ syntax is a
shortcut for the m/
expression/.) When run from the command line, like this:
$ test.pl test.html
Will produce this output:
<head><title> Example File </title>
<h1> Example File </h1>
Using Perl Default Separator
/expression/
can be used because / is the default separator for the match operator. Quite often, however, it is desirable to use a different character to distinguish the matching pattern. UNIX systems, for example, use the forward slash (/) as a separator in the directory structure.
URL's typically make use of the forward slash.
To search for the following URL:
https://www.dispersednet
the following expression would be used:
/http:\/\/www.dispersednet\//
which can be difficult to decipher.
Perl allows for different separators to be used by specifically stating the m at the beginning of the expression.
The following code functions similarly to the previous match, only it utilizes # as the separator:
Question: Is there a way to get just the title from between the <title> tags?
#!/usr/bin/perl
while(<>) {
$exp = '</?title>';
/$exp(.*)$exp/i and
print"[$1]\n";
}
Notice that this time we put the regular expression in a variable. That allows us to use it twice within the same search pattern, even though it is matching two different strings (the begin and end tags). When run, it gives the output:
[ Example File ]
You can also assign the results of a regular expression to a variable. We will explore that in the next lesson.