Perl as a Scripting Language

Perl as a Scripting Language

Perl stands for Practical Extraction Report Language. Larry Wall created Perl to extract information from text files and to use that information to prepare reports. Programs written in Perl, the language, are interpreted and executed by perl, the program. This book's companion CD-ROMs include Perl, and you can install it at the same time as you install CentOS Linux (simply select the Development Tools package group).

Perl is available on a wide variety of computer systems because, like Linux, Perl can be distributed freely. In addition, Perl is popular as a scripting language among many users and system administrators, which is why I introduce Perl and describe its strengths. In , you learn about another scripting language (Tcl/Tk) that provides the capability to create GUIs for the scripts.

Determining Whether You Have Perl

Before you proceed with the Perl tutorial, check whether you have Perl installed on your system. Type the following command:

which perl

The command tells you whether it finds a specified program in the directories listed in the PATH environment variable. If perl is installed, you should see the following output:

/usr/bin/perl

If the command complains that no such program exists in the current PATH, this does not necessarily mean you do not have perl installed; it may mean that you do not have the /usr/bin directory in PATH. Ensure that /usr/bin is in PATH; either type echo $PATH or look at the message displayed by the command (that message includes the directories in PATH). If /usr/bin is not in PATH, use the following command to redefine PATH:

export PATH=$PATH:/usr/bin

Now, try the which perl command again. If you still get an error, you may not have installed Perl. You can install Perl from the companion CD-ROMs by performing the following steps:

  1. Log in as root.

  2. Mount each CD and look for the perl RPM package. Mount the CD with the mount /dev/cdrom command or wait until GNOME's magicdev device mounts the CD. Then search for the perl RPM with the following commands:

    cd /mnt/cdrom/RedHat/RPMS
    ls perl*.rpm
  3. After you find the perl RPM file, type the following rpm (Red Hat Package Manager) command to install Perl:

    rpm -ivh perl*

After you have perl installed on your system, type the following command to see its version number:

perl -v

Following is typical output from that command:

This is perl, v5.8.0 built for i386-linux-thread-multi
(with 1 registered patch, see perl -V for more detail)
Copyright 1987-2002, Larry Wall
Perl may be copied only under the terms of either the Artistic License or the
GNU General Public License, which may be found in the Perl 5 source kit.
Complete documentation for Perl, including FAQ lists, should be found on
this system using `man perl' or `perldoc perl'.  If you have access to the
Internet, point your browser at http://www.perl.com/, the Perl Home Page.

This output tells you that you have Perl Version 5.8, patch Level 0, and that Larry Wall, the originator of Perl, holds the copyright. Perl is distributed freely under the GNU General Public License, however.

You can get the latest version of Perl by pointing your World Wide Web browser to the Comprehensive Perl Archive Network (CPAN). The following address connects you to the CPAN site nearest to you:

http://www.perl.com/CPAN/

Writing Your First Perl Script

Perl has many features of C, and, as you may know, most books on C start with an example program that displays Hello, World! on your terminal. Because Perl is an interpreted language, you can accomplish this task directly from the command line. If you enter:

perl -e 'print "Hello, World!\n";'

the system responds

Hello, World!

This command uses the -e option of the perl program to pass the Perl program as a command-line argument to the Perl interpreter. In this case, the following line constitutes the Perl program:

print "Hello, World!\n";

To convert this line to a script, simply place the line in a file, and start the file with a directive to run the perl program (as you do in shell scripts, when you place a line such as #!/bin/sh to run the Bourne shell to process the script).

To try a Perl script, follow these steps:

  1. Use a text editor, such as vi or emacs, to save the following lines in the file named hello:

    #!/usr/bin/perl
    # This is a comment.
    print "Hello, World!\n";
  2. Make the hello file executable by using the following command:

    chmod +x hello
  3. Run the Perl script by typing the following at the shell prompt:

    ./hello
    Hello, World!

That's it! You have written and tried your first Perl script.

Learning More about Perl

I devote a few sections of this chapter to giving you an overview of Perl and to showing a few simple examples. However, this discussion does not do justice to Perl. If you want to use Perl as a tool, consult one of the following books:

Programming Perl, 3rd Edition, is the authoritative guide to Perl (although it may not be the best resource for learning Perl). The book by Randal Schwartz focuses more on teaching Perl programming. Paul Hoffman's book is a good introduction for nonprogrammers wanting to learn Perl.

Getting an Overview of Perl

Most programming languages, including Perl, have some common features:

The next few sections provide an overview of these major features of Perl and illustrate the features through simple examples.

Learning Basic Perl Syntax

Perl is free-form, like C; no constraints exist on the exact placement of any keyword. Often, Perl programs are stored in files with names that end in .pl, but there is no restriction on the filenames you use.

As in C, each Perl statement ends with a semicolon (;). A number sign or hash mark (#) marks the start of a comment; the perl program disregards the rest of the line beginning with the number sign.

Groups of Perl statements are enclosed in braces ({...}). This feature also is similar in C.

Using Variables in Perl

You don't have to declare Perl variables before using them, as you do in C. You can recognize a variable in a Perl script easily, because each variable name begins with a special character: an at symbol (@), a dollar sign ($), or a percent sign (%). These special characters denote the variable's type.

Using Scalars

A scalar variable can store a single value, such as a number, or a text string. Scalar variables are the basic data type in Perl. Each scalar's name begins with a dollar sign ($). Typically, you start using a scalar with an assignment statement that initializes it. You even can use a variable without initializing it; the default value for numbers is zero, and the default value of a string is an empty string. If you want to see whether a scalar is defined, use the defined function as follows:

print "Name undefined!\n" if !(defined $name);

The expression (defined $name) is 1 if $name is defined. You can 'undefine' a variable by using the undef function. You can undefine $name, for example, as follows:

undef $name;

Variables are evaluated according to context. Following is a script that initializes and prints a few variables:

#!/usr/bin/perl
$title = "CentOS Linux Professional Secrets";
$count1 = 650;
$count2 = 425;
$total = $count1 + $count2;
print "Title: $title -- $total pages\n";

When you run the preceding Perl program, it produces the following output:

Title: CentOS Linux Professional Secrets -- 1075 pages

As the Perl statements show, when the two numeric variables are added, their numeric values are used; but when the $total variable is printed, its string representation is displayed.

Another interesting aspect of Perl is that it evaluates all variables in a string within double quotation marks ("..."). However, if you write a string inside single quotation marks ('...'), Perl leaves that string untouched. If you write

 print 'Title: $title -- $total pages\n';

with single quotes instead of double quotes, Perl displays

Title: $title -- $total pages\n

and does not generate a new line.

Insider Insight 

A useful Perl variable is $_ (the dollar sign followed by the underscore character). This special variable is known as the default argument. The Perl interpreter determines the value of $_ depending on the context. When the Perl interpreter reads input from the standard input, $_ holds the current input line; when the interpreter is searching for a specific pattern of text, $_ holds the default search pattern.

Using Arrays

An array is a collection of scalars. The array name begins with an at symbol (@). As in C, array subscripts start at zero. You can access the elements of an array with an index. Perl allocates space for arrays dynamically.

Consider the following simple script:

#!/usr/bin/perl
@commands = ("start", "stop", "draw" , "exit");
$numcmd = @commands;
print "There are $numcmd commands.\n";
print "The first command is: $commands[0]\n";

When you run the script, it produces the following output:

There are 4 commands.
The first command is: start

You can print an entire array with a simple print statement like this:

print "@commands\n";

When Perl executes this statement for the @commands array used in this section's examples, it displays the following output:

start stop draw exit
Using Associative Arrays

Associative array variables, which are declared with a percent sign (%) prefix, are unique features of Perl. Using associative arrays, you can index an array with a string, such as a name. A good example of an associative array is the %ENV array, which Perl automatically defines for you. In Perl, %ENV is the array of environment variables you can access by using the environment-variable name as an index. The following Perl statement prints the current PATH environment variable:

print "PATH = $ENV{PATH}\n";

When Perl executes this statement, it prints the current setting of PATH. In contrast to indexing regular arrays, you have to use braces to index an associative array.

Perl has many built-in functions-such as delete, each, keys, and values-that enable you to access and manipulate associative arrays.

Listing the Predefined Variables in Perl

Perl has several predefined variables that contain useful information you may need in a Perl script. Following are a few important predefined variables:

Using Operators and Expressions

Operators are used to combine and compare Perl variables. Typical mathematical operators are addition (+), subtraction (-), multiplication (*), and division (/). Perl and C provide nearly the same set of operators. When you use operators to combine variables, you end up with expressions. Each expression has a value.

Following are some typical Perl expressions:

error < 0
$count == 10
$count + $i
$users[$i]

These expressions are examples of the comparison operator (the first two lines), the arithmetic operator, and the array-index operator.

You can initialize an array to null by using ()-the null-list operator-as follows:

@commands = ();

The dot operator (.) enables you to concatenate two strings, as follows:

$part1 = "Hello, ";
$part2 = "World!";
$message = $part1.$part2;  # Now $message = "Hello, World!"

The repetition operator, denoted by x=, is curious but useful. You can use the x= operator to repeat a string a specified number of times. Suppose that you want to initialize a string to 65 asterisks (*). The following example shows how you can initialize the string with the x= operator:

$marker = "*";
$marker x= 65;  # Now $marker is a string of 65 asterisks.

Another powerful operator in Perl is range, which is represented by two periods (..). You can initialize an array easily by using the range operator. Following are some examples:

@numerals = (0..9); # @numerals = 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
@alphabet = ('A'..'Z'); # @alphabet = capital letters A through Z

Learning Regular Expressions

If you have used Linux (or any variant of UNIX) for a while, you probably know about the command, which enables you to search files for a pattern of strings. Following is a typical use of to locate all files that have any occurrences of the string blaster or Blaster-on any line of all files with names that end in .c:

cd /usr/src/linux*/drivers/cdrom
grep "[bB]laster"  *.c

The preceding commands produce the following output on my system:

sbpcd.c: *          Works with SoundBlaster compatible cards and with "no-sound"
sbpcd.c:        0x230, 1, /* Soundblaster Pro and 16 (default) */
sbpcd.c:        0x250, 1, /* OmniCD default, Soundblaster Pro and 16 */
sbpcd.c:        0x270, 1, /* Soundblaster 16 */
sbpcd.c:        0x290, 1, /* Soundblaster 16 */
sbpcd.c:static const char *str_sb_l = "soundblaster";
sbpcd.c:static const char *str_sb = "SoundBlaster";
sbpcd.c: *                 sbpcd=0x230,SoundBlaster
sbpcd.c:        msg(DBG_INF,"   LILO boot: ... sbpcd=0x230,SoundBlaster\n");
sjcd.c: *  the SoundBlaster/Panasonic style CDROM interface. But today, the

As you can see, has found all occurrences of blaster and Blaster in the files with names ending in .c.

The command's "[bB]laster" argument is known as a regular expression, a pattern that matches a set of strings. You construct a regular expression with a small set of operators and rules that resemble the ones for writing arithmetic expressions. A list of characters inside brackets ([...]), for example, matches any single character in the list. Thus, the regular expression "[bB]laster" is a set of two strings, as follows:

blaster   Blaster

So far, this section has summarized the syntax of regular expressions. But, you have not seen how to use regular expressions in Perl. Typically, you place a regular expression within a pair of slashes and use the match (=~)or not-match (!~) operators to test a string. You can write a Perl script that performs the same search as the one done with earlier in this section. The following steps help you complete this exercise:

  1. Use a text editor to type and save the following script in a file named lookup:

    #!/usr/bin/perl
    while (<STDIN>)
    {
        if ( $_ =~ /[bB]laster/ ) { print $_; }
    }
  2. Make the lookup file executable by using the following command:

    chmod +x lookup
  3. Try the script by using the following command:

    cat /usr/src/linux*/drivers/cdrom/sbpcd.c | ./lookup

    My system responds with this:

     *    Works with SoundBlaster compatible cards and with "no-sound"
            0x230, 1, /* Soundblaster Pro and 16 (default) */
            0x250, 1, /* OmniCD default, Soundblaster Pro and 16 */
            0x270, 1, /* Soundblaster 16 */
            0x290, 1, /* Soundblaster 16 */
    static const char *str_sb_l = "soundblaster";
    static const char *str_sb = "SoundBlaster";
     *                 sbpcd=0x230,SoundBlaster
                    msg(DBG_INF,"   LILO boot: ... sbpcd=0x230,SoundBlaster\n");
     *    Works with SoundBlaster compatible cards and with "no-sound"
            0x230, 1, /* Soundblaster Pro and 16 (default) */
            0x250, 1, /* OmniCD default, Soundblaster Pro and 16 */
            0x270, 1, /* Soundblaster 16 */
            0x290, 1, /* Soundblaster 16 */
    static const char *str_sb_l = "soundblaster";
    static const char *str_sb = "SoundBlaster";
     *                 sbpcd=0x230,SoundBlaster
                    msg(DBG_INF,"   LILO boot: ... sbpcd=0x230,SoundBlaster\n");

    The command feeds the contents of a specific file (which, as you know from the example, contains some lines with the regular expression) to the lookup script. The script simply applies Perl's regular expression-match operator (=~) and prints any matching line.

The $_ variable in the script needs some explanation. The <STDIN> expression gets a line from the standard input and, by default, stores that line in the $_ variable. Inside the while loop, the regular expression is matched against the $_ string. The following single Perl statement completes the lookup script's work:

if ( $_ =~ /[bB]laster/ ) { print $_; }

This example illustrates how you might use a regular expression to search for occurrences of strings in a file.

After you use regular expressions for a while, you can better appreciate their power. The trick is to find the regular expression that performs the task you want. Following is a search that looks for all lines that begin with exactly seven spaces and end with a right parenthesis:

while (<STDIN>)
{
    if ( $_ =~ /\)\n/ && $_ =~ /^ {7}\S/ )  { print $_; }
}

Using Flow-Control Statements

So far, you have seen Perl statements intended to execute in a serial fashion, one after another. Perl also includes statements that enable you to control the flow of execution of the statements. You already have seen the if statement and a while loop. Perl includes a complete set of flow-control statements just like those in C, but with a few extra features.

In Perl, all conditional statements take the following form:

conditional-statement
{ Perl code to execute if conditional is true }

Notice that you must enclose within braces ({...}) the code that follows the conditional statement. The conditional statement checks the value of an expression to determine whether to execute the code within the braces. In Perl, as in C, any nonzero value is considered true, whereas a zero value is false.

The following sections briefly describe the syntax of the major conditional statements in Perl.

Using if and unless Statements

The Perl if statement resembles the C if statement. For example, an if statement might check a count to see whether the count exceeds a threshold, as follows:

if ( $count > 25 ) { print "Too many errors!\n"; }

You can add an else clause to the if statement, as follows:

if ($user eq "root")
{
    print "Starting simulation...\n";
}
else
{
    print "Sorry $user, you must be \"root\" to run this program.\n.";
    exit;
}

If you know C, you can see that Perl's syntax looks quite a bit like that in C. Conditionals with the if statement can have zero or more elsif clauses to account for more alternatives, such as the following:

print "Enter version number:"; # prompt user for version number
$os_version = <STDIN>;         # read from standard input
chop $os_version;  # get rid of the newline at the end of the line
# Check version number
if ($os_version >= 10 ) { print "No upgrade necessary\n";}
elsif ($os_version >= 6 && $os_version < 9) 
                                    { print "Standard upgrade\n";}
elsif ($os_version > 3 && $os_version < 6) { print "Reinstall\n";}
else { print "Sorry, cannot upgrade\n";}
Using the while Statement

Use Perl's while statement for looping-the repetition of some processing until a condition becomes false. To read a line at a time from standard input and to process that line, you might use the following:

while ($in = <STDIN>)
{
# Code to process the line
    print $in;
}

You can skip to the end of a loop with the next keyword; the last keyword exits the loop. The following while loop adds the numbers from 1 to 10, skipping 5:

while (1)
{
    $i++;
    if($i == 5) { next;}  # Jump to the next iteration if $i is 5
    if($i > 10) { last;}  # When $i exceeds 10, end the loop
    $sum += $i;           # Add the numbers
}
# At this point $sum should be 50.
Using for and foreach Statements

Perl and C's for statements have similar syntax. Use the for statement to execute a statement any number of times, based on the value of an expression. The syntax of the for statement is as follows:

for (expr_1; expr_2; expr_3) { statement block }

expr_1 is evaluated one time, at the beginning of the loop; the statement block is executed until expression expr_2 evaluates to zero. The third expression, expr_3, is evaluated after each execution of the statement block. You can omit any of the expressions, but you must include the semicolons. In addition, the braces around the statement block are required. Following is an example that uses a for loop to add the numbers from 1 to 10:

for($i=0, $sum=0; $i <= 10; $sum += $i, $i++) {}

In this example, the actual work of adding the numbers is done in the third expression, and the statement the for loop controls is an empty block ({}).

Using the goto Statement

The goto statement transfers control to a statement label. Following is an example that prompts the user for a value and repeats the request, if the value is not acceptable:

ReEnter:
print "Enter offset: ";
$offset = <STDIN>;
chop $offset;
unless ($offset > 0 && $offset < 512)
{
    print "Bad offset: $offset\n";
    goto ReEnter;
}

Accessing Linux Commands

You can execute any Linux command from Perl in several ways:

The simplest way to execute a Linux command in your script is to use the system function with the command in a string. After the system function returns, the exit code from the command is in the $? variable. You can easily write a simple Perl script that reads a string from the standard input and processes that string with the system function. Follow these steps:

  1. Use a text editor to enter and save the following script in a file named rcmd.pl:

    #!/usr/bin/perl
    # Read user input and process command
    $prompt = "Command (\"exit\" to quit): ";
    print $prompt;
    while (<STDIN>)
    {
        chop;
        if ($_ eq "exit") { exit 0;}
    # Execute command by calling system
        system $_;
        unless ($? == 0) {print "Error executing: $_\n";}
        print $prompt;
    }
  2. Make the rcmd.pl file executable by using the following command:

    chmod +x rcmd.pl
  3. Run the script by typing ./rcmd.pl at the shell prompt in a terminal window. The following listing shows some sample output from the rcmd.pl script (the output depends on what commands you enter):

    Command ("exit" to quit): ps
      PID TTY          TIME CMD
      767 pts/0    00:00:00 bash
      940 pts/0    00:00:00 rcmd.pl
      945 pts/0    00:00:00 ps
    Command ("exit" to quit): exit      

Also, you can run Linux commands by using fork and exec in your Perl script. Following is an example script-psh.pl-that uses fork and exec to execute commands the user enters:

#!/usr/bin/perl
# This is a simple script that uses "fork" and "exec" to
# run a command entered by the user
$prompt = "Command (\"exit\" to quit): ";
print $prompt;
while (<STDIN>)
{
    chop;    # remove trailing newline
    if($_ eq "exit") { exit 0;}

    $status = fork;
    if($status)
    {
# In parent... wait for child process to finish...
        wait;
        print $prompt;
        next;
    }
    else
    {
        exec $_;
    }
}

The following example shows how the psh.pl script executes the command (remember to type chmod +x psh.pl before typing ./psh.pl):

Command ("exit" to quit): ps
  PID TTY          TIME CMD
  767 pts/0    00:00:00 bash
  949 pts/0    00:00:00 psh.pl
  950 pts/0    00:00:00 ps
Command ("exit" to quit): exit   

Linux shells, such as Bash, use the fork and exec combination to run commands.

Working with Files

You may have noticed the <STDIN> expression in various examples in this chapter. That's Perl's way of reading from a file. In Perl, a file handle, also known as an identifier, identifies a file. Usually, file handles are in uppercase characters. STDIN is a predefined file handle that denotes the standard input-by default, the keyboard. STDOUT and STDERR are the other two predefined file handles. STDOUT is used for printing to the terminal, and STDERR is used for printing error messages.

To read from a file, write the file handle inside angle brackets (<>). Thus, <STDIN> reads a line from the standard input.

You can open other files by using the open function. The following example shows you how to open the /etc/passwd file for reading and how to display the lines in that file:

open (PWDFILE, "/etc/passwd");  # PWDFILE is the file handle
while (<PWDFILE>) { print $_;}  # By default, input line is in $_
close PWDFILE;                  # Close the file

By default, the open function opens a file for reading. You can add special characters at the beginning of the filename to indicate other types of access. A > prefix opens the file for writing, whereas a >> prefix opens a file for appending. Following is a short script that reads the /etc/passwd file and creates a new file, named output, with a list of all users who lack shells (the password entries for these users have : at the end of each line):

#!/usr/bin/perl
# Read /etc/passwd and create list of users without any shell
open (PWDFILE, "/etc/passwd");
open (RESULT, ">output");                # open file for writing

while (<PWDFILE>)
{
    if ($_ =~ /:\n/) {print RESULT $_;}
}
close PWDFILE;
close RESULT;

After you execute this script, you should find a file named output in the current directory. Following is what the output file contains when I run this script on a CentOS Linux system:

news:x:9:13:news:/etc/news:

Writing Perl Subroutines

Although Perl includes a large assortment of built-in functions, you can add your own code modules in the form of subroutines. In fact, the Perl distribution comes with a large set of subroutines. Following is a simple script that illustrates the syntax of subroutines in Perl:

#!/usr/bin/perl
sub hello
{
# Make local copies of the arguments from the @_ array.
    local ($first,$last) = @_;
    print "Hello, $first $last\n";
}
$a = Jane;
$b = Doe;
&hello($a, $b);     # Call the subroutine.

When you run the preceding script, it displays the following output:

Hello, Jane Doe

Taking Stock of the Built-in Functions in Perl

Perl has nearly 200 built-in functions (also referred to as Perl functions), including functions that resemble the ones in the C Run-Time Library, as well as functions that access the operating system. You really need to go through the list of functions to appreciate the breadth of capabilities available in Perl. briefly describes each of the Perl functions.

Insider Insight 

This chapter does not have enough space to cover these functions, but you can learn about the Perl functions by pointing your World Wide Web browser to the following address:

http://www.perl.com/CPAN//doc/manual/html/pod/perlfunc.html

This address connects you to the Comprehensive Perl Archive Network (CPAN)-actually, it connects to the CPAN site nearest to you-so you can download the page with an overview of the Perl built-in functions. Click a function's name to view more detailed information about that function.

Table 24-3: A Quick Reference Guide to Perl Functions

Function Call

Description

abs(VALUE)

Returns the absolute value of the argument

accept(NEWSOCKET, GENERICSOCKET)

Waits for a connection on a socket

alarm(SECONDS)

Sends an alarm signal after a specified number of seconds

atan2(Y,X)

Returns the arctangent of Y/X

bind(SOCKET,NAME)

Associates a name to an already opened socket

binmode(FILEHANDLE)

Arranges for a file to be treated as binary

bless(REF,PACKAGE)

Makes a referenced item an object in a package

caller(EXPR)

Returns information about current subroutine calls

chdir(EXPR)

Changes the directory to the directory specified by EXPR

chmod(LIST)

Changes the permissions of a list of files

chomp(VARIABLE)

Removes trailing characters that match the current value of the special variable $/

chop(VARIABLE)

Chops off the last character (useful for removing the trailing newline character in a string)

chown(LIST)

Changes the owner of a list of files

chr(NUMBER)

Returns the character whose ASCII code is NUMBER

chroot(FILENAME)

Changes the root directory to the specified FILENAME

close(FILEHANDLE)

Closes the specified file

closedir(DIRHANDLE)

Closes the directory that had been opened by opendir

connect(SOCKET,NAME)

Initiates a connection to another system using a socket

cos(EXPR)

Returns the cosine of the angle EXPR (radians)

crypt(PLAINTEXT, SALT)

Encrypts a string

dbmclose(ASSOC_ARRAY)

Disassociates an associative array from a DBM file. (DBM, or data base manager, is a library of routines that manages DBM files-data files that contain key/data pairs.)

dbmopen(ASSOC, DBNAME, MODE)

Associates an associative array with a DBM file

defined(EXPR)

Returns true if EXPR is defined

delete $ASSOC{KEY}

Deletes a value from an associative array

die(LIST)

Prints LIST to standard error and exits the Perl program

do SUBROUTINE (LIST)

Calls a subroutine

dump(LABEL)

Causes a core dump

each(ASSOC_ARRAY)

Returns next key-value pair of an associative array

endgrent

Closes the /etc/group file in UNIX

endhostent

Closes the /etc/hosts file in UNIX

endnetent

Closes the /etc/networks file in UNIX

endprotoent

Closes the /etc/protocols file in UNIX

endpwent

Closes the /etc/passwd file in UNIX

endservent

Closes the /etc/services file in UNIX

eof(FILEHANDLE)

Returns true if end of file is reached

eval(EXPR)

Executes the EXPR as if it were a Perl program

exec(LIST)

Terminates the current Perl program by running another program (specified by LIST) in its place

exists($ASSOC($KEY))

Returns true if the specified key exists in the associative array

exit(EXPR)

Exits the Perl program and returns EXPR

exp(EXPR)

Returns e raised to the power EXPR

fcntl(FILEHANDLE, FUNCTION, SCALAR)

Performs various control operations on a file

fileno(FILEHANDLE)

Returns the file descriptor for a file handle

flock(FILEHANDLE, OPERATION)

Locks a file so other processes cannot change the file (useful when multiple processes need to access a single file)

fork

Creates a child process and returns the child process ID

format NAME = picture line value list

Defines an output format to be used by the write function

formline(PICTURE, LIST)

Formats a list of values according to the contents of PICTURE

getc(FILEHANDLE)

Reads the next character from the file

getgrent

Returns group information from /etc/group

getgrgid(GID)

Looks up a group file entry by group number

getgrnam(NAME)

Looks up a group file entry by group name

gethostbyaddr(ADDR, ADDRTYPE)

Translates a network address to a name

gethostbyname(NAME)

Translates a network hostname to corresponding addresses

gethostent

Gets entries from the /etc/hosts file on UNIX

getlogin

Returns current login information in UNIX

getnetbyaddr(ADDR, ADDRTYPE)

Translates a network address to its corresponding network name

getnetbyname(NAME)

Translates a network name to its corresponding network address

getnetent

Gets entries from the /etc/networks file (or equivalent on non-UNIX systems)

getpeername(SOCKET)

Returns the socket address of the other end of a socket connection

getpgrp(PID)

Returns the current process group for the specified process ID

getppid

Returns the process ID of the parent process

getpriority(WHICH, WHO)

Returns the current priority of a process

getprotobyname(NAME)

Translates a protocol name into a number

getprotobynumber(NUMBER)

Translates a protocol number into a name

getprotoent

Gets networking protocol information from the /etc/networks file in UNIX

getpwent

Gets entry from the password file (/etc/passwd in UNIX)

getpwnam(NAME)

Translates a user name into the corresponding entry in the password file

getpwuid(UID)

Translates a numeric user ID into the corresponding entry in the password file

getservbyname(NAME, PROTO)

Translates a service (port) name into the corresponding port number

getservbyport(PORT, PROTO)

Translates the service (port) number into a name

getservent

Gets entries from the /etc/services file in UNIX

getsockname(SOCKET)

Returns the address of this end of a socket connection

getsockopt(SOCKET, LEVEL,

Returns the requested socket options OPTNAME)

glob(EXPR)

Returns filenames corresponding to a wildcard expression

gmtime(EXPR)

Converts binary time into a nine-element list corresponding to Greenwich Mean Time (GMT)

goto(LABEL)

Jumps to the statement identified by the LABEL

grep(EXPR, LIST)

Searches LIST for occurrences of the expression

hex(EXPR)

Returns the decimal value corresponding to hexadecimal EXPR

index(STR, SUBSTR, POSITION)

Returns the position of the first occurrence of a string (the search begins at the character location specified by POSITION)

int(EXPR)

Returns the integer portion of EXPR

ioctl(FILEHANDLE, FUNCTION, SCALAR)

Controls various aspects of FILEHANDLE

join(EXPR, LIST)

Returns a single string by joining list elements

keys(ASSOC_ARRAY)

Returns an array of keys for an associative array

kill(LIST)

Sends a signal to a list of processes

last(LABEL)

Exits the loop identified by LABEL

lc(EXPR)

Returns the lowercase version of EXPR

lcfirst(EXPR)

Returns EXPR, after changing the first character to lowercase

length(EXPR)

Returns length in number of characters

link(OLDFILE, NEWFILE)

Creates NEWFILE as a link to OLDFILE

listen(SOCKET, QUEUESIZE)

Waits for incoming connections on a socket

local(LIST)

Makes a list of variables local to a subroutine

localtime(EXPR)

Converts binary time into a nine-element list corresponding to local time

lock SHAREDVAR

Locks a shared variable

log(EXPR)

Returns the logarithm (to base e) of EXPR

lstat(FILEHANDLE)

Returns file statistics for a file (if the file refers to a symbolic link, returns information about the symbolic link)

m/PATTERN/gimosx

Performs pattern matching

map(EXPR, LIST)

Evaluates the expression EXPR for each item of LIST

mkdir(FILENAME, MODE)

Creates the directory specified by FILENAME

msgctl(ID, CMD, ARG)

Performs message control operations on message queues

msgget(KEY, FLAGS)

Gets a message queue identifier corresponding to KEY

msgrcv(ID, VAR, SIZE, TYPE, FLAGS)

Receives a message from the message queue identifier ID

msgsnd(ID, MSG, FLAGS)

Sends a message-to-message queue identifier ID

my(EXPR)

Declares one or more private variables that exist in a subroutine or a block enclosed in curly braces ({...})

next(LABEL)

Starts the next iteration of the loop identified by LABEL

no(Module LIST)

Stops using a Perl module

oct(EXPR)

Returns the decimal equivalent of an octal number in EXPR

open(FILEHANDLE, EXPR)

Opens a file whose name is in EXPR, and associates that file with FILEHANDLE

opendir(DIRHANDLE, EXPR)

Opens a directory whose name is in EXPR, and associates that directory with DIRHANDLE

ord(EXPR)

Returns the numeric ASCII code of the first character in EXPR

our EXPR

Declares the listed variables in EXPR as valid global variables within the enclosing block (similar to my, but does not create any local variables)

pack(TEMPLATE, LIST)

Takes a list of values and returns a string containing a packed binary structure (TEMPLATE specifies the packing)

package PACKAGENAME

Declares current file to be part of the specified package

pipe(READHANDLE, WRITEHANDLE)

Opens a pipe for reading and writing

pop(ARRAY)

Removes and returns the last element of an array

pos(SCALAR)

Returns the position where the last pattern match occurred (applies when a global search is performed with /PATTERN/g)

print(FILEHANDLE LIST)

Prints a list of items to a file identified by FILEHANDLE

printf(FILEHANDLE LIST)

Prints formatted output to a file

prototype FUNCTION

Returns the prototype of a function as a string (the prototype shows the declaration of the function, including its arguments)

push(ARRAY, LIST)

Appends values in LIST to the end of ARRAY

q/STRING/

Quotes a STRING, without replacing variable names with values (similar to a single quoted string)

qq/STRING/

Quotes a STRING, but replaces variable names with values (similar to a double-quoted string)

quotemeta(EXPR)

Returns the value of EXPR, after adding a backslash prefix for all characters that take on special meaning in regular expressions

qw/STRING/

Quotes a word list (similar to parentheses used in patterns)

qx/STRING/

Quotes a command (similar to backquotes)

rand(EXPR)

Returns a random value between 0 and EXPR

read(FILEHANDLE, SCALAR, LENGTH)

Reads a specified number of bytes from the file

readdir(DIRHANDLE)

Reads directory entries from a directory handle

readlink(EXPR)

Returns the filename pointed to by a symbolic link

readpipe(EXPR)

Returns the output after executing EXPR as a system command

recv(SOCKET, SCALAR, LEN, FLAGS)

Receives a message from a socket

redo(LABEL)

Restarts the loop identified by LABEL

ref(EXPR)

Returns true if EXPR is a reference (a reference points to an object)

rename(OLDNAME, NEWNAME)

Changes the name of a file from OLDNAME to NEWNAME

require(FNAME)

Includes the file specified by FNAME, and executes the Perl code in that file

reset(EXPR)

Clears global variables

return(LIST)

Returns from subroutine with the specified values

reverse(LIST)

Reverses the order of elements in LIST

rewinddir(DIRHANDLE)

Sets the current position to the beginning of the directory identified by DIRHANDLE

rindex(STR, SUBSTR)

Returns the last position of a substring in a string

rindex(STR, SUBSTR, POSITION)

Returns the position of the last occurrence of a substring in a string

rmdir(FILENAME)

Deletes the directory specified by FILENAME

s/PATTERN/REPLACEMENT/egimosx

Replaces PATTERN (a regular expression) with REPLACEMENT

scalar(EXPR)

Evaluates the expression EXPR in a scalar context

seek(FILEHANDLE, POSITION, WHENCE)

Moves to a new location in a file

seekdir(DIRHANDLE, POS)

Moves to a new position in a directory

select(FILEHANDLE)

Returns the currently selected file handle, and sets FILEHANDLE as the default file handle for output

select(RBITS, WBITS, EBITS, TIMEOUT)

Checks if one or more files are ready for input or output

semctl(ID, SEMNUM, CMD, ARG)

Controls the semaphores used for interprocess communication

semget(KEY, NSEMS, FLAGS)

Returns the semaphore ID corresponding to a key

semop(KEY, OPSTRING)

Performs a semaphore operation (semaphores are used for interprocess communications in UNIX System V)

send(SOCKET, MSG, FLAGS, TO)

Sends a message to a socket

setgrent

Sets group information in /etc/group

sethostent(STAYOPEN)

Opens the host database (the /etc/hosts file in UNIX)

setnetent(STAYOPEN)

Opens the network database (the /etc/networks file in UNIX)

setpgrp(PID,PGRP)

Sets the current process group of a process

setpriority(WHICH, WHO, PRIORITY)

Sets the priority for a process

setprotoent(STAYOPEN)

Opens the protocol database (the /etc/protocols file in UNIX)

setpwent

Opens the /etc/passwd file in UNIX

setservent(STAYOPEN)

Opens the /etc/services file in UNIX

setsockopt(SOCKET, LEVEL, OPTNAME, OPTVAL)

Sets the specified socket options

shift(ARRAY)

Removes the first value of the array and returns it

shmctl(ID, CMD, ARG)

Controls shared memory settings, such as permission

shmget(KEY, SIZE, FLAGS)

Allocates a shared memory segment

shmread(ID, VAR, POS, SIZE)

Reads from the shared memory segment identified by ID

shmwrite(ID, STRING, POS, SIZE)

Writes to the shared memory segment identified by ID

shutdown(SOCKET, HOW)

Shuts down a socket connection

sin(EXPR)

Returns the sine of the angle specified by EXPR (in radians)

sleep(EXPR)

Sleeps for EXPR seconds

socket(SOCKET, DOMAIN, TYPE, PROTOCOL)

Opens a socket for a specified type and attaches it to the file handle SOCKET

socketpair(SOCKET1, SOCKET2, DOMAIN, TYPE, PROTOCOL)

Creates an unnamed pair of sockets

sort(LIST)

Sorts a list and returns the sorted list in an array

splice(ARRAY, OFFSET, LENGTH, LIST)

Replaces some ARRAY elements with LIST

split(/PATTERN/, EXPR, LIMIT)

Splits EXPR into an array of strings

sprintf(FORMAT, LIST)

Returns a string containing formatted output consisting of LIST elements formatted according to the FORMAT string

sqrt(EXPR)

Returns the square root of EXPR

srand(EXPR)

Sets the seed for random number generation

stat(FILEHANDLE)

Returns a 13-element list with statistics for a file

study(STRING)

Examines STRING in anticipation of doing many pattern matches on the string

substr(EXPR, OFFSET, LEN)

Returns a substring from the string EXPR

symlink(OLDFILE, NEWFILE)

Creates NEWFILE as a symbolic link to OLDFILE

syscall(LIST)

Calls the system function specified in the first element of LIST (and passes to that call the remaining list elements as arguments)

sysopen(FILEHANDLE, FILENAME, MODE, PERMS)

Opens a file named FILENAME and associates it with FILEHANDLE

sysread(FILEHANDLE, SCALAR, LENGTH, OFFSET)

Reads a specified number of bytes from a file

sysseek(FILEHANDLE, POSITION, WHENCE)

Sets FILEHANDLE's position to the specified POSITION in bytes (WHENCE refers to the reference point for setting the position and it can be one of SEEK_SET, SEEK_CUR, and SEEK_END)

system(LIST)

Executes the shell commands in LIST

syswrite(FILEHANDLE, SCALAR, LENGTH, OFFSET)

Writes a specified number of bytes to a file

tell(FILEHANDLE)

Returns the current file position in bytes from the beginning of a file

telldir(DIRHANDLE)

Returns the current position where the readdir function can read from a directory handle

tie(VARIABLE, PACKAGENAME, LIST)

Associates a variable to a package that implements the variable

time

Returns the number of seconds since 00:00:00 GMT 1/1/1970

times

Returns time in seconds for this process

tr/SEARCHLIST/REPLACE_LIST/cds

Translates a search list into a replacement list

truncate(FILEHANDLE, LENGTH)

Truncates the file FILEHANDLE to a specified LENGTH

uc(EXPR)

Returns the uppercase version of EXPR

ucfirst(EXPR)

Returns EXPR after changing the first character to uppercase

umask(EXPR)

Sets the permission mask to be used when creating a file (this specifies what operations are not allowed on the file)

undef(EXPR)

Undefines EXPR

unlink(LIST)

Deletes a list of files

unpack(TEMPLATE, EXPR)

Unpacks a string into an array and returns the array

unshift(ARRAY, LIST)

Prepends LIST to the beginning of ARRAY

untie(VARIABLE)

Breaks the binding between a variable and a package

use(MODULE)

Starts using a Perl module

utime(LIST)

Changes the access and modification time of a list of files

values(ASSOC_ARRAY)

Returns an array containing all values from an associative array

vec(EXPR, OFFSET, BITS)

Treats the string EXPR as a vector of integers, and returns a specified element of the vector

wait

Waits for a child process to terminate

waitpid(PID, FLAGS)

Waits for a specific child process (identified by PID) to terminate

wantarray

Returns if the current subroutine has been called in an array context

warn(LIST)

Produces a warning message (specified by LIST) on the standard error

write(FILEHANDLE)

Writes a formatted record to a file

y/SEARCHLIST/REPLACE_LIST/cds

Translates a search list into a replacement list

Understanding Perl Packages and Modules

A Perl package is a way to group together data and subroutines. Essentially, it's a way to use variable and subroutine names without conflicting with any names used in other parts of a program. The concept of a package existed in Perl 4.

A Perl package provides a way to control the namespace-a term that refers to the collection of variable and subroutine names. Although you may not be aware of this, when you write a Perl program, it automatically belongs to a package named main. Besides main, there are other Perl packages in the Perl library (these packages are in the /usr/lib/perl5 directory of your CentOS Linux system, under a subdirectory whose name is the same as the Perl version you are running), and you can define your own package, as well.

Perl modules, as you'll learn soon, are packages that follow specific guidelines.

You can think of a Perl package as a convenient way to organize a set of related Perl subroutines. Another benefit is that variable and subroutine names defined in a package do not conflict with names used elsewhere in the program. Thus, a variable named $count in one package remains unique to that package and does not conflict with a $count used elsewhere in a Perl program.

A Perl package is in a single file. The package statement is used at the beginning of the file to declare the file as a package and to give the package a name. For example, the file ctime.pl defines a number of subroutines and variables in a package named ctime. The ctime.pl file has the following package statement in various places:

package ctime;

The effect of this package declaration is that all subsequent variable names and subroutine names are considered to be in the ctime package. You can put such a package statement at the beginning of the file that implements the package.

What if you are implementing a package and you need to refer to a subroutine or variable in another package? As you might guess, all you need to do is specify both the package name and the variable (or subroutine) name. Perl 5 provides the following syntax for referring to a variable in another package:

$Package::Variable

Here Package is the name of the package, and Variable is the name of the variable in that package. If you omit the package name, Perl assumes you are referring to a variable in the main package. Note that C++ happens to use a similar syntax when referring to variables in another C++ class (a class is basically a collection of data and functions-a template for an object).

To use a package in your program, you can simply call the require function with the package filename as an argument. For instance, there is a package named ctime defined in the file ctime.pl. That package includes the ctime subroutine that converts a binary time into a string. The following simple program uses the ctime package from the ctime.pl file:

#!/usr/bin/perl -w
# Use the ctime package defined in ctime.pl file.
require 'ctime.pl';
# Call the ctime subroutine.
$time = ctime(time());
# Print the time string.
print $time;

As you can see, this program uses the require function to bring the ctime.pl file into the program. When you run this program, it should print the current date and time formatted, as shown in the sample output:

Sun Feb  9 18:25:46 2003

Perl 5 takes the concept of a package one step further and introduces the module, a package that follows certain guidelines and is designed to be reusable. Each module is a package that is defined in a file with the same name as the package but with a .pm extension. Each Perl object is implemented as a module. For example, the Shell object is implemented as the Shell module, stored in the file named Shell.pm.

Perl 5 comes with a large number of modules. You'll find these modules in the /usr/lib/perl5 directory under a subdirectory corresponding to your Perl version. For Perl Version 5.8.0, the Perl modules are in the /usr/lib/perl5/5.8.0 directory (the last part of the pathname is the Perl version number). Look for files with names that end in .pm (for Perl module).

Using a Perl Module

You can call the require function, or the use function, to include a Perl module in your program. For example, a Perl module named Cwd (defined, as expected, in the Cwd.pm file) provides a getcwd subroutine that returns the current directory. You can call the require function to include the Cwd module and call getcwd as follows:

require Cwd;  # You do not need the full filename.
$curdir = Cwd::getcwd();
print "Current directory = $curdir\n";

The first line brings the Cwd.pm file into this program-you do not have to specify the full filename; the require function automatically appends .pm to the module's name to figure out which file to include. The second line shows how you call a subroutine from the Cwd module. When you use require to include a module, you must invoke each subroutine with the Module::subroutine format.

If you were to rewrite this example program with the use function in place of require, it would take the following form:

use Cwd;  
$curdir = getcwd(); # no need for Cwd:: prefix
print "Current directory = $curdir\n";

The most significant difference is that you no longer need to qualify a subroutine name with the module name prefix (such as Cwd::).

Using Perl Objects

An object is a data structure that includes both the data and the functions that operate on that data. Each object is an instance of a class that defines the object's type. For example, a rectangle class may have the four corners of the rectangle as data, and functions such as one that computes the rectangle's area and another that draws the rectangle. Then, each rectangle object can be an instance of the rectangle class, with different coordinates for the four corners. In this sense, an object is an instance of a class.

The functions (or subroutines) that implement the operations on an object's data are known as methods. That's terminology borrowed from Smalltalk, one of the earliest object-oriented programming languages.

Classes also suggest the notion of inheritance. You can define a new class of objects by extending the data or methods (or both) of an existing class. A common use of inheritance is to express the is a relationship among various classes of objects. Consider, for example, the geometric shapes. Because a circle is a shape and a rectangle is a shape, you can say that the circle and rectangle classes inherit from the shape class. In this case, the shape class is called a parent class or base class.

Creating and Using Perl Objects

A useful Perl object is Shell object, which is implemented by the Perl module Shell.pm. That module comes with the Perl distribution and is in the /usr/lib/perl5/5.8.0 directory (for Perl Version 5.8.0).

As the name implies, the Shell object is meant for running shell commands from within Perl scripts. You can create a Shell object and have it execute commands.

To use the Shell object, follow these general steps:

  1. Place the following line to include the CGI module in your program:

    use Shell;

    You must include this line before you create a Shell object.

  2. To create a Shell object, use the following syntax:

    my $sh = Shell->new;

    where $sh is the reference to the Shell object.

  3. Run Linux commands by using the Shell object and capture any outputs by saving to an appropriate variable. For example, to save the directory listing of the /usr/lib/perl5/5.8.0 directory in an array named @modules, write the following:

    @modules = $sh->ls("/usr/lib/perl5/5.8.0/*.pm");

    Then you can work with this array of Perl module file names (that's what *.pm files are) any way you want. For example, to simply go through the array and print each string out, use the following while loop:

    while(@modules)
    {
      $mod = shift @modules;
      print $mod;
    } 

How do you know which methods of an object to call and in what order to call them? You have to read the object's documentation before you can use the object. The method names and the sequences of method invocation depend on what the object does.

Using the English Module

Perl includes several special variables with strange names, such as $_ for the default argument and $! for error messages corresponding to the last error. When you read a program, it can be difficult to guess what a special variable means. The result is that you may end up avoiding a special variable that could be useful in your program.

As a helpful gesture, Perl 5 provides the English module (English.pm), which enables you to use understandable names for various special variables in Perl. To use the English module, include the following line in your Perl program:

use English;

After that, you can refer to $_ as $ARG and $! as $ERRNO (these 'English' names can still be a bit cryptic, but they're definitely better than the punctuation marks).

The following program uses the English module and prints a few interesting variables:

#!/usr/bin/perl -w
# File: english.pl
use English;
if($PERL_VERSION ge v5.8.0)
{
    print "Perl version 5.8.0 or later\n";
}
else
{
    print "Perl version prior to 5.8.0\n";
}
print "Perl executable = $EXECUTABLE_NAME\n";
print "Script name = $PROGRAM_NAME\n"; 

When I run this script, the output appears as follows:

Perl version 5.8.0 or later
Perl executable = /usr/bin/perl
Script name = ./english.pl

The English module is handy because it lets you write Perl scripts in which you can refer to special variables by meaningful names. To learn more about the Perl special variables and their English names, type man perlvar at the shell prompt.