Command-Line Interface

Contents:

Command Processing
Environment Variables

This chapter is about aiming Perl in the right direction before you fire it off. There are various ways to aim Perl, but the two primary ways are through switches on the command line and through environment variables. Switches are the more immediate and precise way to aim a particular command. Environment variables are more often used to set general policy.

Command Processing

It is fortunate that Perl grew up in the Unix world, because that means its invocation syntax works pretty well under the command interpreters of other operating systems, too. Most command interpreters know how to deal with a list of words as arguments and don't care if an argument starts with a minus sign. There are, of course, some sticky spots where you'll get fouled up if you move from one system to another. You can't use single quotes under MS-DOS as you do under Unix, for instance. And on systems like VMS, some wrapper code has to jump through hoops to emulate Unix I/O redirection. Wildcard interpretation is a wildcard. Once you get past those issues, however, Perl treats its switches and arguments much the same on any operating system.

Even when you don't have a command interpreter per se, it's easy to execute a Perl program from another program written in any language. Not only can the calling program pass arguments in the ordinary way, it can also pass information via environment variables and, if your operating system supports them, inherited file descriptors (see "Passing Filehandles" in "Interprocess Communication". Even exotic argument-passing mechanisms can easily be encapsulated in a module, then brought into your Perl program via a simple use directive.

Perl parses command-line switches in the standard fashion.[1] That is, it expects any switches (words beginning with a minus) to come first on the command line. After that usually comes the name of the script, followed by any additional arguments to be passed into the script. Some of these additional arguments may themselves look like switches, but if so, they must be processed by the script, because Perl quits parsing switches as soon as it sees a nonswitch, or the special "--" switch that says, "I am the last switch."

[1] Presuming you agree that Unix is both standard and fashionable.

Perl gives you some flexibility in where you place the source code for your program. For small, quick-and-dirty jobs, you can program Perl entirely from the command line. For larger, more permanent jobs, you can supply a Perl script as a separate file. Perl looks for a script to compile and run in any one of these three ways:

  1. Specified line by line via -e switches on the command line. For example:

    % perl -e "print 'Hello, World.'" Hello, World.
    


  2. Contained in the file specified by the first filename on the command line. Systems supporting the #! notation on the first line of an executable script invoke interpreters this way on your behalf.
  3. Passed in implicitly via standard input. This method works only when there are no filename arguments; to pass arguments to a standard-input script you must use method 2, explicitly specifying a "-" for the script name. For example:

    % echo "print qq(Hello, @ARGV.)" | perl - World Hello, World.
    


With methods 2 and 3, Perl starts parsing the input file from the beginning--unless you've specified a -x switch, in which case it scans for the first line starting with #! and containing the word "perl", and starts there instead. This is useful for running a script embedded in a larger message. If so, you might indicate the end of the script using the __END__ token.

Whether or not you use -x, the #! line is always examined for switches when the line is parsed. That way, if you're on a platform that allows only one argument with the #! line, or worse, doesn't even recognize the #! line as special, you can still get consistent switch behavior regardless of how Perl was invoked, even if -x was used to find the beginning of the script.

Warning: because older versions of Unix silently chop off kernel interpretation of the #! line after 32 characters, some switches may end up getting to your program intact, and others not; you could even get a "-" without its letter, if you're not careful. You probably want to make sure that all your switches fall either before or after that 32-character boundary. Most switches don't care whether they're processed redundantly, but getting a "-" instead of a complete switch would cause Perl to try to read its source code from the standard input instead of from your script. And a partial -I switch could also cause odd results. However, some switches do care if they are processed twice, like combinations of -l and -0. Either put all the switches after the 32-character boundary (if applicable), or replace the use of -0DIGITS with BEGIN{ $/ = "\0DIGITS"}; . Of course, if you're not on a Unix system, you're guaranteed not to have this particular problem.

Parsing of #! switches starts from where "perl" is first mentioned in the line. The sequences "-*" and "- " are specifically ignored for the benefit of emacs users, so that, if you're so inclined, you can say:

#!/bin/sh -- # -*- perl -*- -p eval 'exec perl -S $0 ${1+"$@"}' if 0;


and Perl will see only the -p switch. The fancy "-*- perl -*-" gizmo tells emacs to start up in Perl mode; you don't need it if you don't use emacs. The -S mess is explained later under the description of that switch.

A similar trick involves the env(1) program, if you have it:

#!/usr/bin/env perl


The previous examples use a relative path to the Perl interpreter, getting whatever version is first in the user's path. If you want a specific version of Perl, say, perl5.6.1, place it directly in the #! line's path, whether with the env program, with the -S mess, or with a regular #! processing.

If the #! line does not contain the word "perl", the program named after the #! is executed instead of the Perl interpreter. For example, suppose you have an ordinary Bourne shell script out there that says:

#!/bin/sh echo "I am a shell script"


If you feed that file to Perl, then Perl will run /bin/sh for you. This is slightly bizarre, but it helps people on machines that don't recognize #!, because--by setting their SHELL environment variable--they can tell a program (such as a mailer) that their shell is /usr/bin/perl, and Perl will then dispatch the program to the correct interpreter for them, even though their kernel is too stupid to do so.

But back to Perl scripts that are really Perl scripts. After locating your script, Perl compiles the entire program into an internal form (see "Compiling"). If any compilation errors arise, execution does not even begin. (This is unlike the typical shell script or command file, which might run part-way through before finding a syntax error.) If the script is syntactically correct, it is executed. If the script runs off the end without hitting an exit or die operator, an implicit exit(0) is supplied by Perl to indicate successful completion to your caller. (This is unlike the typical C program, where you're likely to get a random exit status if your program just terminates in the normal way.)

#! and Quoting on Non-Unix Systems

Unix's #! technique can be simulated on other systems:

Command interpreters on non-Unix systems often have extraordinarily different ideas about quoting than Unix shells have. You'll need to learn the special characters in your command interpreter (*, , and " are common) and how to protect whitespace and these special characters to run one-liners via the -e switch. You might also have to change a single % to a %%, or otherwise escape it, if that's a special character for your shell.

On some systems, you may have to change single quotes to double quotes. But don't do that on Unix or Plan9 systems, or anything running a Unix-style shell, such as systems from the MKS Toolkit or from the Cygwin package produced by the Cygnus folks, now at Redhat. Microsoft's new Unix emulator called Interix is also starting to look, ahem, interixing.

For example, on Unix and Mac OS X, use:

% perl -e 'print "Hello world\n"' 

On Macintosh (pre Mac OS X), use:

print "Hello world\n"

then run "Myscript" or Shift-Command-R.

On VMS, use:

$ perl -e "print ""Hello world\n""" 

or again with qq//:

$ perl -e "print qq(Hello world\n)" 

And on MS-DOS et al., use:

A:> perl -e "print \"Hello world\n\"" 

or use qq// to pick your own quotes:

A:> perl -e "print qq(Hello world\n)" 

The problem is that neither of those is reliable: it depends on the command interpreter you're using there. If DOS were the command shell, this would probably work better:

perl -e "print <Ctrl-x>"Hello world\n<Ctrl-x>""


The CMD.EXE program seen on Windows seems to have slipped a lot of standard Unix shell functionality in when nobody was looking, but just try to find documentation for its quoting rules.

On the Macintosh,[3] all this depends on which environment you are using. The MacPerl shell, or MPW, is much like Unix shells in its support for several quoting variants, except that it makes free use of the Macintosh's non-ASCII characters as control characters.

[3]At least, prior to release of Mac OS X, which, happily enough, is a BSD-derived system.

There is no general solution to all of this. It's just a mess. If you aren't on a Unix system but want to do command-line things, your best bet is to acquire a better command interpreter than the one your vendor supplied you, which shouldn't be too hard.

Or just write it all in Perl, and forget the one-liners.

Location of Perl

Although this may seem obvious, Perl is useful only when users can easily find it. When possible, it's good for both /usr/bin/perl and /usr/local/bin/perl to be symlinks to the actual binary. If that can't be done, system administrators are strongly encouraged to put Perl and its accompanying utilities into a directory typically found along a user's standard PATH, or in some other obvious and convenient place.

In this tutorial, we use the standard #!/usr/bin/perl notation on the first line of the program to mean whatever particular mechanism works on your system. If you care about running a specific version of Perl, use a specific path:

#!/usr/local/bin/perl5.6.0


If you just want to be running at least some version number, but don't mind higher ones, place a statement like this near the top of your program:

use v5.6.0;


(Note: earlier versions of Perl use numbers like 5.005 or 5.004_05. Nowadays we would think of those as 5.5.0 and 5.4.5, but versions of Perl older than 5.6.0 won't understand that notation.)

Switches

A single-character command-line switch without its own argument may always be combined (bundled) with a switch following it.

#!/usr/bin/perl -spi.bak # same as -s -p -i.bak


Switches are also known as options or flags. Whatever you call them, here are the ones Perl recognizes: