References

Contents:

What Is a Reference?
Creating References
Using Hard References
Symbolic References
Braces, Brackets, and Quoting

For both practical and philosophical reasons, Perl has always been biased in favor of flat, linear data structures. And for many problems, this is just what you want.

Suppose you wanted to build a simple table (two-dimensional array) showing vital statistics--age, eye color, and weight--for a group of people. You could do this by first creating an array for each individual:

@john = (47, "brown", 186);
 @mary = (23, "hazel", 128);
 @bill = (35, "blue", 157);
You could then construct a single, additional array consisting of the names of the other arrays:
@vitals = ('john', 'mary', 'bill');
To change John's eyes to "red" after a night on the town, we want a way to change the contents of the @john array given only the simple string "john". This is the basic problem of indirection, which various languages solve in various ways. In C, the most common form of indirection is the pointer, which lets one variable hold the memory address of another variable. In Perl, the most common form of indirection is the reference.

What Is a Reference?

In our example, $vitals[0] has the value "john". That is, it contains a string that happens to be the name of another (global) variable. We say that the first variable refers to the second, and this sort of reference is called a symbolic reference, since Perl has to look up @john in a symbol table to find it. (You might think of symbolic references as analogous to symbolic links in the filesystem.) We'll talk about symbolic references later in this chapter.

The other kind of reference is a hard reference, and this is what most Perl developers use to accomplish their indirections (if not their indiscretions). We call them hard references not because they're difficult, but because they're real and solid. If you like, think of hard references as real references and symbolic references as fake references. It's like the difference between true friendship and mere name-dropping. When we don't specify which type of reference we mean, it's a hard reference. Figure 8-1 depicts a variable named $bar referring to the contents of a scalar named $foo which has the value "bot".

Screenshot

Figure 8.1. A hard reference and a symbolic reference

Unlike a symbolic reference, a real reference refers not to the name of another variable (which is just a container for a value) but to an actual value itself, some internal glob of data. There's no good word for that thing, but when we have to, we'll call it a referent. Suppose, for example, that you create a hard reference to a lexically scoped array named @array. This hard reference, and the referent it refers to, will continue to exist even after @array goes out of scope. A referent is only destroyed when all the references to it are eliminated.

A referent doesn't really have a name of its own, apart from the references to it. To put it another way, every Perl variable name lives in some kind of symbol table, holding one hard reference to its underlying (otherwise nameless) referent. That referent might be simple, like a number or string, or complex, like an array or hash. Either way, there's still exactly one reference from the variable to its value. You might create additional hard references to the same referent, but if so, the variable doesn't know (or care) about them.[1]

[1] If you're curious, you can determine the underlying refcount with the Devel::Peek module, bundled with Perl.

A symbolic reference is just a string that happens to name something in a package symbol table. It's not so much a distinct type as it is something you do with a string. But a hard reference is a different beast entirely. It is the third of the three kinds of fundamental scalar data types, the other two being strings and numbers. A hard reference doesn't know something's name just to refer to it, and it's actually completely normal for there to be no name to use in the first place. Such totally nameless referents are called anonymous; we discuss them in "Anonymous Data" below.

To reference a value, in the terminology of this chapter, is to create a hard reference to it. (There's a special operator for this creative act.) The reference so created is simply a scalar, which behaves in all familiar contexts just like any other scalar. To dereference this scalar means to use the reference to get at the referent. Both referencing and dereferencing occur only when you invoke certain explicit mechanisms; implicit referencing or dereferencing never occurs in Perl. Well, almost never.

A function call can use implicit pass-by-reference semantics--if it has a prototype declaring it that way. If so, the caller of the function doesn't explicitly pass a reference, although you still have to dereference it explicitly within the function. See the section "Prototypes" in "Subroutines". And to be perfectly honest, there's also some behind-the-scenes dereferencing happening when you use certain kinds of filehandles, but that's for backward compatibility and is transparent to the casual user. Finally, two built-in functions, bless and lock, each take a reference for their argument but implicitly dereference it to work their magic on what lies behind. But those confessions aside, the basic principle still holds that Perl isn't interested in muddling your levels of indirection.

A reference can point to any data structure. Since references are scalars, you can store them in arrays and hashes, and thus build arrays of arrays, arrays of hashes, hashes of arrays, arrays of hashes and functions, and so on. There are examples of these in "Data Structures".

Keep in mind, though, that Perl arrays and hashes are internally one-dimensional. That is, their elements can hold only scalar values (strings, numbers, and references). When we use a phrase like "array of arrays", we really mean "array of references to arrays", just as when we say "hash of functions" we really mean "hash of references to subroutines". But since references are the only way to implement such structures in Perl, it follows that the shorter, less accurate phrase is not so inaccurate as to be false, and therefore should not be totally despised, unless you're into that sort of thing.