Using Tied Variables

In older versions of Perl, a user could call dbmopen to tie a hash to a UNIX DBM file. Whenever the hash was accessed, the database file on disk (really just a hash, not a full relational database) would be magically[] read from or written to. In modern versions of Perl, you can bind any ordinary variable (scalar, array, or hash) to an implementation class by using tie. (The class may or may not implement a DBM file.) You can break this association with untie.

[16] In this case, magically means "transparently doing something very complicated". You know the old saying - any technology sufficiently advanced is indistinguishable from a Perl script.

The tie function creates the association by creating an object internally to represent the variable to the class. If you have a tied variable, but want to get at the underlying object, there are two ways to do it. First, the tie function returns a reference to the object. But if you didn't bother to store that object reference anywhere, you could still retrieve it using the tied function.

$object = tie VARIABLE, CLASSNAME, LIST untie VARIABLE $object = tied VARIABLE

The tie function binds the variable to the class package that provides the methods for that variable. Once this magic has been performed, accessing a tied variable automatically triggers method calls in the proper class. All the complexity of the class is hidden behind magic method calls. The method names are predetermined, since they're called implicitly from within the innards of Perl. These names are in ALL CAPS, which is a convention in Perl culture that indicates that the routines are called implicitly rather than explicitly - just like BEGIN, END, and DESTROY. And AUTOLOAD too, for that matter.

You can almost think of tie as a funny kind of bless, except that it blesses a bare variable instead of a thingy reference, and takes extra parameters, like a constructor. That's because it actually does call a constructor internally. (That's one of the magic methods we mentioned.) This constructor is passed the CLASSNAME you specified, as well as any additional arguments you supply in the LIST. It is not passed the VARIABLE, however. The only way the constructor can tell which kind of VARIABLE is being tied is by knowing its own method name. This is not the customary constructor name, new, but rather one of TIESCALAR, TIEARRAY, or TIEHASH. (You can likely figure out which name goes with which variable type.) The constructor just returns an object reference in the normal fashion, and doesn't worry about whether it was called from tie - which it may not have been, since you can call these methods directly if you like. (Indeed, if you've tied your variable to a class that provides other methods not accessible through the variable, you must call the other methods directly yourself, via the object reference. These extra methods might provide services like file locking or other forms of transaction protection.)

As in any constructor, these constructors must bless a reference to a thingy and return it as the implementation object. The thingy inside the implementation object doesn't have to be of the same type as the variable you're tying to. It does have to be a properly blessed object, though. See the example below on tied arrays, which uses a hash object to hold information about an array.

The tie function will not use or require a module for you - you must do that explicitly yourself. (On the other hand, the dbmopen emulator function will, for backward compatibility, attempt to use one or another DBM implementation. But you can preempt its selection with an explicit use, provided the module you use is one of the modules in dbmopen's list of modules to try. See the AnyDBM_File module in for a fuller explanation.)

Tying Scalars

A class implementing a tied scalar must define the following methods: TIESCALAR, FETCH, STORE, and possibly DESTROY. These routines will be invoked implicitly when you tie a variable (TIESCALAR), read a tied variable (FETCH), or assign a value to a tied variable (STORE). The DESTROY method is called (as always) when the last reference to the object disappears. (This may or may not happen when you call untie, which destroys the reference used by the tie, but doesn't destroy any outstanding references you may have squirreled away elsewhere.) The FETCH and STORE methods are triggered when you access the variable that's been tied, not the object it's been tied to. If you have a handle on the object (either returned by the initial tie or retrieved later via tied), you can access the underlying object yourself without automatically triggering its FETCH or STORE methods.

Let's look at each of these methods in turn, using as our example an imaginary class called Nice.[] Variables tied to this class are scalars containing process priorities, and each such variable is implicitly associated with an object that contains a particular process ID, such the ID of the currently running process or of the parent process. (Presumably you'd name your variables to remind you which process you're referring to.) Variables are tied to the class this way:

[17] UNIX priorities are associated with the word "nice" because they're inverted from what you'd expect. Higher priorities run slower, hence are "nicer" to other processes. A more portable module might prefer a less UNIX-centric name like Priority. But if we were writing this class for the Perl library, we'd probably call it Tie::Priority or some such, to fit the library's hierarchical naming scheme. Not everything can be a top-level class, or things will get rather confused. Not to mention people.

use Nice; # load the Nice.pm module tie $his_speed, 'Nice', getppid(); tie $my_speed, 'Nice', $$;

Once the variables have been tied, their previous contents are no longer accessible. The internally forged connection between the variable and the object takes precedence over ordinary variable semantics.

For example, let's say you copy a variable that's been tied:

$speed = $his_speed;

Instead of reading the value in the ordinary fashion from the $his_speed scalar variable, Perl implicitly calls the FETCH method on the associated underlying object. It's as though you'd written this:

$speed = (tied $his_speed)->FETCH():

Or if you'd captured the object returned by the tie, you could simply use that reference instead of the tied function, as in the following sample code.

$myobj = tie $my_speed, 'Nice', $$; $speed = $my_speed; # through the implicit interface $speed = $myobj->FETCH(); # same thing, explicitly

You can use $myobj to call methods other than the implicit ones, such as those provided by the DB_File class (see ). However, one normally minds one's own business and leaves the underlying object alone, which is why you often see the return value from tie ignored. You can still get at it if you need it later.

That's the external view of it. For our implementation, we'll use the BSD::Resource class (found in CPAN, but not included with Perl) to access the PRIO_PROCESS, PRIO_MIN, and PRIO_MAX constants from your system. Here's the preamble of our class, which we will put into a file named Nice.pm:

package Nice; use Carp; # Propagates error messages nicely. use BSD::Resource; # Use these hooks into the OS. use strict; # Enforce some discipline on ourselves, use vars '$DEBUG'; # but exempt $DEBUG from discipline.

The Carp module provides methods carp(), croak(), and confess(), which we'll use in various spots below. As usual, see for more about Carp.

The use strict would ordinarily disallow the use of unqualified package variables like $DEBUG, but we then declared the global with use vars, so it's exempt. Otherwise we'd have to say $Nice::DEBUG everywhere. But it is a global, and other modules can turn on debugging in our module by setting $Nice::DEBUG to some other value before using our module.

Tying Arrays

A class implementing a tied ordinary array must define the following methods: TIEARRAY, FETCH, STORE, and perhaps DESTROY.

Tied arrays are incomplete. There are, as yet, no defined methods to deal with $#ARRAY access (which is hard, since it's an lvalue), nor with the other obvious array functions, like push, pop, shift, unshift, and splice. This means that a tied array doesn't behave like an untied one. You can't even determine the length of the array. But if you use the tied arrays only for simple read and write access you'll be OK. These restrictions will be removed in a future release.

For the purpose of this discussion, we will implement an array whose indices are fixed at its creation. If you try to access anything beyond those bounds, you will cause an exception.

require Bounded_Array; tie @ary, 'Bounded_Array', 2; # maximum allowable subscript is 2 $| = 1;
for $i (0 .. 10) {
 print "setting index $i: "; $ary[$i] = 10 * $i; # should raise exception on 3 print "value of element $i now $ary[$i]\n";
}

The preamble code for the class is as follows:

package Bounded_Array; use Carp; use strict;

Tying Hashes

For historical reasons, hashes have the most complete and useful tie implementation. A class implementing a tied associative array must define various methods. TIEHASH is the constructor. FETCH and STORE access the key/value pairs. EXISTS reports whether a key is present in the hash, and DELETE deletes one. CLEAR empties the hash by deleting all the key/value pairs. FIRSTKEY and NEXTKEY implement the keys and each built-in functions to iterate over all the keys. And DESTROY (if defined) is called when the tied object is deallocated.

If this seems like a lot, then feel free to inherit most of these methods from the standard Tie::Hash module, redefining only the interesting ones. See the Tie::Hash module documentation in for details.

Remember that Perl distinguishes a key not existing in the hash from a key that exists with an undefined value. The two possibilities can be tested with the exists and defined functions, respectively.

Because functions like keys and values may return huge array values when used on large hashes (like tied DBM files), you may prefer to use the each function to iterate over such. For example:

# print out B-news history file offsets use NDBM_File; tie(%HIST, 'NDBM_File', '/usr/lib/news/history', 1, 0); while (($key,$val) = each %HIST) {
 print $key, ' = ', unpack('L',$val), "\n";
}
untie(%HIST);

(But does anyone run B-news any more?)

Here's an example of a somewhat peculiar tied hash class: it gives you a hash representing a particular user's dotfiles (that is, files whose names begin with a period). You index into the hash with the name of the file (minus the period) and you get back that dotfile's contents. For example:

use DotFiles; tie %dot, "DotFiles"; if ( $dot{profile} =~ /MANPATH/ or $dot{login} =~ /MANPATH/ or $dot{cshrc} =~ /MANPATH/ ) {
 print "you've set your manpath\n";
}

Here's another way to use our tied class:

# third argument is name of user whose dot files we will tie to tie %him, 'DotFiles', 'daemon'; foreach $f ( keys %him ) {
 printf "daemon dot file %s is size %d\n", $f, length $him{$f};
}

In our DotFiles example we implement the object as a regular hash containing several important fields, of which only the {CONTENTS} field will be what the user thinks of as the real hash. Here are the fields:

Here's the start of DotFiles.pm:

package DotFiles; use Carp; sub whowasi {
 (caller(1))[3] . '()'
}
my $DEBUG = 0; sub debug {
 $DEBUG = @_ ? shift : 1 }

For our example, we want to be able to emit debugging information to help in tracing during development. We also keep one convenience function around internally to help print out warnings; whowasi() returns the name of the function that called the current function (whowasi()'s "grandparent" function).

Here are the methods for the DotFiles tied hash.