Tying Filehandles
A class implementing a tied filehandle should define the following methods: TIEHANDLE
and at least one of PRINT
, PRINTF
, WRITE
, READLINE
, GETC
, and READ
. The class can also provide a DESTROY
method, and BINMODE
, OPEN
, CLOSE
, EOF
, FILENO
, SEEK
, TELL
, READ
, and WRITE
methods to enable the corresponding Perl built-ins for the tied filehandle. (Well, that isn't quite true: WRITE
corresponds to syswrite
and has nothing to do with Perl's built-in write
function for printing with format
declarations.)
Tied filehandles are especially useful when Perl is embedded in another program (such as Apache or vi) and output to STDOUT
or STDERR
needs to be redirected in some special way.
But filehandles don't actually have to be tied to a file at all. You can use output statements to build up an in-memory data structure and input statements to read them back in. Here's an easy way to reverse a sequence of print
and printf
statements without reversing the individual lines:
package ReversePrint; use strict; sub TIEHANDLE { my $class = shift; bless [], $class; } sub PRINT { my $self = shift; push @$self, join '', @_; } sub PRINTF { my $self = shift; my $fmt = shift; push @$self, sprintf $fmt, @_; } sub READLINE { my $self = shift; pop @$self; } package main; my $m = "--MORE--\n"; tie *REV, "ReversePrint"; # Do some prints and printfs. print REV "The fox is now dead.$m"; printf REV <<"END", int rand 10000000; The quick brown fox jumps over over the lazy dog %d times! END print REV <<"END"; The quick brown fox jumps over the lazy dog. END # Now read back from the same handle. print while <REV>;
This prints:
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over over the lazy dog 3179357 times! The fox is now dead.--MORE--
Filehandle-Tying Methods
For our extended example, we'll create a filehandle that uppercases strings printed to it. Just for kicks, we'll begin the file with <SHOUT>
when it's opened and end with </SHOUT>
when it's closed. That way we can rant in well-formed XML.
Here's the top of our Shout.pm file that will implement the class:
package Shout; use Carp; # So we can croak our errors
We'll now list the method definitions in Shout.pm.
- CLASSNAME
->TIEHANDLE(
LIST)
- This is the constructor for the class, which as usual should return a blessed reference.
sub TIEHANDLE { my $class = shift; my $form = shift; open my $self, $form, @_ or croak "can't open $form@_: $!"; if ($form =~ />/) { print $self "<SHOUT>\n"; $$self->{WRITING} = 1; # Remember to do end tag } return bless $self, $class; # $self is a glob ref }
Here, we open a new filehandle according to the mode and filename passed to thetie
operator, write<SHOUT>
to the file, and return a blessed reference to it. There's a lot of stuff going on in thatopen
statement, but we'll just point out that, in addition to the usual "open or die" idiom, themy $self
furnishes an undefined scalar toopen
, which knows to autovivify it into a typeglob. The fact that it's a typeglob is also significant, because not only does the typeglob contain the real I/O object of the file, but it also contains various other handy data structures that come along for free, like a scalar ($$$self
), an array (@$$self
), and a hash (%$$self
). (We won't mention the subroutine,&$$self
.)The
$form
is the filename-or-mode argument. If it's a filename,@_
is empty, so it behaves as a two-argument open. Otherwise,$form
is the mode for the rest of the arguments.After the open, we test to see whether we should write the beginning tag. If so, we do. And right away, we use one of those glob data structures we mentioned. That
$$self->{WRITING}
is an example of using the glob to store interesting information. In this case, we remember whether we did the beginning tag so we know whether to do the corresponding end tag. We're using the%$$self
hash, so we can give the field a decent name. We could have used the scalar as$$$self
, but that wouldn't be self-documenting. (Or it would only be self-documenting, depending on how you look at it.) - SELF
->PRINT(
LIST)
- This method implements a
print
to the tied handle. The LIST is whatever was passed toprint
. Our method below uppercases each element of LIST:
sub PRINT { my $self = shift; print $self map {uc} @_; }
- SELF
->READLINE
- This method supplies the data when the filehandle is read from via the angle operator (
<FH>
) orreadline
. The method should returnundef
when there is no more data.
sub READLINE { my $self = shift; return <$self>; }
Here, we simplyreturn <$self>
so that the method will behave appropriately depending on whether it was called in scalar or list context. - SELF
->GETC
- This method runs whenever
getc
is used on the tied filehandle.
sub GETC { my $self = shift; return getc($self); }
Like several of the methods in ourShout
class, theGETC
method simply calls its corresponding Perl built-in and returns the result. - SELF
->OPEN(
LIST)
- Our
TIEHANDLE
method itself opens a file, but a program using theShout
class that callsopen
afterward triggers this method.
sub OPEN { my $self = shift; my $form = shift; my $name = "$form@_"; $self->CLOSE; open($self, $form, @_) or croak "can't reopen $name: $!"; if ($form =~ />/) { print $self "<SHOUT>\n" or croak "can't start print: $!"; $$self->{WRITING} = 1; # Remember to do end tag } else { $$self->{WRITING} = 0; # Remember not to do end tag } return 1; }
We invoke our ownCLOSE
method to explicitly close the file in case the user didn't bother to. Then we open a new file with whatever filename was specified in theopen
and shout at it. - SELF
->CLOSE
- This method deals with the request to close the handle. Here, we seek to the end of the file and, if that was successful, print
</SHOUT>
before using Perl's built-inclose
.
sub CLOSE { my $self = shift; if ($$self->{WRITING}) { $self->SEEK(0, 2) or return; $self->PRINT("</SHOUT>\n") or return; } return close $self; }
- SELF
->SEEK(
LIST)
- When you
seek
on a tied filehandle, theSEEK
method gets called.
sub SEEK { my $self = shift; my ($offset, $whence) = @_; return seek($self, $offset, $whence); }
- SELF
->TELL
- This method is invoked when
tell
is used on the tied handle.
sub TELL { my $self = shift; return tell $self; }
- SELF
->PRINTF(
LIST)
- This method is run whenever
printf
is used on the tied handle. The LIST will contain the format and the items to be printed.
sub PRINTF { my $self = shift; my $template = shift; return $self->PRINT(sprintf $template, @_); }
Here, we usesprintf
to generate the formatted string and pass it toPRINT
for uppercasing. There's nothing that requires you to use the built-insprintf
function though. You could interpret the percent escapes to suit your own purpose. - SELF
->READ(
LIST)
- This method responds when the handle is read using
read
orsysread
. Note that we modify the first argument of LIST "in-place", mimickingread
's ability to fill in the scalar passed in as its second argument.
sub READ { my ($self, undef, $length, $offset) = @_; my $bufref = \$_[1]; return read($self, $$bufref, $length, $offset); }
- SELF
->WRITE(
LIST)
- This method gets invoked when the handle is written to with
syswrite
. Here, we uppercase the string to be written.
sub WRITE { my $self = shift; my $string = uc(shift); my $length = shift || length $string; my $offset = shift || 0; return syswrite $self, $string, $length, $offset; }
- SELF
->EOF
- This method returns a Boolean value when a filehandle tied to the
Shout
class is tested for its end-of-file status usingeof
.
sub EOF { my $self = shift; return eof $self; }
- SELF
->BINMODE(
DISC)
- This method specifies the I/O discipline to be used on the filehandle. If none is specified, it puts the tied filehandle into binary mode (the
:raw
discipline), for filesystems that distinguish between text and binary files.
sub BINMODE { my $self = shift; my $disc = shift || ":raw"; return binmode $self, $disc; }
That's how you'd write it, but it's actually useless in our case because theopen
already wrote on the handle. So in our case we should probably make it say:
sub BINMODE { croak("Too late to use binmode") }
- SELF
->FILENO
- This method should return the file descriptor (
fileno
) associated with the tied filehandle by the operating system.
sub FILENO { my $self = shift; return fileno $self; }
- SELF
->DESTROY
- As with the other types of ties, this method is triggered when the tied object is about to be destroyed. This is useful for letting the object clean up after itself. Here, we make sure that the file is closed, in case the program forgot to call
close
. We could just sayclose $self
, but it's better to invoke theCLOSE
method of the class. That way if the designer of the class decides to change how files are closed, thisDESTROY
method won't have to be modified.
sub DESTROY { my $self = shift; $self->CLOSE; # Close the file using Shout's CLOSE method. }
Here's a demonstration of our Shout
class:
#!/usr/bin/perl use Shout; tie(*FOO, Shout::, ">filename"); print FOO "hello\n"; # Prints HELLO. seek FOO, 0, 0; # Rewind to beginning. @lines = <FOO>; # Calls the READLINE method. close FOO; # Close file explicitly. open(FOO, "+<", "filename"); # Reopen FOO, calling OPEN. seek(FOO, 8, 0); # Skip the "<SHOUT>\n". sysread(FOO, $inbuf, 5); # Read 5 bytes from FOO into $inbuf. print "found $inbuf\n"; # Should print "hello". seek(FOO, -5, 1); # Back up over the "hello". syswrite(FOO, "ciao!\n", 6); # Write 6 bytes into FOO. untie(*FOO); # Calls the CLOSE method implicitly.
After running this, the file contains:
<SHOUT> CIAO! </SHOUT>
Here are some more strange and wonderful things to do with that internal glob. We use the same hash as before, but with new keys
PATHNAME
and DEBUG
. First we install a stringify overloading so that printing one of our objects reveals the pathname (see "Overloading"):
# This is just so totally cool! use overload q("") => sub { $_[0]->pathname }; # This is the stub to put in each function you want to trace. sub trace { my $self = shift; local $Carp::CarpLevel = 1; Carp::cluck("\ntrace magical method") if $self->debug; } # Overload handler to print out our path. sub pathname { my $self = shift; confess "i am not a class method" unless ref $self; $$self->{PATHNAME} = shift if @_; return $$self->{PATHNAME}; } # Dual moded. sub debug { my $self = shift; my $var = ref $self ? \$$self->{DEBUG} : \our $Debug; $$var = shift if @_; return ref $self ? $$self->{DEBUG} || $Debug : $Debug; }
And then call
trace
on entry to all your ordinary methods like this:
sub GETC { $_[0]->trace; # NEW my($self) = @_; getc($self); }
And also set the pathname in
TIEHANDLE
and OPEN
:
sub TIEHANDLE { my $class = shift; my $form = shift; my $name = "$form@_"; # NEW open my $self, $form, @_ or croak "can't open $name: $!"; if ($form =~ />/) { print $self "<SHOUT>\n"; $$self->{WRITING} = 1; # Remember to do end tag } bless $self, $class; # $fh is a glob ref $self->pathname($name); # NEW return $self; } sub OPEN { $_[0]->trace; # NEW my $self = shift; my $form = shift; my $name = "$form@_"; $self->CLOSE; open($self, $form, @_) or croak "can't reopen $name: $!"; $self->pathname($name); # NEW if ($form =~ />/) { print $self "<SHOUT>\n" or croak "can't start print: $!"; $$self->{WRITING} = 1; # Remember to do end tag } else { $$self->{WRITING} = 0; # Remember not to do end tag } return 1; }
Somewhere you also have to call
$self->debug(1)
to turn debugging on. When you do that, all your Carp::cluck
calls will produce meaningful messages. Here's one that we get while doing the reopen above. It shows us three deep in method calls, as we're closing down the old file in preparation for opening the new one:
trace magical method at foo line 87 Shout::SEEK('>filename', '>filename', 0, 2) called at foo line 81 Shout::CLOSE('>filename') called at foo line 65 Shout::OPEN('>filename', '+<', 'filename') called at foo line 141
Creative Filehandles
You can tie
the same filehandle to both the input and the output of a two-ended pipe. Suppose you wanted to run the bc(1) (arbitrary precision calculator) program this way:
use Tie::Open2; tie *CALC, 'Tie::Open2', "bc -l"; $sum = 2; for (1 .. 7) { print CALC "$sum * $sum\n"; $sum = <CALC>; print "$_: $sum"; chomp $sum; } close CALC;
One would expect it to print this:
: 4 2: 16 3: 256 4: 65536 5: 4294967296 6: 18446744073709551616 7: 340282366920938463463374607431768211456
One's expectations would be correct if one had the bc(1) program on one's computer, and one also had
Tie::Open2
defined as follows. This time we'll use a blessed array for our internal object. It contains our two actual filehandles for reading and writing. (The dirty work of opening a double-ended pipe is done by IPC::Open2
; we're just doing the fun part.)
package Tie::Open2; use strict; use Carp; use Tie::Handle; # do not inherit from this! use IPC::Open2; sub TIEHANDLE { my ($class, @cmd) = @_; no warnings 'once'; my @fhpair = \do { local(*RDR, *WTR) }; bless $_, 'Tie::StdHandle' for @fhpair; bless(\@fhpair => $class)->OPEN(@cmd) || die; return \@fhpair; } sub OPEN { my ($self, @cmd) = @_; $self->CLOSE if grep {defined} @{ $self->FILENO }; open2(@$self, @cmd); } sub FILENO { my $self = shift; [ map { fileno $self->[$_] } 0,1 ]; } for my $outmeth ( qw(PRINT PRINTF WRITE) ) { no strict 'refs'; *$outmeth = sub { my $self = shift; $self->[1]->$outmeth(@_); }; } for my $inmeth ( qw(READ READLINE GETC) ) { no strict 'refs'; *$inmeth = sub { my $self = shift; $self->[0]->$inmeth(@_); }; } for my $doppelmeth ( qw(BINMODE CLOSE EOF)) { no strict 'refs'; *$doppelmeth = sub { my $self = shift; $self->[0]->$doppelmeth(@_) && $self->[1]->$doppelmeth(@_); }; } for my $deadmeth ( qw(SEEK TELL)) { no strict 'refs'; *$deadmeth = sub { croak("can't $deadmeth a pipe"); }; } 1;
The final four loops are just incredibly snazzy, in our opinion. For an explanation of what's going on, look back at the section "Closures as function templates" in "References".
Here's an even wackier set of classes. The package names should give you a clue as to what they do.
use strict; package Tie::DevNull; sub TIEHANDLE { my $class = shift; my $fh = local *FH; bless \$fh, $class; } for (qw(READ READLINE GETC PRINT PRINTF WRITE)) { no strict 'refs'; *$_ = sub { return }; } package Tie::DevRandom; sub READLINE { rand() . "\n"; } sub TIEHANDLE { my $class = shift; my $fh = local *FH; bless \$fh, $class; } sub FETCH { rand() } sub TIESCALAR { my $class = shift; bless \my $self, $class; } package Tie::Tee; sub TIEHANDLE { my $class = shift; my @handles; for my $path (@_) { open(my $fh, ">$path") || die "can't write $path"; push @handles, $fh; } bless \@handles, $class; } sub PRINT { my $self = shift; my $ok = 0; for my $fh (@$self) { $ok += print $fh @_; } return $ok == @$self; }
The
Tie::Tee
class emulates the standard Unix tee(1) program, which sends one stream of output to multiple different destinations. The Tie::DevNull
class emulates the null device, /dev/null on Unix systems. And the Tie::DevRandom
class produces random numbers either as a handle or as a scalar, depending on whether you call TIEHANDLE
or TIESCALAR
! Here's how you call them:
package main; tie *SCATTER, "Tie::Tee", qw(tmp1 - tmp2 >tmp3 tmp4); tie *RANDOM, "Tie::DevRandom"; tie *NULL, "Tie::DevNull"; tie my $randy, "Tie::DevRandom"; for my $i (1..10) { my $line = <RANDOM>; chomp $line; for my $fh (*NULL, *SCATTER) { print $fh "$i: $line $randy\n"; } }
This produces something like the following on your screen:
: 0.124115571686165 0.20872819474074 2: 0.156618299751194 0.678171662366353 3: 0.799749050426126 0.300184963960792 4: 0.599474551447884 0.213935286029916 5: 0.700232143543861 0.800773751296671 6: 0.201203608274334 0.0654303290639575 7: 0.605381294683365 0.718162304090487 8: 0.452976481105495 0.574026269121667 9: 0.736819876983848 0.391737610662044 10: 0.518606540417331 0.381805078272308
But that's not all! It wrote to your screen because of the
-
in the *SCATTER
tie
above. But that line also told it to create files tmp1, tmp2, and tmp4, as well as to append to file tmp3. (We also wrote to the *NULL
filehandle in the loop, though of course that didn't show up anywhere interesting, unless you're interested in black holes.)