XML::TreeBuilder

XML::TreeBuilder is a factory class that builds a tree of XML::Element objects. The XML::Element class inherits from the older HTML::Element class that comes with the HTML::Tree package. Thus, you can build the tree from a file with XML::TreeBuilder and use the XML::Element accessor methods to move around, grab data from the tree, and change the structure of the tree as needed. We're going to focus on that last thing: using accessor methods to assemble a tree of our own.

For example, we're going to write a program that manages a simple, prioritized "to-do" list that uses an XML datafile to store entries. Each item in the list has an "immediate" or "long-term" priority. The program will initialize the list if it's empty or the file is missing. The user can add items by using -i or -l (for "immediate" or "long-term," respectively), followed by a description. Finally, the program updates the datafile and prints it out on the screen.

The first part of the program, listed in Example 6-7, sets up the tree structure. If the datafile can be found, it is read and used to build the tree. Otherwise, the tree is built from scratch.

Example 6-7. To-do list manager, first part

use XML::TreeBuilder; use XML::Element; use Getopt::Std; # command line options # -i immediate # -l long-term # my %opts; getopts( 'il', \%opts ); # initialize tree my $data = 'data.xml'; my $tree; # if file exists, parse it and build the tree if( -r $data ) {
 $tree = XML::TreeBuilder->new( ); $tree->parse_file($data); # otherwise, create a new tree from scratch
}
else {
 print "Creating new data file.\n"; my @now = localtime; my $date = $now[4] . '/' . $now[3]; $tree = XML::Element->new( 'todo-list', 'date' => $date ); $tree->push_content( XML::Element->new( 'immediate' )); $tree->push_content( XML::Element->new( 'long-term' ));
}

A few notes on initializing the structure are necessary. The minimal structure of the datafile is this:

<todo-list date="DATE">
<immediate></immediate>
<long-term></long-term>
</todo-list>

As long as the <immediate> and <long-term> elements are present, we have somewhere to put schedule items. Thus, we need to create three elements using the XML::Element constructor method new( ), which uses its argument to set the name of the element. The first call of this method also includes an argument 'date' => $date to create an attribute named "date." After creating element nodes, we have to connect them. The push_content( ) method adds a node to an element's content list.

The next part of the program updates the datafile, adding a new item if the user supplies one. Where to put the item depends on the option used (-i or -l). We use the as_XML method to output XML, as shown in Example 6-8.

Example 6-8. To-do list manager, second part

# add new entry and update file if( %opts ) {
 my $item = XML::Element->new( 'item' ); $item->push_content( shift @ARGV ); my $place; if( $opts{ 'i' }) {
 $place = $tree->find_by_tag_name( 'immediate' );
}
elsif( $opts{ 'l' }) {
 $place = $tree->find_by_tag_name( 'long-term' );
}
$place->push_content( $item );
}
open( F, ">$data" ) or die( "Couldn't update schedule" );
print F $tree->as_XML; close F;

Finally, the program outputs the current schedule to the terminal. We use the find_by_tag_name( ) method to descend from an element to a child with a given tag name. If more than one element match, they are supplied in a list. Two methods retrieve the contents of an element: attr_get_i( ) for attributes and as_text( ) for character data. Example 6-9 has the rest of the code.

Example 6-9. To-do list manager, third part

# output schedule print "To-do list for " . $tree->attr_get_i( 'date' ) . ":\n";
print "\nDo right away:\n"; my $immediate = $tree->find_by_tag_name( 'immediate' ); my $count = 1; foreach my $item ( $immediate->find_by_tag_name( 'item' )) {
 print $count++ . '. ' . $item->as_text . "\n";
}
print "\nDo whenever:\n"; my $longterm = $tree->find_by_tag_name( 'long-term' ); $count = 1; foreach my $item ( $longterm->find_by_tag_name( 'item' )) {
 print $count++ . '. ' . $item->as_text . "\n";
}

To test the code, we created this datafile with several calls to the program (whitespace was added to make it more readable):

<todo-list date="7/3">
<immediate>
<item>take goldfish to the vet</item>
<item>get appendix removed</item>
</immediate>
<long-term>
<item>climb K-2</item>
<item>decipher alien messages</item>
</long-term>
</todo-list>

The output to the screen was this:

To-do list for 7/3: Do right away: 1. take goldfish to the vet 2. get appendix removed Do whenever: 1. climb K-2 2. decipher alien messages