Constructing Records

Problem

You want to create a record data type.

Solution

Use a reference to an anonymous hash.

Discussion

Suppose you wanted to create a data type that contained various data fields, akin to a C struct or a Pascal RECORD. The easiest way is to use an anonymous hash. For example, here's how to initialize and use that record:

$record = {
 NAME => "Jason", EMPNO => 132, TITLE => "deputy peon", AGE => 23, SALARY => 37_000, PALS => [ "Norbert", "Rhys", "Phineas"], };
 printf "I am %s, and my pals are %s.\n", $record->{NAME}, join(", ", @{$record->{PALS}});

Just having one of these records isn't much fun - you'd like to build larger structures. For example, you might want to create a %ByName hash that you could initialize and use this way:

# store record $byname{ $record->{NAME}
}
= $record; # later on, look up by name if ($rp = $byname{"Aron"}) {
 # false if missing printf "Aron is employee %d.\n", $rp->{EMPNO};
}
# give jason a new pal push @{$byname{"Jason"}->{PALS}}, "Theodore";
printf "Jason now has %d pals\n", scalar @{$byname{"Jason"}->{PALS}};

That makes %byname a hash of hashes, because its values are hash references. Looking up employees by name would be easy using such a structure. If we find a value in the hash, we store a reference to the record in a temporary variable, $rp, which we then use to get any field we want.

We can use our existing hash tools to manipulate %byname. For instance, we could use the each iterator to loop through it in an arbitrary order:

# Go through all records while (($name, $record) = each %byname) {
 printf "%s is employee number %d\n", $name, $record->{EMPNO};
}

What about looking employees up by employee number? Just build and use another data structure, an array of hashes called @employees. If your employee numbers aren't consecutive (for instance, they jump from 1 to 159997) an array would be a bad choice. Instead, you should use a hash mapping employee number to record. For consecutive employee numbers, use an array:

# store record $employees[ $record->{EMPNO} ] = $record; # lookup by id if ($rp = $employee[132]) {
 printf "employee number 132 is %s\n", $rp->{NAME};
}

With a data structure like this, updating a record in one place effectively updates it everywhere. For example, this gives Jason a 3.5% raise:

$byname{"Jason"}->{SALARY} *= 1.035;

This change is reflected in all views of these records. Remember that both $byname{"Jason"} and $employees[132] refer to the same record because the references they contain refer to the same anonymous hash.

How would you select all records matching a particular criterion? This is what grep is for. Here's how to get everyone with "peon" in their titles or all the 27-year-olds:

@peons = grep {
 $_->{TITLE} =~ /peon/i
}
@employees; @tsevens = grep {
 $_->{AGE} == 27
}
@employees;

Each element of @peons and @tsevens is itself a reference to a record, making them arrays of hashes, like @employees.

Here's how to print all records sorted in a particular order, say by age:

# Go through all records foreach $rp (sort {
 $a->{AGE} <=> $b->{AGE}
}
values %byname) {
 printf "%s is age %d.\n", $rp->{NAME}, $rp->{AGE};
 # or with a hash slice on the reference printf "%s is employee number %d.\n", @$rp{'NAME','EMPNO'};
}

Rather than take time to sort them by age, you could just create another view of these records, @byage. Each element in this array, $byage[27] for instance, would be an array of all the records with that age. In effect, this is an array of arrays of hashes. You would build it this way:

# use @byage, an array of arrays of records push @{ $byage[ $record->{AGE} ] }, $record;

Then you could find them all this way:

for ($age = 0; $age <= $#byage; $age++) {
 next unless $byage[$age];
print "Age $age: "; foreach $rp (@{$byage[$age]}) {
 print $rp->{NAME}, " ";
}
print "\n";
}

A similar approach is to use map to avoid the foreach loop:

for ($age = 0; $age <= $#byage; $age++) {
 next unless $byage[$age];
printf "Age %d: %s\n", $age, join(", ", map {$_->{NAME}} @{$byage[$age]}); }

See Also

;