start page | rating of books | rating of authors | reviews | copyrights

Book Home Programming PerlSearch this book

2.9. Hashes

As we said earlier, a hash is just a funny kind of array in which you look values up using key strings instead of numbers. A hash defines associations between keys and values, so hashes are often called associative arrays by people who are not lazy typists.

There really isn't any such thing as a hash literal in Perl, but if you assign an ordinary list to a hash, each pair of values in the list will be taken to indicate one key/value association:

%map = ('red',0xff0000,'green',0x00ff00,'blue',0x0000ff);
This has the same effect as:
%map = ();            # clear the hash first
$map{red}   = 0xff0000;
$map{green} = 0x00ff00;
$map{blue}  = 0x0000ff;

It is often more readable to use the => operator between key/value pairs. The => operator is just a synonym for a comma, but it's more visually distinctive and also quotes any bare identifiers to the left of it (just like the identifiers in braces above), which makes it convenient for several sorts of operation, including initializing hash variables:

%map = (
    red   => 0xff0000,
    green => 0x00ff00,
    blue  => 0x0000ff,
);
or initializing anonymous hash references to be used as records:
$rec = {
    NAME  => 'John Smith',
    RANK  => 'Captain',
    SERNO => '951413',
};

or using named parameters to invoke complicated functions:

$field = radio_group(
             NAME      => 'animals',
             VALUES    => ['camel', 'llama', 'ram', 'wolf'],
             DEFAULT   => 'camel',
             LINEBREAK => 'true',
             LABELS    => \%animal_names,
         );
But we're getting ahead of ourselves again. Back to hashes.

You can use a hash variable (%hash) in a list context, in which case it interpolates all its key/value pairs into the list. But just because the hash was initialized in a particular order doesn't mean that the values come back out in that order. Hashes are implemented internally using hash tables for speedy lookup, which means that the order in which entries are stored is dependent on the internal hash function used to calculate positions in the hash table, and not on anything interesting. So the entries come back in a seemingly random order. (The two elements of each key/value pair come out in the right order, of course.) For examples of how to arrange for an output ordering, see the keys function in Chapter 29, "Functions".

When you evaluate a hash variable in a scalar context, it returns a true value only if the hash contains any key/value pairs whatsoever. If there are any key/value pairs at all, the value returned is a string consisting of the number of used buckets and the number of allocated buckets, separated by a slash. This is pretty much only useful to find out whether Perl's (compiled in) hashing algorithm is performing poorly on your data set. For example, you stick 10,000 things in a hash, but evaluating %HASH in scalar context reveals "1/8", which means only one out of eight buckets has been touched. Presumably that one bucket contains all 10,000 of your items. This isn't supposed to happen.

To find the number of keys in a hash, use the keys function in a scalar context: scalar(keys(%HASH)).

You can emulate a multidimensional hash by specifying more than one key within the braces, separated by commas. The listed keys are concatenated together, separated by the contents of $; ($SUBSCRIPT_SEPARATOR), which has a default value of chr(28). The resulting string is used as the actual key to the hash. These two lines do the same thing:

$people{ $state, $county } = $census_results;
$people{ join $; => $state, $county } = $census_results;

This feature was originally implemented to support a2p, the awk-to-Perl translator. These days, you'd usually just use a real (well, realer) multidimensional array as described in Chapter 9, "Data Structures". One place the old style is still useful is for hashes tied to DBM files (see DB_File in Chapter 32, "Standard Modules"), which don't support multidimensional keys.

Don't confuse multidimensional hash emulations with slices. The one represents a scalar value, and the other represents a list value:

$hash{ $x, $y, $z }      # a single value
@hash{ $x, $y, $z }      # a slice of three values



Library Navigation Links

Copyright © 2002 O'Reilly & Associates. All rights reserved.