Perl Cookbook

Perl CookbookSearch this book
Previous: 11.12. Copying Data StructuresChapter 11
References and Records
Next: 11.14. Transparently Persistent Data Structures
 

11.13. Storing Data Structures to Disk

Problem

You want to save your large, complex data structure to disk so you don't have to build it up each time your program runs.

Solution

Use the CPAN module Storable's store and retrieve functions:

use Storable; 
store(\%hash, "filename");

# later on...  
$href = retrieve("filename");        # by ref
%hash = %{ retrieve("filename") };   # direct to hash

Discussion

The Storable module uses C functions and a binary format to walk Perl's internal data structures and lay out its data. It's more efficient than a pure Perl and string-based approach, but it's also more fragile.

The store and retrieve functions expect binary data using the machine's own byte-ordering. This means files created with these functions cannot be shared across different architectures. nstore does the same job store does, but keeps data in canonical (network) byte order, at a slight speed cost:

use Storable qw(nstore); 
nstore(\%hash, "filename"); 
# later ...  
$href = retrieve("filename");

No matter whether store or nstore was used, you need to call the same retrieve routine to restore the objects in memory. The producer must commit to portability, but the consumer doesn't have to. Code needs only to be changed in one place when the producer changes their mind and the code thus offers a consistent interface on the consumer side, who does not need to know or care.

The store and nstore functions don't lock any of the files they work on. If you're worried about concurrent access, open the file yourself, lock it using Recipe 7.11, and then use store_fd or its slower but machine-independent version nstore_fd.

Here's code to save a hash to a file, with locking. We don't open with the O_TRUNC flag because we have to wait to get the lock before we can clobber the file.

use Storable qw(nstore_fd);
use Fcntl qw(:DEFAULT :flock);
sysopen(DF, "/tmp/datafile", O_RDWR|O_CREAT, 0666) 
    or die "can't open /tmp/datafile: $!";
flock(DF, LOCK_EX)           or die "can't lock /tmp/datafile: $!";
nstore_fd(\%hash, *DF)
    or die "can't store hash\n";
truncate(DF, tell(DF));
close(DF);

Here's code to restore that hash from a file, with locking:

use Storable;
use Fcntl qw(:DEFAULT :flock);
open(DF, "< /tmp/datafile")      or die "can't open /tmp/datafile: $!";
flock(DF, LOCK_SH)               or die "can't lock /tmp/datafile: $!";
$href = retrieve(*DF);
close(DF);

With care, you can pass large data objects efficiently between processes with this strategy, since a filehandle connected to a pipe or socket is still a byte stream, just like a plain file.

Unlike the various DBM bindings, Storable does not restrict you to using only hashes (or arrays, with DB_File). Arbitrary data structures, including objects, can be stored to disk. The whole structure must be read in or written out in its entirety.

See Also

The section on "Remote Procedure Calls (RPC)" in Chapter 13 of Advanced Perl Programming; Recipe 11.14


Previous: 11.12. Copying Data StructuresPerl CookbookNext: 11.14. Transparently Persistent Data Structures
11.12. Copying Data StructuresBook Index11.14. Transparently Persistent Data Structures