Programming Perl, Second Edition

Previous Chapter 7
The Standard Perl Library
Next
 

7.2 Library Modules

As mentioned earlier, the following library modules are arranged in alphabetical order, for easy reference.

AnyDBM_File--Provide Framework for Multiple DBMs

use AnyDBM_File;

This module is a "pure virtual base class"--it has nothing of its own. It's just there to inherit from the various DBM packages. By default it inherits from NDBM_File for compatibility with earlier versions of Perl. If it doesn't find NDBM_File, it looks for DB_File, GDBM_File, SDBM_File (which is always there--it comes with Perl), and finally ODBM_File.

Perl's dbmopen function (which now exists only for backward compatibility) actually just calls tie to bind a hash to AnyDBM_File. The effect is to bind the hash to one of the specific DBM classes that AnyDBM_File inherits from.

You can override the defaults and determine which class dbmopen will tie to. Do this by redefining @ISA:

@AnyDBM_File::ISA = qw(DB_File GDBM_File NDBM_File);

Note, however, that an explicit use takes priority over the ordering of @ISA, so that:

use GDBM_File;

will cause the next dbmopen to tie your hash to GDBM_File.

You can tie hash variables directly to the desired class yourself, without using dbmopen or AnyDBM_File. For example, by using multiple DBM implementations, you can copy a database from one format to another:

use Fcntl;         # for O_* values
use NDBM_File;
use DB_File;
tie %oldhash, "NDBM_File", $old_filename, O_RDWR;
tie %newhash, "DB_File",   $new_filename, O_RDWR|O_CREAT|O_EXCL, 0644;
while (($key,$val) = each %oldhash) {
    $newhash{$key} = $val;
}

DBM comparisons

Here's a table of the features that the different DBMish packages offer:

Feature ODBM NDBM SDBM GDBM BSD-DB
Linkage comes with Perl Yes Yes Yes Yes Yes
Source bundled with Perl No No Yes No No
Source redistributable No No Yes GPL Yes
Often comes with UNIX Yes Yes[1] No No No
Builds OK on UNIX N/A N/A Yes Yes Yes[2]
Code size Varies[3] Varies[3] Small Big Big
Disk usage Varies[3] Varies[3] Small Big OK[4]
Speed Varies[3] Varies[3] Slow OK Fast
FTPable No No Yes Yes Yes
Easy to build N/A N/A Yes Yes OK[5]
Block size limits 1k 4k 1k[6] None None
Byte-order independent No No No No Yes
User-defined sort order No No No No Yes
Wildcard lookups No No No No Yes

Footnotes:

[1] On mixed-universe machines, may be in the BSD compatibility library, which is often shunned.

[2] Providing you have an ANSI C compiler.

[3] Depends on how much your vendor has "tweaked" it.

[4] Can be trimmed if you compile for one access method.

[5] See the DB_File library module. Requires symbolic links.

[6] By default, but can be redefined (at the expense of compatibility with older files).

See also

Relevant library modules include: DB_File, GDBM_File, NDBM_File, ODBM_File, and SDBM_File. Related manpages: dbm (3), ndbm (3). Tied variables are discussed extensively in Chapter 5, Packages, Modules, and Object Classes, and the dbmopen entry in Chapter 3, Functions, may also be helpful. You can pick up the unbundled modules from the src/misc/ directory on your nearest CPAN site. Here are the most popular ones, but note that their version numbers may have changed by the time you read this:

AutoLoader--Load Functions Only on Demand

package GoodStuff;
use Exporter;
use AutoLoader;
@ISA = qw(Exporter AutoLoader);

The AutoLoader module provides a standard mechanism for delayed loading of functions stored in separate files on disk. Each file has the same name as the function (plus a .al ), and comes from a directory named after the package (with the auto/ directory). For example, the function named GoodStuff::whatever() would be loaded from the file auto/GoodStuff/whatever.al.

A module using the AutoLoader should have the special marker _ _END_ _ prior to the actual subroutine declarations. All code before this marker is loaded and compiled when the module is used. At the marker, Perl stops parsing the file.

When a subroutine not yet in memory is called, the AUTOLOAD function attempts to locate it in a directory relative to the location of the module file itself. As an example, assume POSIX.pm is located in /usr/local/lib/perl5/POSIX.pm. The AutoLoader will look for the corresponding subroutines for this package in /usr/ local/lib/perl5/auto/POSIX/*.al.

Lexicals declared with my in the main block of a package using the AutoLoader will not be visible to autoloaded functions, because the given lexical scope ends at the _ _END_ _ marker. A module using such variables as file-scoped globals will not work properly under the AutoLoader. Package globals must be used instead. When running under use strict, the use vars pragma may be employed in such situations as an alternative to explicitly qualifying all globals with the package name. Package variables predeclared with this pragma will be accessible to any autoloaded routines, but of course will not be invisible outside the module file.

The AutoLoader is a counterpart to the SelfLoader module. Both delay the loading of subroutines, but the SelfLoader accomplishes this by storing the subroutines right there in the module file rather than in separate files elsewhere. While this avoids the use of a hierarchy of disk files and the associated I/O for each routine loaded, the SelfLoader suffers a disadvantage in the one-time parsing of the lines after _ _DATA_ _, after which routines are cached. The SelfLoader can also handle multiple packages in a file.

AutoLoader, on the other hand, only reads code as it is requested, and in many cases should be faster. But it requires a mechanism like AutoSplit to be used to create the individual files.

On systems with restrictions on file name length, the file corresponding to a subroutine may have a shorter name than the routine itself. This can lead to conflicting filenames. The AutoSplit module will warn of these potential conflicts when used to split a module.

See the discussion of autoloading in Chapter 5, Packages, Modules, and Object Classes. Also see the AutoSplit module, a utility that automatically splits a module into a collection of files for autoloading.

AutoSplit--Split a Module for Autoloading

# from a program
use AutoSplit;
autosplit_modules(@ARGV)
# or from the command line
perl -MAutoSplit -e 'autosplit(FILE, DIR, KEEP, CHECK, MODTIME)' ... 
# another interface
perl -MAutoSplit -e 'autosplit_lib_modules(@ARGV)' ...

This function splits up your program or module into files that the AutoLoader module can handle. It is mainly used to build autoloading Perl library modules, especially complex ones like POSIX. It is used by both the standard Perl libraries and by the MakeMaker module to automatically configure libraries for autoloading.

The autosplit() interface splits the specified FILE into a hierarchy rooted at the directory DIR. It creates directories as needed to reflect class hierarchy. It then creates the file autosplit.ix, which acts as both a forward declaration for all package routines and also as a timestamp for when the hierarchy was last updated.

The remaining three arguments to autosplit() govern other options to the autosplitter. If the third argument, KEEP, is false, then any pre-existing .al files in the autoload directory are removed if they are no longer part of the module (obsoleted functions). The fourth argument, CHECK, instructs autosplit() to check the module currently being split to ensure that it really does include a use specification for the AutoLoader module, and skips the module if AutoLoader is not detected. Lastly, the MODTIME argument specifies that autosplit() is to check the modification time of the module against that of the autosplit.ix file, and only split the module if it is newer.

Here's a typical use of AutoSplit by the MakeMaker utility via the command line:

perl -MAutoSplit -e 'autosplit($ARGV[0], $ARGV[1], 0, 1, 1)'

MakeMaker defines this as a make macro, and it is invoked with file and directory arguments. The autosplit() function splits the named file into the given directory and deletes obsolete .al files, after checking first that the module does use the AutoLoader and ensuring that the module isn't already split in its current form.

The autosplit_lib_modules() form is used in the building of Perl. It takes as input a list of files (modules) that are assumed to reside in a directory lib/ relative to the current directory. Each file is sent to the autosplitter one at a time, to be split into the directory lib/auto/.

In both usages of the autosplitter, only subroutines defined following the Perl special marker _ _END_ _ are split out into separate files. Routines placed prior to this marker are not autosplit, but are forced to load when the module is first required.

Currently, AutoSplit cannot handle multiple package specifications within one file.

AutoSplit will inform the user if it is necessary to create the top-level directory specified in the invocation. It's better if the script or installation process that invokes AutoSplit has created the full directory path ahead of time. This warning may indicate that the module is being split into an incorrect path.

AutoSplit will also warn the user of subroutines whose names cause potential naming conflicts on machines with severely limited (eight characters or less) filename length. Since the subroutine name is used as the filename, these warnings can aid in portability to such systems.

Warnings are issued and the file skipped if AutoSplit cannot locate either the _ _END_ _ marker or a specification of the form package Name;. AutoSplit will also complain if it can't create directories or files.

Benchmark--Check and Compare Running Times of Code

use Benchmark;
# timeit():  run $count iterations of the given Perl code, and time it
$t = timeit($count, 'CODE');  # $t is now a Benchmark object
# timestr():  convert Benchmark times to printable strings
print "$count loops of 'CODE' took:", timestr($t), "\n";
# timediff():  calculate the difference between two times
$t = timediff($t1 - $t2);
# timethis():  run "code" $count times with timeit(); also, print out a
#     header saying "timethis $count: "
$t = timethis($count, "CODE");
# timethese():  run timethis() on multiple chunks of code
@t = timethese($count, {
    'Name1' => '...CODE1...',
    'Name2' => '...CODE2...',
});
# new method:  return the current time
$t0 = new Benchmark;
# ... your CODE here ...
$t1 = new Benchmark;
$td = timediff($t1, $t0);
print "the code took: ", timestr($td), "\n";
# debug method:  enable or disable debugging
Benchmark->debug (1);
$t = timeit(10, ' 5 ** $Global ');
Benchmark->debug(0);

The Benchmark module encapsulates a number of routines to help you figure out how long it takes to execute some code a given number of times within a loop.

For the timeit() routine, $count is the number of times to run the loop. CODE is a string containing the code to run. timeit() runs a null loop with $count iterations, and then runs the same loop with your code inserted. It reports the difference between the times of execution.

For timethese(), a loop of $count iterations is run on each code chunk separately, and the results are reported separately. The code to run is given as a hash with keys that are names and values that are code. timethese() is handy for quick tests to determine which way of doing something is faster. For example:

$ perl -MBenchmark -Minteger
timethese(100000, { add => '$i += 2', inc => '$i++; $i++' });
_ _END_ _
Benchmark: timing 1000000 iterations of add, inc...
       add:  4 secs ( 4.52 usr  0.00 sys =  4.52 cpu)
       inc:  6 secs ( 5.32 usr  0.00 sys =  5.32 cpu)

The following routines are exported into your namespace if you use the Benchmark module:

timeit()
timethis()
timethese()
timediff()
timestr()

The following routines will be exported into your namespace if you specifically ask that they be imported:

clearcache()     # clear just the cache element indexed by $key
clearallcache()  # clear the entire cache
disablecache()   # do not use the cache
enablecache()    # resume caching

Notes

Code is executed in the caller's package.

The null loop times are cached, the key being the number of iterations. You can control caching with calls like these:

clearcache($key);
clearallcache();
disablecache();
enablecache();

Benchmark inherits only from the Exporter class.

The elapsed time is measured using time (2) and the granularity is therefore only one second. Times are given in seconds for the whole loop (not divided by the number of iterations). Short tests may produce negative figures because Perl can appear to take longer to execute the empty loop than a short test.

The user and system CPU time is measured to millisecond accuracy using times (3). In general, you should pay more attention to the CPU time than to elapsed time, especially if other processes are running on the system. Also, elapsed times of five seconds or more are needed for reasonable accuracy.

Because you pass in a string to be evaled instead of a closure to be executed, lexical variables declared with my outside of the eval are not visible.

Carp--Generate Error Messages

use Carp;
carp "Be careful!";         # warn of errors (from perspective of caller)
croak "We're outta here!";  # die of errors (from perspective of caller)
confess "Bye!";             # die of errors with stack backtrace

carp() and croak() behave like warn and die, respectively, except that they report the error as occurring not at the line of code where they are invoked, but at a line in one of the calling routines. Suppose, for example, that you have a routine goo() containing an invocation of carp(). In that case--and assuming that the current stack shows no callers from a package other than the current one--carp() will report the error as occurring where goo() was called. If, on the other hand, callers from different packages are found on the stack, then the error is reported as occurring in the package immediately preceding the package in which the carp() invocation occurs. The intent is to let library modules act a little more like built-in functions, which always report errors where you call them from.

confess() is like die except that it prints out a stack backtrace. The error is reported at the line where confess() is invoked, not at a line in one of the calling routines.

Config--Access Perl Configuration Information

use Config;
if ($Config{cc} =~ /gcc/) {
    print "built by gcc\n";
}
use Config qw(myconfig config_sh config_vars);
print myconfig();
print config_sh();
config_vars(qw(osname archname));

The Config module contains all the information that the Configure script had to figure out at Perl build time (over 450 values).[1]

[1] Perl was written in C, not because it's a portable language, but because it's a ubiquitous language. A bare C program is about as portable as Chuck Yeager on foot.

Shell variables from the config.sh file (written by Configure) are stored in a readonly hash, %Config, indexed by their names. Values set to the string "undef" in config.sh are returned as undefined values. The Perl exists function should be used to check whether a named variable exists.

myconfig

Returns a textual summary of the major Perl configuration values. See also the explanation of Perl's -V command-line switch in Chapter 6, Social Engineering.

config_sh

Returns the entire Perl configuration information in the form of the original config.sh shell variable assignment script.

config_vars(@names)

Prints to STDOUT the values of the named configuration variables. Each is printed on a separate line in the form:

name='value';

Names that are unknown are output as name='UNKNOWN';.

Here's a more sophisticated example using %Config:

use Config;
defined $Config{sig_name} or die "No sigs?";
foreach $name (split(' ', $Config{sig_name})) {
    $signo{$name} = $i;
    $signame[$i] = $name;
    $i++;
}
print "signal #17 = $signame[17]\n";
if ($signo{ALRM}) {
    print "SIGALRM is $signo{ALRM}\n";
}

Because configuration information is not stored within the Perl executable itself, it is possible (but unlikely) that the information might not relate to the actual Perl binary that is being used to access it. The Config module checks the Perl version number when loaded to try to prevent gross mismatches, but can't detect subsequent rebuilds of the same version.

Cwd--Get Pathname of Current Working Directory

use Cwd;
$dir = cwd();           # get current working directory safest way
$dir = getcwd();        # like getcwd(3) or getwd(3)
$dir = fastcwd();       # faster and more dangerous
use Cwd 'chdir';        # override chdir; keep PWD up to date
chdir "/tmp";
print $ENV{PWD};        # prints "/tmp"

cwd() gets the current working directory using the most natural and safest form for the current architecture. For most systems it is identical to `pwd` (but without the trailing line terminator).

getcwd() does the same thing by re-implementing getcwd (3) or getwd (3) in Perl.

fastcwd() looks the same as getcwd(), but runs faster. It's also more dangerous because you might chdir out of a directory that you can't chdir back into.

It is recommended that one of these functions be used in all code to ensure portability because the pwd program probably only exists on UNIX systems.

If you consistently override your chdir built-in function in all packages of your program, then your PWD environment variable will automatically be kept up to date. Otherwise, you shouldn't rely on it. (Which means you probably shouldn't rely on it.)

DB_File--Access to Berkeley DB

use DB_File;
# brackets in following code indicate optional arguments
[$X =] tie %hash,  "DB_File", $filename [, $flags, $mode, $DB_HASH];
[$X =] tie %hash,  "DB_File", $filename, $flags, $mode, $DB_BTREE;
[$X =] tie @array, "DB_File", $filename, $flags, $mode, $DB_RECNO;
$status = $X->del($key [, $flags]);
$status = $X->put($key, $value [, $flags]);
$status = $X->get($key, $value [, $flags]);
$status = $X->seq($key, $value [, $flags]);
$status = $X->sync([$flags]);
$status = $X->fd;
untie %hash;
untie @array;

DB_File is the most flexible of the DBM-style tie modules. It allows Perl programs to make use of the facilities provided by Berkeley DB (not included). If you intend to use this module you should really have a copy of the Berkeley DB manual page at hand. The interface defined here mirrors the Berkeley DB interface closely.

Berkeley DB is a C library that provides a consistent interface to a number of database formats. DB_File provides an interface to all three of the database (file) types currently supported by Berkeley DB.

The file types are:

DB_HASH

Allows arbitrary key/data pairs to be stored in data files. This is equivalent to the functionality provided by other hashing packages like DBM, NDBM, ODBM, GDBM, and SDBM. Remember, though, the files created using DB_HASH are not binary compatible with any of the other packages mentioned. A default hashing algorithm that will be adequate for most applications is built into Berkeley DB. If you do need to use your own hashing algorithm, it's possible to write your own and have DB_File use it instead.

DB_BTREE

The btree format allows arbitrary key/data pairs to be stored in a sorted, balanced binary tree. It is possible to provide a user-defined Perl routine to perform the comparison of keys. By default, though, the keys are stored in lexical order. This is useful for providing an ordering for your hash keys, and may be used on hashes that are only in memory and never go to disk.

DB_RECNO

DB_RECNO allows both fixed-length and variable-length flat text files to be manipulated using the same key/value pair interface as in DB_HASH and DB_BTREE. In this case the key will consist of a record (line) number.

How does DB_File interface to Berkeley DB?

DB_File gives access to Berkeley DB files using Perl's tie function. This allows DB_File to access Berkeley DB files using either a hash (for DB_HASH and DB_BTREE file types) or an ordinary array (for the DB_RECNO file type).

In addition to the tie interface, it is also possible to use most of the functions provided in the Berkeley DB API.

Differences from Berkeley DB

Berkeley DB uses the function dbopen (3) to open or create a database. Below is the C prototype for dbopen (3).

DB *
dbopen (const char *file, int flags, int mode,
        DBTYPE type, const void *openinfo)

The type parameter is an enumeration selecting one of the three interface methods, DB_HASH, DB_BTREE or DB_RECNO. Depending on which of these is actually chosen, the final parameter, openinfo, points to a data structure that allows tailoring of the specific interface method.

This interface is handled slightly differently in DB_File. Here is an equivalent call using DB_File.

tie %array, "DB_File", $filename, $flags, $mode, $DB_HASH;

The filename, flags, and mode parameters are the direct equivalent of their dbopen (3) counterparts. The final parameter $DB_HASH performs the function of both the type and openinfo parameters in dbopen (3).

In the example above $DB_HASH is actually a reference to a hash object. DB_File has three of these predefined references. Apart from $DB_HASH, there are also $DB_BTREE and $DB_RECNO.

The keys allowed in each of these predefined references are limited to the names used in the equivalent C structure. So, for example, the $DB_HASH reference will only allow keys called bsize, cachesize, ffactor, hash, lorder, and nelem.

To change one of these elements, just assign to it like this:

$DB_HASH->{cachesize} = 10_000;

Array offsets

In order to make RECNO more compatible with Perl, the array offset for all RECNO arrays begins at 0 rather than 1 as in Berkeley DB.

In-memory databases

Berkeley DB allows the creation of in-memory databases by using NULL (that is, a (char *)0 in C) in place of the filename. DB_File uses undef instead of NULL to provide this functionality.

use strict;
use Fcntl;
use DB_File;
my ($k, $v, %hash);
tie(%hash, 'DB_File', undef, O_RDWR|O_CREAT, 0, $DB_BTREE)
    or die "can't tie DB_File: $!":
foreach $k (keys %ENV) {
    $hash{$k} = $ENV{$k};
}
# this will now come out in sorted lexical order 
# without the overhead of sorting the keys
while  (($k,$v) = each %hash) {
    print "$k=$v\n";
}

Using the Berkeley DB interface directly

In addition to accessing Berkeley DB using a tied hash or array, you can also make direct use of most functions defined in the Berkeley DB documentation.

To do this you need to remember the return value from tie, or use the tied function to get at it yourself later on.

$db = tie %hash, "DB_File", "filename";

Once you have done that, you can access the Berkeley DB API functions directly.

$db->put($key, $value, R_NOOVERWRITE);  # invoke the DB "put" function

All the functions defined in the dbopen (3) manpage are available except for close() and dbopen() itself. The DB_File interface to these functions mirrors the way Berkeley DB works. In particular, note that all these functions return only a status value. Whenever a Berkeley DB function returns data via one of its parameters, the DB_File equivalent does exactly the same thing.

All the constants defined in the dbopen manpage are also available.

Below is a list of the functions available. (The comments only tell you the differences from the C version.)

get

The $flags parameter is optional. The value associated with the key you request is returned in the $value parameter.

put

As usual the flags parameter is optional. If you use either the R_IAFTER or R_IBEFORE flags, the $key parameter will be set to the record number of the inserted key/value pair.

del

The $flags parameter is optional.

fd

No differences encountered.

seq

The $flags parameter is optional. Both the $key and $value parameters will be set.

sync

The $flags parameter is optional.

Examples

Here are a few examples. First, using $DB_HASH:

use DB_File;
use Fcntl;
tie %h,  "DB_File", "hashed", O_RDWR|O_CREAT, 0644, $DB_HASH;
# Add a key/value pair to the file
$h{apple} = "orange";
# Check for value of a key
print "No, we have some bananas.\n" if $h{banana};
# Delete
delete $h{"apple"};
untie %h;

Here is an example using $DB_BTREE. Just to make life more interesting, the default comparison function is not used. Instead, a Perl subroutine, Compare(), does a case-insensitive comparison.

use DB_File;
use Fcntl;
sub Compare {
    my ($key1, $key2) = @_;
    "\L$key1" cmp "\L$key2";
}
$DB_BTREE->{compare} = 'Compare';
tie %h,  'DB_File', "tree", O_RDWR|O_CREAT, 0644, $DB_BTREE;
# Add a key/value pair to the file
$h{Wall}  = 'Larry';
$h{Smith} = 'John';
$h{mouse} = 'mickey';
$h{duck}  = 'donald';
# Delete
delete $h{duck};
# Cycle through the keys printing them in order.
# Note it is not necessary to sort the keys as
# the btree will have kept them in order automatically.
while ($key = each %h) { print "$key\n" }
untie %h;

The preceding code yields this output:

mouse
Smith
Wall

Next, an example using $DB_RECNO. You may access a regular textfile as an array of lines. But the first line of the text file is the zeroth element of the array, and so on. This provides a clean way to seek to a particular line in a text file.

my(@line, $number);
$number = 10;
use Fcntl;
use DB_File;
tie(@line, "DB_File", "/tmp/text", O_RDWR|O_CREAT, 0644, $DB_RECNO)
    or die "can't tie file: $!";
$line[$number - 1] = "this is a new line $number";

Here's an example of updating a file in place:

use Fcntl;
use DB_File;
tie(@file, 'DB_File', "/tmp/sample", O_RDWR, 0644, $DB_RECNO)
    or die "can't update /tmp/sample: $!";
print "line #3 was ", $file[2], "\n";
$file[2] = `date`;
untie @file;

Note that the tied array interface is incomplete, causing some operations on the resulting array to fail in strange ways. See the discussion of tied arrays in Chapter 5, Packages, Modules, and Object Classes. Some object methods are provided to avoid this. Here's an example of reading a file backward:

use DB_File;
use Fcntl;
$H = tie(@h, "DB_File", $file, O_RDWR, 0640, $DB_RECNO)
        or die "Cannot open file $file: $!\n";
# print the records in reverse order
for ($i = $H->length - 1; $i >= 0; --$i) { 
    print "$i: $h[$i]\n";
}
untie @h;

Locking databases

Concurrent access of a read-write database by several parties requires that each use some kind of locking. Here's an example that uses the fd() method to get the file descriptor, and then a careful open to give something Perl will flock for you. Run this repeatedly in the background to watch the locks granted in proper order. You have to call the sync() method to ensure that the writes make it to disk between access, or else the library would normally hold some in its own cache.

use Fcntl;
use DB_File;

use strict;

sub LOCK_SH { 1 }
sub LOCK_EX { 2 }
sub LOCK_NB { 4 }
sub LOCK_UN { 8 }

my($oldval, $fd, $db_obj, %db_hash, $value, $key);

$key   = shift || 'default';
$value = shift || 'magic';

$value .= " $$";

$db_obj = tie(%db_hash, 'DB_File', '/tmp/foo.db', O_CREAT|O_RDWR, 0644)
                    or die "dbcreat /tmp/foo.db $!";
$fd = $db_obj->fd;
print "$$: db fd is $fd\n";
open(DB_FH, "+<&=$fd") or die "fdopen $!";

unless (flock (DB_FH, LOCK_SH | LOCK_NB)) {
    print "$$: CONTENTION; can't read during write update!
                Waiting for read lock ($!) ....";
    unless (flock (DB_FH, LOCK_SH)) { die "flock: $!" }
}
print "$$: Read lock granted\n";

$oldval = $db_hash{$key};
print "$$: Old value was $oldval\n";
flock(DB_FH, LOCK_UN);

unless (flock (DB_FH, LOCK_EX | LOCK_NB)) {
    print "$$: CONTENTION; must have exclusive lock!
                Waiting for write lock ($!) ....";
    unless (flock (DB_FH, LOCK_EX)) { die "flock: $!" }
}

print "$$: Write lock granted\n";
$db_hash{$key} = $value;
sleep 10;

$db_obj->sync();                   # to flush
flock(DB_FH, LOCK_UN);
untie %db_hash;
undef $db_obj;                     # removing the last reference to the DB
                                   # closes it. Closing DB_FH is implicit.
print "$$: Updated db to $key=$value\n";

See also

Related manpages: dbopen (3), hash (3), recno (3), btree (3).

Berkeley DB is available from these locations:

Devel::SelfStubber--Generate Stubs for a SelfLoading Module

use Devel::SelfStubber;
$modulename = "Mystuff::Grok";  # no .pm suffix or slashes
$lib_dir = "";                  # defaults to current directory
Devel::SelfStubber->stub($modulename, $lib_dir);   # stubs only
# to generate the whole module with stubs inserted correctly
use Devel::SelfStubber;
$Devel::SelfStubber::JUST_STUBS = 0;
Devel::SelfStubber->stub($modulename, $lib_dir);

Devel::SelfStubber supports inherited, autoloaded methods by printing the stubs you need to put in your module before the _ _DATA_ _ token. A subroutine stub looks like this:

sub moo;

The stub ensures that if a method is called, it will get loaded. This is best explained using the following example:

Assume four classes, A, B, C, and D. A is the root class, B is a subclass of A, C is a subclass of B, and D is another subclass of A.

                    A
                   / \
                  B   D
                 /
                C

If D calls an autoloaded method moo() which is defined in class A, then the method is loaded into class A, and executed. If C then calls method moo(), and that method was reimplemented in class B, but set to be autoloaded, then the lookup mechanism never gets to the AUTOLOAD mechanism in B because it first finds the moo() method already loaded in A, and so erroneously uses that. If the method moo() had been stubbed in B, then the lookup mechanism would have found the stub, and correctly loaded and used the subroutine from B.

So, to get autoloading to work right with classes and subclasses, you need to make sure the stubs are loaded.

The SelfLoader can load stubs automatically at module initialization with:

SelfLoader->load_stubs();

But you may wish to avoid having the stub-loading overhead associated with your initialization.[2] In this case, you can put the subroutine stubs before the _ _DATA_ _ token. This can be done manually, by inserting the output of the first call to the stub() method above. But the module also allows automatic insertion of the stubs. By default the stub() method just prints the stubs, but you can set the global $Devel::SelfStubber::JUST_STUBS to 0 and it will print out the entire module with the stubs positioned correctly, as in the second call to stub().

[2] Although note that the load_stubs() method will be called sooner or later, at latest when the first subroutine is being autoloaded--which may be too late, if you're trying to moo().

At the very least, this module is useful for seeing what the SelfLoader thinks are stubs; in order to ensure that future versions of the SelfStubber remain in step with the SelfLoader, the SelfStubber actually uses the SelfLoader to determine which stubs are needed.

diagnostics--Force Verbose Warning Diagnostics

# As a pragma:
use diagnostics;
use diagnostics -verbose;
enable  diagnostics;
disable diagnostics;
# As a program:
$ perl program 2>diag.out
$ splain [-v] [-p] diag.out

The diagnostics module extends the terse diagnostics normally emitted by both the Perl compiler and the Perl interpreter, augmenting them with the more explicative and endearing descriptions found in Chapter 9, Diagnostic Messages. It affects the compilation phase of your program rather than merely the execution phase.

To use in your program as a pragma, merely say:

use diagnostics;

at the start (or near the start) of your program. (Note that this enables Perl's -w flag.) Your whole compilation will then be subject to the enhanced diagnostics. These are still issued to STDERR.

Due to the interaction between run-time and compile-time issues, and because it's probably not a very good idea anyway, you may not use:

no diagnostics

to turn diagnostics off at compile time. However, you can turn diagnostics on or off at run-time by invoking diagnostics::enable() and diagnostics::disable(), respectively.

The -verbose argument first prints out the perldiag (1) manpage introduction before any other diagnostics. The $diagnostics::PRETTY variable, if set in a BEGIN block, results in nicer escape sequences for pagers:

BEGIN { $diagnostics::PRETTY = 1 }

The standalone program

While apparently a whole other program, splain is actually nothing more than a link to the (executable) diagnostics.pm module. It acts upon the standard error output of a Perl program, which you may have treasured up in a file, or piped directly to splain.

The -v flag has the same effect as:

use diagnostics -verbose

The -p flag sets $diagnostics::PRETTY to true. Since you're post-processing with splain, there's no sense in being able to enable() or disable() diagnostics.

Output from splain (unlike the pragma) is directed to STDOUT.

Examples

The following file is certain to trigger a few errors at both run-time and compile-time:

use diagnostics;
print NOWHERE "nothing\n";
print STDERR "\n\tThis message should be unadorned.\n";
warn "\tThis is a user warning";
print "\nDIAGNOSTIC TESTER: Please enter a <CR> here: ";
my $a, $b = scalar <STDIN>;
print "\n";
print $x/$y;

If you prefer to run your program first and look at its problems afterward, do this while talking to a Bourne-like shell:

perl -w test.pl 2>test.out
./splain < test.out

If you don't want to modify your source code, but still want on-the-fly warnings, do this:

perl -w -Mdiagnostics test.pl

If you want to control warnings on the fly, do something like this. (Make sure the use comes first, or you won't be able to get at the enable() or disable() methods.)

use diagnostics; # checks entire compilation phase
print "\ntime for 1st bogus diags: SQUAWKINGS\n";
print BOGUS1 'nada';
print "done with 1st bogus\n";
disable diagnostics; # only turns off run-time warnings
print "\ntime for 2nd bogus: (squelched)\n";
print BOGUS2 'nada';
print "done with 2nd bogus\n";
enable diagnostics; # turns back on run-time warnings
print "\ntime for 3rd bogus: SQUAWKINGS\n";
print BOGUS3 'nada';
print "done with 3rd bogus\n";
disable diagnostics;
print "\ntime for 4th bogus: (squelched)\n";
print BOGUS4 'nada';
print "done with 4th bogus\n";

DirHandle--Supply Object Methods for Directory Handles

use DirHandle;
my $d = new DirHandle ".";   # open the current directory
if (defined $d) {
    while (defined($_ = $d->read)) { something($_); }
    $d->rewind;
    while (defined($_ = $d->read)) { something_else($_); }
}

DirHandle provides an alternative interface to Perl's opendir, closedir, readdir, and rewinddir functions.

The only objective benefit to using DirHandle is that it avoids name-space pollution by creating anonymous globs to hold directory handles. Well, and it also closes the DirHandle automatically when the last reference goes out of scope. But since most people only keep a directory handle open long enough to slurp in all the filenames, this is of dubious value. But hey, it's object-oriented.

DynaLoader--Automatic Dynamic Loading of Perl Modules

package YourModule;
require DynaLoader;
@ISA = qw(... DynaLoader ...);
bootstrap YourModule;

This module defines the standard Perl interface to the dynamic linking mechanisms available on many platforms. A common theme throughout the module system is that using a module should be easy, even if the module itself (or the installation of the module) is more complicated as a result. This applies particularly to the DynaLoader. To use it in your own module, all you need are the incantations listed above in the synopsis. This will work whether YourModule is statically or dynamically linked into Perl. (This is a Configure option for each module.) The bootstrap() method will either call YourModule's bootstrap routine directly if YourModule is statically linked into Perl, or if not, YourModule will inherit the bootstrap() method from DynaLoader, which will do everything necessary to load in your module, and then call YourModule's bootstrap() method for you, as if it were there all the time and you called it yourself. Piece of cake, of the have-it-and-eat-it-too variety.

The rest of this description talks about the DynaLoader from the viewpoint of someone who wants to extend the DynaLoader module to a new architecture. The Configure process selects which kind of dynamic loading to use by choosing to link in one of several C implementations, which must be linked into perl statically. (This is unlike other C extensions, which provide a single implementation, which may be linked in either statically or dynamically.)

The DynaLoader is designed to be a very simple, high-level interface that is sufficiently general to cover the requirements of SunOS, HP-UX, NeXT, Linux, VMS, Win-32, and other platforms. By itself, though, DynaLoader is practically useless for accessing non-Perl libraries because it provides almost no Perl-to-C "glue". There is, for example, no mechanism for calling a C library function or supplying its arguments in any sort of portable form. This job is delegated to the other extension modules that you may load in by using DynaLoader.

Internal interface summary

Variables:
    @dl_library_path
    @dl_resolve_using
    @dl_require_symbols
    $dl_debug

Subroutines:
    bootstrap($modulename);
    @filepaths = dl_findfile(@names);
    $filepath = dl_expandspec($spec);
    $libref  = dl_load_file($filename);
    $symref  = dl_find_symbol($libref, $symbol);
    @symbols = dl_undef_symbols();
    dl_install_xsub($name, $symref [, $filename]);
    $message = dl_error;

The bootstrap() and dl_findfile() routines are standard across all platforms, and so are defined in DynaLoader.pm. The rest of the functions are supplied by the particular .xs file that supplies the implementation for the platform. (You can examine the existing implementations in the ext/DynaLoader/ *.xs files in the Perl source directory. You should also read DynaLoader.pm, of course.) These implementations may also tweak the default values of the variables listed below.

@dl_library_path

The default list of directories in which dl_findfile() will search for libraries. Directories are searched in the order they are given in this array variable, beginning with subscript 0. @dl_library_path is initialized to hold the list of "normal" directories (/usr/lib and so on) determined by the Perl installation script, Configure, and given by $Config{'libpth'}. This is to ensure portability across a wide range of platforms. @dl_library_path should also be initialized with any other directories that can be determined from the environment at run-time (such as LD_LIBRARY_PATH for SunOS). After initialization, @dl_library_path can be manipulated by an application using push and unshift before calling dl_findfile(). unshift can be used to add directories to the front of the search order either to save search time or to override standard libraries with the same name. The load function that dl_load_file() calls might require an absolute pathname. The dl_findfile() function and @dl_library_path can be used to search for and return the absolute pathname for the library/object that you wish to load.

@dl_resolve_using

A list of additional libraries or other shared objects that can be used to resolve any undefined symbols that might be generated by a later call to dl_load_file(). This is only required on some platforms that do not handle dependent libraries automatically. For example, the Socket extension shared library (auto/Socket/Socket.so) contains references to many socket functions that need to be resolved when it's loaded. Most platforms will automatically know where to find the "dependent" library (for example, /usr/lib/libsocket.so). A few platforms need to be told the location of the dependent library explicitly. Use @dl_resolve_using for this. Example:

@dl_resolve_using = dl_findfile('-lsocket');

@dl_require_symbols

A list of one or more symbol names that are in the library/object file to be dynamically loaded. This is only required on some platforms.

dl_error

$message = dl_error();

Error message text from the last failed DynaLoader function. Note that, similar to errno in UNIX, a successful function call does not reset this message. Implementations should detect the error as soon as it occurs in any of the other functions and save the corresponding message for later retrieval. This will avoid problems on some platforms (such as SunOS) where the error message is very temporary (see, for example, dlerror (3)).

$dl_debug

Internal debugging messages are enabled when $dl_debug is set true. Currently, setting $dl_debug only affects the Perl side of the DynaLoader. These messages should help an application developer to resolve any DynaLoader usage problems. $dl_debug is set to $ENV{'PERL_DL_DEBUG'} if defined. For the DynaLoader developer and porter there is a similar debugging variable added to the C code (see dlutils.c) and enabled if Perl was built with the -DDEBUGGING flag. This can also be set via the PERL_DL_DEBUG environment variable. Set to 1 for minimal information or higher for more.

dl_findfile

@filepaths = dl_findfile(@names)

Determines the full paths (including file suffix) of one or more loadable files, given their generic names and optionally one or more directories. Searches directories in @dl_library_path by default and returns an empty list if no files were found. Names can be specified in a variety of platform-independent forms. Any names in the form -lname are converted into libname.*, where .* is an appropriate suffix for the platform. If a name does not already have a suitable prefix or suffix, then the corresponding file will be sought by trying prefix and suffix combinations appropriate to the platform: $name.o, lib$name.* and $name. If any directories are included in @names, they are searched before @dl_library_path. Directories may be specified as -Ldir. Any other names are treated as filenames to be searched for. Using arguments of the form -Ldir and -lname is recommended. Example:

@dl_resolve_using = dl_findfile(qw(-L/usr/5lib -lposix));

dl_expandspec

$filepath = dl_expandspec($spec)

Some unusual systems such as VMS require special filename handling in order to deal with symbolic names for files (that is, VMS's Logical Names). To support these systems a dl_expandspec() function can be implemented either in the dl_*.xs file or code can be added to the autoloadable dl_expandspec() function in DynaLoader.pm.

dl_load_file

$libref = dl_load_file($filename)

Dynamically load $filename, which must be the path to a shared object or library. An opaque "library reference" is returned as a handle for the loaded object. dl_load_file() returns the undefined value on error. (On systems that provide a handle for the loaded object such as SunOS and HP-UX, the returned handle will be $libref. On other systems $libref will typically be $filename or a pointer to a buffer containing $filename. The application should not examine or alter $libref in any way.) Below are some of the functions that do the real work. Such functions should use the current values of @dl_require_symbols and @dl_resolve_using if required.

SunOS:  dlopen($filename)
HP-UX:  shl_load($filename)
Linux:  dld_create_reference(@dl_require_symbols); dld_link($filename)
NeXT:   rld_load($filename, @dl_resolve_using)
VMS:    lib$find_image_symbol($filename, $dl_require_symbols[0])

dl_find_symbol

$symref = dl_find_symbol($libref, $symbol)

Returns the address of the symbol $symbol, or the undefined value if not found. If the target system has separate functions to search for symbols of different types, then dl_find_symbol() should search for function symbols first and then search for other types. The exact manner in which the address is returned in $symref is not currently defined. The only initial requirement is that $symref can be passed to, and understood by, dl_install_xsub(). Here are some current implementations:

SunOS:  dlsym($libref, $symbol)
HP-UX:  shl_findsym($libref, $symbol)
Linux:  dld_get_func($symbol) and/or dld_get_symbol($symbol)
NeXT:   rld_lookup("_$symbol")
VMS:    lib$find_image_symbol($libref, $symbol)

dl_undef_symbols

@symbols = dl_undef_symbols()

Returns a list of symbol names which remain undefined after dl_load_file(). It returns () if these names are not known. Don't worry if your platform does not provide a mechanism for this. Most platforms do not need it and hence do not provide it; they just return an empty list.

dl_install_xsub

dl_install_xsub($perl_name, $symref [, $filename])

Creates a new Perl external subroutine named $perl_name using $symref as a pointer to the function that implements the routine. This is simply a direct call to newXSUB(). It returns a reference to the installed function. The $filename parameter is used by Perl to identify the source file for the function if required by die, caller, or the debugger. If $filename is not defined, then DynaLoader will be used.

bootstrap()

bootstrap($module);

This is the normal entry point for automatic dynamic loading in Perl.

It performs the following actions:

  • Locates an auto/$module directory by searching @INC

  • Uses dl_findfile() to determine the filename to load

  • Sets @dl_require_symbols to (`boot_$module`)

  • Executes an auto/$module/$module.bs file if it exists (typically used to add to @dl_resolve_using any files that are required to load the module on the current platform)

  • Calls dl_load_file() to load the file

  • Calls dl_undef_symbols() and warns if any symbols are undefined

  • Calls dl_find_symbol() for "boot_$module"

  • Calls dl_install_xsub() to install it as ${module}::bootstrap

  • Calls &{"${module}::bootstrap"} to bootstrap the module (actually it uses the function reference returned by dl_install_xsub() for speed)

English--Use English or awk Names for Punctuation Variables

use English;
...
if ($ERRNO =~ /denied/) { ... }

This module provides aliases for the built-in "punctuation" variables. Variables with side effects that get triggered merely by accessing them (like $0) will still have the same effects under the aliases.

For those variables that have an awk (1) version, both long and short English alternatives are provided. For example, the $/ variable can be referred to either as $RS or as $INPUT_RECORD_SEPARATOR if you are using the English module.

Here is the list of variables along with their English alternatives:

Perl English Perl English
@_ @ARG $? $CHILD_ERROR
$_ $ARG $! $OS_ERROR
$& $MATCH $! $ERRNO
$` $PREMATCH $@ $EVAL_ERROR
$' $POSTMATCH $$ $PROCESS_ID
$+ $LAST_PAREN_MATCH $$ $PID
$. $INPUT_LINE_NUMBER $< $REAL_USER_ID
$. $NR $< $UID
$/ $INPUT_RECORD_SEPARATOR $> $EFFECTIVE_USER_ID
$/ $RS $> $EUID
$| $OUTPUT_AUTOFLUSH $( $REAL_GROUP_ID
$, $OUTPUT_FIELD_SEPARATOR $( $GID
$, $OFS $) $EFFECTIVE_GROUP_ID
$\ $OUTPUT_RECORD_SEPARATOR $) $EGID
$\ $ORS $0 $PROGRAM_NAME
$" $LIST_SEPARATOR $] $PERL_VERSION
$; $SUBSCRIPT_SEPARATOR $^A $ACCUMULATOR
$; $SUBSEP $^D $DEBUGGING
$% $FORMAT_PAGE_NUMBER $^F $SYSTEM_FD_MAX
$= $FORMAT_LINES_PER_PAGE $^I $INPLACE_EDIT
$- $FORMAT_LINES_LEFT $^P $PERLDB
$~ $FORMAT_NAME $^T $BASETIME
$^ $FORMAT_TOP_NAME $^W $WARNING
$: $FORMAT_LINE_BREAK_CHARACTERS $^X $EXECUTABLE_NAME
$^L $FORMAT_LINEFEED $^O $OSNAME

Env--Import Environment Variables

use Env;                     # import all possible variables
use Env qw(PATH HOME TERM);  # import only specified variables

Perl maintains environment variables in a pseudo-associative array named %ENV. Since this access method is sometimes inconvenient, the Env module allows environment variables to be treated as simple variables.

The Env::import() routine ties environment variables to global Perl variables with the same names. By default it ties suitable, existing environment variables (that is, variables yielded by keys %ENV). An environmental variable is considered suitable if its name begins with an alphabetic character, and if it consists of nothing but alphanumeric characters plus underscore.

If you supply arguments when invoking use Env, they are taken to be a list of environment variables to tie. It's OK if the variables don't yet exist.

After an environment variable is tied, you can use it like a normal variable. You may access its value:

@path = split(/:/, $PATH);

or modify it any way you like:

$PATH .= ":.";

To remove a tied environment variable from the environment, make it the undefined value:

undef $PATH;

Note that the corresponding operation performed directly against %ENV is not undef, but delete:

delete $ENV{PATH};

Exporter--Default Import Method for Modules

# in module YourModule.pm:
package YourModule;
use Exporter ();
@ISA = qw(Exporter);
@EXPORT = qw(...);              # Symbols to export by default.
@EXPORT_OK = qw(...);           # Symbols to export on request.
%EXPORT_TAGS = (tag => [...]);  # Define names for sets of symbols.
# in other files that wish to use YourModule:
use YourModule;                 # Import default symbols into my package.
use YourModule qw(...);         # Import listed symbols into my package.
use YourModule ();              # Do not import any symbols!

Any module may define a class method called import(). Perl automatically calls a module's import() method when processing the use statement for the module. The module itself doesn't have to define the import() method, though. The Exporter module implements a default import() method that many modules choose to inherit instead. The Exporter module supplies the customary import semantics, and any other import() methods will tend to deviate from the normal import semantics in various (hopefully documented) ways. Now we'll talk about the normal import semantics.

Specialized import lists

Ignoring the class name, which is always the first argument to a class method, the arguments that are passed into the import() method are known as an import list. Usually the import list is nothing more than a list of subroutine or variable names, but occasionally you may want to get fancy. If the first entry in an import list begins with !, :, or /, the list is treated as a series of specifications that either add to or delete from the list of names to import. They are processed left to right. Specifications are in the form:

Symbol Meaning
[!]name This name only
[!]:DEFAULT All names in @EXPORT
[!]:tag All names in $EXPORT_TAGS{tag} anonymous list
[!]/pattern/ All names in @EXPORT and @EXPORT_OK that match pattern

A leading ! indicates that matching names should be deleted from the list of names to import. If the first specification is a deletion, it is treated as though preceded by :DEFAULT. If you just want to import extra names in addition to the default set, you will still need to include :DEFAULT explicitly.

For example, suppose that YourModule.pm says:

@EXPORT      = qw(A1 A2 A3 A4 A5);
@EXPORT_OK   = qw(B1 B2 B3 B4 B5);
%EXPORT_TAGS = (
    T1 => [qw(A1 A2 B1 B2)],
    T2 => [qw(A1 A2 B3 B4)]
);

Individual names in EXPORT_TAGS must also appear in @EXPORT or @EXPORT_OK. Note that you cannot use the tags directly within either @EXPORT or @EXPORT_OK (though you could preprocess tags into either of those arrays, and in fact, the export_tags() and export_ok_tags() functions below do precisely that).

An application using YourModule can then say something like this:

use YourModule qw(:DEFAULT :T2 !B3 A3);

The :DEFAULT adds in A1, A2, A3, A4, and A5. The :T2 adds in only B3 and B4, since A1 and A2 were already added. The !B3 then deletes B3, and the A3 does nothing because A3 was already included. Other examples include:

use Socket qw(!/^[AP]F_/ !SOMAXCONN !SOL_SOCKET);
use POSIX  qw(:errno_h :termios_h !TCSADRAIN !/^EXIT/);

Remember that most patterns (using //) will need to be anchored with a leading ^, for example, /^EXIT/ rather than /EXIT/.

You can say:

BEGIN { $Exporter::Verbose=1 }

in order to see how the specifications are being processed and what is actually being imported into modules.

Module version checking

The Exporter module will convert an attempt to import a number from a module into a call to $module_name->require_version($value). This can be used to validate that the version of the module being used is greater than or equal to the required version. The Exporter module also supplies a default require_version() method, which checks the value of $VERSION in the exporting module.

Since the default require_version() method treats the $VERSION number as a simple numeric value, it will regard version 1.10 as lower than 1.9. For this reason it is strongly recommended that the module developer use numbers with at least two decimal places; for example, 1.09.

Prior to release 5.004 or so of Perl, this only worked with modules that use the Exporter module; in particular, this means that you can't check the version of a class module that doesn't require the Exporter module.

Managing unknown symbols

In some situations you may want to prevent certain symbols from being exported. Typically this applies to extensions with functions or constants that may not exist on some systems.

The names of any symbols that cannot be exported should be listed in the @EXPORT_FAIL array.

If a module attempts to import any of these symbols, the Exporter will give the module an opportunity to handle the situation before generating an error. The Exporter will call an export_fail() method with a list of the failed symbols:

@failed_symbols = $module_name->export_fail(@failed_symbols);

If the export_fail() method returns an empty list, then no error is recorded and all requested symbols are exported. If the returned list is not empty, then an error is generated for each symbol and the export fails. The Exporter provides a default export_fail() method that simply returns the list unchanged.

Uses for the export_fail() method include giving better error messages for some symbols and performing lazy architectural checks. Put more symbols into @EXPORT_FAIL by default and then take them out if someone actually tries to use them and an expensive check shows that they are usable on that platform.

Tag handling utility functions

Since the symbols listed within %EXPORT_TAGS must also appear in either @EXPORT or @EXPORT_OK, two utility functions are provided that allow you to easily add tagged sets of symbols to @EXPORT or @EXPORT_OK:

%EXPORT_TAGS = (Bactrian => [qw(aa bb cc)], Dromedary => [qw(aa cc dd)]);

Exporter::export_tags('Bactrian');     # add aa, bb and cc to @EXPORT
Exporter::export_ok_tags('Dromedary'); # add aa, cc and dd to @EXPORT_OK

Any names that are not tags are added to @EXPORT or @EXPORT_OK unchanged, but will trigger a warning (with -w) to avoid misspelt tag names being silently added to @EXPORT or @EXPORT_OK. Future versions may regard this as a fatal error.

ExtUtils::Install--Install Files from Here to There

use ExtUtils::Install;
install($hashref, $verbose, $nonono);
uninstall($packlistfile, $verbose, $nonono);

install() and uninstall() are specific to the way ExtUtils::MakeMaker handles the platform-dependent installation and deinstallation of Perl extensions. They are not designed as general-purpose tools. If you're reading this chapter straight through (brave soul), you probably want to take a glance at the MakeMaker entry first. (Or just skip over everything in the ExtUtils package until you start writing an Ext.)

install() takes three arguments: a reference to a hash, a verbose switch, and a don't-really-do-it switch. The hash reference contains a mapping of directories; each key/value pair is a combination of directories to be copied. The key is a directory to copy from, and the value is a directory to copy to. The whole tree below the "from" directory will be copied, preserving timestamps and permissions.

There are two keys with a special meaning in the hash: `read` and `write`. After the copying is done, install will write the list of target files to the file named by $hashref->{write}. If there is another file named by $hashref->{read}, the contents of this file will be merged into the written file. The read and the written file may be identical, but on the Andrew File System (AFS) it is fairly likely that people are installing to a different directory than the one where the files later appear.

uninstall() takes as first argument a file containing filenames to be unlinked. The second argument is a verbose switch, the third is a no-don't-really-do-it-now switch (useful to know what will happen without actually doing it).

ExtUtils::Liblist--Determine Libraries to Use and How to Use Them

require ExtUtils::Liblist;
ExtUtils::Liblist::ext($potential_libs, $Verbose);

This utility takes a list of libraries in the form -llib1 -llib2 -llib3 and returns lines suitable for inclusion in a Perl extension Makefile on the current platform. Extra library paths may be included with the form -L/another/path. This will affect the searches for all subsequent libraries.

ExtUtils::Liblist::ext() returns a list of four scalar values, which Makemaker will eventually use in constructing a Makefile, among other things. The values are:

EXTRALIBS

List of libraries that need to be linked with ld (1) when linking a Perl binary that includes a static extension. Only those libraries that actually exist are included.

LDLOADLIBS

List of those libraries that can or must be linked when creating a shared library using ld (1). These may be static or dynamic libraries.

LD_RUN_PATH

A colon-separated list of the directories in LDLOADLIBS. It is passed as an environment variable to the process that links the shared library.

BSLOADLIBS

List of those libraries that are needed but can be linked in dynamically with the DynaLoader at run-time on this platform. This list is used to create a .bs (bootstrap) file. SunOS/Solaris does not need this because ld (1) records the information (from LDLOADLIBS) into the object file.

Portability

This module deals with a lot of system dependencies and has quite a few architecture-specific ifs in the code.

ExtUtils::MakeMaker--Create a Makefile for a Perl Extension

use ExtUtils::MakeMaker;
WriteMakefile( ATTRIBUTE => VALUE, ... );
# which internally is really more like...
%att = (ATTRIBUTE => VALUE, ...);
MM->new(\%att)->flush;

When you build an extension to Perl, you need to have an appropriate Makefile[3] in the extension's source directory. And while you could conceivably write one by hand, this would be rather tedious. So you'd like a program to write it for you.

[3] If you don't know what a Makefile is, or what the make (1) program does with one, you really shouldn't be reading this section. We will be assuming that you know what happens when you type a command like make foo.

Originally, this was done using a shell script (actually, one for each extension) called Makefile.SH, much like the one that writes the Makefile for Perl itself. But somewhere along the line, it occurred to the perl5-porters that, by the time you want to compile your extensions, there's already a bare-bones version of the Perl executable called miniperl, if not a fully installed perl. And for some strange reason, Perl programmers prefer programming in Perl to programming in shell. So they wrote MakeMaker, just so that you can write Makefile.PL instead of Makefile.SH.

MakeMaker isn't a program; it's a module (or it wouldn't be in this chapter). The module provides the routines you need; you just need to use the module, and then call the routines. As with any programming job, there are many degrees of freedom; but your typical Makefile.PL is pretty simple. For example, here's ext/POSIX/Makefile.PL from the Perl distribution's POSIX extension (which is by no means a trivial extension):

use ExtUtils::MakeMaker;
WriteMakefile(
    NAME         => 'POSIX',
    LIBS         => ["-lm -lposix -lcposix"],
    MAN3PODS     => ' ',    # Pods will be built by installman.
    XSPROTOARG   => '-noprototypes',       # XXX remove later?
    VERSION_FROM => 'POSIX.pm', 
);

Several things are apparent from this example, but the most important is that the WriteMakefile() function uses named parameters. This means that you can pass many potential parameters, but you're only required to pass the ones you want to be different from the default values. (And when we say "many", we mean "many"--there are about 75 of them. See the Attributes section later.)

As the synopsis above indicates, the WriteMakefile() function actually constructs an object. This object has attributes that are set from various sources, including the parameters you pass to the function. It's this object that actually writes your Makefile, meshing together the demands of your extension with the demands of the architecture on which the extension is being installed. Like many craftily crafted objects, this MakeMaker object delegates as much of its work as possible to various other subroutines and methods. Many of these may be overridden in your Makefile.PL if you need to do some fine tuning. (Generally you don't.)

But let's not lose track of the goal, which is to write a Makefile that will know how to do anything to your extension that needs doing. Now as you can imagine, the Makefile that MakeMaker writes is quite, er, full-featured. It's easy to get lost in all the details. If you look at the POSIX Makefile generated by the bit of code above, you will find a file containing about 122 macros and 77 targets. You will want to go off into a corner and curl up into a little ball, saying, "Never mind, I didn't really want to know."

Well, the fact of the matter is, you really don't want to know, nor do you have to. Most of these items take care of themselves--that's what MakeMaker is there for, after all. We'll lay out the various attributes and targets for you, but you can just pick and choose, like in a cafeteria. We'll talk about the make targets first, because they're the actions you eventually want to perform, and then work backward to the macros and attributes that feed the targets.

But before we do that, you need to know just a few more architectural features of MakeMaker to make sense of some of the things we'll say. The targets at the end of your Makefile depend on the macro definitions that are interpolated into them. Those macro definitions in turn come from any of several places. Depending on how you count, there are about five sources of information for these attributes. Ordered by increasing precedence and (more or less) decreasing permanence, they are:

The first four of these turn into attributes of the object we mentioned, and are eventually written out as macro definitions in your Makefile. In most cases, the names of the values are consistent from beginning to end. (Except that the Config database keeps the names in lowercase, as they come from Perl's config.sh file. The names are translated to uppercase when they become attributes of the object.) In any case, we'll tend to use the term attributes to mean both attributes and the Makefile macros derived from them.

The Makefile.PL and the hints may also provide overriding methods for the object, if merely changing an attribute isn't good enough.

The hints files are expected to be named like their counterparts in PERL_SRC/hints, but with a .pl filename extension (for example, next_3_2.pl ), because the file consists of Perl code to be evaluated. Apart from that, the rules governing which hintsfile is chosen are the same as in Configure. The hintsfile is evaled within a routine that is a method of our MakeMaker object, so if you want to override or create an attribute, you would say something like:

$self->{LIBS} = ['-ldbm -lucb -lc'];

By and large, if your Makefile isn't doing what you want, you just trace back the name of the misbehaving attribute to its source, and either change it there or override it downstream.

Extensions may be built using the contents of either the Perl source directory tree or the installed Perl library. The recommended way is to build extensions after you have run make install on Perl itself. You can then build your extension in any directory on your hard disk that is not below the Perl source tree. The support for extensions below the ext/ directory of the Perl distribution is only good for the standard extensions that come with Perl.

If an extension is being built below the ext/ directory of the Perl source, then MakeMaker will set PERL_SRC automatically (usually to ../..). If PERL_SRC is defined and the extension is recognized as a standard extension, then other variables default to the following:

PERL_INC     = PERL_SRC
PERL_LIB     = PERL_SRC/lib
PERL_ARCHLIB = PERL_SRC/lib
INST_LIB     = PERL_LIB
INST_ARCHLIB = PERL_ARCHLIB

If an extension is being built away from the Perl source, then MakeMaker will leave PERL_SRC undefined and default to using the installed copy of the Perl library. The other variables default to the following:

PERL_INC     = $archlibexp/CORE
PERL_LIB     = $privlibexp
PERL_ARCHLIB = $archlibexp
INST_LIB     = ./blib/lib
INST_ARCHLIB = ./blib/arch

If Perl has not yet been installed, then PERL_SRC can be defined as an override on the command line.

Targets

Far and away the most commonly used make targets are those used by the installer to install the extension. So we aim to make the normal installation very easy:

perl Makefile.PL  # generate the Makefile
make              # compile the extension
make test         # test the extension
make install      # install the extension

This assumes that the installer has dynamic linking available. If not, a couple of additional commands are also necessary:

make perl         # link a new perl statically with this extension
make inst_perl    # install that new perl appropriately

Other interesting targets in the generated Makefile are:

make config       # check whether the Makefile is up-to-date
make clean        # delete local temp files (Makefile gets renamed)
make realclean    # delete derived files (including ./blib)
make ci           # check in all files in the MANIFEST file
make dist         # see the "Distribution Support" section below

Now we'll talk about some of these commands, and how each of them is related to MakeMaker. So we'll not only be talking about things that happen when you invoke the make target, but also about what MakeMaker has to do to generate that make target. So brace yourself for some temporal whiplash.

Running MakeMaker

This command is the one most closely related to MakeMaker because it's the one in which you actually run MakeMaker. No temporal whiplash here. As we mentioned earlier, some of the default attribute values may be overridden by adding arguments of the form KEY=VALUE. For example:

perl Makefile.PL PREFIX=/tmp/myperl5

To get a more detailed view of what MakeMaker is doing, say:

perl Makefile.PL verbose

Making whatever is needed

A make command without arguments performs any compilation needed and puts any generated files into staging directories that are named by the attributes INST_LIB, INST_ARCHLIB, INST_EXE, INST_MAN1DIR, and INST_MAN3DIR. These directories default to something below . /blib if you are not building below the Perl source directory. If you are building below the Perl source, INST_LIB and INST_ARCHLIB default to .. /.. /lib, and INST_EXE is not defined.

Running tests

The goal of this command is to run any regression tests supplied with the extension, so MakeMaker checks for the existence of a file named test.pl in the current directory and, if it exists, adds commands to the test target of the Makefile that will execute the script with the proper set of Perl -I options (since the files haven't been installed into their final location yet).

MakeMaker also checks for any files matching glob(`t/*.t`). It will add commands to the test target that execute all matching files via the Test::Harness module with the -I switches set correctly. If you pass TEST_VERBOSE=1, the test target will run the tests verbosely.

Installing files

Once the installer has tested the extension, the various generated files need to get put into their final resting places. The install target copies the files found below each of the INST_* directories to their INSTALL* counterparts.

INST_LIB -> INSTALLPRIVLIB[1]or INSTALLSITELIB[2]
INST_ARCHLIB -> INSTALLARCHLIB[1]or INSTALLSITEARCH[2]
INST_EXE -> INSTALLBIN
INST_MAN1DIR -> INSTALLMAN1DIR
INST_MAN3DIR -> INSTALLMAN3DIR

Footnotes:

[1] if INSTALLDIRS set to "perl"

[2] if INSTALLDIRS set to "site"

The INSTALL* attributes in turn default to their %Config counterparts, $Config{installprivlib}, $Config{installarchlib}, and so on.

If you don't set INSTALLARCHLIB or INSTALLSITEARCH, MakeMaker will assume you want them to be subdirectories of INSTALLPRIVLIB and INSTALLSITELIB, respectively. The exact relationship is determined by Configure. But you can usually just go with the defaults for all these attributes.

The PREFIX attribute can be used to redirect all the INSTALL* attributes in one go. Here's the quickest way to install a module in a nonstandard place:

perl Makefile.PL PREFIX=~

The value you specify for PREFIX replaces one or more leading pathname components in all INSTALL* attributes. The prefix to be replaced is determined by the value of $Config{prefix}, which typically has a value like /usr. (Note that the tilde expansion above is done by MakeMaker, not by perl or make.)

If the user has superuser privileges and is not working under the Andrew File System (AFS) or relatives, then the defaults for INSTALLPRIVLIB, INSTALLARCHLIB, INSTALLBIN, and so on should be appropriate.

By default, make install writes some documentation of what has been done into the file given by $(INSTALLARCHLIB)/perllocal.pod. This feature can be bypassed by calling make pure_install.

If you are using AFS, you must specify the installation directories, since these most probably have changed since Perl itself was installed. Do this by issuing these commands:

perl Makefile.PL INSTALLSITELIB=/afs/here/today
    INSTALLBIN=/afs/there/now INSTALLMAN3DIR=/afs/for/manpages
make

Be careful to repeat this procedure every time you recompile an extension, unless you are sure the AFS installation directories are still valid.

Static linking of a new Perl binary

The steps above are sufficient on a system supporting dynamic loading. On systems that do not support dynamic loading, however, the extension has to be linked together statically with everything else you might want in your perl executable. MakeMaker supports the linking process by creating appropriate targets in the Makefile. If you say:

make perl

it will produce a new perl binary in the current directory with all extensions linked in that can be found in INST_ARCHLIB, SITELIBEXP, and PERL_ARCHLIB. To do that, MakeMaker writes a new Makefile ; on UNIX it is called Makefile.aperl, but the name may be system-dependent. When you want to force the creation of a new perl, we recommend that you delete this Makefile.aperl so the directories are searched for linkable libraries again.

The binary can be installed in the directory where Perl normally resides on your machine with:

make inst_perl

To produce a Perl binary with a different filename than perl, either say:

perl Makefile.PL MAP_TARGET=myperl
make myperl
make inst_perl

or say:

perl Makefile.PL
make myperl MAP_TARGET=myperl
make inst_perl MAP_TARGET=myperl

In either case, you will be asked to confirm the invocation of the inst_perl target, since this invocation is likely to overwrite your existing Perl binary in INSTALLBIN.

By default make inst_perl documents what has been done in the file given by $(INSTALLARCHLIB)/perllocal.pod. This behavior can be bypassed by calling make pure_inst_perl.

Sometimes you might want to build a statically linked Perl even though your system supports dynamic loading. In this case you may explicitly set the linktype:

perl Makefile.PL LINKTYPE=static

Attributes you can set

The following attributes can be specified as arguments to WriteMakefile() or as NAME=VALUE pairs on the command line. We give examples below in the form they would appear in your Makefile.PL, that is, as though passed as a named parameter to WriteMakefile() (including the comma that comes after it).

C

A reference to an array of *.c filenames. It's initialized by doing a directory scan and by derivation from the values of the XS attribute hash. This is not currently used by MakeMaker but may be handy in Makefile.PLs.

CONFIG

An array reference containing a list of attributes to fetch from %Config. For example:

CONFIG => [qw(archname manext)],

defines ARCHNAME and MANEXT from config.sh. MakeMaker will automatically add the following values to CONFIG:

ar            dlext        ldflags     ranlib
cc            dlsrc        libc        sitelibexp
cccdlflags    ld           lib_ext     sitearchexp
ccdlflags     lddlflags    obj_ext     so

CONFIGURE

A reference to a subroutine returning a hash reference. The hash may contain further attributes, for example, {LIBS => ...}, that have to be determined by some evaluation method. Be careful, because any attributes defined this way will override hints and WriteMakefile( ) parameters (but not command-line arguments).

DEFINE

An attribute containing additional defines, such as -DHAVE_UNISTD_H.

DIR

A reference to an array of subdirectories containing Makefile.PLs. For example, SDBM_FILE has:

DIR => ['sdbm'],

MakeMaker will automatically do recursive MakeMaking if subdirectories contain Makefile.PL files. A separate MakeMaker class is generated for each subdirectory, so each MakeMaker object can override methods using the fake MY:: class (see below) without interfering with other MakeMaker objects. You don't even need a Makefile.PL in the top level directory if you pass one in via -M and -e:

perl -MExtUtils::MakeMaker -e 'WriteMakefile()'

DISTNAME

Your name for distributing the package (by tar file). This defaults to NAME below.

DL_FUNCS

A reference to a hash of symbol names for routines to be made available as universal symbols. Each key/value pair consists of the package name and an array of routine names in that package. This attribute is used only under AIX (export lists) and VMS (linker options) at present. The routine names supplied will be expanded in the same way as XSUB names are expanded by the XS attribute.

The default key/value pair looks like this:

"$PKG" => ["boot_$PKG"]

For a pair of packages named RPC and NetconfigPtr, you might, for example, set it to this:

DL_FUNCS => {
    RPC          => [qw(boot_rpcb rpcb_gettime getnetconfigent)],
    NetconfigPtr => ['DESTROY'],
},

DL_VARS

An array of symbol names for variables to be made available as universal symbols. It's used only under AIX (export lists) and VMS (linker options) at present. Defaults to []. A typical value might look like this:

DL_VARS => [ qw( Foo_version Foo_numstreams Foo_tree ) ],

EXE_FILES

A reference to an array of executable files. The files will be copied to the INST_EXE directory. A make realclean command will delete them from there again.

FIRST_MAKEFILE

The name of the Makefile to be produced. Defaults to the contents of MAKEFILE, but can be overridden. This is used for the second Makefile that will be produced for the MAP_TARGET.

FULLPERL

A Perl binary able to run this extension.

H

A reference to an array of *.h filenames. Similar to C.

INC

Directories containing include files, in -I form. For example:

INC => "-I/usr/5include -I/path/to/inc",

INSTALLARCHLIB

Used by make install, which copies files from INST_ARCHLIB to this directory if INSTALLDIRS is set to "perl".

INSTALLBIN

Used by make install, which copies files from INST_EXE to this directory.

INSTALLDIRS

Determines which of the two sets of installation directories to choose: installprivlib and installarchlib versus installsitelib and installsitearch. The first pair is chosen with INSTALLDIRS=perl, the second with INSTALLDIRS=site. The default is "site".

INSTALLMAN1DIR

This directory gets the command manpages at make install time. It defaults to $Config{installman1dir}.

INSTALLMAN3DIR

This directory gets the library manpages at make install time. It defaults to $Config{installman3dir}.

INSTALLPRIVLIB

Used by make install, which copies files from INST_LIB to this directory if INSTALLDIRS is set to "perl".

INSTALLSITELIB

Used by make install, which copies files from INST_LIB to this directory if INSTALLDIRS is set to "site" (default).

INSTALLSITEARCH

Used by make install, which copies files from INST_ARCHLIB to this directory if INSTALLDIRS is set to "site" (default).

INST_ARCHLIB

Same as INST_LIB, but for architecture-dependent files.

INST_EXE

Directory where executable scripts should be staged during running of make. Defaults to ./blib/bin, just to have a dummy location during testing. make install will copy the files in INST_EXE to INSTALLBIN.

INST_LIB

Directory where we put library files of this extension while building it.

INST_MAN1DIR

Directory to hold the command manpages at make time.

INST_MAN3DIR

Directory to hold the library manpages at make time

LDFROM

Defaults to $(OBJECT) and is used in the ld (1) command to specify what files to link/load from. (Also see dynamic_lib later for how to specify ld flags.)

LIBPERL_A

The filename of the Perl library that will be used together with this extension. Defaults to libperl.a.

LIBS

An anonymous array of alternative library specifications to be searched for (in order) until at least one library is found.

For example:

LIBS => ["-lgdbm", "-ldbm -lfoo", "-L/path -ldbm.nfs"],

Note that any element of the array contains a complete set of arguments for the ld command. So do not specify:

LIBS => ["-ltcl", "-ltk", "-lX11"],

See NDBM_File/Makefile.PL for an example where an array is needed. If you specify a scalar as in:

LIBS => "-ltcl -ltk -lX11",

MakeMaker will turn it into an array with one element.

LINKTYPE

"static" or "dynamic" (the latter is the default unless usedl=undef in config.sh). Should only be used to force static linking. (Also see linkext, later in this chapter).

MAKEAPERL

Boolean that tells MakeMaker to include the rules for making a Perl binary. This is handled automatically as a switch by MakeMaker. The user normally does not need it.

MAKEFILE

The name of the Makefile to be produced.

MAN1PODS

A reference to a hash of POD-containing files. MakeMaker will default this to all EXE_FILES files that include POD directives. The files listed here will be converted to manpages and installed as requested at Configure time.

MAN3PODS

A reference to a hash of .pm and .pod files. MakeMaker will default this to all .pod and any .pm files that include POD directives. The files listed here will be converted to manpages and installed as requested at Configure time.

MAP_TARGET

If it is intended that a new Perl binary be produced, this variable holds the name for that binary. Defaults to perl.

MYEXTLIB

If the extension links to a library that it builds, set this to the name of the library (see SDBM_File).

NAME

Perl module name for this extension (for example, DBD::Oracle). This will default to the directory name, but should really be explicitly defined in the Makefile.PL.

NEEDS_LINKING

MakeMaker will figure out whether an extension contains linkable code anywhere down the directory tree, and will set this variable accordingly. But you can speed it up a very little bit if you define this Boolean variable yourself.

NOECHO

Governs make 's @ (echoing) feature. By setting NOECHO to an empty string, you can generate a Makefile that echos all commands. Mainly used in debugging MakeMaker itself.

NORECURS

A Boolean that inhibits the automatic descent into subdirectories (see DIR above). For example:

NORECURS => 1,

OBJECT

A string containing a list of object files, defaulting to $(BASEEXT)$(OBJ_EXT). But it can be a long string containing all object files. For example:

OBJECT => "tkpBind.o tkpButton.o tkpCanvas.o",

PERL

Perl binary for tasks that can be done by miniperl.

PERLMAINCC

The command line that is able to compile perlmain.c. Defaults to $(CC).

PERL_ARCHLIB

Same as PERL_LIB for architecture-dependent files.

PERL_LIB

The directory containing the Perl library to use.

PERL_SRC

The directory containing the Perl source code. Use of this should be avoided, since it may be undefined.

PL_FILES

A reference to hash of files to be processed as Perl programs. By default MakeMaker will turn the names of any *.PL files it finds (except Makefile.PL) into keys, and use the basenames of these files as values. For example:

PL_FILES => {'whatever.PL' => 'whatever'},

This turns into a Makefile entry resembling:

all :: whatever
whatever :: whatever.PL
        $(PERL) -I$(INST_ARCHLIB) -I$(INST_LIB) \
                -I$(PERL_ARCHLIB) -I$(PERL_LIB) whatever.PL

You'll note that there's no I/O redirection into whatever there. The *.PL files are expected to produce output to the target files themselves.

PM

A reference to a hash of .pm files and .pl files to be installed. For example:

PM => {'name_of_file.pm' => '$(INST_LIBDIR)/install_as.pm'},

By default this includes *.pm and *.pl. If a lib/ subdirectory exists and is not listed in DIR (above) then any *.pm and *.pl files it contains will also be included by default. Defining PM in the Makefile.PL will override PMLIBDIRS.

PMLIBDIRS

A reference to an array of subdirectories that contain library files. Defaults to:

PMLIBDIRS => [ 'lib', '$(BASEEXT)' ],

The directories will be scanned and any files they contain will be installed in the corresponding location in the library. A libscan() method may be used to alter the behavior. Defining PM in the Makefile.PL will override PMLIBDIRS.

PREFIX

May be used to set the three INSTALL* attributes in one go (except for probably INSTALLMAN1DIR if it is not below PREFIX according to %Config). They will have PREFIX as a common directory node and will branch from that node into lib/, lib/ARCHNAME or whatever Configure decided at the build time of your Perl (unless you override one of them, of course).

PREREQ

A placeholder, not yet implemented. Will eventually be a hash reference: the keys of the hash are names of modules that need to be available to run this extension (for example, Fcntl for SDBM_File); the values of the hash are the desired versions of the modules.

SKIP

An array reference specifying the names of sections of the Makefile not to write. For example:

SKIP => [qw(name1 name2)],

TYPEMAPS

A reference to an array of typemap filenames. (Typemaps are used by the XS preprocessing system.) Use this when the typemaps are in some directory other than the current directory or when they are not named typemap. The last typemap in the list takes precedence. A typemap in the current directory has highest precedence, even if it isn't listed in TYPEMAPS. The default system typemap has lowest precedence.

VERSION

Your version number for distributing the package. This number defaults to 0.1.

VERSION_FROM

Instead of specifying the VERSION in the Makefile.PL, you can let MakeMaker parse a file to determine the version number. The parsing routine requires that the file named by VERSION_FROM contain one single line to compute the version number. The first line in the file that contains the regular expression:

/(\$[\w:]*\bVERSION)\b.*=/

will be evaluated with eval and the value of the named variable after the eval will be assigned to the VERSION attribute of the MakeMaker object. The following lines will be parsed satisfactorily:

$VERSION = '1.00';
( $VERSION ) = '$Revision: 1.1 $ ' =~ /\$Revision:\s+([^\s]+)/;
$FOO::VERSION = '1.10';

but these will fail:

my $VERSION = '1.01';
local $VERSION = '1.02';
local $FOO::VERSION = '1.30';

The file named in VERSION_FROM is added as a dependency to the Makefile in order to guarantee that the Makefile contains the correct VERSION attribute after a change of the file.

XS

A hash reference of .xs files. MakeMaker will default this. For example:

XS => {'name_of_file.xs' => 'name_of_file.c'},

The *.c files will automatically be included in the list of files deleted by a make clean.

XSOPT

A string of options to pass to xsubpp (the XS preprocessor). This might include -C++ or -extern. Do not include typemaps here; the TYPEMAP parameter exists for that purpose.

XSPROTOARG

May be set to an empty string, which is identical to -prototypes, or -noprototypes. MakeMaker defaults to the empty string.

XS_VERSION

Your version number for the .xs file of this package. This defaults to the value of the VERSION attribute.

Additional lowercase attributes

There are additional lowercase attributes that you can use to pass parameters to the methods that spit out particular portions of the Makefile. These attributes are not normally required.

clean

Extra files to clean.

clean => {FILES => "*.xyz foo"},

depend

Extra dependencies.

depend => {ANY_TARGET => ANY_DEPENDENCY, ...},

dist

Options for distribution (see "Distribution Support" below).

dist => {
    TARFLAGS => 'cvfF',
    COMPRESS => 'gzip',
    SUFFIX => 'gz',
    SHAR => 'shar -m',
    DIST_CP => 'ln',
},

If you specify COMPRESS, then SUFFIX should also be altered, since it is needed in order to specify for make the target file of the compression. Setting DIST_CP to "ln" can be useful if you need to preserve the timestamps on your files. DIST_CP can take the values "cp" (copy the file), "ln" (link the file), or "best" (copy symbolic links and link the rest). Default is "best".

dynamic_lib

Options for dynamic library support.

dynamic_lib => {
    ARMAYBE => 'ar',
    OTHERLDFLAGS => '...',
    INST_DYNAMIC_DEP => '...',
},

installpm

Some installation options having to do with AutoSplit.

{SPLITLIB => '$(INST_LIB)' (default) or '$(INST_ARCHLIB)'}

linkext

Linking style.

linkext => {LINKTYPE => 'static', 'dynamic', or ""},

Extensions that have nothing but *.pm files used to have to say:

linkext => {LINKTYPE => ""},

with Pre-5.0 MakeMakers. With Version 5.00 of MakeMaker such a line can be deleted safely. MakeMaker recognizes when there's nothing to be linked.

macro

Extra macros to define.

macro => {ANY_MACRO => ANY_VALUE, ...},

realclean

Extra files to really clean.

{FILES => '$(INST_ARCHAUTODIR)/*.xyz'}

Useful Makefile macros

Here are some useful macros that you probably shouldn't redefine because they're derivative.

FULLEXT

Pathname for extension directory (for example, DBD/Oracle).

BASEEXT

Basename part of FULLEXT. May be just equal to FULLEXT.

ROOTEXT

Directory part of FULLEXT with leading slash (for example, /DBD)

INST_LIBDIR

$(INST_LIB)$(ROOTEXT)

INST_AUTODIR

$(INST_LIB)/auto/$(FULLEXT)

INST_ARCHAUTODIR

$(INST_ARCHLIB)/auto/$(FULLEXT)

Overriding MakeMaker methods

If you cannot achieve the desired Makefile behavior by specifying attributes, you may define private subroutines in the Makefile.PL. Each subroutine returns the text it wishes to have written to the Makefile. To override a section of the Makefile you can use one of two styles. You can just return a new value:

sub MY::c_o { "new literal text" }

or you can edit the default by saying something like:

sub MY::c_o {
    my $self = shift;
    local *c_o;
    $_=$self->MM::c_o;
    s/old text/new text/;
    $_;
}

Both methods above are available for backward compatibility with older Makefile.PLs.

If you still need a different solution, try to develop another subroutine that better fits your needs and then submit the diffs to either perl5-porters@nicoh.com or comp.lang.perl.modules as appropriate.

Distribution support

For authors of extensions, MakeMaker provides several Makefile targets. Most of the support comes from the ExtUtils::Manifest module, where additional documentation can be found. Note that a MANIFEST file is basically just a list of filenames to be shipped with the kit to build the extension.

make distcheck

Reports which files are below the build directory but not in the MANIFEST file and vice versa. (See ExtUtils::Manifest::fullcheck() for details.)

make skipcheck

Reports which files are skipped due to the entries in the MANIFEST.SKIP file. (See ExtUtils::Manifest::skipcheck() for details).

make distclean

Does a realclean first and then the distcheck. Note that this is not needed to build a new distribution as long as you are sure that the MANIFEST file is OK.

make manifest

Rewrites the MANIFEST file, adding all remaining files found. (See ExtUtils::Manifest::mkmanifest() for details.)

make distdir

Copies all files that are in the MANIFEST file to a newly created directory with the name $(DISTNAME)-$(VERSION). If that directory exists, it will be removed first.

make disttest

Makes distdir first, and runs perl Makefile.PL, make, and make test in that directory.

make tardist

First does a command $(PREOP), which defaults to a null command. Does a make distdir next and runs tar (1) on that directory into a tarfile. Then deletes the distdir. Finishes with a command $(POSTOP), which defaults to a null command.

make dist

Defaults to $(DIST_DEFAULT), which in turn defaults to tardist.

make uutardist

Runs a tardist first and uuencode s the tarfile.

make shdist

First does a command $(PREOP), which defaults to a null command. Does a distdir next and runs shar on that directory into a sharfile. Then deletes the distdir. Finishes with a command $(POSTOP), which defaults to a null command. Note: for shdist to work properly, a shar program that can handle directories is mandatory.

make ci

Does a $(CI) and a $(RCS_LABEL) on all files in the MANIFEST file.

Customization of the distribution targets can be done by specifying a hash reference to the dist attribute of the WriteMakefile() call. The following parameters are recognized:

Parameter Default
CI ('ci -u')
COMPRESS ('compress')
POSTOP ('@ :')
PREOP ('@ :')
RCS_LABEL ('rcs -q -Nv$(VERSION_SYM):')
SHAR ('shar')
SUFFIX ('Z')
TAR ('tar')
TARFLAGS ('cvf')

An example:

WriteMakefile( 'dist' => { COMPRESS=>"gzip", SUFFIX=>"gz" })

ExtUtils::Manifest--Utilities to Write and Check a MANIFEST File

require ExtUtils::Manifest;
ExtUtils::Manifest::mkmanifest();
ExtUtils::Manifest::manicheck();
ExtUtils::Manifest::filecheck();
ExtUtils::Manifest::fullcheck();
ExtUtils::Manifest::skipcheck();
ExtUtild::Manifest::manifind();
ExtUtils::Manifest::maniread($file);
ExtUtils::Manifest::manicopy($read, $target, $how);

These routines automate the maintenance and use of a MANIFEST file. A MANIFEST file is essentially just a list of filenames, one per line, with an optional comment on each line, separated by whitespace (usually one or more tabs). The idea is simply that you can extract the filenames by saying:

awk '{print $1}' MANIFEST

mkmanifest() writes the names of all files in and below the current directory to a file named in the global variable $ExtUtils::Manifest::MANIFEST (which defaults to MANIFEST) in the current directory. As the counterpart to the awk command above, it works much like:

find . -type f -print > MANIFEST

except that it also checks the existing MANIFEST file (if any) and copies over any comments that are found there. Also, all filenames that match any regular expression in a file MANIFEST.SKIP (if such a file exists) are ignored.

manicheck() checks whether all files listed in a MANIFEST file in the current directory really do exist.

filecheck() finds files below the current directory that are not mentioned in the MANIFEST file. An optional MANIFEST.SKIP file will be consulted, and any filename matching a regular expression in such a file will not be reported as missing in the MANIFEST file.

fullcheck() does both a manicheck() and a filecheck().

skipcheck() lists all files that are skipped due to your MANIFEST.SKIP file.

manifind() returns a hash reference. The keys of the hash are the files found below the current directory. The values are null strings, representing all the MANIFEST comments that aren't there.

maniread($file) reads a named MANIFEST file (defaults to MANIFEST in the current directory) and returns a hash reference, the keys of which are the filenames, and the values of which are the comments that are there. Er, which may be null if the comments aren't there. . . .

manicopy($read, $target, $how) copies the files that are the keys in the hash %$read to the named target directory. The hash reference $read is typically returned by the maniread() function. manicopy() is useful for producing a directory tree identical to the intended distribution tree. The third parameter $how can be used to specify a different method of "copying". Valid values are "cp", which actually copies the files, "ln", which creates hard links, and "best", which mostly links the files but copies any symbolic link to make a tree without any symbolic link. "best" is the default, though it may not be the best default.

Ignoring files

The MANIFEST.SKIP file may contain regular expressions of files that should be ignored by mkmanifest() and filecheck(). The regular expressions should appear one on each line. A typical example:

\bRCS\b
^MANIFEST\.
(?i)^makefile$
~$
\.html$
\.old$
^blib/
^MakeMaker-\d

Exportability

mkmanifest(), manicheck(), filecheck(), fullcheck(), maniread(), and manicopy() are exportable.

Global variables

$ExtUtils::Manifest::MANIFEST defaults to MANIFEST. Changing it results in both a different MANIFEST and a different MANIFEST.SKIP file. This is useful if you want to maintain different distributions for different audiences (say a user version and a developer version including RCS).

$ExtUtils::Manifest::Quiet defaults to 0. You can set it to a true value to get all the functions to shutup already.

Diagnostics

All diagnostic output is sent to STDERR.

Not in MANIFEST: file

A file excluded by a regular expression in MANIFEST.SKIP was missing from the MANIFEST file.

No such file: file

A file mentioned in a MANIFEST file does not exist.

MANIFEST: $!

The MANIFEST file could not be opened.

Added to MANIFEST: file

Reported by mkmanifest() if $Verbose is set and a file is added to MANIFEST. $Verbose is set to 1 by default.

See also

The ExtUtils::MakeMaker library module generates a Makefile with handy targets for most of this functionality.

ExtUtils::Miniperl--Write the C Code for perlmain.c

use ExtUtils::Miniperl;
writemain(@directories);

writemain() takes an argument list of directories containing archive libraries that are needed by Perl modules and that should be linked into a new Perl binary. It correspondingly writes to STDOUT a file intended to be compiled as perlmain.c that contains all the bootstrap code to make the modules associated with the libraries available from within Perl.

The typical usage is from within a Makefile generated by ExtUtils::MakeMaker. So under normal circumstances you won't have to deal with this module directly.

WARNING:

This entire module is automatically generated from a script called minimod.PL when Perl itself is built. So if you want to patch it, please patch minimod.PL in the Perl distribution instead.

ExtUtils::Mkbootstrap--Make a Bootstrap File for Use by DynaLoader

use ExtUtils::Mkbootstrap;
mkbootstrap();

mkbootstrap() typically gets called from an extension's Makefile. It writes a *.bs file that is needed by some architectures to do dynamic loading. It is otherwise unremarkable, and MakeMaker usually handles the details. If you need to know more about it, you've probably already read the module.

ExtUtils::Mksymlists--Write Linker Option Files for Dynamic Extension

use ExtUtils::Mksymlists;
Mksymlists(  NAME     => $name,
             DL_FUNCS => { $pkg1 => [$func1, $func2], $pkg2 => [$func3] },
             DL_VARS  => [$var1, $var2, $var3]);

ExtUtils::Mksymlists() produces files used by the linker under some OSes during the creation of shared libraries for dynamic extensions. It is normally called from a MakeMaker-generated Makefile when the extension is built. The linker option file is generated by calling the function Mksymlists(), which is exported by default from ExtUtils::Mksymlists. It takes one argument, a list of key/value pairs, in which the following keys are recognized:

NAME

This gives the name of the extension (for example, Tk::Canvas) for which the linker option file will be produced.

DL_FUNCS

This is identical to the DL_FUNCS attribute available via MakeMaker, from which it is usually taken. Its value is a reference to a hash, in which each key is the name of a package, and each value is a reference to an array of function names, which should be exported by the extension. So, one might say:

DL_FUNCS => {
     Homer::Iliad   => [ qw(trojans greeks) ],
     Homer::Odyssey => [ qw(travelers family suitors) ],
},

The function names should be identical to those in the XSUB code; Mksymlists() will alter the names written to the linker option file to match the changes made by xsubpp. In addition, if none of the functions in a list begins with the string "boot_", Mksymlists() will add a bootstrap function for that package, just as xsubpp does. (If a boot_pkg function is present in the list, it is passed through unchanged.) If DL_FUNCS is not specified, it defaults to the bootstrap function for the extension specified in NAME.

DL_VARS

This is identical to the DL_VARS attribute available via MakeMaker, and, like DL_FUNCS, it is usually specified via MakeMaker. Its value is a reference to an array of variable names that should be exported by the extension.

FILE

This key can be used to specify the name of the linker option file (minus the OS-specific extension) if for some reason you do not want to use the default value, which is the last word of the NAME attribute (for example, for Tk::Canvas, FILE defaults to Canvas).

FUNCLIST

This provides an alternate means to specify function names to be exported from the extension. Its value is a reference to an array of function names to be exported. These names are passed through unaltered to the linker options file.

DLBASE

This item specifies the name by which the linker knows the extension, which may be different from the name of the extension itself (for instance, some linkers add an "_" to the name of the extension). If it is not specified, it is derived from the NAME attribute. It is presently used only by OS/2.

When calling Mksymlists(), one should always specify the NAME attribute. In most cases, this is all that's necessary. In the case of unusual extensions, however, the other attributes can be used to provide additional information to the linker.

ExtUtils::MM_OS2--Methods to Override UNIX Behavior in ExtUtils::MakeMaker

use ExtUtils::MM_OS2; # Done internally by ExtUtils::MakeMaker if needed

See ExtUtils::MM_Unix for documentation of the methods provided there. This package overrides the implementation of the methods, not the interface.

ExtUtils::MM_Unix--Methods Used by ExtUtils::MakeMaker

require ExtUtils::MM_Unix;

The methods provided by this package (and by the other MM_* packages) are designed to be used in conjunction with ExtUtils::MakeMaker. You will never require this module yourself. You would only define methods in this or a similar module if you're working on improving the porting capabilities of MakeMaker. Nevertheless, this is a laudable goal, so we'll talk about it here.

When MakeMaker writes a Makefile, it creates one or more objects that inherit their methods from package MM. MM itself doesn't provide any methods, but it inherits from the ExtUtils::MM_Unix class. However, for certain platforms, it also inherits from an OS-specific module such as MM_VMS, and it does this before it inherits from the MM_Unix module in the @ISA list. The inheritance tree of MM therefore lets the OS-specific package override any of the methods listed here. In a sense, the MM_Unix package is slightly misnamed, since it provides fundamental methods on non-UNIX systems too, to the extent that the system is like UNIX.

MM methods

We've avoided listing deprecated methods here, as well as any private methods you're unlikely to want to override.

catdir LIST

Concatenates two or more directory names to form a complete path ending with a directory. On UNIX it just glues it together with a / character.

catfile LIST

Concatenates one or more directory names and a filename to form a complete path ending with a filename. Also uses / on UNIX.

dir_target

Takes an array of directories that need to exist and returns a Makefile entry for a .exists file in these directories. Returns nothing if the entry has already been processed. We're helpless, though, if the same directory comes as $(FOO) and as bar. Both of them get an entry; that's why we use "::".

file_name_is_absolute FILENAME

Takes as argument a path and returns true if it is an absolute path.

find_perl VERSION, NAMES, DIRS, TRACE

Searches for an executable Perl that is at least the specified VERSION, named by one of the entries in NAMES (an array reference), and located in one of the entries of DIRS (also an array reference). It prints debugging info if TRACE is true.

guess_name

Guesses the name of this package by examining the working directory's name. MakeMaker calls this only if the developer has not supplied a NAME attribute. Shame on you.

has_link_code

Returns true if C, XS, MYEXTLIB or similar objects exist within this object that need a compiler. Does not descend into subdirectories as needs_linking() does.

libscan FILENAME

Takes a path to a file that is found by init_dirscan() and returns false if we don't want to include this file in the library. It is mainly used to exclude RCS/, CVS/, and SCCS/ directories from installation.

lsdir DIR, REGEXP

Takes as arguments a directory name and a regular expression. Returns all entries in the directory that match the regular expression.

maybe_command_in_dirs

Method under development. Not yet used.

maybe_command FILENAME

Returns true if the argument is likely to be a command.

needs_linking

Does this module need linking? Looks into subdirectory objects, if any. (See also has_link_code().)

nicetext TARGET

(A misnamed method.) The MM_Unix version of the method just returns the argument without further processing. On VMS, this method ensures that colons marking targets are preceded by space. Most UNIX makes don't need this, but it's necessary under VMS to distinguish the target delimiter from a colon appearing as part of a filespec.

path

Takes no argument. Returns the environment variable PATH as an array.

perl_script FILENAME

Returns true if the argument is likely to be a Perl script. With MM_Unix this is true for any ordinary, readable file.

prefixify ATTRNAME, OLDPREFIX, NEWPREFIX

Processes a path attribute in $self->{ ATTRNAME }. First it looks it up for you in %Config if it doesn't have a value yet. Then it replaces (in-place) the OLDPREFIX with the NEWPREFIX (if it matches).

replace_manpage_separator FILENAME

Takes the filename of a package, which if it's a nested package will have a name of the form "Foo/Bar" (under UNIX), and replaces the subdirectory delimiter with "::". Returns the altered name.

Methods to produce chunks of text for the Makefile

When MakeMaker thinks it has all its ducks in a row, it calls a special sequence of methods to produce the Makefile for a given MakeMaker object. The list of methods it calls is specified in the array @ExtUtils::MakeMaker::MM_Sections, one method per section. Since these routines are all called the same way, we won't document each of them separately, except to list them.

By far the most accurate and up-to-date documentation for what each method does is actually the Makefile that MakeMaker produces. Each section of the file is labeled with the name of the method that produces it, so once you see how you want to change the Makefile, it's a trivial matter to work back from the proposed change and find the method responsible for it.

You've plowed through a lot of ugly things to get here, but since you've read this far, we'll reward you by pointing out something incredibly beautiful in MakeMaker. The arguments (if any) that are passed to each method are simply the pseudo-attributes of the same name that you already saw documented under "Additional Lowercase Attributes" in the section on ExtUtils::MakeMaker. You'll recall that those pseudo-attributes were specified as anonymous hashes, which Just Happen to have exactly the same syntax inside as named parameters. Fancy that. So the arguments just come right into your method as ordinary named parameters. Assign the arguments to a hash, and off you go. And it's completely forward and backward compatible. Even if you override a method that didn't have arguments before, there's no problem. Since it's all driven off the method name, just name your new pseudo-attribute after your method, and your method will get its arguments.

The return values are also easy to understand: each method simply returns the string it wants to put into its section of the Makefile.

Two special methods are post_initialize() and postamble(), each of which returns an empty string by default. You can define them in your Makefile.PL to insert customized text near the beginning or end of the Makefile.

Here are the methods. They're called in this order (reading down the columns):

post_initialize() top_targets() realclean()
const_config() linkext() dist_basics()
constants() dlsyms() dist_core()
const_loadlibs() dynamic() dist_dir()
const_cccmd() dynamic_bs() dist_test()
tool_autosplit() dynamic_lib() dist_ci()
tool_xsubpp() static() install()
tools_other() static_lib() force()
dist() installpm() perldepend()
macro() installpm_x() makefile()
depend() manifypods() staticmake()
post_constants() processPL() test()
pasthru() installbin() test_via_harness()
c_o() subdirs() test_via_script()
xs_c() subdir_x() postamble()
xs_o() clean()  

See also

ExtUtils::MakeMaker library module.

ExtUtils::MM_VMS--Methods to Override UNIX Behavior in ExtUtils::MakeMaker

use ExtUtils::MM_VMS; # Done internally by ExtUtils::MakeMaker if needed

See ExtUtils::MM_Unix for documentation of the methods provided there. This package overrides the implementation of the methods, not the interface.

Fcntl--Load the C fcntl.h Defines

use Fcntl;
$nonblock_flag = O_NDELAY();
$create_flag = O_CREAT();
$read_write_flag = O_RDWR();

This module is just a translation of the C fcntl.h file. Unlike the old mechanism which required a translated fcntl.ph file, fcntl uses the h2xs program (see the Perl source distribution) and your native C compiler. This means that it has a much better chance of getting the numbers right.

Note that only #define symbols get translated; you must still correctly pack up your own arguments to pass as arguments for locking functions and so on.

The following routines are exported by default, and each routine returns the value of the #define that is the same as the routine name:

FD_CLOEXEC F_DUPFD F_GETFD F_GETFL F_GETLK F_RDLCK
F_SETFD F_SETFL F_SETLK F_SETLKW F_UNLCK F_WRLCK
O_APPEND O_CREAT O_EXCL O_NDELAY O_NOCTTY  
O_NONBLOCK O_RDONLY O_RDWR O_TRUNC O_WRONLY  

File::Basename--Parse File Specifications

use File::Basename;
($name, $path, $suffix) = fileparse($fullname, @suffixlist)
fileparse_set_fstype($os_string);  # $os_string specifies OS type
$basename = basename($fullname, @suffixlist);
$dirname = dirname($fullname);
($name, $path, $suffix) = fileparse("lib/File/Basename.pm", '\.pm');
fileparse_set_fstype("VMS");
$basename = basename("lib/File/Basename.pm", ".pm");
$dirname = dirname("lib/File/Basename.pm");

These routines allow you to parse file specifications into useful pieces using the syntax of different operating systems.

fileparse_set_fstype

You select the syntax via the routine fileparse_set_fstype(). If the argument passed to it contains one of the substrings "VMS", "MSDOS", or "MacOS", the file specification syntax of that operating system is used in future calls to fileparse(), basename(), and dirname(). If it contains none of these substrings, UNIX syntax is used. This pattern matching is case-insensitive. If you've selected VMS syntax and the file specification you pass to one of these routines contains a /, it assumes you are using UNIX emulation and applies the UNIX syntax rules instead for that function call only. If you haven't called fileparse_set_fstype(), the syntax is chosen by examining the osname entry from the Config package according to these rules.

fileparse

The fileparse() routine divides a file specification into three parts: a leading path, a file name, and a suffix. The path contains everything up to and including the last directory separator in the input file specification. The remainder of the input file specification is then divided into name and suffix based on the optional patterns you specify in @suffixlist. Each element of this list is interpreted as a regular expression, and is matched against the end of name. If this succeeds, the matching portion of name is removed and prepended to suffix. By proper use of @suffixlist, you can remove file types or versions for examination. You are guaranteed that if you concatenate path, name, and suffix together in that order, the result will be identical to the input file specification. Using UNIX file syntax:

($name, $path, $suffix) = fileparse('/virgil/aeneid/draft.book7',
                                                  '\.book\d+');

would yield:

$name   eq 'draft'
$path   eq '/virgil/aeneid',
$suffix eq '.book7'

(Note that the suffix pattern is in single quotes. You'd have to double the backslashes if you used double quotes, since double quotes do backslash interpretation.) Similarly, using VMS syntax:

($name, $path, $suffix) = fileparse('Doc_Root:[Help]Rhetoric.Rnh', '\..*');

would yield:

$name   eq 'Rhetoric'
$path   eq 'Doc_Root:[Help]'
$suffix eq '.Rnh'

basename

The basename() routine returns the first element of the list produced by calling fileparse() with the same arguments. It is provided for compatibility with the UNIX shell command basename (1).

dirname

The dirname() routine returns the directory portion of the input file specification. When using VMS or MacOS syntax, this is identical to the second element of the list produced by calling fileparse() with the same input file specification. When using UNIX or MS-DOS syntax, the return value conforms to the behavior of the UNIX shell command dirname (1). This is usually the same as the behavior of fileparse(), but differs in some cases. For example, for the input file specification lib/, fileparse() considers the directory name to be lib/, while dirname() considers the directory name to be . (dot).

File::CheckTree--Run Many Tests on a Collection of Files

use File::CheckTree;
$warnings += validate( q{
    /vmunix                 -e || die
    /boot                   -e || die
    /bin                    cd
        csh                 -ex
        csh                 !-ug
        sh                  -ex
        sh                  !-ug
    /usr                    -d || warn "What happened to $file?\n"
});

The validate() routine takes a single multi-line string, each line of which contains a filename plus a file test to try on it. (The file test may be given as "cd", causing subsequent relative filenames to be interpreted relative to that directory.) After the file test you may put "|| die" to make it a fatal error if the file test fails. The default is:

|| warn

You can reverse the sense of the test by prepending "!". If you specify "cd" and then list some relative filenames, you may want to indent them slightly for readability. If you supply your own die or warn message, you can use $file to interpolate the filename.

File tests may be grouped: -rwx tests for all of -r, -w, and -x. Only the first failed test of the group will produce a warning.

validate() returns the number of warnings issued, presuming it didn't die.

File::Copy--Copy Files or Filehandles

use File::Copy;
copy("src-file", "dst-file");
copy("Copy.pm", \*STDOUT);
use POSIX;
use File::Copy 'cp';
$fh = FileHandle->new("/dev/null", "r");
cp($fh, "dst-file");'

The Copy module provides one function, copy(), that takes two parameters: a file to copy from and a file to copy to. Either argument may be a string, a FileHandle reference, or a FileHandle glob. If the first argument is a filehandle of some sort, it will be read from; if it is a filename, it will be opened for reading. Likewise, the second argument will be written to (and created if need be).

An optional third parameter is a hint that requests the buffer size to be used for copying. This is the number of bytes from the first file that will be held in memory at any given time, before being written to the second file. The default buffer size depends upon the file and the operating system, but will generally be the whole file (up to 2Mb), or 1kb for filehandles that do not reference files (for example, sockets).

When running under VMS, this routine performs an RMS copy of the file, in order to preserve file attributes, indexed file structure, and so on. The buffer size parameter is ignored.

You may use the syntax:

use File::Copy "cp"

to get at the cp() alias for the copy() function. The syntax is exactly the same.

copy() returns 1 on success, 0 on failure; $! will be set if an error was encountered.

File::Find--Traverse a File Tree

use File::Find;
find(\&wanted, 'dir1', 'dir2'...);
sub wanted { ... }
use File::Find;
finddepth(\&wanted, 'dir1', 'dir2'...);  # traverse depth-first
sub wanted { ... }

find() is similar to the UNIX find (1) command in that it traverses the specified directories, performing whatever tests or other actions you request. However, these actions are given in the subroutine, wanted(), which you must define (but see find2perl below). For example, to print out the names of all executable files, you could define wanted() this way:

sub wanted {
    print "$File::Find::name\n" if -x;
}

$File::Find::dir contains the current directory name, and $_ the current filename within that directory. $File::Find::name contains "$File::Find::dir/$_". You are chdired to $File::Find::dir when find() is called. You can set $File::Find::prune to true in wanted() in order to prune the tree; that is, find() will not descend into any directory when $File::Find::prune is set.

This library is primarily for use with the find2perl (1) command, which is supplied with the standard Perl distribution and converts a find (1) invocation to an appropriate wanted() subroutine. The command:

find2perl / -name .nfs\* -mtime +7 \
             -exec rm -f {} \; -o -fstype nfs -prune

produces something like:

sub wanted {
    /^\.nfs.*$/ &&
    (($dev, $ino, $mode, $nlink, $uid, $gid) = lstat($_)) &&
    int(-M _) > 7 &&
    unlink($_)
    ||
    ($nlink || (($dev, $ino, $mode, $nlink, $uid, $gid) = lstat($_))) &&
    $dev < 0 &&
    ($File::Find::prune = 1);
}

Set the variable $File::Find::dont_use_nlink if you're using the AFS.

finddepth() is just like find(), except that it does a depth-first search.

Here's another interesting wanted() function. It will find all symbolic links that don't resolve:

sub wanted {
    -l and not -e and print "bogus link: $File::Find::name\n";
}

File::Path--Create or Remove a Series of Directories

use File::Path
mkpath(['/foo/bar/baz', 'blurfl/quux'], 1, 0711);
rmtree(['/foo/bar/baz', 'blurfl/quux'], 1, 1);

The mkpath() function provides a convenient way to create directories, even if your mkdir (2) won't create more than one level of directory at a time. mkpath() takes three arguments:

It returns a list of all directories created, including intermediate directories, which are assumed to be delimited by the UNIX path separator, /.

Similarly, the rmtree() function provides a convenient way to delete a subtree from the directory structure, much like the UNIX rm -r command. rmtree() takes three arguments:

rmtree() returns the number of files successfully deleted. Symbolic links are treated as ordinary files.

FileCache--Keep More Files Open Than the System Permits

use FileCache;
cacheout $path;         # open the file whose path name is $path
print $path "stuff\n";  # print stuff to file given by $path

The cacheout() subroutine makes sure that the file whose name is $path is created and accessible through the filehandle also named $path. It permits you to write to more files than your system allows to be open at once, performing the necessary opens and closes in the background. By preceding each file access with:

cacheout $path;

you can be sure that the named file will be open and ready to do business. However, you do not need to invoke cacheout() between successive accesses to the same file.

cacheout() does not create directories for you. If you use it to open an existing file that FileCache is seeing for the first time, the file will be truncated to zero length with no questions asked. (However, in its opening and closing of files in the background, cacheout() keeps track of which files it has opened before and does not overwrite them, but appends to them instead.)

cacheout() checks the value of NOFILE in sys/param.h to determine the number of open files allowed. This value is incorrect on some systems, in which case you should set $FileCache::maxopen to be four less than the correct value for NOFILE.

FileHandle--Supply Object Methods for Filehandles

use FileHandle;
$fh = new FileHandle;
if ($fh->open "< file") {
    print <$fh>;
    $fh->close;
}
$fh = new FileHandle "> file";
if (defined $fh) {
    print $fh "bar\n";
    $fh->close;
}
$fh = new FileHandle "file", "r";
if (defined $fh) {
    print <$fh>;
    undef $fh;       # automatically closes the file
}
$fh = new FileHandle "file", O_WRONLY|O_APPEND;
if (defined $fh) {
    print $fh "stuff\n";
    undef $fh;       # automatically closes the file
}
$pos = $fh->getpos;
$fh->setpos $pos;
$fh->setvbuf($buffer_var, _IOLBF, 1024);
($readfh, $writefh) = FileHandle::pipe;
autoflush STDOUT 1;

new

Creates a FileHandle, which is a reference to a newly created symbol (see the Symbol library module). If it receives any parameters, they are passed to open(). If the open fails, the FileHandle object is destroyed. Otherwise, it is returned to the caller.

new_from_fd

Creates a FileHandle like new() does. It requires two parameters, which are passed to fdopen(); if the fdopen() fails, the FileHandle object is destroyed. Otherwise, it is returned to the caller.

open

Accepts one parameter or two. With one parameter, it is just a front end for the built-in open function. With two parameters, the first parameter is a filename that may include whitespace or other special characters, and the second parameter is the open mode in either Perl form (">", "+<", and so on) or POSIX form ("w", "r+", and so on).

fdopen

Like open() except that its first parameter is not a filename but rather a filehandle name, a FileHandle object, or a file descriptor number.

getpos

If the C functions fgetpos (3) and fsetpos (3) are available, then getpos() returns an opaque value that represents the current position of the FileHandle, and setpos() uses that value to return to a previously visited position.

setvbuf

If the C function setvbuf (3) is available, then setvbuf() sets the buffering policy for the FileHandle. The calling sequence for the Perl function is the same as its C counterpart, including the macros _IOFBF, _IOLBF, and _IONBF, except that the buffer parameter specifies a scalar variable to use as a buffer.

WARNING:

A variable used as a buffer by setvbuf() must not be modified in any way until the FileHandle is closed or until setvbuf() is called again, or memory corruption may result!

The following supported FileHandle methods are just front ends for the corresponding built-in Perl functions:

clearerr getc
close gets
eof seek
fileno tell

The following supported FileHandle methods correspond to Perl special variables:

autoflush format_page_number
format_formfeed format_top_name
format_line_break_characters input_line_number
format_lines_left input_record_separator
format_lines_per_page output_field_separator
format_name output_record_separator

Furthermore, for doing normal I/O you might need these methods:

$fh->print

See Perl's built-in print function.

$fh->printf

See Perl's built-in printf function.

$fh->getline

This method works like Perl's <FILEHANDLE> construct, except that it can be safely called in an array context, where it still returns just one line.

$fh->getlines

This method works like Perl's <FILEHANDLE> construct when called in an array context to read all remaining lines in a file. It will also croak() if accidentally called in a scalar context.

Bugs

Due to backward compatibility, all filehandles resemble objects of class FileHandle, or actually classes derived from that class. But they aren't. Which means you can't derive your own class from FileHandle and inherit those methods.

While it may look as though the filehandle methods corresponding to the built-in variables are unique to a particular filehandle, currently some of them are not, including the following:

input_line_number()
input_record_separator()
output_record_separator()

GDBM_File--Tied Access to GDBM Library

use GDBM_File;
tie %hash, "GDBM_File", $filename, &GDBM_WRCREAT, 0644);
# read/writes of %hash are now read/writes of $filename
untie %hash;

GDBM_File is a module that allows Perl programs to make use of the facilities provided by the GNU gdbm library. If you intend to use this module, you should have a copy of the gdbm (3) manpage at hand.

Most of the libgdbm.a functions are available as methods of the GDBM_File interface.

Availability

gdbm is available from any GNU archive. The master site is prep.ai.mit.edu, but you are strongly urged to use one of the many mirrors. You can obtain a list of mirror sites by issuing the command, finger fsf@prep.ai.mit.edu. A copy is also stored on CPAN:

See also

DB_File library module.

Getopt::Long--Extended Processing of Command-Line Options

use Getopt::Long;
$result = GetOptions(option-descriptions);

The Getopt::Long module implements an extended function called GetOptions(). This function retrieves and processes the command-line options with which your Perl program was invoked, based on the description of valid options that you provide.

GetOptions() adheres to the POSIX syntax for command-line options, with GNU extensions. In general, this means that options have long names instead of single letters, and are introduced with a double hyphen - -. (A single hyphen can also be used, but implies restrictions on functionality. See later in the chapter.) There is no bundling of command-line options, as was the case with the more traditional single-letter approach. For example, the UNIX ps (1) command can be given the command-line argument:

-vax

which means the combination of -v, -a and -x. With the Getopt::Long syntax, -vax would be a single option.

Command-line options can be used to set values. These values can be specified in one of two ways:

- -size 24
- -size=24

GetOptions() is called with a list of option descriptions, each of which consists of two elements: the option specifier and the option linkage. The option specifier defines the name of the option and, optionally, the value it can take. The option linkage is usually a reference to a variable that will be set when the option is used. For example, the following call to GetOptions():

&GetOptions("size=i" => \$offset);

will accept a command-line option "size" that must have an integer value. With a command line of - -size 24 this will cause the variable $offset to be assigned the value 24.

Alternatively, the first argument to GetOptions may be a reference to a hash describing the linkage for the options. The following call is equivalent to the example above:

%optctl = (size => \$offset);
&GetOptions(\%optctl, "size=i");

Linkage may be specified using either of the above methods, or both. The linkage specified in the argument list takes precedence over the linkage specified in the hash.

The command-line options are implicitly taken from array @ARGV. Upon completion of GetOptions(), @ARGV will contain only the command-line arguments that were not options. (But see below for a way to process non-option arguments.) Each option specifier handed to GetOptions() designates the name of an option, possibly followed by an argument specifier. Values for argument specifiers are:

<none>

Option does not take an argument. If the user invokes the option, the option variable will be set to 1.

!

Option does not take an argument and may be negated, that is, prefixed by "no". For example, foo! will allow - -foo (with value 1 being assigned to the option variable) and -nofoo (with value 0).

=s

Option takes a mandatory string argument. This string will be assigned to the option variable. Even if the string argument starts with - or - -, it will be assigned to the option variable rather than taken as a separate option.

:s

Option takes an optional string argument. This string will be assigned to the option variable. If the string is omitted from the command invocation, "" (an empty string) will be assigned to the option variable. If the string argument starts with - or - -, it will be taken as another option rather than assigned to the option variable.

=i

Option takes a mandatory integer argument. This value will be assigned to the option variable. Note that the value may start with - to indicate a negative value.

:i

Option takes an optional integer argument. This integer value will be assigned to the option variable. If the optional argument is omitted, the value 0 will be assigned to the option variable. The value may start with - to indicate a negative value.

=f

Option takes a mandatory floating-point argument. This value will be assigned to the option variable. Note that the value may start with - to indicate a negative value.

:f

Option takes an optional floating-point argument. This value will be assigned to the option variable. If the optional argument is omitted, the value 0 will be assigned to the option variable. The value may start with - to indicate a negative value.

A lone hyphen - is considered an option; the corresponding option name is the empty string.

A lone double hyphen - - terminates the processing of options and arguments. Any options following the double hyphen will remain in @ARGV when GetOptions() returns.

If an argument specifier concludes with @ (as in =s@), then the option is treated as an array. That is, multiple invocations of the same option, each with a particular value, will result in the list of values being assigned to the option variable, which is an array. See the following section for an example.

Linkage specification

The linkage specifier is optional. If no linkage is explicitly specified but a hash reference is passed, GetOptions() will place the value in the hash. For example:

%optctl = ();
&GetOptions (\%optctl, "size=i");

will perform the equivalent of the assignment:

$optctl{"size"} = 24;

For array options, a reference to an anonymous array is generated. For example:

%optctl = ();
&GetOptions (\%optctl, "sizes=i@");

with command-line arguments:

-sizes 24 -sizes 48

will perform the equivalent of the assignment:

$optctl{"sizes"} = [24, 48];

If no linkage is explicitly specified and no hash reference is passed, GetOptions() will put the value in a global variable named after the option, prefixed by opt_. To yield a usable Perl variable, characters that are not part of the syntax for variables are translated to underscores. For example, - -fpp-struct-return will set the variable $opt_fpp_struct_return. (Note that this variable resides in the namespace of the calling program, not necessarily main.) For example:

&GetOptions ("size=i", "sizes=i@");

with command line:

-size 10 -sizes 24 -sizes 48

will perform the equivalent of the assignments:

$opt_size = 10;
@opt_sizes = (24, 48);

A lone hyphen (-) is considered an option; the corresponding identifier is $opt_ .

The linkage specifier can be a reference to a scalar, a reference to an array, or a reference to a subroutine:

Aliases and abbreviations

The option specifier may actually include a "|"-separated list of option names:

foo|bar|blech=s

In this example, foo is the true name of the option. If no linkage is specified, options -foo, -bar and -blech all will set $opt_foo.

Options may be invoked as unique abbreviations, depending on configuration variable $Getopt::Long::autoabbrev.

Non-option callback routine

A special option specifier <> can be used to designate a subroutine to handle non-option arguments. For example:

&GetOptions(..."<>", \&mysub...);

In this case GetOptions() will immediately call &mysub for every non-option it encounters in the options list. This subroutine gets the name of the non-option passed. This feature requires $Getopt::Long::order to have the value of the predefined and exported variable, $PERMUTE. See also the examples.

Option starters

On the command line, options can start with - (traditional), - - (POSIX), and + (GNU, now being phased out). The latter is not allowed if the environment variable POSIXLY_CORRECT has been defined.

Options that start with - - may have an argument appended, following an equals sign (=). For example: - -foo=bar.

Return value

A return status of 0 (false) indicates that the function detected one or more errors.

Configuration variables

The following variables can be set to change the default behavior of GetOptions():

$Getopt::Long::autoabbrev

If true, then allow option names to be invoked with unique abbreviations. Default is 1 unless environment variable POSIXLY_CORRECT has been set.

$Getopt::Long::getopt_compat

If true, then allow "+" to start options. Default is 1 unless environment variable POSIXLY_CORRECT has been set.

$Getopt::Long::order

If set to $PERMUTE, then non-options are allowed to be mixed with options on the command line. If set to $REQUIRE_ORDER, then mixing is not allowed. Default is $REQUIRE_ORDER if environment variable POSIXLY_CORRECT has been set, $PERMUTE otherwise. Both $PERMUTE and $REQUIRE_ORDER are defined in the library module and automatically exported. $PERMUTE means that:

-foo arg1 -bar arg2 arg3

is equivalent to:

-foo -bar arg1 arg2 arg3

If a non-option callback routine is specified, @ARGV will always be empty upon successful return of GetOptions() since all options have been processed, except when - - is used. So, for example:

-foo arg1 -bar arg2 -- arg3

will call the callback routine for arg1 and arg2, and then terminate, leaving arg3 in @ARGV. If $Getopt::Long::order is $REQUIRE_ORDER, option processing terminates when the first non-option is encountered.

-foo arg1 -bar arg2 arg3

is equivalent to:

-foo -- arg1 -bar arg2 arg3

$Getopt::Long::ignorecase

If true, then ignore case when matching options. Default is 1.

$Getopt::Long::VERSION

The version number of this Getopt::Long implementation is in the format major.minor. This can be used to have Exporter check the version. Example:

use Getopt::Long 2.00;

$Getopt::Long::major_version and $Getopt::Long::minor_version may be inspected for the individual components.

$Getopt::Long::error

Internal error flag. May be incremented from a callback routine to cause options parsing to fail.

$Getopt::Long::debug

Enable copious debugging output. Default is 0.

Examples

If the option specifier is one:i (which takes an optional integer argument), then the following situations are handled:

-one -two            # $opt_one = "", -two is next option
-one -2              # $opt_one = -2

Also, assume specifiers foo=s and bar:s:

-bar -xxx            # $opt_bar = "", -xxx is next option
-foo -bar            # $opt_foo = '-bar'
-foo --              # $opt_foo = '--'

In GNU or POSIX format, option names and values can be combined:

+foo=blech           # $opt_foo = 'blech'
--bar=               # $opt_bar = ""
--bar=--             # $opt_bar = '--'

Example using variable references:

$ret = &GetOptions ('foo=s', \$foo, 'bar=i', 'ar=s', \@ar);

With command-line options -foo blech -bar 24 -ar xx -ar yy this will result in:

$bar = 'blech'
$opt_bar = 24
@ar = ('xx', 'yy')

Example of using the < > option specifier:

@ARGV = qw(-foo 1 bar -foo 2 blech);
&GetOptions("foo=i", \$myfoo, "<>", \&mysub);

Results:

&mysub("bar") will be called (with $myfoo being 1)
&mysub("blech") will be called (with $myfoo being 2)

Compare this with:

@ARGV = qw(-foo 1 bar -foo 2 blech);
&GetOptions("foo=i", \$myfoo);

This will leave the non-options in @ARGV:

$myfoo becomes 2
@ARGV  becomes qw(bar blech)

If you're using the use strict pragma, which requires you to employ only lexical variables or else globals that are fully declared, you will have to use the double-colon package delimiter or else the use vars pragma. For example:

use strict;
use vars qw($opt_rows $opt_cols);
use Getopt::Long;

Getopt::Std--Process Single-Character Options with Option Clustering

use Getopt::Std;
getopt('oDI');    # -o, -D & -I take arg.  Sets opt_* as a side effect.
getopts('oif:');  # -o & -i are boolean flags, -f takes an argument.
                  # Sets opt_* as a side effect.

The getopt() and getopts() functions give your program simple mechanisms for processing single-character options. These options can be clustered (for example, -bdLc might be interpreted as four single-character options), and you can specify individual options that require an accompanying argument. When you invoke getopt() or getopts(), you pass along information about the kinds of options your program expects. These functions then analyze @ARGV, extract information about the options, and return this information to your program in a set of variables. The processing of @ARGV stops when an argument without a leading "-" is encountered, if that argument is not associated with a preceding option. Otherwise, @ARGV is processed to its end and left empty.

For each option in your program's invocation, both getopt() and getopts() define a variable $opt_x where x is the option name. If the option takes an argument, then the argument is read and assigned to $opt_x as its value; otherwise, a value of 1 is assigned to the variable.

Invoke getopt() with one argument, which should contain all options that require a following argument. For example:

getopt('dV');

If your program is then invoked as:

myscr -bfd January -V 10.4

then these variables will be set in the program:

$opt_b = 1;
$opt_f = 1;
$opt_d = "January";
$opt_V = 10.4;

Space between an option and its following argument is unnecessary. The previous command line could have been given this way:

myscr -bfdJanuary -V10.4

In general, your program can be invoked with options given in any order. All options not "declared" in the invocation of getopt() are assumed to be without accompanying argument.

Where getopt() allows any single-character option, getopts() allows only those options you declare explicitly. For example, this invocation:

getopts('a:bc:');

legitimizes only the options -a, -b, and -c. The colon following the a and c means that these two options require an accompanying argument; b is not allowed to have an argument. Accordingly, here are some ways to invoke the program:

myscr -abc              # WRONG unless bc is really the argument to -a
myscr -a -bc            # WRONG, with same qualification
myscr -a foo -bc bar    # $opt_a = "foo"; $opt_b = 1; $opt_c = "bar"
myscr -bafoo -cbar      # same as previous

getopts() returns false if it encounters errors during option processing. However, it continues to process arguments and assign values as best it can to $opt_x variables. You should always check for errors before assuming that the variables hold meaningful values.

getopt() does not return a meaningful value.

Remember that both getopt() and getopts() halt argument processing upon reading an argument (without leading "-") where none was called for. This is not considered an error. So a user might invoke your program with invalid arguments, without your being notified of the fact. However, you can always check to see whether @ARGV has been completely emptied or not--that is, whether all arguments have been processed. If you're using the use strict pragma, which requires you to employ only lexical variables or else globals that are fully declared, you will have to use the double-colon package delimiter or else the use vars pragma. For example:

use strict;
use vars qw($opt_o $opt_i $opt_D);
use Getopt::Std;

I18N::Collate--Compare 8-bit Scalar Data According to the Current Locale

use I18N::Collate;
setlocale(LC_COLLATE, $locale);         # uses POSIX::setlocale
$s1 = new I18N::Collate "scalar_data_1";
$s2 = new I18N::Collate "scalar_data_2";

This module provides you with objects that can be collated (ordered) according to your national character set, provided that Perl's POSIX module and the POSIX setlocale (3) and strxfrm (3) functions are available on your system. $locale in the setlocale() invocation shown above must be an argument acceptable to setlocale (3) on your system. See the setlocale (3) manpage for further information. Available locales depend upon your operating system.

Here is an example of collation within the standard `C' locale:

use I18N::Collate;
setlocale(LC_COLLATE, 'C');
$s1 = new I18N::Collate "Hello";
$s2 = new I18N::Collate "Goodbye";
# following line prints "Hello comes before Goodbye"
print "$$s1 comes before $$s2" if $s2 le $s1;

The objects returned by the new() method are references. You can get at their values by dereferencing them--for example, $$s1 and $$s2. However, Perl's built-in comparison operators are overloaded by I18N::Collate, so that they operate on the objects returned by new() without the necessity of dereference. The print line above dereferences $s1 and $s2 to access their values directly, but does not dereference the variables passed to the le operator. The comparison operators you can use in this way are the following:

<   <=  >   >=  ==  !=  <=>
lt  le  gt  ge  eq  ne  cmp

I18N::Collate uses POSIX::setlocale() and POSIX::strxfrm() to perform the collation. Unlike strxfrm(), however, I18N::Collate handles embedded NULL characters gracefully.

To determine which locales are available with your operating system, check whether the command:

locale -a

lists them. You can also check the locale (5) or nlsinfo manpages, or look at the filenames within one of these directories (or their subdirectories): /usr/lib/nls, /usr/share/lib/locale, or /etc/locale. Not all locales your vendor supports are necessarily installed. Please consult your operating system's documentation and possibly your local system administrator.

integer--Do Arithmetic in Integer Instead of Double

use integer;
$x = 10/3;   # $x is now 3, not 3.33333333333333333

This module tells the compiler to use integer operations from here to the end of the enclosing block. On many machines, this doesn't matter a great deal for most computations, but on those without floating point hardware, it can make a big difference.

This pragma does not automatically cast everything to an integer; it only forces integer operations on arithmetic. For example:

use integer; 
print sin(3);           # 0.141120008059867
print sin(3) + 4;       # 4

You can turn off the integer pragma within an inner block by using the no integer directive.

IPC::Open2--Open a Process for Both Reading and Writing

use IPC::Open2;
# with named filehandles
$pid = open2(\*RDR, \*WTR, $cmd_with_args);
$pid = open2(\*RDR, \*WTR, $cmd, "arg1", "arg2", ...);

# with object-oriented handles
use FileHandle;
my($rdr, $wtr) = (FileHandle->new, FileHandle->new);
$pid = open2($rdr, $wtr, $cmd_with_args);

The open2() function forks a child process to execute the specified command. The first two arguments represent filehandles, one way or another. They can be FileHandle objects, or they can be references to typeglobs, which can either be explicitly named as above, or generated by the Symbol package, as in the example below. Whichever you choose, they represent handles through which your program can read from the command's standard output and write to the command's standard input, respectively. open2() differs from Perl's built-in open function in that it allows your program to communicate in both directions with the child process.

open2() returns the process ID of the child process. On failure it reports a fatal error.

Here's a simple use of open2() by which you can give the program user interactive access to the bc (1) command. (bc is an arbitrary-precision arithmetic package.) In this case we use the Symbol module to produce "anonymous" symbols:

use IPC::Open2;
use Symbol;
$WTR = gensym();  # get a reference to a typeglob
$RDR = gensym();  # and another one
$pid = open2($RDR, $WTR, 'bc');
while (<STDIN>) {            # read commands from user
     print $WTR $_;          # write a command to bc(1)
     $line = <$RDR>;         # read the output of bc(1)
     print STDOUT "$line";   # send the output to the user
}

open2() establishes unbuffered output for $WTR. However, it cannot control buffering of output from the designated command. Therefore, be sure to heed the following warning.

WARNING:

It is extremely easy for your program to hang while waiting to read the next line of output from the command. In the example just shown, bc is known to read and write one line at a time, so it is safe. But utilities like sort (1) that read their entire input stream before offering any output will cause a deadlock when used in the manner we have illustrated. You might do something like this instead:

$pid = open2($RDR, $WTR, 'sort');
while (<STDIN>) {
     print $WTR $_;
}
close($WTR);    # finish sending all output to sort(1)
while (<$RDR>) {     # now read the output of sort(1)
     print STDOUT "$_";
}

More generally, you may have to use select to determine which file descriptors are ready to read, and then sysread for the actual reading.

See also

The IPC::open3 module shows an alternative that handles STDERR as well.

IPC::Open3--Open a Process for Reading, Writing, and Error Handling

use IPC::Open3;
$pid = open3($WTR, $RDR, $ERR, $cmd_with_args);
$pid = open3($WTR, $RDR, $ERR, $cmd, "arg1", "arg2", ...);

IPC::Open3 works like IPC::Open2, with the following differences:

Warnings given for IPC::Open2 regarding possible program hangs apply to IPC::Open3 as well.

lib--Manipulate @INC at Compile-Time

use lib LIST;
no lib LIST;

This module simplifies the manipulation of Perl's special @INC variable at compile-time. It is used to add extra directories to Perl's search path so that later use or require statements will find modules not located along Perl's default search path.

Adding directories

Directories itemized in LIST are added to the start of the Perl search path. Saying:

use lib LIST;

is almost the same as saying:

BEGIN { unshift(@INC, LIST ) }

The difference is that, for each directory in LIST (called $dir here), the lib module also checks to see whether a directory called $dir/$archname/auto exists, where $archname is derived from Perl's configuration information:

use Config;
$archname = $Config{'archname'};

If so, the $dir/$archname directory is assumed to be an architecture-specific directory and is added to @INC in front of $dir.

If LIST includes both $dir and $dir/$archname, then $dir/$archname will be added to @INC twice (assuming $dir/$archname/auto exists).

Deleting directories

You should normally only add directories to @INC. If you need to delete directories from @INC, take care to delete only those you yourself added. Otherwise, be certain that the directories you delete are not needed by other modules directly or indirectly invoked by your script. Other modules may have added directories they need for correct operation.

By default the statement:

no lib LIST

deletes the first instance of each named directory from @INC. To delete multiple instances of the same name from @INC you can specify the name multiple times.

To delete all instances of all the specified names from @INC you can specify :ALL as the first parameter of LIST. For example:

no lib qw(:ALL .);

For each directory in LIST (called $dir here) the lib module also checks to see whether a directory called $dir/$archname/auto exists. If so, the $dir/$archname directory is assumed to be a corresponding architecture-specific directory and is also deleted from @INC.

If LIST includes both $dir and $dir/$archname then $dir/$archname will be deleted from @INC twice (assuming $dir/$archname/auto exists).

Restoring the original directory list

When the lib module is first loaded, it records the current value of @INC in an array @lib::ORIG_INC. To restore @INC to that value you can say:

@INC = @lib::ORIG_INC;

See also

The AddINC module (not in the standard Perl library, but available from CPAN) deals with paths relative to the source file.

Math::BigFloat--Arbitrary-Length, Floating-Point Math Package

use Math::BigFloat;
$f = Math::BigFloat->new($string);
# NSTR is a number string; SCALE is an integer value.
# In all following cases $f remains unchanged.
# All methods except fcmp() return a number string.
$f->fadd(NSTR);          # return sum of NSTR and $f
$f->fsub(NSTR);          # return $f minus NSTR
$f->fmul(NSTR);          # return $f multiplied by NSTR
$f->fdiv(NSTR[,SCALE]);  # return $f divided by NSTR to SCALE places
$f->fneg();              # return negative of $f
$f->fabs();              # return absolute value of $f
$f->fcmp(NSTR);          # compare $f to NSTR; see below for return value
$f->fround(SCALE);       # return rounded value of $f to SCALE digits
$f->ffround(SCALE);      # return rounded value of $f at SCALEth place
$f->fnorm();             # return normalization of $f
$f->fsqrt([SCALE]);      # return sqrt of $f to SCALE places

This module allows you to use floating-point numbers of arbitrary length. For example:

$float = new Math::BigFloat "2.123123123123123123123123123123123";

Number strings (NSTRs) have the form, /[+-]\d*\.?\d*E[+-]\d+/. Embedded white space is ignored, so that the number strings used in the following two lines are identical:

$f = Math::BigFloat->new("-20.0    0732");
$g = $f->fmul("-20.00732");

The return value NaN indicates either that an input parameter was "Not a Number", or else that you tried to divide by zero or take the square root of a negative number. The fcmp() method returns -1, 0, or 1 depending on whether $f is less than, equal to, or greater than the number string given as an argument. If the number string is undefined or null, the undefined value is returned.

If SCALE is unspecified, division is computed to the number of digits given by:

max($div_scale, length(dividend)+length(divisor))

A similar default scale value is computed for square roots.

When you use this module, Perl's basic math operations are overloaded with routines from Math::BigFloat. Therefore, you don't have to employ the methods shown above to multiply, divide, and so on. You can rely instead on the usual operators. Given this code:

$f = Math::BigFloat->new("20.00732");
$g = Math::BigFloat->new("1.7");

the following six lines all yield the corresponding values for $h:

$h = -20.00732 * 1.7;   # 34.012444 (ordinary math--$h is not an object)
$h = $f * $g;           # "34.012444" ($h is now a BigFloat object)
$h = $f * 1.7;          # "34.012444" ($h is now a BigFloat object)
$h = -20.00732 * $g;    # "34.012444" ($h is now a BigFloat object)
$h = $f->fmul($g);      # "+34012444E-6" ($h is now a BigFloat object)
$h = $f->fmul(1.7);     # "+34012444E-6" ($h is now a BigFloat object)

Math::BigInt--Arbitrary-Length Integer Math Package

use Math::BigInt;
$i = Math::BigInt->new($string);
# BINT is a big integer string; in all following cases $i remains unchanged.
# All methods except bcmp() return a big integer string, or strings.
$i->bneg;       # return negative of $i
$i->babs        # return absolute value of $i
$i->bcmp(BINT)  # compare $i to BINT; see below for return value
$i->badd(BINT)  # return sum of BINT and $i
$i->bsub(BINT)  # return $i minus BINT
$i->bmul(BINT)  # return $i multiplied by BINT
$i->bdiv(BINT)  # return $i divided by BINT; see below for return value
$i->bmod(BINT)  # return $i modulus BINT
$i->bgcd(BINT)  # return greatest common divisor of $i and BINT
$i->bnorm       # return normalization of $i

This module allows you to use integers of arbitrary length. Integer strings (BINTs) have the form /^\s*[+-]?[\d\s]+$/. Embedded whitespace is ignored. Output values are always in the canonical form: /^[+-]\d+$/ . For example:

'+0'                # canonical zero value
'   -123 123 123'   # canonical value:  '-123123123'
'1 23 456 7890'     # canonical value:  '+1234567890'

The return value NaN results when an input argument is not a number, or when a divide by zero is attempted. The bcmp() method returns -1, 0, or 1 depending on whether $f is less than, equal to, or greater than the number string given as an argument. If the number string is undefined or null, the undefined value is returned. In a list context the bdiv() method returns a two-element array containing the quotient of the division and the remainder; in a scalar context only the quotient is returned.

When you use this module, Perl's basic math operations are overloaded with routines from Math::BigInt. Therefore, you don't have to employ the methods shown above to multiply, divide, and so on. You can rely instead on the usual operators. Given this code:

$a = Math::BigInt->new("42 000 000 000 000");
$b = Math::BigInt->new("-111111");

the following five lines yield these string values for $c:

$c = 42000000000000 - -111111;
                          # 42000000111111; ordinary math--$c is a double
$c = $a - $b;             # "+42000000111111"; $c is now a BigInt object
$c = $a - -111111;        # "+42000000111111"; $c is now a BigInt object
$c = $a->bsub($b);        # "+42000000111111"; $c is just a string
$c = $a->bsub(-111111);   # "+42000000111111"; $c is just a string

Math::Complex--Complex Numbers Package

use Math::Complex;
$cnum = new Math::Complex;

When you use this module, complex numbers declared as:

$cnum = Math::Complex->new(1, 1);

can be manipulated with overloaded math operators. The operators:

+ - * / neg ~ abs cos sin exp sqrt

are supported, and return references to new objects. Also,

"" (stringify)

is available to convert complex numbers to strings. In addition, the methods:

Re Im arg

are available. Given a complex number, $cnum:

$cnum = Math::Complex->new($x, $y);

then $cnum->Re() returns $x, $cnum->Im() returns $y, and $cnum->arg() returns atan2($y, $x).

sqrt(), which should return two roots, returns only one.

NDBM_File--Tied Access to NDBM Files

use Fcntl;
use NDBM_File;
tie(%hash, NDBM_File, 'Op.dbmx', O_RDWR|O_CREAT, 0644);
# read/writes of %hash are now read/writes of the file, Op.dmx.pag
untie %hash;

See Perl's built-in tie function. Also see under DB_File in this chapter for a description of a closely related module.

Net::Ping--Check Whether a Host Is Online

use Net::Ping;
$hostname = 'elvis';       # host to check
$timeout = 10;             # how long to wait for a response
print "elvis is alive\n"    if pingecho($hostname, $timeout);

pingecho() uses a TCP echo (not an ICMP one) to determine whether a remote host is reachable. This is usually adequate to tell whether a remote host is available to rsh (1), ftp (1), or telnet (1).

The parameters for pingecho() are:

hostname

The remote host to check, specified either as a hostname or as an IP address.

timeout

The timeout in seconds. If not specified it will default to 5 seconds.

WARNING:

pingecho() uses alarm to implement the timeout, so don't set another alarm while you are using it.

ODBM_File--Tied Access to ODBM Files

use Fcntl;
use ODBM_File;
tie(%hash, ODBM_File, 'Op.dbmx', O_RDWR|O_CREAT, 0644);
# read/writes of %hash are now read/writes of the file, Op.dmx
untie %h;

See Perl's built-in tie function. Also see under DB_File in this chapter for a description of a closely related module.

overload--Overload Perl's Mathematical Operations

# In the SomeThing module:
package SomeThing;
use overload
    '+' => \&myadd,
    '-' => \&mysub;
# In your other code:
use SomeThing;
$a = SomeThing->new(57);
$b=5+$a;
if (overload::Overloaded $b) {...}  # is $b subject to overloading?
$strval = overload::StrVal $b;

Caveat Scriptor: This interface is the subject of ongoing research. Feel free to play with it, but don't be too surprised if the interface changes subtly (or not so subtly) as it is developed further. If you rely on it for a mission-critical application, please be sure to write some good regression tests. (Or perhaps in this case we should call them "progression" tests.)

This module allows you to substitute class methods or your own subroutines for standard Perl operators. For example, the code:

package Number;
use overload
    "+"  => \&add,
    "*=" => "muas";

declares function add() for addition, and method muas() in the Number class (or one of its base classes) for the assignment form *= of multiplication.

Arguments to use overload come in key/value pairs. Legal values are values permitted inside a &{ ... } call, so the name of a subroutine, a reference to a subroutine, or an anonymous subroutine will all work. Legal keys are listed below.

The subroutine add() will be called to execute $a+$b if $a is a reference to an object blessed into the package Number, or if $a is not an object from a package with overloaded addition, but $b is a reference to a Number. It can also be called in other situations, like $a+=7, or $a++. See the section on "Autogeneration".

Calling conventions for binary operations

The functions specified with the use overload directive are typically called with three arguments. (See the "No Method" section later in this chapter for the four-argument case.) If the corresponding operation is binary, then the first two arguments are the two arguments of the operation. However, due to general object-calling conventions, the first argument should always be an object in the package, so in the situation of 7+$a, the order of the arguments gets interchanged before the method is called. It probably does not matter when implementing the addition method, but whether the arguments are reversed is vital to the subtraction method. The method can query this information by examining the third argument, which can take three different values:

false (0)

The order of arguments is as in the current operation.

true (1)

The arguments are reversed.

undefined

The current operation is an assignment variant (as in $a+=7), but the usual function is called instead. This additional information can be used to generate some optimizations.

Calling conventions for unary operations

Unary operations are considered binary operations with the second argument being undef. Thus the function that overloads {"++"} is called with arguments ($a, undef, ``) when $a++ is executed.

Overloadable operations

The following operations can be specified with use overload:

Three keys are recognized by Perl that are not covered by the above descriptions: "nomethod", "fallback", and "=".

No method

"nomethod" should be followed by a reference to a function of four parameters. If defined, it is called when the overloading mechanism cannot find a method for some operation. The first three arguments of this function coincide with the arguments for the corresponding method if it were found; the fourth argument is the symbol corresponding to the missing method. If several methods are tried, the last one is used.

For example, 1-$a can be equivalent to:

&nomethodMethod($a, 1, 1, "-")

if the pair `nomethod` => `nomethodMethod` was specified in the use overload directive.

If some operation cannot be resolved and there is no function assigned to "nomethod", then an exception will be raised via die unless "fallback" was specified as a key in a use overload directive.

Fallback

The "fallback" key governs what to do if a method for a particular operation is not found. Three different cases are possible depending on the value of "fallback":

undefined

Perl tries to use a substituted method (see the section later on "Autogeneration". If this fails, it then tries to call the method specified for "nomethod"; if missing, an exception will be raised.

true

The same as for the undefined value, but no exception is raised. Instead, Perl silently reverts to what it would have done were there no use overload present.

defined, but false

No autogeneration is tried. Perl tries to call the method specified for "nomethod", and if this is missing, raises an exception.

Copy constructor

The value for "=" is a reference to a function with three arguments; that is, it looks like the other values in use overload. However, it does not overload the Perl assignment operator. This would rub Camel hair the wrong way.

This operation is called when a mutator is applied to a reference that shares its object with some other reference, such as:

$a=$b;
$a++;

In order to change $a but not $b, a copy of $$a is made, and $a is assigned a reference to this new object. This operation is done during execution of the $a++, and not during the assignment, (so before the increment $$a coincides with $$b). This is only done if ++ is expressed via a method for "++" or "+=". Note that if this operation is expressed via "+" (a nonmutator):

$a=$b;
$a=$a+1;

then $a does not reference a new copy of $$a, since $$a does not appear as an lvalue when the above code is executed.

If the copy constructor is required during the execution of some mutator, but a method for "=" was not specified, it can be autogenerated as a string copy if the object is a plain scalar.

As an example, the actually executed code for:

$a=$b;
# Something else which does not modify $a or $b...
++$a;

may be:

$a=$b;
# Something else which does not modify $a or $b...
$a = $a->clone(undef, "");
$a->incr(undef, "");

This assumes $b is subject to overloading, "++" was overloaded with \&incr, and "=" was overloaded with \&clone.

Autogeneration

If a method for an operation is not found, and the value for "fallback" is true or undefined, Perl tries to autogenerate a substitute method for the missing operation based on the defined operations. Autogenerated method substitutions are possible for the following operations:

Assignment forms of arithmetic operations

$a+=$b can use the method for "+" if the method for "+=" is not defined.

Conversion operations

String, numeric, and Boolean conversion are calculated in terms of one another if not all of them are defined.

Increment and decrement

The ++$a operation can be expressed in terms of $a+=1 or $a+1, and $a- - in terms of $a-=1 and $a-1.

abs($a)

Can be expressed in terms of $a<0 and -$a (or 0-$a).

Unary minus

Can be expressed in terms of subtraction.

Concatenation

Can be expressed in terms of string conversion.

Comparison operations

Can be expressed in terms of its three-valued counterpart: either <=> or cmp:

<,  >,  <=, >=, ==, !=    in terms of <=>
lt, gt, le, ge, eq, ne    in terms of cmp

Copy operator

Can be expressed in terms of an assignment to the dereferenced value if this value is a scalar and not a reference.

WARNING:

One restriction for the comparison operation is that even if, for example, cmp returns a blessed reference, the autogenerated lt function will produce only a standard logical value based on the numerical value of the result of cmp. In particular, a working numeric conversion is needed in this case (possibly expressed in terms of other conversions).

Similarly, .= and x= operators lose their overloaded properties if the string conversion substitution is applied.

When you chop an object that is subject to overloaded operations, the object is promoted to a string and its overloading properties are lost. The same can happen with other operations as well.

Run-time overloading

Since all use directives are executed at compile-time, the only way to change overloading during run-time is:

eval 'use overload "+" => \&addmethod';

You can also say:

eval 'no overload "+", "--", "<="';

although the use of these constructs during run-time is questionable.

Public functions

The overload module provides the following public functions:

overload::StrVal(arg)

Gives string value of arg if stringify overloading is absent.

overload::Overloaded(arg)

Returns true if arg is subject to overloading of some operations.

overload::Method(obj, op)

Returns the undefined value or a reference to the method that implements op.

Diagnostics

When Perl is run with the -Do switch or its equivalent, overloading induces diagnostic messages.

Bugs

Because it is used for overloading, the per-package associative array %OVERLOAD now has a special meaning in Perl.

Overloading is not yet inherited via the @ISA tree, though individual methods may be.

POSIX--Perl Interface to IEEE Std 1003.1

use POSIX;                        # import all symbols
use POSIX qw(setsid);             # import one symbol
use POSIX qw(:errno_h :fcntl_h);  # import sets of symbols
printf "EINTR is %d\n", EINTR;
$sess_id = POSIX::setsid();
$fd = POSIX::open($path, O_CREAT|O_EXCL|O_WRONLY, 0644);
# note: $fd is a filedescriptor, *NOT* a filehandle

The POSIX module permits you to access all (or nearly all) the standard POSIX 1003.1 identifiers. Many of these identifiers have been given Perl-ish interfaces.

This description gives a condensed list of the features available in the POSIX module. Consult your operating system's manpages for general information on most features. Consult the appropriate Perl built-in function whenever a POSIX routine is noted as being identical to the function.

The "Classes" section later in this chapter describes some classes for signal objects, TTY objects, and other miscellaneous objects. The "Functions" section later in this chapter describes POSIX functions from the 1003.1 specification. The remaining sections list various constants and macros in an organization that roughly follows IEEE Std 1003.1b-1993.

WARNING:

A few functions are not implemented because they are C-specific.[4] If you attempt to call one of these functions, it will print a message telling you that it isn't implemented, and will suggest using the Perl equivalent, should one exist. For example, trying to access the setjmp() call will elicit the message: "setjmp() is C-specific: use eval {} instead".

[4] The 1003.1 standard wisely recommends that other language bindings should avoid duplicating the idiosyncracies of C. This is something we were glad to comply with.

Furthermore, some vendors will claim 1003.1 compliance without passing the POSIX Compliance Test Suites (PCTS). For example, one vendor may not define EDEADLK, or may incorrectly define the semantics of the errno values set by open (2). Perl does not attempt to verify POSIX compliance. That means you can currently say "use POSIX" successfully, and then later in your program find that your vendor has been lax and there's no usable ICANON macro after all. This could be construed to be a bug. Whose bug, we won't venture to guess.

Classes

POSIX::SigAction

new

Creates a new POSIX::SigAction object that corresponds to the C struct sigaction. This object will be destroyed automatically when it is no longer needed. The first parameter is the fully qualified name of a subroutine which is a signal handler. The second parameter is a POSIX::SigSet object. The third parameter contains the sa_flags.

$sigset = POSIX::SigSet->new;
$sigaction = POSIX::SigAction->new('main::handler', $sigset,
                 &POSIX::SA_NOCLDSTOP);

This POSIX::SigAction object should be used with the POSIX::sigaction() function.

POSIX::SigSet

new

Creates a new SigSet object. This object will be destroyed automatically when it is no longer needed. Arguments may be supplied to initialize the set. Create an empty set:

$sigset = POSIX::SigSet->new;

Create a set with SIGUSR1:

$sigset = POSIX::SigSet->new(&POSIX::SIGUSR1);

addset

Adds a signal to a SigSet object. Returns undef on failure.

$sigset->addset(&POSIX::SIGUSR2);

delset

Removes a signal from the SigSet object. Returns undef on failure.

$sigset->delset(&POSIX::SIGUSR2);

emptyset

Initializes the SigSet object to be empty. Returns undef on failure.

$sigset->emptyset();

fillset

Initializes the SigSet object to include all signals. Returns undef on failure.

$sigset->fillset();

ismember

Tests the SigSet object to see whether it contains a specific signal.

if ($sigset->ismember(&POSIX::SIGUSR1 ) ){
    print "contains SIGUSR1\n";
}

POSIX::Termios

new

Creates a new Termios object. This object will be destroyed automatically when it is no longer needed.

$termios = POSIX::Termios->new;

getattr

Gets terminal control attributes for a given fd, 0 by default. Returns undef on failure. Obtain the attributes for standard input:

$termios->getattr()

Obtain the attributes for standard output:

$termios->getattr(1)
getcc

Retrieves a value from the c_cc field of a Termios object. The c_cc field is an array, so an index must be specified.

$c_cc[1] = $termios->getcc(&POSIX::VEOF);

getcflag

Retrieves the c_cflag field of a Termios object.

$c_cflag = $termios->getcflag;

getiflag

Retrieves the c_iflag field of a Termios object.

$c_iflag = $termios->getiflag;

getispeed

Retrieves the input baud rate.

$ispeed = $termios->getispeed;

getlflag

Retrieves the c_lflag field of a Termios object.

$c_lflag = $termios->getlflag;

getoflag

Retrieves the c_oflag field of a Termios object.

$c_oflag = $termios->getoflag;

getospeed

Retrieves the output baud rate.

$ospeed = $termios->getospeed;

setattr

Sets terminal control attributes for a given fd. Returns undef on failure. The following sets attributes immediately for standard output.

$termios->setattr(1, &POSIX::TCSANOW);

setcc

Sets a value in the c_cc field of a Termios object. The c_cc field is an array, so an index must be specified.

$termios->setcc(&POSIX::VEOF, 4);

setcflag

Sets the c_cflag field of a Termios object.

$termios->setcflag(&POSIX::CLOCAL);

setiflag

Sets the c_iflag field of a Termios object.

$termios->setiflag(&POSIX::BRKINT);

setispeed

Sets the input baud rate. Returns undef on failure.

$termios->setispeed(&POSIX::B9600);

setlflag

Sets the c_lflag field of a Termios object.

$termios->setlflag(&POSIX::ECHO);

setoflag

Set the c_oflag field of a Termios object.

$termios->setoflag(&POSIX::OPOST);

setospeed

Sets the output baud rate. Returns undef on failure.

$termios->setospeed(&POSIX::B9600);

Baud rate values

B0 B50 B75 B110 B134 B150 B200 B300 B600 B1200 B1800 B2400 B4800 B9600 B19200 B38400

Terminal interface values

TCSADRAIN TCSANOW TCOON TCIOFLUSH TCOFLUSH TCION TCIFLUSH TCSAFLUSH TCIOFF TCOOFF

c_cc index values

VEOF VEOL VERASE VINTR VKILL VQUIT VSUSP VSTART VSTOP VMIN VTIME NCCS

c_cflag field values

CLOCAL CREAD CSIZE CS5 CS6 CS7 CS8 CSTOPB HUPCL PARENB PARODD

c_iflag field values

BRKINT ICRNL IGNBRK IGNCR IGNPAR INLCR INPCK ISTRIP IXOFF IXON PARMRK

c_lflag field values

ECHO ECHOE ECHOK ECHONL ICANON IEXTEN ISIG NOFLSH TOSTOP

c_oflag field values

OPOST

While these constants are associated with the Termios class, note that they are actually symbols in the POSIX package.

Here's an example of a complete program for getting unbuffered, single-character input on a POSIX system:

#!/usr/bin/perl -w
use strict;
$| = 1;
for (1..4) {
    my $got;
    print "gimme: ";
    $got = getone();
    print "--> $got\n";
}
exit;
BEGIN {
    use POSIX qw(:termios_h);
    my ($term, $oterm, $echo, $noecho, $fd_stdin);
    $fd_stdin = fileno(STDIN);
    $term     = POSIX::Termios->new();
    $term->getattr($fd_stdin);
    $oterm    = $term->getlflag();
    $echo     = ECHO | ECHOK | ICANON;
    $noecho   = $oterm & ~$echo;
    sub cbreak {
        $term->setlflag($noecho);
        $term->setcc(VTIME, 1);
        $term->setattr($fd_stdin, TCSANOW);
    }
    sub cooked {
        $term->setlflag($oterm);
        $term->setcc(VTIME, 0);
        $term->setattr($fd_stdin, TCSANOW);
    }
    sub getone {
        my $key = "";
        cbreak();
        sysread(STDIN, $key, 1);
        cooked();
        return $key;
    }
}
END { cooked() }

Functions

Table 7.12: Functions
Function Name Definition
_exit

Identical to the C function _exit (2).

abort

Identical to the C function abort (3).

abs

Identical to Perl's built-in abs function.

access

Determines the accessibility of a file. Returns undef on failure.

if (POSIX::access("/", &POSIX::R_OK ) ){
    print "have read permission\n";
}
acos

Identical to the C function acos (3).

alarm

Identical to Perl's built-in alarm function.

asctime

Identical to the C function asctime (3).

asin

Identical to the C function asin (3).

assert

Similar to C macro assert (3).

atan

Identical to the C function atan (3).

atan2

Identical to Perl's built-in atan2 function.

atexit

C-specific: use END {} instead.

atof

C-specific.

atoi

C-specific.

atol

C-specific.

bsearch

Not supplied. You should probably be using a hash anyway.

calloc

C-specific.

ceil

Identical to the C function ceil (3).

chdir

Identical to Perl's built-in chdir function.

chmod

Identical to Perl's built-in chmod function.

chown

Identical to Perl's built-in chown function.

clearerr

Use method FileHandle::clearerr() instead.

clock

Identical to the C function clock (3).

close

Closes a file. This uses file descriptors such as those obtained by calling POSIX::open(). Returns undef on failure.

$fd = POSIX::open("foo", &POSIX::O_RDONLY);
POSIX::close($fd);
closedir

Identical to Perl's built-in closedir function.

cos

Identical to Perl's built-in cos function.

cosh

Identical to the C function cosh (3).

creat

Creates a new file. This returns a file descriptor like the ones returned by POSIX::open(). Use POSIX::close() to close the file.

$fd = POSIX::creat("foo", 0611);
POSIX::close($fd);
ctermid

Generates the path name for the controlling terminal.

$path = POSIX::ctermid();
ctime

Identical to the C function ctime (3)

cuserid

Gets the character login name of the user.

$name = POSIX::cuserid();
difftime

Identical to the C function difftime (3).

div

C-specific.

dup

Similar to the C function dup (2). Uses file descriptors such as those obtained by calling POSIX::open(). Returns undef on failure.

dup2

Similar to the C function dup2 (2). Uses file descriptors such as those obtained by calling POSIX::open(). Returns undef on failure.

errno

Returns the value of errno.

$errno = POSIX::errno();
execl

C-specific; use Perl's exec instead.

execle

C-specific; use Perl's exec instead.

execlp

C-specific; use Perl's exec instead.

execv

C-specific; use Perl's exec instead.

execve

C-specific; use Perl's exec instead.

execvp

C-specific; use Perl's exec instead.

exit

Identical to Perl's built-in exit function.

exp

Identical to Perl's built-in exp function.

fabs

Identical to Perl's built-in abs function.

fclose

Use method FileHandle::close() instead.

fcntl

Identical to Perl's built-in fcntl function.

fdopen

Use method FileHandle::new_from_fd() instead.

feof

Use method FileHandle::eof() instead.

ferror

Use method FileHandle::error() instead.

fflush

Use method FileHandle::flush() instead.

fgetc

Use method FileHandle::getc() instead.

fgetpos

Use method FileHandle::getpos() instead.

fgets

Use method FileHandle::gets() instead.

fileno

Use method FileHandle::fileno() instead.

floor

Identical to the C function floor (3).

fmod

Identical to the C function fmod (3).

fopen

Use method FileHandle::open() instead.

fork

Identical to Perl's built-in fork function.

fpathconf

Retrieves the value of a configurable limit on a file or directory. This uses file descriptors such as those obtained by calling POSIX::open(). Returns undef on failure. The following will determine the maximum length of the longest allowable pathname on the filesystem that holds /tmp/foo.

$fd = POSIX::open("/tmp/foo", &POSIX::O_RDONLY);
$path_max = POSIX::fpathconf($fd, &POSIX::_PC_PATH_MAX);
fprintf

C-specific; use Perl's built-in printf function instead.

fputc

C-specific; use Perl's built-in print function instead.

fputs

C-specific; use Perl's built-in print function instead.

fread

C-specific; use Perl's built-in read function instead.

free

C-specific

freopen

C-specific; use Perl's built-in open function instead.

frexp

Returns the mantissa and exponent of a floating-point number.

($mantissa, $exponent) = POSIX::frexp(3.14);
fscanf

C-specific; use <> and regular expressions instead.

fseek

Use method FileHandle::seek() instead.

fsetpos

Use method FileHandle::setpos() instead.

fstat

Gets file status. This uses file descriptors such as those obtained by calling POSIX::open(). The data returned is identical to the data from Perl's built-in stat function. Odd how that happens...

$fd = POSIX::open("foo", &POSIX::O_RDONLY);
@stats = POSIX::fstat($fd);
ftell

Use method FileHandle::tell() instead.

fwrite

C-specific; use Perl's built-in print function instead.

getc

Identical to Perl's built-in getc function.

getchar

Returns one character from STDIN.

getcwd

Returns the name of the current working directory.

getegid

Returns the effective group ID (gid).

getenv

Returns the value of the specified environment variable.

geteuid

Returns the effective user ID (uid).

getgid

Returns the user's real group ID (gid).

getgrgid

Identical to Perl's built-in getgrgid function.

getgrnam

Identical to Perl's built-in getgrnam function.

getgroups

Returns the ids of the user's supplementary groups.

getlogin

Identical to Perl's built-in getlogin function.

getpgrp

Identical to Perl's built-in getpgrp function.

getpid

Returns the process's ID (pid).

getppid

Identical to Perl's built-in getppid function.

getpwnam

Identical to Perl's built-in getpwnam function.

getpwuid

Identical to Perl's built-in getpwuid function.

gets

Returns one line from STDIN.

getuid

Returns the user's ID (uid).

gmtime

Identical to Perl's built-in gmtime function.

isalnum

Identical to the C function, except that it can apply to a single character or to a whole string. (If applied to a whole string, all characters must be of the indicated category.)

isalpha

Identical to the C function, except that it can apply to a single character or to a whole string.

isatty

Returns a Boolean indicating whether the specified filehandle is connected to a TTY.

iscntrl

Identical to the C function, except that it can apply to a single character or to a whole string.

isdigit

Identical to the C function, except that it can apply to a single character or to a whole string.

isgraph

Identical to the C function, except that it can apply to a single character or to a whole string.

islower

Identical to the C function, except that it can apply to a single character or to a whole string.

isprint

Identical to the C function, except that it can apply to a single character or to a whole string.

ispunct

Identical to the C function, except that it can apply to a single character or to a whole string.

isspace

Identical to the C function, except that it can apply to a single character or to a whole string.

isupper

Identical to the C function, except that it can apply to a single character or to a whole string.

isxdigit

Identical to the C function, except that it can apply to a single character or to a whole string.

kill

Identical to Perl's built-in kill function.

labs

C-specific; use Perl's built-in abs function instead.

ldexp

Identical to the C function ldexp (3).

ldiv

C-specific; use the division operator / and Perl's built-in int function instead.

link

Identical to Perl's built-in link function.

localeconv

Gets numeric formatting information. Returns a reference to a hash containing the current locale formatting values. The database for the de (Deutsch or German) locale:

 

$loc = POSIX::setlocale(&POSIX::LC_ALL, "de");
print "Locale = $loc\n";
$lconv = POSIX::localeconv();
print "decimal_point     = ", $lconv->{decimal_point},     "\n";
print "thousands_sep     = ", $lconv->{thousands_sep},     "\n";
print "grouping          = ", $lconv->{grouping},          "\n";
print "int_curr_symbol   = ", $lconv->{int_curr_symbol},   "\n";
print "currency_symbol   = ", $lconv->{currency_symbol},   "\n";
print "mon_decimal_point = ", $lconv->{mon_decimal_point}, "\n";
print "mon_thousands_sep = ", $lconv->{mon_thousands_sep}, "\n";
print "mon_grouping      = ", $lconv->{mon_grouping},      "\n";
print "positive_sign     = ", $lconv->{positive_sign},     "\n";
print "negative_sign     = ", $lconv->{negative_sign},     "\n";
 

print "int_frac_digits   = ", $lconv->{int_frac_digits},   "\n";
print "frac_digits       = ", $lconv->{frac_digits},       "\n";
print "p_cs_precedes     = ", $lconv->{p_cs_precedes},     "\n";
print "p_sep_by_space    = ", $lconv->{p_sep_by_space},    "\n";
print "n_cs_precedes     = ", $lconv->{n_cs_precedes},     "\n";
print "n_sep_by_space    = ", $lconv->{n_sep_by_space},    "\n";
print "p_sign_posn       = ", $lconv->{p_sign_posn},       "\n";
print "n_sign_posn       = ", $lconv->{n_sign_posn},       "\n";
localtime

Identical to Perl's built-in localtime function.

log

Identical to Perl's built-in log function.

log10

Identical to the C function log10 (3).

longjmp

C-specific; use Perl's built-in die function instead.

lseek

Moves the read/write file pointer. This uses file descriptors such as those obtained by calling POSIX::open().

$fd = POSIX::open("foo", &POSIX::O_RDONLY);
$off_t = POSIX::lseek($fd, 0, &POSIX::SEEK_SET);
Returns undef on failure.
malloc

C-specific.

mblen

Identical to the C function mblen (3).

mbstowcs

Identical to the C function mbstowcs (3).

mbtowc

Identical to the C function mbtowc (3).

memchr

C-specific; use Perl's built-in index instead.

memcmp

C-specific; use eq instead.

memcpy

C-specific; use = instead.

memmove

C-specific; use = instead.

memset

C-specific; use x instead.

mkdir

Identical to Perl's built-in mkdir function.

mkfifo

Similar to the C function mkfifo (2). Returns undef on failure.

mktime

Converts date/time information to a calendar time. Returns undef on failure. Synopsis:

 

mktime(sec, min, hour, mday, mon, year, wday = 0,
                                     yday = 0, isdst = 0)
 

The month (mon), weekday (wday), and yearday (yday) begin at zero. That is, January is 0, not 1; Sunday is 0, not 1; January 1st is 0, not 1. The year (year) is given in years since 1900. That is, the year 1995 is 95; the year 2001 is 101. Consult your system's mktime (3) manpage for details about these and the other arguments. Calendar time for December 12, 1995, at 10:30 am.

 

$time_t = POSIX::mktime(0, 30, 10, 12, 11, 95);
print "Date = ", POSIX::ctime($time_t);
modf

Returns the integral and fractional parts of a floating-point number.

($fractional, $integral) = POSIX::modf(3.14);
nice

Similar to the C function nice (3). Returns undef on failure.

offsetof

C-specific.

open

Opens a file for reading or writing. This returns file descriptors, not Perl filehandles. Returns undef on failure. Use POSIX::close() to close the file. Open a file read-only:

$fd = POSIX::open("foo");
Open a file for reading and writing:

$fd = POSIX::open("foo", &POSIX::O_RDWR);
Open a file for writing, with truncation:

$fd = POSIX::open("foo", &POSIX::O_WRONLY | &POSIX::O_TRUNC);
Create a new file with mode 0644; set up the file for writing:

$fd = POSIX::open("foo", &POSIX::O_CREAT | &POSIX::O_WRONLY, 
        0644);
opendir

Opens a directory for reading. Returns undef on failure.

$dir = POSIX::opendir("/tmp");
@files = POSIX::readdir($dir);
POSIX::closedir($dir);
pathconf

Retrieves the value of a configurable limit on a file or directory. Returns undef on failure. The following will determine the maximum length of the longest allowable pathname on the filesystem that holds /tmp :

$path_max = POSIX::pathconf("/tmp", &POSIX::_PC_PATH_MAX);
pause

Similar to the C function pause (3). Returns undef on failure.

perror

Identical to the C function perror (3).

pipe

Creates an interprocess channel. Returns file descriptors like those returned by POSIX::open().

($fd0, $fd1) = POSIX::pipe();
POSIX::write($fd0, "hello", 5);
POSIX::read($fd1, $buf, 5);
pow

Computes $x raised to the power $exponent.

$ret = POSIX::pow($x, $exponent);
printf

Prints the specified arguments to STDOUT.

putc

C-specific; use Perl's built-in print function instead.

putchar

C-specific; use Perl's built-in print function instead.

puts

C-specific; use Perl's built-in print function instead.

qsort

C-specific; use Perl's built-in sort function instead.

raise

Sends the specified signal to the current process.

rand

Non-portable; use Perl's built-in rand function instead.

read

Reads from a file. This uses file descriptors such as those obtained by calling POSIX::open(). If the buffer $buf is not large enough for the read, then Perl will extend it to make room for the request. Returns undef on failure.

$fd = POSIX::open("foo", &POSIX::O_RDONLY);
$bytes = POSIX::read($fd, $buf, 3);
readdir

Identical to Perl's built-in readdir function.

realloc

C-specific.

remove

Identical to Perl's built-in unlink function.

rename

Identical to Perl's built-in rename function.

rewind

Seeks to the beginning of the file.

rewinddir

Identical to Perl's built-in rewinddir function.

rmdir

Identical to Perl's built-in rmdir function.

scanf

C-specific; use <> and regular expressions instead.

setgid

Sets the real group id for this process, like assigning to the special variable $(.

setjmp

C-specific; use eval {} instead.

setlocale

Modifies and queries program's locale. The following will set the traditional UNIX system locale behavior.

$loc = POSIX::setlocale(&POSIX::LC_ALL, "C");
setpgid

Similar to the C function setpgid (2). Returns undef on failure.

setsid

Identical to the C function setsid (8).

setuid

Sets the real user ID for this process, like assigning to the special variable $<.

sigaction

Detailed signal management. This uses POSIX::SigAction objects for the $action and $oldaction arguments. Consult your system's sigaction (3) manpage for details. Returns undef on failure.

POSIX::sigaction($sig, $action, $oldaction)
siglongjmp

C-specific; use Perl's built-in die function instead.

sigpending

Examine signals that are blocked and pending. This uses POSIX::SigSet objects for the $sigset argument. Consult your system's sigpending (2) manpage for details. Returns undef on failure.

POSIX::sigpending($sigset)
sigprocmask

Changes and/or examines this process's signal mask. This uses POSIX::SigSet objects for the $sigset and $oldsigset arguments. Consult your system's sig procmask (2) manpage for details. Returns undef on failure.

POSIX::sigprocmask($how, $sigset, $oldsigset)
sigsetjmp

C-specific; use eval {} instead.

sigsuspend

Install a signal mask and suspend process until signal arrives. This uses POSIX::SigSet objects for the $signal_mask argument. Consult your system's sigsuspend (2) manpage for details. Returns undef on failure.

POSIX::sigsuspend($signal_mask)
sin

Identical to Perl's built-in sin function.

sinh

Identical to the C function sinh (3).

sleep

Identical to Perl's built-in sleep function.

sprintf

Identical to Perl's built-in sprintf function.

sqrt

Identical to Perl's built-in sqrt function.

srand

Identical to Perl's built-in srand function.

sscanf

C-specific; use regular expressions instead.

stat

Identical to Perl's built-in stat function.

strcat

C-specific; use .= instead.

strchr

C-specific; use index instead.

strcmp

C-specific; use eq instead.

strcoll

Identical to the C function strcoll (3).

strcpy

C-specific; use = instead.

strcspn

C-specific; use regular expressions instead.

strerror

Returns the error string for the specified errno.

strftime

Converts date and time information to string. Returns the string.

strftime(fmt, sec, min, hour, mday, mon, year, 
            wday = 0, yday = 0, isdst = 0)
The month (mon), weekday (wday), and yearday (yday) begin at zero. That is, January is 0, not 1; Sunday is 0, not 1; January 1st is 0, not 1. The year (year) is given in years since 1900. That is, the year 1995 is 95; the year 2001 is 101. Consult your system's strftime (3) manpage for details about these and the other arguments. The string for Tuesday, December 12, 1995:

$str = POSIX::strftime("%A, %B %d, %Y", 0, 0, 0, 12, 
                        11, 95, 2);
print "$str\n";
strlen

C-specific; use length instead.

strncat

C-specific; use .= and/or substr instead.

strncmp

C-specific; use eq and/or substr instead.

strncpy

C-specific; use = and/or substr instead.

strpbrk

C-specific.

strrchr

C-specific; use rindex and/or substr instead.

strspn

C-specific.

strstr

Identical to Perl's built-in index function.

strtod

C-specific.

strtok

C-specific.

strtol

C-specific.

strtoul

C-specific.

strxfrm

String transformation. Returns the transformed string.

$dst = POSIX::strxfrm($src);
sysconf

Retrieves values of system configurable variables. Returns undef on failure. The following will get the machine's clock speed.

$clock_ticks = POSIX::sysconf(&POSIX::_SC_CLK_TCK);
system

Identical to Perl's built-in system function.

tan

Identical to the C function tan (3).

tanh

Identical to the C function tanh (3).

tcdrain

Similar to the C function tcdrain (3). Returns undef on failure.

tcflow

Similar to the C function tcflow (3). Returns undef on failure.

tcflush

Similar to the C function tcflush (3). Returns undef on failure.

tcgetpgrp

Identical to the C function tcgetpgrp (3).

tcsendbreak

Similar to the C function tcsendbreak (3). Returns undef on failure.

tcsetpgrp

Similar to the C function tcsetpgrp (3). Returns undef on failure.

time

Identical to Perl's built-in time function.

times

Returns elapsed realtime since some point in the past (such as system startup), user and system times for this process, and user and system times for child processes. All times are returned in clock ticks.

($realtime, $user, $system, $cuser, $csystem) = POSIX::times();
Note: Perl's built-in times function returns four values, measured in seconds.
tmpfile

Use method FileHandle::new_tmpfile() instead.

tmpnam

Returns a name for a temporary file.

$tmpfile = POSIX::tmpnam();
tolower

Identical to Perl's built-in lc function.

toupper

Identical to Perl's built-in uc function.

ttyname

Identical to the C function ttyname (3).

tzname

Retrieves the time conversion information from the tzname variable.

POSIX::tzset();
($std, $dst) = POSIX::tzname();
tzset

Identical to the C function tzset (3).

umask

Identical to Perl's built-in umask function.

uname

Gets name of current operating system.

($sysname, $nodename, $release, 
     $version, $machine) = POSIX::uname();
ungetc

Use method FileHandle::ungetc() instead.

unlink

Identical to Perl's built-in unlink function.

utime

Identical to Perl's built-in utime function.

vfprintf

C-specific.

vprintf

C-specific.

vsprintf

C-specific.

wait

Identical to Perl's built-in wait function.

waitpid

Wait for a child process to change state. This is identical to Perl's built-in waitpid function.

$pid = POSIX::waitpid(-1, &POSIX::WNOHANG);
print "status = ", ($? / 256), "\n";
wcstombs

Identical to the C function wcstombs (3).

wctomb

Identical to the C function wctomb (3).

write

Writes to a file. Uses file descriptors such as those obtained by calling POSIX::open(). Returns undef on failure.

$fd = POSIX::open("foo", &POSIX::O_WRONLY);
$buf = "hello";
$bytes = POSIX::write($b, $buf, 5);

Pathname constants

_PC_CHOWN_RESTRICTED _PC_LINK_MAX _PC_MAX_CANON
_PC_MAX_INPUT _PC_NAME_MAX _PC_NO_TRUNC
_PC_PATH_MAX _PC_PIPE_BUF _PC_VDISABLE

POSIX constants

_POSIX_ARG_MAX _POSIX_CHILD_MAX _POSIX_CHOWN_RESTRICTED
_POSIX_JOB_CONTROL _POSIX_LINK_MAX _POSIX_MAX_CANON
_POSIX_MAX_INPUT _POSIX_NAME_MAX _POSIX_NGROUPS_MAX
_POSIX_NO_TRUNC _POSIX_OPEN_MAX _POSIX_PATH_MAX
_POSIX_PIPE_BUF _POSIX_SAVED_IDS _POSIX_SSIZE_MAX
_POSIX_STREAM_MAX _POSIX_TZNAME_MAX _POSIX_VDISABLE

System configuration

_SC_ARG_MAX _SC_CHILD_MAX _SC_CLK_TCK _SC_JOB_CONTROL
_SC_NGROUPS_MAX _SC_OPEN_MAX _SC_SAVED_IDS _SC_STREAM_MAX
_SC_TZNAME_MAX _SC_VERSION

Error constants

E2BIG EACCES EAGAIN EBADF EBUSY ECHILD EDEADLK
EDOM EEXIST EFAUL EFBIG EINTR EINVAL EIO
EISDIR EMFILE EMLINK ENAMETOOLONG ENFILE ENODE ENOENT
ENOEXEC ENOLCK ENOMEM ENOSPC ENOSYS ENOTDIR ENOTEMPTY
ENOTTY ENXIO EPERM EPIPE ERANGE EROFS ESPIPE

File control constants

FD_CLOEXEC F_DUPFD F_GETFD F_GETFL F_GETLK F_OK
F_RDLCK F_SETFD F_SETFL F_SETLK F_SETLKW F_UNLCK
F_WRLCK O_ACCMODE O_APPEND O_CREAT O_EXCL O_NOCTTY
O_NONBLOCK O_RDONLY O_RDWR O_TRUNC O_WRONLY

Floating-point constants

DBL_DIG DBL_EPSILON DBL_MANT_DIG DBL_MAX
DBL_MAX_10_EXP DBL_MAX_EXP DBL_MIN DBL_MIN_10_EXP
DBL_MIN_EXP FLT_DIG FLT_EPSILON FLT_MANT_DIG
FLT_MAX FLT_MAX_10_EXP FLT_MAX_EXP FLT_MIN
FLT_MIN_10_EXP FLT_MIN_EXP FLT_RADIX FLT_ROUNDS
LDBL_DIG LDBL_EPSILON LDBL_MANT_DIG LDBL_MAX
LDBL_MAX_10_EXP LDBL_MAX_EXP LDBL_MIN LDBL_MIN_10_EXP

Limit constants

ARG_MAX CHAR_BIT CHAR_MAX CHAR_MIN CHILD_MAX
INT_MAX INT_MIN LINK_MAX LONG_MAX LONG_MIN
MAX_CANON MAX_INPUT MB_LEN_MAX NAME_MAX NGROUPS_MAX
OPEN_MAX PATH_MAX PIPE_BUF SCHAR_MAX SCHAR_MIN
SHRT_MAX SHRT_MIN SSIZE_MAX STREAM_MAX TZNAME_MAX
UCHAR_MAX UINT_MAX ULONG_MAX USHRT_MAX

Locale constants

LC_ALL LC_COLLATE LC_CTYPE LC_MONETARY LC_NUMERIC LC_TIME

Math constants

HUGE_VAL

Signal constants

SA_NOCLDSTOP SIGABRT SIGALRM SIGCHLD SIGCONT SIGFPE
SIGHUP SIGILL SIGINT SIGKILL SIGPIPE SIGQUIT
SIGSEGV SIGSTOP SIGTERM SIGTSTP SIGTTIN SIGTTOU
SIGUSR1 SIGUSR2 SIG_BLOCK SIG_DFL SIG_ERR SIG_IGN

Stat constants

S_IRGRP S_IROTH S_IRUSR S_IRWXG S_IRWXO S_IRWXU S_ISGID
S_ISUID S_IWGRP S_IWOTH S_IWUSR S_IXGRP S_IXOTH S_IXUSR

Stat macros

S_ISBLK S_ISCHR S_ISDIR S_ISFIFO S_ISREG

Stdlib constants

EXIT_FAILURE EXIT_SUCCESS MB_CUR_MAX RAND_MAX

Stdio constants

BUFSIZ EOF FILENAME_MAX L_ctermid L_cuserid L_tmpname TMP_MAX

Time constants

CLK_TCK CLOCKS_PER_SEC

Unistd constants

R_OK SEEK_CUR SEEK_END SEEK_SET STDIN_FILENO
STDOUT_FILENO STRERR_FILENO W_OK X_OK

Wait constants

WNOHANG WUNTRACED

Wait macros

WIFEXITED WEXITSTATUS WIFSIGNALED WTERMSIG WIFSTOPPED WSTOPSIG

Pod::Text--Convert POD Data to Formatted ASCII Text

use Pod::Text;
pod2text("perlfunc.pod", *filehandle);  # send formatted output to file
$text = pod2text("perlfunc.pod");       # assign formatted output to $text

Pod::Text converts documentation in the POD format (such as can be found throughout the Perl distribution) into formatted ASCII text. Termcap is optionally supported for boldface/underline, and can be enabled with:

$Pod::Text::termcap=1

If termcap is not enabled, backspaces are used to simulate bold and underlined text.

The pod2text() subroutine can take one or two arguments. The first is the name of a file to read the POD from, or "<&STDIN" to read from STDIN. The second argument, if provided, is a filehandle glob where output should be sent. (Use *STDOUT to write to STDOUT.)

A separate pod2text program is included as part of the standard Perl distribution. Primarily, a wrapper for Pod::Text, it can be invoked this way:

pod2text < input.pod

Safe--Create Safe Namespaces for Evaluating Perl Code

use Safe;
$cpt = new Safe;  # create a new safe compartment

The Safe extension module allows the creation of compartments in which untrusted Perl code can be evaluated. Each compartment provides a new namespace and has an associated operator mask.

The root of the namespace (that is, main::) is changed to a different package, and code evaluated in the compartment cannot refer to variables outside this namespace, even with run-time glob lookups and other tricks. Code that is compiled outside the compartment can choose to place variables into (or share variables with) the compartment's namespace, and only that data will be visible to code evaluated in the compartment.

By default, the only variables shared with compartments are the underscore variables $_ and @_ (and, technically, the much less frequently used %_, the _ filehandle and so on). This is because otherwise Perl operators that default to $_ would not work and neither would the assignment of arguments to @_ on subroutine entry.

Each compartment has an associated operator mask with which you can exclude particular Perl operators from the compartment. (The mask syntax is explained below.) Recall that Perl code is compiled into an internal format before execution. Evaluating Perl code (for example, via eval STRING or do FILE) causes the code to be compiled into an internal format and then, provided there was no error in the compilation, executed. Code evaluated in a compartment is compiled subject to the compartment's operator mask. Attempting to evaluate compartmentalized code that contains a masked operator will cause the compilation to fail with an error. The code will not be executed.

By default, the operator mask for a newly created compartment masks out all operations that give access to the system in some sense. This includes masking off operators such as system, open, chown, and shmget, but operators such as print, sysread, and <FILEHANDLE> are not masked off. These file operators are allowed since, in order for the code in the compartment to have access to a filehandle, the code outside the compartment must have explicitly placed the filehandle variable inside the compartment.

Since it is only at the compilation stage that the operator mask applies, controlled access to potentially unsafe operations can be achieved by having a handle to a wrapper subroutine (written outside the compartment) placed into the compartment. For example:

$cpt = new Safe;
sub wrapper {
    ;# vet arguments and perform potentially unsafe operations
}
$cpt->share('&wrapper');  # see share method below

An operator mask exists at user-level as a string of bytes of length MAXO, each of which is either 0x00 or 0x01. Here, MAXO is the number of operators in the current version of Perl. The subroutine MAXO (available for export by package Safe) returns the number of operators in the currently running Perl executable. The presence of a 0x01 byte at offset n of the string indicates that operator number n should be masked (that is, disallowed). The Safe extension makes available routines for converting from operator names to operator numbers (and vice versa) and for converting from a list of operator names to the corresponding mask (and vice versa).

Methods in class Safe

To create a new compartment, use:

$cpt = new Safe NAMESPACE, MASK;

where NAMESPACE is the root namespace to use for the compartment (defaults to Safe::Root000000000, auto-incremented for each new compartment). MASK is the operator mask to use. Both arguments are optional.

The following methods can then be used on the compartment object returned by the above constructor. The object argument is implicit in each case.

root(NAMESPACE)

A get-or-set method for the compartment's namespace. With the NAMESPACE argument present, it sets the root namespace for the compartment. With no NAMESPACE argument present, it returns the current root namespace of the compartment.

mask(MASK)

A get-or-set method for the compartment's operator mask. With the MASK argument present, it sets the operator mask for the compartment. With no MASK argument present, it returns the current operator mask of the compartment.

trap(OP, ...)

Sets bits in the compartment's operator mask corresponding to each operator named in the list of arguments. Each OP can be either the name of an operation or its number. See opcode.h or opcode.pl in the main Perl distribution for a canonical list of operator names.

untrap(OP, ...)

Resets bits in the compartment's operator mask corresponding to each operator named in the list of arguments. Each OP can be either the name of an operation or its number. See opcode.h or opcode.pl in the main Perl distribution for a canonical list of operator names.

share(VARNAME, ...)

Shares the variables in the argument list with the compartment. Each VARNAME must be a string containing the name of a variable with a leading type identifier included. Examples of legal variable names are $foo for a scalar, @foo for an array, %foo for a hash, &foo for a subroutine and *foo for a typeglob. (A typeglob results in the sharing of all symbol table entries associated with foo, including scalar, array, hash, subroutine, and filehandle.)

varglob(VARNAME)

Returns a typeglob for the symbol table entry of VARNAME in the package of the compartment. VARNAME must be the name of a variable without any leading type marker. For example:

$cpt = new Safe 'Root';
$Root::foo = "Hello world";
# Equivalent version which doesn't need to know $cpt's package name:
${$cpt->varglob('foo')} = "Hello world";

reval(STRING)

Evaluates STRING as Perl code inside the compartment. The code can only see the compartment's namespace (as returned by the root() method). Any attempt by code in STRING to use an operator which is in the compartment's mask will cause an error (at run-time of the main program, but at compile-time for the code in STRING). If the code in STRING includes an eval (and the eval operator is permitted) then the error can occur at run-time for STRING (although it is at compile-time for the eval within STRING). The error is of the form "%s trapped by operation mask operation...." If an operation is trapped in this way, then the code in STRING will not be executed. If such a trapped operation occurs, or if any other compile-time or return error occurs, then $@ is set to the error message, just as with an eval. If there is no error, then the method returns the value of the last expression evaluated, or a return statement may be used, just as with subroutines and eval.

rdo(FILENAME)

Evaluates the contents of file FILENAME inside the compartment. See the reval() method earlier for further details.

Subroutines in package Safe

The Safe package contains subroutines for manipulating operator names and operator masks. All are available for export by the package. The canonical list of operator names is contained in the array op_name defined and initialized in file opcode.h of the Perl source distribution.

ops_to_mask(OP, ...)

Takes a list of operator names and returns an operator mask with precisely those operators masked.

mask_to_ops(MASK)

Takes an operator mask and returns a list of operator names corresponding to those operators which are masked in MASK.

opcode(OP, ...)

Takes a list of operator names and returns the corresponding list of opcodes (which can then be used as byte offsets into a mask).

opname(OP, ...)

Takes a list of opcodes and returns the corresponding list of operator names.

fullmask

Returns a mask with all operators masked. It returns the string `\001` x MAXO().

emptymask

Returns a mask with all operators unmasked. It returns the string `\0` x MAXO(). This is useful if you want a compartment to make use of the name-space protection features but do not want the default restrictive mask.

MAXO

This returns the number of operators (hence the length of an operator mask).

op_mask

This returns the operator mask that is actually in effect at the time the invocation to the subroutine is compiled. This is probably not terribly useful.

SDBM_File--Tied Access to SDBM Files

use Fcntl;
use SDBM_File;
tie(%hash, SDBM_File, 'Op.dbmx', O_RDWR|O_CREAT, 0644);
# read/writes of %hash are now read/writes of the file, Op.dmx.pag
untie %h;

See Perl's built-in tie function. Also see the DB_File module in this chapter for a description of a closely related module.

Search::Dict--Search for Key in Dictionary File

use Search::Dict;
look *FILEHANDLE, $key, $dict, $fold;

The look() routine sets the file position in FILEHANDLE to be the first line greater than or equal (stringwise) to $key. It returns the new file position, or -1 if an error occurs.

If $dict is true, the search is in dictionary order (ignoring everything but word characters and whitespace). If $fold is true, then case is ignored. The file must be sorted into the appropriate order, using the -d and -f flags of UNIX sort (1), or the equivalent command on non-UNIX machines. Unpredictable results will otherwise ensue.

SelectSaver--Save and Restore Selected Filehandle

use SelectSaver;
select $fh_old;
{
    my $saver = new SelectSaver($fh_new); # selects $fh_new
}
# block ends; object pointed to by "my" $saver is destroyed
# previous handle, $fh_old is now selected
# alternative invocation, without filehandle argument
my $saver = new SelectSaver; # selected filehandle remains unchanged

A SelectSaver object contains a reference to the filehandle that was selected when the object was created. If its new() method is given a filehandle as an argument, then that filehandle is selected; otherwise, the selected filehandle remains unchanged.

When a SelectSaver object is destroyed, the filehandle that was selected immediately prior to the object's creation is re-selected.

SelfLoader--Load Functions Only on Demand

package GoodStuff;
use SelfLoader;
[initializing code]
_ _DATA_ _
sub {...};

This module is used for delayed loading of Perl functions that (unlike AutoLoader functions) are packaged within your script file. This gives the appearance of faster loading.

In the example above, SelfLoader tells its user (GoodStuff) that functions in the GoodStuff package are to be autoloaded from after the _ _DATA_ _ token.

The _ _DATA_ _ token tells Perl that the code for compilation is finished. Everything after the _ _DATA_ _ token is available for reading via the filehandle GoodStuff::DATA, where GoodStuff is the name of the current package when the _ _DATA_ _ token is reached. This token works just the same as _ _END_ _ does in package main, except that data after _ _END_ _ is retrievable only in package main, whereas data after _ _DATA_ _ is retrievable in whatever the current package is.

Note that it is possible to have _ _DATA_ _ tokens in the same package in multiple files, and that the last _ _DATA_ _ token in a given package that is encountered by the compiler is the one accessible by the filehandle. That is, whenever the _ _DATA_ _ token is parsed, any DATA filehandle previously open in the current package (opened in a different file, presumably) is closed so that the new one can be opened. (This also applies to _ _END_ _ and the main::DATA filehandle: main::DATA is reopened whenever _ _END_ _ is encountered, so any former association is lost.)

SelfLoader autoloading

The SelfLoader will read from the GoodStuff::DATA filehandle to get definitions for functions placed after _ _DATA_ _, and then eval the requested subroutine the first time it's called. The costs are the one-time parsing of the data after _ _DATA_ _, and a load delay for the first call of any autoloaded function. The benefits are a speeded up compilation phase, with no need to load functions that are never used.

You can use _ _END_ _ after _ _DATA_ _. The SelfLoader will stop reading from DATA if it encounters the _ _END_ _ token, just as you might expect. If the _ _END_ _ token is present, and is followed by the token DATA, then the SelfLoader leaves the GoodStuff::DATA filehandle open on the line after that token.

The SelfLoader exports the AUTOLOAD subroutine to the package using the SelfLoader, and this triggers the automatic loading of an undefined subroutine out of its DATA portion the first time that subroutine is called.

There is no advantage to putting subroutines that will always be called after the _ _DATA_ _ token.

Autoloading and file-scoped lexicals

A my $pack_lexical statement makes the variable $pack_lexical visible only up to the _ _DATA_ _ token. That means that subroutines declared elsewhere cannot see lexical variables. Specifically, autoloaded functions cannot see such lexicals (this applies to both the SelfLoader and the Autoloader). The use vars pragma (see later in this chapter) provides a way to declare package-level globals that will be visible to autoloaded routines.

SelfLoader and AutoLoader

The SelfLoader can replace the AutoLoader--just change use AutoLoader to use SelfLoader[5] and the _ _END_ _ token to _ _DATA_ _.

[5] Be aware, however, that the SelfLoader exports an AUTOLOAD function into your package. But if you have your own AUTOLOAD and are using the AutoLoader too, you probably know what you're doing.

There is no need to inherit from the SelfLoader.

The SelfLoader works similarly to the AutoLoader, but picks up the subroutine definitions from after the _ _DATA_ _ instead of in the lib/auto/ directory. SelfLoader needs less maintenance at the time the module is installed, since there's no need to run AutoSplit. And it can run faster at load time because it doesn't need to keep opening and closing files to load subroutines. On the other hand, it can run slower because it needs to parse the code after the _ _DATA_ _. Details of the AutoLoader and another view of these distinctions can be found in that module's documentation.

How to read DATA from your Perl program

(This section is only relevant if you want to use the GoodStuff::DATA together with the SelfLoader.)

The SelfLoader reads from wherever the current position of the GoodStuff::DATA filehandle is, until EOF or the _ _END_ _ token. This means that if you want to use that filehandle (and only if you want to), you should either

You could even conceivably do both.

Classes and inherited methods

This section is only relevant if your module is a class, and has methods that could be inherited.

A subroutine stub (or forward declaration) looks like:

sub stub;

That is, it is a subroutine declaration without the body of the subroutine. For modules that aren't classes, there is no real need for stubs as far as autoloading is concerned.

For modules that are classes, and need to handle inherited methods, stubs are needed to ensure that the method inheritance mechanism works properly. You can load the stubs into the module at require time, by adding the statement SelfLoader->load_stubs(); to the module to do this.

The alternative is to put the stubs in before the _ _DATA_ _ token before releasing the module, and for this purpose the Devel::SelfStubber module is available. However this does require the extra step of ensuring that the stubs are in the module. If you do this, we strongly recommended that you do it before releasing the module and not at install time.

Multiple packages and fully qualified subroutine names

Subroutines in multiple packages within the same file are supported--but you should note that this requires exporting SelfLoader::AUTOLOAD to every package which requires it. This is done automatically by the SelfLoader when it first loads the subs into the cache, but you should really specify it in the initialization before the _ _DATA_ _ by putting a use SelfLoader statement in each package.

Fully qualified subroutine names are also supported. For example:

_ _DATA_ _
sub foo::bar {23}
package baz;
sub dob {32}

will all be loaded correctly by the SelfLoader, and the SelfLoader will ensure that the packages "foo" and "baz" correctly have the SelfLoader::AUTOLOAD method when the data after _ _DATA_ _ is first parsed.

See the discussion of autoloading in Chapter 5, Packages, Modules, and Object Classes. Also see the AutoLoader module, a utility that handles modules that have been into a collection of files for autoloading.

Shell--Run Shell Commands Transparently Within Perl

use Shell qw(date cp ps);  # list shell commands you want to use
$date = date();   # put the output of the date(1) command into $date
cp("-p" "/etc/passwd", "/tmp/passwd");  # copy password file to a tmp file
print ps("-ww");  # print the results of a "ps -ww" command

This module allows you to invoke UNIX utilities accessible from the shell command line as if they were Perl subroutines. Arguments (including switches) are passed to the utilities as strings.

The Shell module essentially duplicates the built-in backtick functionality of Perl. The module was written so that its implementation could serve as a demonstration of autoloading. It also shows how function calls can be mapped to subprocesses.

sigtrap--Enable Stack Backtrace on Unexpected Signals

use sigtrap;       # initialize default signal handlers
use sigtrap LIST;  # LIST example:  qw(BUS SEGV PIPE SYS ABRT TRAP)

The sigtrap pragma initializes a signal handler for the signals specified in LIST, or (if no list is given) for a set of default signals. The signal handler prints a stack dump of the program and then issues a (non-trapped) ABRT signal.

In the absence of LIST, the signal handler is set up to deal with the ABRT, BUS, EMT, FPE, ILL, PIPE, QUIT, SEGV, SYS, TERM, and TRAP signals.

Socket--Load the C socket.h Defines and Structure Manipulators

use Socket;
$proto = getprotobyname('udp');
socket(Socket_Handle, PF_INET, SOCK_DGRAM, $proto);
$iaddr = gethostbyname('hishost.com');
$port = getservbyname('time', 'udp');
$sin = sockaddr_in($port, $iaddr);
send(Socket_Handle, 0, 0, $sin);
$proto = getprotobyname('tcp');
socket(Socket_Handle, PF_INET, SOCK_STREAM, $proto);
$port = getservbyname('smtp');
$sin = sockaddr_in($port, inet_aton("127.1"));
$sin = sockaddr_in(7, inet_aton("localhost"));
$sin = sockaddr_in(7, INADDR_LOOPBACK);
connect(Socket_Handle, $sin);
($port, $iaddr) = sockaddr_in(getpeername(Socket_Handle));
$peer_host = gethostbyaddr($iaddr, AF_INET);
$peer_addr = inet_ntoa($iaddr);
socket(Socket_Handle, PF_UNIX, SOCK_STREAM, 0);
unlink('/tmp/usock');
$sun = sockaddr_un('/tmp/usock');
bind(Socket_Handle, $sun);

This module is just a translation of the C socket.h file. Unlike the old mechanism of requiring a translated socket.ph file, this uses the h2xs program (see the Perl source distribution) and your native C compiler. This means that it has a far more likely chance of getting the numbers right. This includes all of the commonly used preprocessor-defined constants like AF_INET, SOCK_STREAM, and so on.

In addition, some structure manipulation functions are available:

inet_aton HOSTNAME

Takes a string giving the name of a host, and translates that to a four-byte, packed string (structure). Takes arguments of both the rtfm.mit.edu and 18.181.0.24 types. If the host name cannot be resolved, returns the undefined value.

inet_ntoa IP_ADDRESS

Takes a four-byte IP address (as returned by inet_aton()) and translates it into a string of the form d.d.d.d where the ds are numbers less than 256 (the normal, readable, dotted-quad notation for Internet addresses).

INADDR_ANY

Note: This function does not return a number, but a packed string. Returns the four-byte wildcard IP address that specifies any of the host's IP addresses. (A particular machine can have more than one IP address, each address corresponding to a particular network interface. This wildcard address allows you to bind to all of them simultaneously.) Normally equivalent to inet_aton('0.0.0.0').

INADDR_LOOPBACK

Note: does not return a number, but a packed string. Returns the four-byte loopback address. Normally equivalent to inet_aton('localhost').

INADDR_NONE

Note: does not return a number, but a packed string. Returns the four-byte invalid IP address. Normally equivalent to inet_aton('255.255.255.255').

sockaddr_in PORT, ADDRESS

sockaddr_in SOCKADDR_IN

In a list context, unpacks its SOCKADDR_IN argument and returns a list consisting of (PORT, ADDRESS). In a scalar context, packs its (PORT, ADDRESS) arguments as a SOCKADDR_IN and returns it. If this is confusing, use pack_sockaddr_in() and unpack_sockaddr_in() explicitly.

pack_sockaddr_in PORT, IP_ADDRESS

Takes two arguments, a port number and a four-byte IP_ADDRESS (as returned by inet_aton()). Returns the sockaddr_in structure with those arguments packed in with AF_INET filled in. For Internet domain sockets, this structure is normally what you need for the arguments in bind, connect, and send, and is also returned by getpeername, getsockname, and recv.

unpack_sockaddr_in SOCKADDR_IN

Takes a sockaddr_in structure (as returned by pack_sockaddr_in()) and returns a list of two elements: the port and the four-byte IP address. This function will croak if the structure does not have AF_INET in the right place.

sockaddr_un PATHNAME

sockaddr_un SOCKADDR_UN

In a list context, it unpacks its SOCKADDR_UN argument and returns a list consisting of (PATHNAME). In a scalar context, it packs its PATHNAME argument as a SOCKADDR_UN and returns it. If this is confusing, use pack_sockaddr_un() and unpack_sockaddr_un() explicitly. These functions are only supported if your system has <sys/un.h>.

pack_sockaddr_un PATH

Takes one argument, a pathname. Returns the sockaddr_un structure with that path packed in with AF_UNIX filled in. For UNIX domain sockets, this structure is normally what you need for the arguments in bind, connect, and send, and is also returned by getpeername, getsockname and recv.

unpack_sockaddr_un SOCKADDR_UN

Takes a sockaddr_un structure (as returned by pack_sockaddr_un()) and returns the pathname. Will croak if the structure does not have AF_UNIX in the right place.

strict--Restrict Unsafe Constructs

use strict;        # apply all possible restrictions
use strict 'vars'; # restrict unsafe use of variables for rest of block
use strict 'refs'; # restrict unsafe use of references for rest of block
use strict 'subs'; # restrict unsafe use of barewords for rest of block
no strict 'vars';  # relax restrictions on variables for rest of block
no strict 'refs';  # relax restrictions on references for rest of block
no strict 'subs';  # relax restrictions on barewords for rest of block

If no import list is given to use strict, all possible restrictions upon unsafe Perl constructs are imposed. (This is the safest mode to operate in, but is sometimes too strict for casual programming.) Currently, there are three possible things to be strict about: refs, vars, and subs.

In all cases the restrictions apply only until the end of the immediately enclosing block.

strict 'refs'

This generates a run-time error if you use symbolic references.

use strict 'refs';
$ref = \$foo;
print $$ref;        # ok
$ref = "foo";
print $$ref;        # run-time error; normally ok

strict 'vars'

This generates a compile-time error if you access a variable that wasn't declared via my, or fully qualified, or imported.

use strict 'vars';
use vars '$foe';
$SomePack::fee = 1;  # ok, fully qualified
my $fie = 10;        # ok, my() var
$foe = 7;            # ok, pseudo-imported by 'use vars'
$foo = 9;            # blows up--did you mistype $foe maybe?

The last line generates a compile-time error because you're touching a global name without fully qualifying it. Since the purpose of this pragma is to encourage use of my variables, using local on a variable isn't good enough to declare it. You can, however, use local on a variable that you declared with use vars.

strict 'subs'

This generates a compile-time error if you try to use a bareword identifier that's not a predeclared subroutine.

use strict 'subs';
$SIG{PIPE} = Plumber;     # blows up (assuming Plumber sub not declared yet)
$SIG{PIPE} = "Plumber";   # okay, means "main::Plumber" really
$SIG{PIPE} = \&Plumber;   # preferred form

The no strict 'vars' statement negates any preceding use strict vars for the remainder of the innermost enclosing block. Likewise, no strict 'refs' negates any preceding invocation of use strict refs, and no strict 'subs' negates use strict 'subs'.

The arguments to use strict are sometimes given as barewords--that is, without surrounding quotes. Be aware, however, that the following sequence will not work:

use strict;      # or just:  use strict subs;
...
no strict subs;  # WRONG!  Should be:  no strict 'subs';
...

The problem here is that giving subs as a bareword is no longer allowed after the use strict statement. :-)

subs--Predeclare Subroutine Names

use subs qw(sub1 sub2 sub3);
sub1 $arg1, $arg2;

This predeclares the subroutines whose names are in the list, allowing you to use them without parentheses even before they're defined. It has the additional benefit of allowing you to override built-in functions, since you may only override built-ins via an import, and this pragma does a pseudo-import.

See also the vars module.

Symbol--Generate Anonymous Globs; Qualify Variable Names

use Symbol;
$sym = gensym;
open($sym, "filename");
$_ = <$sym>;
ungensym $sym;      # no effect
print qualify("x");              # "main::x"
print qualify("x", "FOO");       # "FOO::x"
print qualify("BAR::x");         # "BAR::x"
print qualify("BAR::x", "FOO");  # "BAR::x"
print qualify("STDOUT", "FOO");  # "main::STDOUT" (global)
print qualify(\*x);              # \*x--for example: GLOB(0x99530)
print qualify(\*x, "FOO");       # \*x--for example: GLOB(0x99530)

gensym() creates an anonymous glob and returns a reference to it. Such a glob reference can be used as a filehandle or directory handle.

For backward compatibility with older implementations that didn't support anonymous globs, ungensym() is also provided. But it doesn't do anything.

qualify() turns unqualified symbol names into qualified variable names (for example, myvar becomes MyPackage::myvar). If it is given a second parameter, qualify() uses it as the default package; otherwise, it uses the package of its caller. Regardless, global variable names (for example, STDOUT, %ENV, %SIG) are always qualified with main::.

Qualification applies only to symbol names (strings). References are left unchanged under the assumption that they are glob references, which are qualified by their nature.

Sys::Hostname--Try Every Conceivable Way to Get Hostname

use Sys::Hostname;
$host = hostname();

Attempts several methods of getting the system hostname and then caches the result. It tries syscall(SYS_gethostname), `hostname`, `uname -n`, and the file /com/host. If all that fails, it croak()s.

All nulls, returns, and newlines are removed from the result.

Sys::Syslog--Perl Interface to UNIX syslog(3) Calls

use Sys::Syslog;
openlog $ident, $logopt, $facility;
syslog $priority, $mask, $format, @args;
$oldmask = setlogmask $mask_priority;
closelog;

Sys::Syslog is an interface to the UNIX syslog (3) program. Call syslog() with a string priority and a list of printf args just like syslog (3). Sys::Syslog needs syslog.ph, which must be created with h2ph by your system administrator.

Sys::Syslog provides these functions:

openlog $ident, $logopt, $facility

$ident is prepended to every message. $logopt contains one or more of the words pid, ndelay, cons, nowait. $facility specifies the part of the system making the log entry.

syslog $priority, $mask, $format, @args

If $priority and $mask permit, logs a message formed as if by sprintf($format, @args), with the addition that %m is replaced with "$!" (the latest error message).

setlogmask $mask_priority

Sets log mask to $mask_priority and returns the old mask.

closelog

Closes the log file.

Examples

openlog($program, 'cons, pid', 'user');
syslog('info', 'this is another test');
syslog('mail|warning', 'this is a better test: %d', time);
closelog();

syslog('debug', 'this is the last test');
openlog("$program $$", 'ndelay', 'user');
syslog('notice', 'fooprogram: this is really done');
$! = 55;
syslog('info', 'problem was %m'); # %m == $! in syslog (3)

Term::Cap--Terminal Capabilities Interface

require Term::Cap;
$terminal = Tgetent Term::Cap { TERM => undef, OSPEED => $ospeed };
$terminal->Trequire(qw/ce ku kd/);
$terminal->Tgoto('cm', $col, $row, $FH);
$terminal->Tputs('dl', $count, $FH);

These are low-level functions to extract and use capabilities from a terminal capability (termcap) database. For general information about the use of this database, see the termcap (5) manpage.

The "new" function of Term::Cap is Tgetent(), which extracts the termcap entry for the specified terminal type and returns a reference to a terminal object. If the value associated with the TERM key in the Tgetent() argument list is false or undefined, then it defaults to the environment variable TERM.

Tgetent() looks in the environment for a TERMCAP variable. If it finds one, and if the value does not begin with a slash and looks like a termcap entry in which the terminal type name is the same as the environment string TERM, then the TERMCAP string is used directly as the termcap entry and there is no search for an entry in a termcap file somewhere.

Otherwise, Tgetent() looks in a sequence of files for the termcap entry. The sequence consists of the filename in TERMCAP, if any, followed by either the files listed in the TERMPATH environment variable, if any, or otherwise the files $HOME/.termcap, /etc/termcap, and /usr/share/misc/termcap, in that order. (Filenames in TERMPATH may be separated by either a colon or a space.) Whenever multiple files are searched and a tc field occurs in the requested entry, the entry named in the tc field must be found in the same file or one of the succeeding files. If there is a tc field in the TERMCAP environment variable string, Tgetent() continues searching as indicated above.

OSPEED is the terminal output bit rate (often mistakenly called the baud rate). OSPEED can be specified as either a POSIX termios/SYSV termio speed (where 9600 equals 9600) or an old BSD-style speed (where 13 equals 9600). See the next section, "Getting Terminal Output Speed", for code illustrating how to obtain the output speed.

Tgetent() returns a reference to a blessed object ($terminal in the examples above). The actual termcap entry is available as $terminal->{TERMCAP}. Failure to find an appropriate termcap entry results in a call to Carp::croak().

Once you have invoked Tgetent(), you can manage a terminal by sending control strings to it with Tgoto() and Tputs(). You can also test for the existence of particular terminal capabilities with Trequire().

Trequire() checks to see whether the named capabilities have been specified in the terminal's termcap entry. For example, this line:

$terminal->Trequire(qw/ce ku kd/);

checks whether the ce (clear to end of line), ku (keypad up-arrow), and kd (keypad down-arrow) capabilities have been defined. Any undefined capabilities will result in a listing of those capabilities and a call to Carp::croak().

Tgoto() produces a control string to move the cursor relative to the screen. For example, to move the cursor to the fifth line and forty-fifth column on the screen, you can say:

$row = 5; $col = 45;
$terminal->Tgoto('cm', $row, $col, STDOUT);

The first argument in this call must always be cm. If a file handle is given as the final argument, then Tgoto() sends the appropriate control string to that handle. With or without a handle, the routine returns the control string, so you could achieve the same effect this way:

$str = $terminal->Tgoto('cm', $row, $col);
print STDOUT $str;

Tgoto() performs the necessary % interpolation on the control strings. (See the termcap (5) manpage for details.)

The Tputs() routine allows you to exercise other terminal capabilities. For example, the following code deletes one line at the cursor's present position, and then turns on the bold text attribute:

$count = 1;
$terminal->Tputs('dl', $count, $FILEHANDLE);  # delete one line
$terminal->Tputs('md', $count, $FILEHANDLE);  # turn on bold attribute

Again, Tputs() returns the terminal control string, and the file handle can be omitted. The $count for such calls should normally be 1, unless padding is required. (Padding involves the output of "no-op" characters in order to effect a delay required by the terminal device. It is most commonly required for hardcopy devices.) A count greater than 1 is taken to specify the amount of padding. See the termcap (5) manpage for more about padding.

Tputs() does not perform % interpolation. This means that the following will not work:

$terminal->Tputs('DC', 1, $FILEHANDLE);  # delete one character (WRONG!)

If the terminal control string requires numeric parameters, then you must do the interpolation yourself:

$str = $terminal->Tputs('DC', 1);
$str =~ s/%d/7/;
print STDOUT $str;        # delete seven characters

The output strings for Tputs() are cached for counts of 1. Tgoto() does not cache. $terminal->{_xx} is the raw termcap data and $terminal->{xx} is the cached version (where xx is the two-character terminal capability code).

Getting terminal output speed

You can use the POSIX module to get your terminal's output speed for use in the Tgetent() call:

require POSIX;
my $termios = new POSIX::Termios;
$termios->getattr;
my $ospeed = $termios->getospeed;

The method using ioctl (2) works like this:

require 'ioctl.pl';
ioctl(TTY, $TIOCGETP, $sgtty);
($ispeed, $ospeed) = unpack('cc', $sgtty);

Term::Complete--Word Completion Module

use Term::Complete;
$input = Complete('prompt_string', \@completion_list);
$input = Complete('prompt_string', @completion_list);

The Complete() routine sends the indicated prompt string to the currently selected filehandle, reads the user's response, and places the response in $input. What the user types is read one character at a time, and certain characters result in special processing as follows:

TAB

The tab character causes Complete() to match what the user has typed so far against the list of strings in @completion_list. If the user's partial input uniquely matches one of these strings, then the rest of the matched string is output. However, input is still not finished until the user presses the return key. If the user's partial input does not uniquely match one string in @completion_list when the tab character is pressed, then the partial input remains unchanged and the bell character is output.

CTRL-D

If the user types CTRL-D, the current matches between the user's partial input string and the completion list are printed out. If the partial input string is null, then the entire completion list is printed. In any case, the prompt string is then reissued, along with the partial input. You can substitute a different character for CTRL-D by defining $Term::Complete::complete. For example:

$Term::Complete::complete = "\001";  # use ctrl-a instead of ctrl-d

CTRL-U

Typing CTRL-U erases any partial input. You can substitute a different character for CTRL-U by defining $Term::Complete::kill.

DEL, BS

The delete and backspace characters both erase one character from the partial input string. You can redefine them by assigning a different character value to $Term::Complete::erase1 and $Term::Complete::erase2.

The user is not prevented from providing input that differs from all strings in the completion list, or from adding to input that has been completed from the list. The final input (determined when the user presses the return key) is the string returned by Complete().

The TTY driver is put into raw mode using the system command stty raw -echo and restored using stty -raw echo. When Complete() is called multiple times, it offers the user's immediately previous response as the default response to each prompt.

Test::Harness--Run Perl Standard Test Scripts with Statistics

use Test::Harness;
runtests(@tests);

This module is used by MakeMaker. If you're building a Perl extension and if you have test scripts with filenames matching t/*.t in the extension's subdirectory, then you can run those tests by executing the shell command, make test.

runtests(@tests) runs all test scripts named as arguments and checks standard output for the expected "ok n" strings. (Standard Perl test scripts print "ok n" for each single test, where n is an integer incremented by one each time around.) After all tests have been performed, runtests() prints some performance statistics that are computed by the Benchmark module.

runtests() is exported by Test::Harness by default.

The test script output

The first line output by a standard test script should be 1..m with m being the number of tests that the test script attempts to run. Any output from the test script to standard error is ignored and bypassed, and thus will be seen by the user. Lines written to standard output that look like Perl comments (starting with /^\s*\#/) are discarded. Lines containing /^(not\s+)?ok\b/ are interpreted as feedback for runtests().

The global variable $Test::Harness::verbose is exportable and can be used to let runtests() display the standard output of the script without altering the behavior otherwise.

It is tolerated if the script omits test numbers after ok. In this case Test::Harness maintains its own counter. So the following script output:

1..6
not ok
ok
not ok
ok
ok

will generate:

FAILED tests 1, 3, 6
Failed 3/6 tests, 50.00% okay

Diagnostics

All tests successful.\nFiles=%d, Tests=%d, %s

If all tests are successful, some statistics about the performance are printed.

FAILED tests %s\n\tFailed %d/%d tests, %.2f%% okay.

For any single script that has failing subtests, these statistics are printed.

Test returned status %d (wstat %d)

Scripts that return a non-zero exit status, both $?>>8 and $?, are printed in a message similar to the above.

Failed 1 test, %.2f%% okay.

Failed %d/%d tests, %.2f%% okay.

If not all tests were successful, the script dies with one of the above messages.

Notes

Test::Harness uses $^X to determine which Perl binary to run the tests with. Test scripts running via the shebang (#!) line may not be portable because $^X is not consistent for shebang scripts across platforms. This is no problem when Test::Harness is run with an absolute path to the Perl binary or when $^X can be found in the path.

Text::Abbrev--Create an Abbreviation Table from a List

use Text::Abbrev;
%hash = ();
abbrev(*hash, LIST);

The abbrev() routine takes each string in LIST and constructs all unambiguous abbreviations (truncations) of the string with respect to the other strings in LIST. Each such truncation (including the null truncation consisting of the entire string) is used as a key in %hash for which the associated value is the non-truncated string.

So, if good is the only string in LIST beginning with g, the following key/value pairs will be created:

g    => good,
go   => good,
goo  => good,
good => good

If, on the other hand, the string go is also in the list, then good yields these key/value pairs:

goo  => good,
good => good

and go yields only:

go => go

Text::ParseWords--Parse Text into a List of Tokens

use Text::ParseWords;
@words = quotewords($delim, $keep, @lines);

quotewords() accepts a delimiter (which can be a regular expression) and a list of lines, and then breaks those lines up into a list of delimiter-separated words. It ignores delimiters that appear inside single or double quotes.

The $keep argument is a Boolean flag. If it is false, then quotes are removed from the list of words returned by quotewords(); otherwise, quotes are retained.

The value of $keep also affects the interpretation of backslashes. If $keep is true, then backslashes are fully preserved in the returned list of words. Otherwise, a single backslash disappears and a double backslash is returned as a single backslash. (Be aware, however, that, regardless of the value of $keep, a single backslash occurring within quotes causes a Perl syntax error--presumably a bug.)

Text::Soundex--The Soundex Algorithm Described by Knuth

use Text::Soundex;
$code = soundex $string;  # get soundex code for a string
@codes = soundex @list;   # get list of codes for list of strings
# set value to be returned for strings without soundex code
$soundex_nocode = 'Z000';

This module implements the soundex algorithm as described by Donald Knuth in Volume 3 of The Art of Computer Programming. The algorithm is intended to hash words (in particular surnames) into a small space using a simple model that approximates the sound of the word when spoken by an English speaker. Each word is reduced to a four-character string, the first character being an uppercase letter and the remaining three being digits.

If there is no soundex code representation for a string, then the value of $soundex_nocode is returned. This variable is initially set to the undefined value, but many people seem to prefer an unlikely value like Z000. (How unlikely this is depends on the data set being dealt with.) Any value can be assigned to $soundex_nocode.

In a scalar context soundex() returns the soundex code of its first argument, and in an array context a list is returned in which each element is the soundex code for the corresponding argument passed to soundex().

For example:

@codes = soundex qw(Mike Stok);

leaves @codes containing ('M200', 'S320').

Here are Knuth's examples of various names and the soundex codes they map to:

Names Code
Euler, Ellery E460
Gauss, Ghosh G200
Hilbert, Heilbronn H416
Knuth, Kant K530
Lloyd, Ladd L300
Lukasiewicz, Lissajous L222

So we have:

$code = soundex 'Knuth';              # $code contains 'K530'
@list = soundex qw(Lloyd Gauss);      # @list contains 'L300', 'G200'

As the soundex algorithm was originally used a long time ago in the United States, it considers only the English alphabet and pronunciation.

As it is mapping a large space (arbitrary-length strings) onto a small space (single letter plus three digits), no inference can be made about the similarity of two strings that end up with the same soundex code. For example, both Hilbert and Heilbronn end up with a soundex code of H416.

Text::Tabs--Expand and Unexpand Tabs

use Text::Tabs;
$tabstop = 8;                            # set tab spacing to 8 (default)
print expand("Hello\tworld");            # convert tabs to spaces in output
print unexpand("Hello,        world");   # convert spaces to tabs in output
$tabstop = 4;                            # set tab spacing to 4
print join("\n", expand(split(/\n/,
                "Hello\tworld, \nit's a nice day.\n")));

This module expands tabs into spaces and "unexpands" spaces into tabs, in the manner of the UNIX expand (1) and unexpand (1) programs. All tabs and spaces--not only leading ones--are subject to being expanded and unexpanded.

Both expand() and unexpand() take as argument an array of strings, which are returned with tabs or spaces transformed. Newlines may not be included in the strings, and should be used to split strings into separate elements before they are passed to expand() and unexpand().

expand(), unexpand(), and $tabstop are imported into your program when you use this module.

Text::Wrap--Wrap Text into a Paragraph

use Text::Wrap;
$Text::Wrap::columns = 20; # default is 76
$pre1 = "\t";              # prepend this to first line of paragraph
$pre2 = "";                # prepend this to subsequent lines
print wrap($pre1, $pre2, "Hello, world, it's a nice day, isn't it?");

This module is a simple paragraph formatter that wraps text into a paragraph and indents each line. The single exported function, wrap(), takes three arguments: a string to prepend to the first output line; a string to prepend to each subsequent output line; and the text to be wrapped.

$columns is exported on request.

Tie::Hash, Tie::StdHash--Base Class Definitions for Tied Hashes

package NewHash;
require Tie::Hash;
@ISA = (Tie::Hash);
sub DELETE { ... }          # Provides additional method
sub CLEAR { ... }           # Overrides inherited method
package NewStdHash;
require Tie::Hash;
@ISA = (Tie::StdHash);
sub DELETE { ... }
package main;
tie %new_hash, "NewHash";
tie %new_std_hash, "NewStdHash";

This module provides some skeletal methods for hash-tying classes. (See Chapter 5, Packages, Modules, and Object Classes for a list of the functions required in order to tie a hash to a package.) The basic Tie::Hash package provides a new() method, as well as methods TIEHASH(), EXISTS() and CLEAR(). The Tie::StdHash package provides most methods required for hashes. It inherits from Tie::Hash, and causes tied hashes to behave exactly like standard hashes, allowing for selective overloading of methods. The new() method is provided as grandfathering in case a class forgets to include a TIEHASH() method.

For developers wishing to write their own tied hashes, the required methods are briefly defined below. (Chapter 5, Packages, Modules, and Object Classes not only documents these methods, but also has sample code.)

TIEHASH ClassName, LIST

The method invoked by the command:

tie %hash, ClassName, LIST

Associates a new hash instance with the specified class. LIST would represent additional arguments (along the lines of AnyDBM_File and compatriots) needed to complete the association.

STORE this, key, value

Store value into key for the tied hash this.

FETCH this, key

Retrieve the value associated with key for the tied hash this.

FIRSTKEY this

Return the key/value pair for the first key in hash this.

NEXTKEY this, lastkey

Return the next key/value pair for the hash.

EXISTS this, key

Verify that key exists with the tied hash this.

DELETE this, key

Delete key from the tied hash this.

CLEAR this

Clear all values from the tied hash this.

Chapter 5, Packages, Modules, and Object Classes includes a method called DESTROY() as a "necessary" method for tied hashes. However, it is not actually required, and neither Tie::Hash nor Tie::StdHash defines a default for this method.

See also

The library modules relating to various DBM-related implementations (DB_File, GDBM_File, NDBM_File, ODBM_File, and SDBM_File) show examples of general tied hashes, as does the Config module. While these modules do not utilize Tie::Hash, they serve as good working examples.

Tie::Scalar, Tie::StdScalar--Base Class Definitions for Tied Scalars

package NewScalar;
require Tie::Scalar;
@ISA = (Tie::Scalar);
sub FETCH { ... }           # Provides additional method
sub TIESCALAR { ... }       # Overrides inherited method
package NewStdScalar;
require Tie::Scalar;
@ISA = (Tie::StdScalar);
sub FETCH { ... }
package main;
tie $new_scalar, "NewScalar";
tie $new_std_scalar, "NewStdScalar";

This module provides some skeletal methods for scalar-tying classes. (See Chapter 5, Packages, Modules, and Object Classes for a list of the functions required in tying a scalar to a package.) The basic Tie::Scalar package provides a new() method, as well as methods TIESCALAR(), FETCH() and STORE(). The Tie::StdScalar package provides all methods specified in Chapter 5, Packages, Modules, and Object Classes. It inherits from Tie::Scalar and causes scalars tied to it to behave exactly like the built-in scalars, allowing for selective overloading of methods. The new() method is provided as a means of grandfathering for classes that forget to provide their own TIESCALAR() method.

For developers wishing to write their own tied-scalar classes, methods are summarized below. (Chapter 5, Packages, Modules, and Object Classes not only documents these, but also has sample code.)

TIESCALAR ClassName, LIST

The method invoked by the command:

tie $scalar, ClassName, LIST

Associates a new scalar instance with the specified class. LIST would represent additional arguments (along the lines of the AnyDBM_File library module and associated modules) needed to complete the association.

FETCH this

Retrieve the value of the tied scalar referenced by this.

STORE this, value

Store value in the tied scalar referenced by this.

DESTROY this

Free the storage associated with the tied scalar referenced by this. This is rarely needed, since Perl manages its memory well. But the option exists, should a class wish to perform specific actions upon the destruction of an instance.

See also

Chapter 5, Packages, Modules, and Object Classes has a good example using tied scalars to associate process IDs with priority.

Tie::SubstrHash--Fixed-table-size, Fixed-key-length Hashing

require Tie::SubstrHash;
tie %myhash, "Tie::SubstrHash", $key_len, $value_len, $table_size;

The Tie::SubstrHash package provides a hash table-like interface to an array of determinate size, with constant key size and record size.

Upon tying a new hash to this package, the developer must specify the size of the keys that will be used, the size of the value fields that the keys will index, and the size of the overall table (in terms of the number of key/value pairs, not hard memory). These values will not change for the duration of the tied hash. The newly allocated hash table may now have data stored and retrieved. Efforts to store more than $table_size elements will result in a fatal error, as will efforts to store a value not exactly $value_len characters in length, or to reference through a key not exactly $key_len characters in length. While these constraints may seem excessive, the result is a hash table using much less internal memory than an equivalent freely allocated hash table.

Because the current implementation uses the table and key sizes for the hashing algorithm, there is no means by which to dynamically change the value of any of the initialization parameters.

Time::Local--Efficiently Compute Time from Local and GMT Time

use Time::Local;
$time = timelocal($sec, $min, $hours, $mday, $mon, $year);
$time = timegm($sec, $min, $hours, $mday, $mon, $year);

These routines take a series of arguments specifying a local (timelocal()) or Greenwich (timegm()) time, and return the number of seconds elapsed between January 1, 1970, and the specified time. The arguments are defined like the corresponding arguments returned by Perl's gmtime and localtime functions.

The routines are very efficient and yet are always guaranteed to agree with the gmtime and localtime functions. That is, if you pass the value returned by time to localtime, and if you then pass the values returned by localtime to timelocal(), the returned value from timelocal() will be the same as the value originally returned from time.

Both routines return -1 if the integer limit is hit. On most machines this applies to dates after January 1, 2038.

vars--Predeclare Global Variable Names

use vars qw($frob @mung %seen);

This module predeclares all variables whose names are in the list, allowing you to use them under use strict, and disabling any typo warnings.

Packages such as the AutoLoader and SelfLoader that delay loading of subroutines within packages can create problems with file-scoped lexicals defined using my. This is because they move the subroutines outside the scope of the lexical variables. While the use vars pragma cannot duplicate the effect of file-scoped lexicals (total transparency outside of the file), it can act as an acceptable substitute by pre-declaring global symbols, ensuring their availability to the routines whose loading was delayed.

See also the subs module.


Previous Home Next
Beyond the Standard Library Book Index Other Oddments

HTML: The Definitive Guide CGI Programming JavaScript: The Definitive Guide Programming Perl WebMaster in a Nutshell