- publishing free software manuals
Perl Language Reference Manual
by Larry Wall and others
Paperback (6"x9"), 724 pages
ISBN 9781906966027
RRP £29.95 ($39.95)

Sales of this book support The Perl Foundation! Get a printed copy>>>

8.8 Prototypes

Perl supports a very limited kind of compile-time argument checking using function prototyping. If you declare

sub mypush (\@@)

then mypush() takes arguments exactly like push() does. The function declaration must be visible at compile time. The prototype affects only interpretation of new-style calls to the function, where new-style is defined as not using the & character. In other words, if you call it like a built-in function, then it behaves like a built-in function. If you call it like an old-fashioned subroutine, then it behaves like an old-fashioned subroutine. It naturally falls out from this rule that prototypes have no influence on subroutine references like \&foo or on indirect subroutine calls like &{$subref} or $subref->().

Method calls are not influenced by prototypes either, because the function to be called is indeterminate at compile time, since the exact code called depends on inheritance.

Because the intent of this feature is primarily to let you define subroutines that work like built-in functions, here are prototypes for some other functions that parse almost exactly like the corresponding built-in.

Declared as                 Called as
sub mylink ($$)          mylink $old, $new
sub myvec ($$$)          myvec $var, $offset, 1
sub myindex ($$;$)       myindex &getstring, "substr"
sub mysyswrite ($$$;$)   mysyswrite $buf, 0, length($buf) -
                         $off, $off
sub myreverse (@)        myreverse $a, $b, $c
sub myjoin ($@)          myjoin ":", $a, $b, $c
sub mypop (\@)           mypop @array
sub mysplice (\@$$@)     mysplice @array, 0, 2, @pushme
sub mykeys (\%)          mykeys %{$hashref}
sub myopen (*;$)         myopen HANDLE, $name
sub mypipe (**)          mypipe READHANDLE, WRITEHANDLE
sub mygrep (&@)          mygrep { /foo/ } $a, $b, $c
sub myrand (;$)          myrand 42
sub mytime ()            mytime

Any backslashed prototype character represents an actual argument that absolutely must start with that character. The value passed as part of @_ will be a reference to the actual argument given in the subroutine call, obtained by applying \ to that argument.

You can also backslash several argument types simultaneously by using the \[] notation:

sub myref (\[$@%&*])

will allow calling myref() as

myref $var
myref @array
myref %hash
myref &sub
myref *glob

and the first argument of myref() will be a reference to a scalar, an array, a hash, a code, or a glob.

Unbackslashed prototype characters have special meanings. Any unbackslashed @ or % eats all remaining arguments, and forces list context. An argument represented by $ forces scalar context. An & requires an anonymous subroutine, which, if passed as the first argument, does not require the sub keyword or a subsequent comma.

A * allows the subroutine to accept a bareword, constant, scalar expression, typeglob, or a reference to a typeglob in that slot. The value will be available to the subroutine either as a simple scalar, or (in the latter two cases) as a reference to the typeglob. If you wish to always convert such arguments to a typeglob reference, use Symbol::qualify_to_ref() as follows:

use Symbol 'qualify_to_ref';
sub foo (*) {
    my $fh = qualify_to_ref(shift, caller);
    ...
}

A semicolon (;) separates mandatory arguments from optional arguments. It is redundant before @ or %, which gobble up everything else.

As the last character of a prototype, or just before a semicolon, you can use _ in place of $: if this argument is not provided, $_ will be used instead.

Note how the last three examples in the table above are treated specially by the parser. mygrep() is parsed as a true list operator, myrand() is parsed as a true unary operator with unary precedence the same as rand(), and mytime() is truly without arguments, just like time(). That is, if you say

mytime +2;

you'll get mytime() + 2, not mytime(2), which is how it would be parsed without a prototype.

The interesting thing about & is that you can generate new syntax with it, provided it's in the initial position:

sub try (&@) {
    my($try,$catch) = @_;
    eval { &$try };
    if ($@) {
        local $_ = $@;
        &$catch;
    }
}
sub catch (&) { $_[0] }
try {
    die "phooey";
} catch {
    /phooey/ and print "unphooey\n";
};

That prints "unphooey". (Yes, there are still unresolved issues having to do with visibility of @_. I'm ignoring that question for the moment. (But note that if we make @_ lexically scoped, those anonymous subroutines can act like closures... (Gee, is this sounding a little Lispish? (Never mind.))))

And here's a reimplementation of the Perl grep operator:

sub mygrep (&@) {
    my $code = shift;
    my @result;
    foreach $_ (@_) {
        push(@result, $_) if &$code;
    }
    @result;
}

Some folks would prefer full alphanumeric prototypes. Alphanumerics have been intentionally left out of prototypes for the express purpose of someday in the future adding named, formal parameters. The current mechanism's main goal is to let module writers provide better diagnostics for module users. Larry feels the notation quite understandable to Perl programmers, and that it will not intrude greatly upon the meat of the module, nor make it harder to read. The line noise is visually encapsulated into a small pill that's easy to swallow.

If you try to use an alphanumeric sequence in a prototype you will generate an optional warning - "Illegal character in prototype...". Unfortunately earlier versions of Perl allowed the prototype to be used as long as its prefix was a valid prototype. The warning may be upgraded to a fatal error in a future version of Perl once the majority of offending code is fixed.

It's probably best to prototype new functions, not retrofit prototyping into older ones. That's because you must be especially careful about silent impositions of differing list versus scalar contexts. For example, if you decide that a function should take just one parameter, like this:

sub func ($) {
    my $n = shift;
    print "you gave me $n\n";
}

and someone has been calling it with an array or expression returning a list:

func(@foo);
func( split /:/ );

Then you've just supplied an automatic scalar in front of their argument, which can be more than a bit surprising. The old @foo which used to hold one thing doesn't get passed in. Instead, func() now gets passed in a 1; that is, the number of elements in @foo. And the split gets called in scalar context so it starts scribbling on your @_ parameter list. Ouch!

This is all very powerful, of course, and should be used only in moderation to make the world a better place.

ISBN 9781906966027Perl Language Reference ManualSee the print edition