This file was automatically generated from http://svn.pugscode.org/pugs/docs/tutorial/ch05_subroutines.pod on Sat Aug 1 14:01:20 2009 GMT, revision 27701.
=head0 Subroutines
Subroutines are reusable units of code, and one of the fundamental building blocks of modern programming languages. They can be called from just about anywhere, and return control to the point of the call when they finish executing. They can be passed zero or more arguments and return zero or more results. Some programmers may be more familar with languages which only allow a single return value from a subroutine, or languages for which subroutines have exactly zero return values and "functions" may have one. Perl 6 generalized the concept to allow subroutines to return as many, or as few, return values as needed. We think it just makes more sense to let the programmer do what they want to do in this regard.
Subroutines can be named or anonymous. They can be lexically scoped, package scoped, or globally scoped. Subroutines can be "Multi" subs, which allow multiple subroutines to have the same name as long as they have different parameter lists.
Blocks of reusable code can also be called the "methods" of a particular
class of object. Methods have a few significant differences from
subroutines, and a few significant differences from those found in Perl 5.
For instance, in Perl 6 they're distinguished by a separate keyword,
method. Because of these differences, they'll be discussed in
Chapter 6.
The most basic subroutine declaration is simply the sub keyword,
followed by the name of the sub, followed by the block that defines
the sub:
sub alert {
print "We have normality.";
}
The simplest subroutine call is just the subroutine name followed by a comma-separated list of variables or values:
$result = sum($a, $b, 42, 57);
Arguments are also sometimes passed as anonymous pairs:
$result = sum(first => 12, second => 21);
Parentheses are optional on calls to subroutines that have been
predeclared, but required for all other calls. Including the &
sigil before the subroutine name in a call will not turn off signature
checking as it did in Perl 5. In fact, in most contexts prefixing
the subroutine name with & will return a reference to the
subroutine instead of actually calling the subroutine.
One of the most significant additions to subroutines in Perl 6 is formal parameters. The parameter list, often called the signature of the subroutine, is part of the subroutine declaration:
sub standardize ($text, $method) {
my $clean;
given $method {
when 'length' { $clean = wrap($text, 72); }
when 'lower' { $clean = lowercase($text); }
...
}
return $clean;
}
The subroutine standardize has two scalar parameters, $text and
$method, so it is called with two arguments, each a scalar value.
The parameters are lexical variables within the body of the sub. They
never need to be explicitly declared as my, even under use strict
because they're declared by the subroutine declaration.
In a sub with no parameter list, all arguments are passed in the @_
array:
sub sum {
my $sum;
for @_ -> $number {
$sum += $number;
}
return $sum;
}
Subroutines with defined parameter lists don't get an @_ array. If you want a subroutine that takes no arguments
(and complains when arguments are passed), define it with an empty
parameter list ().
Subroutine calls normally provide a non-flattening list context, which means that any array or hash passed into a subroutine is treated as a single argument. An array parameter in the signature expects to be passed an actual array or arrayref argument, and a hash parameter expects a hash or hashref argument:
sub whole (@names, %flags) {
...
}
whole(@array, %hash);
Every subroutine call checks its signature to make sure the arguments
it gets match the declared parameter list. A mismatch in the number or
kind of arguments is an error. But since requiring every declared
parameter to be passed in on every call isn't nearly flexible enough
for the average programmer, Perl 6 also allows optional parameters.
Optional parameters can be included or ignored without causing any
errors. Each optional parameter is marked with a ? after the
parameter name:
sub someopt ($required1, $required2, $optional1?, $optional2?) {
...
}
So, someopt will accept anywhere from 2 to 4 arguments. You can
have any number of required and optional parameters, but all the
required parameters must come before the first optional parameter.
This is largely a common sense restriction. If you want to leave some
elements off a list of arguments, it has to be the ones at the end,
because positional arguments are bound to the parameters in strict
linear order. All these calls to someopt are fine:
someopt('req1', 'req2', 'opt1', 'opt2');
someopt('req1', 'req2', 'opt1');
someopt('req1', 'req2');
Notice that it will still cause an error to pass too many or too few parameters in a list with optional parameters:
someopt('req1') # WRONG!
someopt('req1', 'req2', 'opt1', 'opt2', 'extra') # WRONG!
Any argument can be passed either by position with an ordered list of
arguments, or by name with an unordered list of pairs (see
"Named Argument Passing" later in this chapter for
more details). Sometimes you'll want to specify that certain
parameters will be passed only by name, never by position. Named
parameters are marked with a : before the parameter name:
sub namedparams ($first, :$second, :$third) {
...
}
namedparams(1, second => 2, third => 3);
Named parameters are always optional. They must come after all
positional parameters--that is, after the unmarked required parameters
and the optional parameters marked with a ?. Again, this is largely
a matter of common sense. Though named parameters are completely
ignored when binding a list of positional arguments, the parser and
the person maintaining your code will both be profoundly grateful they
don't have to sort through a mixed bag of positional and named
parameters to find the positional parameter list.
If it makes more sense to do it, you can also use the alternate key syntax for passing parameters:
namedparams(1, :second(2), :third(3)) # Right
Also, named parameters aren't positional, you can pass them in any order. So, we can pass the second third and the third second:
namedparams(1, third => 3, second => 2)
The point of having a name for your parameter is that you don't have to worry about the position of it.
Another element of flexibility Perl developers will expect is the
ability to pull a list of arguments into an array or hash parameter.
These are known as variadic parameters because they can take a
variable number of arguments. In Perl 6, an array parameter with a
* before the parameter name will slurp up all the positional
arguments that haven't already been bound to another positional
parameter. So, the following call to transport binds
$arthur to @names[0], and $ford to @names[1]:
sub transport ($planet, *@names) {
...
}
transport('Magrathea', $arthur, $ford);
If the variadic array parameter is the only positional parameter in the signature, it will take all the positional arguments.
sub simple (*@_) {...}
# is the same as
sub simple {...}
A hash parameter with a * before the name will slurp up all the
named arguments that haven't already been bound to another
parameter. So, the following call to transport binds the value of
the pair argument with the key 'planet' to the parameter
:$planet, but all the other pairs become part of the %flags hash
(more on this in "Named Argument Passing" later in
this chapter).
sub transport (:$planet, *%flags) {...}
transport(:name('Arthur'),
:luggage('lost'),
:planet('Magrathea'),
:towel('required'));
When they're combined with other kinds of parameters, variadic positional parameters must come after all positional parameters in the signature. They can either precede or follow the named parameters. Variadic named parameters only slurp up the named parameters that aren't bound already, so they can appear anywhere in the signature.
Signature checking is sensitive not only to the number of arguments
and the variable type (defined by the $, @, %, or &
symbol), but also to the value type. The parameter
type is defined before the parameter name and before any symbols for
optional, named, or variadic parameters:
sub typedparams (Int $first, Str $second?) {...}
The parameter type declares the type of argument that can be bound to it. The parameter and argument types have to be compatible, but not identical.
Type checking happens at compile time whenever possible, because it's faster and can be optimized. Otherwise, type checking happens at run time. So, if all the arguments passed to the subroutine are explicitly typed, the types will be checked at compile time. If the arguments aren't explicitly typed, the run-time checks will make sure the scalars contain an integer value and a string value.
By default, parameters are aliases to the original arguments
(pass-by-reference), but they're marked as constant so they cannot be
modified within the body of the subroutine. The is rw property
marks a parameter as modifiable, so changes to the parameter within
the body of the sub modify the original variable passed in:
sub modifyparams ($first is rw, $second is rw) {...}
Be careful about using the is rw property unless it's necessary,
inadvertent changes in a subroutine can change data values in the
caller's scope that might not be expecting to be changed. At the
very least, it's polite to document it when your subroutine is monkeying
around in somebody else's scope.
The is copy property marks a parameter as pass-by-value, so the
parameter is a lexically scoped copy of the original value passed in:
sub passbyvalue ($first is copy, $second is copy) {...}
This means that the parameter is not an alias for the value in the caller's scope, but it's not read-only either. It's a new variable, a copy of the original and free to be used however your subroutine sees fit.
Sometimes it is useful to be able to define a default value for an
optional or named parameter. The = operator marks the default
value. The parameter takes the default value only
if the call doesn't pass an argument for that parameter.
sub default_vals ($required, $optional? = 5) {...}
Default values are only used with optional parameters. This should make sense because required positional values are always required, and so they always have a value passed.
Placeholder variables are a simple alternative to formal parameter
lists. They have many of the advantages of ordinary parameters,
without the inconvenience of declaring a signature. You just use
variables with a caret after the sigil--$^name, @^name,
%^name, or &^name--within the subroutine's block, and the
arguments passed into the subroutine are bound to them.
@sorted = sort { $^a <=> $^b } @array;
The order of the parameters is determined by the Unicode sorting order
of the placeholders' names, so the example below acts as if it has a
formal parameter list of ($^milk, $^sugar, $^tealeaves):
$make_tea = {
my $tea = boil $^tealeaves;
combine $tea, $^sugar, $^milk;
return $tea;
}
Placeholders are handy in short subroutines and bare blocks, but soon become unwieldy in anything more complicated.
In addition to a signature for the incoming parameters to a
subroutine, you can also declare a partial signature, or siglet,
for the values returned from a subroutine. Return siglets declare the
type of each return value, but they don't bind a named variable to the
returned value and can't define a default value for the return. In the
declaration, the return signature either goes before the sub
keyword or after the parameter list attached with the returns
keyword.
sub get_value (Int $incoming) returns Int {...}
# same as
Int sub get_value (Int $incoming) {...}
# same as
sub get_value (Int $incoming --> Int) {...}
Either syntax has exactly the same effect, but using the returns
keyword is usually clearer when the sub has multiple return values:
sub get_values (Str $incoming) returns (Int, Str) {...}
The standard way of passing arguments is by position. The first argument passed in goes to the first parameter, the second to the second, and so on:
sub matchparams ($first, $second) {...}
matchparams($one, $two); # $one is bound to $first
# $two is bound to $second
You can also pass arguments in by name, using a list of anonymous pairs. The key of each pair gives the parameter's name and the value of the pair gives the value to be bound to the parameter. When passed by name, the arguments can come in any order. Optional parameters can be left out, even if they come in the middle of the parameter list. This is particularly useful for subroutines with a large number of optional parameters:
sub namedparams ($first, $second?, $third? is rw) {...}
namedparams(third => 'Trillian', first => $name);
Sometimes the option syntax for pairs is clearer than the pair constructor syntax:
namedparams :third('Trillian'), :first($name);
To get the Perl 5-style behavior where the elements of an array (or
the pairs of a hash) flatten out into the parameter list, use the
flattening operator | in the call to the subroutine. Here,
$first binds to @array[0] and $second binds to @array[1]:
sub flat ($first, $second) {...}
flat(|@array);
A flattened hash argument acts as a list of pairs, which are bound to
the parameters just like ordinary named arguments. So, $first is
bound to %hash{'first'}, and $second is bound to
%hash{'second'}:
sub flat_hash (:$first, :$second) {...}
%hash = (first => 1, second => 2);
flat_hash(|%hash);
Flattened hash arguments are useful for building up hashes of named arguments to pass in all at once.
Arguments to subroutine calls have a standard general order. Positional arguments, if there are any, always go first. Named arguments go after any positional arguments. Variadic arguments always go at the end of the list.
order($positional, named => 1, 'va', 'ri', 'ad', 'ic');
Positional arguments are first so the parser and the person maintaining the code have an easier time associating them with positional parameters. Variadic arguments are at the end because they're open-ended lists.
If a subroutine has only required and variadic parameters, you can
always call it with a simple list of positional arguments. In this
example 'a' is bound to $req and the rest of the arguments go to
the slurpy array:
sub straight_line ($req, *@slurpy) {...}
straight_line('a', 'b', 'c', 'd', 'e');
If a subroutine has some optional parameters and a variadic array you
can call it with a simple list of positional arguments, but only if
you have arguments for all the optional parameters. In this example,
'a' is bound to $req, 'b' is bound to $opt and the rest of the
arguments go to the slurpy array:
sub mixed ($req, ?$opt, *@slurpy) {...}
mixed('a', 'b', 'c', 'd', 'e');
If you want to skip some of the optional parameters, you have two
choices. When the argument list has at least one named argument, the
parser knows to start the variadic list right after the named
arguments end. This example binds 'a' to $req, 'opt' to $opt,
skips $another, and puts the rest of the arguments in the variadic
array:
sub mixed ($req, $opt?, $another?, *@slurpy) {...}
mixed('a', 'opt' => 1, 'b', 'c', 'd', 'e');
If you want to skip all the optional parameters you need to use the
<== operator in place of the comma to mark where the variadic
list starts. This example binds 'a' to $req, skips $opt and
$another, and puts all the rest of the arguments in the variadic
array:
mixed('a' <== 'b', 'c', 'd', 'e');
You have to watch out for optional and variadic parameters when you
modify subroutines already in use. Adding an extra optional parameter
to a signature with a variadic array will break any calls that passed
all positional arguments. You could suggest that all users call your
subroutines with <== in case you decide to change them later,
or you could just add the new parameters as named parameters instead
of optional parameters. Named parameters ignore positional arguments,
so this version of the subroutine puts 'b' through 'e' in the variadic
array with or without any named arguments in the call:
sub mixed ($req, $opt?, $another?, *@slurpy) {...}
mixed('a', 'opt' => 1, 'b', 'c', 'd', 'e');
mixed('a', 'b', 'c', 'd', 'e');
As usual, there's more than one way to do it.
To declare a subroutine without defining it you give it a body
consisting of nothing but the ... (or "yada, yada, yada") operator,
optionally followed by a message.
So, all the preceding examples that look like pseudocode with {...}
for their body are actually valid subroutine declarations.
sub stubbly (Str $name, Int $days?) {...}
You can include a message that appears in the error message if you try to execute a stu subroutine.
sub stubbly (Str $name, Int $days?) { ... "Don't call me" }
When you later define the subroutine, the signature and defined traits must exactly match the declaration.
sub stubbly (Str $name, Int $days?) {
print "$name hasn't shaved in $days day";
print "s" if $days > 1;
}
Just like variables, subroutine names are simply entries in a symbol table or lexical scratchpad. So, all subroutines live in a particular scope, whether it's lexical, package or global scope.
Package scope is the default scope for subs. A sub that is declared
without any scope marking is accessible within the module or class
where it's defined with an unqualified call, like subname(), and
accessible elsewhere with a fully-qualified call using the
Package::Name::subname() syntax.
module My::Module {
sub firstsub ($param) {...}
sub secondsub {
mysub('arg'); # call the subroutine
}
}
module Other::Module {
use My::Module;
sub thirdsub {
My::Module::firstsub('arg');
}
}
This example declares two modules, My::Module and Other::Module.
My::Module declares a subroutine firstsub and calls it from within
secondsub. Other::Module declares a subroutine thirdsub that
calls firstsub using its fully qualified name.
Subroutines can also be lexically scoped, just like variables. A
my-ed subroutine makes an entry in the current lexical scratchpad
with a & sigil. Lexically scoped subs are called just like a normal
subroutine:
if $dining {
my sub dine ($who, $where) {
...
}
dine($zaphod, "Milliways");
}
dine($arthur, "Nutri-Matic"); # error - not in scope!
The first call to the lexically scoped dine is fine, but the second
would be a compile-time error because dine doesn't exist in the outer
scope.
The our keyword declares a lexically scoped alias to a package scoped
subroutine (it has an entry both in the symbol table of the current
package and in the current lexical scratchpad). This is useful under
certain levels of strictness.
if $dining {
our sub pay ($when, $what) {
...
}
pay($tuesday, "hamburger");
}
Globally scoped subroutines are visible everywhere, unless they're
overridden by a lexical or package scoped subroutine of the same name.
They are declared with the * symbol before the name of the
subroutine:
sub *seen_by_all ($why, $how) {...}
Most built-ins will be globally scoped.
Anonymous subroutines do everything that ordinary subroutines do. They can declare a formal parameter list with optional and required parameters, take positional and named arguments, and do variadic slurping. The only difference is that they don't define a name. But since you can't call a subroutine if you have no way to refer to it, they have to get the equivalent of a name somewhere, whether they're assigned to a variable, passed as a parameter, or aliased to another subroutine.
$make_tea = sub ($tealeaves, :$sugar, :$milk) {...}
The arrow operator used with for and given is just a way
of defining anonymous blocks. The arrow doesn't require
parentheses around its parameter list. It can't declare named subs,
and can't declare a return type.
$make_tea = -> $tealeaves, :$sugar, :$milk {...}
A bare block can also define an anonymous subroutine, but it can't define a formal parameter list on the sub and can't define a named sub.
$make_tea = {
my $tea = boil 'tealeaves';
combine $tea, 'sugar', 'milk';
}
You can't use the return statement within an arrow block or bare
block sub to return from that block. Blocks and arrow subs are
commonly used for ordinary control flow, so return ignores them and
only returns from subroutines defined with sub keyword or
methods.
Instead you can leave a block:
my $make_tea {
leave unless teamtime();
my $tea = boil 'tealeaves';
combine $tea, 'sugar', 'milk';
}
The simple rule is that everything declared with a sub or method
keyword use the return statement, blocks declared without such a keyword
can use leave.
You can define multiple routines with the same name but different
signatures in the same scope. These are known as "multisubs" and
defined with the multi keyword before sub. They're useful if
you want a routine that can handle different types of arguments in
different ways, but still appear as a single subroutine to the user.
For example, you might define an add multisub with different behavior
for integers, floats, and certain types of numeric objects:
multi sub add (Int $first, Int $second) {...}
multi sub add (Num $first, Num $second) {...}
multi sub add (Imaginary $first, Imaginary $second) {...}
multi sub add (MyNum $first, MyNum $second) {...}
When you later call the routine:
add($apples, $oranges);
it will dispatch to the right version of add based on the types of
the arguments passed to it. The parameters used for dispatch selection
are called invocants. If you want to use a limited set
of parameters as invocants, mark the boundary between invocant
parameters and the rest of the signature with a semi-colon:
multi sub add (Int $first, Int $second; Int $third) {...}
This version of add will dispatch based on the types of the first
two arguments passed in, and ignore the type of the third.
Multisubs can also differ in the number of arguments as long as no ambiguities arise.
Currying allows you to create a
shortcut for calling a subroutine with some preset parameter values.
The assuming method takes a list of named arguments and returns a
subroutine reference, with each of the named arguments bound to the
original subroutine's parameter list. If you have a subroutine
multiply that multiplies two numbers, you might create a subref
$six_times that sets the value for the $multiplier parameter, so
you can reuse it several times:
sub multiply ($multiplicand, $multiplier) {
return $multiplicand * $multiplier;
}
$six_times = &multiply.assuming(multiplier => 6);
$six_times(9); # 54
$six_times(7); # 42
...
You can also use binding assignment to alias a curried subroutine to an ordinary subroutine name instead of a scalar variable:
&six_times := &multiply.assuming(multiplier => 6); six_times(7); # 42
Sometimes you might want to wrap extra functionality around a
subroutine that was already defined (perhaps in a standard module),
but still call it with the same name. The .wrap method is similar
to the .assuming method, but more powerful. It takes a subroutine
reference as an argument and returns an ID object. Inside the subref
wrapper, the call statement marks the point where the original
subroutine will be executed.
$id = &subname.wrap ({
# preprocess arguments
# or execute additional code
call;
# postprocess return value
# or execute additional code
})
subname(...); # call the wrapped subroutine
By default, the inner subroutine is passed the same arguments as the
wrapping subroutine, and the wrapping subroutine returns the same
result as the inner subroutine. You can alter the arguments passed to
the inner subroutine by adding an explicit argument list to call,
and alter the outer return value by capturing the result from call
and explicitly returning a value in the wrapper.
$id = &subname.wrap (sub (*@args) {
# preprocess arguments
$result = call('modified', 'arguments');
# postprocess return value
return $result;
})
A subroutine can have multiple wrappers at the same time. Each new
wrapper wraps around the previous one, and the outermost wrapper
executes first. The ID object returned by .wrap allows the
.unwrap method to remove a specific wrapper:
&subname.unwrap($id);
If you'd rather not manually unwrap your sub, wrap a temped version
instead. The temp automatically removes the wrapper at the end of
its scope.
{
temp &subname.wrap ({...})
subname(...);
}
Lvalue subroutines pretend to be assignable values, just like any
ordinary variable. They do this by returning a proxy variable which
handles the lvalue behavior for the subroutine (fetch, store, etc.). You
declare an lvalue subroutine with the is rw property.
sub storage is rw {...}
storage() = 5;
An lvalue sub can return an ordinary variable which acts as a proxy, return the return value from another lvalue sub, or it can return a tied proxy variable defined within the sub.
my sub assignable is rw {
my $proxy is Proxy(
FETCH => {...},
STORE => {...},
...
);
return $proxy;
}
This example defines an lvalue sub named assignable. It creates a
proxy variable tied to a Proxy class that defines FETCH and
STORE tie methods on the fly.
Macros are a powerful way of manipulating source code at compile time.
Macros must be declared before they're called. A call to a macro
routine executes as soon as it's parsed. The parser substitutes the
return value from the macro into the parse tree in place of the macro
call. If a macro
returns undef, it makes no entry in the parse tree. So, the macro
disappear takes a single string argument and returns undef.
Any call to disappear is replaced at compile time with nothing,
just as if it were commented out.
macro disappear (Str $thinair) {
return;
}
...
disappear("Some text you'll never see");
This technique might seem like a nice way to add custom comment operators,
but as we will see in Chapter 7: "Grammars and Rules", there are
better ways to add custom commenting operators. If a macro returns a string, the string is parsed as
Perl source code, and the resulting parse tree replaces the macro call.
So, anywhere the macro twice is called, it is replaced at compile time
by a for modifier.
macro twice {
return "for 1..2";
}
...
print "\n" twice; # same as: print "\n" for 1..2;
If a macro returns a block, that block is parsed as a closure, and the
resulting parse tree replaces the macro call. So, when the
reverse_numeric macro is called, the parser substitutes the block
{ $^b <=> $^a } in place of the call.
macro reverse_numeric {
return { $^b <=> $^a };
}
...
sort reverse_numeric, @values;
If a macro returns a parse tree, the parser substitutes it directly for the macro call. The returned tree may be the original parse tree, a modified parse tree, or a manufactured parse tree.
By default, a call to a macro is parsed just like an ordinary
subroutine call, so it can take no arguments or a comma-separated list
of arguments. But, macros can also modify the way their arguments are
parsed, by adding an is parsed trait. The trait takes a rule as an
argument, and will parse whatever code follows using that rule instead of
the normal rule for parsing subroutine arguments. So, the macro
funky essentially translates a "ValSpeak" subroutine call into an
ordinary Perl subroutine call. It takes a single string argument,
which it parses as a sequence of word-forming characters, surrounded
by the strings "like" and ", you know". It then returns a block that
will call the plain subroutine with the single argument passed to
funky.
macro funky (Str $whatever)
is parsed (/:s like (\w+), you know/)
{
return { plain($whatever); };
}
...
funky like whatever, you know