Index

This file was automatically generated from http://svn.pugscode.org/pugs/docs/tutorial/ch05_subroutines.pod on Sat Aug 1 14:01:20 2009 GMT, revision 27701.


=head0 Subroutines

Subroutines are reusable units of code, and one of the fundamental building blocks of modern programming languages. They can be called from just about anywhere, and return control to the point of the call when they finish executing. They can be passed zero or more arguments and return zero or more results. Some programmers may be more familar with languages which only allow a single return value from a subroutine, or languages for which subroutines have exactly zero return values and "functions" may have one. Perl 6 generalized the concept to allow subroutines to return as many, or as few, return values as needed. We think it just makes more sense to let the programmer do what they want to do in this regard.

Subroutines can be named or anonymous. They can be lexically scoped, package scoped, or globally scoped. Subroutines can be "Multi" subs, which allow multiple subroutines to have the same name as long as they have different parameter lists.

Blocks of reusable code can also be called the "methods" of a particular class of object. Methods have a few significant differences from subroutines, and a few significant differences from those found in Perl 5. For instance, in Perl 6 they're distinguished by a separate keyword, method. Because of these differences, they'll be discussed in Chapter 6.

Using Subroutines

The most basic subroutine declaration is simply the sub keyword, followed by the name of the sub, followed by the block that defines the sub:

  sub alert {
      print "We have normality.";
  }

The simplest subroutine call is just the subroutine name followed by a comma-separated list of variables or values:

  $result = sum($a, $b, 42, 57);

Arguments are also sometimes passed as anonymous pairs:

  $result = sum(first => 12, second => 21);

Parentheses are optional on calls to subroutines that have been predeclared, but required for all other calls. Including the & sigil before the subroutine name in a call will not turn off signature checking as it did in Perl 5. In fact, in most contexts prefixing the subroutine name with & will return a reference to the subroutine instead of actually calling the subroutine.

Parameters

One of the most significant additions to subroutines in Perl 6 is formal parameters. The parameter list, often called the signature of the subroutine, is part of the subroutine declaration:

  sub standardize ($text, $method) {
      my $clean;
      given $method {
          when 'length' { $clean = wrap($text, 72); }
          when 'lower' { $clean = lowercase($text); }
          ...
      }
      return $clean;
  }

The subroutine standardize has two scalar parameters, $text and $method, so it is called with two arguments, each a scalar value. The parameters are lexical variables within the body of the sub. They never need to be explicitly declared as my, even under use strict because they're declared by the subroutine declaration.

In a sub with no parameter list, all arguments are passed in the @_ array:

  sub sum {
      my $sum;
      for @_ -> $number {
          $sum += $number;
      }
      return $sum;
  }

Subroutines with defined parameter lists don't get an @_ array. If you want a subroutine that takes no arguments (and complains when arguments are passed), define it with an empty parameter list ().

Subroutine calls normally provide a non-flattening list context, which means that any array or hash passed into a subroutine is treated as a single argument. An array parameter in the signature expects to be passed an actual array or arrayref argument, and a hash parameter expects a hash or hashref argument:

  sub whole (@names, %flags) {
      ...
  }

  whole(@array, %hash);

Optional Parameters

Every subroutine call checks its signature to make sure the arguments it gets match the declared parameter list. A mismatch in the number or kind of arguments is an error. But since requiring every declared parameter to be passed in on every call isn't nearly flexible enough for the average programmer, Perl 6 also allows optional parameters. Optional parameters can be included or ignored without causing any errors. Each optional parameter is marked with a ? after the parameter name:

  sub someopt ($required1, $required2, $optional1?, $optional2?) {
      ...
  }

So, someopt will accept anywhere from 2 to 4 arguments. You can have any number of required and optional parameters, but all the required parameters must come before the first optional parameter. This is largely a common sense restriction. If you want to leave some elements off a list of arguments, it has to be the ones at the end, because positional arguments are bound to the parameters in strict linear order. All these calls to someopt are fine:

  someopt('req1', 'req2', 'opt1', 'opt2');
  someopt('req1', 'req2', 'opt1');
  someopt('req1', 'req2');

Notice that it will still cause an error to pass too many or too few parameters in a list with optional parameters:

  someopt('req1')                                  # WRONG!
  someopt('req1', 'req2', 'opt1', 'opt2', 'extra') # WRONG!

Named Parameters

Any argument can be passed either by position with an ordered list of arguments, or by name with an unordered list of pairs (see "Named Argument Passing" later in this chapter for more details). Sometimes you'll want to specify that certain parameters will be passed only by name, never by position. Named parameters are marked with a : before the parameter name:

  sub namedparams ($first, :$second, :$third) {
      ...
  }

  namedparams(1, second => 2, third => 3);

Named parameters are always optional. They must come after all positional parameters--that is, after the unmarked required parameters and the optional parameters marked with a ?. Again, this is largely a matter of common sense. Though named parameters are completely ignored when binding a list of positional arguments, the parser and the person maintaining your code will both be profoundly grateful they don't have to sort through a mixed bag of positional and named parameters to find the positional parameter list.

If it makes more sense to do it, you can also use the alternate key syntax for passing parameters:

 namedparams(1, :second(2), :third(3))      # Right

Also, named parameters aren't positional, you can pass them in any order. So, we can pass the second third and the third second:

 namedparams(1, third => 3, second => 2)

The point of having a name for your parameter is that you don't have to worry about the position of it.

Variadic Parameters

Another element of flexibility Perl developers will expect is the ability to pull a list of arguments into an array or hash parameter. These are known as variadic parameters because they can take a variable number of arguments. In Perl 6, an array parameter with a * before the parameter name will slurp up all the positional arguments that haven't already been bound to another positional parameter. So, the following call to transport binds $arthur to @names[0], and $ford to @names[1]:

  sub transport ($planet, *@names) {
      ...
  }
  
  transport('Magrathea', $arthur, $ford);

If the variadic array parameter is the only positional parameter in the signature, it will take all the positional arguments.

  sub simple (*@_) {...}
  # is the same as
  sub simple {...}

A hash parameter with a * before the name will slurp up all the named arguments that haven't already been bound to another parameter. So, the following call to transport binds the value of the pair argument with the key 'planet' to the parameter :$planet, but all the other pairs become part of the %flags hash (more on this in "Named Argument Passing" later in this chapter).

  sub transport (:$planet, *%flags) {...}

  transport(:name('Arthur'),
            :luggage('lost'),
            :planet('Magrathea'),
            :towel('required'));

When they're combined with other kinds of parameters, variadic positional parameters must come after all positional parameters in the signature. They can either precede or follow the named parameters. Variadic named parameters only slurp up the named parameters that aren't bound already, so they can appear anywhere in the signature.

Typed Parameters

Signature checking is sensitive not only to the number of arguments and the variable type (defined by the $, @, %, or & symbol), but also to the value type. The parameter type is defined before the parameter name and before any symbols for optional, named, or variadic parameters:

  sub typedparams (Int $first, Str $second?) {...}

The parameter type declares the type of argument that can be bound to it. The parameter and argument types have to be compatible, but not identical.

Type checking happens at compile time whenever possible, because it's faster and can be optimized. Otherwise, type checking happens at run time. So, if all the arguments passed to the subroutine are explicitly typed, the types will be checked at compile time. If the arguments aren't explicitly typed, the run-time checks will make sure the scalars contain an integer value and a string value.

Properties on Parameters

By default, parameters are aliases to the original arguments (pass-by-reference), but they're marked as constant so they cannot be modified within the body of the subroutine. The is rw property marks a parameter as modifiable, so changes to the parameter within the body of the sub modify the original variable passed in:

  sub modifyparams ($first is rw, $second is rw) {...}

Be careful about using the is rw property unless it's necessary, inadvertent changes in a subroutine can change data values in the caller's scope that might not be expecting to be changed. At the very least, it's polite to document it when your subroutine is monkeying around in somebody else's scope.

The is copy property marks a parameter as pass-by-value, so the parameter is a lexically scoped copy of the original value passed in:

  sub passbyvalue ($first is copy, $second is copy) {...}

This means that the parameter is not an alias for the value in the caller's scope, but it's not read-only either. It's a new variable, a copy of the original and free to be used however your subroutine sees fit.

Default Values for Parameters

Sometimes it is useful to be able to define a default value for an optional or named parameter. The = operator marks the default value. The parameter takes the default value only if the call doesn't pass an argument for that parameter.

  sub default_vals ($required, $optional? = 5) {...}

Default values are only used with optional parameters. This should make sense because required positional values are always required, and so they always have a value passed.

Placeholder Variables

Placeholder variables are a simple alternative to formal parameter lists. They have many of the advantages of ordinary parameters, without the inconvenience of declaring a signature. You just use variables with a caret after the sigil--$^name, @^name, %^name, or &^name--within the subroutine's block, and the arguments passed into the subroutine are bound to them.

  @sorted = sort { $^a <=> $^b } @array;

The order of the parameters is determined by the Unicode sorting order of the placeholders' names, so the example below acts as if it has a formal parameter list of ($^milk, $^sugar, $^tealeaves):

  $make_tea = {
      my $tea = boil $^tealeaves;
      combine $tea, $^sugar, $^milk;
      return $tea;
  }

Placeholders are handy in short subroutines and bare blocks, but soon become unwieldy in anything more complicated.

Return Values

In addition to a signature for the incoming parameters to a subroutine, you can also declare a partial signature, or siglet, for the values returned from a subroutine. Return siglets declare the type of each return value, but they don't bind a named variable to the returned value and can't define a default value for the return. In the declaration, the return signature either goes before the sub keyword or after the parameter list attached with the returns keyword.

  sub get_value (Int $incoming) returns Int {...}
  # same as
  Int sub get_value (Int $incoming) {...}
  # same as
  sub get_value (Int $incoming --> Int) {...}

Either syntax has exactly the same effect, but using the returns keyword is usually clearer when the sub has multiple return values:

  sub get_values (Str $incoming) returns (Int, Str) {...}

Arguments

The standard way of passing arguments is by position. The first argument passed in goes to the first parameter, the second to the second, and so on:

  sub matchparams ($first, $second) {...}
  
  matchparams($one, $two);  # $one is bound to $first
                            # $two is bound to $second

Named Argument Passing

You can also pass arguments in by name, using a list of anonymous pairs. The key of each pair gives the parameter's name and the value of the pair gives the value to be bound to the parameter. When passed by name, the arguments can come in any order. Optional parameters can be left out, even if they come in the middle of the parameter list. This is particularly useful for subroutines with a large number of optional parameters:

  sub namedparams ($first, $second?, $third? is rw) {...}
  
  namedparams(third => 'Trillian', first => $name);

Sometimes the option syntax for pairs is clearer than the pair constructor syntax:

  namedparams :third('Trillian'), :first($name);

Flattening Arguments

To get the Perl 5-style behavior where the elements of an array (or the pairs of a hash) flatten out into the parameter list, use the flattening operator | in the call to the subroutine. Here, $first binds to @array[0] and $second binds to @array[1]:

  sub flat ($first, $second) {...}
  
  flat(|@array);

A flattened hash argument acts as a list of pairs, which are bound to the parameters just like ordinary named arguments. So, $first is bound to %hash{'first'}, and $second is bound to %hash{'second'}:

  sub flat_hash (:$first, :$second) {...}


  %hash = (first => 1, second => 2);
  flat_hash(|%hash);

Flattened hash arguments are useful for building up hashes of named arguments to pass in all at once.

Argument Order Constraints

Arguments to subroutine calls have a standard general order. Positional arguments, if there are any, always go first. Named arguments go after any positional arguments. Variadic arguments always go at the end of the list.

  order($positional, named => 1, 'va', 'ri', 'ad', 'ic');

Positional arguments are first so the parser and the person maintaining the code have an easier time associating them with positional parameters. Variadic arguments are at the end because they're open-ended lists.

If a subroutine has only required and variadic parameters, you can always call it with a simple list of positional arguments. In this example 'a' is bound to $req and the rest of the arguments go to the slurpy array:

  sub straight_line ($req, *@slurpy) {...}

  straight_line('a', 'b', 'c', 'd', 'e');

If a subroutine has some optional parameters and a variadic array you can call it with a simple list of positional arguments, but only if you have arguments for all the optional parameters. In this example, 'a' is bound to $req, 'b' is bound to $opt and the rest of the arguments go to the slurpy array:

  sub mixed ($req, ?$opt, *@slurpy) {...}

  mixed('a', 'b', 'c', 'd', 'e');

If you want to skip some of the optional parameters, you have two choices. When the argument list has at least one named argument, the parser knows to start the variadic list right after the named arguments end. This example binds 'a' to $req, 'opt' to $opt, skips $another, and puts the rest of the arguments in the variadic array:

  sub mixed ($req, $opt?, $another?, *@slurpy) {...}

  mixed('a', 'opt' => 1, 'b', 'c', 'd', 'e');

If you want to skip all the optional parameters you need to use the <== operator in place of the comma to mark where the variadic list starts. This example binds 'a' to $req, skips $opt and $another, and puts all the rest of the arguments in the variadic array:

    mixed('a' <==  'b', 'c', 'd', 'e');

You have to watch out for optional and variadic parameters when you modify subroutines already in use. Adding an extra optional parameter to a signature with a variadic array will break any calls that passed all positional arguments. You could suggest that all users call your subroutines with <== in case you decide to change them later, or you could just add the new parameters as named parameters instead of optional parameters. Named parameters ignore positional arguments, so this version of the subroutine puts 'b' through 'e' in the variadic array with or without any named arguments in the call:

  sub mixed ($req, $opt?, $another?, *@slurpy) {...}

  mixed('a', 'opt' => 1, 'b', 'c', 'd', 'e');
  mixed('a', 'b', 'c', 'd', 'e');

As usual, there's more than one way to do it.

Subroutine Stubs

To declare a subroutine without defining it you give it a body consisting of nothing but the ... (or "yada, yada, yada") operator, optionally followed by a message. So, all the preceding examples that look like pseudocode with {...} for their body are actually valid subroutine declarations.

  sub stubbly (Str $name, Int $days?) {...}

You can include a message that appears in the error message if you try to execute a stu subroutine.

  sub stubbly (Str $name, Int $days?) { ... "Don't call me" }

When you later define the subroutine, the signature and defined traits must exactly match the declaration.

  sub stubbly (Str $name, Int $days?) {
      print "$name hasn't shaved in $days day";
      print "s" if $days > 1;
  }

Subroutine Scope

Just like variables, subroutine names are simply entries in a symbol table or lexical scratchpad. So, all subroutines live in a particular scope, whether it's lexical, package or global scope.

Package Scoped Subroutines

Package scope is the default scope for subs. A sub that is declared without any scope marking is accessible within the module or class where it's defined with an unqualified call, like subname(), and accessible elsewhere with a fully-qualified call using the Package::Name::subname() syntax.

  module My::Module {
    sub firstsub ($param) {...}

    sub secondsub {
      mysub('arg'); # call the subroutine
    }
  }

  module Other::Module {
    use My::Module;

    sub thirdsub {
       My::Module::firstsub('arg');
    }
  }

This example declares two modules, My::Module and Other::Module. My::Module declares a subroutine firstsub and calls it from within secondsub. Other::Module declares a subroutine thirdsub that calls firstsub using its fully qualified name.

Lexically Scoped Subroutines

Subroutines can also be lexically scoped, just like variables. A my-ed subroutine makes an entry in the current lexical scratchpad with a & sigil. Lexically scoped subs are called just like a normal subroutine:

  if $dining {
      my sub dine ($who, $where) {
          ...
      }
  
      dine($zaphod, "Milliways");
  }
  
  dine($arthur, "Nutri-Matic");  # error - not in scope!

The first call to the lexically scoped dine is fine, but the second would be a compile-time error because dine doesn't exist in the outer scope.

The our keyword declares a lexically scoped alias to a package scoped subroutine (it has an entry both in the symbol table of the current package and in the current lexical scratchpad). This is useful under certain levels of strictness.

  if $dining {
      our sub pay ($when, $what) {
          ...
      }
  
      pay($tuesday, "hamburger");
  }

Globally Scoped Subroutines

Globally scoped subroutines are visible everywhere, unless they're overridden by a lexical or package scoped subroutine of the same name. They are declared with the * symbol before the name of the subroutine:

  sub *seen_by_all ($why, $how) {...}

Most built-ins will be globally scoped.

Anonymous Subroutines

Anonymous subroutines do everything that ordinary subroutines do. They can declare a formal parameter list with optional and required parameters, take positional and named arguments, and do variadic slurping. The only difference is that they don't define a name. But since you can't call a subroutine if you have no way to refer to it, they have to get the equivalent of a name somewhere, whether they're assigned to a variable, passed as a parameter, or aliased to another subroutine.

  $make_tea = sub ($tealeaves, :$sugar, :$milk) {...}

The arrow operator used with for and given is just a way of defining anonymous blocks. The arrow doesn't require parentheses around its parameter list. It can't declare named subs, and can't declare a return type.

  $make_tea = -> $tealeaves, :$sugar, :$milk {...}

A bare block can also define an anonymous subroutine, but it can't define a formal parameter list on the sub and can't define a named sub.

  $make_tea = { 
      my $tea = boil 'tealeaves';
      combine $tea, 'sugar', 'milk';
  }

You can't use the return statement within an arrow block or bare block sub to return from that block. Blocks and arrow subs are commonly used for ordinary control flow, so return ignores them and only returns from subroutines defined with sub keyword or methods.

Instead you can leave a block:

  my $make_tea {
      leave unless teamtime();
      my $tea = boil 'tealeaves';
      combine $tea, 'sugar', 'milk';
  }

The simple rule is that everything declared with a sub or method keyword use the return statement, blocks declared without such a keyword can use leave.

Multi Subroutines

You can define multiple routines with the same name but different signatures in the same scope. These are known as "multisubs" and defined with the multi keyword before sub. They're useful if you want a routine that can handle different types of arguments in different ways, but still appear as a single subroutine to the user. For example, you might define an add multisub with different behavior for integers, floats, and certain types of numeric objects:

  multi sub add (Int $first, Int $second) {...}
  multi sub add (Num $first, Num $second) {...}
  multi sub add (Imaginary $first, Imaginary $second) {...}
  multi sub add (MyNum $first, MyNum $second) {...}

When you later call the routine:

  add($apples, $oranges);

it will dispatch to the right version of add based on the types of the arguments passed to it. The parameters used for dispatch selection are called invocants. If you want to use a limited set of parameters as invocants, mark the boundary between invocant parameters and the rest of the signature with a semi-colon:

  multi sub add (Int $first, Int $second; Int $third) {...}

This version of add will dispatch based on the types of the first two arguments passed in, and ignore the type of the third.

Multisubs can also differ in the number of arguments as long as no ambiguities arise.

Curried Subroutines

Currying allows you to create a shortcut for calling a subroutine with some preset parameter values. The assuming method takes a list of named arguments and returns a subroutine reference, with each of the named arguments bound to the original subroutine's parameter list. If you have a subroutine multiply that multiplies two numbers, you might create a subref $six_times that sets the value for the $multiplier parameter, so you can reuse it several times:

  sub multiply ($multiplicand, $multiplier) {
      return $multiplicand * $multiplier;
  }
  
  $six_times = &multiply.assuming(multiplier => 6);
  
  $six_times(9); # 54
  $six_times(7); # 42
  ...

You can also use binding assignment to alias a curried subroutine to an ordinary subroutine name instead of a scalar variable:

  &six_times := &multiply.assuming(multiplier => 6);

  six_times(7); # 42

Wrapped Subroutines

Sometimes you might want to wrap extra functionality around a subroutine that was already defined (perhaps in a standard module), but still call it with the same name. The .wrap method is similar to the .assuming method, but more powerful. It takes a subroutine reference as an argument and returns an ID object. Inside the subref wrapper, the call statement marks the point where the original subroutine will be executed.

  $id = &subname.wrap ({
      # preprocess arguments
      # or execute additional code
      call;
      # postprocess return value
      # or execute additional code
  })

  subname(...); # call the wrapped subroutine

By default, the inner subroutine is passed the same arguments as the wrapping subroutine, and the wrapping subroutine returns the same result as the inner subroutine. You can alter the arguments passed to the inner subroutine by adding an explicit argument list to call, and alter the outer return value by capturing the result from call and explicitly returning a value in the wrapper.

  $id = &subname.wrap (sub (*@args) {
      # preprocess arguments
      $result = call('modified', 'arguments');
      # postprocess return value
      return $result;
  })

A subroutine can have multiple wrappers at the same time. Each new wrapper wraps around the previous one, and the outermost wrapper executes first. The ID object returned by .wrap allows the .unwrap method to remove a specific wrapper:

  &subname.unwrap($id);

If you'd rather not manually unwrap your sub, wrap a temped version instead. The temp automatically removes the wrapper at the end of its scope.

  {
    temp &subname.wrap ({...})

    subname(...);
  }

Lvalue Subroutines

Lvalue subroutines pretend to be assignable values, just like any ordinary variable. They do this by returning a proxy variable which handles the lvalue behavior for the subroutine (fetch, store, etc.). You declare an lvalue subroutine with the is rw property.

  sub storage is rw {...}

  storage() = 5;

An lvalue sub can return an ordinary variable which acts as a proxy, return the return value from another lvalue sub, or it can return a tied proxy variable defined within the sub.

  my sub assignable is rw {
      my $proxy is Proxy(
          FETCH => {...},
          STORE => {...},
          ...
      );
      return $proxy;
  }

This example defines an lvalue sub named assignable. It creates a proxy variable tied to a Proxy class that defines FETCH and STORE tie methods on the fly.

Macros

Macros are a powerful way of manipulating source code at compile time. Macros must be declared before they're called. A call to a macro routine executes as soon as it's parsed. The parser substitutes the return value from the macro into the parse tree in place of the macro call. If a macro returns undef, it makes no entry in the parse tree. So, the macro disappear takes a single string argument and returns undef. Any call to disappear is replaced at compile time with nothing, just as if it were commented out.

  macro disappear (Str $thinair) {
      return;
  }

  ...

  disappear("Some text you'll never see");

This technique might seem like a nice way to add custom comment operators, but as we will see in Chapter 7: "Grammars and Rules", there are better ways to add custom commenting operators. If a macro returns a string, the string is parsed as Perl source code, and the resulting parse tree replaces the macro call. So, anywhere the macro twice is called, it is replaced at compile time by a for modifier.

  macro twice {
      return "for 1..2";
  }

  ...

  print "\n" twice;     # same as:  print "\n" for 1..2;

If a macro returns a block, that block is parsed as a closure, and the resulting parse tree replaces the macro call. So, when the reverse_numeric macro is called, the parser substitutes the block { $^b <=> $^a } in place of the call.

  macro reverse_numeric {
      return { $^b <=> $^a };
  }

  ...

  sort reverse_numeric, @values;

If a macro returns a parse tree, the parser substitutes it directly for the macro call. The returned tree may be the original parse tree, a modified parse tree, or a manufactured parse tree.

By default, a call to a macro is parsed just like an ordinary subroutine call, so it can take no arguments or a comma-separated list of arguments. But, macros can also modify the way their arguments are parsed, by adding an is parsed trait. The trait takes a rule as an argument, and will parse whatever code follows using that rule instead of the normal rule for parsing subroutine arguments. So, the macro funky essentially translates a "ValSpeak" subroutine call into an ordinary Perl subroutine call. It takes a single string argument, which it parses as a sequence of word-forming characters, surrounded by the strings "like" and ", you know". It then returns a block that will call the plain subroutine with the single argument passed to funky.

   macro funky (Str $whatever) 
      is parsed (/:s like (\w+), you know/)
  {
      return { plain($whatever); };
  }

  ...

  funky like whatever, you know