This file was automatically generated from http://svn.pugscode.org/pugs/docs/Perl6/Spec/S07-iterators.pod on Sat Aug 1 14:01:18 2009 GMT, revision 27701.
Synopsis 7: Iterators and Laziness
Tim Nelson <wayland@wayland.id.au>
Daniel Ruoso <daniel@ruoso.com>
Created: 27 Nov 2008
Last Modified: 20 Apr 2009
Version: 5
As we all know, one of the primary virtues of the Perl programmer is laziness. This is also one of the virtues of Perl itself. However, Perl 6 knows better than to succumb to false laziness, and so is eager sometimes, and lazy others.
One thing that Perl understands is the difference between Laziness and Eagerness. When something is Lazy, it says "just give me what you've got; I'll get the rest later", whereas when it's eager, it says "More! More! Give me everything you can get!".
Perl 6 defines 4 levels of laziness:
It's important to realize that the responsibility of determining the level of laziness/eagerness in each operation is external to each lazy object, the runtime, depending on which operation is being performed, is going to assume the level of laziness and perform the needed operations to apply that level.
In order to provide p5-like behavior in list assignment, this operation is performed in the Mostly Eager level, meaning that if you do
my @a = grep { ... }, @b;
the grep will be evaluated as long as @b is not infinite.
my @a = grep { ... }, 1, 2, 3, 4..*
will give grep an infinite list (even if the first elements are known), therefore it will also be lazy. On the other hand
my @a = grep { ... }, 1, 2, 3, 4;
will be eagerly evaluated.
The feed operator is strictly lazy, meaning that no operation should be performed before the user requests any element. That's how
my @a <== grep { ... } <== map { ... } <== grep { ... } <== 1, 2, 3
is completely lazy, even if 1,2,3 is a fairly small known compact list.
But it's important to notice that eagerness takes precedence over laziness, meaning that
my @a = grep { ... } <== map { ... } <== grep { ... } <== 1, 2, 3
Will be eagerly evaluated, but that is still different from
my @d = 1,2,3;
my @c = grep { ... }, @d;
my @b = map { ... }, @c;
my @a = grep { ... }, @b;
Because in the first, the processing would be made as a flow, avoiding the creation of the intermediary eager lists that the second example creates. On the other hand
my @d <== 1,2,3;
my @c <== grep { ... }, @d;
my @b <== map { ... }, @c;
my @a = grep { ... }, @b;
provides the same laziness level of the first example.
The iterator role represents the lazy access to a list, walking through:
It's important to realize that the iterator of a list can be accessed by the .Iterator() method (but only the runtime will be calling that most of the time), and the implementation of each iterator is private to the list and implementation specific.
This is a minimal API that should allow custom iterator implementations, but this spec should be expanded in the future to provide additional API for batch-aware iterators.
The methods in this role are:
Returns the next item for that iteration. The grouping of elements returned in each iteration is visible if this iterator is being used to build a slice. While building a List, the items will be flattened.
When it runs out of items, it will return Nil.
This role defines an iterator that knows how to receive values back to be consumed again as if they were never consumed. The iterator is free to refuse values that were not consumed first and in the correct order, since this role is not intended to modify the original data, only to modify the traversal of that data.
This pushes $value back to the array. If this iterator is proxying immutable data, it is free to refuse a value that is either out of the original order or if it wasn't consumed. In that case, it should fail with UnorderedPushBackException or BadPushBackException.
This role defines an iterator that can receive new elements in the top of the stack, it works like pushback, but it should be able to store objects that are completely new to this stream. Either by being backed by a mutable list or by providing a local stack. This role implies Iterator::PushBack, where pushback will be handled by unshift.
The purpose of defining two different Roles is to give the user the "expect to fail" or "expect to succeed" semantics.
This unshifts $value either to the backed list or to a local stack of the iterator.
Perl's built-ins require that a number of auxiliary types implement Iterators. These are available for general use, and are instantiated by ending a feed at a scalar, array, or sliced array.
To create a generic item iterator, end a feed in a scalar:
my $it <== @a;
You can later do:
my $item = $it.get; my @items = @$it;
(XXX TODO: effect of type constraints of $it -- maybe to control flattening? )
(XXX TODO:
What's this do? my $it <== @a; my $it2 <== $it; my $it3 <== $it; get $it2; get $it3; ...allow tag team suckling? Is it worth the dwim, or do we just forbid it?
)
To obtain a generic lazy list, end a feed in a Positional.
my @it <== 1..2000;
The generic lazy list consumes the feed on demand. As the elements of
the list are accessed, values that were in each subsequently returned
Capture are placed at the end of the Positional. If a Capture
with more than one value is returned, each value is stored at its own
index. Empty Captures have no effect, and so the information that
they ever existed at all is lost.
You can later do things like:
@c = @it[1..5];
$s = @it[4];
...and this may run the iterator to find any values which are only lazily present. Values which have already been retrieved from the iterator are kept available unless they are removed, so if you are working with a large sequence you may want to:
shift(@it);
Reading and writing indices in a generic lazy list iterator will force consumption and caching of values for all indices up to and including the accessed one.
@it[4] = 2; # @it[0..3] are cached now, @it[4] was consumed/destroyed
Adding to the beginning of the list just stores values without touching the iterators.
@it.unshift(<a b c>); # works as expected
Trying to find the number of elements in any way may force eager consumption of the remainder of the values in the iterator, and trying to access the end of the list certainly will.
@it.pop; # Eagerness happens. Maybe bad things, too.
...but you had better be very sure that the list is finite. Attempts to
fiddle with the end of an infinite list may result in the interpreter
raising quite the exception, in the luckiest cases it may just deliver an
Inf, but in some cases you may find yourself spending a very, very
long time doing laps around the yard. The exact behavior may be
implementation specific.
So don't do that.
If the list is known to be finite, the generic array iterator will revert to a normal array once the feed is exhausted. So it is legal to:
@it.push; # eagerly exhausts the feed, then turns into to a normal array
...with the same caveats about infinity.
(XXX TODO: effect of type constraints/shape of @it? )
(XXX TODO:
my @a = (1,2); @a <<== ... promotes an occupied Positional to iterator, right? Should probably be explicitly mentioned. )
The generic lazy slice consumes the Captures from an iterator but
stores the results as a bi-dimensional list, where the first dimension
corresponds to an iteration, and the second contains the values in
the Capture returned for that iteration. Empty Captures are
stored just like the rest of the iterations.
To obtain a generic lazy slice, end a feed in a sliced Positional.
my @@it <== map { ... }, 1,2,3;
(XXX TODO:
@@it <== (1,mysub(),2;1,2,3);
@@it[0];
@@it[0;1];
exactly when does mysub() get called?)
Perl6 does not have a formally defined sub-style coroutine. Doubtless there will be external modules to do so in different flavors. Such a construct, where many calls made to the name of a sub with different parameters, expecting to reach the same sub every time, is not really compatible with multi-method-dispatch. While it may suit some programming styles which only use a subset of the Perl6 language, it cannot be made generally applicable across the Perl6 feature set.
This is not to say you cannot do coroutines in Perl6. In fact, the gather/take construct is a simple coroutine. But if you want to pass values back and forth Lua-style, you have to use a suplimentary object:
sub my_coro (*@slurp) {
gather do {
my Int $initval;
my Str $secondval;
my Int $thirdval;
$initval = shift(@slurp);
$initval++;
take $initval;
$secondval = shift(@slurp);
take "$initval $secondval";
$thirdval = shift(@slurp);
take $thirdval + $initval;
}
}
my @a = (1);
my $it;
$it <== my_coro() <== while 1 { shift(@a) };
say "First result: " ~ get $it;
@a.push("Hello World");
say "Second result: " ~ get $it;
@a.push(500);
say "Third result: " ~ get $it;
...if you want to pass multiple parameters on each call, you can
use a slice slurpy instead, to pass a Capture.
The iterator and array can of course be bundled up to give a more natural feel:
class my_sub2coro {
has $!it = Failure;
has @!data = ();
has $.coro;
submethod BUILD {
$!it <== &($.coro)() <== while 1 { shift(@!data) };
}
method corocall($message) {
@!data.push($message);
get $!it;
}
}
my $c = my_sub2coro.new(coro => &my_coro);
say "First result" ~ $c.corocall(1);
say "Second result" ~ $c.corocall("Hello World");
say "Third result" ~ $c.corocall(500);
(XXX TODO: As feeds are not implemented yet, above example code not tested)
Please post errors and feedback to perl6-language. If you are making a general laundry list, please separate messages by topic.