This file was automatically generated from http://svn.pugscode.org/pugs/docs/notes/context_coercion_traits.pod on Wed Jun 6 22:16:47 2007 GMT, revision 16639.
Context and Coercion - The interplay of the method call and the call chain
We've flip-flopped back and forth on whether type and context are equivalent. In this proposal, they are not, but every context is also a type. It is also a runtime concept, except when we can optimize it away (like in the case of a known signature for a method). Context is a argument that is passed in along with a call. For now, until we get to coercion, the return type of that call must be covariantly compatible (i.e. a subtype of) the context. This will allow us to overload multimethods based on return type, but it will sidestep the paradoxes that are often associated with that.
So how do we compute the context to pass to a call? It comes from one of the "signature functions" of the routine. These operate on invocant lists. For the routine:
proto foo(A $a, B $b: C $c: D $d E $e) {...}
The invocant lists are () (empty), ($a,$b), ($c) (and not
($d, $e)). The first one is always empty. Every routine has a
context method that returns an object that contains the contexts for
the first invocant set and another method for the second set. For
example, in the call:
foo($a, $b, $c, $d, $e);
To find the context of $d, you would write something like:
&foo.context().context($a,$b).context($c)[0];
(Of course these wouldn't be real method calls when used by the runtime, because they would have to do context chaining too, and to find those you'd have to do method calls, etc. etc. etc.)
For methods whose signatures are not known at all at compile time, the context object would have to tell you how big the next set of invocants is also.
Let's look at a standard paradox in multimethods that dispatch on return types:
multi doit(--> Int) { 42 }
multi doit(--> Str) { "hello world" }
multi tryit(Int $x) { say "Got an int" }
multi tryit(Str $x) { say "Got a str" }
tryit(doit());
Which variant of tryit() gets called? Well, execute doit() to
find out. But in order to know in which context doit() is executed,
we have to know which variant of tryit() we're going to call. Both
variants of tryit() are perfectly consistent, so we can't pick one to
call.
In order to solve this paradox, we just use the system outlined above.
tryit() has no proto, so it defaults to:
proto tryit(Any $x) {...}
So &tryit.context[0] is Any. This is passed as an argument to
doit(), resulting in an ambiguous multimethod call, and the program
dies.
This is all simple and expected, which is a good thing. One of the
interesting results of this system is that is lazy parameters can
only be non-invocants, because if they were invocants, they would be
evaluated immediately upon the call to &foo.context.
Perl has a concept of "list context", in which multiple lists get flattened together and passed as a single argument. How does that work in this system?
The context hierarchy is rooted with a tagged union:
union Context (Singular {}, Plural {}) # made-up syntax
That is, every context is either Singular or Plural, and they
have no common supercontext!. It is of course illegal to return
something Singular when your context is Plural and vice-versa. If
you have a function whose signature looks like:
proto foo(*@a) {...}
Then the call &foo.context[$n] for any nonnegative integer $n will
return some subtype of Plural. When the foo() routine gets its
arguments, it will concatenate all of them into the @a array.
The alert reader may notice a violation of the principle of
orthogonality here. The context system allows, say, the third argument
to a routine to be in Singular context and all the rest to be
Plural. However, the language (or at least the standard declaration
system) does not. According to the language, once one parameter is
Plural, all following parameters are Plural.
We can at long last understand references. We know that Scalar and
Array are containers. The thing that has perplexed us is the idea
that:
(\@array).push(42);
Works. The reference doesn't have a push method, so it must be
delegating. But in fact this is not the case.
Every time you mention a variable in an rvalue context[1], you are
calling the container's fetch() method. This method is polymorphic
on context. That is, in Singular context, Scalars fetch the value
they contain, and in Plural context they fetch a singleton List of
they value contained. In Singular context, Arrays fetch
themselves and in Plural context they fetch a List of the
values in the array (that is, they don't fetch themselves in Plural
context, because Array objects are singular).
Finally, the \ operator simply puts the variable in lvalue context,
so that fetch() is not called. This still leaves the one remaining
hole:
my $x = \\1;
That should not return "1 itself", but rather a reference to a scalar to
a reference to a scalar containing 1. How come? Well, the fundamental
thing to realize is that the \ operator works differently on
variables and on non-variables. For non-variables, it puts the value in
Singular context, creates an anonymous scalar and returns it.
Also note that:
@array.push(42)
Still works, because method calls impose a Singular context on their
invocant.
What about assignment? That is an example of a method whose non-invocant types change based on the type of the invocant, which we know is okay from the system described in the first part of this proposal. For the container interface:
role Container {
method infix:<=> (None $value) {...} # [2]
}
And then for the Scalar and Array container interfaces:
role Scalar {
does Container;
method infix:<=> ($value) {...}
}
role Array {
does Container;
method infix:<=> (*@value) {...}
}
Which gives the context-propagation that epitomizes =.
[1] This document is supposed to explain context, so I can't just handwave over that. Have a nice day. (XXX)
[2] Multis require that if a variant A is more specific than a variant
B, then all non-invocant parameters of A must in fact be more general
than the corresponding parameters of B. That allows us to give the
$value parameter the type None (not allowing assignment at all
into a general container), and if anything is known about the container,
the compiler can assume a more general type.
At the beginning of the proposal, I said that type and context are not the same thing. But exactly how are they different? Every context is a type, and they share the same subtype relations. However, the context tree has a constraint: No multiple inheritance[3]. We need this so we can do well-defined multi dispatch on contexts[4].
Recall that we also required that there be no common supercontext of
Singular and Plural (and there are no contexts that are not
subtypes of one of these two). This means that operators need only
decide what to do in those two contexts and the dispatch is completely
well-defined. Subs by default do the same thing in both contexts and
propagate the decision to their return value, which will eventually
propagate to something that will make the decision.
In order to define a type to be a context, declare it with is context
and derive from some other context (both of these things are necessary):
class Foo does Int is context {...}
class Bar does Plural is context {...}
When you introduce a new context, you must be sure that you don't derive from two other contexts (if you do, the compiler will reject your program). Perhaps there is a way to set the "primary context" of a type derived from two contexts, so that the subtype relation will ignore the other when doing context calculations:
class IntStr does Int does Str is context(Str) {...}
The reason that explicit marking of is context is required is so that
we don't kill multiple inheritance altogether. You're allowed to
multiply inherit, as long as all contexts you inherit from form a chain
of subtypes (A <: B <: C <: ...). Basically, you can't be an Int and
a Str at once; you can't be a Ref and a Plural at once, etc.
Classes and roles by default are Singular. It is unclear how to
create a Plural type, but there should be a way (Plural might
have some mandatory interface).
[3] Formally, if A <: B and A <: C then B <: C or C <: B.
[4] In order to do well-defined multi dispatch, you must be sure that the variant you pick is more specific than any other matching variant, or the context was too general to decide. The thing we want to avoid is having a context that is too specific to decide; equivalently, that it is possible to define a completely unambiguous dispatch. In the language of logic:
Given a variant's context type C and its return value r <: C, another variant's context type D and its return value s <: D, and the current context cxt. Show that if both variants match, then one must be more specific than the other; i.e. that if r <: cxt and s <: cxt then C <: D, D <: C, or C <: cxt and D <: cxt (the context was too general).
Proof: By the "no multiple inheritance" constraint in [3], r <: cxt and r <: C implies cxt <: C or C <: cxt. Similarly, cxt <: D or D <: cxt. By cases:
if cxt <: C and cxt <: D then (by [3]) C <: D or D <: C.
if cxt <: C and D <: cxt then D <: C
if C <: cxt and cxt <: D then C <: D
C <: cxt and D <: cxt was a goal
We defined what scalars and arrays do in the two base contexts
Singular and Plural. But what happens when the context is more
specific, and the value that the scalar holds does not match? This
violates the guarantee that the return has to be a subtype of the
context. Perl could just fail (and does in general), but there is one
out: &*COERCE[5]. This gets called whenever a value is not a
subtype of the context it is in. Just define it as a regular
return-value-polymorphic function:
multi *COERCE (MyString $s --> Str) {
$s.string_val;
}
Incidentally, this single-invocant multi also allows you to define
COERCE as a method.
[5] Note that it isn't coerce:<as>, as it's not really a
syntactic category at all; it's just infix. Better to have a callback
sub and let the syntax transform into it.
How is lvalue and rvalue encoded in context? How is a hash different from an array?