Now that async/await has been released, attention has drifted back to refining stackless coroutines (the unstable language feature that makes async/await possible). Alas, the latest RFC has shown that there is still a lot of disagreement on what exactly coroutines in Rust should look like beyond async/await. I felt like it will be useful to flesh out what coroutines could be so we can better discuss what they should be.
Edit from future Sam! After almost a year of discussions with other people, we settled on language team MCP-49 as the “best” version of coroutines. Further discussion is available in a series of design notes in the language team website.
Before we go any further, a quick reminder of how coroutines work. In theory, a
coroutine is just a kind of function which has the ability to pause and resume
without blocking any OS threads. For example, an async
function will pause
when you .await
on a future which hasn’t resolved yet. In languages like Go or
Python, coroutines are hosted by some sort of runtime which maintains each call
stack and automatically jumps around as needed.
Rust doesn’t have a heavy runtime so coroutines have to be assembled somehow out
of ordinary functions and structs. In fact, an anonymous struct is exactly what
async
produces. The coroutine is executed by calling a resume function which
can later pause execution by saving variables into the struct and returning back
to the caller. This is why they are called “stackless coroutines”. It also means
that any coroutine syntax in Rust (e.g. async/await) is just a fancy way of
creating a state machine! For example, consider the following async code:
let my_coroutine: Future<Output=u32> = async {
some_function().await + 3
};
This coroutine can be in one of 3 states. In the “Start” state, the coroutine
has just been created but hasn’t had the chance to do anything yet. In the
“Await” state, the coroutine has called some_function
but the returned future
isn’t ready yet. In the “Done” state, the coroutine has received a number from
some_function
and added 3 to it. We can draw out a diagram to make it clearer:
A state machine can run all day day but it won’t do us any good unless we can
get useful data out of it. So Rust allows each resume call (a edge on the graph)
to output something back to the caller. For futures like this one, it will
output Ready
once the the function properly returns or Pending
if execution
was paused while waiting on some other coroutine future.
Since our coroutines can now generate output, we might call them “generators”. There are a lot of advantages to this terminology but I’ll stick to “stackless coroutine” in this post to distance it from any specific proposal beyond async/await.
Notice that users of this coroutine could keep calling the resume function even
after it has reached the “Done” state. The futures API in Rust basically just
voids your warranty once Ready
is returned. I’d like something more general
purpose than that. Could we simply ban the concept of a last state since it
causes problems like this? If so, what happens when resume is called on “Done”?
It panics! Thus it never returns and never requires us to come up with a state
after “Done”. Problem solved. A different state machine could return to a
previous state instead but either way coroutines in Rust don’t need to define
a last state or a last output type. If the caller should stop, each coroutine
can signal that however they please.
Futures in Rust have some additional complexity beyond the ability to pause,
resume, and generate output. If you read the documentation on the Future
trait, you’ll see that
the resume function (called poll
) takes an additional context parameter. This
parameter gives futures the ability to communicate with complex schedulers that
decide when to resume.
As you’ll see, we now have a very general definition of stackless coroutines which we can carry through the rest of this post.
A stackless coroutine is a function which mutates state, takes input, and generates output.
Wait, that means there really isn’t much difference between a stackless
coroutine and a FnMut
! Both can be called many times; each time taking some
input and returning some output. The only real difference is the syntax which
creates the FnMut
implementations:
FnMut
which runs all the way through the
closure body when called, outputting the return value of the block.FnMut
which can pause and save execution mid-way through the body of the async
block, outputting intermediary results.Really, we’d want coroutines to implement some sort of FnPinMut
since they
will likely contain self-references. But this representation of stackless
coroutines really is completely general. Even today, we could write something
like:
use std::future::Future;
use std::task::{Context, Poll};
use std::ops::{Generator, GeneratorState};
impl<C, T> Future for C where C: FnPinMut(&mut Context) -> Poll<T> {
type Output = T;
fn poll(self: Pin<&mut Self>, ctx: &mut Context) -> Poll<T>;{
self(ctx)
}
}
impl<C, Y, R> Generator for C where C: FnPinMut() -> GeneratorState<Y, R> {
type Yield = Y;
type Return = R;
unsafe fn resume(&mut self) -> GeneratorState<Y, R> {
Pin::new_unchecked(self)()
}
}
Again, I’m not trying to say that Rust should literally adopt some FnPinMut
as the canonical definition of a stackless coroutine (although I don’t think
that’s a bad idea). Only that it could and that we can compare and contrast
coroutine proposals by asking “what sort of FnPinMut
would this be?”.
Just like the general definition of coroutines above, it will be useful to have a general syntax for creating coroutines. Async/await can’t do everything which is why people are interested in a dedicated coroutine syntax in the first place. Our hypothetical syntax doesn’t have to be pretty but it gives us a place to start. Here are the rules:
coroutine { ... }
block produces a FnPinMut(I) -> O
. I
and O
are inferred.coroutine { ... }
block can capture data in the same way as a closure.coroutine { ... }
block contains a yield
postfix operator for every state it can occupy.coroutine { ... }
block is not allowed to end and must therefor evaluate to !
.With this in mind, let’s translate our async example to this syntax. And before
you come to my house and murder me, I’m using yield
as a postfix operator to
avoid the same precedence questions that plagued await
but that isn’t set in
stone or anything.
Also notice that we need to start our coroutine block with a yield
since the
coroutine is created in the “Start” state. The first yield
evaluates to the
argument to the first resume call and the output of that resume call isn’t
produced until the next yield
. That means that the first yield
actually
ignores its input. I’ll just have it take ()
for now.
let my_coroutine: FnPinMut(&mut Context) -> Poll<u32> = coroutine {
let mut ctx = ().yield; // Start
let mut future = some_function();
let value = loop {
match /* pin stuff here */ future.poll(ctx) {
Ready(x) => break x;
Pending => ctx = Pending.yield; // Await
};
};
Ready(value + 3).yield; // Done
panic!() // -> !
};
That is clearly a lot less legible than the async/await syntax but it’s not too bad. Just try creating a similar state machine by hand using an enum! Maybe we can see the utility of a dedicated coroutine syntax if we aim for something simpler. Here we are using a coroutine to do a sum:
let summer = coroutine {
let mut sum = ().yield; // Start
loop {
sum += sum.yield; // Add
} // -> !
};
assert_eq!(Pin::new(&mut summer)(5), 5);
assert_eq!(Pin::new(&mut summer)(4), 9);
assert_eq!(Pin::new(&mut summer)(3), 12);
Hmmm, my proof-of-concept syntax seems even more odd now. It’s frankly hard to see exactly why it works how it works. To demonstrate, here are two different state-machine diagrams. Which one do you think is a correct representation of this coroutine?
Trick question! They are both correct. Each is just the “line graph” of the other (nodes and edges are swapped). The graph on the left makes a lot more sense when looking at the coroutine syntax: you pass some output to yield and then receive some input which you use to do work before deciding what to yield next. The graph on the right makes a lot more sense from outside the coroutine: the coroutine is in some state and when you resume it with an input that state changes and you get some output.
We can now see something really interesting about coroutines. We are trying to decide what to output before we begin (empty tuple, I guess?) just like we had to decide earlier how to handle input after we end (it will end with a !). Either case leaves a strange edge from/to nowhere in a state machine diagram. Could we ban the concept of a first state since it also causes problems? Actually yes! We can make every coroutine a loop like ouroboros eating its own tail. All that’s needed is one state which we can create out of thin air because it lacks hidden internal data. That gives us something like this:
I mean something like this:
let mut sum = 0;
let summer = coroutine loop { sum += sum.yield };
Ouroboroutines are even stranger. Because execution of the loop starts with the
yield receiving 1 and proceeds to the sum +=
, the coroutine never outputs 0.
Then the loop comes around and we output sum (now equal to 1) for the first
time. A coroutine which panics on the second resume call might look like:
let resume_if_you_dare: FnPinMut(()) -> () = coroutine loop {
panic!().yield;
().yield;
};
This loop syntax should never actually be accepted because it’s super stupid. I just wanted to share it because it’s a cool exercise and because demonstrates some of the trade-offs every proposal has to make if it allows for unique begin or end states.
Now that we’ve completed my tour of what coroutines could be. Here is a
comparison of different proposals for what coroutines generators should
be. Note that this is a pretty high-level rundown. I’m trying to avoid
duplicating proposals that differ only on bikeshed-ish decisions. I’m also not
thinking super hard about lifetime or compatibility issues beyond the obvious.
These are just here to start a conversation. If you come across anything
interesting I’ve missed, please let me know via my email or anywhere I post
this!
This is the unstable generator syntax as it exists in rust today. It got the job done while async/await was being shipped but leaves a lot to be desired. Notably, the only way to get new data into the coroutine is by mutating some kind of shared state (very un-Rust-like if you ask me).
// This works great.
let mut gen = || {
yield 1;
return "foo"
};
assert_eq!(
Pin::new(&mut gen).resume(),
GeneratorState::Yielded(1));
assert_eq!(
Pin::new(&mut gen).resume(),
GeneratorState::Returned("foo"));
// Good luck doing something like this.
let mut queue = VecDeque::new();
queue.extend(0..2);
let mut gen: FnPinMut(u32) -> u32 = generator loop {
queue.push_back(queue.pop_front().unwrap_or(0).yield);
};
assert_eq!(Pin::new(&mut gen)(6), 0);
assert_eq!(Pin::new(&mut gen)(7), 1);
assert_eq!(Pin::new(&mut gen)(8), 6);
assert_eq!(Pin::new(&mut gen)(9), 7);
This proposal sidesteps the first input problem and avoids adding new syntax by
always receiving resume arguments as closure arguments. This does a good job
unifying syntax that produces a Fn(Pin)?Mut
but relies on controversial “magic
mutation”. I admit I hated it at first but the simplicity has grown on me. Like
the existing syntax, a generator can’t “naturally” hold on to passed-in values
from previous invocations (Edit: this is a lie, see
MCP-49). Although, this has
the benefit that users are less likely to accidentally leave unused Drop
types
un-dropped across yields. This proposal also has the best support for passing
multiple resume arguments. As proposed, these generators always produce
GeneratorState
which slightly defeats the attempt to be one-to-one with
closures. This could be easily changed using the “always end in a !” trick
discussed above.
let gen = |name: &'static str| {
yield name;
yield "Hello";
return name;
};
dbg!(gen.resume("A")); // Yielded("A")
dbg!(gen.resume("B")); // Yielded("Hello")
dbg!(gen.resume("C")); // Finished("C")
let gen = |name: &'static str| {
yield name;
yield "Hello";
yield name;
unreachable!()
};
dbg!(gen.resume("A")); // "A"
dbg!(gen.resume("B")); // "Hello"
dbg!(gen.resume("C")); // "C"
This comes up a lot as an alternative to magic mutation. The generator gets to
keep passed-in arguments at the cost of some confusion over
“where-does-input-come-in?” and around multiple arguments. A yield
expression
also feels more Rusty than a yield
statement but could open a lot of the same
order-of-operations problems that await
had to deal with.
let gen = |name1: &'static str| {
let name2 = yield name;
let name3 = yield "Hello";
return (name1, name2, name3);
};
dbg!(gen.resume("A")); // Yielded("A")
dbg!(gen.resume("B")); // Yielded("Hello")
dbg!(gen.resume("C")); // Finished(("A", "B", "C"))
This is basically my proof-of-concept syntax plus some bikeshedding. Should
yield
be a postfix operator à la await
? Do we need a new keyword for
generator/coroutine blocks? What type does the first yield
take? It’s strange
and might require an edition boundary. But it does do everything except take
multiple arguments without a tuple.
let gen = coroutine {
let name1 = yield;
let name2 = yield vec![name];
let name3 = yield vec!["Hello"];
yield vec![name1, name2, name3];
unreachable!()
};
dbg!(gen.resume("A")); // vec!["A"]
dbg!(gen.resume("B")); // vec!["Hello"]
dbg!(gen.resume("C")); // vec!["A", "B", "C"]
This is the dedicated generator syntax which people often picture. However, the first resume argument is always dropped, making it incompatible with futures (which require the context on each iteration).
let gen = || {
let name1 = yield "Hello";
let name2 = yield name1;
return (name1, name2);
};
dbg!(gen.resume("A")); // Yielded("Hello")
dbg!(gen.resume("B")); // Yielded("B")
dbg!(gen.resume("C")); // Finished(("B", "C"))
Perfectly general, perfectly consistent, and perfectly grotesque. Tell me what you hate about it most!
/* Censored to preserve human life */
This integrates yield
as a sort of escape hatch into the existing async
syntax; avoiding an edition boundary and allowing await and yield to stand
side-by-side like the brothers they are. With the right clever type inference, I
think it can also be just as general as my proof-of-concept syntax. The price? I
see no pretty way to take resume arguments into the coroutine since .await
needs to be able to use them too.
let my_coroutine: Future<Output=u32>;
// This future...
my_coroutine = async {
some_function().await + 3
};
// ...is the same as this coroutine.
my_coroutine = async {
yield Ready(some_function().await + 3);
panic!()
};
// Returns Ready repeatedly after the given future resolves.
async fn fuse_clone<T: Clone>(future: impl Future<T>) -> T {
let value = future.await;
loop {
yield Ready(value.clone());
}
}
// Always returns Pending at least once.
async fn pending_once<T>(future: impl Future<T>) -> T {
#[async_input]
let mut ctx: &mut Context;
match /* pin stuff here */ future.poll(ctx) {
Ready(value) => {
yield Pending;
value
},
Pending => future.await,
}
}
// We can also produce a more traditional generator.
let my_generator: Generator<Yield=usize, Return=()> = async {
for i in 0..10 {
yield GeneratorState::Yielded(i);
}
yield GeneratorState::Returned(());
// Because of hypothetical ! coersion rules, the yield type is allowed to be
// GeneratorState although the block ends with `yield Poll::Ready(!)`.
panic!()
};
// This only works for async blocks. An async fn must be a Future.
let sum_coroutine: FnPinMut(u32) -> 32 = async {
#[async_input]
let mut n: u32;
let mut sum = 0;
loop {
sum += n;
yield sum;
}
};
Coroutines are very hard but very cool.