Sam Sartor
# Generalizing Coroutines in Rust 2019-11-11

Now that async/await has been released, attention has drifted back to refining stackless coroutines (the unstable language feature that makes async/await possible). Alas, the latest RFC has shown that there is still a lot of disagreement on what exactly coroutines in Rust should look like beyond async/await. I felt like it will be useful to flesh out what coroutines could be so we can better discuss what they should be.

Edit from future Sam! After almost a year of discussions with other people, we settled on language team MCP-49 as the “best” version of coroutines. Further discussion is available in a series of design notes in the language team website.

Before we go any further, a quick reminder of how coroutines work. In theory, a coroutine is just a kind of function which has the ability to pause and resume without blocking any OS threads. For example, an async function will pause when you .await on a future which hasn’t resolved yet. In languages like Go or Python, coroutines are hosted by some sort of runtime which maintains each call stack and automatically jumps around as needed.

Rust doesn’t have a heavy runtime so coroutines have to be assembled somehow out of ordinary functions and structs. In fact, an anonymous struct is exactly what async produces. The coroutine is executed by calling a resume function which can later pause execution by saving variables into the struct and returning back to the caller. This is why they are called “stackless coroutines”. It also means that any coroutine syntax in Rust (e.g. async/await) is just a fancy way of creating a state machine! For example, consider the following async code:

let my_coroutine: Future<Output=u32> = async {
    some_function().await + 3
};

This coroutine can be in one of 3 states. In the “Start” state, the coroutine has just been created but hasn’t had the chance to do anything yet. In the “Await” state, the coroutine has called some_function but the returned future isn’t ready yet. In the “Done” state, the coroutine has received a number from some_function and added 3 to it. We can draw out a diagram to make it clearer:

image/svg+xml Start Await Done

A state machine can run all day day but it won’t do us any good unless we can get useful data out of it. So Rust allows each resume call (a edge on the graph) to output something back to the caller. For futures like this one, it will output Ready once the the function properly returns or Pending if execution was paused while waiting on some other coroutine future.

Since our coroutines can now generate output, we might call them “generators”. There are a lot of advantages to this terminology but I’ll stick to “stackless coroutine” in this post to distance it from any specific proposal beyond async/await.

Notice that users of this coroutine could keep calling the resume function even after it has reached the “Done” state. The futures API in Rust basically just voids your warranty once Ready is returned. I’d like something more general purpose than that. Could we simply ban the concept of a last state since it causes problems like this? If so, what happens when resume is called on “Done”? It panics! Thus it never returns and never requires us to come up with a state after “Done”. Problem solved. A different state machine could return to a previous state instead but either way coroutines in Rust don’t need to define a last state or a last output type. If the caller should stop, each coroutine can signal that however they please.

image/svg+xml Start Await Done Pending Pending Ready ! (never) Ready

Futures in Rust have some additional complexity beyond the ability to pause, resume, and generate output. If you read the documentation on the Future trait, you’ll see that the resume function (called poll) takes an additional context parameter. This parameter gives futures the ability to communicate with complex schedulers that decide when to resume.

image/svg+xml Start Await Done Pending Pending Context Ready Context ! (never) Context Context Context Ready

As you’ll see, we now have a very general definition of stackless coroutines which we can carry through the rest of this post.

A stackless coroutine is a function which mutates state, takes input, and generates output.

Wait, that means there really isn’t much difference between a stackless coroutine and a FnMut! Both can be called many times; each time taking some input and returning some output. The only real difference is the syntax which creates the FnMut implementations:

Really, we’d want coroutines to implement some sort of FnPinMut since they will likely contain self-references. But this representation of stackless coroutines really is completely general. Even today, we could write something like:

use std::future::Future;
use std::task::{Context, Poll};
use std::ops::{Generator, GeneratorState};

impl<C, T> Future for C where C: FnPinMut(&mut Context) -> Poll<T> {
    type Output = T;

    fn poll(self: Pin<&mut Self>, ctx: &mut Context) -> Poll<T>;{
        self(ctx)
    }
}

impl<C, Y, R> Generator for C where C: FnPinMut() -> GeneratorState<Y, R> {
    type Yield = Y;
    type Return = R;

    unsafe fn resume(&mut self) -> GeneratorState<Y, R> {
        Pin::new_unchecked(self)()
    }
}

Again, I’m not trying to say that Rust should literally adopt some FnPinMut as the canonical definition of a stackless coroutine (although I don’t think that’s a bad idea). Only that it could and that we can compare and contrast coroutine proposals by asking “what sort of FnPinMut would this be?”.

A Very General Syntax

Just like the general definition of coroutines above, it will be useful to have a general syntax for creating coroutines. Async/await can’t do everything which is why people are interested in a dedicated coroutine syntax in the first place. Our hypothetical syntax doesn’t have to be pretty but it gives us a place to start. Here are the rules:

With this in mind, let’s translate our async example to this syntax. And before you come to my house and murder me, I’m using yield as a postfix operator to avoid the same precedence questions that plagued await but that isn’t set in stone or anything.

Also notice that we need to start our coroutine block with a yield since the coroutine is created in the “Start” state. The first yield evaluates to the argument to the first resume call and the output of that resume call isn’t produced until the next yield. That means that the first yield actually ignores its input. I’ll just have it take () for now.

let my_coroutine: FnPinMut(&mut Context) -> Poll<u32> = coroutine {
    let mut ctx = ().yield; // Start
    let mut future = some_function();
    let value = loop {
        match /* pin stuff here */ future.poll(ctx) {
            Ready(x) => break x;
            Pending => ctx = Pending.yield; // Await
        };
    };
    Ready(value + 3).yield; // Done
    panic!() // -> !
};

That is clearly a lot less legible than the async/await syntax but it’s not too bad. Just try creating a similar state machine by hand using an enum! Maybe we can see the utility of a dedicated coroutine syntax if we aim for something simpler. Here we are using a coroutine to do a sum:

let summer = coroutine {
    let mut sum = ().yield; // Start
    loop {
        sum += sum.yield; // Add
    } // -> !
};

assert_eq!(Pin::new(&mut summer)(5), 5);
assert_eq!(Pin::new(&mut summer)(4), 9);
assert_eq!(Pin::new(&mut summer)(3), 12);

Hmmm, my proof-of-concept syntax seems even more odd now. It’s frankly hard to see exactly why it works how it works. To demonstrate, here are two different state-machine diagrams. Which one do you think is a correct representation of this coroutine?

image/svg+xml Start Add sum += () n n sum sum n sum n

Trick question! They are both correct. Each is just the “line graph” of the other (nodes and edges are swapped). The graph on the left makes a lot more sense when looking at the coroutine syntax: you pass some output to yield and then receive some input which you use to do work before deciding what to yield next. The graph on the right makes a lot more sense from outside the coroutine: the coroutine is in some state and when you resume it with an input that state changes and you get some output.

We can now see something really interesting about coroutines. We are trying to decide what to output before we begin (empty tuple, I guess?) just like we had to decide earlier how to handle input after we end (it will end with a !). Either case leaves a strange edge from/to nowhere in a state machine diagram. Could we ban the concept of a first state since it also causes problems? Actually yes! We can make every coroutine a loop like ouroboros eating its own tail. All that’s needed is one state which we can create out of thin air because it lacks hidden internal data. That gives us something like this:

I mean something like this:

let mut sum = 0;
let summer = coroutine loop { sum += sum.yield };

Ouroboroutines are even stranger. Because execution of the loop starts with the yield receiving 1 and proceeds to the sum +=, the coroutine never outputs 0. Then the loop comes around and we output sum (now equal to 1) for the first time. A coroutine which panics on the second resume call might look like:

let resume_if_you_dare: FnPinMut(()) -> () = coroutine loop {
    panic!().yield;
    ().yield;
};

This loop syntax should never actually be accepted because it’s super stupid. I just wanted to share it because it’s a cool exercise and because demonstrates some of the trade-offs every proposal has to make if it allows for unique begin or end states.

Comparison of Proposals

Now that we’ve completed my tour of what coroutines could be. Here is a comparison of different proposals for what coroutines generators should be. Note that this is a pretty high-level rundown. I’m trying to avoid duplicating proposals that differ only on bikeshed-ish decisions. I’m also not thinking super hard about lifetime or compatibility issues beyond the obvious. These are just here to start a conversation. If you come across anything interesting I’ve missed, please let me know via my email or anywhere I post this!

eRFC 2033 (Experimental coroutines)

This is the unstable generator syntax as it exists in rust today. It got the job done while async/await was being shipped but leaves a lot to be desired. Notably, the only way to get new data into the coroutine is by mutating some kind of shared state (very un-Rust-like if you ask me).

// This works great.
let mut gen = || {
    yield 1;
    return "foo"
};

assert_eq!(
    Pin::new(&mut gen).resume(),
    GeneratorState::Yielded(1));
assert_eq!(
    Pin::new(&mut gen).resume(),
    GeneratorState::Returned("foo"));

// Good luck doing something like this.
let mut queue = VecDeque::new();
queue.extend(0..2);

let mut gen: FnPinMut(u32) -> u32 = generator loop {
    queue.push_back(queue.pop_front().unwrap_or(0).yield);
};

assert_eq!(Pin::new(&mut gen)(6), 0);
assert_eq!(Pin::new(&mut gen)(7), 1);
assert_eq!(Pin::new(&mut gen)(8), 6);
assert_eq!(Pin::new(&mut gen)(9), 7);

RFC 2781 (Unified coroutines)

This proposal sidesteps the first input problem and avoids adding new syntax by always receiving resume arguments as closure arguments. This does a good job unifying syntax that produces a Fn(Pin)?Mut but relies on controversial “magic mutation”. I admit I hated it at first but the simplicity has grown on me. Like the existing syntax, a generator can’t “naturally” hold on to passed-in values from previous invocations (Edit: this is a lie, see MCP-49). Although, this has the benefit that users are less likely to accidentally leave unused Drop types un-dropped across yields. This proposal also has the best support for passing multiple resume arguments. As proposed, these generators always produce GeneratorState which slightly defeats the attempt to be one-to-one with closures. This could be easily changed using the “always end in a !” trick discussed above.

let gen = |name: &'static str| {
    yield name;
    yield "Hello";
    return name;
};

dbg!(gen.resume("A")); // Yielded("A")
dbg!(gen.resume("B")); // Yielded("Hello")
dbg!(gen.resume("C")); // Finished("C")

let gen = |name: &'static str| {
    yield name;
    yield "Hello";
    yield name;
    unreachable!()
};

dbg!(gen.resume("A")); // "A"
dbg!(gen.resume("B")); // "Hello"
dbg!(gen.resume("C")); // "C"

Yield is an Expression

This comes up a lot as an alternative to magic mutation. The generator gets to keep passed-in arguments at the cost of some confusion over “where-does-input-come-in?” and around multiple arguments. A yield expression also feels more Rusty than a yield statement but could open a lot of the same order-of-operations problems that await had to deal with.

let gen = |name1: &'static str| {
    let name2 = yield name;
    let name3 = yield "Hello";
    return (name1, name2, name3);
};

dbg!(gen.resume("A")); // Yielded("A")
dbg!(gen.resume("B")); // Yielded("Hello")
dbg!(gen.resume("C")); // Finished(("A", "B", "C"))

Yield at the Start

This is basically my proof-of-concept syntax plus some bikeshedding. Should yield be a postfix operator à la await? Do we need a new keyword for generator/coroutine blocks? What type does the first yield take? It’s strange and might require an edition boundary. But it does do everything except take multiple arguments without a tuple.

let gen = coroutine {
    let name1 = yield;
    let name2 = yield vec![name];
    let name3 = yield vec!["Hello"];
    yield vec![name1, name2, name3];
    unreachable!()
};

dbg!(gen.resume("A")); // vec!["A"]
dbg!(gen.resume("B")); // vec!["Hello"]
dbg!(gen.resume("C")); // vec!["A", "B", "C"]

Drop the First

This is the dedicated generator syntax which people often picture. However, the first resume argument is always dropped, making it incompatible with futures (which require the context on each iteration).

let gen = || {
    let name1 = yield "Hello";
    let name2 = yield name1;
    return (name1, name2);
};

dbg!(gen.resume("A")); // Yielded("Hello")
dbg!(gen.resume("B")); // Yielded("B")
dbg!(gen.resume("C")); // Finished(("B", "C"))

Ouroboroutines

Perfectly general, perfectly consistent, and perfectly grotesque. Tell me what you hate about it most!

/* Censored to preserve human life */

Extend Async

This integrates yield as a sort of escape hatch into the existing async syntax; avoiding an edition boundary and allowing await and yield to stand side-by-side like the brothers they are. With the right clever type inference, I think it can also be just as general as my proof-of-concept syntax. The price? I see no pretty way to take resume arguments into the coroutine since .await needs to be able to use them too.

let my_coroutine: Future<Output=u32>;

// This future...
my_coroutine = async {
    some_function().await + 3
};

// ...is the same as this coroutine.
my_coroutine = async {
    yield Ready(some_function().await + 3);
    panic!()
};

// Returns Ready repeatedly after the given future resolves.
async fn fuse_clone<T: Clone>(future: impl Future<T>) -> T {
    let value = future.await;
    loop {
        yield Ready(value.clone());
    }
}

// Always returns Pending at least once.
async fn pending_once<T>(future: impl Future<T>) -> T {
    #[async_input]
    let mut ctx: &mut Context;

    match /* pin stuff here */ future.poll(ctx) {
        Ready(value) => {
            yield Pending;
            value
        },
        Pending => future.await,
    }
}

// We can also produce a more traditional generator.
let my_generator: Generator<Yield=usize, Return=()> = async {
    for i in 0..10 {
        yield GeneratorState::Yielded(i);
    }
    yield GeneratorState::Returned(());

    // Because of hypothetical ! coersion rules, the yield type is allowed to be
    // GeneratorState although the block ends with `yield Poll::Ready(!)`.
    panic!()
};

// This only works for async blocks. An async fn must be a Future.
let sum_coroutine: FnPinMut(u32) -> 32 = async {
    #[async_input]
    let mut n: u32;

    let mut sum = 0;
    loop {
        sum += n;
        yield sum;
    }
};

Conclusion

Coroutines are very hard but very cool.