Rust AsyncRuntime
这一章,我们会一步步实现自己的 Futures。如果不想一步一步地看,你也可以直接从下方直接获取到完整的例子。
完整代码
use std::{
future::Future, pin::Pin, sync::{mpsc::{channel, Sender}, Arc, Mutex},
task::{Context, Poll, RawWaker, RawWakerVTable, Waker},
thread::{self, JoinHandle}, time::{Duration, Instant}
};
fn main() {
let start = Instant::now();
let reactor = Reactor::new();
let reactor = Arc::new(Mutex::new(reactor));
let future1 = Task::new(reactor.clone(), 1, 1);
let future2 = Task::new(reactor.clone(), 2, 2);
let fut1 = async {
let val = future1.await;
let dur = (Instant::now() - start).as_secs_f32();
println!("Future got {} at time: {:.2}.", val, dur);
};
let fut2 = async {
let val = future2.await;
let dur = (Instant::now() - start).as_secs_f32();
println!("Future got {} at time: {:.2}.", val, dur);
};
let mainfut = async {
fut1.await;
fut2.await;
};
block_on(mainfut);
reactor.lock().map(|mut r| r.close()).unwrap();
}
// ============================= EXECUTOR ====================================
fn block_on<F: Future>(mut future: F) -> F::Output {
let mywaker = Arc::new(MyWaker{ thread: thread::current() });
let waker = waker_into_waker(Arc::into_raw(mywaker));
let mut cx = Context::from_waker(&waker);
// SAFETY: we shadow `future` so it can't be accessed again.
let mut future = unsafe { Pin::new_unchecked(&mut future) };
let val = loop {
match Future::poll(future.as_mut(), &mut cx) {
Poll::Ready(val) => break val,
Poll::Pending => thread::park(),
};
};
val
}
// ====================== FUTURE IMPLEMENTATION ==============================
#[derive(Clone)]
struct MyWaker {
thread: thread::Thread,
}
#[derive(Clone)]
pub struct Task {
id: usize,
reactor: Arc<Mutex<Reactor>>,
data: u64,
is_registered: bool,
}
fn mywaker_wake(s: &MyWaker) {
let waker_ptr: *const MyWaker = s;
let waker_arc = unsafe {Arc::from_raw(waker_ptr)};
waker_arc.thread.unpark();
}
fn mywaker_clone(s: &MyWaker) -> RawWaker {
let arc = unsafe { Arc::from_raw(s) };
std::mem::forget(arc.clone()); // increase ref count
RawWaker::new(Arc::into_raw(arc) as *const (), &VTABLE)
}
const VTABLE: RawWakerVTable = unsafe {
RawWakerVTable::new(
|s| mywaker_clone(&*(s as *const MyWaker)), // clone
|s| mywaker_wake(&*(s as *const MyWaker)), // wake
|s| mywaker_wake(*(s as *const &MyWaker)), // wake by ref
|s| drop(Arc::from_raw(s as *const MyWaker)), // decrease refcount
)
};
fn waker_into_waker(s: *const MyWaker) -> Waker {
let raw_waker = RawWaker::new(s as *const (), &VTABLE);
unsafe { Waker::from_raw(raw_waker) }
}
impl Task {
fn new(reactor: Arc<Mutex<Reactor>>, data: u64, id: usize) -> Self {
Task {
id,
reactor,
data,
is_registered: false,
}
}
}
impl Future for Task {
type Output = usize;
fn poll(mut self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output> {
let mut r = self.reactor.lock().unwrap();
if r.is_ready(self.id) {
Poll::Ready(self.id)
} else if self.is_registered {
Poll::Pending
} else {
r.register(self.data, cx.waker().clone(), self.id);
drop(r);
self.is_registered = true;
Poll::Pending
}
}
}
// =============================== REACTOR ===================================
struct Reactor {
dispatcher: Sender<Event>,
handle: Option<JoinHandle<()>>,
readylist: Arc<Mutex<Vec<usize>>>,
}
#[derive(Debug)]
enum Event {
Close,
Timeout(Waker, u64, usize),
}
impl Reactor {
fn new() -> Self {
let (tx, rx) = channel::<Event>();
let readylist = Arc::new(Mutex::new(vec![]));
let rl_clone = readylist.clone();
let mut handles = vec![];
let handle = thread::spawn(move || {
// This simulates some I/O resource
for event in rx {
println!("REACTOR: {:?}", event);
let rl_clone = rl_clone.clone();
match event {
Event::Close => break,
Event::Timeout(waker, duration, id) => {
let event_handle = thread::spawn(move || {
thread::sleep(Duration::from_secs(duration));
rl_clone.lock().map(|mut rl| rl.push(id)).unwrap();
waker.wake();
});
handles.push(event_handle);
}
}
}
for handle in handles {
handle.join().unwrap();
}
});
Reactor {
readylist,
dispatcher: tx,
handle: Some(handle),
}
}
fn register(&mut self, duration: u64, waker: Waker, data: usize) {
self.dispatcher
.send(Event::Timeout(waker, duration, data))
.unwrap();
}
fn close(&mut self) {
self.dispatcher.send(Event::Close).unwrap();
}
fn is_ready(&self, id_to_check: usize) -> bool {
self.readylist
.lock()
.map(|rl| rl.iter().any(|id| *id == id_to_check))
.unwrap()
}
}
impl Drop for Reactor {
fn drop(&mut self) {
self.handle.take().map(|h| h.join().unwrap()).unwrap();
}
}
首先,我们需要从 std 引入一些关键依赖,然后开始写代码。或者
use std::{
future::Future, pin::Pin, sync::{mpsc::{channel, Sender}, Arc, Mutex},
task::{Context, Poll, RawWaker, RawWakerVTable, Waker},
thread::{self, JoinHandle}, time::{Duration, Instant}
};
Executor
Executor 负责运行一个或多个 Futures 直到它们完成,接收到 Future 之后的第一步是开始轮询。此时可能遇到三种情况:
- Future 的状态是 Ready,开始执行链式操作;
- Future 还没有被 Poll 过,我们暂停它,将它传给一个 Waker;
- Future 的状态是 Pending;
Rust提供了一种让 Reactor 和 Executor 通过Waker进行通信的方式,Reactor 存储此 Waker,并在 Future 解析后再次轮询时调用 Waker::wake()。
看一下我们的 executor 代码
// Our executor takes any object which implements the `Future` trait
fn block_on<F: Future>(mut future: F) -> F::Output {
// the first thing we do is to construct a `Waker` which we'll pass on to
// the `reactor` so it can wake us up when an event is ready.
let mywaker = Arc::new(MyWaker{ thread: thread::current() });
let waker = waker_into_waker(Arc::into_raw(mywaker));
// The context struct is just a wrapper for a `Waker` object. Maybe in the
// future this will do more, but right now it's just a wrapper.
let mut cx = Context::from_waker(&waker);
// So, since we run this on one thread and run one future to completion
// we can pin the `Future` to the stack. This is unsafe, but saves an
// allocation. We could `Box::pin` it too if we wanted. This is however
// safe since we shadow `future` so it can't be accessed again and will
// not move until it's dropped.
let mut future = unsafe { Pin::new_unchecked(&mut future) };
// We poll in a loop, but it's not a busy loop. It will only run when
// an event occurs, or a thread has a "spurious wakeup" (an unexpected wakeup
// that can happen for no good reason).
let val = loop {
match Future::poll(pinned, &mut cx) {
// when the Future is ready we're finished
Poll::Ready(val) => break val,
// If we get a `pending` future we just go to sleep...
Poll::Pending => thread::park(),
};
};
val
}
在上面的例子中,我们的 Executor 首先创建了 Waker,然后在一个 loop 当中调用了 Future::poll,等待其状态变为 Ready,在 Ready 之前,线程会持续休眠。同时,为了允许 Future 自引用,我们为它添加了 Pin 保证。
这里使用了 std::thread::park 来暂停线程,由于 std::thread 会存在死锁的风险,如果在多处调用 block_on 会导致线程被错误唤醒(和 Future 状态不同步)。
Future
接下来,我们实现一下 Waker 和 Future
// This is the definition of our `Waker`. We use a regular thread-handle here.
// It works but it's not a good solution. It's easy to fix though, I'll explain
// after this code snippet.
#[derive(Clone)]
struct MyWaker {
thread: thread::Thread,
}
// This is the definition of our `Future`. It keeps all the information we
// need. This one holds a reference to our `reactor`, that's just to make
// this example as easy as possible. It doesn't need to hold a reference to
// the whole reactor, but it needs to be able to register itself with the
// reactor.
#[derive(Clone)]
pub struct Task {
id: usize,
reactor: Arc<Mutex<Reactor>>,
data: u64,
is_registered: bool,
}
// These are function definitions we'll use for our waker. Remember the
// "Trait Objects" chapter earlier.
fn mywaker_wake(s: &MyWaker) {
let waker_ptr: *const MyWaker = s;
let waker_arc = unsafe {Arc::from_raw(waker_ptr)};
waker_arc.thread.unpark();
}
// Since we use an `Arc` cloning is just increasing the refcount on the smart
// pointer.
fn mywaker_clone(s: &MyWaker) -> RawWaker {
let arc = unsafe { Arc::from_raw(s) };
std::mem::forget(arc.clone()); // increase ref count
RawWaker::new(Arc::into_raw(arc) as *const (), &VTABLE)
}
// This is actually a "helper funtcion" to create a `Waker` vtable. In contrast
// to when we created a `Trait Object` from scratch we don't need to concern
// ourselves with the actual layout of the `vtable` and only provide a fixed
// set of functions
const VTABLE: RawWakerVTable = unsafe {
RawWakerVTable::new(
|s| mywaker_clone(&*(s as *const MyWaker)), // clone
|s| mywaker_wake(&*(s as *const MyWaker)), // wake
|s| mywaker_wake(*(s as *const &MyWaker)), // wake by ref
|s| drop(Arc::from_raw(s as *const MyWaker)), // decrease refcount
)
};
// Instead of implementing this on the `MyWaker` oject in `impl Mywaker...` we
// just use this pattern instead since it saves us some lines of code.
fn waker_into_waker(s: *const MyWaker) -> Waker {
let raw_waker = RawWaker::new(s as *const (), &VTABLE);
unsafe { Waker::from_raw(raw_waker) }
}
impl Task {
fn new(reactor: Arc<Mutex<Reactor>>, data: u64, id: usize) -> Self {
Task {
id,
reactor,
data,
is_registered: false,
}
}
}
// This is our `Future` implementation
impl Future for Task {
// The output for our kind of `leaf future` is just an `usize`. For other
// futures this could be something more interesting like a byte array.
type Output = usize;
fn poll(mut self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output> {
let mut r = self.reactor.lock().unwrap();
// we check with the `Reactor` if this future is in its "readylist"
// i.e. if it's `Ready`
if r.is_ready(self.id) {
// if it is, we return the data. In this case it's just the ID of
// the task since this is just a very simple example.
Poll::Ready(self.id)
} else if self.is_registered {
// If the future is registered alredy, we just return `Pending`
Poll::Pending
} else {
// If we get here, it must be the first time this `Future` is polled
// so we register a task with our `reactor`
r.register(self.data, cx.waker().clone(), self.id);
// oh, we have to drop the lock on our `Mutex` here because we can't
// have a shared and exclusive borrow at the same time
drop(r);
self.is_registered = true;
Poll::Pending
}
}
}
这一段代码有点长并且难以读懂,我们拆解一下。大致可以分为 Waker 和 Task 两个部分。
- Task 是我们自己实现的一个 Leaf Future,当它的
poll被调用时,会从 Reactor 中获取一下当前任务的状态,然后选择注册并执行操作(register)或是直接返回状态。 - Waker 是对工作线程的一层包装,Executor 将自己所在线程传给 Waker,调用
poll之后会让工作线程暂停(thread.park),Waker 暴露了一个wake方法来唤醒这个线程(thread.unpark),让 Executor 继续loop。 - Waker 通过 Arc 来进行内存管理,声明 Waker 使用了 vtable 而不是常规的 impl trait,两者实际上没有什么差别。
Reactor
第三部分是 Reactor
// This is a "fake" reactor. It does no real I/O, but that also makes our
// code possible to run in the book and in the playground
struct Reactor {
// we need some way of registering a Task with the reactor. Normally this
// would be an "interest" in an I/O event
dispatcher: Sender<Event>,
handle: Option<JoinHandle<()>>,
// This is a list of tasks that are ready, which means they should be polled
// for data.
readylist: Arc<Mutex<Vec<usize>>>,
}
// We just have two kind of events. An event called `Timeout`
// and a `Close` event to close down our reactor.
#[derive(Debug)]
enum Event {
Close,
Timeout(Waker, u64, usize),
}
impl Reactor {
fn new() -> Self {
// The way we register new events with our reactor is using a regular
// channel
let (tx, rx) = channel::<Event>();
let readylist = Arc::new(Mutex::new(vec![]));
let rl_clone = readylist.clone();
// This `Vec` will hold handles to all the threads we spawn so we can
// join them later on and finish our programm in a good manner
let mut handles = vec![];
// This will be the "Reactor thread"
let handle = thread::spawn(move || {
for event in rx {
let rl_clone = rl_clone.clone();
match event {
// If we get a close event we break out of the loop we're in
Event::Close => break,
Event::Timeout(waker, duration, id) => {
// When we get an event we simply spawn a new thread
// which will simulate some I/O resource...
let event_handle = thread::spawn(move || {
//... by sleeping for the number of seconds
// we provided when creating the `Task`.
thread::sleep(Duration::from_secs(duration));
// When it's done sleeping we put the ID of this task
// on the "readylist"
rl_clone.lock().map(|mut rl| rl.push(id)).unwrap();
// Then we call `wake` which will wake up our
// executor and start polling the futures
waker.wake();
});
handles.push(event_handle);
}
}
}
// When we exit the Reactor we first join all the handles on
// the child threads we've spawned so we catch any panics and
// release any resources.
for handle in handles {
handle.join().unwrap();
}
});
Reactor {
readylist,
dispatcher: tx,
handle: Some(handle),
}
}
fn register(&mut self, duration: u64, waker: Waker, data: usize) {
// registering an event is as simple as sending an `Event` through
// the channel.
self.dispatcher
.send(Event::Timeout(waker, duration, data))
.unwrap();
}
fn close(&mut self) {
self.dispatcher.send(Event::Close).unwrap();
}
// We need a way to check if any event's are ready. This will simply
// look through the "readylist" for an event macthing the ID we want to
// check for.
fn is_ready(&self, id_to_check: usize) -> bool {
self.readylist
.lock()
.map(|rl| rl.iter().any(|id| *id == id_to_check))
.unwrap()
}
}
// When our `Reactor` is dropped we join the reactor thread with the thread
// owning our `Reactor` so we catch any panics and release all resources.
// It's not needed for this to work, but it really is a best practice to join
// all threads you spawn.
impl Drop for Reactor {
fn drop(&mut self) {
self.handle.take().map(|h| h.join().unwrap()).unwrap();
}
}
我们的 Reactor 只是一个玩具,本身并不会进行任何 IO,只是负责让代码运作起来。register 方法实际上可以看作是 IO 的起点,例如发起读取某个文件或者流的请求。创建 Reactor 时,会开启一个新的线程用于从 channel 当中不断轮询需要处理的 IO 指令。由于 Reactor 的 readylist 状态需要在多个 Future 之间复用,这里使用了一个 Mutex 对它进行包装。
运行一下代码
最后,我们补上一个 `main` 函数,让代码运行起来。
fn main() {
// This is just to make it easier for us to see when our Future was resolved
let start = Instant::now();
// Many runtimes create a glocal `reactor` we pass it as an argument
let reactor = Reactor::new();
// Since we'll share this between threads we wrap it in a
// atmically-refcounted- mutex.
let reactor = Arc::new(Mutex::new(reactor));
// We create two tasks:
// - first parameter is the `reactor`
// - the second is a timeout in seconds
// - the third is an `id` to identify the task
let future1 = Task::new(reactor.clone(), 1, 1);
let future2 = Task::new(reactor.clone(), 2, 2);
// an `async` block works the same way as an `async fn` in that it compiles
// our code into a state machine, `yielding` at every `await` point.
let fut1 = async {
let val = future1.await;
let dur = (Instant::now() - start).as_secs_f32();
println!("Future got {} at time: {:.2}.", val, dur);
};
let fut2 = async {
let val = future2.await;
let dur = (Instant::now() - start).as_secs_f32();
println!("Future got {} at time: {:.2}.", val, dur);
};
// Our executor can only run one and one future, this is pretty normal
// though. You have a set of operations containing many futures that
// ends up as a single future that drives them all to completion.
let mainfut = async {
fut1.await;
fut2.await;
};
// This executor will block the main thread until the futures is resolved
block_on(mainfut);
// When we're done, we want to shut down our reactor thread so our program
// ends nicely.
reactor.lock().map(|mut r| r.close()).unwrap();
}
main 函数当中包含了两个 Task 来模拟 Leaf Future,在 mainfut 中还有两个 Non-leaf Future 负责调用 Task 的 poll。Non-leaf Future 也有 poll 方法,仅对其内部的 futures 进行轮询,这些状态机被轮询直到最终某个 Leaf Future 返回 Ready 或 Pending。
从例子当中,我们可以看到,async 关键字可以用于函数,如 async fn(...),也可以用于块,如 async { ... }。两者都会将您的函数或块转换为 Future 。这些例子当中的 Future 就像是之前的 Generator,await 和 yield 一样。
我们现在例子的输出是:
Future got 1 at time: 1.00.
Future got 2 at time: 3.00.
如果我们的 Futures 是异步执行的,我们应该会看到:
Future got 1 at time: 1.00.
Future got 2 at time: 2.00.
总结
到这里,我们的简单 Async Runtime 就实现完成了,如果要继续深入的话,我们应该要了解更高级的运行时是如何工作的,以及它们如何实现不同的 Futures 运行方式。
以下是一些参考资料:
- The Async Book:官方的 Async Rust 指引
- The async-std Book:
async-std官方文档 - Designing futures for Rust:tokio 作者的文章
- Rust's Journey to Async/await:Rust 实现 Async/await 的历史
- Making the Tokio scheduler 10x faster
- Stjepan's blog:Smol 创建者的博客,有一些很好的学习异步的文章
- Withoutboat's blog:Rust 核心开发者的博客