Browse Source

update original

pull/1402/merge
funkill2 11 months ago
parent
commit
7119ae1346
  1. 3
      rustbook-en/ci/dictionary.txt
  2. 8
      rustbook-en/src/SUMMARY.md
  3. 4
      rustbook-en/src/appendix-01-keywords.md
  4. 194
      rustbook-en/src/ch17-00-async-await.md
  5. 386
      rustbook-en/src/ch17-01-futures-and-syntax.md
  6. 224
      rustbook-en/src/ch17-02-concurrency-with-async.md
  7. 289
      rustbook-en/src/ch17-03-more-futures.md
  8. 355
      rustbook-en/src/ch17-04-streams.md
  9. 456
      rustbook-en/src/ch17-05-traits-for-async.md
  10. 137
      rustbook-en/src/ch17-06-futures-tasks-threads.md
  11. 14
      rustbook-en/src/ch20-01-unsafe-rust.md

3
rustbook-en/ci/dictionary.txt

@ -253,6 +253,7 @@ interoperate @@ -253,6 +253,7 @@ interoperate
IntoFuture
IntoIterator
intra
intratask
InvalidDigit
invariants
ioerror
@ -360,6 +361,7 @@ nondeterministic @@ -360,6 +361,7 @@ nondeterministic
nonequality
nongeneric
noplayground
NoStarch
NotFound
nsprust
null's
@ -523,6 +525,7 @@ suboptimal @@ -523,6 +525,7 @@ suboptimal
subpath
subslices
substring
subtasks
subteams
subtree
subtyping

8
rustbook-en/src/SUMMARY.md

@ -101,12 +101,12 @@ @@ -101,12 +101,12 @@
- [Shared-State Concurrency](ch16-03-shared-state.md)
- [Extensible Concurrency with the `Sync` and `Send` Traits](ch16-04-extensible-concurrency-sync-and-send.md)
- [Async and Await](ch17-00-async-await.md)
- [Fundamentals of Asynchronous Programming: Async, Await, Futures, and Streams](ch17-00-async-await.md)
- [Futures and the Async Syntax](ch17-01-futures-and-syntax.md)
- [Concurrency With Async](ch17-02-concurrency-with-async.md)
- [Applying Concurrency with Async](ch17-02-concurrency-with-async.md)
- [Working With Any Number of Futures](ch17-03-more-futures.md)
- [Streams](ch17-04-streams.md)
- [Digging Into the Traits for Async](ch17-05-traits-for-async.md)
- [Streams: Futures in Sequence](ch17-04-streams.md)
- [A Closer Look at the Traits for Async](ch17-05-traits-for-async.md)
- [Futures, Tasks, and Threads](ch17-06-futures-tasks-threads.md)
- [Object Oriented Programming Features of Rust](ch18-00-oop.md)

4
rustbook-en/src/appendix-01-keywords.md

@ -69,9 +69,7 @@ Rust for potential future use. @@ -69,9 +69,7 @@ Rust for potential future use.
- `box`
- `do`
- `final`
* `gen`
- `gen`
- `macro`
- `override`
- `priv`

194
rustbook-en/src/ch17-00-async-await.md

@ -1,122 +1,139 @@ @@ -1,122 +1,139 @@
# Async and Await
Many operations we ask the computer to do can take a while to finish. For
example, if you used a video editor to create a video of a family celebration,
exporting it could take anywhere from minutes to hours. Similarly, downloading a
video shared by someone in your family might take a long time. It would be nice
if we could do something else while we are waiting for those long-running
processes to complete.
The video export will use as much CPU and GPU power as it can. If you only had
one CPU core, and your operating system never paused that export until it
completed, you couldn’t do anything else on your computer while it was running.
That would be a pretty frustrating experience, though. Instead, your computer’s
operating system can—and does!—invisibly interrupt the export often enough to
let you get other work done along the way.
The file download is different. It does not take up very much CPU time. Instead,
the CPU needs to wait on data to arrive from the network. While you can start
reading the data once some of it is present, it might take a while for the rest
to show up. Even once the data is all present, a video can be quite large, so it
might take some time to load it all. Maybe it only takes a second or two—but
that’s a very long time for a modern processor, which can do billions of
operations every second. It would be nice to be able to put the CPU to use for
other work while waiting for the network call to finish—so, again, your
operating system will invisibly interrupt your program so other things can
happen while the network operation is still ongoing.
> Note: The video export is the kind of operation which is often described as
> “CPU-bound” or “compute-bound”. It’s limited by the speed of the computer’s
> ability to process data within the _CPU_ or _GPU_, and how much of that speed
> it can use. The video download is the kind of operation which is often
> described as “IO-bound,” because it’s limited by the speed of the computer’s
> _input and output_. It can only go as fast as the data can be sent across the
> network.
# Fundamentals of Asynchronous Programming: Async, Await, Futures, and Streams
Many operations we ask the computer to do can take a while to finish. It would
be nice if we could do something else while we are waiting for those
long-running processes to complete. Modern computers offer two techniques for
working on more than one operation at a time: parallelism and concurrency. Once
we start writing programs that involve parallel or concurrent operations,
though, we quickly encounter new challenges inherent to *asynchronous
programming*, where operations may not finish sequentially in the order they
were started. This chapter builds on Chapter 16’s use of threads for parallelism
and concurrency by introducing an alternative approach to asynchronous
programming: Rust’s Futures, Streams, the `async` and `await` syntax that
supports them, and the tools for managing and coordinating between asynchronous
operations.
Let’s consider an example. Say you’re exporting a video you’ve created of a
family celebration, an operation that could take anywhere from minutes to hours.
The video export will use as much CPU and GPU power as it can. If you had only
one CPU core and your operating system didn’t pause that export until it
completed—that is, if it executed the export _synchronously_—you couldn’t do
anything else on your computer while that task was running. That would be a
pretty frustrating experience. Fortunately, your computer’s operating system
can, and does, invisibly interrupt the export often enough to let you get other
work done simultaneously.
Now say you’re downloading a video shared by someone else, which can also take a
while but does not take up as much CPU time. In this case, the CPU has to wait
for data to arrive from the network. While you can start reading the data once
it starts to arrive, it might take some time for all of it to show up. Even once
the data is all present, if the video is quite large, it could take at least a
second or two to load it all. That might not sound like much, but it’s a very
long time for a modern processor, which can perform billions of operations every
second. Again, your operating system will invisibly interrupt your program to
allow the CPU to perform other work while waiting for the network call to
finish.
The video export is an example of a _CPU-bound_ or _compute-bound_ operation.
It’s limited by the computer’s potential data processing speed within the CPU or
GPU, and how much of that speed it can dedicate to the operation. The video
download is an example of an _IO-bound_ operation, because it’s limited by the
speed of the computer’s _input and output_; it can only go as fast as the data
can be sent across the network.
In both of these examples, the operating system’s invisible interrupts provide a
form of concurrency. That concurrency only happens at the level of a whole
form of concurrency. That concurrency happens only at the level of the entire
program, though: the operating system interrupts one program to let other
programs get work done. In many cases, because we understand our programs at a
much more granular level than the operating system does, we can spot lots of
opportunities for concurrency that the operating system cannot see.
much more granular level than the operating system does, we can spot
opportunities for concurrency that the operating system cant see.
For example, if we’re building a tool to manage file downloads, we should be
able to write our program in such a way that starting one download does not lock
up the UI, and users should be able to start multiple downloads at the same
time. Many operating system APIs for interacting with the network are
_blocking_, though. That is, these APIs block the program’s progress until the
data that they are processing is completely ready.
> Note: This is how _most_ function calls work, if you think about it! However,
> we normally reserve the term “blocking” for function calls which interact with
able to write our program so that starting one download won’t lock up the UI,
and users should be able to start multiple downloads at the same time. Many
operating system APIs for interacting with the network are _blocking_, though;
that is, they block the program’s progress until the data they’re processing is
completely ready.
> Note: This is how _most_ function calls work, if you think about it. However,
> the term _blocking_ is usually reserved for function calls that interact with
> files, the network, or other resources on the computer, because those are the
> places where an individual program would benefit from the operation being
> cases where an individual program would benefit from the operation being
> _non_-blocking.
We could avoid blocking our main thread by spawning a dedicated thread to
download each file. However, we would eventually find that the overhead of those
threads was a problem. It would also be nicer if the call were not blocking in
the first place. Last but not least, it would be better if we could write in the
same direct style we use in blocking code. Something similar to this:
download each file. However, the overhead of those threads would eventually
become a problem. It would be preferable if the call didn’t block in the first
place. It would also be better if we could write in the same direct style we use
in blocking code, similar to this:
```rust,ignore,does_not_compile
let data = fetch_data_from(url).await;
println!("{data}");
```
That is exactly what Rust’s async abstraction gives us. Before we see how this
works in practice, though, we need to take a short detour into the differences
between parallelism and concurrency.
That is exactly what Rust’s _async_ (short for _asynchronous_) abstraction gives
us. In this chapter, you’ll learn all about async as we cover the following
topics:
- How to use Rust’s `async` and `await` syntax
- How to use the async model to solve some of the same challenges we looked at
in Chapter 16
- How multithreading and async provide complementary solutions, that you can
combine in many cases
Before we see how async works in practice, though, we need to take a short
detour to discuss the differences between parallelism and concurrency.
### Parallelism and Concurrency
In the previous chapter, we treated parallelism and concurrency as mostly
interchangeable. Now we need to distinguish between them more precisely, because
the differences will show up as we start working.
We’ve treated parallelism and concurrency as mostly interchangeable so far. Now
we need to distinguish between them more precisely, because the differences will
show up as we start working.
Consider the different ways a team could split up work on a software project. We
could assign a single individual multiple tasks, or we could assign one task per
team member, or we could do a mix of both approaches.
Consider the different ways a team could split up work on a software project.
You could assign a single member multiple tasks, assign each member one task, or
use a mix of the two approaches.
When an individual works on several different tasks before any of them is
complete, this is _concurrency_. Maybe you have two different projects checked
out on your computer, and when you get bored or stuck on one project, you switch
to the other. You’re just one person, so you can’t make progress on both tasks
at the exact same time—but you can multi-task, making progress on multiple
tasks by switching between them.
at the exact same time, but you can multi-task, making progress on one at a time
by switching between them (see Figure 17-1).
<figure>
<img alt="Concurrent work flow" src="img/trpl17-01.svg" class="center" />
<img src="img/trpl17-01.svg" class="center" alt="A diagram with boxes labeled Task A and Task B, with diamonds in them representing subtasks. There are arrows pointing from A1 to B1, B1 to A2, A2 to B2, B2 to A3, A3 to A4, and A4 to B3. The arrows between the subtasks cross the boxes between Task A and Task B." />
<figcaption>Figure 17-1: A concurrent workflow, switching between Task A and Task B.</figcaption>
<figcaption>Figure 17-1: A concurrent workflow, switching between Task A and Task B</figcaption>
</figure>
When you agree to split up a group of tasks between the people on the team, with
each person taking one task and working on it alone, this is _parallelism_. Each
person on the team can make progress at the exact same time.
When the team splits up a group of tasks by having each member take one task and
work on it alone, this is _parallelism_. Each person on the team can make
progress at the exact same time (see Figure 17-2).
<figure>
<img alt="Concurrent work flow" src="img/trpl17-02.svg" class="center" />
<img src="img/trpl17-02.svg" class="center" alt="A diagram with boxes labeled Task A and Task B, with diamonds in them representing subtasks. There are arrows pointing from A1 to A2, A2 to A3, A3 to A4, B1 to B2, and B2 to B3. No arrows cross between the boxes for Task A and Task B." />
<figcaption>Figure 17-2: A parallel workflow, where work happens on Task A and Task B independently.</figcaption>
<figcaption>Figure 17-2: A parallel workflow, where work happens on Task A and Task B independently</figcaption>
</figure>
With both of these situations, you might have to coordinate between different
tasks. Maybe you _thought_ the task that one person was working on was totally
independent from everyone else’s work, but it actually needs something finished
by another person on the team. Some of the work could be done in parallel, but
some of it was actually _serial_: it could only happen in a series, one thing
after the other, as in Figure 17-3.
In both of these workflows, you might have to coordinate between different
tasks. Maybe you _thought_ the task assigned to one person was totally
independent from everyone else’s work, but it actually requires another person
on the team to finish their task first. Some of the work could be done in
parallel, but some of it was actually _serial_: it could only happen in a
series, one task after the other, as in Figure 17-3.
<figure>
<img alt="Concurrent work flow" src="img/trpl17-03.svg" class="center" />
<img src="img/trpl17-03.svg" class="center" alt="A diagram with boxes labeled Task A and Task B, with diamonds in them representing subtasks. There are arrows pointing from A1 to A2, A2 to a pair of thick vertical lines like a “pause” symbol, from that symbol to A3, B1 to B2, B2 to B3, which is below that symbol, B3 to A3, and B3 to B4." />
<figcaption>Figure 17-3: A partially parallel workflow, where work happens on Task A and Task B independently until task A3 is blocked on the results of task B3.</figcaption>
<figcaption>Figure 17-3: A partially parallel workflow, where work happens on Task A and Task B independently until Task A3 is blocked on the results of Task B3.</figcaption>
</figure>
@ -130,24 +147,17 @@ coworker are no longer able to work in parallel, and you’re also no longer abl @@ -130,24 +147,17 @@ coworker are no longer able to work in parallel, and you’re also no longer abl
to work concurrently on your own tasks.
The same basic dynamics come into play with software and hardware. On a machine
with a single CPU core, the CPU can only do one operation at a time, but it can
still work concurrently. Using tools such as threads, processes, and async, the
computer can pause one activity and switch to others before eventually cycling
back to that first activity again. On a machine with multiple CPU cores, it can
also do work in parallel. One core can be doing one thing while another core
does something completely unrelated, and those actually happen at the same
time.
with a single CPU core, the CPU can perform only one operation at a time, but it
can still work concurrently. Using tools such as threads, processes, and async,
the computer can pause one activity and switch to others before eventually
cycling back to that first activity again. On a machine with multiple CPU cores,
it can also do work in parallel. One core can be performing one task while
another core performs a completely unrelated one, and those operations actually
happen at the same time.
When working with async in Rust, we’re always dealing with concurrency.
Depending on the hardware, the operating system, and the async runtime we are
using—more on async runtimes shortly!—that concurrency may also use parallelism
using (more on async runtimes shortly), that concurrency may also use parallelism
under the hood.
Now, let’s dive into how async programming in Rust actually works! In the rest
of this chapter, we will:
- see how to use Rust’s `async` and `await` syntax
- explore how to use the async model to solve some of the same challenges we
looked at in Chapter 16
- look at how multithreading and async provide complementary solutions, which
you can even use together in many cases
Now, let’s dive into how async programming in Rust actually works.

386
rustbook-en/src/ch17-01-futures-and-syntax.md

@ -3,59 +3,56 @@ @@ -3,59 +3,56 @@
The key elements of asynchronous programming in Rust are _futures_ and Rust’s
`async` and `await` keywords.
A _future_ is a value which may not be ready now, but will become ready at some
A _future_ is a value that may not be ready now but will become ready at some
point in the future. (This same concept shows up in many languages, sometimes
under other names such as “task” or “promise”.) Rust provides a `Future` trait
as a building block so different async operations can be implemented with
different data structures, but with a common interface. In Rust, we say that
types which implement the `Future` trait are futures. Each type which
implements `Future` holds its own information about the progress that has been
made and what "ready" means.
The `async` keyword can be applied to blocks and functions to specify that they
under other names such as _task_ or _promise_.) Rust provides a `Future` trait
as a building block so that different async operations can be implemented with
different data structures but with a common interface. In Rust, futures are
types that implement the `Future` trait. Each future holds its own information
about the progress that has been made and what "ready" means.
You can apply the `async` keyword to blocks and functions to specify that they
can be interrupted and resumed. Within an async block or async function, you can
use the `await` keyword to wait for a future to become ready, called _awaiting a
future_. Each place you await a future within an async block or function is a
place that async block or function may get paused and resumed. The process of
checking with a future to see if its value is available yet is called _polling_.
Some other languages also use `async` and `await` keywords for async
programming. If you’re familiar with those languages, you may notice some
significant differences in how Rust does things, including how it handles the
syntax. That’s for good reason, as we’ll see!
Most of the time when writing async Rust, we use the `async` and `await`
keywords. Rust compiles them into equivalent code using the `Future` trait, much
as it compiles `for` loops into equivalent code using the `Iterator` trait.
Because Rust provides the `Future` trait, though, you can also implement it for
your own data types when you need to. Many of the functions we’ll see
throughout this chapter return types with their own implementations of `Future`.
We’ll return to the definition of the trait at the end of the chapter and dig
into more of how it works, but this is enough detail to keep us moving forward.
That may all feel a bit abstract. Let’s write our first async program: a little
web scraper. We’ll pass in two URLs from the command line, fetch both of them
concurrently, and return the result of whichever one finishes first. This
example will have a fair bit of new syntax, but don’t worry. We’ll explain
use the `await` keyword to _await a future_ (that is, wait for it to become
ready). Any point where you await a future within an async block or function is
a potential spot for that async block or function to pause and resume. The
process of checking with a future to see if its value is available yet is called
_polling_.
Some other languages, such as C# and JavaScript, also use `async` and `await`
keywords for async programming. If you’re familiar with those languages, you may
notice some significant differences in how Rust does things, including how it
handles the syntax. That’s for good reason, as we’ll see!
When writing async Rust, we use the `async` and `await` keywords most of the
time. Rust compiles them into equivalent code using the `Future` trait, much as
it compiles `for` loops into equivalent code using the `Iterator` trait. Because
Rust provides the `Future` trait, though, you can also implement it for your own
data types when you need to. Many of the functions we’ll see throughout this
chapter return types with their own implementations of `Future`. We’ll return to
the definition of the trait at the end of the chapter and dig into more of how
it works, but this is enough detail to keep us moving forward.
This may all feel a bit abstract, so let’s write our first async program: a
little web scraper. We’ll pass in two URLs from the command line, fetch both of
them concurrently, and return the result of whichever one finishes first. This
example will have a fair bit of new syntax, but don’t worry—we’ll explain
everything you need to know as we go.
### Our First Async Program
## Our First Async Program
To keep this chapter focused on learning async, rather than juggling parts of
the ecosystem, we have created the `trpl` crate (`trpl` is short for “The Rust
To keep the focus of this chapter on learning async rather than juggling parts
of the ecosystem, we’ve created the `trpl` crate (`trpl` is short for “The Rust
Programming Language”). It re-exports all the types, traits, and functions
you’ll need, primarily from the [`futures`][futures-crate]<!-- ignore --> and
[`tokio`][tokio]<!-- ignore --> crates.
- The `futures` crate is an official home for Rust experimentation for async
code, and is actually where the `Future` type was originally designed.
- Tokio is the most widely used async runtime in Rust today, especially (but
not only!) for web applications. There are other great runtimes out there,
and they may be more suitable for your purposes. We use Tokio under the hood
for `trpl` because it’s well-tested and widely used.
In some cases, `trpl` also renames or wraps the original APIs to let us stay
[`tokio`][tokio]<!-- ignore --> crates. The `futures` crate is an official home
for Rust experimentation for async code, and it’s actually where the `Future`
trait was originally designed. Tokio is the most widely used async runtime in
Rust today, especially for web applications. There are other great runtimes out
there, and they may be more suitable for your purposes. We use the `tokio` crate
under the hood for `trpl` because it’s well tested and widely used.
In some cases, `trpl` also renames or wraps the original APIs to keep you
focused on the details relevant to this chapter. If you want to understand what
the crate does, we encourage you to check out [its source
code][crate-source]<!-- ignore -->. You’ll be able to see what crate each
@ -72,12 +69,14 @@ $ cargo add trpl @@ -72,12 +69,14 @@ $ cargo add trpl
```
Now we can use the various pieces provided by `trpl` to write our first async
program. We’ll build a little command line tool which fetches two web pages,
program. We’ll build a little command line tool that fetches two web pages,
pulls the `<title>` element from each, and prints out the title of whichever
finishes that whole process first.
page finishes that whole process first.
### Defining the page_title Function
Let’s start by writing a function that takes one page URL as a parameter, makes
a request to it, and returns the text of the title element:
a request to it, and returns the text of the title element (see Listing 17-1).
<Listing number="17-1" file-name="src/main.rs" caption="Defining an async function to get the title element from an HTML page">
@ -87,53 +86,53 @@ a request to it, and returns the text of the title element: @@ -87,53 +86,53 @@ a request to it, and returns the text of the title element:
</Listing>
In Listing 17-1, we define a function named `page_title`, and we mark it with
the `async` keyword. Then we use the `trpl::get` function to fetch whatever URL
is passed in, and we await the response by using the `await` keyword. Then we
get the text of the response by calling its `text` method, and once again await
it with the `await` keyword. Both of these steps are asynchronous. For `get`,
we need to wait for the server to send back the first part of its response,
which will include HTTP headers, cookies, and so on. That part of the response
can be delivered separately from the body of the request. Especially if the
body is very large, it can take some time for it all to arrive. Thus, we have
to wait for the _entirety_ of the response to arrive, so the `text` method is
also async.
First, we define a function named `page_title` and mark it with the `async`
keyword. Then we use the `trpl::get` function to fetch whatever URL is passed in
and add the `await` keyword to await the response. To get the text of the
response, we call its `text` method, and once again await it with the `await`
keyword. Both of these steps are asynchronous. For the `get` function, we have
to wait for the server to send back the first part of its response, which will
include HTTP headers, cookies, and so on, and can be delivered separately from
the response body. Especially if the body is very large, it can take some time
for it all to arrive. Because we have to wait for the _entirety_ of the response
to arrive, the `text` method is also async.
We have to explicitly await both of these futures, because futures in Rust are
_lazy_: they don’t do anything until you ask them to with `await`. (In fact,
Rust will show a compiler warning if you don’t use a future.) This should
remind you of our discussion of iterators [back in Chapter 13][iterators-lazy]<!--
ignore -->.
Iterators do nothing unless you call their `next` method—whether directly, or
using `for` loops or methods such as `map` which use `next` under the hood. With
futures, the same basic idea applies: they do nothing unless you explicitly ask
them to. This laziness allows Rust to avoid running async code until it’s
actually needed.
> Note: This is different from the behavior we saw when using `thread::spawn` in
> the previous chapter, where the closure we passed to another thread started
> running immediately. It’s also different from how many other languages
> approach async! But it’s important for Rust. We’ll see why that is later.
Once we have `response_text`, we can then parse it into an instance of the
`Html` type using `Html::parse`. Instead of a raw string, we now have a data
type we can use to work with the HTML as a richer data structure. In particular,
we can use the `select_first` method to find the first instance of a given CSS
selector. By passing the string `"title"`, we’ll get the first `<title>`
element in the document, if there is one. Because there may not be any matching
element, `select_first` returns an `Option<ElementRef>`. Finally, we use the
_lazy_: they don’t do anything until you ask them to with the `await` keyword.
(In fact, Rust will show a compiler warning if you don’t use a future.) This
might remind you of Chapter 13’s discussion of iterators in the section
[Processing a Series of Items With Iterators][iterators-lazy]<!-- ignore -->.
Iterators do nothing unless you call their `next` method—whether directly or by
using `for` loops or methods such as `map` that use `next` under the hood.
Likewise, futures do nothing unless you explicitly ask them to. This laziness
allows Rust to avoid running async code until it’s actually needed.
> Note: This is different from the behavior we saw in the previous chapter when
> using `thread::spawn` in [Creating a New Thread with
> spawn][thread-spawn]<!--ignore-->, where the closure we passed to another
> thread started running immediately. It’s also different from how many other
> languages approach async. But it’s important for Rust, and we’ll see why
> later.
Once we have `response_text`, we can parse it into an instance of the `Html`
type using `Html::parse`. Instead of a raw string, we now have a data type we
can use to work with the HTML as a richer data structure. In particular, we can
use the `select_first` method to find the first instance of a given CSS
selector. By passing the string `"title"`, we’ll get the first `<title>` element
in the document, if there is one. Because there may not be any matching element,
`select_first` returns an `Option<ElementRef>`. Finally, we use the
`Option::map` method, which lets us work with the item in the `Option` if it’s
present, and do nothing if it isn’t. (We could also use a `match` expression
here, but `map` is more idiomatic.) In the body of the function we supply to
`map`, we call `inner_html` on the `title_element` to get its content, which is
a `String`. When all is said and done, we have an `Option<String>`.
Notice that Rust’s `await` keyword goes after the expression you’re awaiting,
not before it. That is, it’s a _postfix keyword_. This may be different from
what you might be used to if you have used async in other languages. Rust chose
this because it makes chains of methods much nicer to work with. As a result, we
can change the body of `page_url_for` to chain the `trpl::get` and `text`
function calls together with `await` between them, as shown in Listing 17-2:
Notice that Rust’s `await` keyword goes _after_ the expression you’re awaiting,
not before it. That is, it’s a _postfix_ keyword. This may differ from what
you’re used to if you’ve used `async` in other languages, but in Rust it makes
chains of methods much nicer to work with. As a result, we can change the body
of `page_url_for` to chain the `trpl::get` and `text` function calls together
with `await` between them, as shown in Listing 17-2.
<Listing number="17-2" file-name="src/main.rs" caption="Chaining with the `await` keyword">
@ -148,15 +147,15 @@ some code in `main` to call it, let’s talk a little more about what we’ve @@ -148,15 +147,15 @@ some code in `main` to call it, let’s talk a little more about what we’ve
written and what it means.
When Rust sees a block marked with the `async` keyword, it compiles it into a
unique, anonymous data type which implements the `Future` trait. When Rust sees
a function marked with `async`, it compiles it into a non-async function whose
unique, anonymous data type that implements the `Future` trait. When Rust sees a
function marked with `async`, it compiles it into a non-async function whose
body is an async block. An async function’s return type is the type of the
anonymous data type the compiler creates for that async block.
Thus, writing `async fn` is equivalent to writing a function which returns a
_future_ of the return type. When the compiler sees a function definition such
as the `async fn page_title` in Listing 17-1, it’s equivalent to a non-async
function defined like this:
Thus, writing `async fn` is equivalent to writing a function that returns a
_future_ of the return type. To the compiler, a function definition such as the
`async fn page_title` in Listing 17-1 is equivalent to a non-async function
defined like this:
```rust
# extern crate trpl; // required for mdbook test
@ -175,34 +174,38 @@ fn page_title(url: &str) -> impl Future<Output = Option<String>> + '_ { @@ -175,34 +174,38 @@ fn page_title(url: &str) -> impl Future<Output = Option<String>> + '_ {
Let’s walk through each part of the transformed version:
- It uses the `impl Trait` syntax we discussed back in the [“Traits as
Parameters”][impl-trait]<!-- ignore --> section in Chapter 10.
- The returned trait is a `Future`, with an associated type of `Output`. Notice
that the `Output` type is `Option<String>`, which is the same as the the
original return type from the `async fn` version of `page_title`.
- It uses the `impl Trait` syntax we discussed back in Chapter 10 in the
[“Traits as Parameters”][impl-trait]<!-- ignore --> section.
- The returned trait is a `Future` with an associated type of `Output`. Notice
that the `Output` type is `Option<String>`, which is the same as the original
return type from the `async fn` version of `page_title`.
- All of the code called in the body of the original function is wrapped in an
`async move` block. Remember that blocks are expressions. This whole block is
the expression returned from the function.
- This async block produces a value with the type `Option<String>`, as described
above. That value matches the `Output` type in the return type. This is just
like other blocks you have seen.
- This async block produces a value with the type `Option<String>`, as just
described. That value matches the `Output` type in the return type. This
is just like other blocks you have seen.
- The new function body is an `async move` block because of how it uses the
`url` parameter. (We’ll talk about `async` vs. `async move` much more later
`url` parameter. (We’ll talk much more about `async` versus `async move` later
in the chapter.)
- The new version of the function has a kind of lifetime we haven’t seen before
in the output type: `'_`. Because the function returns a `Future` which refers
to a reference—in this case, the reference from the `url` parameter—we need to
tell Rust that we mean for that reference to be included. We don’t have to
name the lifetime here, because Rust is smart enough to know there is only one
reference which could be involved, but we _do_ have to be explicit that the
resulting `Future` is bound by that lifetime.
Now we can call `page_title` in `main`. To start, we’ll just get the title
for a single page. In Listing 17-3, we follow the same pattern we used for
getting command line arguments back in Chapter 12. Then we pass the first URL
`page_title`, and await the result. Because the value produced by the future is
an `Option<String>`, we use a `match` expression to print different messages to
account for whether the page had a `<title>`.
in the output type: `'_`. Because the function returns a future that refers to
a reference—in this case, the reference from the `url` parameter—we need to
tell Rust that we want that reference to be included. We don’t have to name
the lifetime here, because Rust is smart enough to know there’s only one
reference that could be involved, but we _do_ have to be explicit that the
resulting future is bound by that lifetime.
Now we can call `page_title` in `main`.
## Determining a Single Page’s Title
To start, we’ll just get the title for a single page. In Listing 17-3, we follow
the same pattern we used in Chapter 12 to get command line arguments in the
[Accepting Command Line Arguments][cli-args]<!-- ignore --> section. Then we
pass the first URL `page_title` and await the result. Because the value
produced by the future is an `Option<String>`, we use a `match` expression to
print different messages to account for whether the page had a `<title>`.
<Listing number="17-3" file-name="src/main.rs" caption="Calling the `page_title` function from `main` with a user-supplied argument">
@ -212,7 +215,7 @@ account for whether the page had a `<title>`. @@ -212,7 +215,7 @@ account for whether the page had a `<title>`.
</Listing>
Unfortunately, this doesn’t compile. The only place we can use the `await`
Unfortunately, this code doesn’t compile. The only place we can use the `await`
keyword is in async functions or blocks, and Rust won’t let us mark the
special `main` function as `async`.
@ -231,33 +234,32 @@ error[E0752]: `main` function is not allowed to be `async` @@ -231,33 +234,32 @@ error[E0752]: `main` function is not allowed to be `async`
```
The reason `main` can’t be marked `async` is that async code needs a _runtime_:
a Rust crate which manages the details of executing asynchronous code. A
a Rust crate that manages the details of executing asynchronous code. A
program’s `main` function can _initialize_ a runtime, but it’s not a runtime
_itself_. (We’ll see more about why this is a bit later.) Every Rust program
that executes async code has at least one place where it sets up a runtime and
executes the futures.
Most languages which support async bundle a runtime with the language. Rust does
not. Instead, there are many different async runtimes available, each of which
makes different tradeoffs suitable to the use case they target. For example, a
high-throughput web server with many CPU cores and a large amount of RAM has
very different needs than a microcontroller with a single core, a small amount
of RAM, and no ability to do heap allocations. The crates which provide those
runtimes also often supply async versions of common functionality such as file
or network I/O.
Here, and throughout the rest of this chapter, we’ll use the `run` function
from the `trpl` crate, which takes a future as an argument and runs it to
completion. Behind the scenes, calling `run` sets up a runtime to use to run the
future passed in. Once the future completes, `run` returns whatever value the
future produced.
We could pass the future returned by `page_title` directly to `run`. Once it
completed, we would be able to match on the resulting `Option<String>`, the way
_itself_. (We’ll see more about why this is the case in a bit.) Every Rust
program that executes async code has at least one place where it sets up a
runtime and executes the futures.
Most languages that support async bundle a runtime, but Rust does not. Instead,
there are many different async runtimes available, each of which makes different
tradeoffs suitable to the use case it targets. For example, a high-throughput
web server with many CPU cores and a large amount of RAM has very different
needs than a microcontroller with a single core, a small amount of RAM, and no
heap allocation ability. The crates that provide those runtimes also often
supply async versions of common functionality such as file or network I/O.
Here, and throughout the rest of this chapter, we’ll use the `run` function from
the `trpl` crate, which takes a future as an argument and runs it to completion.
Behind the scenes, calling `run` sets up a runtime that’s used to run the future
passed in. Once the future completes, `run` returns whatever value the future
produced.
We could pass the future returned by `page_title` directly to `run`, and once it
completed, we could match on the resulting `Option<String>`, as
we tried to do in Listing 17-3. However, for most of the examples in the chapter
(and most async code in the real world!), we’ll be doing more than just one
(and most async code in the real world), we’ll be doing more than just one
async function call, so instead we’ll pass an `async` block and explicitly
await the result of calling `page_title`, as in Listing 17-4.
await the result of the `page_title` call, as in Listing 17-4.
<Listing number="17-4" caption="Awaiting an async block with `trpl::run`" file-name="src/main.rs">
@ -269,7 +271,7 @@ await the result of calling `page_title`, as in Listing 17-4. @@ -269,7 +271,7 @@ await the result of calling `page_title`, as in Listing 17-4.
</Listing>
When we run this, we get the behavior we might have expected initially:
When we run this code, we get the behavior we expected initially:
<!-- manual-regeneration
cd listings/ch17-async-await/listing-17-04
@ -286,16 +288,16 @@ The title for https://www.rust-lang.org was @@ -286,16 +288,16 @@ The title for https://www.rust-lang.org was
Rust Programming Language
```
Phew: we finally have some working async code! This now compiles, and we can run
it. Before we add code to race two sites against each other, let’s briefly turn
our attention back to how futures work.
Phew—we finally have some working async code! But before we add the code to race
the two sites against each other, let’s briefly turn our attention back to how
futures work.
Each _await point_—that is, every place where the code uses the `await`
keyword—represents a place where control gets handed back to the runtime. To
keyword—represents a place where control is handed back to the runtime. To
make that work, Rust needs to keep track of the state involved in the async
block, so that the runtime can kick off some other work and then come back when
it’s ready to try advancing this one again. This is an invisible state machine,
as if you wrote an enum in this way to save the current state at each `await`
block so that the runtime can kick off some other work and then come back when
it’s ready to try advancing the first one again. This is an invisible state machine,
as if you’d written an enum like this to save the current state at each await
point:
```rust
@ -303,33 +305,36 @@ point: @@ -303,33 +305,36 @@ point:
```
Writing the code to transition between each state by hand would be tedious and
error-prone, especially when adding more functionality and more states to the
code later. Instead, the Rust compiler creates and manages the state machine
data structures for async code automatically. If you’re wondering: yep, the
normal borrowing and ownership rules around data structures all apply. Happily,
the compiler also handles checking those for us, and has good error messages.
We’ll work through a few of those later in the chapter!
Ultimately, something has to execute that state machine. That something is a
runtime. (This is why you may sometimes come across references to _executors_
error-prone, however, especially when you need to add more functionality and
more states to the code later. Fortunately, the Rust compiler creates and
manages the state machine data structures for async code automatically. The
normal borrowing and ownership rules around data structures all still apply, and
happily, the compiler also handles checking those for us and provides useful
error messages. We’ll work through a few of those later in the chapter.
Ultimately, something has to execute this state machine, and that something is a
runtime. (This is why you may come across references to _executors_
when looking into runtimes: an executor is the part of a runtime responsible for
executing the async code.)
Now we can understand why the compiler stopped us from making `main` itself an
async function back in Listing 17-3. If `main` were an async function, something
else would need to manage the state machine for whatever future `main` returned,
but `main` is the starting point for the program! Instead, we call the
`trpl::run` function in `main`, which sets up a runtime and runs the future
returned by the `async` block until it returns `Ready`.
Now you can see why the compiler stopped us from making `main` itself an async
function back in Listing 17-3. If `main` were an async function, something else
would need to manage the state machine for whatever future `main` returned, but
`main` is the starting point for the program! Instead, we called the `trpl::run`
function in `main` to set up a runtime and run the future returned by the
`async` block until it returns `Ready`.
> Note: some runtimes provide macros to make it so you _can_ write an async main
> Note: Some runtimes provide macros so you _can_ write an async `main`
> function. Those macros rewrite `async fn main() { ... }` to be a normal `fn
> main` which does the same thing we did by hand in Listing 17-5: call a
> function which runs a future to completion the way `trpl::run` does.
> main`, which does the same thing we did by hand in Listing 17-5: call a
> function that runs a future to completion the way `trpl::run` does.
Now let’s put these pieces together and see how we can write concurrent code.
### Racing Our Two URLs Against Each Other
Let’s put these pieces together and see how we can write concurrent code, by
calling `page_title` with two different URLs passed in from the command line
and racing them.
In Listing 17-5, we call `page_title` with two different URLs passed in from the
command line and race them.
<Listing number="17-5" caption="" file-name="src/main.rs">
@ -341,24 +346,23 @@ and racing them. @@ -341,24 +346,23 @@ and racing them.
</Listing>
In Listing 17-5, we begin by calling `page_title` for each of the user-supplied
URLs. We save the futures produced by calling `page_title` as `title_fut_1` and
`title_fut_2`. Remember, these don’t do anything yet, because futures are lazy,
and we haven’t yet awaited them. Then we pass the futures to `trpl::race`,
which returns a value to indicate which of the futures passed to it finishes
first.
We begin by calling `page_title` for each of the user-supplied URLs. We save the
resulting futures as `title_fut_1` and `title_fut_2`. Remember, these don’t do
anything yet, because futures are lazy and we haven’t yet awaited them. Then we
pass the futures to `trpl::race`, which returns a value to indicate which of the
futures passed to it finishes first.
> Note: Under the hood, `race` is built on a more general function, `select`,
> which you will encounter more often in real-world Rust code. A `select`
> function can do a lot of things that `trpl::race` function can’t, but it also
> has some additional complexity that we can skip over for now.
> function can do a lot of things that the `trpl::race` function can’t, but it
> also has some additional complexity that we can skip over for now.
Either future can legitimately “win,” so it doesn’t make sense to return a
`Result`. Instead, `race` returns a type we haven’t seen before,
`trpl::Either`. The `Either` type is somewhat similar to a `Result`, in that it
`trpl::Either`. The `Either` type is somewhat similar to a `Result` in that it
has two cases. Unlike `Result`, though, there is no notion of success or
failure baked into `Either`. Instead, it uses `Left` and `Right` to indicate
“one or the other”.
“one or the other”:
```rust
enum Either<A, B> {
@ -367,25 +371,27 @@ enum Either<A, B> { @@ -367,25 +371,27 @@ enum Either<A, B> {
}
```
The `race` function returns `Left` if the first argument finishes first, with
that future’s output, and `Right` with the second future argument’s output if
_that_ one finishes first. This matches the order the arguments appear when
calling the function: the first argument is to the left of the second argument.
The `race` function returns `Left` with that future’s output if the first
argument wins, and `Right` with the second future argument’s output if _that_
one wins. This matches the order the arguments appear in when calling the
function: the first argument is to the left of the second argument.
We also update `page_title` to return the same URL passed in. That way, if
the page which returns first does not have a `<title>` we can resolve, we can
the page that returns first does not have a `<title>` we can resolve, we can
still print a meaningful message. With that information available, we wrap up by
updating our `println!` output to indicate both which URL finished first and
what the `<title>` was for the web page at that URL, if any.
what, if any, the `<title>` is for the web page at that URL.
You have built a small working web scraper now! Pick a couple URLs and run the
command line tool. You may discover that some sites are reliably faster than
others, while in other cases which site “wins” varies from run to run. More
importantly, you’ve learned the basics of working with futures, so we can now
dig into even more of the things we can do with async.
command line tool. You may discover that some sites are consistently faster than
others, while in other cases the faster site varies from run to run. More
importantly, you’ve learned the basics of working with futures, so now we can
dig deeper into what we can do with async.
[impl-trait]: ch10-02-traits.html#traits-as-parameters
[iterators-lazy]: ch13-02-iterators.html
[thread-spawn]: ch16-01-threads.html#creating-a-new-thread-with-spawn
[cli-args]: ch12-01-accepting-command-line-arguments.html
<!-- TODO: map source link version to version of Rust? -->

224
rustbook-en/src/ch17-02-concurrency-with-async.md

@ -1,4 +1,7 @@ @@ -1,4 +1,7 @@
## Concurrency With Async
## Applying Concurrency with Async
<!-- Old headings. Do not remove or links may break. -->
<a id="concurrency-with-async"></a>
In this section, we’ll apply async to some of the same concurrency challenges
we tackled with threads in chapter 16. Because we already talked about a lot of
@ -6,20 +9,24 @@ the key ideas there, in this section we’ll focus on what’s different between @@ -6,20 +9,24 @@ the key ideas there, in this section we’ll focus on what’s different between
threads and futures.
In many cases, the APIs for working with concurrency using async are very
similar to those for using threads. In other cases, they end up being shaped
quite differently. Even when the APIs _look_ similar between threads and async,
they often have different behavior—and they nearly always have different
performance characteristics.
similar to those for using threads. In other cases, they end up being quite
different. Even when the APIs _look_ similar between threads and async, they
often have different behavior—and they nearly always have different performance
characteristics.
<!-- Old headings. Do not remove or links may break. -->
<a id="counting"></a>
### Counting
### Creating a New Task with `spawn_task`
The first task we tackled in Chapter 16 was counting up on two separate threads.
The first operation we tackled in [Creating a New Thread with
Spawn][thread-spawn]<!-- ignore --> was counting up on two separate threads.
Let’s do the same using async. The `trpl` crate supplies a `spawn_task` function
which looks very similar to the `thread::spawn` API, and a `sleep` function
which is an async version of the `thread::sleep` API. We can use these together
to implement the same counting example as with threads, in Listing 17-6.
that looks very similar to the `thread::spawn` API, and a `sleep` function
that is an async version of the `thread::sleep` API. We can use these together
to implement the counting example, as shown in Listing 17-6.
<Listing number="17-6" caption="Using `spawn_task` to count with two" file-name="src/main.rs">
<Listing number="17-6" caption="Creating a new task to print one thing while the main task prints something else" file-name="src/main.rs">
```rust
{{#rustdoc_include ../listings/ch17-async-await/listing-17-06/src/main.rs:all}}
@ -27,21 +34,21 @@ to implement the same counting example as with threads, in Listing 17-6. @@ -27,21 +34,21 @@ to implement the same counting example as with threads, in Listing 17-6.
</Listing>
As our starting point, we set up our `main` function with `trpl::run`, so
that our top-level function can be async.
As our starting point, we set up our `main` function with `trpl::run` so that
our top-level function can be async.
> Note: From this point forward in the chapter, every example will include this
> exact same wrapping code with `trpl::run` in `main`, so we’ll often skip it
> just as we do with `main`. Don’t forget to include it in your code!
Then we write two loops within that block, each with a `trpl::sleep` call in it,
Then we write two loops within that block, each containing a `trpl::sleep` call,
which waits for half a second (500 milliseconds) before sending the next
message. We put one loop in the body of a `trpl::spawn_task` and the other in a
top-level `for` loop. We also add an `await` after the `sleep` calls.
This does something similar to the thread-based implementation—including the
This code behaves similarly to the thread-based implementation—including the
fact that you may see the messages appear in a different order in your own
terminal when you run it.
terminal when you run it:
<!-- Not extracting output because changes to this output aren't significant;
the changes are likely to be due to the threads running differently rather than
@ -59,9 +66,9 @@ hi number 4 from the second task! @@ -59,9 +66,9 @@ hi number 4 from the second task!
hi number 5 from the first task!
```
This version stops as soon as the for loop in the body of the main async block
finishes, because the task spawned by `spawn_task` is shut down when the main
function ends. If you want to run all the way to the completion of the task, you
This version stops as soon as the `for` loop in the body of the main async block
finishes, because the task spawned by `spawn_task` is shut down when the `main`
function ends. If you want it to run all the way to the task’s completion, you
will need to use a join handle to wait for the first task to complete. With
threads, we used the `join` method to “block” until the thread was done running.
In Listing 17-7, we can use `await` to do the same thing, because the task
@ -76,7 +83,7 @@ after awaiting it. @@ -76,7 +83,7 @@ after awaiting it.
</Listing>
This updated version runs till _both_ loops finish.
This updated version runs until _both_ loops finish.
<!-- Not extracting output because changes to this output aren't significant;
the changes are likely to be due to the threads running differently rather than
@ -108,14 +115,15 @@ async blocks compile to anonymous futures, we can put each loop in an async @@ -108,14 +115,15 @@ async blocks compile to anonymous futures, we can put each loop in an async
block and have the runtime run them both to completion using the `trpl::join`
function.
In Chapter 16, we showed how to use the `join` method on the `JoinHandle` type
returned when you call `std::thread::spawn`. The `trpl::join` function is
similar, but for futures. When you give it two futures, it produces a single new
future whose output is a tuple with the output of each of the futures you passed
in once _both_ complete. Thus, in Listing 17-8, we use `trpl::join` to wait for
both `fut1` and `fut2` to finish. We do _not_ await `fut1` and `fut2`, but
instead the new future produced by `trpl::join`. We ignore the output, because
it’s just a tuple with two unit values in it.
In the section [Waiting for All Threads to Finishing Using `join`
Handles][join-handles]<!-- ignore -->, we showed how to use the `join` method on
the `JoinHandle` type returned when you call `std::thread::spawn`. The
`trpl::join` function is similar, but for futures. When you give it two futures,
it produces a single new future whose output is a tuple containing the output of
each future you passed in once they _both_ complete. Thus, in Listing 17-8, we
use `trpl::join` to wait for both `fut1` and `fut2` to finish. We do _not_ await
`fut1` and `fut2` but instead the new future produced by `trpl::join`. We ignore
the output, because it’s just a tuple containing two unit values.
<Listing number="17-8" caption="Using `trpl::join` to await two anonymous futures" file-name="src/main.rs">
@ -147,7 +155,7 @@ hi number 8 from the first task! @@ -147,7 +155,7 @@ hi number 8 from the first task!
hi number 9 from the first task!
```
Here, you’ll see the exact same order every time, which is very different from
Now, you’ll see the exact same order every time, which is very different from
what we saw with threads. That is because the `trpl::join` function is _fair_,
meaning it checks each future equally often, alternating between them, and never
lets one race ahead if the other is ready. With threads, the operating system
@ -156,11 +164,10 @@ runtime decides which task to check. (In practice, the details get complicated @@ -156,11 +164,10 @@ runtime decides which task to check. (In practice, the details get complicated
because an async runtime might use operating system threads under the hood as
part of how it manages concurrency, so guaranteeing fairness can be more work
for a runtime—but it’s still possible!) Runtimes don’t have to guarantee
fairness for any given operation, and runtimes often offer different APIs to let
you choose whether you want fairness or not.
fairness for any given operation, and they often offer different APIs to let you
choose whether or not you want fairness.
Try some of these different variations on awaiting the futures and see what they
do:
Try some of these variations on awaiting the futures and see what they do:
- Remove the async block from around either or both of the loops.
- Await each async block immediately after defining it.
@ -170,14 +177,18 @@ do: @@ -170,14 +177,18 @@ do:
For an extra challenge, see if you can figure out what the output will be in
each case _before_ running the code!
### Message Passing
<!-- Old headings. Do not remove or links may break. -->
<a id="message-passing"></a>
### Counting Up on Two Tasks Using Message Passing
Sharing data between futures will also be familiar: we’ll use message passing
again, but this with async versions of the types and functions. We’ll take a
slightly different path than we did in Chapter 16, to illustrate some of the key
differences between thread-based and futures-based concurrency. In Listing 17-9,
we’ll begin with just a single async block—_not_ spawning a separate task as
we spawned a separate thread.
again, but this time with async versions of the types and functions. We’ll take
a slightly different path than we did in [Using Message Passing to Transfer Data
Between Threads][message-passing-threads]<!-- ignore --> to illustrate some of
the key differences between thread-based and futures-based concurrency. In
Listing 17-9, we’ll begin with just a single async block—_not_ spawning a
separate task as we spawned a separate thread.
<Listing number="17-9" caption="Creating an async channel and assigning the two halves to `tx` and `rx`" file-name="src/main.rs">
@ -205,18 +216,18 @@ because the channel we’re sending it into is unbounded. @@ -205,18 +216,18 @@ because the channel we’re sending it into is unbounded.
> Note: Because all of this async code runs in an async block in a `trpl::run`
> call, everything within it can avoid blocking. However, the code _outside_ it
> will block on the `run` function returning. That is the whole point of the
> will block on the `run` function returning. Thats the whole point of the
> `trpl::run` function: it lets you _choose_ where to block on some set of async
> code, and thus where to transition between sync and async code. In most async
> runtimes, `run` is actually named `block_on` for exactly this reason.
Notice two things about this example: First, the message will arrive right away!
Notice two things about this example. First, the message will arrive right away.
Second, although we use a future here, there’s no concurrency yet. Everything
in the listing happens in sequence, just as it would if there were no futures
involved.
Let’s address the first part by sending a series of messages, and sleep in
between them, as shown in Listing 17-10:
Let’s address the first part by sending a series of messages and sleeping in
between them, as shown in Listing 17-10.
<!-- We cannot test this one because it never stops! -->
@ -228,26 +239,26 @@ between them, as shown in Listing 17-10: @@ -228,26 +239,26 @@ between them, as shown in Listing 17-10:
</Listing>
In addition to sending the messages, we need to receive them. In this case, we
could do that manually, by just doing `rx.recv().await` four times, because we
know how many messages are coming in. In the real world, though, we’ll
generally be waiting on some _unknown_ number of messages. In that case, we need
to keep waiting until we determine that there are no more messages.
In addition to sending the messages, we need to receive them. In this case,
because we know how many messages are coming in, we could do that manually by
calling `rx.recv().await` four times. In the real world, though, we’ll generally
be waiting on some _unknown_ number of messages, so we need to keep waiting
until we determine that there are no more messages.
In Listing 16-10, we used a `for` loop to process all the items received from a
synchronous channel. However, Rust doesn’t yet have a way to write a `for` loop
over an _asynchronous_ series of items. Instead, we need to use a new kind of
loop we haven’t seen before, the `while let` conditional loop. A `while let`
loop is the loop version of the `if let` construct we saw back in Chapter 6. The
loop will continue executing as long as the pattern it specifies continues to
match the value.
The `rx.recv` call produces a `Future`, which we await. The runtime will pause
the `Future` until it is ready. Once a message arrives, the future will resolve
to `Some(message)`, as many times as a message arrives. When the channel closes,
synchronous channel. Rust doesn’t yet have a way to write a `for` loop over an
_asynchronous_ series of items, however, so we need to use a loop we haven’t
seen before: the `while let` conditional loop. This is the loop version of the
`if let` construct we saw back in the section [Concise Control Flow with `if
let` and `let else`][if-let]<!-- ignore -->. The loop will continue executing as
long as the pattern it specifies continues to match the value.
The `rx.recv` call produces a future, which we await. The runtime will pause the
future until it is ready. Once a message arrives, the future will resolve to
`Some(message)` as many times as a message arrives. When the channel closes,
regardless of whether _any_ messages have arrived, the future will instead
resolve to `None` to indicate that there are no more values, and we should stop
polling—that is, stop awaiting.
resolve to `None` to indicate that there are no more values and thus we should
stop polling—that is, stop awaiting.
The `while let` loop pulls all of this together. If the result of calling
`rx.recv().await` is `Some(message)`, we get access to the message and we can
@ -256,16 +267,16 @@ use it in the loop body, just as we could with `if let`. If the result is @@ -256,16 +267,16 @@ use it in the loop body, just as we could with `if let`. If the result is
again, so the runtime pauses it again until another message arrives.
The code now successfully sends and receives all of the messages. Unfortunately,
there are still a couple problems. For one thing, the messages do not arrive at
half-second intervals. They arrive all at once, two seconds (2,000 milliseconds)
after we start the program. For another, this program also never exits! Instead,
it waits forever for new messages. You will need to shut it down using <span
there are still a couple of problems. For one thing, the messages do not arrive
at half-second intervals. They arrive all at once, 2 (2,000 milliseconds) after
we start the program. For another, this program also never exits! Instead, it
waits forever for new messages. You will need to shut it down using <span
class="keystroke">ctrl-c</span>.
Let’s start by understanding why the messages all come in at once after the full
delay, rather than coming in with delays in between each one. Within a given
async block, the order that `await` keywords appear in the code is also the
order they happen when running the program.
Let’s start by examining why the messages come in all at once after the full
delay, rather than coming in with delays between each one. Within a given async
block, the order in which `await` keywords appear in the code is also the order
in which they’re executed when the program runs.
There’s only one async block in Listing 17-10, so everything in it runs
linearly. There’s still no concurrency. All the `tx.send` calls happen,
@ -273,13 +284,13 @@ interspersed with all of the `trpl::sleep` calls and their associated await @@ -273,13 +284,13 @@ interspersed with all of the `trpl::sleep` calls and their associated await
points. Only then does the `while let` loop get to go through any of the `await`
points on the `recv` calls.
To get the behavior we want, where the sleep delay happens between receiving
each message, we need to put the `tx` and `rx` operations in their own async
blocks. Then the runtime can execute each of them separately using `trpl::join`,
just as in the counting example. Once again, we await the result of calling
`trpl::join`, not the individual futures. If we awaited the individual futures
in sequence, we would just end up back in a sequential flow—exactly what we’re
trying _not_ to do.
To get the behavior we want, where the sleep delay happens between each message,
we need to put the `tx` and `rx` operations in their own async blocks, as shown
in Listing 17-11. Then the runtime can execute each of them separately using
`trpl::join`, just as in the counting example. Once again, we await the result
of calling `trpl::join`, not the individual futures. If we awaited the
individual futures in sequence, we would just end up back in a sequential
flow—exactly what we’re trying _not_ to do.
<!-- We cannot test this one because it never stops! -->
@ -292,25 +303,25 @@ trying _not_ to do. @@ -292,25 +303,25 @@ trying _not_ to do.
</Listing>
With the updated code in Listing 17-11, the messages get printed at
500-millisecond intervals, rather than all in a rush after two seconds.
500-millisecond intervals, rather than all in a rush after 2 seconds.
The program still never exits, though, because of the way `while let` loop
interacts with `trpl::join`:
- The future returned from `trpl::join` only completes once _both_ futures
- The future returned from `trpl::join` completes only once _both_ futures
passed to it have completed.
- The `tx` future completes once it finishes sleeping after sending the last
message in `vals`.
- The `rx` future won’t complete until the `while let` loop ends.
- The `while let` loop won’t end until awaiting `rx.recv` produces `None`.
- Awaiting `rx.recv` will only return `None` once the other end of the channel
- Awaiting `rx.recv` will return `None` only once the other end of the channel
is closed.
- The channel will only close if we call `rx.close` or when the sender side,
- The channel will close only if we call `rx.close` or when the sender side,
`tx`, is dropped.
- We don’t call `rx.close` anywhere, and `tx` won’t be dropped until the
outermost async block passed to `trpl::run` ends.
- The block can’t end because it is blocked on `trpl::join` completing, which
takes us back to the top of this list!
takes us back to the top of this list.
We could manually close `rx` by calling `rx.close` somewhere, but that doesn’t
make much sense. Stopping after handling some arbitrary number of messages would
@ -318,18 +329,20 @@ make the program shut down, but we could miss messages. We need some other way @@ -318,18 +329,20 @@ make the program shut down, but we could miss messages. We need some other way
to make sure that `tx` gets dropped _before_ the end of the function.
Right now, the async block where we send the messages only borrows `tx` because
sending a message doesn’t require ownership, but if we could move `tx` into
that async block, it would be dropped once that block ends. In Chapter 13, we
learned how to use the `move` keyword with closures, and in Chapter 16, we saw
that we often need to move data into closures when working with threads. The
sending a message doesn’t require ownership, but if we could move `tx` into that
async block, it would be dropped once that block ends. In the Chapter 13 section
[Capturing References or Moving Ownership][capture-or-move]<!-- ignore -->, you
learned how to use the `move` keyword with closures, and, as discussed in the
Chapter 16 section [Using `move` Closures with Threads][move-threads]<!-- ignore
-->, we often need to move data into closures when working with threads. The
same basic dynamics apply to async blocks, so the `move` keyword works with
async blocks just as it does with closures.
In Listing 17-12, we change the async block for sending messages from a plain
`async` block to an `async move` block. When we run _this_ version of the code,
it shuts down gracefully after the last message is sent and received.
In Listing 17-12, we change the block used to send messages from `async` to
`async move`. When we run _this_ version of the code, it shuts down gracefully
after the last message is sent and received.
<Listing number="17-12" caption="A working example of sending and receiving messages between futures which correctly shuts down when complete" file-name="src/main.rs">
<Listing number="17-12" caption="A revision of the code from Listing 17-11 that correctly shuts down when complete" file-name="src/main.rs">
```rust
{{#rustdoc_include ../listings/ch17-async-await/listing-17-12/src/main.rs:with-move}}
@ -338,18 +351,8 @@ it shuts down gracefully after the last message is sent and received. @@ -338,18 +351,8 @@ it shuts down gracefully after the last message is sent and received.
</Listing>
This async channel is also a multiple-producer channel, so we can call `clone`
on `tx` if we want to send messages from multiple futures. In Listing 17-13, we
clone `tx`, creating `tx1` outside the first async block. We move `tx1` into
that block just as we did before with `tx`. Then, later, we move the original
`tx` into a _new_ async block, where we send more messages on a slightly slower
delay. We happen to put this new async block after the async block for receiving
messages, but it could go before it just as well. The key is the order of the
futures are awaited in, not the order they are created in.
Both of the async blocks for sending messages need to be `async move` blocks, so
that both `tx` and `tx1` get dropped when those blocks finish. Otherwise we’ll
end up back in the same infinite loop we started out in. Finally, we switch from
`trpl::join` to `trpl::join3` to handle the additional future.
on `tx` if we want to send messages from multiple futures, as shown in Listing
17-13.
<Listing number="17-13" caption="Using multiple producers with async blocks" file-name="src/main.rs">
@ -359,7 +362,19 @@ end up back in the same infinite loop we started out in. Finally, we switch from @@ -359,7 +362,19 @@ end up back in the same infinite loop we started out in. Finally, we switch from
</Listing>
Now we see all the messages from both sending futures. Because the sending
First, we clone `tx`, creating `tx1` outside the first async block. We move
`tx1` into that block just as we did before with `tx`. Then, later, we move the
original `tx` into a _new_ async block, where we send more messages on a
slightly slower delay. We happen to put this new async block after the async
block for receiving messages, but it could go before it just as well. The key is
the order in which the futures are awaited, not in which they’re created.
Both of the async blocks for sending messages need to be `async move` blocks so
that both `tx` and `tx1` get dropped when those blocks finish. Otherwise, we’ll
end up back in the same infinite loop we started out in. Finally, we switch from
`trpl::join` to `trpl::join3` to handle the additional future.
Now we see all the messages from both sending futures, and because the sending
futures use slightly different delays after sending, the messages are also
received at those different intervals.
@ -380,3 +395,10 @@ received 'you' @@ -380,3 +395,10 @@ received 'you'
This is a good start, but it limits us to just a handful of futures: two with
`join`, or three with `join3`. Let’s see how we might work with more futures.
[thread-spawn]: ch16-01-threads.html#creating-a-new-thread-with-spawn
[join-handles]: ch16-01-threads.html#waiting-for-all-threads-to-finish-using-join-handles
[message-passing-threads]: ch16-02-message-passing.html
[if-let]: ch06-03-if-let.html
[capture-or-move]: ch13-01-closures.html#capturing-references-or-moving-ownership
[move-threads]: ch16-01-threads.html#using-move-closures-with-threads

289
rustbook-en/src/ch17-03-more-futures.md

@ -1,4 +1,4 @@ @@ -1,4 +1,4 @@
## Working With Any Number of Futures
## Working with Any Number of Futures
When we switched from using two futures to three in the previous section, we
also had to switch from using `join` to using `join3`. It would be annoying to
@ -6,7 +6,7 @@ have to call a different function every time we changed the number of futures we @@ -6,7 +6,7 @@ have to call a different function every time we changed the number of futures we
wanted to join. Happily, we have a macro form of `join` to which we can pass an
arbitrary number of arguments. It also handles awaiting the futures itself.
Thus, we could rewrite the code from Listing 17-13 to use `join!` instead of
`join3`, as in Listing 17-14:
`join3`, as in Listing 17-14.
<Listing number="17-14" caption="Using `join!` to wait for multiple futures" file-name="src/main.rs">
@ -16,17 +16,18 @@ Thus, we could rewrite the code from Listing 17-13 to use `join!` instead of @@ -16,17 +16,18 @@ Thus, we could rewrite the code from Listing 17-13 to use `join!` instead of
</Listing>
This is definitely a nice improvement over needing to swap between `join` and
`join3` and `join4` and so on! However, even this macro form only works when we
know the number of futures ahead of time. In real-world Rust, though, pushing
futures into a collection and then waiting on some or all the futures in that
collection to complete is a common pattern.
This is definitely an improvement over swapping between `join` and
`join3` and `join4` and so on! However, even this macro form only works
when we know the number of futures ahead of time. In real-world Rust,
though, pushing futures into a collection and then waiting on some or
all the futures of them to complete is a common pattern.
To check all the futures in some collection, we’ll need to iterate over and
join on _all_ of them. The `trpl::join_all` function accepts any type which
implements the `Iterator` trait, which we learned about back in Chapter 13, so
it seems like just the ticket. Let’s try putting our futures in a vector, and
replace `join!` with `join_all`.
join on _all_ of them. The `trpl::join_all` function accepts any type that
implements the `Iterator` trait, which you learned about back in [The Iterator
Trait and the `next` Method][iterator-trait]<!-- ignore --> Chapter 13, so
it seems like just the ticket. Let’s try putting our futures in a vector and
replacing `join!` with `join_all` as show in Listing 17-15.
<Listing number="17-15" caption="Storing anonymous futures in a vector and calling `join_all`">
@ -36,7 +37,7 @@ replace `join!` with `join_all`. @@ -36,7 +37,7 @@ replace `join!` with `join_all`.
</Listing>
Unfortunately, this doesn’t compile. Instead, we get this error:
Unfortunately, this code doesn’t compile. Instead, we get this error:
<!-- manual-regeneration
cd listings/ch17-async-await/listing-17-15/
@ -55,7 +56,8 @@ error[E0308]: mismatched types @@ -55,7 +56,8 @@ error[E0308]: mismatched types
| ----- the found `async` block
...
45 | let futures = vec![tx1_fut, rx_fut, tx_fut];
| ^^^^^^ expected `async` block, found a different `async` block
| ^^^^^^ expected `async` block, found a
different `async` block
|
= note: expected `async` block `{async block@src/main.rs:10:23: 10:33}`
found `async` block `{async block@src/main.rs:24:22: 24:27}`
@ -63,31 +65,31 @@ error[E0308]: mismatched types @@ -63,31 +65,31 @@ error[E0308]: mismatched types
= help: consider pinning your async block and casting it to a trait object
```
This might be surprising. After all, none of them return anything, so each
block produces a `Future<Output = ()>`. However, `Future` is a trait, not a
concrete type. The concrete types are the individual data structures generated
by the compiler for async blocks. You can’t put two different hand-written
structs in a `Vec`, and the same thing applies to the different structs
generated by the compiler.
This might be surprising. After all, none of the async blocks returns anything,
so each one produces a `Future<Output = ()>`. Remember that `Future` is a trait,
though, and that the compiler creates a unique enum for each async block. You
can’t put two different hand-written structs in a `Vec`, and the same rule
applies to the different enums generated by the compiler.
To make this work, we need to use _trait objects_, just as we did in [“Returning
Errors from the run function”][dyn]<!-- ignore --> in Chapter 12. (We’ll cover trait objects
in detail in Chapter 18.) Using trait objects lets us treat each of the
anonymous futures produced by these types as the same type, because all of them
implement the `Future` trait.
Errors from the run function”][dyn]<!-- ignore --> in Chapter 12. (We’ll cover
trait objects in detail in Chapter 18.) Using trait objects lets us treat each
of the anonymous futures produced by these types as the same type, because all
of them implement the `Future` trait.
> Note: In Chapter 8, we discussed another way to include multiple types in a
> `Vec`: using an enum to represent each of the different types which can
> appear in the vector. We can’t do that here, though. For one thing, we have
> no way to name the different types, because they are anonymous. For another,
> the reason we reached for a vector and `join_all` in the first place was to be
> able to work with a dynamic collection of futures where we don’t know what
> they will all be until runtime.
> Note: In the Chapter 8 section [Using an Enum to Store Multiple
> Values][enum-alt]<!-- ignore -->, we discussed another way to include multiple
> types in a `Vec`: using an enum to represent each type that can appear in the
> vector. We can’t do that here, though. For one thing, we have no way to name
> the different types, because they are anonymous. For another, the reason we
> reached for a vector and `join_all` in the first place was to be able to work
> with a dynamic collection of futures where we only care that they have the
> same output type.
We start by wrapping each of the futures in the `vec!` in a `Box::new`, as shown
in Listing 17-16.
We start by wrapping each future in the `vec!` in a `Box::new`, as shown in
Listing 17-16.
<Listing number="17-16" caption="Trying to use `Box::new` to align the types of the futures in a `Vec`" file-name="src/main.rs">
<Listing number="17-16" caption="Using `Box::new` to align the types of the futures in a `Vec`" file-name="src/main.rs">
```rust,ignore,does_not_compile
{{#rustdoc_include ../listings/ch17-async-await/listing-17-16/src/main.rs:here}}
@ -95,11 +97,11 @@ in Listing 17-16. @@ -95,11 +97,11 @@ in Listing 17-16.
</Listing>
Unfortunately, this still doesn’t compile. In fact, we have the same basic
error we did before, but we get one for both the second and third `Box::new`
calls, and we also get new errors referring to the `Unpin` trait. We will come
back to the `Unpin` errors in a moment. First, let’s fix the type errors on the
`Box::new` calls, by explicitly annotating the type of the `futures` variable:
Unfortunately, this code still doesn’t compile. In fact, we get the same basic
error we got before for both the second and third `Box::new` calls, as well as
new errors referring to the `Unpin` trait. We’ll come back to the `Unpin` errors
in a moment. First, let’s fix the type errors on the `Box::new` calls by
explicitly annotating the type of the `futures` variable (see Listing 17-17).
<Listing number="17-17" caption="Fixing the rest of the type mismatch errors by using an explicit type declaration" file-name="src/main.rs">
@ -109,17 +111,18 @@ back to the `Unpin` errors in a moment. First, let’s fix the type errors on th @@ -109,17 +111,18 @@ back to the `Unpin` errors in a moment. First, let’s fix the type errors on th
</Listing>
The type we had to write here is a little involved, so let’s walk through it:
This type declaration is a little involved, so let’s walk through it:
- The innermost type is the future itself. We note explicitly that the output of
the future is the unit type `()` by writing `Future<Output = ()>`.
- Then we annotate the trait with `dyn` to mark it as dynamic.
- The entire trait reference is wrapped in a `Box`.
- Finally, we state explicitly that `futures` is a `Vec` containing these items.
1. The innermost type is the future itself. We note explicitly that the output
of the future is the unit type `()` by writing `Future<Output = ()>`.
2. Then we annotate the trait with `dyn` to mark it as dynamic.
3. The entire trait reference is wrapped in a `Box`.
4. Finally, we state explicitly that `futures` is a `Vec` containing these
items.
That already made a big difference. Now when we run the compiler, we only have
the errors mentioning `Unpin`. Although there are three of them, notice that
each is very similar in its contents.
That already made a big difference. Now when we run the compiler, we get only
the errors mentioning `Unpin`. Although there are three of them, their contents
are very similar.
<!-- manual-regeneration
cd listings/ch17-async-await/listing-17-16
@ -236,10 +239,10 @@ note: required by a bound in `futures_util::future::join_all::JoinAll` @@ -236,10 +239,10 @@ note: required by a bound in `futures_util::future::join_all::JoinAll`
That is a _lot_ to digest, so let’s pull it apart. The first part of the message
tell us that the first async block (`src/main.rs:8:23: 20:10`) does not
implement the `Unpin` trait, and suggests using `pin!` or `Box::pin` to resolve
implement the `Unpin` trait and suggests using `pin!` or `Box::pin` to resolve
it. Later in the chapter, we’ll dig into a few more details about `Pin` and
`Unpin`. For the moment, though, we can just follow the compiler’s advice to get
unstuck! In Listing 17-18, we start by updating the type annotation for
unstuck. In Listing 17-18, we start by updating the type annotation for
`futures`, with a `Pin` wrapping each `Box`. Second, we use `Box::pin` to pin
the futures themselves.
@ -270,20 +273,20 @@ received 'you' @@ -270,20 +273,20 @@ received 'you'
Phew!
There’s a bit more we can explore here. For one thing, using `Pin<Box<T>>`
comes with a small amount of extra overhead from putting these futures on the
heap with `Box`—and we’re only doing that to get the types to line up. We don’t
actually _need_ the heap allocation, after all: these futures are local to this
particular function. As noted above, `Pin` is itself a wrapper type, so we can
get the benefit of having a single type in the `Vec`—the original reason we
reached for `Box`—without doing a heap allocation. We can use `Pin` directly
with each future, using the `std::pin::pin` macro.
There’s a bit more to explore here. For one thing, using `Pin<Box<T>>` adds a
small amount of overhead from putting these futures on the heap with `Box`—and
we’re only doing that to get the types to line up. We don’t actually _need_ the
heap allocation, after all: these futures are local to this particular function.
As noted before, `Pin` is itself a wrapper type, so we can get the benefit of
having a single type in the `Vec`—the original reason we reached for
`Box`—without doing a heap allocation. We can use `Pin` directly with each
future, using the `std::pin::pin` macro.
However, we must still be explicit about the type of the pinned reference;
otherwise Rust will still not know to interpret these as dynamic trait objects,
otherwise, Rust will still not know to interpret these as dynamic trait objects,
which is what we need them to be in the `Vec`. We therefore `pin!` each future
when we define it, and define `futures` as a `Vec` containing pinned mutable
references to the dynamic `Future` type, as in Listing 17-19.
references to the dynamic future type, as in Listing 17-19.
<Listing number="17-19" caption="Using `Pin` directly with the `pin!` macro to avoid unnecessary heap allocations" file-name="src/main.rs">
@ -306,19 +309,20 @@ types. For example, in Listing 17-20, the anonymous future for `a` implements @@ -306,19 +309,20 @@ types. For example, in Listing 17-20, the anonymous future for `a` implements
</Listing>
We can use `trpl::join!` to await them, because it allows you to pass in
multiple future types and produces a tuple of those types. We _cannot_ use
`trpl::join_all`, because it requires the futures passed in all to have the same
type. Remember, that error is what got us started on this adventure with `Pin`!
We can use `trpl::join!` to await them, because it allows us to pass in multiple
future types and produces a tuple of those types. We _cannot_ use
`trpl::join_all`, because it requires all of the futures passed in to have the
same type. Remember, that error is what got us started on this adventure with
`Pin`!
This is a fundamental tradeoff: we can either deal with a dynamic number of
futures with `join_all`, as long as they all have the same type, or we can deal
with a set number of futures with the `join` functions or the `join!` macro,
even if they have different types. This is the same as working with any other
types in Rust, though. Futures are not special, even though we have some nice
syntax for working with them, and that is a good thing.
even if they have different types. This is the same scenario we’d face when
working with any other types in Rust. Futures are not special, even though we
have some nice syntax for working with them, and that’s a good thing.
### Racing futures
### Racing Futures
When we “join” futures with the `join` family of functions and macros, we
require _all_ of them to finish before we move on. Sometimes, though, we only
@ -326,13 +330,7 @@ need _some_ future from a set to finish before we move on—kind of similar to @@ -326,13 +330,7 @@ need _some_ future from a set to finish before we move on—kind of similar to
racing one future against another.
In Listing 17-21, we once again use `trpl::race` to run two futures, `slow` and
`fast`, against each other. Each one prints a message when it starts running,
pauses for some amount of time by calling and awaiting `sleep`, and then prints
another message when it finishes. Then we pass both to `trpl::race` and wait for
one of them to finish. (The outcome here won’t be too surprising: `fast` wins!)
Unlike when we used `race` back in [“Our First Async Program”][async-program]<!--
ignore -->, we just ignore the `Either` instance it returns here, because all of
the interesting behavior happens in the body of the async blocks.
`fast`, against each other.
<Listing number="17-21" caption="Using `race` to get the result of whichever future finishes first" file-name="src/main.rs">
@ -342,28 +340,36 @@ the interesting behavior happens in the body of the async blocks. @@ -342,28 +340,36 @@ the interesting behavior happens in the body of the async blocks.
</Listing>
Each future prints a message when it starts running, pauses for some amount of
time by calling and awaiting `sleep`, and then prints another message when it
finishes. Then we pass both `slow` and `fast` to `trpl::race` and wait for one
of them to finish. (The outcome here isn’t too surprising: `fast` wins.) Unlike
when we used `race` back in [“Our First Async Program”][async-program]<!--
ignore -->, we just ignore the `Either` instance it returns here, because all of
the interesting behavior happens in the body of the async blocks.
Notice that if you flip the order of the arguments to `race`, the order of the
“started” messages changes, even though the `fast` future always completes
first. That’s because the implementation of this particular `race` function is
not fair. It always runs the futures passed as arguments in the order they’re
passed. Other implementations _are_ fair, and will randomly choose which future
to poll first. Regardless of whether the implementation of race we’re using is
fair, though, _one_ of the futures will run up to the first `await` in its body
before another task can start.
Recall from [Our First Async Program][async-program]<!-- ignore --> that at each await point,
Rust gives a runtime a chance to pause the task and switch to another one if the
future being awaited isn’t ready. The inverse is also true: Rust _only_ pauses
async blocks and hands control back to a runtime at an await point. Everything
between await points is synchronous.
not fair. It always runs the futures passed in as arguments in the order in
which they’re passed. Other implementations _are_ fair and will randomly choose
which future to poll first. Regardless of whether the implementation of race
we’re using is fair, though, _one_ of the futures will run up to the first
`await` in its body before another task can start.
Recall from [Our First Async Program][async-program]<!-- ignore --> that at each
await point, Rust gives a runtime a chance to pause the task and switch to
another one if the future being awaited isn’t ready. The inverse is also true:
Rust _only_ pauses async blocks and hands control back to a runtime at an await
point. Everything between await points is synchronous.
That means if you do a bunch of work in an async block without an await point,
that future will block any other futures from making progress. You may sometimes
hear this referred to as one future _starving_ other futures. In some cases,
that may not be a big deal. However, if you are doing some kind of expensive
setup or long-running work, or if you have a future which will keep doing some
particular task indefinitely, you’ll need to think about when and where to
hand control back to the runtime.
setup or long-running work, or if you have a future that will keep doing some
particular task indefinitely, you’ll need to think about when and where to hand
control back to the runtime.
By the same token, if you have long-running blocking operations, async can be a
useful tool for providing ways for different parts of the program to relate to
@ -371,13 +377,13 @@ each other. @@ -371,13 +377,13 @@ each other.
But _how_ would you hand control back to the runtime in those cases?
### Yielding
<!-- Old headings. Do not remove or links may break. -->
<a id="yielding"></a>
### Yielding Control to the Runtime
Let’s simulate a long-running operation. Listing 17-22 introduces a `slow`
function. It uses `std::thread::sleep` instead of `trpl::sleep` so that calling
`slow` will block the current thread for some number of milliseconds. We can use
`slow` to stand in for real-world operations which are both long-running and
blocking.
function.
<Listing number="17-22" caption="Using `thread::sleep` to simulate slow operations" file-name="src/main.rs">
@ -387,9 +393,13 @@ blocking. @@ -387,9 +393,13 @@ blocking.
</Listing>
This code uses `std::thread::sleep` instead of `trpl::sleep` so that calling
`slow` will block the current thread for some number of milliseconds. We can use
`slow` to stand in for real-world operations that are both long-running and
blocking.
In Listing 17-23, we use `slow` to emulate doing this kind of CPU-bound work in
a pair of futures. To begin, each future only hands control back to the runtime
_after_ carrying out a bunch of slow operations.
a pair of futures.
<Listing number="17-23" caption="Using `thread::sleep` to simulate slow operations" file-name="src/main.rs">
@ -399,7 +409,8 @@ _after_ carrying out a bunch of slow operations. @@ -399,7 +409,8 @@ _after_ carrying out a bunch of slow operations.
</Listing>
If you run this, you will see this output:
To begin, each future only hands control back to the runtime _after_ carrying
out a bunch of slow operations. If you run this code, you will see this output:
<!-- manual-regeneration
cd listings/ch17-async-await/listing-17-23/
@ -423,15 +434,16 @@ copy just the output @@ -423,15 +434,16 @@ copy just the output
As with our earlier example, `race` still finishes as soon as `a` is done.
There’s no interleaving between the two futures, though. The `a` future does all
of its work until the `trpl::sleep` call is awaited, then the `b` future does
all of its work until its own `trpl::sleep` call is awaited, and then the `a`
all of its work until its own `trpl::sleep` call is awaited, and finally the `a`
future completes. To allow both futures to make progress between their slow
tasks, we need await points so we can hand control back to the runtime. That
means we need something we can await!
We can already see this kind of handoff happening in Listing 17-23: if we
removed the `trpl::sleep` at the end of the `a` future, it would complete
without the `b` future running _at all_. Maybe we could use the `sleep` function
as a starting point?
without the `b` future running _at all_. Let’s try using the `sleep` function as
a starting point for letting operations switch off making progress, as shown in
Listing 17-24.
<Listing number="17-24" caption="Using `sleep` to let operations switch off making progress" file-name="src/main.rs">
@ -465,8 +477,8 @@ copy just the output @@ -465,8 +477,8 @@ copy just the output
The `a` future still runs for a bit before handing off control to `b`, because
it calls `slow` before ever calling `trpl::sleep`, but after that the futures
swap back and forth each time one of them hits an await point. In this case, we
have done that after every call to `slow`, but we could break up the work
however makes the most sense to us.
have done that after every call to `slow`, but we could break up the work in
whatever way makes the most sense to us.
We don’t really want to _sleep_ here, though: we want to make progress as fast
as we can. We just need to hand back control to the runtime. We can do that
@ -481,20 +493,16 @@ directly, using the `yield_now` function. In Listing 17-25, we replace all those @@ -481,20 +493,16 @@ directly, using the `yield_now` function. In Listing 17-25, we replace all those
</Listing>
This is both clearer about the actual intent and can be significantly faster
than using `sleep`, because timers such as the one used by `sleep` often have
limits to how granular they can be. The version of `sleep` we are using, for
example, will always sleep for at least a millisecond, even if we pass it a
This code is both clearer about the actual intent and can be significantly
faster than using `sleep`, because timers such as the one used by `sleep` often
have limits on how granular they can be. The version of `sleep` we are using,
for example, will always sleep for at least a millisecond, even if we pass it a
`Duration` of one nanosecond. Again, modern computers are _fast_: they can do a
lot in one millisecond!
You can see this for yourself by setting up a little benchmark, such as the one
in Listing 17-26. (This isn’t an especially rigorous way to do performance
testing, but it suffices to show the difference here.) Here, we skip all the
status printing, pass a one-nanosecond `Duration` to `trpl::sleep`, and let
each future run by itself, with no switching between the futures. Then we run
for 1,000 iterations and see how long the future using `trpl::sleep` takes
compared to the future using `trpl::yield_now`.
testing, but it suffices to show the difference here.)
<Listing number="17-26" caption="Comparing the performance of `sleep` and `yield_now`" file-name="src/main.rs">
@ -504,6 +512,11 @@ compared to the future using `trpl::yield_now`. @@ -504,6 +512,11 @@ compared to the future using `trpl::yield_now`.
</Listing>
Here, we skip all the status printing, pass a one-nanosecond `Duration` to
`trpl::sleep`, and let each future run by itself, with no switching between the
futures. Then we run for 1,000 iterations and see how long the future using
`trpl::sleep` takes compared to the future using `trpl::yield_now`.
The version with `yield_now` is _way_ faster!
This means that async can be useful even for compute-bound tasks, depending on
@ -516,20 +529,19 @@ operating systems, this is the _only_ kind of multitasking! @@ -516,20 +529,19 @@ operating systems, this is the _only_ kind of multitasking!
In real-world code, you won’t usually be alternating function calls with await
points on every single line, of course. While yielding control in this way is
relatively inexpensive, it’s not free! In many cases, trying to break up a
relatively inexpensive, it’s not free. In many cases, trying to break up a
compute-bound task might make it significantly slower, so sometimes it’s better
for _overall_ performance to let an operation block briefly. You should always
for _overall_ performance to let an operation block briefly. Always
measure to see what your code’s actual performance bottlenecks are. The
underlying dynamic is an important one to keep in mind if you _are_ seeing a
lot of work happening in serial that you expected to happen concurrently,
though!
underlying dynamic is important to keep in mind, though, if you _are_ seeing a
lot of work happening in serial that you expected to happen concurrently!
### Building Our Own Async Abstractions
We can also compose futures together to create new patterns. For example, we can
build a `timeout` function with async building blocks we already have. When
we’re done, the result will be another building block we could use to build up
yet further async abstractions.
we’re done, the result will be another building block we could use to create
still more async abstractions.
Listing 17-27 shows how we would expect this `timeout` to work with a slow
future.
@ -571,17 +583,14 @@ need: we want to race the future passed in against the duration. We can use @@ -571,17 +583,14 @@ need: we want to race the future passed in against the duration. We can use
`trpl::sleep` to make a timer future from the duration, and use `trpl::race` to
run that timer with the future the caller passes in.
We also know that `race` is not fair, and polls arguments in the order they are
passed. Thus, we pass `future_to_try` to `race` first so it gets a chance to
complete even if `max_time` is a very short duration. If `future_to_try`
finishes first, `race` will return `Left` with the output from `future`. If
`timer` finishes first, `race` will return `Right` with the timer’s output of
`()`.
We also know that `race` is not fair, polling arguments in the order in which
they are passed. Thus, we pass `future_to_try` to `race` first so it gets a
chance to complete even if `max_time` is a very short duration. If
`future_to_try` finishes first, `race` will return `Left` with the output from
`future_to_try`. If `timer` finishes first, `race` will return `Right` with the
timer’s output of `()`.
In Listing 17-29, we match on the result of awaiting `trpl::race`. If the
`future_to_try` succeeded and we get a `Left(output)`, we return `Ok(output)`.
If the sleep timer elapsed instead and we get a `Right(())`, we ignore the `()`
with `_` and return `Err(max_time)` instead.
In Listing 17-29, we match on the result of awaiting `trpl::race`.
<Listing number="17-29" caption="Defining `timeout` with `race` and `sleep`" file-name="src/main.rs">
@ -591,7 +600,11 @@ with `_` and return `Err(max_time)` instead. @@ -591,7 +600,11 @@ with `_` and return `Err(max_time)` instead.
</Listing>
With that, we have a working `timeout`, built out of two other async helpers. If
If the `future_to_try` succeeds and we get a `Left(output)`, we return
`Ok(output)`. If the sleep timer elapses instead and we get a `Right(())`, we
ignore the `()` with `_` and return `Err(max_time)` instead.
With that, we have a working `timeout` built out of two other async helpers. If
we run our code, it will print the failure mode after the timeout:
```text
@ -599,28 +612,30 @@ Failed after 2 seconds @@ -599,28 +612,30 @@ Failed after 2 seconds
```
Because futures compose with other futures, you can build really powerful tools
using smaller async building blocks. For example, you can use this same
approach to combine timeouts with retries, and in turn use those with things
such as network calls—one of the examples from the beginning of the chapter!
using smaller async building blocks. For example, you can use this same approach
to combine timeouts with retries, and in turn use those with operations such as
network calls (one of the examples from the beginning of the chapter).
In practice, you will usually work directly with `async` and `await`, and
secondarily with functions and macros such as `join`, `join_all`, `race`, and
so on. You’ll only need to reach for `pin` now and again to use them with those
In practice, youll usually work directly with `async` and `await`, and
secondarily with functions and macros such as `join`, `join_all`, `race`, and so
on. You’ll only need to reach for `pin` now and again to use futures with those
APIs.
We’ve now seen a number of ways to work with multiple futures at the same
time. Up next, we’ll look at how we can work with multiple futures in a
sequence over time, with _streams_. Here are a couple more things you might want
sequence over time with _streams_. Here are a couple more things you might want
to consider first, though:
- We used a `Vec` with `join_all` to wait for all of the futures in some group
to finish. How could you use a `Vec` to process a group of futures in
sequence, instead? What are the tradeoffs of doing that?
sequence instead? What are the tradeoffs of doing that?
- Take a look at the `futures::stream::FuturesUnordered` type from the `futures`
crate. How would using it be different from using a `Vec`? (Don’t worry about
the fact that it is from the `stream` part of the crate; it works just fine
the fact that its from the `stream` part of the crate; it works just fine
with any collection of futures.)
[dyn]: ch12-03-improving-error-handling-and-modularity.html
[enum-alt]: ch12-03-improving-error-handling-and-modularity.html#returning-errors-from-the-run-function
[async-program]: ch17-01-futures-and-syntax.html#our-first-async-program
[iterator-trait]: ch13-02-iterators.html#the-iterator-trait-and-the-next-method

355
rustbook-en/src/ch17-04-streams.md

@ -1,26 +1,33 @@ @@ -1,26 +1,33 @@
## Streams
## Streams: Futures in Sequence
So far in this chapter, we have mostly stuck to individual futures. The one big
<!-- Old headings. Do not remove or links may break. -->
<a id="streams"></a>
So far in this chapter, we’ve mostly stuck to individual futures. The one big
exception was the async channel we used. Recall how we used the receiver for our
async channel in the [“Message Passing”][17-02-messages]<!-- ignore --> earlier in the chapter.
The async `recv` method produces a sequence of items over time. This is an
instance of a much more general pattern, often called a _stream_.
A sequence of items is something we’ve seen before, when we looked at the
`Iterator` trait in Chapter 13. There are two differences between iterators and
the async channel receiver, though. The first is the element of time: iterators
are synchronous, while the channel receiver is asynchronous. The second is the
API. When working directly with an `Iterator`, we call its synchronous `next`
method. With the `trpl::Receiver` stream in particular, we called an
asynchronous `recv` method instead. These APIs otherwise feel very similar.
That similarity isn’t a coincidence. A stream is similar to an asynchronous
form of iteration. Whereas the `trpl::Receiver` specifically waits to receive
messages, though, the general-purpose stream API is much more general: it
provides the next item the way `Iterator` does, but asynchronously. The
similarity between iterators and streams in Rust means we can actually create a
stream from any iterator. As with an iterator, we can work with a stream by
calling its `next` method and then awaiting the output, as in Listing 17-30.
async channel earlier in this chapter in the [“Message
Passing”][17-02-messages]<!-- ignore --> section. The async `recv` method
produces a sequence of items over time. This is an instance of a much more
general pattern known as a _stream_.
We saw a sequence of items back in Chapter 13, when we looked at the `Iterator`
trait in [The Iterator Trait and the `next` Method][iterator-trait]<!-- ignore
--> section, but there are two differences between iterators and the async
channel receiver. The first difference is time: iterators are synchronous, while
the channel receiver is asynchronous. The second is the API. When working
directly with `Iterator`, we call its synchronous `next` method. With the
`trpl::Receiver` stream in particular, we called an asynchronous `recv` method
instead. Otherwise, these APIs otherwise feel very similar, and that similarity
isn’t a coincidence. A stream is like an asynchronous form of iteration. Whereas
the `trpl::Receiver` specifically waits to receive messages, though, the
general-purpose stream API is much broader: it provides the next item the
way `Iterator` does, but asynchronously.
The similarity between iterators and streams in Rust means we can actually
create a stream from any iterator. As with an iterator, we can work with a
stream by calling its `next` method and then awaiting the output, as in Listing
17-30.
<Listing number="17-30" caption="Creating a stream from an iterator and printing its values" file-name="src/main.rs">
@ -32,11 +39,10 @@ calling its `next` method and then awaiting the output, as in Listing 17-30. @@ -32,11 +39,10 @@ calling its `next` method and then awaiting the output, as in Listing 17-30.
We start with an array of numbers, which we convert to an iterator and then call
`map` on to double all the values. Then we convert the iterator into a stream
using the `trpl::stream_from_iter` function. Then we loop over the items in the
using the `trpl::stream_from_iter` function. Next, we loop over the items in the
stream as they arrive with the `while let` loop.
Unfortunately, when we try to run the code, it doesn’t compile. Instead, as we
can see in the output, it reports that there is no `next` method available.
Unfortunately, when we try to run the code, it doesn’t compile, but instead it reports that there’s no `next` method available:
<!-- manual-regeneration
cd listings/ch17-async-await/listing-17-30
@ -70,20 +76,20 @@ help: there is a method `try_next` with a similar name @@ -70,20 +76,20 @@ help: there is a method `try_next` with a similar name
| ~~~~~~~~
```
As the output suggests, the reason for the compiler error is that we need the
As this output explains, the reason for the compiler error is that we need the
right trait in scope to be able to use the `next` method. Given our discussion
so far, you might reasonably expect that to be `Stream`, but the trait we need
here is actually `StreamExt`. The `Ext` there is for “extension”: this is a
common pattern in the Rust community for extending one trait with another.
Why do we need `StreamExt` instead of `Stream`, and what does the `Stream` trait
itself do? Briefly, the answer is that throughout the Rust ecosystem, the
`Stream` trait defines a low-level interface which effectively combines the
`Iterator` and `Future` traits. The `StreamExt` trait supplies a higher-level
set of APIs on top of `Stream`, including the `next` method as well as other
utility methods similar to those provided by the `Iterator` trait. We’ll return
to the `Stream` and `StreamExt` traits in a bit more detail at the end of the
chapter. For now, this is enough to let us keep moving.
so far, you might reasonably expect that trait to be `Stream`, but it’s actually
`StreamExt`. Short for _extension_, `Ext` is a common pattern in the
Rust community for extending one trait with another.
We’ll explain the `Stream` and `StreamExt` traits in a bit more detail at the
end of the chapter, but for now all you need to know is that the `Stream` trait
defines a low-level interface that effectively combines the `Iterator` and
`Future` traits. `StreamExt` supplies a higher-level set of APIs on top of
`Stream`, including the `next` method as well as other utility methods similar
to those provided by the `Iterator` trait. `Stream` and `StreamExt` are not yet
part of Rust’s standard library, but most ecosystem crates use the same
definition.
The fix to the compiler error is to add a `use` statement for `trpl::StreamExt`,
as in Listing 17-31.
@ -101,7 +107,7 @@ more, now that we have `StreamExt` in scope, we can use all of its utility @@ -101,7 +107,7 @@ more, now that we have `StreamExt` in scope, we can use all of its utility
methods, just as with iterators. For example, in Listing 17-32, we use the
`filter` method to filter out everything but multiples of three and five.
<Listing number="17-32" caption="Filtering a `Stream` with the `StreamExt::filter` method" file-name="src/main.rs">
<Listing number="17-32" caption="Filtering a stream with the `StreamExt::filter` method" file-name="src/main.rs">
```rust
{{#rustdoc_include ../listings/ch17-async-await/listing-17-32/src/main.rs:all}}
@ -109,31 +115,24 @@ methods, just as with iterators. For example, in Listing 17-32, we use the @@ -109,31 +115,24 @@ methods, just as with iterators. For example, in Listing 17-32, we use the
</Listing>
Of course, this isn’t very interesting. We could do that with normal iterators
and without any async at all. So let’s look at some of the other things we can
do which are unique to streams.
Of course, this isn’t very interesting, since we could do the same with normal
iterators and without any async at all. Let’s look at what
we can do that _is_ unique to streams.
### Composing Streams
Many concepts are naturally represented as streams: items becoming available in
a queue, or working with more data than can fit in a computer’s memory by only
pulling chunks of it from the file system at a time, or data arriving over the
a queue, chunks of data being pulled incrementally from the filesystem when the
full data set is too large for the computer’s , or data arriving over the
network over time. Because streams are futures, we can use them with any other
kind of future, too, and we can combine them in interesting ways. For example,
we can batch up events to avoid triggering too many network calls, set timeouts
on sequences of long-running operations, or throttle user interface events to
avoid doing needless work.
kind of future and combine them in interesting ways. For example, we can batch
up events to avoid triggering too many network calls, set timeouts on sequences
of long-running operations, or throttle user interface events to avoid doing
needless work.
Let’s start by building a little stream of messages, as a stand-in for a stream
Let’s start by building a little stream of messages as a stand-in for a stream
of data we might see from a WebSocket or another real-time communication
protocol. In Listing 17-33, we create a function `get_messages` which returns
`impl Stream<Item = String>`. For its implementation, we create an async
channel, loop over the first ten letters of the English alphabet, and send them
across the channel.
We also use a new type: `ReceiverStream`, which converts the `rx` receiver from
the `trpl::channel` into a `Stream` with a `next` method. Back in `main`, we use
a `while let` loop to print all the messages from the stream.
protocol, as shown in Listing 17-33.
<Listing number="17-33" caption="Using the `rx` receiver as a `ReceiverStream`" file-name="src/main.rs">
@ -143,6 +142,14 @@ a `while let` loop to print all the messages from the stream. @@ -143,6 +142,14 @@ a `while let` loop to print all the messages from the stream.
</Listing>
First, we create a function called `get_messages` that returns `impl Stream<Item
= String>`. For its implementation, we create an async channel, loop over the
first 10 letters of the English alphabet, and send them across the channel.
We also use a new type: `ReceiverStream`, which converts the `rx` receiver from
the `trpl::channel` into a `Stream` with a `next` method. Back in `main`, we use
a `while let` loop to print all the messages from the stream.
When we run this code, we get exactly the results we would expect:
<!-- Not extracting output because changes to this output aren't significant;
@ -162,19 +169,12 @@ Message: 'i' @@ -162,19 +169,12 @@ Message: 'i'
Message: 'j'
```
We could do this with the regular `Receiver` API, or even the regular `Iterator`
API, though. Let’s add something that requires streams: adding a timeout
which applies to every item in the stream, and a delay on the items we emit.
Again, we could do this with the regular `Receiver` API or even the regular
`Iterator` API, though, so let’s add a feature that requires streams: adding a
timeout that applies to every item in the stream, and a delay on the items we
emit, as shown in Listing 17-34.
In Listing 17-34, we start by adding a timeout to the stream with the `timeout`
method, which comes from the `StreamExt` trait. Then we update the body of the
`while let` loop, because the stream now returns a `Result`. The `Ok` variant
indicates a message arrived in time; the `Err` variant indicates that the
timeout elapsed before any message arrived. We `match` on that result and either
print the message when we receive it successfully, or print a notice about the
timeout. Finally, notice that we pin the messages after applying the timeout to
them, because the timeout helper produces a stream which needs to be pinned to
be polled.
<Listing number="17-34" caption="Using the `StreamExt::timeout` method to set a time limit on the items in a stream" file-name="src/main.rs">
@ -184,14 +184,19 @@ be polled. @@ -184,14 +184,19 @@ be polled.
</Listing>
We start by adding a timeout to the stream with the `timeout` method, which
comes from the `StreamExt` trait. Then we update the body of the `while let`
loop, because the stream now returns a `Result`. The `Ok` variant indicates a
message arrived in time; the `Err` variant indicates that the timeout elapsed
before any message arrived. We `match` on that result and either print the
message when we receive it successfully or print a notice about the timeout.
Finally, notice that we pin the messages after applying the timeout to them,
because the timeout helper produces a stream that needs to be pinned to be
polled.
However, because there are no delays between messages, this timeout does not
change the behavior of the program. Let’s add a variable delay to the messages
we send. In `get_messages`, we use the `enumerate` iterator method with the
`messages` array so that we can get the index of each item we are sending along
with the item itself. Then we apply a 100 millisecond delay to even-index items
and a 300 millisecond delay to odd-index items, to simulate the different delays
we might see from a stream of messages in the real world. Because our timeout is
for 200 milliseconds, this should affect half of the messages.
we send, as shown in Listing 17-35.
<Listing number="17-35" caption="Sending messages through `tx` with an async delay without making `get_messages` an async function" file-name="src/main.rs">
@ -201,30 +206,36 @@ for 200 milliseconds, this should affect half of the messages. @@ -201,30 +206,36 @@ for 200 milliseconds, this should affect half of the messages.
</Listing>
In `get_messages`, we use the `enumerate` iterator method with the `messages`
array so that we can get the index of each item we’re sending along with the
item itself. Then we apply a 100-millisecond delay to even-index items and a
300-millisecond delay to odd-index items to simulate the different delays we
might see from a stream of messages in the real world. Because our timeout is
for 200 milliseconds, this should affect half of the messages.
To sleep between messages in the `get_messages` function without blocking, we
need to use async. However, we can’t make `get_messages` itself into an async
function, because then we’d return a `Future<Output = Stream<Item = String>>`
instead of a `Stream<Item = String>>`. The caller would have to await
`get_messages` itself to get access to the stream. But remember: everything in a
given future happens linearly; concurrency happens _between_ futures. Awaiting
`get_messages` would require it to send all the messages, including sleeping
between sending each message, before returning the receiver stream. As a result,
the timeout would end up useless. There would be no delays in the stream itself:
the delays would all happen before the stream was even available.
`get_messages` would require it to send all the messages, including the sleep
delay between each message, before returning the receiver stream. As a result,
the timeout would be useless. There would be no delays in the stream itself;
they would all happen before the stream was even available.
Instead, we leave `get_messages` as a regular function which returns a stream,
and spawn a task to handle the async `sleep` calls.
Instead, we leave `get_messages` as a regular function that returns a stream,
and we spawn a task to handle the async `sleep` calls.
> Note: calling `spawn_task` in this way works because we already set up our
> runtime. Calling this particular implementation of `spawn_task` _without_
> first setting up a runtime will cause a panic. Other implementations choose
> different tradeoffs: they might spawn a new runtime and so avoid the panic but
> end up with a bit of extra overhead, or simply not provide a standalone way to
> spawn tasks without reference to a runtime. You should make sure you know what
> tradeoff your runtime has chosen and write your code accordingly!
> Note: Calling `spawn_task` in this way works because we already set up our
> runtime; had we not, it would cause a panic. Other implementations choose
> different tradeoffs: they might spawn a new runtime and avoid the panic but
> end up with a bit of extra overhead, or they may simply not provide a
> standalone way to spawn tasks without reference to a runtime. Make sure you
> know what tradeoff your runtime has chosen and write your code accordingly!
Now our code has a much more interesting result! Between every other pair of
messages, we see an error reported: `Problem: Elapsed(())`.
Now our code has a much more interesting result. Between every other pair of
messages, a `Problem: Elapsed(())` error.
<!-- manual-regeneration
cd listings/ch17-async-await/listing-17-35
@ -250,34 +261,25 @@ Problem: Elapsed(()) @@ -250,34 +261,25 @@ Problem: Elapsed(())
Message: 'j'
```
The timeout doesn’t prevent the messages from arriving in the end—we still get
all of the original messages. This is because our channel is unbounded: it can
hold as many messages as we can fit in memory. If the message doesn’t arrive
before the timeout, our stream handler will account for that, but when it polls
the stream again, the message may now have arrived.
The timeout doesn’t prevent the messages from arriving in the end. We still get
all of the original messages, because our channel is _unbounded_: it can hold as
many messages as we can fit in memory. If the message doesn’t arrive before the
timeout, our stream handler will account for that, but when it polls the stream
again, the message may now have arrived.
You can get different behavior if needed by using other kinds of channels, or
other kinds of streams more generally. Let’s see one of those in practice in our
final example for this section, by combining a stream of time intervals with
this stream of messages.
You can get different behavior if needed by using other kinds of channels or
other kinds of streams more generally. Let’s see one of those in practice by
combining a stream of time intervals with this stream of messages.
### Merging Streams
First, let’s create another stream, which will emit an item every millisecond if
we let it run directly. For simplicity, we can use the `sleep` function to send
a message on a delay, and combine it with the same approach of creating a stream
from a channel we used in `get_messages`. The difference is that this time,
we’re going to send back the count of intervals which has elapsed, so the return
type will be `impl Stream<Item = u32>`, and we can call the function
`get_intervals`.
In Listing 17-36, we start by defining a `count` in the task. (We could define
it outside the task, too, but it is clearer to limit the scope of any given
variable.) Then we create an infinite loop. Each iteration of the loop
asynchronously sleeps for one millisecond, increments the count, and then sends
it over the channel. Because this is all wrapped in the task created by
`spawn_task`, all of it will get cleaned up along with the runtime, including
the infinite loop.
a message on a delay and combine it with the same approach we used in
`get_messages` of creating a stream from a channel. The difference is that this
time, we’re going to send back the count of intervals that have elapsed, so the
return type will be `impl Stream<Item = u32>`, and we can call the function
`get_intervals` (see Listing 17-36).
<Listing number="17-36" caption="Creating a stream with a counter that will be emitted once every millisecond" file-name="src/main.rs">
@ -287,19 +289,22 @@ the infinite loop. @@ -287,19 +289,22 @@ the infinite loop.
</Listing>
This kind of infinite loop, which only ends when the whole runtime gets torn
We start by defining a `count` in the task. (We could define it outside the
task, too, but it’s clearer to limit the scope of any given variable.) Then we
create an infinite loop. Each iteration of the loop asynchronously sleeps for
one millisecond, increments the count, and then sends it over the channel.
Because this is all wrapped in the task created by `spawn_task`, all of
it—including the infinite loop—will get cleaned up along with the runtime.
This kind of infinite loop, which ends only when the whole runtime gets torn
down, is fairly common in async Rust: many programs need to keep running
indefinitely. With async, this doesn’t block anything else, as long as there is
at least one await point in each iteration through the loop.
Back in our main function’s async block, we start by calling `get_intervals`.
Then we merge the `messages` and `intervals` streams with the `merge` method,
which combines multiple streams into one stream that produces items from any of
the source streams as soon as the items are available, without imposing any
particular ordering. Finally, we loop over that combined stream instead of over
`messages` (Listing 17-37).
Now, back in our main function’s async block, we can attempt to merge the
`messages` and `intervals` streams, as shown in Listing 17-37.
<Listing number="17-37" caption="Attempting to merge streams of messages and intervals" file-name="src/main.rs">
<Listing number="17-37" caption="Attempting to the `messages` and `intervals` streams" file-name="src/main.rs">
```rust,ignore,does_not_compile
{{#rustdoc_include ../listings/ch17-async-await/listing-17-37/src/main.rs:main}}
@ -307,29 +312,26 @@ particular ordering. Finally, we loop over that combined stream instead of over @@ -307,29 +312,26 @@ particular ordering. Finally, we loop over that combined stream instead of over
</Listing>
We start by calling `get_intervals`. Then we merge the `messages` and
`intervals` streams with the `merge` method, which combines multiple streams
into one stream that produces items from any of the source streams as soon as
the items are available, without imposing any particular ordering. Finally, we
loop over that combined stream instead of over `messages`.
At this point, neither `messages` nor `intervals` needs to be pinned or mutable,
because both will be combined into the single `merged` stream. However, this
call to `merge` does not compile! (Neither does the `next` call in the `while
let` loop, but we’ll come back to that after fixing this.) The two streams
have different types. The `messages` stream has the type `Timeout<impl
Stream<Item = String>>`, where `Timeout` is the type which implements `Stream`
for a `timeout` call. Meanwhile, the `intervals` stream has the type `impl
Stream<Item = u32>`. To merge these two streams, we need to transform one of
them to match the other.
In Listing 17-38, we rework the `intervals` stream, because `messages` is
already in the basic format we want and has to handle timeout errors. First, we
can use the `map` helper method to transform the `intervals` into a string.
Second, we need to match the `Timeout` from `messages`. Because we don’t
actually _want_ a timeout for `intervals`, though, we can just create a timeout
which is longer than the other durations we are using. Here, we create a
10-second timeout with `Duration::from_secs(10)`. Finally, we need to make
`stream` mutable, so that the `while let` loop’s `next` calls can iterate
through the stream, and pin it so that it’s safe to do so.
call to `merge` doesn’t compile! (Neither does the `next` call in the `while
let` loop, but we’ll come back to that.) This is because the two streams have
different types. The `messages` stream has the type `Timeout<impl Stream<Item =
String>>`, where `Timeout` is the type that implements `Stream` for a `timeout`
call. The `intervals` stream has the type `impl Stream<Item = u32>`. To merge
these two streams, we need to transform one of them to match the other. We’ll
rework the intervals stream, because messages is already in the basic format we
want and has to handle timeout errors (see Listing 17-38).
<!-- We cannot directly test this one, because it never stops. -->
<Listing number="17-38" caption="Aligning the types of the the `intervals` stream with the type of the `messages` stream" file-name="src/main.rs">
<Listing number="17-38" caption="Aligning the type of the the `intervals` stream with the type of the `messages` stream" file-name="src/main.rs">
```rust,ignore
{{#rustdoc_include ../listings/ch17-async-await/listing-17-38/src/main.rs:main}}
@ -337,11 +339,17 @@ through the stream, and pin it so that it’s safe to do so. @@ -337,11 +339,17 @@ through the stream, and pin it so that it’s safe to do so.
</Listing>
That gets us _almost_ to where we need to be. Everything type checks. If you run
this, though, there will be two problems. First, it will never stop! You’ll
need to stop it with <span class="keystroke">ctrl-c</span>. Second, the
messages from the English alphabet will be buried in the midst of all the
interval counter messages:
First, we can use the `map` helper method to transform the `intervals` into a
string. Second, we need to match the `Timeout` from `messages`. Because we don’t
actually _want_ a timeout for `intervals`, though, we can just create a timeout
which is longer than the other durations we are using. Here, we create a
10-second timeout with `Duration::from_secs(10)`. Finally, we need to make
`stream` mutable, so that the `while let` loop’s `next` calls can iterate
through the stream, and pin it so that it’s safe to do so. That gets us _almost_
to where we need to be. Everything type checks. If you run this, though, there
will be two problems. First, it will never stop! You’ll need to stop it with
<span class="keystroke">ctrl-c</span>. Second, the messages from the English
alphabet will be buried in the midst of all the interval counter messages:
<!-- Not extracting output because changes to this output aren't significant;
the changes are likely to be due to the tasks running differently rather than
@ -359,16 +367,7 @@ Interval: 43 @@ -359,16 +367,7 @@ Interval: 43
--snip--
```
Listing 17-39 shows one way to solve these last two problems. First, we use the
`throttle` method on the `intervals` stream, so that it doesn’t overwhelm the
`messages` stream. Throttling is a way of limiting the rate at which a function
will be called—or, in this case, how often the stream will be polled. Once every
hundred milliseconds should do, because that is in the same ballpark as how
often our messages arrive.
To limit the number of items we will accept from a stream, we can use the `take`
method. We apply it to the _merged_ stream, because we want to limit the final
output, not just one stream or the other.
Listing 17-39 shows one way to solve these last two problems.
<Listing number="17-39" caption="Using `throttle` and `take` to manage the merged streams" file-name="src/main.rs">
@ -378,17 +377,26 @@ output, not just one stream or the other. @@ -378,17 +377,26 @@ output, not just one stream or the other.
</Listing>
Now when we run the program, it stops after pulling twenty items from the
stream, and the intervals don’t overwhelm the messages. We also don’t get
`Interval: 100` or `Interval: 200` or so on, but instead get `Interval: 1`,
`Interval: 2`, and so on—even though we have a source stream which _can_
produce an event every millisecond. That’s because the `throttle` call
produces a new stream, wrapping the original stream, so that the original
stream only gets polled at the throttle rate, not its own “native” rate. We
don’t have a bunch of unhandled interval messages we’re choosing to
ignore. Instead, we never produce those interval messages in the first place!
This is the inherent “laziness” of Rust’s futures at work again, allowing us to
choose our performance characteristics.
First, we use the `throttle` method on the `intervals` stream so that it doesn’t
overwhelm the `messages` stream. _Throttling_ is a way of limiting the rate at
which a function will be called—or, in this case, how often the stream will be
polled. Once every 100 milliseconds should do, because that’s roughly how often
our messages arrive.
To limit the number of items we will accept from a stream, we apply the `take`
method to the `merged` stream, because we want to limit the final output, not
just one stream or the other.
Now when we run the program, it stops after pulling 20 items from the stream,
and the intervals don’t overwhelm the messages. We also don’t get `Interval:
100` or `Interval: 200` or so on, but instead get `Interval: 1`, `Interval: 2`,
and so on—even though we have a source stream that _can_ produce an event every
millisecond. That’s because the `throttle` call produces a new stream that wraps
the original stream so that the original stream gets polled only at the throttle
rate, not its own “native” rate. We don’t have a bunch of unhandled interval
messages we’re choosing to ignore. Instead, we never produce those interval
messages in the first place! This is the inherent “laziness” of Rust’s futures
at work again, allowing us to choose our performance characteristics.
<!-- manual-regeneration
cd listings/ch17-async-await/listing-17-39
@ -422,12 +430,11 @@ Interval: 12 @@ -422,12 +430,11 @@ Interval: 12
There’s one last thing we need to handle: errors! With both of these
channel-based streams, the `send` calls could fail when the other side of the
channel closes—and that’s just a matter of how the runtime executes the futures
which make up the stream. Up until now we have ignored this by calling `unwrap`,
but in a well-behaved app, we should explicitly handle the error, at minimum by
ending the loop so we don’t try to send any more messages! Listing 17-40 shows
a simple error strategy: print the issue and then `break` from the loops. As
usual, the correct way to handle a message send error will vary—just make sure
you have a strategy.
that make up the stream. Up until now, we’ve ignored this possibility by calling
`unwrap`, but in a well-behaved app, we should explicitly handle the error, at
minimum by ending the loop so we don’t try to send any more messages. Listing
17-40 shows a simple error strategy: print the issue and then `break` from the
loops.
<Listing number="17-40" caption="Handling errors and shutting down the loops">
@ -437,8 +444,12 @@ you have a strategy. @@ -437,8 +444,12 @@ you have a strategy.
</Listing>
Now that we’ve seen a bunch of async in practice, let’s take a step back and
dig into a few of the details of how `Future`, `Stream`, and the other key
traits which Rust uses to make async work.
As usual, the correct way to handle a message send error will vary; just make
sure you have a strategy.
Now that we’ve seen a bunch of async in practice, let’s take a step back and dig
into a few of the details of how `Future`, `Stream`, and the other key traits
Rust uses to make async work.
[17-02-messages]: ch17-02-concurrency-with-async.html#message-passing
[iterator-trait]: ch13-02-iterators.html#the-iterator-trait-and-the-next-method

456
rustbook-en/src/ch17-05-traits-for-async.md

@ -1,18 +1,23 @@ @@ -1,18 +1,23 @@
## Digging Into the Traits for Async
## A Closer Look at the Traits for Async
<!-- Old headings. Do not remove or links may break. -->
<a id="digging-into-the-traits-for-async"></a>
Throughout the chapter, we’ve used the `Future`, `Pin`, `Unpin`, `Stream`, and
`StreamExt` traits in various ways. So far, though, we’ve avoided digging too
far into the details of how they work or how they fit together. Much of the time
when writing Rust day to day, this is fine. Sometimes, though, you’ll hit
situations where understanding a few more of these details matters. In this
section, we’ll dig down _enough_ further to help with those situations—while
still leaving the _really_ deep dive for other documentation!
`StreamExt` traits in various ways. So far, though, we’ve avoided getting too
far into the details of how they work or how they fit together, which is fine
most of the time for your day-to-day Rust work. Sometimes, though, you’ll
encounter situations where you’ll need to understand a few more of these
details. In this section, we’ll dig in just enough to help in those scenarios,
still leaving the _really_ deep dive for other documentation.
<!-- Old headings. Do not remove or links may break. -->
<a id="future"></a>
### Future
### The `Future` Trait
Back in [“Futures and the Async Syntax”][futures-syntax]<!-- ignore -->, we
noted that `Future` is a trait. Let’s start by taking a closer look at how it
works. Here is how Rust defines a `Future`:
Let’s start by taking a closer look at how the `Future` trait works. Here’s how
Rust defines it:
```rust
use std::pin::Pin;
@ -32,8 +37,8 @@ First, `Future`’s associated type `Output` says what the future resolves to. @@ -32,8 +37,8 @@ First, `Future`’s associated type `Output` says what the future resolves to.
This is analogous to the `Item` associated type for the `Iterator` trait.
Second, `Future` also has the `poll` method, which takes a special `Pin`
reference for its `self` parameter and a mutable reference to a `Context` type,
and returns a `Poll<Self::Output>`. We’ll talk a little more about `Pin` and
`Context` later in the section. For now, let’s focus on what the method returns,
and returns a `Poll<Self::Output>`. We’ll talk more about `Pin` and
`Context` in a moment. For now, let’s focus on what the method returns,
the `Poll` type:
```rust
@ -43,20 +48,20 @@ enum Poll<T> { @@ -43,20 +48,20 @@ enum Poll<T> {
}
```
This `Poll` type is similar to an `Option`: it has one variant which has a value
(`Ready(T)`), and one which does not (`Pending`). It means something quite
different, though! The `Pending` variant indicates that the future still has
work to do, so the caller will need to check again later. The `Ready` variant
indicates that the `Future` has finished its work and the `T` value is
This `Poll` type is similar to an `Option`. It has one variant that has a value,
`Ready(T)`, and one which does not, `Pending`. `Poll` means something quite
different from `Option`, though! The `Pending` variant indicates that the future
still has work to do, so the caller will need to check again later. The `Ready`
variant indicates that the future has finished its work and the `T` value is
available.
> Note: With most futures, the caller should not call `poll` again after the
> future has returned `Ready`. Many futures will panic if polled again after
> becoming ready! Futures which are safe to poll again will say so explicitly in
> their documentation. This is similar to how `Iterator::next` behaves!
> becoming ready. Futures that are safe to poll again will say so explicitly in
> their documentation. This is similar to how `Iterator::next` behaves.
Under the hood, when you see code which uses `await`, Rust compiles that to code
which calls `poll`. If you look back at Listing 17-4, where we printed out the
When you see code that uses `await`, Rust compiles it under the hood to code
that calls `poll`. If you look back at Listing 17-4, where we printed out the
page title for a single URL once it resolved, Rust compiles it into something
kind of (although not exactly) like this:
@ -72,9 +77,9 @@ match page_title(url).poll() { @@ -72,9 +77,9 @@ match page_title(url).poll() {
}
```
What should we do when the `Future` is still `Pending`? We need some way to try
again and again, and again, until the future is finally ready. In other words,
a loop:
What should we do when the future is still `Pending`? We need some way to try
again, and again, and again, until the future is finally ready. In other words,
we need a loop:
```rust,ignore
let mut page_title_fut = page_title(url);
@ -93,30 +98,32 @@ loop { @@ -93,30 +98,32 @@ loop {
If Rust compiled it to exactly that code, though, every `await` would be
blocking—exactly the opposite of what we were going for! Instead, Rust makes
sure that the loop can hand off control to something which can pause work on
this future and work on other futures and check this one again later. That
“something” is an async runtime, and this scheduling and coordination work is
one of the main jobs for a runtime.
Recall our description (in the [Counting][counting] section) of waiting on
`rx.recv`. The `recv` call returns a `Future`, and awaiting it polls it. In our
initial discussion, we noted that a runtime will pause the future until it’s
ready with either `Some(message)` or `None` when the channel closes. With our
deeper understanding of `Future` in place, and specifically `Future::poll`, we
can see how that works. The runtime knows the future isn’t ready when it
returns `Poll::Pending`. Conversely, the runtime knows the future is ready and
advances it when `poll` returns `Poll::Ready(Some(message))` or
`Poll::Ready(None)`.
The exact details of how a runtime does that are more than we will cover in even
this deep dive section. The key here is to see the basic mechanic of futures: a
runtime _polls_ each future it is responsible for, putting it back to sleep when
it is not yet ready.
### Pinning and the Pin and Unpin Traits
When we introduced the idea of pinning while working on Listing 17-16, we ran
into a very gnarly error message. Here is the relevant part of it again:
sure that the loop can hand off control to something that can pause work on this
future to work on other futures and then check this one again later. As we’ve
seen, that something is an async runtime, and this scheduling and coordination
work is one of its main jobs.
Earlier in the chapter, we described waiting on `rx.recv`. The `recv` call
returns a future, and awaiting the future polls it. We noted that a runtime will
pause the future until it’s ready with either `Some(message)` or `None` when the
channel closes. With our deeper understanding of the `Future` trait, and
specifically `Future::poll`, we can see how that works. The runtime knows the
future isn’t ready when it returns `Poll::Pending`. Conversely, the runtime
knows the future _is_ ready and advances it when `poll` returns
`Poll::Ready(Some(message))` or `Poll::Ready(None)`.
The exact details of how a runtime does that are beyond the scope of this book,
but the key is to see the basic mechanics of futures: a runtime _polls_ each
future it is responsible for, putting the future back to sleep when it is not
yet ready.
<!-- Old headings. Do not remove or links may break. -->
<a id="pinning-and-the-pin-and-unpin-traits"></a>
### The `Pin` and `Unpin` Traits
When we introduced the idea of pinning in Listing 17-16, we ran into a very
gnarly error message. Here is the relevant part of it again:
<!-- manual-regeneration
cd listings/ch17-async-await/listing-17-16
@ -144,24 +151,23 @@ note: required by a bound in `futures_util::future::join_all::JoinAll` @@ -144,24 +151,23 @@ note: required by a bound in `futures_util::future::join_all::JoinAll`
| ^^^^^^ required by this bound in `JoinAll`
```
When we read this error message carefully, it not only tells us that we need to
pin the values, but also tells us why pinning is required. The `trpl::join_all`
function returns a struct called `JoinAll`. That struct is generic over a type
`F`, which is constrained to implement the `Future` trait. Directly awaiting a
future with `await` pins the future implicitly. That’s why we don’t need to use
`pin!` everywhere we want to await futures.
This error message tells us not only that we need to pin the values but also why
pinning is required. The `trpl::join_all` function returns a struct called
`JoinAll`. That struct is generic over a type `F`, which is constrained to
implement the `Future` trait. Directly awaiting a future with `await` pins the
future implicitly. That’s why we don’t need to use `pin!` everywhere we want to
await futures.
However, we’re not directly awaiting a future here. Instead, we construct a new
future, `JoinAll`, by passing a collection of futures to the `join_all`
function. The signature for `join_all` requires that the type of the items in
the collection all implement the `Future` trait, and `Box<T>` only implements
`Future` if the `T` that it wraps is a future which implements the `Unpin`
trait.
function. The signature for `join_all` requires that the types of the items in
the collection all implement the `Future` trait, and `Box<T>` implements
`Future` only if the `T` it wraps is a future that implements the `Unpin` trait.
That’s a lot! But we can understand it, if we dive a little further into how the
`Future` type actually works, in particular around _pinning_.
That’s a lot to absorb! To really understand it, let’s we dive a little further
into how the `Future` trait actually works, in particular around _pinning_.
Let’s look again at the definition of `Future`:
Look again at the definition of the `Future` trait:
```rust
use std::pin::Pin;
@ -175,174 +181,174 @@ pub trait Future { @@ -175,174 +181,174 @@ pub trait Future {
}
```
The `cx` parameter and its `Context` type is the key to how a runtime actually
knows when to check any given future, while still being lazy. The details of how
that works are beyond the scope of this chapter, though: you generally only need
to worry about it when writing a custom `Future` implementation.
Instead, we’ll focus on the type for `self`. This is the first time we’ve seen
a method where `self` has a type annotation. A type annotation for `self` is
similar to type annotations for other function parameters, with two key
differences. First, when we specify the type of `self` in this way, we’re
telling Rust what type `self` must be to call this method. Second, a type
annotation on `self` can’t be just any type. It’s only allowed to be the type
on which the method is implemented, a reference or smart pointer to that type,
or a `Pin` wrapping a reference to that type. We’ll see more on this syntax in
Chapter 18. For now, it’s enough to know that if we want to poll a future (to
check whether it is `Pending` or `Ready(Output)`), we need a mutable reference
to the type, which is wrapped in a `Pin`.
`Pin` is a wrapper type. In some ways, it’s similar to the `Box`, `Rc`, and
other smart pointer types we saw in Chapter 15, which also wrap other types.
Unlike those, however, `Pin` only works with _pointer types_ such as references
(`&` and `&mut`) and smart pointers (`Box`, `Rc`, and so on). To be precise,
`Pin` works with types which implement the `Deref` or `DerefMut` traits, which
we covered in Chapter 15. You can think of this restriction as equivalent to
only working with pointers, though, because implementing `Deref` or `DerefMut`
means your type behaves similarly to a pointer type. `Pin` is also not a pointer
itself, and it doesn’t have any behavior of its own the way `Rc` and `Arc` do
with ref counting. It’s purely a tool the compiler can use to uphold the
relevant guarantees, by wrapping pointers in the type.
Recalling that `await` is implemented in terms of calls to `poll`, this starts
to explain the error message we saw above—but that was in terms of `Unpin`, not
`Pin`. So what exactly are `Pin` and `Unpin`, how do they relate, and why does
`Future` need `self` to be in a `Pin` type to call `poll`?
In [Our First Async Program][first-async], we described how a series of await
points in a future get compiled into a state machine—and noted how the compiler
helps make sure that state machine follows all of Rust’s normal rules around
safety, including borrowing and ownership. To make that work, Rust looks at what
data is needed between each await point and the next await point or the end of
the async block. It then creates a corresponding variant in the state machine it
creates. Each variant gets the access it needs to the data that will be used in
that section of the source code, whether by taking ownership of that data or by
getting a mutable or immutable reference to it.
So far so good: if we get anything wrong about the ownership or references in a
The `cx` parameter and its `Context` type are the key to how a runtime actually
knows when to check any given future while still being lazy. Again, the details
of how that works are beyond the scope of this chapter, and you generally only
need to think about this when writing a custom `Future` implementation. We’ll
focus instead on the type for `self`, as this is the first time we’ve seen a
method where `self` has a type annotation. A type annotation for `self` is works
like type annotations for other function parameters, but with two key
differences:
- It tells Rust what type `self` must be for the method to be called.
- It can’t be just any type. It’s restricted to the type on which the method is
implemented, a reference or smart pointer to that type, or a `Pin` wrapping a
reference to that type.
We’ll see more on this syntax in [Chapter 18][ch-18]<!-- ignore -->. For now,
it’s enough to know that if we want to poll a future to check whether it is
`Pending` or `Ready(Output)`, we need a `Pin`-wrapped mutable reference to the
type.
`Pin` is a wrapper for pointer-like types such as `&`, `&mut`, `Box`, and `Rc`.
(Technically, `Pin` works with types that implement the `Deref` or `DerefMut`
traits, but this is effectively equivalent to working only with pointers.) `Pin`
is not a pointer itself and doesn’t have any behavior of its own like `Rc` and
`Arc` do with reference counting; it’s purely a tool the compiler can use to
enforce constraints on pointer usage.
Recalling that `await` is implemented in terms of calls to `poll` starts to
explain the error message we saw earlier, but that was in terms of `Unpin`, not
`Pin`. So how exactly does `Pin` relate to `Unpin`, and why does `Future` need
`self` to be in a `Pin` type to call `poll`?
Remember from earlier in this chapter a series of await points in a future get
compiled into a state machine, and the compiler makes sure that state machine
follows all of Rust’s normal rules around safety, including borrowing and
ownership. To make that work, Rust looks at what data is needed between one
await point and either the next await point or the end of the async block. It
then creates a corresponding variant in the compiled state machine. Each variant
gets the access it needs to the data that will be used in that section of the
source code, whether by taking ownership of that data or by getting a mutable or
immutable reference to it.
So far, so good: if we get anything wrong about the ownership or references in a
given async block, the borrow checker will tell us. When we want to move around
the future that corresponds to that block—like moving it into a `Vec` to pass
to `join_all`, the way we did back in the [“Working With Any Number of
Futures”][any-number-futures]<!-- ignore --> section—things get trickier.
the future that corresponds to that block—like moving it into a `Vec` to pass to
`join_all`—things get trickier.
When we move a future—whether by pushing into a data structure to use as an
iterator with `join_all`, or returning them from a function—that actually means
When we move a future—whether by pushing it into a data structure to use as an
iterator with `join_all` or by returning it from a function—that actually means
moving the state machine Rust creates for us. And unlike most other types in
Rust, the futures Rust creates for async blocks can end up with references to
themselves in the fields of any given variant, as in Figure 17-4 (a simplified
illustration to help you get a feel for the idea, rather than digging into what
are often fairly complicated details).
themselves in the fields of any given variant, as shown in the simplified illustration in Figure 17-4.
<figure>
<img alt="Concurrent work flow" src="img/trpl17-04.svg" class="center" />
<img alt="A single-column, three-row table representing a future, fut1, which has data values 0 and 1 in the first two rows and an arrow pointing from the third row back to the second row, representing an internal reference within the future." src="img/trpl17-04.svg" class="center" />
<figcaption>Figure 17-4: A self-referential data type.</figcaption>
</figure>
By default, though, any object which has a reference to itself is unsafe to
move, because references always point to the actual memory address of the thing
they refer to. If you move the data structure itself, those internal references
will be left pointing to the old location. However, that memory location is now
invalid. For one thing, its value will not be updated when you make changes to
the data structure. For another—and more importantly!—the computer is now free
to reuse that memory for other things! You could end up reading completely
unrelated data later.
By default, though, any object that has a reference to itself is unsafe to move,
because references always point to the actual memory address of whatever they
refer to (see Figure 17-5). If you move the data structure itself, those
internal references will be left pointing to the old location. However, that
memory location is now invalid. For one thing, its value will not be updated
when you make changes to the data structure. For another—more important—thing,
the computer is now free to reuse that memory for other purposes! You could end
up reading completely unrelated data later.
<figure>
<img alt="Concurrent work flow" src="img/trpl17-05.svg" class="center" />
<img alt="Two tables, depicting two futures, fut1 and fut2, each of which has one column and three rows, representing the result of having moved a future out of fut1 into fut2. The first, fut1, is grayed out, with a question mark in each index, representing unknown memory. The second, fut2, has 0 and 1 in the first and second rows and an arrow pointing from its third row back to the second row of fut1, representing a pointer that is referencing the old location in memory of the future before it was moved." src="img/trpl17-05.svg" class="center" />
<figcaption>Figure 17-5: The unsafe result of moving a self-referential data type.</figcaption>
<figcaption>Figure 17-5: The unsafe result of moving a self-referential data type</figcaption>
</figure>
In principle, the Rust compiler could try to update every reference to an object
every time it gets moved. That would potentially be a lot of performance
overhead, especially given there can be a whole web of references that need
updating. On the other hand, if we could make sure the data structure in
question _doesn’t move in memory_, we don’t have to update any references.
This is exactly what Rust’s borrow checker requires: you can’t move an item
which has any active references to it using safe code.
Theoretically, the Rust compiler could try to update every reference to an
object whenever it gets moved, but that could add a lot of performance overhead,
especially if a whole web of references needs updating. If we could instead make
sure the data structure in question _doesn’t move in memory_, we wouldn’t have
to update any references. This is exactly what Rust’s borrow checker requires:
in safe code, it prevents you from moving any item with an active reference to
it.
`Pin` builds on that to give us the exact guarantee we need. When we _pin_ a
value by wrapping a pointer to that value in `Pin`, it can no longer move. Thus,
if you have `Pin<Box<SomeType>>`, you actually pin the `SomeType` value, _not_
the `Box` pointer. Figure 17-6 illustrates this:
the `Box` pointer. Figure 17-6 illustrates this process.
<figure>
<img alt="Concurrent work flow" src="img/trpl17-06.svg" class="center" />
<img alt="Three boxes laid out side by side. The first is labeled “Pin”, the second “b1”, and the third “pinned”. Within “pinned” is a table labeled “fut”, with a single column; it represents a future with cells for each part of the data structure. Its first cell has the value “0”, its second cell has an arrow coming out of it and pointing to the fourth and final cell, which has the value “1” in it, and the third cell has dashed lines and an ellipsis to indicate there may be other parts to the data structure. All together, the “fut” table represents a future which is self-referential. An arrow leaves the box labeled “Pin”, goes through the box labeled “b1” and has terminates inside the “pinned” box at the “fut” table." src="img/trpl17-06.svg" class="center" />
<figcaption>Figure 17-6: Pinning a `Box` which points to a self-referential future type.</figcaption>
<figcaption>Figure 17-6: Pinning a `Box` that points to a self-referential future type.</figcaption>
</figure>
In fact, the `Box` pointer can still move around freely. Remember: we care about
making sure the data ultimately being referenced stays in its place. If a
pointer moves around, but the data it points to is in the same place, as in
Figure 17-7, there’s no potential problem. (How you would do this with a `Pin`
wrapping a `Box` is more than we’ll get into in this particular discussion,
but it would make for a good exercise! If you look at the docs for the types as
well as the `std::pin` module, you might be able to work out how you would do
that.) The key is that the self-referential type itself cannot move, because it
is still pinned.
making sure the data ultimately being referenced stays in place. If a pointer
moves around, _but the data it points to is in the same place_, as in Figure
17-7, there’s no potential problem. As an independent exercise, look at the docs
for the types as well as the `std::pin` module and try to work out how you’d do
this with a `Pin` wrapping a `Box`.) The key is that the self-referential type
itself cannot move, because it is still pinned.
<figure>
<img alt="Concurrent work flow" src="img/trpl17-07.svg" class="center" />
<img alt="Four boxes laid out in three rough columns, identical to the previous diagram with a change to the second column. Now there are two boxes in the second column, labeled “b1” and “b2”, “b1” is grayed out, and the arrow from “Pin” goes through “b2” instead of “b1”, indicating that the pointer has moved from “b1” to “b2”, but the data in “pinned” has not moved." src="img/trpl17-07.svg" class="center" />
<figcaption>Figure 17-7: Moving a `Box` which points to a self-referential future type.</figcaption>
</figure>
However, most types are perfectly safe to move around, even if they happen to
be behind a `Pin` pointer. We only need to think about pinning when items have
internal references. Primitive values such as numbers and booleans don’t have
any internal references, so they’re obviously safe. Neither do most types you
normally work with in Rust. A `Vec`, for example, doesn’t have any internal
references it needs to keep up to date this way, so you can move it around
without worrying. If you have a `Pin<Vec<String>>`, you’d have to do everything
via the safe but restrictive APIs provided by `Pin`, even though a
`Vec<String>` is always safe to move if there are no other references to it. We
need a way to tell the compiler that it’s actually just fine to move items
around in cases such as these. For that, we have `Unpin`.
However, most types are perfectly safe to move around, even if they happen to be
behind a `Pin` pointer. We only need to think about pinning when items have
internal references. Primitive values such as numbers and Booleans are safe
since they obviously don’t have any internal references, so they’re obviously
safe. Neither do most types you normally work with in Rust. You can move around
a `Vec`, for example, without worrying. Given only what we have seen so far, if
you have a `Pin<Vec<String>>`, you’d have to do everything via the safe but
restrictive APIs provided by `Pin`, even though a `Vec<String>` is always safe
to move if there are no other references to it. We need a way to tell the
compiler that it’s fine to move items around in cases like this—and there’s
where `Unpin` comes into play.
`Unpin` is a marker trait, similar to the `Send` and `Sync` traits we saw in
Chapter 16. Recall that marker traits have no functionality of their own. They
exist only to tell the compiler that it’s safe to use the type which implements
a given trait in a particular context. `Unpin` informs the compiler that a given
type does _not_ need to uphold any particular guarantees about whether the value
in question can be moved.
Chapter 16, and thus has no functionality of its own. Marker traits exist only
to tell the compiler it’s safe to use the type implementing a given trait in a
particular context. `Unpin` informs the compiler that a given type does _not_
need to uphold any guarantees about whether the value in question can be safely
moved.
<!--
The inline `<code>` in the next block is to allow the inline `<em>` inside it,
matching what NoStarch does style-wise, and emphasizing within the text here
that it is something distinct from a normal type.
-->
Just as with `Send` and `Sync`, the compiler implements `Unpin` automatically
for all types where it can prove it is safe. The special case, again similar to
`Send` and `Sync`, is the case where `Unpin` is _not_ implemented for a type.
The notation for this is `impl !Unpin for SomeType`, where `SomeType` is the
name of a type which _does_ need to uphold those guarantees to be safe whenever
a pointer to that type is used in a `Pin`.
for all types where it can prove it is safe. A special case, again similar to
`Send` and `Sync`, is where `Unpin` is _not_ implemented for a type. The
notation for this is <code>impl !Unpin for <em>SomeType</em></code>, where
<code><em>SomeType</em></code> is the name of a type that _does_ need to uphold
those guarantees to be safe whenever a pointer to that type is used in a `Pin`.
In other words, there are two things to keep in mind about the relationship
between `Pin` and `Unpin`. First, `Unpin` is the “normal” case, and `!Unpin` is
the special case. Second, whether a type implements `Unpin` or `!Unpin` _only_
matters when using a pinned pointer to that type like `Pin<&mut SomeType>`.
matters when you’re using a pinned pointer to that type like <code>Pin<&mut
<em>SomeType</em>></code>.
To make that concrete, think about a `String`: it has a length and the Unicode
characters which make it up. We can wrap a `String` in `Pin`, as seen in Figure
17-8. However, `String` automatically implements `Unpin`, the same as most other
types in Rust.
characters that make it up. We can wrap a `String` in `Pin`, as seen in Figure
17-8. However, `String` automatically implements `Unpin`, as do most other types
in Rust.
<figure>
<img alt="Concurrent work flow" src="img/trpl17-08.svg" class="center" />
<figcaption>Figure 17-8: Pinning a String, with a dotted line indicating that the String implements the `Unpin` trait, so it is not pinned.</figcaption>
<figcaption>Figure 17-8: Pinning a `String`; the dotted line indicates that the `String` implements the `Unpin` trait, and thus is not pinned.</figcaption>
</figure>
As a result, we can do things which would be illegal if `String` implemented
`!Unpin` instead, such as replace one string with another at the exact same
As a result, we can do things that would be illegal if `String` implemented
`!Unpin` instead, such as replacing one string with another at the exact same
location in memory as in Figure 17-9. This doesn’t violate the `Pin` contract,
because `String` has no internal references that make it unsafe to move around!
That is precisely why it implements `Unpin` rather than `!Unpin`.
@ -351,7 +357,7 @@ That is precisely why it implements `Unpin` rather than `!Unpin`. @@ -351,7 +357,7 @@ That is precisely why it implements `Unpin` rather than `!Unpin`.
<img alt="Concurrent work flow" src="img/trpl17-09.svg" class="center" />
<figcaption>Figure 17-9: Replacing the String with an entirely different String in memory.</figcaption>
<figcaption>Figure 17-9: Replacing the `String` with an entirely different `String` in memory.</figcaption>
</figure>
@ -363,42 +369,39 @@ They need to be pinned, and then we can pass the `Pin` type into the `Vec`, @@ -363,42 +369,39 @@ They need to be pinned, and then we can pass the `Pin` type into the `Vec`,
confident that the underlying data in the futures will _not_ be moved.
`Pin` and `Unpin` are mostly important for building lower-level libraries, or
when you’re building a runtime itself, rather than for day to day Rust code.
when you’re building a runtime itself, rather than for day-to-day Rust code.
When you see these traits in error messages, though, now you’ll have a better
idea of how to fix the code!
idea of how to fix your code!
> Note: This combination of `Pin` and `Unpin` allows a whole class of complex
> types to be safe in Rust which are otherwise difficult to implement because
> they’re self-referential. Types which require `Pin` show up _most_ commonly
> in async Rust today, but you might—very rarely!—see it in other contexts, too.
> Note: This combination of `Pin` and `Unpin` makes it possible to safely
> implement a whole class of complex types in Rust that would otherwise prove
> challenging because they’re self-referential. Types that require `Pin` show up
> most commonly in async Rust today, but every once in a while, you might see
> them in other contexts, too.
>
> The specifics of how `Pin` and `Unpin` work, and the rules they’re required
> to uphold, are covered extensively in the API documentation for `std::pin`, so
> if you’d like to understand them more deeply, that’s a great place to start.
>
> If you want to understand how things work “under the hood” in even more
> detail, the official [_Asynchronous Programming in Rust_][async-book] book has
> you covered:
> if you’re interested in learning more, that’s a great place to start.
>
> - [Chapter 2: Under the Hood: Executing Futures and Tasks][under-the-hood]
> - [Chapter 4: Pinning][pinning]
### The Stream Trait
Now that we have a deeper grasp on the `Future`, `Pin`, and `Unpin` traits, we
can turn our attention to the `Stream` trait. As described in the section
introducing streams, streams are similar to asynchronous iterators. Unlike
`Iterator` and `Future`, there is no definition of a `Stream` trait in the
standard library as of the time of writing, but there _is_ a very common
definition from the `futures` crate used throughout the ecosystem.
Let’s review the definitions of the `Iterator` and `Future` traits, so we can
build up to how a `Stream` trait that merges them together might look. From
`Iterator`, we have the idea of a sequence: its `next` method provides an
`Option<Self::Item>`. From `Future`, we have the idea of readiness over time:
its `poll` method provides a `Poll<Self::Output>`. To represent a sequence of
items which become ready over time, we define a `Stream` trait which puts those
features together:
> If you want to understand how things work under the hood in even more detail,
> see Chapters [2][under-the-hood] and [4][pinning] of [_Asynchronous
> Programming in Rust_][async-book].
### The `Stream` Trait
Now that you have a deeper grasp on the `Future`, `Pin`, and `Unpin` traits, we
can turn our attention to the `Stream` trait. As you learned earlier in the
chapter, streams are similar to asynchronous iterators. Unlike `Iterator` and
`Future`, however, `Stream` has no definition in the standard library as of this
writing, but there _is_ a very common definition from the `futures` crate used
throughout the ecosystem.
Let’s review the definitions of the `Iterator` and `Future` traits before
looking at how a `Stream` trait might merge them together. From `Iterator`, we
have the idea of a sequence: its `next` method provides an `Option<Self::Item>`.
From `Future`, we have the idea of readiness over time: its `poll` method
provides a `Poll<Self::Output>`. To represent a sequence of items that become
ready over time, we define a `Stream` trait that puts those features together:
```rust
use std::pin::Pin;
@ -414,10 +417,10 @@ trait Stream { @@ -414,10 +417,10 @@ trait Stream {
}
```
The `Stream` trait defines an associated type `Item` for the type of the items
produced by the stream. This is similar to `Iterator`: there may be zero to
many of these, and unlike `Future`, where there is always a single `Output`
(even if it’s the unit type `()`).
The `Stream` trait defines an associated type called `Item` for the type of the
items produced by the stream. This is similar to `Iterator`, where there may be
zero to many items, and unlike `Future`, where there is always a single
`Output`, even if it’s the unit type `()`.
`Stream` also defines a method to get those items. We call it `poll_next`, to
make it clear that it polls in the same way `Future::poll` does and produces a
@ -427,16 +430,16 @@ checked for readiness, just as a future does. The inner type is `Option`, @@ -427,16 +430,16 @@ checked for readiness, just as a future does. The inner type is `Option`,
because it needs to signal whether there are more messages, just as an iterator
does.
Something very similar to this will likely end up standardized as part of Rust’s
standard library. In the meantime, it’s part of the toolkit of most runtimes,
so you can rely on it, and everything we cover below should generally apply!
Something very similar to this definition will likely end up as part of Rust’s
standard library. In the meantime, it’s part of the toolkit of most runtimes, so
you can rely on it, and everything we cover next should generally apply!
In the example we saw in the section on streaming, though, we didn’t use
`poll_next` _or_ `Stream`, but instead used `next` and `StreamExt`. We _could_
work directly in terms of the `poll_next` API by hand-writing our own `Stream`
state machines, of course, just as we _could_ work with futures directly via
their `poll` method. Using `await` is much nicer, though, so the `StreamExt`
trait supplies the `next` method so we can do just that.
their `poll` method. Using `await` is much nicer, though, and the `StreamExt`
trait supplies the `next` method so we can do just that:
```rust
{{#rustdoc_include ../listings/ch17-async-await/no-listing-stream-ext/src/lib.rs:here}}
@ -448,36 +451,35 @@ in traits, since the lack thereof is the reason they do not yet have this. @@ -448,36 +451,35 @@ in traits, since the lack thereof is the reason they do not yet have this.
-->
> Note: The actual definition we used earlier in the chapter looks slightly
> different than this, because it supports versions of Rust which did not yet
> different than this, because it supports versions of Rust that did not yet
> support using async functions in traits. As a result, it looks like this:
>
> ```rust,ignore
> fn next(&mut self) -> Next<'_, Self> where Self: Unpin;
> ```
>
> That `Next` type is a `struct` which implements `Future` and gives a way to
> name the lifetime of the reference to `self` with `Next<'_, Self>`, so that
> `await` can work with this method!
> That `Next` type is a `struct` that implements `Future` and allows us to name
> the lifetime of the reference to `self` with `Next<'_, Self>`, so that `await`
> can work with this method.
The `StreamExt` trait is also the home of all the interesting methods available
to use with streams. `StreamExt` is automatically implemented for every type
which implements `Stream`, but these traits are defined separately so that the
community can iterate on the foundational trait distinctly from the convenience
APIs.
that implements `Stream`, but these traits are defined separately to enable the
community to iterate on convenience APIs without affecting the foundational
trait.
In the version of `StreamExt` used in the `trpl` crate, the trait not only
defines the `next` method, it also supplies an implementation of `next`, which
correctly handles the details of calling `Stream::poll_next`. This means that
even when you need to write your own streaming data type, you _only_ have to
implement `Stream`, and then anyone who uses your data type can use `StreamExt`
and its methods with it automatically.
defines the `next` method but also supplies a default implementation of `next`
that correctly handles the details of calling `Stream::poll_next`. This means
that even when you need to write your own streaming data type, you _only_ have
to implement `Stream`, and then anyone who uses your data type can use
`StreamExt` and its methods with it automatically.
That’s all we’re going to cover for the lower-level details on these traits. To
wrap up, let’s consider how futures (including streams), tasks, and threads all
fit together!
[futures-syntax]: ch17-01-futures-and-syntax.html
[counting]: ch17-02-concurrency-with-async.html
[ch-18]: ch18-00-oop.html
[async-book]: https://rust-lang.github.io/async-book/
[under-the-hood]: https://rust-lang.github.io/async-book/02_execution/01_chapter.html
[pinning]: https://rust-lang.github.io/async-book/04_pinning/01_chapter.html

137
rustbook-en/src/ch17-06-futures-tasks-threads.md

@ -1,30 +1,30 @@ @@ -1,30 +1,30 @@
## Futures, Tasks, and Threads
## Putting It All Together: Futures, Tasks, and Threads
As we saw in the previous chapter, threads provide one approach to concurrency.
We’ve seen another approach to concurrency in this chapter, using async with
futures and streams. You might be wondering why you would choose one or the
other. The answer is: it depends! And in many cases, the choice isn’t threads
_or_ async but rather threads _and_ async.
As we saw in [Chapter 16][ch16]<!-- ignore -->, threads provide one approach to
concurrency. We’ve seen another approach in this chapter: using async with
futures and streams. If you‘re wondering when to choose method over the other,
the answer is: it depends! And in many cases, the choice isn’t threads _or_
async but rather threads _and_ async.
Many operating systems have supplied threading-based concurrency models for
decades now, and many programming languages have support for them as a result.
However, they are not without their tradeoffs. On many operating systems, they
decades now, and many programming languages support them as a result. However,
these models are not without their tradeoffs. On many operating systems, they
use a fair bit of memory for each thread, and they come with some overhead for
starting up and shutting down. Threads are also only an option when your
operating system and hardware support them! Unlike mainstream desktop and mobile
operating system and hardware support them. Unlike mainstream desktop and mobile
computers, some embedded systems don’t have an OS at all, so they also don’t
have threads!
have threads.
The async model provides a different—and ultimately complementary—set of
tradeoffs. In the async model, concurrent operations don’t require their own
threads. Instead, they can run on tasks, as when we used `trpl::spawn_task` to
kick off work from a synchronous function throughout the streams section. A task
is similar to a thread, but instead of being managed by the operating system,
it’s managed by library-level code: the runtime.
kick off work from a synchronous function in the streams section. A task is
similar to a thread, but instead of being managed by the operating system, it’s
managed by library-level code: the runtime.
In the previous section, we saw that we could build a `Stream` by using an async
channel and spawning an async task which we could call from synchronous code. We
could do the exact same thing with a thread! In Listing 17-40, we used
In the previous section, we saw that we could build a stream by using an async
channel and spawning an async task we could call from synchronous code. We can
do the exact same thing with a thread. In Listing 17-40, we used
`trpl::spawn_task` and `trpl::sleep`. In Listing 17-41, we replace those with
the `thread::spawn` and `thread::sleep` APIs from the standard library in the
`get_intervals` function.
@ -37,15 +37,16 @@ the `thread::spawn` and `thread::sleep` APIs from the standard library in the @@ -37,15 +37,16 @@ the `thread::spawn` and `thread::sleep` APIs from the standard library in the
</Listing>
If you run this, the output is identical. And notice how little changes here
from the perspective of the calling code! What’s more, even though one of our
functions spawned an async task on the runtime and the other spawned an
OS thread, the resulting streams were unaffected by the differences.
If you run this code, the output is identical to that of Listing 17-40. And
notice how little changes here from the perspective of the calling code. What’s
more, even though one of our functions spawned an async task on the runtime and
the other spawned an OS thread, the resulting streams were unaffected by the
differences.
Despite the similarities, these two approaches behave very differently, although
we might have a hard time measuring it in this very simple example. We could
spawn millions of async tasks on any modern personal computer. If we tried to do
that with threads, we would literally run out of memory!
Despite their similarities, these two approaches behave very differently,
although we might have a hard time measuring it in this very simple example. We
could spawn millions of async tasks on any modern personal computer. If we tried
to do that with threads, we would literally run out of memory!
However, there’s a reason these APIs are so similar. Threads act as a boundary
for sets of synchronous operations; concurrency is possible _between_ threads.
@ -58,48 +59,45 @@ that regard, tasks are similar to lightweight, runtime-managed threads with @@ -58,48 +59,45 @@ that regard, tasks are similar to lightweight, runtime-managed threads with
added capabilities that come from being managed by a runtime instead of by the
operating system.
This doesn’t mean that async tasks are always better than threads, any more than
that threads are always better than tasks.
Concurrency with threads is in some ways a simpler programming model than
concurrency with `async`. That can be a strength or a weakness. Threads are
somewhat “fire and forget,” they have no native equivalent to a future, so they
simply run to completion, without interruption except by the operating system
itself. That is, they have no built-in support for _intra-task concurrency_ the
way futures do. Threads in Rust also have no mechanisms for cancellation—a
subject we haven’t covered in depth in this chapter, but which is implicit in
the fact that whenever we ended a future, its state got cleaned up correctly.
This doesn’t mean that async tasks are always better than threads (or vice
versa). Concurrency with threads is in some ways a simpler programming model
than concurrency with `async`. That can be a strength or a weakness. Threads are
somewhat “fire and forget”; they have no native equivalent to a future, so they
simply run to completion without being interrupted except by the operating
system itself. That is, they have no built-in support for _intratask
concurrency_ the way futures do. Threads in Rust also have no mechanisms for
cancellation—a subject we haven’t covered explicitly in this chapter but was
implied by the fact that whenever we ended a future, its state got cleaned up
correctly.
These limitations also make threads harder to compose than futures. It’s much
more difficult, for example, to use threads to build helpers such as the
`timeout` we built in [“Building Our Own Async Abstractions”][combining-futures]
or the `throttle` method we used with streams in [“Composing Streams”][streams].
The fact that futures are richer data structures means they can be composed
together more naturally, as we have seen.
Tasks then give _additional_ control over futures, allowing you to choose where
and how to group the futures. And it turns out that threads and tasks often
work very well together, because tasks can (at least in some runtimes) be moved
around between threads. We haven’t mentioned it up until now, but under the
hood the `Runtime` we have been using, including the `spawn_blocking` and
`spawn_task` functions, is multithreaded by default! Many runtimes use an
approach called _work stealing_ to transparently move tasks around between
threads based on the current utilization of the threads, with the aim of
improving the overall performance of the system. To build that actually requires
threads _and_ tasks, and therefore futures.
As a default way of thinking about which to use when:
`timeout` and `throttle` methods we built earlier in this chapter. The fact that
futures are richer data structures means they can be composed together more
naturally, as we have seen.
Tasks, then, give us _additional_ control over futures, allowing us to choose
where and how to group them. And it turns out that threads and tasks often work
very well together, because tasks can (at least in some runtimes) be moved
around between threads. In fact, under the hood, the runtime we’ve been
using—including the `spawn_blocking` and `spawn_task` functions—is multithreaded
by default! Many runtimes use an approach called _work stealing_ to
transparently move tasks around between threads, based on how the threads are
currently being utilized, to improve the system’s overall performance. That
approach actually requires threads _and_ tasks, and therefore futures.
When thinking about which method to use when, consider these rules of thumb:
- If the work is _very parallelizable_, such as processing a bunch of data where
each part can be processed separately, threads are a better choice.
- If the work is _very concurrent_, such as handling messages from a bunch of
different sources which may come in a different intervals or different rates,
different sources that may come in at different intervals or different rates,
async is a better choice.
And if you need some mix of parallelism and concurrency, you don’t have to
choose between threads and async. You can use them together freely, letting each
one serve the part it is best at. For example, Listing 17-42 shows a fairly
common example of this kind of mix in real-world Rust code.
And if you need both parallelism and concurrency, you don’t have to choose
between threads and async. You can use them together freely, letting each one
play the part it’s best at. For example, Listing 17-42 shows a fairly common
example of this kind of mix in real-world Rust code.
<Listing number="17-42" caption="Sending messages with blocking code in a thread and awaiting the messages in an async block" file-name="src/main.rs">
@ -109,32 +107,33 @@ common example of this kind of mix in real-world Rust code. @@ -109,32 +107,33 @@ common example of this kind of mix in real-world Rust code.
</Listing>
We begin by creating an async channel. Then we spawn a thread which takes
We begin by creating an async channel, then spawn a thread that takes
ownership of the sender side of the channel. Within the thread, we send the
numbers 1 through 10, and sleep for a second in between each. Finally, we run a
numbers 1 through 10, sleeping for a second between each. Finally, we run a
future created with an async block passed to `trpl::run` just as we have
throughout the chapter. In that future, we await those messages, just as in
the other message-passing examples we have seen.
To return to the examples we opened the chapter with: you could imagine running
a set of video encoding tasks using a dedicated thread, because video encoding
is compute bound, but notifying the UI that those operations are done with an
async channel. Examples of this kind of mix abound!
To return to the scenario we opened the chapter with, imagine running a set of
video encoding tasks using a dedicated thread (because video encoding is
compute-bound) but notifying the UI that those operations are done with an async
channel. There are countless examples of these kinds of combinations in
real-world use cases.
## Summary
This isn’t the last you’ll see of concurrency in this book: the project in
Chapter 21 will use the concepts in this chapter in a more realistic situation
than the smaller examples discussed here—and compare more directly what it looks
like to solve these kinds of problems with threading vs. with tasks and futures.
This isn’t the last you’ll see of concurrency in this book. The project in
[Chapter 21][ch21] will apply these concepts in a more realistic situation
than the simpler examples discussed here and compare problem-solving with threading versus tasks more directly.
Whether with threads, with futures and tasks, or with the combination of them
all, Rust gives you the tools you need to write safe, fast, concurrent
No matter which of these approaches you choose, Rust gives you the tools you need to write safe, fast, concurrent
code—whether for a high-throughput web server or an embedded operating system.
Next, we’ll talk about idiomatic ways to model problems and structure solutions
as your Rust programs get bigger. In addition, we’ll discuss how Rust’s idioms
relate to those you might be familiar with from object-oriented programming.
[ch16]: http://localhost:3000/ch16-00-concurrency.html
[combining-futures]: ch17-03-more-futures.html#building-our-own-async-abstractions
[streams]: ch17-04-streams.html#composing-streams
[ch21]: ch21-00-final-project-a-web-server.html

14
rustbook-en/src/ch20-01-unsafe-rust.md

@ -186,14 +186,11 @@ With the `unsafe` block, we’re asserting to Rust that we’ve read the functio @@ -186,14 +186,11 @@ With the `unsafe` block, we’re asserting to Rust that we’ve read the functio
documentation, we understand how to use it properly, and we’ve verified that
we’re fulfilling the contract of the function.
> Note: In earlier versions of Rust, the body of an unsafe function was treated
> as an `unsafe` block, so you could perform any unsafe operation within the
> body of an `unsafe` function. In later versions of Rust, the compiler will
> warn you that you need to use an `unsafe` block to perform unsafe operations
> in the body of an unsafe function. This is because Rust now distinguishes
> between `unsafe fn`, which defines what you need to do to call the function
> safely, and an `unsafe` block, where you actually uphold that “contract” the
> function establishes.
To perform unsafe operations in the body of an unsafe function, you still need
to use an `unsafe` block just as within a regular function, and the compiler
will warn you if you forget. This helps to keep `unsafe` blocks as small as
possible, as unsafe operations may not be needed across the whole function
body.
#### Creating a Safe Abstraction over Unsafe Code
@ -550,5 +547,6 @@ Rust’s official guide to the subject, the [Rustonomicon][nomicon]. @@ -550,5 +547,6 @@ Rust’s official guide to the subject, the [Rustonomicon][nomicon].
[the-slice-type]: ch04-03-slices.html#the-slice-type
[reference]: ../reference/items/unions.html
[miri]: https://github.com/rust-lang/miri
[editions]: appendix-05-editions.html
[nightly]: appendix-07-nightly-rust.html
[nomicon]: https://doc.rust-lang.org/nomicon/

Loading…
Cancel
Save