11

AsyncSequence & AsyncStream Tutorial for iOS [FREE]

 1 year ago
source link: https://www.raywenderlich.com/34044359-asyncsequence-asyncstream-tutorial-for-ios
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Home iOS & Swift Tutorials

AsyncSequence & AsyncStream Tutorial for iOS

Learn how to use Swift concurrency’s AsyncSequence and AsyncStream protocols to process asynchronous sequences.

By Audrey Tam Jun 29 2022 · Article (20 mins) · Intermediate

Version

AsyncSequenceAndAsyncStreamGettingStarted-feature.png

You’ve embraced async/await as the newest and safest way to code for concurrency in Swift. You’re loving how eliminating a lot of the nested completion handlers reduces the amount of code you write and simplifies that code’s logic so it’s easier to get it right.

And what’s the next step in your Swift concurrency journey? Asynchronous loops. Using Swift concurrency’s AsyncSequence and AsyncStream protocols, this is as easy as looping over an ordinary sequence.

In this tutorial, you’ll:

  • Compare the speed and memory use when synchronously and asynchronously reading a very large file.
  • Create and use a custom AsyncSequence.
  • Create and use pull-based and push-based AsyncStreams.
Note: This is an intermediate-level tutorial. You should be familiar with “traditional” concurrency — GCD (Grand Central Dispatch) and URLSession — and with basic Swift concurrency features like those presented in async/await in SwiftUI and SwiftUI and Structured Concurrency.

Getting Started

Use the Download Materials button at the top or bottom of this tutorial to download the starter project. Open it in Xcode to see what you have to work with.

Data Files

The purpose of ActorSearch is to help you solve puzzles that ask for actor names by searching the name.basics.tsv.gz dataset from IMDb Datasets. This file contains a header line to describe the information for each name:

  • nconst (string) – alphanumeric unique identifier of the name/person
  • primaryName (string)– name by which the person is most often credited
  • birthYear – in YYYY format
  • deathYear – in YYYY format if applicable, else ‘\N’
  • primaryProfession (array of strings) – the top three professions of the person
  • knownForTitles (array of tconsts) – titles the person is known for

To reduce the demand on your network and make it straightforward to read in line by line, the starter project already contains data.tsv: This is the unzipped name.basics.tsv.gz, with the header line removed. It’s a tab-separated-values (TSV) file, formatted in the UTF-8 character set.

Note: Don’t try to view data.tsv by selecting it in the project navigator. It takes a long time to open, and Xcode becomes unresponsive.

In this tutorial, you’ll explore different ways to read the file contents into an array of Actor values. data.tsv contains 11,445,101 lines and takes a very long time to read in, so you’ll use it only to compare memory use. You’ll try out most of your code on the smaller files data-100.tsv and data-1000.tsv, which contain the first 100 and 1000 lines, respectively.

Note: These files are only in the starter project. Copy them into the final project if you want to build and run that project.

Models

Open ActorAPI.swift. Actor is a super-simple structure with only two properties: id and name.

In this file, you’ll implement different methods to read a data file. The ActorAPI initializer takes a filename argument and creates the url. It’s an ObservableObject that publishes an Actor array.

The starter contains a basic synchronous method:

func readSync() throws {
  let start = Date.now
  let contents = try String(contentsOf: url)
  var counter = 0
  contents.enumerateLines { _, _ in
    counter += 1
  }
  print("\(counter) lines")
  print("Duration: \(Date.now.timeIntervalSince(start))")
}

This just creates a String from the contentsOf the file’s url, then counts the lines and prints this number and how long it took.

Open ContentView.swift. ContentView creates an ActorAPI object with a specific filename and displays the Actor array, with a search field.

First, add this view modifier below the searchable(text:) closure:

.onAppear {
  do {
    try model.readSync()
  } catch let error {
    print(error.localizedDescription)
  }
}

You call readSync() when the view appears, catching and printing any errors readSync() throws.

Now, look at the memory use when you run this app. Open the Debug navigator, then build and run. When the gauges appear, select Memory and watch:

Synchronous read memory spike

On my Mac, reading in this 685MB file took 8.9 seconds and produced a 1.9GB spike in memory use.

Next, you’ll try out a Swift concurrency way to read the file. You’ll iterate over an asynchronous sequence.

AsyncSequence

You work with the Sequence protocol all the time: arrays, dictionaries, strings, ranges and Data are all sequences. They come with a lot of convenient methods, like next(), contains(), filter() and more. Looping over a sequence uses its built-in iterator and stops when the iterator returns nil.

The AsyncSequence protocol works like Sequence, but an asynchronous sequence returns each element asynchronously (duh!). You can iterate over its elements asynchronously as more elements become available over time.

  • You await each element, so the sequence can suspend while getting or calculating the next value.
  • The sequence might generate elements faster than your code can use them: One kind of AsyncStream buffers its values, so your app can read them when it needs them.

AsyncSequence provides language support for asynchronously processing collections of data. There are built-in AsyncSequences like NotificationCenter.Notifications, URLSession.bytes(from:delegate:) and its subsequences lines and characters. And you can create your own custom asynchronous sequences with AsyncSequence and AsyncIteratorProtocol or use AsyncStream.

Note: Apple’s AsyncSequence documentation page lists all the built-in asynchronous sequences.

Reading a File Asynchronously

For processing a dataset directly from a URL, the URL foundation class provides its own implementation of AsyncSequence in URL.lines. This is useful for creating an asynchronous sequence of lines directly from the URL.

Open ActorAPI.swift and add this method to ActorAPI:

// Asynchronous read
func readAsync() async throws {
  let start = Date.now

  var counter = 0
  for try await _ in url.lines {
    counter += 1
  }
  print("\(counter) lines")

  print("Duration: \(Date.now.timeIntervalSince(start))")
}

You iterate asynchronously over the asynchronous sequence, counting lines as you go.

Here’s some Swift concurrency magic: url.lines has its own asynchronous iterator, and the for loop calls its next() method until the sequence signals it’s finished by returning nil.

Note: URLSession has a method that gets an asynchronous sequence of bytes and the usual URLResponse object. You can check the response status code, then call lines on this sequence of bytes to convert it into an asynchronous sequence of lines.
let (stream, response) = try await URLSession.shared.bytes(from: url)
guard (response as? HTTPURLResponse)?.statusCode == 200 else {
  throw "The server responded with an error."
}
for try await line in stream.lines { 
  // ... 
}

Calling an Asynchronous Method From a View

To call an asynchronous method from a SwiftUI view, you use the task(priority:_:) view modifier.

In ContentView, comment out the onAppear(perform:) closure and add this code:

.task {
  do {
    try await model.readAsync()
  } catch let error {
    print(error.localizedDescription)
  }
}

Open the Debug navigator, then build and run. When the gauges appear, select Memory and watch:

Asynchronous read memory use

On my Mac, reading in the file took 3.7 seconds, and memory use was a steady 68MB. Quite a difference!

On each iteration of the for loop, the lines sequence reads more data from the URL. Because this happens in chunks, memory usage stays constant.

Getting Actors

It’s time to fill the actors array so the app has something to display.

Add this method to ActorAPI:

func getActors() async throws {
  for try await line in url.lines {
    let name = line.components(separatedBy: "\t")[1]
    await MainActor.run {
      actors.append(Actor(name: name))
    }
  }
}

Instead of counting lines, you extract the name from each line, use it to create an Actor instance, then append this to actors. Because actors is a published property used by a SwiftUI view, modifying it must happen on the main queue.

Now, in ContentView, in the task closure, replace try await model.readAsync() with this:

try await model.getActors()

Also, update the declaration of model with one of the smaller data files, either data-100.tsv or data-1000.tsv:

@StateObject private var model = ActorAPI(filename: "data-100")

Build and run.

List of actors with search field

The list appears pretty quickly. Pull down the screen to see the search field and try out some searches. Use the simulator’s software keyboard (Command-K) to make it easier to uncapitalize the first letter of the search term.

Custom AsyncSequence

So far, you’ve been using the asynchronous sequence built into the URL API. You can also create your own custom AsyncSequence, like an AsyncSequence of Actor values.

To define an AsyncSequence over a dataset, you conform to its protocol and construct an AsyncIterator that returns the next element of the sequence of data in the collection.

AsyncSequence of Actors

You need two structures — one conforms to AsyncSequence and the other conforms to AsyncIteratorProtocol.

In ActorAPI.swift, outside ActorAPI, add these minimal structures:

struct ActorSequence: AsyncSequence {
  // 1
  typealias Element = Actor
  typealias AsyncIterator = ActorIterator

  // 2
  func makeAsyncIterator() -> ActorIterator {
    return ActorIterator()
  }
}

struct ActorIterator: AsyncIteratorProtocol {
  // 3
  mutating func next() -> Actor? {
    return nil
  }
}
Note: If you prefer, you can define the iterator structure inside the AsyncSequence structure.

Here’s what each part of this code does:

  1. Your AsyncSequence generates an Element sequence. In this case, ActorSequence is a sequence of Actors. AsyncSequence expects an AsyncIterator, which you typealias to ActorIterator.
  2. The AsyncSequence protocol requires a makeAsyncIterator() method, which returns an instance of ActorIterator. This method cannot contain any asynchronous or throwing code. Code like that goes into ActorIterator.
  3. The AsyncIteratorProtocol protocol requires a next() method to return the next sequence element or nil, to signal the end of the sequence.

Now, to fill in the structures, add these lines to ActorSequence:

let filename: String
let url: URL

init(filename: String) {
  self.filename = filename
  self.url = Bundle.main.url(forResource: filename, withExtension: "tsv")!
}

This sequence needs an argument for the file name and a property to store the file’s URL. You set these in the initializer.

In makeAsyncIterator(), you’ll iterate over url.lines.

Add these lines to ActorIterator:

let url: URL
var iterator: AsyncLineSequence<URL.AsyncBytes>.AsyncIterator

init(url: URL) {
  self.url = url
  iterator = url.lines.makeAsyncIterator()
}

You explicitly get hold of the asynchronous iterator of url.lines so next() can call the iterator’s next() method.

Now, fix the ActorIterator() call in makeAsyncIterator():

return ActorIterator(url: url)

Next, replace next() with the following:

mutating func next() async -> Actor? {
  do {
    if let line = try await iterator.next(), !line.isEmpty {
      let name = line.components(separatedBy: "\t")[1]
      return Actor(name: name)
    }
  } catch let error {
    print(error.localizedDescription)
  }
  return nil
}

You add the async keyword to the signature because this method uses an asynchronous sequence iterator. Just for a change, you handle errors here instead of throwing them.

Now, in ActorAPI, modify getActors() to use this custom AsyncSequence:

func getActors() async {
  for await actor in ActorSequence(filename: filename) {
    await MainActor.run {
      actors.append(actor)
    }
  }
}

The next() method of ActorIterator handles any errors, so getActors() doesn’t throw, and you don’t have to try await the next element of ActorSequence.

You iterate over ActorSequence(filename:), which returns Actor values for you to append to actors.

Finally, in ContentView, replace the task closure with this:

.task {
  await model.getActors()
}

The code is much simpler, now that getActors() doesn’t throw.

Build and run.

List of actors matching search term

Everything works the same.

AsyncStream

The only downside of custom asynchronous sequences is the need to create and name structures, which adds to your app’s namespace. AsyncStream lets you create asynchronous sequences “on the fly”.

Instead of using a typealias, you just initialize your AsyncStream with your element type, then create the sequence in its trailing closure.

There are actually two kinds of AsyncStream. One has an unfolding closure. Like AsyncIterator, it supplies the next element. It creates a sequence of values, one at a time, only when the task asks for one. Think of it as pull-based or demand-driven.

AsyncStream: Pull-based

First, you’ll create the pull-based AsyncStream version of ActorAsyncSequence.

Add this method to ActorAPI:

// AsyncStream: pull-based
func pullActors() async {
  // 1
  var iterator = url.lines.makeAsyncIterator()
  
  // 2
  let actorStream = AsyncStream<Actor> {
    // 3
    do {
      if let line = try await iterator.next(), !line.isEmpty {
        let name = line.components(separatedBy: "\t")[1]
        return Actor(name: name)
      }
    } catch let error {
      print(error.localizedDescription)
    }
    return nil
  }

  // 4
  for await actor in actorStream {
    await MainActor.run {
      actors.append(actor)
    }
  }
}

Here’s what you’re doing with this code:

  1. You still create an AsyncIterator for url.lines.
  2. Then you create an AsyncStream, specifying the Element type Actor.
  3. And copy the contents of the next() method of ActorIterator into the closure.
  4. Now, actorStream is an asynchronous sequence, exactly like ActorSequence, so you loop over it just like you did in getActors().

In ContentView, call pullActors() instead of getActors():

await model.pullActors()

Build and run, then check that it still works the same.

List of actors matching search term

AsyncStream: Push-based

The other kind of AsyncStream has a build closure. It creates a sequence of values and buffers them until someone asks for them. Think of it as push-based or supply-driven.

Add this method to ActorAPI:

// AsyncStream: push-based
func pushActors() async {
  // 1
  let actorStream = AsyncStream<Actor> { continuation in
    // 2
    Task {
      for try await line in url.lines {
        let name = line.components(separatedBy: "\t")[1]
        // 3
        continuation.yield(Actor(name: name))
      }
      // 4
      continuation.finish()
    }
  }

  for await actor in actorStream {
    await MainActor.run {
      actors.append(actor)
    }
  }
}

Here’s what you’re doing in this method:

  1. You don’t need to create an iterator. Instead, you get a continuation.
  2. The build closure isn’t asynchronous, so you must create a Task to loop over the asynchronous sequence url.lines.
  3. For each line, you call the continuation’s yield(_:) method to push the Actor value into the buffer.
  4. When you reach the end of url.lines, you call the continuation’s finish() method.
Note: Because the build closure isn’t asynchronous, you can use this version of AsyncStream to interact with non-asynchronous APIs like fread(_:_:_:_:) .

In ContentView, call pushActors() instead of pullActors():

await model.pushActors()

Build and run and confirm that it works.

Continuations

Since Apple first introduced Grand Central Dispatch, it has advised developers on how to avoid the dangers of thread explosion.

When there are more threads than CPUs, the scheduler timeshares the CPUs among the threads, performing context switches to swap out a running thread and swap in a blocked thread. Every thread has a stack and associated kernel data structures, so context-switching takes time.

When an app creates a very large number of threads — say, when it’s downloading hundreds or thousands of images — the CPUs spend too much time context-switching and not enough time doing useful work.

In the Swift concurrency system, there are at most only as many threads as there are CPUs.

When threads execute work under Swift concurrency, the system uses a lightweight object known as a continuation to track where to resume work on a suspended task. Switching between task continuations is much cheaper and more efficient than performing thread context switches.

Threads with continuations
Note: This image of threads with continuations is from WWDC21 Session 10254.

When a task suspends, it captures its state in a continuation. Its thread can resume execution of another task, recreating its state from the continuation it created when it suspended. The cost of this is a function call.

This all happens behind the scenes when you use async functions.

But you can also get your hands on a continuation to manually resume execution. The buffering form of AsyncStream uses a continuation to yield stream elements.

A different continuation API helps you reuse existing code like completion handlers and delegate methods. To see how, check out Modern Concurrency in Swift, Chapter 5, “Intermediate async/await & CheckedContinuation”.

Push or Pull?

Push-based is like a factory making clothes and storing them in warehouses or stores until someone buys them. Pull-based is like ordering clothes from a tailor.

When choosing between pull-based and push-based, consider the potential mismatch with your use case:

  • Pull-based (unfolding) AsyncStream: Your code wants values faster than the asynchronous sequence can make them.
  • Push-based (buffering) AsyncStream: The asynchronous sequence generates elements faster than your code can read them, or at irregular or unpredictable intervals, like updates from background monitors — notifications, location, custom monitors

When downloading a large file, a pull-based AsyncStream — downloading more bytes only when your code asks for them — gives you more control over memory and network use. A push-based AsyncStream — downloading the whole file without pausing — could create spikes in memory or network use.

To see another difference between the two kinds of AsyncStream, see what happens if your code doesn’t use actorStream.

In ActorAPI, comment out this code in both pullActors() and pushActors():

for await actor in actorStream {
  await MainActor.run {
    actors.append(actor)
  }
}

Next, place breakpoints at this line in both methods:

let name = line.components(separatedBy: "\t")[1]

Edit both breakpoints to log the breakpoint name and hit count, then continue:

Log breakpoint name and hit count.

Now, in ContentView, set task to call pullActors():

.task {
  await model.pullActors()
}

Build and run, then open the Debug console:

Pull-based actorStream: No log messages

No log messages appear because the code in the pull-based actorStream doesn’t run when your code doesn’t ask for its elements. It doesn’t read from the file unless you ask for the next element.

Now, switch the task to call pushActors():

.task {
  await model.pushActors()
}

Build and run, with the Debug console open:

Push-based actorStream: Log message for every data line

The push-based actorStream runs even though your code doesn’t ask for any elements. It reads the entire file and buffers the sequence elements.

Where to Go From Here?

Download the final project using the Download Materials button at the top or bottom of the tutorial.

Note: The data files are only in the starter project. Copy them into the final project if you want to build and run that project.

In this tutorial, you:

  • Compared the speed and memory use when synchronously and asynchronously reading a very large file.
  • Created and used a custom AsyncSequence.
  • Created and used pull-based and push-based AsyncStreams.
  • Showed that the pull-based AsyncStream does nothing until the code asks for sequence elements, while the push-based AsyncStream runs whether or not the code asks for sequence elements.

You can use AsyncSequence and AsyncStream to generate asynchronous sequences from your existing code — any closures that you call multiple times, as well as delegate methods that just report new values and don’t need a response back. You’ll find examples in our book Modern Concurrency in Swift.

Additional Resources:

If you have any comments or questions, feel free to join in the forum discussion below!

raywenderlich.com Weekly

The raywenderlich.com newsletter is the easiest way to stay up-to-date on everything you need to know as a mobile developer.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK