Data Streams

Reactive programming is programming with asynchronous data streams

This is useful because anything can be represented as a stream of data, including user input, network responses, and even internal program state. By treating everything as a stream, it becomes possible to use a single set of abstractions to handle all types of data, making it simpler to reason about and manage the flow of data through an application.

Manipulating Streams

Reactive programming allows for easy manipulation of streams with functions such as map, filter, and reduce, which can be used to transform, filter, and aggregate data in a stream.

These functions are similar to their counterparts in functional programming and allow for a simple and declarative way to manipulate streams.

  • The map function can be used to transform data in a stream, such as converting temperatures from Celsius to Fahrenheit.
  • The filter function can be used to filter out unwanted data, such as removing invalid data points.
  • The reduce function can be used to aggregate data in a stream, such as calculating the sum of all elements in a stream.

These functions can also be combined and composed to create more complex operations, making it easy to manipulate and process streams in a flexible and powerful way.

Visualising Streams

A stream is a sequence of events occurring over time.

The idea of data streams happening over time can be hard to visualise. A useful tool is the marble diagram. Imagine a data stream derived from a user tapping on the iPhone. This could be visualised like so:

marble

The diagram depicts time on a line starting from the left and proceeding to the right. The circles represent taps (two close together, then a third a little later).

3 Event Types

A data stream can emit three types of events: data, error and complete.

  • Data events are emitted when new data is available in the stream.
  • Error events are emitted when an error occurs while processing the stream.
  • Complete events are emitted when the stream has finished emitting all data.

In the earlier marble diagram, we represented 3 data events in the form of circles. Marble diagrams also depict error events.

marble error

And completion events.

marble complete

Reacting to Events

We can define functions to run when specific events are emitted from a stream. These functions are called “subscribers” and/or “operators”. They are used to handle and operate on the different types of events emitted by a stream.

For example, a subscriber for a stream of button taps might define a function that updates a label when a button is tapped, and another function that logs an error message if the stream emits an error event.

Reactive libraries often include many functions to handle events such as map, filter, scan etc. Each returns a new stream and does not modify the original stream. This immutability makes it easier to compose and reuse concurrent streams.

Swift Combine

Combine is Apple’s reactive framework.

Publishers, Subscribers, and Subscriptions

Combine uses the concepts of publishers, subscribers, and subscriptions to handle data streams.

  • A publisher is an object that emits values over time, such as a stream of data from a network request. In Combine, publishers conform to the Publisher protocol and are responsible for emitting values to subscribers.
  • A subscriber is an object that receives values emitted by a publisher and can perform some action, such as updating a user interface. In Combine, subscribers conform to the Subscriber protocol and are responsible for receiving values from a publisher and performing some action.
  • A subscription is the connection between a publisher and a subscriber, which can be established, modified, or cancelled. In Combine, a subscription is represented by the Subscription protocol and is responsible for managing the connection between a publisher and a subscriber.

One of the key differences between Combine and other reactive frameworks is that publishers and subscribers in Combine are separate types, unlike in other frameworks where a single object often serves as both a publisher and a subscriber. This separation allows for greater flexibility and composability in managing data streams.

A Simple Combine Example

Imagine we have a UILabel called label. We want to create a data stream from taps on the label. Every tap will emit an event. We’ll use Combine operators to count the number of taps and print the result.

Here’s how that might look.

cancellable = label
    .publisher(for: tapGestureRecogniser)
    .map { _ in 1 }
    .scan(0, +)
    .sink { count in
        print(count)
    }

Let’s step through this code and then provide the marble diagram.

  • Firstly, the chain of events is assigned to a property called cancellable. This property stores the subscription. It is a type called AnyCancellable. It enables us to manage the memory of the data streams and cancel the subscription as needed.
  • The label.publisher(for: tapGestureRecogniser) creates a publisher (Combine’s name for a data stream) for the tap gesture event on the label. This publisher will emit an event every time the label is tapped.
  • The .map { _ in 1 } operator is applied on the publisher. It maps the event emitted by the publisher to a new stream emitting the integer 1.
  • The .scan(0, +) operator is applied on the stream, it applies an accumulator function that takes the current value and the next value from the stream and returns a new value. In this case, it’s adding up the values. It starts with the initial value of 0 and adds up the 1s emitted by the publisher, creating a new stream with the count of taps.
  • Finally, the .sink { count in print(count) } operator is applied on the stream, it’s used to subscribe to the stream and handle the values emitted. In this case, it’s printing the count to the console every time the label is tapped.

Here’s how that looks visually, starting with the taps, then the map to integer 1 then the accumulation using scan.

simple marble scan

This demonstrates a simple reactive data flow using Combine. Notice how the code describes what we want to happen from input (taps) to output (print). This is the declarative nature of reactive programming. Operators are chained together from input to output.

input operate output

A More Complex Combine Example

Let’s look at a more complex example that leverages multiple operators.

Take a look at this gif.

combine search example

Here we’re taking a search string, triggering an API call and loading a list of characters. Let’s think of this in terms of inputs and outputs:

  • Input: user inputting text
  • Output: list of marvel characters

In between the input and output are operators that transform the data and create new data streams.

Here are some of the operations we want to perform:

  • Avoid excessive network requests. If someone types very quickly, wait till they slow down or stop to search.
  • Don’t send a network request for empty searches.
  • Only return a network request for the most recent search.

Below is a simplified version of a view model that takes the search data stream, operates on it and then outputs the character array data stream.

final class MarvelViewModel {
    private var cancellables = Set<AnyCancellable>()

    // Inputs
    @Published var search: String = ""
    
    // Outputs
    @Published var marvelCharacters: [MarvelCharacter] = []
    
    init(marvelAPI: @escaping API = MarvelAPI.searchCharacters) {
        $search
            .debounce(for: .seconds(0.3), scheduler: RunLoop.main)
            .removeDuplicates()
            .filter { !$0.isEmpty }
            .map(marvelAPI)
            .switchToLatest()
            .assign(to: \.marvelCharacters, on: self)
            .store(in: &cancellables)
    }
}

If you skim down to $search you can see a chain of operations written using Combine. Let’s overview the rest of the code first.

  1. @Published synthesises a Publisher for the property it is assigned to.
  2. The input data stream is the search: String published property.
  3. The output is the marvelCharacters: [MarvelCharacter] published property.
  4. The init takes a marvelAPI. This function does our network fetch and returns a new Publisher of [MarvelCharacter] that does not emit an error.

That is a simple explanation of the class architecture.

Let’s look at the chain of operations declared in the init, starting at the $search.

  1. $search: This is the syntax for accessing the publisher from @Published var search, it will emit values as the user types in the search bar.
  2. debounce: This is delaying the execution of the following operations for 0.3 seconds, and only allows the latest search term to pass through.
  3. removeDuplicates: It’s removing any duplicates of the same search term that may come through.
  4. filter: It’s removing any empty search terms.
  5. map: It’s sending the non-empty, non-duplicate search term to the “marvelAPI” function, which performs the API fetch
  6. switchToLatest: It’s only allowing the most recent search result from the API to pass through.
  7. assign: It’s assigning the final search result to a “marvelCharacters” property on the current object, which is used to drive the UI
  8. store: It’s storing the above subscription in a set of “cancellables”, which can be used to cancel the operation if needed, and will be dropped from memory when the view model is deallocated

Here is a marble diagram of the data flow.

long marble diagram

Notice that the marvelAPI function does an asynchronous network call. We can’t guarantee the order of responses from the network. In the diagram I made it look like the API manages to fetch “d” before finishing the “c” fetch. That is why we use .switchToLatest(), to ensure we get the most recent publisher from the prior data stream. This isn’t exactly how it works but hopefully gives a sense of managing asynchronous code in a data stream.

Hopefully, this short example demonstrates the power of Combine and reactive programming. The code describes a complex set of synchronous and asynchronous operations in a declarative and linear fashion. This makes it easier to read, test and reason about.

Imagine achieving the same functionality without Combine or reactive programming. There’d at least be multiple stores of state to keep track of different properties, probably multiple functions, delegates or callbacks, and likely different abstractions.

Combine attempts to reduce those complexities.

Conclusion

Reactive Programming with Swift and Combine is a powerful way to handle asynchronous data streams in iOS development. By treating everything as a stream, it becomes possible to use a single set of abstractions to handle all types of data, making it simpler to reason about and manage the flow of data through an application.

It’s also a new and challenging paradigm to learn. This post only touches on some of the new theories and syntax to master. More to come on that. But hopefully, it helps to understand what and why reactive programming has come to the Apple ecosphere via Combine.