Understanding SwiftNIO by building a text modifying server Aug 20 2020

Building a network application requires a good amount of effort, not only because of the complexities of the application you are building but also by the nature of network architecture. We have to define how are we going to handle the connections, the abstractions we’ll use to differentiate between network code and our application code, etcetera. Here is where SwiftNIO comes in, it provides an efficient non-blocking event-driven model, that is easy to use and extend. If we follow SwiftNIO’s model, we can take a lot of the boilerplate set up away and focus on building the logic of our applications. In this post, I’ll show you how to use SwiftNIO and understand its workflow by creating a server that receives text from clients and returns a modified version of the text.

As with any other networking project, the complexity comes from understanding networking concepts. This post assumes a basic understanding of networks, I’ll try to explain as much as possible but if there is a term that you are unfamiliar with, do a quick search and come back and continue where you left.

Ok, let’s get started.

* You can check the full code in the GitHub Repository

* NOTE: You can also get the “macOS network programming in Swift” guide. It includes more topics on building network applications on macOS, including:

BSD Sockets in Swift
Apple’s Network.framework
SwiftNIO

You can get it from the Guides section:

rderik.com - Guides

What is SwiftNIO

SwiftNIO is a Server-Side Swift framework that its main focus is building network applications. The NIO part of the name comes from the non-blocking I/O model it uses. To understand what this means, let’s see a blocking operation. Imagine the following Swift script:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
#!/usr/bin/env swift
import Foundation

print ("Welcome to our server!")
var count = 0
while (true) {
    print("Server uptime: \(count) seconds")
    sleep(1)
    count += 1
}

If we run that script:

1
2
$ chmod u+x fakeserver.swift
$ ./fakeserver.swift

We’ll see the server uptime increase every second.

What happens if we ask a user of the server for its name? We can do that by using the readLine(strippingNewline:) function:

1
2
3
print("Please enter your name:")
var name = readLine(strippingNewline: true) ?? "anonymous"
print("Welcome \(name)")

Let’s add that code to our while loop:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
#!/usr/bin/env swift
import Foundation

print ("Welcome to our server!")
var count = 0
while (true) {
    print("Server uptime: \(count) seconds")
    sleep(1)
    count += 1
    print("Please enter your name:")
    let name = readLine(strippingNewline: true) ?? "anonymous"
    print("Welcome \(name)")
}

Now, if we execute the script, the blocking operation is evident. When we are waiting for readLine to get the data from the user, it blocks the whole program execution. What we could do to fix this is to add another execution thread. We’ll have one thread to keep track of the count and another thread to read the user input.

That model of one thread per client is the basic idea that thread-based Servers use. As you can imagine, each thread still operates in a blocking manner. What can we do to avoid that? enter the reactor pattern¹.

What we want is a non-blocking way to handle input/output. To get a non-blocking communication, we could designate someone to be responsible for all IO operations. That someone will be the one in charge of the blocking operations, while we continue without waiting. Let’s call that someone Reactor, we could also call it demultiplexer, but its too wordy.

In our program, we just tell the Reactor to read or write, and we can designate a handler to be notified when there is data ready to be read, or when the data has been written. We don’t have to wait for the read or the write operations to complete. The Reactor is responsible for waiting and then notifying the interested party (the handler). And that is the idea behind the non-blocking part of SwiftNIO.

Ok, with the name of the framework out of the way, we can move to more interesting parts.

SwiftNIO architecture

SwiftNIO’s general architecture looks like the following:

SwiftNIO uses the RunLoop (Renamed to EventLoop) model to handle and dispatch events. If you are unfamiliar with RunLoops, you can check my post “Understanding the RunLoop model by creating a basic shell “. In SwiftNIO’s architecture EventLoops can be group together in EventLoopGroups. The idea is to distribute the load between each EventLoop’s resources efficiently.

We have EventLoops that are in charge of managing events. The events come from a socket (or a file descriptor). The socket is wrapped inside a Channel. And each channel has, in turn, a Channel Pipeline. This pipeline has sequences of Channel Handlers that can either be inbound, outbound, or both (More on this when we create our handlers). And finally, the channel handlers are executed in order.

SwiftNIO comes with builtin Bootstrap objects that we can use to associate a channel with an EventLoop. We could manually create the IO objects (e.g. sockets), and associate them to the EventLoop and Channel, but we get the Bootstrap objects that simplify the task.

That is the general architecture. I didn’t explain deeply how each element of the architecture works because we want to understand the whole workflow first, and then if we need something more specific we know where to search. For example, now you know that if you want to specify the behaviour of the TCP or UDP sockets, you would probably have a look at the Bootstrap objects.

Ok, let’s have a look the workflow and what handlers are all about.

Workflow

Just by looking at the architecture, we now have an understanding of which parts depend on which other. With that, we can infer the general workflow:

We need to create an EventLoopGroup
Bootstrap our sockets (it could be any IO capable resource, but from now on, I’ll just refer to sockets)
Initialise our Channel
Initialise our Handlers
And we are ready to handle events

That seems easy. It is easy, and it isn’t. Now comes the part where your general networking knowledge comes in handy. SwiftNIO is a networking framework, it gives you a lot of flexibility and gives you the chance to customise your setup to your heart’s content, but you should know what you are doing. This post’s intention is not to teach you general networking concepts. I won’t give an in-depth explanation of every aspect of the setup, but I’ll try to provide you with enough context so you can search for more information if you need.

Ok, let’s use SwiftNIO to build a basic server that takes text as the input from the client and returns a modified version fo the text.

Building a text modifying server with SwiftNIO

Our server will run on port 8888. When a client connects to the server, it will be able to send text to the server and receive the modified version of the text, like an echo server but with some additional juice.

Our server will modify the text in the following way:

Change the text to uppercase
Replace all the vowels by asterisks
Change the colour to green using Escape Sequences

We’ll have a handler for each of the text modifications. Remember the handlers are set up in a pipeline, where they receive some data, modify it, and pass it to the next handler in the pipeline.

To summarise, our server echoes back the text the client sent but, in uppercase, without vowels and in green colour. The Client doesn’t know and doesn’t need to know, the internals of how our server works. This architecture makes adding middleware layers as easy as adding a new handler.

Let’s create the server. It’ll help you understand the architecture.

Creating our Server’s entry point

Let’s begin by creating the server using the SwiftPackageManager. We’ll call our server niots (NIO Text Server).

1
2
3
$ mkdir niots
$ cd niots
$ swift package init --type executable

We need to add SwiftNIO as a dependency in the package manifest (Package.swift). Your manifest should look like the following:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
// swift-tools-version:5.2
// The swift-tools-version declares the minimum version of Swift required to build this package.

import PackageDescription

let package = Package(
    name: "niots",
    dependencies: [
         .package(url: "https://github.com/apple/swift-nio.git", from: "2.0.0"),
    ],
    targets: [
        .target(
            name: "niots",
            dependencies: [.product(name:"NIO", package: "swift-nio")]),
        .testTarget(
            name: "niotsTests",
            dependencies: ["niots"]),
    ]
)

Now let’s open Sources/niots/main.swift and begin our initial set up. Remember our general workflow:

We need to create an EventLoopGroup
Bootstrap our sockets (it could be any IO capable resource, but from now on, I’ll just refer to sockets)
Initialise our Channel
Initialise our Handlers
And we are ready to handle events

Begin by importing Foundation and NIO frameworks:

1
2
import Foundation
import NIO

We need to create our EventLoopGroup:

1
let group = MultiThreadedEventLoopGroup(numberOfThreads: System.coreCount)

We are going to try to make use of all of our CPU cores, so in the best-case scenario, we could have an execution thread in each of our CPU cores processing our clients’ requests. In our case it’s too much because we will only have a client in the example, we could use just 1, but it’s up to you.

Step number two, we need to Bootstrap our socket.

1
let bootstrap = ServerBootstrap(group: group)

We are using the ServerBootstrap object that will create a listening socket that we can later bind and get a ServerSocketChannel to work with.

Easy enough, but before we bind the socket to a specific host (our host) and port to start listening, we need to configure it properly. We need to configure the listening socket and channel represented by the ServerSocketChannel. Also, the child socket and channel that represents the accepted socket connection.

If you read the SwiftNIO examples, you’ll see that they are written in a very Swifty way, which means a lot of composition. You can split it into multiple commands if you feel that makes it easier to understand. I’ll follow the same pattern as the examples, but I’ll add comments to add some context to the setup. The following is how the whole setup looks:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
let bootstrap = ServerBootstrap(group: group)
    // ① Set up our ServerChannel
    .serverChannelOption(ChannelOptions.backlog, value: 256)
    .serverChannelOption(ChannelOptions.socketOption(.so_reuseaddr), value: 1)

    // ② Set up the closure that will be used to initialise Child channels
    // (when a connection is accepted to our server)
    .childChannelInitializer { channel in
		// ③ add handlers to the pipeline
        channel.pipeline.addHandlers([BackPressureHandler(), UpcaseHandler(), VowelsHandler(), ColourHandler()])
    }

    // ④ Set up child channel options
    .childChannelOption(ChannelOptions.socketOption(.so_reuseaddr), value: 1)
    .childChannelOption(ChannelOptions.maxMessagesPerRead, value: 16)
    .childChannelOption(ChannelOptions.recvAllocator, value: AdaptiveRecvByteBufferAllocator())

① For our server channel, we are defining the backlog option, which is used to specify the maximum length of the queue of pending connections. If a new connection arrives when the queue is full, the client will get an error. Then we set the option.so_reuseaddr, it specifies the rule for when a socket can bind to an address.

I’ll give you a little more context for this option. You can skip to the next step ② if you don’t want to get distracted with this level of detail.

In the case of SO_REUSEADDR, it affects the rules of when a socket can be bound to the same address:port used by another socket. The rule allows the socket to bind to an address if there is no actively listening socket bound to that address. That is the same behaviour you’ll expect of any bind connection. The difference is when the previous socket is still in the lingering time after being closed or crashed. How can we benefit from using it? When using TCP, the primary purpose is to be able to reattach a process to the same address if the previous process was closed/killed without having to wait for the expiration time(TIME_WAIT). When using UDP, it allows multiple sockets to bind to the same port. For more information, you can check the socket(7) man page.

② We define the child channel initialiser. The following is the signature of the childChannelInitializer function:

1
public func childChannelInitializer(_ initializer: @escaping (Channel) -> EventLoopFuture<Void>) -> Self

As you can see, we need to pass a function that accepts a channel and returns an EventLoopFuture. The EventLoopFuture will be notified once the channel has been initialised.

③ Inside the function, we add the handlers to the channel’s pipeline. The addHandler and addHandlers functions both return an EventLoopFuture<Void> which will be notified when the handlers have been added. The BackPressureHandler provided by SwiftNIO gives us a handler that takes care of managing back-pressure on the pipeline. It will stop reading, if it can’t handle the load, and continue reading when the writing catches-up. We can add handlers one at a time, or we can add multiple by passing an array of handlers. The other handlers you see being added (UpcaseHandler, VowelHandler, and ColourHandler) will be handlers we create, so you will see an error until we implement them.

④ We define the childChannelOptions. These options are more self-explanatory.

Ok, that should be it for setting up our EventLoopGroup, and its channels. Because remembering to shutdown the EventLoopGroup gracefully is essential, let’s make sure we always call it by adding it to a defer clause.

1
2
3
defer {
    try! group.syncShutdownGracefully()
}

That’s it for the initial setup. It might seem like a lot of boilerplate, but it is necessary if you would like to configure your server to the most efficient configuration for your needs. I prefer to have the possibility to tune all those settings than having defaults hidden from me that might not be the most performant for my needs.

One more thing left, bind our server to the host and port. Our server will have the port and host hard-coded, I don’t want to distract you from SwiftNIO by parsing arguments. If you want to learn more about parsing argument’s check my post on “Understanding the Swift Argument Parser and working with STDIN “.

Ok, let’s create our channel and bind it to local host (::1 in IPv6) on port 8888:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
let defaultHost = "::1"
let defaultPort = 8888


let channel = try bootstrap.bind(host: defaultHost, port: defaultPort).wait()

print("Server started and listening on \(channel.localAddress!)")

try channel.closeFuture.wait()

print("Server closed") 

That’s it. The following is the complete main.swift file:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
import Foundation
import NIO

let group = MultiThreadedEventLoopGroup(numberOfThreads: System.coreCount)
let bootstrap = ServerBootstrap(group: group)
    // Set up our ServerChannel
    .serverChannelOption(ChannelOptions.backlog, value: 256)
    .serverChannelOption(ChannelOptions.socketOption(.so_reuseaddr), value: 1)

    //Set up the closure that will be used to initialise Child channels
    // (when a connection is accepted to our server)
    .childChannelInitializer { channel in
        channel.pipeline.addHandlers([BackPressureHandler(), UpcaseHandler(), VowelsHandler(), ColourHandler()])
    }

    // Set up child channel options
    .childChannelOption(ChannelOptions.socketOption(.so_reuseaddr), value: 1)
    .childChannelOption(ChannelOptions.maxMessagesPerRead, value: 16)
    .childChannelOption(ChannelOptions.recvAllocator, value: AdaptiveRecvByteBufferAllocator())

let defaultHost = "::1"
let defaultPort = 8888


let channel = try bootstrap.bind(host: defaultHost, port: defaultPort).wait()

print("Server started and listening on \(channel.localAddress!)")

try channel.closeFuture.wait()

print("Server closed")

With our entry point out of the way, let’s work on our handlers.

Creating Channel Handlers

Remember each handler is in a specific position on our channel pipeline. If we want to pass data from one handler to the next, we need to make sure that the output type of one handler matches the input type of the following handler. Also, remember that we have two types of handlers:

ChannelOutboundHandler
ChannelInboundHandler

The difference is where the events are coming or going from. A ChannelInboundHandler works with events that originate on the channel source, a socket for our case. For example, if there is data to read, but also if the channel became inactive, active, etcetera. A ChannelOutboundHandler works with events that we want to pass to the channel source, in our case a socket. For example, bind to a socket, write, flush, read.

When choosing which type of handler to implement, remember the following:

If the event originated from the source, then use ChannelInboundHandler.
If you want to pass an event to the source, then use ChannelOutboundHandler.

In our case, because the event comes from the source, the client connecting to our server, we’ll work with ChannelInboutHandler. The idea is similar when working with ChannelOutboundHandlers. If you want to see an example of ChannelOutboundHandler, look at SwiftNIO - NIOMulticastChat example.

Ok, back to our implementation. We are going to have the following workflow:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
Client: hello
      |
      v
   Server
      |
      v
BackPressureHandler (Receives a ByteBuffer - passes a ByteBuffer)
      |
      v
UpcaseHandler(Receives a ByteBuffer - passes a [CChar])
      |
      v
VowelsHandler(Receives a [CChar] - passes a ByteBuffer)
      |
      v
ColourHandler(Receives a ByteBuffer - passes a ByteBuffer)
      |
      v
Client: receives
H*LL* (In green colour)

ByteBuffer is a struct that contains the raw bytes received from the source, and that is what BackPressureHandler passes to the rest of our handlers. We can pass different types from handler to handler, but we need to make sure that the types match. That will be clear with the implementation of our UpcaseHandler that passes an array of CChar (that means a C string) to the next handler. In reality, the handlers pass data between them wrapped in the NIOAny type, but we’ll see that shortly.

Ok, let’s create our handlers.

Implementing `UpcaseHandler`

Begin by creating a new file inside the Sources/niots/ directory with the name UpcaseHandler.swift.

import NIO

final class UpcaseHandler: ChannelInboundHandler {
}

Our class UpcaseHandler will implement the protocol ChannelInboudHandler which in turn implements _ChannelInboundHandler. From ChannelInboundHandler we need to define the following properties:

InboundIn - the type of the inbound data that will be passed to the handler
InboundOut - the type of the outbound data that will pass to the next handler

So in our case, we are going to receive a ByteBuffer and pass a [CChar].

1
2
    public typealias InboundIn = ByteBuffer
    public typealias InboundOut = [CChar]

In an Inbound channel pipeline, the _ChannelInboundHandler protocol defines the methods that will be called depending on the event triggered. In our case we want to handle the channelRead event. That means that the function channelRead(context:data:) will be called when some data has been read from the socket. So let’s implement the function, overwriting the default implementation:

1
2
3
4
5
6
7
8
9
    public func channelRead(context: ChannelHandlerContext, data: NIOAny) {
        let inBuff = self.unwrapInboundIn(data)
        let str = inBuff.getString(at: 0, length: inBuff.readableBytes)

        let result = str?.uppercased() ?? ""

        let cresult = result.cString(using: .utf8) ?? [] 
        context.fireChannelRead(self.wrapInboundOut(cresult))
    }

The protocol ChannelInboundHandler implements the functions unwrapInboundIn and wrapInboundOut .

unwrapInboundIn - unwraps the the data wrapped in a NIOANy to the InboundIn type
wrapInboundOut - wraps the the data of type InboundOut in a NIOANy type to be passed to the next handler.

In our case because we know that BackPressureHandler will pass us a ByteBuffer we know that when we unwrap the data, we’ll get a ByteBuffer struct. ByteBuffer has some convenience methods to transfer the data to the most common types. I will advise you to check the documentation here.

After we get the string representation from the ByteBuffer, we simply transform it using the String function uppercased. From the uppercased String, we generate a C string and pass it to the next handler by triggering a ChannelRead. Using fireChannelRead is how we trigger a read on the channel pipeline, and because our next handler implements the channelRead function, it’ll be executed next.

And that’s it. The following is the whole UpcaseHandler.swift file:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
import NIO

final class UpcaseHandler: ChannelInboundHandler {
    public typealias InboundIn = ByteBuffer
    public typealias InboundOut = [CChar]

    public func channelRead(context: ChannelHandlerContext, data: NIOAny) {
        let inBuff = self.unwrapInboundIn(data)
        let str = inBuff.getString(at: 0, length: inBuff.readableBytes)

        let result = str?.uppercased() ?? ""

        let cresult = result.cString(using: .utf8) ?? [] 
        context.fireChannelRead(self.wrapInboundOut(cresult))
    }
}

Implementing `VowelsHandler`

The idea is the same, so I’ll show you the complete file and then give some additional pointers after. If you are following along create the file VowelsHandler.swift inside your Sources/niots/ directory, and add the following content:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
import NIO

final class VowelsHandler: ChannelInboundHandler {
    public typealias InboundIn = [CChar]
    public typealias InboundOut = ByteBuffer

    public func channelRead(context: ChannelHandlerContext, data: NIOAny) {
        let inBuff = self.unwrapInboundIn(data)
        let str = String(cString: inBuff)

        let vowels: [Character] = ["a","e","i","o","u", "A", "E", "I", "O", "U"]
        let result = String(str.map { return vowels.contains($0) ? Character("*") : $0 })
        
        var buffOut = context.channel.allocator.buffer(capacity: result.count )
        buffOut.writeString(result)

        context.fireChannelRead(self.wrapInboundOut(buffOut))
    }
}

As you can see, the InboundIn type is [CChar] which matches the InboundOut of our previous handler. We implement the ChannelRead function as before. And that function will be triggered by the channel read event. The implementation of this function is straightforward, we have an array containing the vowels, and we replace them by an asterisk.

The interesting part of this function is the following lines:

1
2
        var buffOut = context.channel.allocator.buffer(capacity: result.count )
        buffOut.writeString(result)

We are going to pass a ByteBuffer to the next handler, so we need to create a ByteBuffer. The way to create it is by using the channel allocator. You can read the documentation here. Once we have the ByteBuffer, we write a string to the ByteBuffer using the function writeString. And now we can pass that ByteBuffer to the next handler by triggering the fireChannelRead and passing the ByteBuffer wrapped in a NIOAny by using wrapInboundOut.

Implementing `ColourHandler`

The next handler is more of the same. Create the file ColourHandler.swift inside the Source/niots/ directory and add the following content:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
import NIO

final class ColourHandler: ChannelInboundHandler {
    public typealias InboundIn = ByteBuffer
    public typealias InboundOut = ByteBuffer

    public func channelRead(context: ChannelHandlerContext, data: NIOAny) {
        let inBuff = self.unwrapInboundIn(data)
        let str = inBuff.getString(at: 0, length: inBuff.readableBytes) ?? ""

        let result = "\u{1B}[32m\(str)\u{1B}[0m"

        var buff = context.channel.allocator.buffer(capacity: result.count )
        buff.writeString(result)

        context.write(self.wrapInboundOut(buff), promise: nil)
    }
}

We surround the string we get from the previous handler inside Escape Sequences, that way it’ll shows in green colour when displayed in a terminal. And we add that string to a ByteBuffer. Nothing new there. Now if you notice we are not going to pass it to the next handler because there is no next handler. What we are going to do is write back to the socket using the channel context, using the write function. The following is the signature of the function:

1
public func write(_ data: NIOAny, promise: EventLoopPromise<Void>?)

We pass the data wrapped in a NIOAny type. We could also pass a promise. We don’t want to be notified when the promise is fulfilled, so we pass nil. I encourage you to check all the documentation for the ChannelHandlerContext it has a lot of useful information.

Running the Server

And that’s all the code we need to build and run our server. Let’s do that:

1
$ swift run

You should see the following message on screen:

1
Server started and listening on [IPv6]::1/::1:8888

Now how to connect to the server? We could use the rdncat application we created using Apple’s NWFramework (You can read the post here: Building a server-client application using Apple’s Network framework). Or if you don’t have it just use ncat(1).

Open a new shell and connect to the server using ncat(1) or rdncat:

1
 $ ncat ::1 8888

Type a message and press enter. You should see the echo back with the text modified.

And that’s it. You can close both applications by pressing <CTRL-C>.

There you have it a simple SwiftNIO server application.

Final Thoughts

As you can see, after a straight forward setup, we can just focus on the logic of our network applications. And that is what makes SwiftNIO such a good option for building network applications. It makes the setup straight forward, but it also allows us to add customisations in every step. We can go and create our own BootStrap objects to create the sockets just how we need them, or build our own pipeline logic.

The idea of the Channel pipeline is one that is often overlooked by simple examples like this. But imagine how easy it is to add a new handler in the pipeline to add functionality without having to rewrite any of the other handlers. For example, suppose we would like to add authentication capabilities to our server. In that case, we only need to build a handler that validates that a user and password is supplied and that they are correct. Depending on the validation result, we could either move to the next handler in the pipeline or return with an authentication error.

Adding new functionality is as easy as adding a new handler. If you need support for TLS, add a handler that supports that. For more handlers examples, check the handlers provided by SwiftNIO. The only complexity (and not that complex IMO) might be with working with Futures and Promises, but that is a small price to pay for all the flexibility you gain.

Ok, that’s it for this post. It’s time for you to go build some cool network applications.