Multithreading with pthreads in Swift Oct 29 2019

Long gone are the days when single process single thread was the norm. We now take multithreading for granted, we expect all our applications not to lockup when we interact with them. We expect them to handle multiple users at the same time, etcetera. In this post, I'll explain what multithreading is and how to use threads (using pthreads) in Swift.

Before someone tells me that in 2009 GCD (Grand Central Dispatch) was introduced, I know. This post is aimed at anyone that wants to understand how multithreading works using threads. To use threads directly in Swift, we need to interoperate with C code. We need to go down to using pthreads directly. We'll go from the basics of threads and then write a Server simulator using threads.

Ok, let's begin by understanding what a process is and where multithreading comes in to play.

**Note: You can find the full code on the GitHub Repository

Processes and execution threads

When we think of a process, we think of a program being executed. In reality, a process is an abstract entity that encapsulates control structures and at least one execution thread. So the process is not the executing code. The process is the structure that contains it.

When we want to execute a program, we ask our Operating System to do it for us. We make a system call to a variant of the exec(3) command to specify the file we want to execute. The operating system creates a process and executes it.

As we said before the process is a structure that represents executable code. A process keeps track of:

If we have a single thread process, the process keeps track of only one set of control structures if we have a process with multiple threads each thread needs its own control structures.

Let's see all the combinations:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
Single Process Single Thread

+----------------------+
|  One Process         |
|  One Thread          |
|   ~~~~~~~~~          |
|   Resources          |
|                      |
|   Control Structures |
|                      |
|   Code               |
+----------------------+


Multiple Processes Single Thread

+----------------------+   +---------------------+  +---------------------+
|  Process 1           |   | Process 2           |  | Process n           |
|  One Thread          |   | One Thread          |  | One Thread          |
|   ~~~~~~~~~          |   |  ~~~~~~~~~          |  |  ~~~~~~~~~          |
|   Resources          |   |  Resources          |  |  Resources          |
|                      |   |                     |  |                     |
|   Control Structures |   |  Control Structures |  |  Control Structures |
|                      |   |                     |  |                     |
|   Code               |   |  Code               |  |  Code               |
+----------------------+   +---------------------+  +---------------------+


Single Process Multiple Threads

+-------------------------------------------------------------------------+
| Single Process                                                          |
|                                                                         |
| Resources                                                               |
|                                                                         |
| Code                                                                    |
|                                                                         |
|  +--------------------+ +--------------------+  +--------------------+  |
|  |     Thread 1       | |     Thread 2       |  |     Thread n       |  |
|  |     ~~~~~~~~~      | |     ~~~~~~~~~      |  |     ~~~~~~~~~      |  |
|  | Control Structures | | Control Structures |  | Control Structures |  |
|  +--------------------+ +--------------------+  +--------------------+  |
|                                                                         |
+-------------------------------------------------------------------------+


Multiple Processes Multiple Threads

+-----------------------------------+ +-----------------------------------+
| Process 1  +--------------------+ | | Process 2  +--------------------+ |
|            |     Thread 1       | | |            |     Thread 1       | |
| Resources  |     ~~~~~~~~~      | | | Resources  |     ~~~~~~~~~      | |
|            | Control Structures | | |            | Control Structures | |
| Code       +--------------------+ | | Code       +--------------------+ |
|            +--------------------+ | |            +--------------------+ |
|            |     Thread 2       | | |            |     Thread 2       | |
|            |     ~~~~~~~~~      | | |            |     ~~~~~~~~~      | |
|            | Control Structures | | |            | Control Structures | |
|            +--------------------+ | |            +--------------------+ |
|            +--------------------+ | |            +--------------------+ |
|            |     Thread n       | | |            |     Thread n       | |
|            |     ~~~~~~~~~      | | |            |     ~~~~~~~~~      | |
|            | Control Structures | | |            | Control Structures | |
|            +--------------------+ | |            +--------------------+ |
+-----------------------------------+ +-----------------------------------+

+-----------------------------------+ +-----------------------------------+
| Process 3  +--------------------+ | | Process n  +--------------------+ |
|            |     Thread 1       | | |            |     Thread 1       | |
| Resources  |     ~~~~~~~~~      | | | Resources  |     ~~~~~~~~~      | |
|            | Control Structures | | |            | Control Structures | |
| Code       +--------------------+ | | Code       +--------------------+ |
|            +--------------------+ | |            +--------------------+ |
|            |     Thread 2       | | |            |     Thread 2       | |
|            |     ~~~~~~~~~      | | |            |     ~~~~~~~~~      | |
|            | Control Structures | | |            | Control Structures | |
|            +--------------------+ | |            +--------------------+ |
|            +--------------------+ | |            +--------------------+ |
|            |     Thread n       | | |            |     Thread n       | |
|            |     ~~~~~~~~~      | | |            |     ~~~~~~~~~      | |
|            | Control Structures | | |            | Control Structures | |
|            +--------------------+ | |            +--------------------+ |
+-----------------------------------+ +-----------------------------------+

Threads have the advantage that they are lightweight compared to Processes. Spawning a new thread and stopping a thread takes less time than creating and stopping a full Process. So we can spawn threads to do additional work with little overhead.

Let's see an example of how to use threads in Swift, more specifically pthreads.

Creating a multithreading program in Swift

Back in the day, every platform created their own implementation of threads. To improve compatibility between platforms, the POSIX Thread execution model was defined. Many of the POSIX conformant Operating Systems provide an implementation of pthreads. macOS being one of them, gives us access to pthreads.

We are going to create a program that simulates a server/client scenario. The server has a queue of clients to process. And processing each client takes a set amount of time. We are going to explore how to improve the processing time of the clients by parallelising the process using threads.

I'll use Swift Package Manager to create the application. Let's start by creating the directory rdknitting (Threads... Knitting.. get it?) and initialising the Swift package.

1
2
3
$ mkdir rdknitting
$ cd rdknitting
$ swift package init --type executable

As we discussed previously, we are going to be using a Queue, so let's start there. Create a file in your Sources/rdknitting/ directory with the name Queue.swift and add the following content:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
struct Queue<T> {
    var clients = [T]()

    mutating func enqueue(_ client: T) {
        clients.append(client)
    }

    mutating func dequeue() -> T? {
        guard !clients.isEmpty else { return nil }
        return clients.removeFirst()
    }

    func peek() -> T? {
        guard !clients.isEmpty else { return nil }
        return clients[0]
    }

    func count() -> Int {
      return clients.count
    }

    func isEmpty() -> Bool {
        return clients.isEmpty
    }
}

Alright, that's our implementation of a Queue structure. We'll use our Queue to handle our clients.

Let us now create a structure that will represent our Clients. Create a new file inside Sources/rdknitting/ directory, with the name Client.swift:

1
2
3
struct Client {
  var id: Int
}

Now let's create the Server logic. Create a new file in Sources/rdknitting/, give it the name Server.swift and add the following content:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
import Foundation

class Server {

  var clients: Queue<Client>

  init(numberOfClients: Int) {
    clients = Queue<Client>()
    for clientId in 1...numberOfClients {
      clients.enqueue(Client(id: clientId))
    }
  } 

  func start() {
    while let client = clients.dequeue() {
      print("client to process:\(client)")
      processClient(client: client, thread: "main")
    }
  }

  func processClient(client: Client, thread: String) {
    print("Processing client: \(client) in Thread: \(thread)")
    sleep(1)
    print("Done processing client: \(client) in Thread: \(thread)")
  }
}

We initialise our Server with the number of clients that the server will begin with. Then we declare the start method that will begin the processing of clients. The processClient function emulates slow processing of the client by sleeping for one second.

We can now put all the parts together in our main.swift. Our main instantiates our server and starts it. Here is the code for main.swift:

1
2
3
4
5
6
7
print("Welcome to our Server simulator")

let server = Server(numberOfClients: 10)
let startTime = CFAbsoluteTimeGetCurrent()
server.start()
let endTime = CFAbsoluteTimeGetCurrent()
print("Duration:\(endTime - startTime) seconds")

Now we can build and run the code to test how long it takes for our single thread Server.

1
$ swift run

You'll see the Server running and display at the end how long it takes to serve all the clients, in my case:

1
Duration:10.038536071777344 seconds

Now that our server is running in a single thread, we can improve it and make it multithread.

Using pthreads in Swift

We are going to use the system's pthread API, that means, we will need to interact with C code. If you are not familiar with Swift and C interoperability check my post on using BSD Sockets in Swift. Especially the first section, "C Interoperability", to get a refresh on the topic.

To create a thread we are going to use the pthread_create function the C signature is as follows:

1
2
int pthread_create (pthread_t *threadID, const pthread_attr_t *attr,
    void *(*startRoutine)(void *), void *arg);

The Swift equivalent is:

1
public func pthread_create(_: UnsafeMutablePointer<pthread_t?>!, _: UnsafePointer<pthread_attr_t>?, _: @convention(c) (UnsafeMutableRawPointer) -> UnsafeMutableRawPointer?, _: UnsafeMutableRawPointer?) -> Int32

The function receives the pointer to the pthread that will be created as the first argument. The second argument is the thread attributes (attr), which allow you to customise how your thread works. If you want to have a look at its possible settings, check the man page for pthread_attr(3). I will suggest using the default settings by sending nil unless you know exactly what you are doing. The third argument is a pointer to the function that will be executed in the thread. The last argument is a pointer to the arguments we want to pass to the function.

One limitation is that the pointer to the function will only accept a global function or a closure that doesn't capture any context. With this limitation, you might ask, how can we pass some of our context to the function?

Well, the answer is the fourth parameter. We are going to create an object that will encapsulate the context we need to pass to the function.

Create a new file in the Sources/rdknitting/ directory, name it: ThreadParameter.swift. We are only interested in an identifier for the thread(we'll create that struct next) and the server object. Here is the content of ThreadParameter.swift:

1
2
3
4
5
6
7
8
9
10
class ThreadParameter {

  var threadIdentifier: ThreadIdentifier
  var server: Server

  init(threadIdentifier: ThreadIdentifier, server: Server) {
    self.threadIdentifier = threadIdentifier
    self.server = server
  }
}

Our ThreadIdentifier object is a simple struct that contains a String. Create the a new file ThreadIdentifier.swift with the following content:

1
2
3
struct ThreadIdentifier {
  var id: String
}

Ok, now we can work on the function that we are going to run in our thread. This function is the one that we are going to pass as the third argument to pthread_create. Remember, we could have passed a closure, but I wanted to show you how to create a global function.

Let's create a global function. In the file main.swift add the following function (that's it, it is not contained in any object. The function is in the global scope):

1
2
3
4
5
6
7
8
9
func threadedFunction(pointer: UnsafeMutableRawPointer) -> UnsafeMutableRawPointer? {
  var threadParameter = pointer.load(as: ThreadParameter.self)
  var server = threadParameter.server
  var threadIdentifier = threadParameter.threadIdentifier
  while let client = server.clients.dequeue() {
    server.processClient(client: client, thread: threadIdentifier.id)
  }
  return nil
}

The function runs until there are no more clients in the queue. If you look carefully at the code, you'll notice that we are using pointer.load to "unwrap" the pointer to our ThreadParameter object (What we are expecting to receive).

Let's put the thread code inside one function in the Server object. Add the following function:

1
2
3
4
5
6
7
8
9
10
11
12
13
  func createThread() {
    var myThread: pthread_t? = nil

    var threadParameter = ThreadParameter(threadIdentifier: ThreadIdentifier(id: "myThread"), server: self)
    var pThreadParameter = UnsafeMutablePointer<ThreadParameter>.allocate(capacity:1)
    pThreadParameter.pointee = threadParameter
    let result = pthread_create(&myThread, nil, threadedFunction, pThreadParameter)

    if result != 0 {
      print("Error creating thread--")
      exit(EXIT_FAILURE)
    }
  }

This function is in charge of creating the thread (and here is where we pass our threadedFunction). Now let's update our start function:

1
2
3
  func start() {
    createThread()  
  }

If we run the program now, we won't be able to see the execution of the thread. The main thread will finish, and because there is nothing for it to do, it'll stop the whole process, including its threads. We need to use pthread_join to "join" our main thread and wait for the thread to finish. In our createThread function after the thread has been created add the following line:

1
    pthread_join(myThread!,nil)

The whole function will look like the following:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
  func createThread() {
    var myThread: pthread_t? = nil

    var threadParameter = ThreadParameter(threadIdentifier: ThreadIdentifier(id: "myThread"), server: self)
    var pThreadParameter = UnsafeMutablePointer<ThreadParameter>.allocate(capacity:1)
    pThreadParameter.pointee = threadParameter
    let result = pthread_create(&myThread, nil, threadedFunction, pThreadParameter)

    if result != 0 {
      print("Error creating thread--")
      exit(EXIT_FAILURE)
    }
    pthread_join(myThread!,nil)
  }

Alright, now we can build and run our code:

1
$ swift run

You'll be able to see our Server processing all the clients in the thread!

Ok, let's make it multithreading, we can add more than one thread to our server.

We are going to use a variable to keep track of how many threads have been created, and we'll use that number as part of the thread identifier. Add the variable to our server class:

1
  var threadCounter: Int = 0

We need to update our createThread function to use the threadCounter as part of the thread identifier and increment it when the thread has been successfully created. The whole function will look like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
func createThread() {
    var myThread: pthread_t? = nil

    var threadParameter = ThreadParameter(threadIdentifier: ThreadIdentifier(id: "myThread-\(threadCounter)"), server: self)
    var pThreadParameter = UnsafeMutablePointer<ThreadParameter>.allocate(capacity:1)
    pThreadParameter.pointee = threadParameter
    let result = pthread_create(&myThread, nil, threadedFunction, pThreadParameter)

    if result != 0 {
      print("Error creating thread--")
      exit(EXIT_FAILURE)
    }
    threadCounter += 1
    pthread_join(myThread!,nil)
  }

And our start function will look like this (the default will be to use only one thread):

1
2
3
4
5
6
  func start(numberOfThreads: Int = 1) {
    for _ in 1...numberOfThreads {
      createThread()
    }
  }

We can build and run it:

1
$ swift run

If you see there seems to be a problem, only one thread appears to be running. Why is this?

Well, it is because in our createThread function we are joining the main thread to the newly created thread so that it will wait there until the thread finishes. That is not what we want. We want to spawn as many threads we define and let them process all the clients. We'll have to create all the threads first, and when all of them are created, join them to our main thread them.

We are going to use an array to keep track of the created threads.

1
  var threads: Array<pthread_t> = Array<pthread_t>()

Every time we create a new thread, we are going to add it to the array:

1
    threads.append(myThread!)

And in our start function after the creation of all threads we'll join to all of them:

1
2
3
4
5
6
7
8
  func start(numberOfThreads: Int = 1) {
    for _ in 1...numberOfThreads {
      createThread()
    }
    for thread in threads {
      pthread_join(thread,nil)
    }
  }

The complete Server.swift file will have the following content:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
import Foundation


class Server {

  public var clients: Queue<Client>
  var threadCounter: Int = 0
  var threads: Array<pthread_t> = Array<pthread_t>()

  init(numberOfClients: Int) {
    clients = Queue<Client>()
    for clientId in 1...numberOfClients {
      clients.enqueue(Client(id: clientId))
    }
  } 

  func createThread() {
    var myThread: pthread_t? = nil

    var threadParameter = ThreadParameter(threadIdentifier: ThreadIdentifier(id: "myThread-\(threadCounter)"), server: self)
    var pThreadParameter = UnsafeMutablePointer<ThreadParameter>.allocate(capacity:1)
    pThreadParameter.pointee = threadParameter
    let result = pthread_create(&myThread, nil, threadedFunction, pThreadParameter)

    if result != 0 {
      print("Error creating thread--")
      exit(EXIT_FAILURE)
    }
    threadCounter += 1
    threads.append(myThread!)
  }

  func start(numberOfThreads: Int = 1) {
    for _ in 1...numberOfThreads {
      createThread()
    }
    for thread in threads {
      pthread_join(thread,nil)
    }
  }

  func processClient(client: Client, thread: String) {
    print("Processing client: \(client) in Thread: \(thread)")
    sleep(1)
    print("Done processing client: \(client) in Thread: \(thread)")
  }
}

And our main.swift will have the following content:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
import Foundation

print("Welcome to our Server simulator")

func threadedFunction(pointer: UnsafeMutableRawPointer) -> UnsafeMutableRawPointer? {
  var threadParameter = pointer.load(as: ThreadParameter.self)
  var server = threadParameter.server
  var threadIdentifier = threadParameter.threadIdentifier
  while let client = server.clients.dequeue() {
    server.processClient(client: client, thread: threadIdentifier.id)
  }
  return nil
}

let server = Server(numberOfClients: 10)
let startTime = CFAbsoluteTimeGetCurrent()
server.start(numberOfThreads: 5)
let endTime = CFAbsoluteTimeGetCurrent()
print("Duration:\(endTime - startTime) seconds")

As you can see in main.swift we are setting the number of threads to five. Run it and see how long it takes us to server ten clients.

In my case, the processing of clients took:

1
Duration:2.0020689964294434 seconds

That's an excellent speedup. We improved from the previous ten seconds for a single thread to two seconds. You can now play with different values for the number of clients and the number of threads.

Final thoughts

Now you know the basics of how to use pthreads in Swift. You'll probably use Grand Central Dispatch to handle multithreading in any modern code. And It is easy to see why GCD would be easier to use. The use of pthreads and interoperability with C doesn't feel Swifty. We have to work around some limitations, like the context on the thread function. But still, I think it is a good idea to know how to interact with code that uses pthreads.

We didn't see how to handle race conditions or concurrency. Our access to the Queue was close enough to be atomic that we didn't need to use mechanisms like mutexes/locks or semaphores to orchestrate the access to it. But I think you can see how it can be problematic to have more than one thread reading and modifying the same resource. I encourage you to research about concurrency, it's a fun topic, and it's also the source of some hard bugs to replicate and debug.

Ok, that's it for this post. I hope you find it useful.

**Note: You can find the full code on the GitHub Repository

Related topics/notes of interest

There are many interesting related topics and resources. I'll list a few that you might want to explore if you find this post interesting:


** If you want to check what else I'm currently doing, be sure to follow me on twitter @rderik or subscribe to the newsletter. If you want to send me a direct message, you can send it to derik@rderik.com.