Multitasking in Nim
3rd September 2021 - Guide , Nim , Programming
This is a series of articles about a topic in the Nim ecosystem that has so far been written fairly little about. Namely threading, asynchronous operation, and communication.
To begin we first need to have a good grasp of the concepts we're going to discuss. There is quite a bit of confusion around all the terms in this space, so let's start with some simple definitions. To put them in context let's first consider how a modern computer works, it's time for some computer science 101.
Modern computers works by having hardware like disks, network cards (now typically built into the motherboard), graphic cards, etc. connected to the CPU. These communicate over interrupts which, as the name implies, interrupts the execution of the CPU to let it handle data from devices. This is done because the hardware is typically many orders of magnitude slower than the CPU itself, and it avoids superfluous polling. In the CPU there are one or more physical cores. Each physical core can only do one thing at a time, but even on single core machines we can run multiple threads. A thread is one piece of execution, typically an entire program, or a specific part of a program. One or more threads belong to the same process, which is what is created when you run an executable.
Now with that out of the way, let's consider the different concepts of concurrency, parallelism, asynchronous operation, and communication. Even on machines with only a single physical core it's possible to run multiple programs at the same time. The way this works is not by actually having the two programs executing at the same time, but rather switching between them really fast, which creates the illusion of simultaneous execution. This is what is known as concurrency. While it doesn't offer any performance benefit, it allows us to do things like having a responsive GUI while our program performs some calculations in the background.
On the other hand we have parallelism, which is only available when the CPU has more than one physical core (or to some extent with hyperthreading and similar architectures). In the same way as concurrency the execution is split into threads, but instead of switching between the threads really quickly they can now actually run at the same time. Modern OSes of course uses both these things in conjunction so your threads can either run on the same core, or on two different cores. This means we can use them to actually gain performance in our programs.
In addition to this we have asynchronous operation. Since file operations and network operations are so extremely slow compared to the CPU we can use asynchronous versions of regular file and network procedures. These polls the OS for whether the operation is complete or not and allows us to do something else while we wait. This can be used to increase performance, but only if working with I/O bound problems.
When working with multiple threads of execution that execute in parallel we face some additional issues that we don't normally have in sequential programs. All of these are due to the fact that we are now sharing our resources and memory. The most obvious of these issues are memory related, we can pass pointers to memory around between threads, but if we're not careful we can end up tripping over other threads which creates weird bugs. This is typically solved by using locks, atomic operations, or special data structures that are ensured to be safe in this context.
Now that we have that out of the way it's time to have a look at how we can do these things in Nim. To keep this as readable as possible I've split these topics into multiple articles, possibly with more to follow in the future:
- Asynchronous execution
- Multi-threading
- Communication
NOTE: This is an article series that I've had lying about on my hard-drive for quite a while now as I haven't gotten further than this primer and the asynchronous article. I wanted to write at least the multi-threading one before publishing any of them in order to make the articles more cohesive. But questions about async keeps cropping up and I've decided that having an article on async is better than having none of the articles in the series. This is why I'm publishing this now. Hopefully I will find time to write the threading and communication articles soon enough.