Wrapping C libraries in Nim

17th April 2023 - Guide , Nim , Programming

As we discovered in my last article, Nim will by default generate C code and then call on a C compiler to actually produce a binary. This might seem like an odd choice, especially in the age of LLVM. However it’s actually not uncommon for languages to compile or transpile into another language. Initially the choice to not use LLVM was simply because it wasn’t as mature back when Nim was created. Though going through C has some quite substantial benefits. For example you’re able to run Nim anywhere you can run C, which means you can run it pretty much anywhere. From the tiniest micro-controllers, to mobile apps, to normal desktop/server targets¹. There is another benefit though which this article focuses on, namely the ability to super easily interface with C libraries or programs. This is what’s called a foreign function interface or FFI and just means we’re calling functions from another language, in this case Nim from C, or C from Nim. Of course the Nim language has quite a few extra features and systems compared to C, so when using a library written in C we often want to do a little bit of translation work in order to make the interface a bit more familiar to Nim users². Especially when it comes to things like memory management, and things like async. In this article I’ll outline how this works in practice, and show some examples from wrapping the Constrained Application Protocol (CoAP) library in Nim.

Telling Nim what’s available

Making use of a C library in Nim is pretty easy, all you need to do is tell Nim the signature of a procedure, then tell it that this is something that exists in C and you’re good to go. Take a very simple example like this:

proc hello(x: string) {.importc.}

hello("world")

This compiles just fine from Nims point of view, and will generate C code which calls the procedure hello with a Nim string object. Of course if this procedure doesn’t exist anywhere in the C standard library, or in any library we import through switches this will create a undefined reference to 'hello' error from the C compiler³. You can further tell Nim which header this procedure comes from, or which dynamic library to link against to get it with additional pragmas. This will cause Nim to automatically import or link to these when the procedure is used. When wrapping a C library you likely want to use the c prefixed types in Nim. For example cstring, cint, cuint, csize, clong, and cfloat which all map to their corresponding type without the prefix in C. You will also have to get used to using a lot of ptr types. Type signatures and objects have to match exactly between C and Nim, but you can use Nims much better type system to improve on the basic signatures. For example a C procedure which should take an int where the special numbers you are allowed to pass are defined using #define will happily accept an enum with cint sized members. This means that instead of accepting any integer you can now only pass the valid options. Similarly if your C procedure takes (or returns) a pointer to the start of an array you could use a ptr UncheckedArray[T] in Nim. This would allow you to use the value in loops and index lookups. It is generally also helpful to use distinct types to make the entire library more type safe.

Writing all these definitions by hand might work well if you only have a handful of procedures you need. But if you share your code with someone else who might want to use different procedures they have to wrap them themselves. Another issue is that Nim will 100% trust you when you tell it what exists in C. So if you accidentally or purposfully give Nim incorrect information you will end up either with C compiler errors, or even weird runtime behaviour and crashes. For example if a procedure takes a pointer to a specific object, and you accidentially write the wrong object in your definition neither the Nim or C compiler will pick up on this. This means that you end up with weird undefined behaviours which can cause crashes and hard to debug issues. Of course if the library you’re wrapping changes you will have to go through and painstakingly make sure that every piece of your wrapper is still correct. If you don’t you will end up back at square one.

Outsourcing the job

So instead of writing all these definitions by hand it is a good idea to let the computer do these conversions for us. After all computers are great at doing repetative tasks with high precision.

Since the early days of Nim there has existed a tool called c2nim which aims at converting C files, and in particular header files, into Nim. However this tool is pretty lacking, it doesn’t handle imports, it doesn’t properly work with defines, and it is a bit limited in what it is able to parse. In addition to this the official documentation clearly states that the output is intended to be tweaked by hand after the translation. This might sound like a good idea, after all sometimes we might be able to a better job. But in reality it means that if the C library has an update (or you decide to throw in another define) you have to remember to repeat all the tweaks. This is similar to the problem with manually writing wrappers.

Another tool for creating wrappers in nimterop, it uses a more robust treesitter parsing algorithm, and aims to make the generation automatic. However it still doesn’t have a great grasp of C defines, and while it’s meant as a tool to generate bindings it is still recomending users to run it as part of a pre-compile step. It also hasn’t received an update in about two years, so whether the project is still being actively maintained is uncertain.

Enter Futhark

After having tried both the manual way, and the two automated versions for wrapping the rather large⁴ Unbound codebase I grew tired. I spent about a week for each of c2nim and nimterop to try and get it to work, but ultimately gave up. Then I wrapped the parts I needed by hand. This was tedious work, but at least I managed to get it working in the end. The project essentially involved writing a dynamic library in Nim that Unbound would load. Unbound wasn’t originally written with dynamic library support (something I added myself in a PR to the project) so the API wasn’t exactly well defined and stable. Unbound also updates fairly regularly, and I didn’t want to use an outdated version in production. This led to Unbound slowly drifting away from the binding and weird errors started to crop up. Going through all the bindings tiny alignment errors and such started to show themselves, clearly making manual bindings wasn’t the way either.

But there had to be a better way, and that’s why I wrote Futhark. Because you know what’s really good at understanding C code? A C compiler! The idea came to me while digging up details on how C interop works in Zig. It doesn’t compile to C, but is still able to import C headers and use things from the language. Zig is a LLVM language, and it turns out that Clang (the C language frontend for LLVM) has a library which allows programatically getting all the procedures, structs, variables, and other information. The idea behind Futhark was then born, use this library to parse the C headers into something useful, then let a Nim macro generate the code required for us! Apart from the general difficulty of using the aforementioned tools, Futhark was built to overcome a couple of gripes I had with both nimterop and c2nim. First and foremost it should just work. Secondly, if it didn’t work it should be trivial to manually provide a workaround. Thirdly, it should be as transparent as possible to use, preferably the user shouldn’t really think much about the fact that a translation was going on at all. And last but not least, the bindings should be so easy to use afterwards that you should be able to follow along a C tutorial and just port the syntax to Nim. A tall order for sure, but apart from a few snags I believe I’ve managed to hit all my goals.

Using Futhark

So, now that we’ve found our tool it’s time to wield it. The basics of Futhark is fairly simple, after all that was one of the design goals. In this article I’ll show how I wrapped libcoap from the very basics of using the C library in Nim, to building a more complex binding.

Simple bindings

To start out we look at how libcoap works in C, simply include the coap.h file and then use -lcoap-3 during compilation to dynamically link to libcoap.so. The process is similar in Nim with Futhark:

import futhark

importc:
  path "/usr/include/coap3"
  "coap.h"

This takes a short while to process (3.5 second exactly on my machine⁵) and prints out some useful hints in our compilation log. Shortened here for brevity but it looks a bit like this:

Hint: Running: opir -I/usr/include/coap3 -I/usr/lib/clang/15.0.7/include /home/peter/.cache/nim/coaptest_d/futhark-includes.h [User]
Hint: Parsing Opir output [User]
Hint: Caching Opir output in /home/peter/.cache/nim/coaptest_d/opir_FB377077491B2805.json [User]
Hint: Generating Futhark output [User]
Hint: Renaming "addr" to "addrfield" in structcoapaddresst [User]
Hint: Renaming "addr" to "addrarg" [User]
Hint: Renaming "type" to "typefield" in structcoaptlsversiont [User]
Hint: Renaming "type" to "typearg" [User]
Hint: Renaming "block" to "blockarg" [User]
Hint: Renaming "method" to "methodarg" [User]
Hint: Caching Futhark output in /home/peter/.cache/nim/coaptest_d/futhark_6BEC84F620D201F5.nim [User]
Hint: Declaration of stderr already exists, not redeclaring [User]
Hint: Declaration of close already exists, not redeclaring [User]
Hint: Declaration of stdout already exists, not redeclaring [User]

This tells us a couple of things, first it invokes the helper tool Øpir⁶ given the path we defined along with the system path and a dummy header file. This creates a JSON file (of about 38k lines pretty printed) in our cache directory which Futhark reads. Then Futhark renames a couple of things which would conflict with built-in names, what they are renamed to depends on the context and as we can see here fields in structures get a field postfix, and arguments to procedures get an arg postfix. You will also notice that everything is lowercase and has no underscores. This is simply for internal purposes, and since Nim is style insensitive you can choose if you want to use e.g. coap_address_t or coapAddressT in your code. You might wonder why the structs have a struct prefix, this is simply to be compatible with how C can require the struct keyword, but most libraries typedef their structs to the same name without the struct part. This typedef is of course also translated by Futhark. After everything is converted we see Futhark storing its output, which for this version of the library is 4.8k lines long into the cache directory. The reason why this file is so large is hinted at in the next three lines, “Declaration of X already exists, not redeclaring” is a check added around every identifier to ensure that the wrapper doesn’t try to override any of Nims existing types, or procedures and objects we might have manually defined ourselves. These checks makes the output a bit hard to read, so Futhark also has the option of not adding them, but this is more likely to cause issues. And since design goal #1 was “it should just work” everything is guarded this way by default.

Basic usage

Now that we have our wrapper built we can write some CoAP code! Just a short note on what CoAP actually is before we get going though. Essentially it is a simplified version of HTTP for constrained devices (in terms of memory, CPU, program storage, etc). Not sure how common it is, but I needed it for interfacing with an IKEA smart light gateway. Let’s start with just trying to call something from the coap library, adding this in after our importc block:

var context = coapNewContext(nil)

Et voilà, our first call from Nim into a C library is done! Exciting stuff, but not very useful yet. This simply allocates and instantiates a new coap_context_t and returns its pointer. We’ve also got our first memory leak, and if we try to compile our code we’re met with a rather nasty error from C saying that the reference to coap_new_context is undefined. What gives? Futhark was supposed to be easy right? The problem here is what I mentioned earlier, Nim trusts us completely when we say that things exist in C. So it happily generated a call to coap_new_context for us. But since we never actually told Nim to link against the dynamic library (remember that -lcoap-3 switch we talked about?) the C compiler doesn’t know what we’re talking about. We’ve basically lied to the Nim compiler, and it happily generated C code which it doesn’t know how to compile. The fix for this is easy though, just like passing -lcoap-3 to the C compiler we need to pass --passL:"-lcoap-3" to the Nim compiler. The passL flag simply means “pass this on to the C linker”.The reason Futhark requires the linking switch and doesn’t add it automatically is because it makes it agnostic to whether or not you want to statically or dynamically link to the C project.

It is also considered good practice to add such flags to a Nim configuration file so you don’t have to remember to type it every time. There’s little worse than knowing that your project works but you can’t remember how to build it. Another upside of this is that it allows your editor to know about it, so if you have a “compile and run” system in your editor it should also be able to build it.

Speaking of editor tools you might have noticed that your code lights up like a christmas tree with errors when you try to use things from the C library. This is simply because your editor doesn’t like to call a macro to get an include statement (which is all the importc block boils down to). While I recommend reading the “Shipping wrappers” section in the Futhark documentation for a more proper way to do this, a simple workaround for this is to hide the importc call behind a when defined(useFuthark) switch and manually include the cache file like so:

when defined(useFuthark):
  importc:
    path "/usr/include/coap3"
    "coap.h"
else:
  include "/home/peter/.cache/nim/coaptest_d/futhark_6BEC84F620D201F5.nim"

This means you need to add -d:useFuthark when compiling to actually use Futhark, but it means the editor can easily find the actual definitions we’re working with and not complain.

But now that our code actually compiles, and both the Nim and C compiler agrees on what comes from where let’s get back to coding. As I mentioned we have a memory leak after all! Remember we’re now dealing with a C library, which means we have to manually manage the memory that it uses. This might be a bit foreign if you’re only used to writing code in Nim, or other garbage collected/reference counted languages. But all it means is that when we tell the C library to create a resource, we also need to tell the C library to destroy the resource once we’re done with it. So at the end of our code we need to add something like the last line here:

var context = coapNewContext(nil)
coapFreeContext(context)

Great, no more memory leak! But manually managing memory is no fun, after all it’s one of Nim great features that it comes with a memory management system so we don’t have to deal with any of that. Fortunately it’s fairly easy to add destructors to our C objects. In this case context is going to be a pointer to a coap_context_t. However if you look through all of the headers in libcoap you won’t find a full definition for this. Essentially we’re not supposed to use or manipulate this structure, so the implementation of it is hidden and the implementation is free to change it at any time. What Futhark does in this case is to create a simple type coapcontextt = object definition which we can have pointers to. But we aren’t allowed to just add a destructor to a pointer type, so how are we going to make this work? There are multiple ways of doing it, but we’re going to stick with the easiest one here.

when defined(useFuthark):
  importc:
    path "/usr/include/coap3"
    "coap.h"
else:
  include "/home/peter/.cache/nim/coaptest_d/futhark_6BEC84F620D201F5.nim"

type Context = distinct ptr coapcontextt

converter toBase(c: Context): ptr coapcontextt = cast[ptr coapcontextt](c)

proc `=destroy`(x: var Context) =
  echo "Destroyed"
  coapFreeContext(x)

proc newContext(): Context =
  echo "Created"
  Context(coapNewContext(nil))

var context = newContext()

While we can’t attach a destructor to a pointer type, we can attach a destructor to a distinct pointer type. If you run the above code, you should see that the execution prints out “Created” followed by “Destroyed”. With the small converter that converts back to the base type we also ensure that we can call C procedures which takes the original pointer type, just be sure to not store these in Nim code because they won’t count as a reference.

While it’s not a problem for this library you do need to keep in mind the fact that the Nim memory management schemes only works in Nim code. So if you have a destructor but pass a pointer into a C function which keeps that pointer around, Nim won’t see this pointer and free the data because it detects that it is no longer in use in the Nim code. Typically though C libraries makes you free your own resources, so this isn’t often a concern, and if you really need to you can always use GC_ref and GC_unref to manually increase and decrease the reference count.

The only real downside to this approach is that now we need to create our own constructor methods, but since we’re wrapping C stuff there is a high likelyhood we would want to do this anyways. In this case for example coapNewContext takes a pointer to the address to listen to, for use in servers. In C we need to pass this as a nil pointer, but in Nim we can easily modify this to just take a default value instead:

proc newContext(listenAddr: ptr coapAddressT = nil): Context =
  Context(coapNewContext(listenAddr))

Speaking of addresses, the coap_address_t structure can be a bit prickly to set up. Essentially you’re just supposed to initialise them, and then fill them in with data from other low-level system procs. This is a perfect opportunity to hide the low-level C logic in more ergonomic Nim procedures. I’ll spare you the details here, if you want to see how it’s done you can check my published bindings.

Complicating matters with async

Now we’ve dealt with creating more ergonomic bindings, and with creating garbage created types out of simple C types. But there are many more differences between C and Nim. This article would get way too long if I were to go into every single detail, but there is one topic I’d like to explore in a bit more detail. It will also help cement some of the more basic concepts. And that topic is asynchronous programming. Nim has an async module which allows a more modern async/await pattern, but C doesn’t have anything like this. As I outline in my article on asynchronous programming the entire concept is built around co-operative multitasking. We tell the underlying system that we’re waiting for something, and the system will hand us back control once that thing occurs. Of course this means that something else is responsible for making progress towards that thing. This is typically used to wait for things which is handled by the operating system, such as file reads or network activity. In my last article however I only showed how the system worked using async sleeps to simulate work. But this is a prime opportunity to show how such a system works in a more realistic scenario. In fact this way of asynchronous programming is nothing new. If you imagine a typical blocking file read for example the concept is similar, we tell the OS that we want to read a file into memory, and it will take control away from our program and hand it back once the file is done reading. But if we want other parts of our program to perform actions while we wait we can use a non-blocking read and then poll the operating system once in a while to see if the file loading is done. Below the hood this is exactly what the async system in Nim does. When you await a read on a resource the file handle of that resource is registered with the async system which will poll the operating system at prudent intervals and decide which piece of execution to continue with. This underlying system is still available to us in Nim, and the CoAP library, along many other C libraries, has support for getting a file handle which supports the required polling. This means that we can actually integrate such libraries into the Nim asynchronous system and have true async/await support for C libraries.

Divulging all the details of how this is done would again require a much longer article, but there are a couple of things which might be of special interest. The system requires files (or sockets for that matter) to be opened properly as non-blocking, their handles to be registered with the async system, and callbacks to be registered for the state of the file. So for example when waiting for a reply the CoAP socket (which is already opened with the required flags) has to be registered with the asynchronous system and a read callback has to be registered. The callback then returns true or false depending on whether the file handle should still be registered with the asynchronous system after the callback.

This is pretty straight forward, but it’s not always trivial to get it right, especially whether to leave the file registered or not. The trick I use in the CoAP bindings is to register a pointer to Nim handled memory in the session so I can call upon it later. Care must be taken when storing such pointers to managed memory. It only works in this case because the object the actual data resides in will always outlive the CoAP session. This facility of passing a pointer to arbitrary data along with a callback is very common in C libraries, so using a similar system in other libraries should be possible.

If you go through the code for the CoAP library you will see that it actually uses an indirect system for the callbacks. The callback simply calls a process called ioProcess from the CoAP library, which then calls a master callback registered earlier, and then it checks the cache of waiting messages to determine if the file handler should be removed. This is simply done to better fit in with the architecture of the library.

Final remarks

Wrapping C libraries in Nim can be tedious work. With Futhark the basic legwork is done for you, and you can focus solely on building more proper Nim bindings on top of the C library. The Futhark bindings should ensure that the calling from Nim to C is done properly, allowing you to build better systems on top instead of wrangling C. In the time since I wrote Futhark I’ve wrapped many a library and it is quite astonishing how well it works. Even complex libraries like Gtk with Webkit2 support can be used directly from Nim with nothing but the automatic binding done by Futhark.

Nim is entirely capable to run without the GC so it can really fit anywhere C can. The macros are even quite handy for making zero-cost abstractions on otherwise limited systems.↩︎
Unlike Python with its word Pythonic we haven’t really settled on a similar term in Nim. My personal favourite however is “royale”, but I fear it’s overly obtuse.↩︎
This would be fairly likely, seeing how I can’t think of any library you might’ve imported which takes a Nim string object in a C codebase.↩︎
About 8k lines of header files, sans comments↩︎
While 3.5 seconds isn’t long it can get annoying in the long run to rebuild the wrapper over and over. Luckily Futhark caches it’s output so as long as nothing changes in the importc block the cache is simply included in your file instead.↩︎
Futhark is named after the old runic alphabet, because it reads the script of our past (Runic inscription in the case of the actual Futhark, C code in the case of the project Futhark). Øpir the helper tool is named after the most well known rune master, this is because he would also be able to read runes, much like our helper tool reads C code. The binary is called opir though to avoid issues with special characters. The Ø is pronunced like the U in “thunder” or “pur”.↩︎

View all entries