Metaprogramming and read- and maintainability in Nim
4th June 2019 - Guide , Nim , Programming
One thing I've mentioned about metaprogramming in Nim, both in posts on this site and in talks, is that metaprogramming in Nim can enhance read- and maintainability. Opponents of metaprogramming would probably sneer at that and remark that it's quite the opposite. And sure, metaprogramming is a powerful tool, and with any sufficiently powerful tool you have to be careful. The same way a chainsaw can take your leg off if you're not careful, a poorly written macro can seriously mess up your program. But that doesn't mean we shouldn't use chainsaws, or macros! Maybe where I uses templates and macros the most in Nim is not to do some strange and crazy things, it's simply to rewrite my code so that it becomes more readable, without sacrificing performance. Since they are resolved on compile-time, and not while our program is running, we can add as much abstraction and complexity we want as long as we make sure the output is nice and efficient.
In this post I want to show a small example of refactoring your code with macros and templates in order to make it more readable. I'll be using a small hobby project I'm working on as the example, namely a small RPN calculator that I wrote the other night and want to expand into a spiritual HP-41C successor. The program is quite simple, but I've removed some of the operators to shorten it down a bit.
import math, strutils
# We define an enum with our built in commands
type
Commands = enum
Plus = "+", Minus = "-", Multiply = "*", Divide = "/", Pop = "pop",
StackSwap = "swap", StackRotate = "rot", Help = "help", Exit = "exit"
# Give them some documentation strings
const docstrings: array[Commands, string] = [
"Adds two numbers",
"Subtract two numbers",
"Multiplies two numbers",
"Divides two numbers",
"Pops a number off the stack",
"Swaps the two bottom elements on the stack",
"Rotates the stack one level",
"Lists all the commands with documentation",
"Exits the program"
]
# Then create a simple "stack" implementation
var stack: seq[float]
proc push[T](stack: var seq[T], value: T) =
stack.add value
proc pop[T](stack: var seq[T]): T =
result = stack[^1]
stack.setLen(stack.len - 1)
# Convenience template to execute an operation over two operands from the stack
template execute[T](stack: var seq[T], operation: untyped): untyped {.dirty.} =
let
a = stack[^1]
b = stack[^2]
stack.setLen(stack.len - 2)
stack.push(operation)
# Program main loop, read input from stdin, try to parse it as a command and
# run it's corresponding operation, if that fails try to push it as a number.
# Print out our "stack" for every iteration of the loop
while true:
for command in stdin.readLine.split(" "):
try:
case parseEnum[Commands](command):
of Plus: stack.execute(a + b)
of Minus: stack.execute(b - a)
of Multiply: stack.execute(a * b)
of Divide: stack.execute(b / a)
of Pop: discard stack.pop
of StackSwap:
let
a = stack[^1]
b = stack[^2]
stack[^1] = b
stack[^2] = a
of StackRotate:
stack.insert(stack.pop, 0)
of Help:
echo "Commands:"
for command in Commands:
echo "\t", command, "\t", docstrings[command]
of Exit: quit 0
except:
stack.push parseFloat(command)
echo stack, " - ", command
A primer on RPN
Now to make sure you know what this code does I'll quickly explain how a RPN calculator works. Instead of infixing the operators like "2 + 3 =" and get a result, in this case 5, these kinds of calculators work with a stack. Plus is an operator that takes two arguments, so we need to push two numbers onto the stack and then hit plus "2 3 +". This will pop the two elements off the stack, add them together and push the result back onto the stack. This makes it very easy to solve more complex pieces like "(2 + 3) * (8 / 2)" without the need for parenthesis. Simply run "2 3 +" and the stack now contains the number 5, then run "8 2 /" and the stack contains the numbers 5 and 4, then hitting "*" will multiply these together and leave 20 on the stack. Not only are these calculators easy to use once you get used to them, but also super easy to implement as you've seen above. Note that if you run the example you can either enter each number or command separated by hitting enter, or by spaces.
Refactoring our code to use macros
So now that we have a grasp on what we're actually working on, lets start with the refactoring. As you can see I've taken the liberty of adding a simple execute template already. It simply takes a statement which has the symbols "a" and "b" available to it and pushes the result to the stack. This is to avoid having to manually do all the poping, pushing, and storing of variables. The next step however is a bit more interesting. Notice that if we want to add a new operator we need to do three modifications in three different places. First add it to the enum, with the string command that is used to run it. Then write a docstring at the correct place in the array. And finally implement the new feature in the case switch in our main loop. Thankfully Nim's type system is able to save us from a lot of things that could go wrong. Case switches are strict, meaning that the compiler will complain if we haven't implement all cases, or if we implement too many. And arrays defined over an enum type will make sure we don't put too many, or too few elements in it. But this is still not quite enough, we might end up writing our doctsring in the wrong spot, which would shift over all the others. And having to jump around the file to three different places isn't optimal. So let's try to solve this with a little pinch of metaprogramming.
So we want to create a simple definition syntax for our operations. It should include the name of the operation, the string to call it by, the docstring, and of course the operation itself. I ended up with going for a this:
Plus = "+"; "Adds two numbers":
stack.execute(a + b)
Minus = "-"; "Subtract two numbers":
stack.execute(b - a)
It's compact, it's understandable, and easily maintainable. Now we just need to make it work! Nim sees this input as this syntax tree (only the plus is represented here for brevity)
StmtList Asgn Ident "Plus" StrLit "+" Call StrLit "Adds two numbers" StmtList Call DotExpr Ident "stack" Ident "execute" Infix Ident "+" Ident "a" Ident "b"
After having figured out our input, we need to figure out what code we want this to generate. The type definition and docstring are both on the top level, so those can be generated directly. But the case switch is inside our main loop, so let's put that in a template, and then simply call that from our loop.
Creating the enum
Running dumpTree
on a standard enum type definition shows us how to generate one:
StmtList TypeSection TypeDef Ident "Commands" Empty EnumTy Empty EnumFieldDef Ident "Plus" StrLit "+" EnumFieldDef Ident "Minus" StrLit "-"
The part here that is interesting to us is the EnumFieldDef
nodes. By creating an EnumTy
node with the first Empty
node, and then iterating over the Asgn
nodes in our input and copying the Ident
and StrLit
into an EnumFieldDef
node which is added to the EnumTy
node we can build up the inner structure of this tree. Then it's just a matter of putting it into the outer shell and our enum is done.
Although we're still just echoing out our resulting enum we can see that we now create what we wanted:
import macros
macro defineCommands(definitions: untyped): untyped =
result = newstmtlist()
echo "This is the input to our macro"
echo definitions.treeRepr
var enumDef = nnkTypeSection.newTree(
nnkTypeDef.newTree(
newIdentNode("Commands"),
newEmptyNode(),
nnkEnumTy.newTree(newEmptyNode())))
for i in countup(0, definitions.len - 1, 2):
let
enumInfo = definitions[i]
commandInfo = definitions[i+1].treeRepr
enumDef[0][2].add nnkEnumFieldDef.newtree(enumInfo[0], enumInfo[1])
echo "This is the enum we create"
echo enumDef.treeRepr
static: echo "This is what we want"
dumpTree:
type Commands = enum
Plus = "+", Minus = "-"
defineCommands:
Plus = "+"; "Adds two numbers":
stack.execute(a + b)
Minus = "-"; "Subtract two numbers":
stack.execute(b - a)
Creating the docstring array
Next up is the docstring array. The structure for this has a lot more static parts. Notice how above most of our stuff was generated in a loop, but this will only require us to copy the StrLit
nodes into a shell:
ConstSection ConstDef Ident "docstrings" BracketExpr Ident "array" Ident "Commands" Ident "string" Bracket StrLit "Adds two numbers" StrLit "Subtract two numbers"
Doing the same as we did with our enum and creating an empty tree that we populate in the loop we end up with something like this:
import macros
macro defineCommands(definitions: untyped): untyped =
result = newstmtlist()
echo "This is the input to our macro"
echo definitions.treeRepr
var
enumDef = nnkTypeSection.newTree(
nnkTypeDef.newTree(
newIdentNode("Commands"),
newEmptyNode(),
nnkEnumTy.newTree(newEmptyNode())))
docstrings = nnkConstSection.newTree(
nnkConstDef.newTree(
newIdentNode("docstrings"),
nnkBracketExpr.newTree(
newIdentNode("array"),
newIdentNode("Commands"),
newIdentNode("string")
),
nnkBracket.newTree()))
for i in countup(0, definitions.len - 1, 2):
let
enumInfo = definitions[i]
commandInfo = definitions[i+1]
enumDef[0][2].add nnkEnumFieldDef.newtree(enumInfo[0], enumInfo[1])
docstrings[0][2].add commandInfo[0]
echo "This is the enum we create"
echo enumDef.treeRepr
echo "This is the docstring array we create"
echo docstrings.treeRepr
static: echo "This is what we want"
dumpTree:
const docstrings: array[Commands, string] = [
"Adds two numbers",
"Subtract two numbers"
]
defineCommands:
Plus = "+"; "Adds two numbers":
stack.execute(a + b)
Minus = "-"; "Subtract two numbers":
stack.execute(b - a)
Creating our case switch template
In a familiar vein we can now create our case switch by first creating an empty tree of a case statement, then adding OfBranch
nodes to it in our loop.
CaseStmt Call BracketExpr Ident "parseEnum" Ident "Commands" Ident "command" OfBranch Ident "Plus" StmtList <nodes of the actual code comes here>
And we end up with something like this:
import macros
macro defineCommands(definitions: untyped): untyped =
result = newstmtlist()
echo "This is the input to our macro"
echo definitions.treeRepr
var
enumDef = nnkTypeSection.newTree(
nnkTypeDef.newTree(
newIdentNode("Commands"),
newEmptyNode(),
nnkEnumTy.newTree(newEmptyNode())))
docstrings = nnkConstSection.newTree(
nnkConstDef.newTree(
newIdentNode("docstrings"),
nnkBracketExpr.newTree(
newIdentNode("array"),
newIdentNode("Commands"),
newIdentNode("string")
),
nnkBracket.newTree()))
caseSwitch = nnkCaseStmt.newTree(
nnkCall.newTree(
nnkBracketExpr.newTree(
newIdentNode("parseEnum"),
newIdentNode("Commands")
),
newIdentNode("command")))
for i in countup(0, definitions.len - 1, 2):
let
enumInfo = definitions[i]
commandInfo = definitions[i+1]
enumDef[0][2].add nnkEnumFieldDef.newtree(enumInfo[0], enumInfo[1])
docstrings[0][2].add commandInfo[0]
caseSwitch.add nnkOfBranch.newTree(
enumInfo[0],
commandInfo[1])
echo "This is the enum we create"
echo enumDef.treeRepr
echo "This is the docstring array we create"
echo docstrings.treeRepr
echo "This is the case statement we create"
echo caseSwitch.treeRepr
static: echo "This is what we want"
dumpTree:
case parseEnum[Commands](command):
of Plus: stack.execute(a + b)
defineCommands:
Plus = "+"; "Adds two numbers":
stack.execute(a + b)
Minus = "-"; "Subtract two numbers":
stack.execute(b - a)
Tying it all together and generating the correct code
So far we've just focused on creating the expected output. But we have some extra considerations to make. What if we want to call the macro multiple times, say from different modules? And what about the template we talked about? So let's change our macro definition a bit to take some more arguments that we can use to name the things our code creates. We add arguments to name the enum, the array of docstrings, and the name of the template we want to create. Then we tie everything together in a call to quote
which adds our sections and the template. Then adding back all our definitions and our main loop, and our final program ends up looking like this:
import math, strutils
import macros
macro defineCommands(enumName, docarrayName, runnerName,
definitions: untyped): untyped =
var
enumDef = nnkTypeSection.newTree(
nnkTypeDef.newTree(
enumName,
newEmptyNode(),
nnkEnumTy.newTree(newEmptyNode())))
docstrings = nnkConstSection.newTree(
nnkConstDef.newTree(
docarrayName,
nnkBracketExpr.newTree(
newIdentNode("array"),
enumName,
newIdentNode("string")
),
nnkBracket.newTree()))
templateArgument = newIdentNode("command")
caseSwitch = nnkCaseStmt.newTree(
nnkCall.newTree(
nnkBracketExpr.newTree(
newIdentNode("parseEnum"),
enumName
),
templateArgument))
for i in countup(0, definitions.len - 1, 2):
let
enumInfo = definitions[i]
commandInfo = definitions[i+1]
enumDef[0][2].add nnkEnumFieldDef.newtree(enumInfo[0], enumInfo[1])
docstrings[0][2].add commandInfo[0]
caseSwitch.add nnkOfBranch.newTree(
enumInfo[0],
commandInfo[1])
result = quote do:
`enumDef`
`docstrings`
template `runnerName`(`templateArgument`: untyped): untyped =
`caseSwitch`
echo result.repr
# First create a simple "stack" implementation
var stack: seq[float]
proc push[T](stack: var seq[T], value: T) =
stack.add value
proc pop[T](stack: var seq[T]): T =
result = stack[^1]
stack.setLen(stack.len - 1)
# Convenience template to execute an operation over two operands from the stack
template execute[T](stack: var seq[T], operation: untyped): untyped {.dirty.} =
let
a = stack[^1]
b = stack[^2]
stack.setLen(stack.len - 2)
stack.push(operation)
# Then define all our commands using our macro
defineCommands(Commands, docstrings, runCommand):
Plus = "+"; "Adds two numbers":
stack.execute(a + b)
Minus = "-"; "Subtract two numbers":
stack.execute(b - a)
Multiply = "*"; "Multiplies two numbers":
stack.execute(a * b)
Divide = "/"; "Divides two numbers":
stack.execute(b / a)
Pop = "pop"; "Pops a number off the stack":
discard stack.pop
StackSwap = "swap"; "Swaps the two bottom elements on the stack":
let
a = stack[^1]
b = stack[^2]
stack[^1] = b
stack[^2] = a
StackRotate = "rot"; "Rotates the stack one level":
stack.insert(stack.pop, 0)
Help = "help"; "Lists all the commands with documentation":
echo "Commands:"
for command in Commands:
echo "\t", command, "\t", docstrings[command]
Exit = "exit"; "Exits the program":
quit 0
# Program main loop, read input from stdin, run our template to parse the
# command and run the corresponding operation. if that fails try to push it as
# a number. Print out our "stack" for every iteration of the loop
while true:
for command in stdin.readLine.split(" "):
try:
runCommand(command)
except:
stack.push parseFloat(command)
echo stack, " - ", command
Final remarks
As we can see from the final result our operation definitions are now collected in one place. We create the enum, the docstring array, and the case switch in the same place. Since all of this is simple rewritten to valid Nim code at the AST level it will still evaluate to the same code as we started with, which means it will still have all the type safety and speed as the original version. Since we also made our macro accept arguments we can easily create multiple different classes of commands and put them in different modules or treat them differently in our main loop. Now you might say that the macro itself is just more code that needs to be maintained and that is true. But maintaining one piece of code that does a simple transformation, and another that does high-level program logic is in my opinion easier than mainting the original spaghetti. Especially if we take into consideration that after we've split this across modules it would be easy to change the format of every such block by simply changing the macro. So I hope this helps to show how a powerful tool like metaprogramming can indeed increase code read- and maintainability instead of being a hindrance to both.