Systems Programming with Nim

black sand

Nim is a statically typed compiled systems programming language. I’ve seen a few posts about Nim popping up on HackerNews recently and was curious enough to give it a go.

My go-to tool / application to implement when trying out a new language these days is a simple CLI tool.

Recently I had a go at trying Rust – creating fstojson-rust. Sticking to systems programming languages, Nim came up next.

Here are the things I initially liked about Nim:

  • Statically typed, compiled language.
  • The Nim compiler and generated executables work on all the major operating systems. i.e. Linux, macOS, Windows, BSD.
  • Lots of documentation and tutorials.
  • An online Playground.
Some Nim code in my IDE
Getting started with Nim

Getting Started With Nim

Installation is really simple regardless of your OS. On macOS I chose to run the choosenim installation script. (Remember, always check and inspect scripts that you cURL / run directly from online sources):

curl https://nim-lang.org/choosenim/init.sh -sSf | sh

You can also download pre-built binaries for your OS at the main Nim Download page, or use a myriad of other methods to get it running such as Homebrew, Docker images, etc…

Nimble is a package manager that is bundled with Nim (since version 0.15.0). It is very similar in what it does as what npm does for node for example.

If you wanted to upgrade nimble itself, you can simply do:

nimble install nimble

Just like an npm update!

To create a new project, you can use the init command. An interactive CLI wizard guides you through creating a Library, Binary, or Hybrid project type.

nimble init [pkgname]
nimble init guide for new projects

A Nim project can use a .nimble file, which is equivalent to package.json in the Node.js world. Nimble is used to work with the project’s .nimble file.

To check the validity of your project and dependencies, you can run the check command:

nimble check

Then to install a new package you use the install command with nimble (the default is to install latest, but you can also install a specific version of a package, e.g. one from a git repository):

$ nimble install package@#head

Getting Acquainted with Nim

I started off with a really simple program that simply reads from stdin and then echos back a message.

echo "What's your name? "
var name: string = readLine(stdin)
echo "Hi, ", name, "!"

Building and running locally is simple. You can run nimble build or nimble install. The build command will create a debug version of your program which includes stack traces, while install will create a release version.

You can also pass Nim flags to these commands, though for anything permanent you can add a configuration file and specify them there.

You use nimble run to build and run your program in one go.

There are many more command options available for nimble. Just run the command without anything else to access the list.

fstojson-nim

Next, I jumped into writing my go-to application for traversing the file system and collecting directory and files in a designated path to output to a JSON hierarchy.

You can take a look at my code in GitHub.

My overall experience was that I found Nim to be a lot more foregiving in terms of safety checks and recursive functions than what Rust was.

I was able to use a single function (in Nim these are called procedures and are defined with the proc keyword), which could recursively call itself for traversing a nested directory hierarchy.

I created a simple object (PathObjectNode) which would store the information about each file or directory, with a children property that is used to hold a list (seq) of more PathObjectNode objects if they object is a directory (files of course do not have children).

A single hierarchy is created at the start of traversal and all directory or file object nodes are added to this.

var hierarchy = newSeq[PathObjectNode]()

At the end of traversal I simply echo out the JSON representation of the root hierarchy node (along with all children). Optionally, this can be prettified JSON.

Speaking of options, I used a handy package called docopt to provide the CLI interface. It was a case of simply adding this to my project’s .nimble file dependencies.

The interface can then be specified by providing a docopt string. For example:

let doc = """
traversefs

Usage:
  traversefs [-p | --pretty] [-r | --recurse] PATH

Arguments:
  PATH          The path to begin traversing

Options:
  -h --help     Show this screen.
  --version     Show version.
  -p --pretty   Pretty print JSON
  -r --recurse  Recusively traverse paths.
"""

With that done, and a simple nimble build later, I could run the binary directly to use it.

Closing Thoughts

Nim feels like an easily accessible systems programming language to me. Although I haven’t really used it enough to have an informed opinion, my thoughts are that it strikes a nice balance between usablity, ease of entry, performance, and type safety.

I’ll definitely consider exploring it further for other projects when I get the chance.

My next Apple Mac will most likely be Intel

apple mac

I currently use an Intel Apple Mac Mini that I upgraded as a daily driver for my work. I’ve been quite excited about the prospect of the M1 processor and the performance and power efficiency it has on offer.

However, I just can’t bring myself to buy one of these yet. Why? It’s like those older Windows Service Pack updates you would always hold off on installing. Sometimes you’re setting yourself up for disaster by adopting something new and shiny without it proving itself first.

Apple M1 turns 1, but still has remove for improvement

Tomorrow, the Apple M1 will have officially been on the market for exactly 1 year (Apple M1 was officially released on November 11, 2020). I still don’t think it has proven itself yet, or had the kind of mass adoption needed to bring about software maturity.

I’ve read countless threads, articles, and comments around the interweb that highlight problems and shortfalls that I just don’t see on my Intel based Mac Mini 2018 model. Here are some examples:

I’ve also spoken with colleagues who upgraded to M1 and have reported that certain Node.js applications are much slower on M1 than they were on their older, Intel based macOS systems.

I use node a lot for work, and this is a worrying thing to hear. I looked it up, and sure enough there are plenty of experiences from others that report issues with Node, and NVM on Apple silicon.

UMA and our right to repair

I am not convinced about Apple’s Unified Memory Architecture (UMA) being all sunshine and roses. The memory bandwidth up for offer sure is enticing, but what about upgradeability and repair?

Apple is making sacrificing one thing for another. Now, if the memory built into the logic board fails, the entire board needs replacing.

What about upgradeability? With my Mac Mini 2018, I bought the (much cheaper) 8GB model. I then purchased 32GB SODIMM RAM modules at half the price of what Apple wanted to charge me. I performed the RAM upgrade myself.

With M1 I cannot do that anymore.

Intel is where it’s at for me

My current Mac Mini is great in terms of memory and SSD performance. However it has an anaemic Core i3 processor that is crippling overall performance. For example, Slack’s renderer process brings all 4 logical cores to their knees if I do a screenshares.

I definitely should have opted for the i5 when I purchased this model.

However, there is light at the end of the tunnel. Apple still sell Intel-based Mac Minis. They do hide it out of sight where you need to scroll to the bottom of their product page, but at least they’re there.

As a society we should be more cautious with early adoption and “jumping on the band wagon”. We should wait for issues to be resolved, or at least “day 1” patches to be released in the case of software. Assess the severity of problems at launch and wait for product maturity before taking the plunge. I try to use this same philosophy with games. The industry has moved to “early access” and we’re all being taken along for the bumpy ride.

For now my hardware upgrade path is clear – an Intel-based Mac Mini with a 3.0GHz 6-core 8th-generation Intel Core i5 processor.

Rust Cross Compile Linux to macOS using GitHub Actions

rust code

It’s easy enough to add extra targets using the cargo command when building your Rust project. However, the Rust cross compile process gets a little tricker when linking is done for platforms different to the host platform.

I wanted to setup a GitHub Actions workflow that would build binaries for different platforms from the same actions runner.

While it might be possible to use GitHub Actions Matrix to run a build across multiple operating systems and install Rust / rustup / cargo on each, performing the build in each place, I opted for a different strategy.

Using a base Rust musl build container, on top of Debian Buster, I’ve added osxcross and the required build tools. This supports building and linking macOS binaries from the Linux container.

Rust Cross Compile GitHub Action

I’ve built a Docker image for GitHub Actions to use. It is based on the popular rustup image. Currently I’ve built a musl-1.0.53 variant. My version copies in and sets up osxcross. This bakes all the heavy lifting into the image so that GitHub actions can quickly build targets for linux and macOS (x86).

You can find the action on the Rust Cross Compile GitHub Action here.

Usage example

Set up a .cargo/config file to designate the target to linker mapping. For example macOS x86:

[target.x86_64-apple-darwin]
linker = "x86_64-apple-darwin14-clang"
ar = "x86_64-apple-darwin14-ar"

Add a GitHub Actions workflow:

name: Rust static build macOS and Linux
on:
  push:
    branches:
      - main
jobs:
  build:
    name: build for all platforms
    runs-on: ubuntu-latest
    env:
      CARGO_TERM_COLOR: always
      BINARY_NAME: rust-test1
    steps:
    - uses: actions/checkout@v2
    - name: Build-musl macOS x86
      uses: Shogan/rust-musl-action@v1.0.2
      with:
        args: cargo build --target x86_64-apple-darwin --release
    - name: Build-musl Linux x86
      uses: Shogan/rust-musl-action@v1.0.2
      with:
        args: cargo build --target x86_64-unknown-linux-musl --release

Release binaries can now easily be built from a single ubuntu linux GitHub actions runner. For example, get the Cargo.toml version and create a release with the built binaries by adding a couple of extra steps:

steps:
    - uses: actions/checkout@v2
    - name: Set build version
      id: version
      shell: bash
      run: |
        VERSION="$(cat Cargo.toml | grep 'version =' -m 1 | sed 's@version =@@' | xargs)"
        echo "RELEASE_VERSION=$VERSION" >> $GITHUB_ENV
        echo "::notice::publish build version $VERSION"
    - name: Upload macOS x86 binary to release
      uses: Spikatrix/upload-release-action@b713c4b73f0a8ddda515820c124efc6538685492
      with:
        repo_token: ${{ secrets.GITHUB_TOKEN }}
        file: target/x86_64-apple-darwin/release/${{ env.BINARY_NAME }}
        asset_name: ${{ env.BINARY_NAME }}-macos-x86
        target_commit: ${{ github.sha }}
        tag: v${{ env.RELEASE_VERSION }}
        release_name: v${{ env.RELEASE_VERSION }}
        prerelease: false
        overwrite: true
        body: ${{ env.BINARY_NAME }} release

There are many ways to achieve an automated CI process that can do Rust cross compile and linking. It was an interesting investigation into custom Docker containers for GitHub actions and the Rust tool chain setting up this GitHub Action package.

Feel free to contribute or improve the GitHub Action by sending a pull request on GitHub.

Beginning Rust: Writing a Small CLI Tool

rust tool

My first go at writing an application in Rust has been slightly frustrating. Coming in from using mostly dynamic languages every day I quickly found myself butting heads with Rust’s borrow checker. However, I’ve found that this is a fair price to pay for a statically typed language with a focus on memory safety and performance.

While these Rust features do increase the barrier of entry for newcomers such as myself, they also help to keep your code in check and are certainly major contributing factors to the language’s success.

Another interesting point is that Rust doesn’t have a GC (garbage collector). As soon as something in your code is not required anymore (a function call returns) the memory associated with that scope is cleaned up. Rust inserts Drop::drop calls at compile time to do this. I imagine this is similar in concept to the way that IL or code weaving is done in the .NET world. This fact means that Rust doesn’t suffer from performance hits that languages with a GC tend to sometimes encounter. Discord wrote an interesting article on how they improved performance by switching from Go to Rust that touches on this particular point.

Goals

To take a look at the Rust language and ecosystem at a really high-level, I decided to write a simple tool. My goals were to:

  • Write a CLI tool, small in scope. The tool will traverse a target directory in the file system recursively and print the structure to stdout as JSON.
  • Get a feeling for the language’s syntax.
  • See how package management and dependencies work.
  • Look at what the options are for cross-compiling to other platforms.

The tool – fstojson

Here is the small tool I wrote to achieve the above list of goals: fstojson-rust.

I’ve compiled my first app on macOS, Linux and Windows all from the same source, with no issues whatsoever.

Rust Packages

On my first look at Rust, packages were simple to understand and use. Rust uses “crates” and they work very similarly to JavaScript packages.

To add a crate to your project you simply add the dependency to your Cargo.toml file (akin to a package.json file in Node.js).

For example:

[dependencies]
serde_json = "1.0.68"

Once crates are installed with the cargo command, you’ll even get a lock file (Cargo.lock), just like with npm or yarn in a Node.js project.

Rust cross compiling

The first time you install Rust with rustup, the standard library for your current platform is installed. If you want to corss compile to other platforms you need to add those target platforms seperately.

Use the rustup target add command to add other platform targets. Use rustup target list to show all possible targets.

To cross-compile you’ll often also need to install a linker. For example if you were trying to compile for x86_64-unknown-linux-gnu on Windows you would need the cc linker.

Thoughts and impressions

To get a really simple “hello-world” application up and running in Rust was trivial. The cargo command makes things really easy for you to scaffold out a project.

However, I honestly struggled with anything more complex for a couple of hours after that. Mostly fighting the “borrow checker”. This is my fault because I didn’t really spend much time getting acquainted with the language initially via the documentation. I dove right in with trying to write a small app.

The last time I wrote something in a System programming language was at least 7 or 8 years ago – I wrote a tool in C++ to quiesce the file system in preparation for snapshots to be taken. Aside from that, the last time I really had to concern myself with memory management was with Objective-C (iOS), before ARC was introduced (See my first serious attempt at creating an iOS game, Cosmosis).

In my opinion, some of Rust’s great benefits also mean it has a high barrier of entry. It has a really strong emphasis on memory safety. I came at my first application trying to do all the things I can easily do in Typescript / Javascript or C#.

I very quickly realised how different things are in the Rust world, and how this opinionated approach helps to keep your code bug-free and your apps safe on memory.

Closing thoughts

After years of dynamic language use, my first introduction to Rust has been a little bit shaky. It’s a high barrier of entry, but with that said, I did find it satisfying that if there were no compiler warnings my code was pretty much guaranteed to run without issue.

The Rust ecosystem is active and thriving from what I can tell. You can use crates.io to search online for packages. You can use rustup to install toolchains and targets.

There are tons of stackoverflow questions and answers and the documentation page for Rust is full of good information.

Going forward I’ll try to dig into the Rust language a bit more. I’m on a little bit of a journey to try different programming languages (I’ve had a fair bit of experience in C# and Typescript / JavaScript, so I’m branching out from those now).

I discovered this post recently – A half-hour to learn Rust. In hindsight it would have been great to have found that before diving in.

Update: thanks to noah04 on GitHub for their improvements PR on applying some Rust idioms.

Packing Executable Files to Reduce Distribution Size with UPX

Recently I’ve been playing around with Ultimate Packer for Executables (UPX) to reduce a distributable CLI application’s size.

The application is built and stored as an asset for multiple target platforms as a GitHub Release.

I started using UPX as a build step to pack the executable release binaries and it made a big difference in final output size. Important, as the GitHub Release assets cost money to store.

UPX has some great advantages. It supports many different executable formats, multiple types of compression (and a strong compression ratio), it’s performant when compressing and decompressing, and it supports runtime decompression. You can even plugin your own compression algorithm if you like. (Probably a reason that malware authors tend to leverage UPX for packing too).

In my case I had a Node.js application that was being bundled into an executable binary file using nexe. It is possible to compress / pack the Node.js executable before nexe combines it with your Node.js code using UPX. I saw a 30% improvement in size after using UPX.

UPX Packing Example

Let’s demonstrate UPX in action with a simple example.

Create a simple C application called hello.c that will print the string “Hello there.”:

#include "stdio.h"

int main() {
  printf("Hello there.\n");
  return 0;
}

Compile the application using static linking with gcc:

gcc -static -o hello hello.c

Note the static linked binary size of your new hello executable (around 876 KB):

sean@DESKTOP-BAO9C6F:~/hello$ gcc -static -o hello hello.c
sean@DESKTOP-BAO9C6F:~/hello$ ls -la
total 908
drwxr-xr-x  2 sean sean   4096 Oct 24 21:27 .
drwxr-xr-x 26 sean sean   4096 Oct 24 21:27 ..
-rwxr-xr-x  1 sean sean 896336 Oct 24 21:27 hello
-rw-r--r--  1 sean sean  23487 Oct 21 21:33 hello.c
sean@DESKTOP-BAO9C6F:~/hello$

This may be a paltry example, but we’ll take a look at the compression ratio achieved. This can of course, generally be extrapolated for larger file sizes.

Analysing our Executable Before Packing

Before we pack this 876 KB executable, let’s analyse it’s entropy using binwalk. The entropy will be higher in parts where the bytes of the file are more random.

Generate an entropy graph of hello with binwalk:

binwalk --entropy --save hello
entropy analysis with binwalk before running upx to pack the executable.

The lower points of entropy should compress fairly well when upx packs the binary file.

UPX Packing

Finally, let’s pack the hello executable with UPX. We’ll choose standard lzma compression – it should be a ‘friendlier’ compression option for anti-virus packages to more widely support.

upx --best --lzma -o hello-upx hello

Look at that, a 31.49% compression ratio! Not bad considering the code itself is really small and most of the original hello executable size is a result of static linking.

sean@DESKTOP-BAO9C6F:~/hello$ upx --best --lzma -o hello-upx hello
                       Ultimate Packer for eXecutables
                          Copyright (C) 1996 - 2020
UPX 3.96        Markus Oberhumer, Laszlo Molnar & John Reiser   Jan 23rd 2020

        File size         Ratio      Format      Name
   --------------------   ------   -----------   -----------
    871760 ->    274516   31.49%   linux/amd64   hello-upx

Packed 1 file.
sean@DESKTOP-BAO9C6F:~/hello$

Running the packed binary still works perfectly fine. UPX cleverly re-arranges the binary file to place the compressed contents in a specific location, adds a new entrypoint and a bit of logic to decompress the data when the file is executed.

sean@DESKTOP-BAO9C6F:~/hello$ ./hello-upx
Hello there.

UPX is a great option to pack / compress your files for distribution. It’s performant and supports many different executable formats, including Windows and 64-bit executables.

A great use case, as demonstrated in this post is to reduce executable size for binary distributions, especially when (for example) cloud storage costs, or download sizes are a concern.