Systems Programming with Nim

black sand

Nim is a statically typed compiled systems programming language. I’ve seen a few posts about Nim popping up on HackerNews recently and was curious enough to give it a go.

My go-to tool / application to implement when trying out a new language these days is a simple CLI tool.

Recently I had a go at trying Rust – creating fstojson-rust. Sticking to systems programming languages, Nim came up next.

Here are the things I initially liked about Nim:

  • Statically typed, compiled language.
  • The Nim compiler and generated executables work on all the major operating systems. i.e. Linux, macOS, Windows, BSD.
  • Lots of documentation and tutorials.
  • An online Playground.
Some Nim code in my IDE
Getting started with Nim

Getting Started With Nim

Installation is really simple regardless of your OS. On macOS I chose to run the choosenim installation script. (Remember, always check and inspect scripts that you cURL / run directly from online sources):

curl https://nim-lang.org/choosenim/init.sh -sSf | sh

You can also download pre-built binaries for your OS at the main Nim Download page, or use a myriad of other methods to get it running such as Homebrew, Docker images, etc…

Nimble is a package manager that is bundled with Nim (since version 0.15.0). It is very similar in what it does as what npm does for node for example.

If you wanted to upgrade nimble itself, you can simply do:

nimble install nimble

Just like an npm update!

To create a new project, you can use the init command. An interactive CLI wizard guides you through creating a Library, Binary, or Hybrid project type.

nimble init [pkgname]
nimble init guide for new projects

A Nim project can use a .nimble file, which is equivalent to package.json in the Node.js world. Nimble is used to work with the project’s .nimble file.

To check the validity of your project and dependencies, you can run the check command:

nimble check

Then to install a new package you use the install command with nimble (the default is to install latest, but you can also install a specific version of a package, e.g. one from a git repository):

$ nimble install package@#head

Getting Acquainted with Nim

I started off with a really simple program that simply reads from stdin and then echos back a message.

echo "What's your name? "
var name: string = readLine(stdin)
echo "Hi, ", name, "!"

Building and running locally is simple. You can run nimble build or nimble install. The build command will create a debug version of your program which includes stack traces, while install will create a release version.

You can also pass Nim flags to these commands, though for anything permanent you can add a configuration file and specify them there.

You use nimble run to build and run your program in one go.

There are many more command options available for nimble. Just run the command without anything else to access the list.

fstojson-nim

Next, I jumped into writing my go-to application for traversing the file system and collecting directory and files in a designated path to output to a JSON hierarchy.

You can take a look at my code in GitHub.

My overall experience was that I found Nim to be a lot more foregiving in terms of safety checks and recursive functions than what Rust was.

I was able to use a single function (in Nim these are called procedures and are defined with the proc keyword), which could recursively call itself for traversing a nested directory hierarchy.

I created a simple object (PathObjectNode) which would store the information about each file or directory, with a children property that is used to hold a list (seq) of more PathObjectNode objects if they object is a directory (files of course do not have children).

A single hierarchy is created at the start of traversal and all directory or file object nodes are added to this.

var hierarchy = newSeq[PathObjectNode]()

At the end of traversal I simply echo out the JSON representation of the root hierarchy node (along with all children). Optionally, this can be prettified JSON.

Speaking of options, I used a handy package called docopt to provide the CLI interface. It was a case of simply adding this to my project’s .nimble file dependencies.

The interface can then be specified by providing a docopt string. For example:

let doc = """
traversefs

Usage:
  traversefs [-p | --pretty] [-r | --recurse] PATH

Arguments:
  PATH          The path to begin traversing

Options:
  -h --help     Show this screen.
  --version     Show version.
  -p --pretty   Pretty print JSON
  -r --recurse  Recusively traverse paths.
"""

With that done, and a simple nimble build later, I could run the binary directly to use it.

Closing Thoughts

Nim feels like an easily accessible systems programming language to me. Although I haven’t really used it enough to have an informed opinion, my thoughts are that it strikes a nice balance between usablity, ease of entry, performance, and type safety.

I’ll definitely consider exploring it further for other projects when I get the chance.

Rust Cross Compile Linux to macOS using GitHub Actions

rust code

It’s easy enough to add extra targets using the cargo command when building your Rust project. However, the Rust cross compile process gets a little tricker when linking is done for platforms different to the host platform.

I wanted to setup a GitHub Actions workflow that would build binaries for different platforms from the same actions runner.

While it might be possible to use GitHub Actions Matrix to run a build across multiple operating systems and install Rust / rustup / cargo on each, performing the build in each place, I opted for a different strategy.

Using a base Rust musl build container, on top of Debian Buster, I’ve added osxcross and the required build tools. This supports building and linking macOS binaries from the Linux container.

Rust Cross Compile GitHub Action

I’ve built a Docker image for GitHub Actions to use. It is based on the popular rustup image. Currently I’ve built a musl-1.0.53 variant. My version copies in and sets up osxcross. This bakes all the heavy lifting into the image so that GitHub actions can quickly build targets for linux and macOS (x86).

You can find the action on the Rust Cross Compile GitHub Action here.

Usage example

Set up a .cargo/config file to designate the target to linker mapping. For example macOS x86:

[target.x86_64-apple-darwin]
linker = "x86_64-apple-darwin14-clang"
ar = "x86_64-apple-darwin14-ar"

Add a GitHub Actions workflow:

name: Rust static build macOS and Linux
on:
  push:
    branches:
      - main
jobs:
  build:
    name: build for all platforms
    runs-on: ubuntu-latest
    env:
      CARGO_TERM_COLOR: always
      BINARY_NAME: rust-test1
    steps:
    - uses: actions/checkout@v2
    - name: Build-musl macOS x86
      uses: Shogan/rust-musl-action@v1.0.2
      with:
        args: cargo build --target x86_64-apple-darwin --release
    - name: Build-musl Linux x86
      uses: Shogan/rust-musl-action@v1.0.2
      with:
        args: cargo build --target x86_64-unknown-linux-musl --release

Release binaries can now easily be built from a single ubuntu linux GitHub actions runner. For example, get the Cargo.toml version and create a release with the built binaries by adding a couple of extra steps:

steps:
    - uses: actions/checkout@v2
    - name: Set build version
      id: version
      shell: bash
      run: |
        VERSION="$(cat Cargo.toml | grep 'version =' -m 1 | sed 's@version =@@' | xargs)"
        echo "RELEASE_VERSION=$VERSION" >> $GITHUB_ENV
        echo "::notice::publish build version $VERSION"
    - name: Upload macOS x86 binary to release
      uses: Spikatrix/upload-release-action@b713c4b73f0a8ddda515820c124efc6538685492
      with:
        repo_token: ${{ secrets.GITHUB_TOKEN }}
        file: target/x86_64-apple-darwin/release/${{ env.BINARY_NAME }}
        asset_name: ${{ env.BINARY_NAME }}-macos-x86
        target_commit: ${{ github.sha }}
        tag: v${{ env.RELEASE_VERSION }}
        release_name: v${{ env.RELEASE_VERSION }}
        prerelease: false
        overwrite: true
        body: ${{ env.BINARY_NAME }} release

There are many ways to achieve an automated CI process that can do Rust cross compile and linking. It was an interesting investigation into custom Docker containers for GitHub actions and the Rust tool chain setting up this GitHub Action package.

Feel free to contribute or improve the GitHub Action by sending a pull request on GitHub.

Beginning Rust: Writing a Small CLI Tool

rust tool

My first go at writing an application in Rust has been slightly frustrating. Coming in from using mostly dynamic languages every day I quickly found myself butting heads with Rust’s borrow checker. However, I’ve found that this is a fair price to pay for a statically typed language with a focus on memory safety and performance.

While these Rust features do increase the barrier of entry for newcomers such as myself, they also help to keep your code in check and are certainly major contributing factors to the language’s success.

Another interesting point is that Rust doesn’t have a GC (garbage collector). As soon as something in your code is not required anymore (a function call returns) the memory associated with that scope is cleaned up. Rust inserts Drop::drop calls at compile time to do this. I imagine this is similar in concept to the way that IL or code weaving is done in the .NET world. This fact means that Rust doesn’t suffer from performance hits that languages with a GC tend to sometimes encounter. Discord wrote an interesting article on how they improved performance by switching from Go to Rust that touches on this particular point.

Goals

To take a look at the Rust language and ecosystem at a really high-level, I decided to write a simple tool. My goals were to:

  • Write a CLI tool, small in scope. The tool will traverse a target directory in the file system recursively and print the structure to stdout as JSON.
  • Get a feeling for the language’s syntax.
  • See how package management and dependencies work.
  • Look at what the options are for cross-compiling to other platforms.

The tool – fstojson

Here is the small tool I wrote to achieve the above list of goals: fstojson-rust.

I’ve compiled my first app on macOS, Linux and Windows all from the same source, with no issues whatsoever.

Rust Packages

On my first look at Rust, packages were simple to understand and use. Rust uses “crates” and they work very similarly to JavaScript packages.

To add a crate to your project you simply add the dependency to your Cargo.toml file (akin to a package.json file in Node.js).

For example:

[dependencies]
serde_json = "1.0.68"

Once crates are installed with the cargo command, you’ll even get a lock file (Cargo.lock), just like with npm or yarn in a Node.js project.

Rust cross compiling

The first time you install Rust with rustup, the standard library for your current platform is installed. If you want to corss compile to other platforms you need to add those target platforms seperately.

Use the rustup target add command to add other platform targets. Use rustup target list to show all possible targets.

To cross-compile you’ll often also need to install a linker. For example if you were trying to compile for x86_64-unknown-linux-gnu on Windows you would need the cc linker.

Thoughts and impressions

To get a really simple “hello-world” application up and running in Rust was trivial. The cargo command makes things really easy for you to scaffold out a project.

However, I honestly struggled with anything more complex for a couple of hours after that. Mostly fighting the “borrow checker”. This is my fault because I didn’t really spend much time getting acquainted with the language initially via the documentation. I dove right in with trying to write a small app.

The last time I wrote something in a System programming language was at least 7 or 8 years ago – I wrote a tool in C++ to quiesce the file system in preparation for snapshots to be taken. Aside from that, the last time I really had to concern myself with memory management was with Objective-C (iOS), before ARC was introduced (See my first serious attempt at creating an iOS game, Cosmosis).

In my opinion, some of Rust’s great benefits also mean it has a high barrier of entry. It has a really strong emphasis on memory safety. I came at my first application trying to do all the things I can easily do in Typescript / Javascript or C#.

I very quickly realised how different things are in the Rust world, and how this opinionated approach helps to keep your code bug-free and your apps safe on memory.

Closing thoughts

After years of dynamic language use, my first introduction to Rust has been a little bit shaky. It’s a high barrier of entry, but with that said, I did find it satisfying that if there were no compiler warnings my code was pretty much guaranteed to run without issue.

The Rust ecosystem is active and thriving from what I can tell. You can use crates.io to search online for packages. You can use rustup to install toolchains and targets.

There are tons of stackoverflow questions and answers and the documentation page for Rust is full of good information.

Going forward I’ll try to dig into the Rust language a bit more. I’m on a little bit of a journey to try different programming languages (I’ve had a fair bit of experience in C# and Typescript / JavaScript, so I’m branching out from those now).

I discovered this post recently – A half-hour to learn Rust. In hindsight it would have been great to have found that before diving in.

Update: thanks to noah04 on GitHub for their improvements PR on applying some Rust idioms.

Generating Music with Markov Chains and Alda

generating music

A while ago I read a fantastic article by Alex Bainter about how he used markov chains to generate new versions of Aphex Twin’s track ‘aisatsana’. After reading it I also wanted to try my hand at generating music using markov chains, but mix it up by tring out alda.

‘aisatsana’ is very different to the rest of Aphex Twin’s 2012 released Syro album. It’s a calm, soothing piano piece that could easily place you into a meditative state after listening to it.

Alda is a text-based programming language for music composition. If you haven’t tried it before, you’ll get a feel for how it works in this post. If you want to learn about it through some much simpler examples, this quick start guide is a good place to start.

Generating music is made easier using the simple language that alda provides.

As an example, here are 4 x samples ‘phrases’ that are generated with markov chains (based off the aisatsana starting state), and played back with Alda. I’ve picked 4 x random phrases out of 32 that sounded similar to me, but were different in each case. A generated track will not necessarily consist of all similar sounding phrases, but might contain a number of these.

Markov Chains 101

On my journey, the first stop was to learn more about markov chains.

Markov chains are mathematical “stochastic” systems that change from one “state” (a situation or set of values) to another. In addition to this a Markov chain tells you the probabilitiy of transitioning from one state to another.

Using a honey bee worker as an example we might say: A honey bee has a bunch of different states:

  • At the hive
  • Leaving the hive
  • Collecting pollen
  • Make honey
  • Returning to hive
  • Cleaning hive
  • Defending hive

After observing honey bees for a while, you might model their behaviour using a markov chain like so:

  • When at the hive they have:
    • 50% chance to make honey
    • 40% chance to leave the hive
    • 10% chance to clean the hive
  • When leaving the hive they have:
    • 95% chance to collect pollen
    • 5% chance to defend the hive
  • When collecting pollen they have:
    • 85% chance of collecting pollen
    • 10% chance of returning to hive
    • 5% chance to defend the hive
  • etc…

The above illustrates what is needed to create a markov chain. A list of states (the “state space”), and the probabilities of transitioning between them.

To play around with Markov chains and simple string generation, I created a small codebase (nodejs / typescript). The app takes a list of ‘chat messages logs’ (really any line separated list of strings) as input. It then uses random selection to find any lines containing the ‘seed’ string.

With the seed string, it generates new and potentially unique ‘chat messages’ based on this input seed and the ‘state’ (which is the list of chat messages fed in).

Using a random function and initial filtering means that the generation probability is constrained to the size of the input and filtered list, but it still helped me understand some of the concepts.

Converting Aisatana MIDI to Alda Format

To start, the first thing I needed was a list of musical segments from the original track. These are what we refer to as ‘phrases’.

As Alex did in his implementation, I grabbed a MIDI version of Aisatsana. I then fed it into a MIDI to JSON converter, yielding a breakdown of the track into individual notes. Here is what the first two notes look like:

[
  {
    "name": "E3",
    "midi": 52,
    "time": 0,
    "velocity": 0.30708661417322836,
    "duration": 0.5882355
  },
  {
    "name": "G3",
    "midi": 55,
    "time": 0.5882355,
    "velocity": 0.31496062992125984,
    "duration": 0.5882355
  }
]

From there I wrote some javascript to take these notes in JSON format, parse the time values and order them into the 32 ‘phrases’ that aisatsana is made up of.

That is, there are 32 ‘phrases’, with each consisting of 32 ‘half-beats’ at 0.294117647058824 seconds per half beat. Totalling the 301 seconds.

const notes = [] // <-- MIDI to JSON notes here

// constants specific to the aisatsana track
const secPerHalfBeat = 0.294117647058824;
const phraseHalfBeats = 32;

// Array to store quantized phrases
let phrases = [];

notes.forEach(n => {
  const halfBeat = Math.round(n.time / secPerHalfBeat);
  const phraseIndex = Math.floor(halfBeat / phraseHalfBeats);
  const note = n.name.substring(0, 1).toLowerCase();
  const octave = n.name.substring(1, 2);
  const time = n.time;
  const duration = n.duration;

  // Store note in correct 'phrase'
  if (!phrases[phraseIndex]) {
  	phrases[phraseIndex] = [];
  }

  phrases[phraseIndex].push({ note: note, octave: octave, time: time, duration: duration });
});

It also gathers information such as the note symbol, octave, and duration for each note and stores it in a phrases array, which also happens to be ordered by phrase index.

Grouping by Chord

Next, the script runs through each phrase and groups the notes by time. If a note is played at the same timestamp, that means it is part of the same chord. To play correctly with alda, I need to know this, so a chords array is setup for each phrase.

phrases.forEach(phrase => {
  let chords = []
  const groupByTime = groupBy('time');
  phrase.chords = [];
  const chordGrouping = groupByTime(phrase);

  for (let [chordTimestamp, notes] of Object.entries(chordGrouping)) {
    phrase.chords.push(notes)
  }
});

Generating alda Compatible Strings

With chord grouping done, we can now convert the track into 32 phrases that alda will understand.

phrases.forEach(phrase => {
  let aldaStr = "piano: (tempo 51) (quant 90) ";
  phrase.chords.forEach(chord => {
    if (chord.length > 1) {
      // Alda plays notes together as a chord when separated by a '/'
      // character. Generate the alda string based on whether or not
      // it needs to have multiple notes in the chord, separating with
      // '/' if so.
      for (let [idx, note] of Object.entries(chord)) {
        if (idx == chord.length - 1) {
          aldaStr += `o${note.octave} ${note.note} ${note.duration}s `;
        } else {
          aldaStr += `o${note.octave} ${note.note} ${note.duration}s / `;
        }
        
      };
    } else {
      chord.forEach(note => {
        aldaStr += `o${note.octave} ${note.note} ${note.duration}s `;
      });
    }
  });
  // Output the phrase as an alda-compatible / playable string (you can
  // also copy this directly into alda's REPL to play it)
  console.log(aldaStr);
})

Here is the full script to convert the MIDI to alda phrase strings.

Generating Music with Markov Chains

There are different entry points that I could have used to create the markov chain initial state, but I went with feeding in the alda strings directly to see what patterns would emerge.

Here are the first 4 x phrases from aisatsana in alda-compatible format:

piano: (tempo 51) (quant 90) o3 e 0.5882355s o3 g 0.5882355s o3 c 0.5882354999999999s o4 c 7.6470615s
piano: (tempo 51) (quant 90) o3 e 0.5882354999999997s o3 g 0.5882354999999997s o3 c 0.5882354999999997s o4 c 0.5882354999999997s o3 b 2.3529420000000005s o4 e 4.705884000000001s
piano: (tempo 51) (quant 90) o3 e 0.5882354999999997s o3 g 0.5882354999999997s o3 c 0.5882354999999997s o4 c 0.5882354999999997s o3 b 7.058826s
piano: (tempo 51) (quant 90) o3 e 0.5882354999999997s o3 g 0.5882354999999997s o3 c 0.5882354999999997s o4 c 0.5882354999999997s o3 b 1.1764709999999994s o4 e 5.882354999999997s

If you like, you can drop those right into alda’s REPL to play them, or drop them into a text file and play them with:

alda play --file first-four-phrases.alda

The strings are quite ugly to look at, but it turns out that they can still be used to generate new and original phrases based off the aisatsana track phrases using markov chains.

Using the markov-chains npm package, I wrote a small nodejs app to generate new phrases. It takes the 32 x alda compatible phrase strings from the original MIDI track of ‘aisatsana’ as a list of states and walks the chain to create new phrases.

E.g.

const states = [
  // [ alda phrase strings here ],
  // [ alda phrase strings here ],
  // [ alda phrase strings here ]
  // etc...
]

const chain = new Chain(states);
 
// generate new phrase(s)
const newPhrases = chain.walk();

I threw together a small function that you can run directly to generate new phrases. Give it a try here. Hitting this URL in the browser will give you new phrases from the markov generation.

If you want a text version that you can drop right into the alda REPL or into a file for alda to play try this:

curl -s https://solitary-mountain-114.fly.dev/ | jq -r '.phrases[]'

I’ve uploaded the code here that does the markov chain generation using the initial alda phrase strings as input state.

Results and Alda Serverless

Generated Music

The results from generating music off the phrases from the original track are certainly fun and interesting to listen to. The new phrases play out in different ways to the original track, but still have the feeling of belonging to the same piece of music.

Going forward I’ll be definitely experiment further with markov chains and music generation using alda.

Experimenting with alda and Serverless

Something I got side-tracked on during this experiment was hosting the alda player in a serverless function. I got pretty far along using AWS Lambda Layers, but the road was bumpy. Alda requires some fairly chunky dependencies.

Even after managing to squeeze Java and the Alda binaries into lambda layers the audio playback engine was failing to start in a serverless function.

I managed to clear through a number of problems but eventually my patience wore down and I settled with writing my own serverless function to generate the strings to feed into alda directly.

My goal here was to generate unique phrases, output them to MIDI, and then convert them to Audio to be played almost instantenously. For now it’s easy enough to take the generated strings and drop them directly into the alda REPL or play them direct from file though.

It will be nice to see alda develop further and offer an online REPL – which would mean the engine itself would be light enough to perform the above too.

Using JSONPath Queries on JSON Data

JSON data for querying with JSONPath

JSONPath does for JSON processing what XPath (defined as a W3C standard) does for XML. JSONPath queries can be super useful, and are a great addition to any developer or ops person’s toolbox.

You may want to do a quick data query, test, or run through some JSON parsing scenarios for your code. If you have your data easily available in JSON format, then using JSONPath queries or expressions can be a great way to filter your data quickly and efficiently.

JSONPath 101

JSONPath expressions use $ to refer to the outer level object. If for example you have an array at the root, $ would refer to that array.

When writing JSONPath expressions, you can use dot notation or bracket notation. For example:

  • $.animals.land[0].weight
  • $['animals']['land'][0]['weight']

You can use filter expressions to filter out specific items in your queries. For example: ?(<bool expression>)

Here is an example that would filter our collection of land animals to show only those heavier than 50.0, returning their names:

$.animals.land[?(@.weight > 50.0)].name

The wildcard character * is used to select all objects or elements.

Note the @ symbol that is used to select the ‘current’ item being iterated in the boolean expression.

There are more JSONPath syntax elements to learn about, but the above are what I find most useful and commonly required.

JSONPath Query Example

Here is a chunk of JSON data, and some basic queries that show how you can easily filter down the dataset and select what you need.

JSONPath Queries – Example 1

Find all “Report runs” where root.id is equal to a specific value:

$.runs[?(@.root.id=="af1bcd6b-406f-43f9-86b3-9f01ee211ddc")]

JSONPath Queries – Example 2 (AND operator)

Find all “Report runs” where root.id is equal to a specific value, and shell.id is equal to a specific value:

$..runs[?(@.root.id=="af1bcd6b-406f-43f9-86b3-9f01ee211ddc" && @.shell.id=='d743537e393d')]

Useful JSONPath Resources

Use this webapp to write and test JSONPath expressions live in your browser.