Using Node.js Streams to Create a Toy Version of jq

To play around with Node.js streams, I made a simple ‘toy’ version of jq, a handy command line JSON processor.

To be clear, the real jq utility is very lightweight and has a ton of functionality. In this post I’ll be mimicking some of it’s functionality and using Node.js to do so.

This will of course result in a bloated tool with way more bundled in than what we actually need.

That being said, it’ll be a useful exercise to go through to learn a little bit about Node.js streams.

In the github repository, you’ll see how easy it is to hack together a very simple command-line tool to process data streamed in from stdin.

node.js stream transform example with PowerShell pipeline processing.
JSON going in to the tool, being filtered, and then converted to an object in PowerShell through the pipeline.

Node.js Streams

The stream documentation describes them as as:

A stream is an abstract interface for working with streaming data in Node.js. The stream module provides an API for implementing the stream interface.

For this example I’ll be jumping straight into the Transform Streamstream.Transform. This is a duplex stream, where the input would usually be related to the output in some way.

Transforming Input from stdin

The basic use of a Transform stream in Node.js (to process input from stdin) looks like this:

const {Transform} = require('stream')
const TransformStream = new Transform;
TransformStream._transform = (chunk, encoding, callback) => {
    // do something with the chunk
    console.log(chunk.toString().toUpperCase());
    callback();
}

process.stdin.pipe(TransformStream);

Toy jq Utility With Transform Streams

I created a ‘toy’ version of jq using a Node.js Transform stream. It’s a very quickly hacked together example, so don’t expect it to do everything that jq can do. I’m also fully aware that the real jq utility is a very lightweight tool and that doing this in the Node.js runtime adds a lot of unecessary bloat!

This is purely for demonstration purposes.

Packaging up the Node.js app with pkg, we get a platform specific binary called toyjq.

Examples

Using node.js streams - toyjs usage examples

Pretty print input JSON

cat ./example.json | toyjq-linux

{
  name: 'directoryobject',
  path: '/path/to/directoryobject',
  type: 'Directory',
  children: [ { foo: 'bar' }, { foo: 'bar1' } ]
}

Output the `type` field only from input JSON:

cat ./example.json | toyjq-linux

cat ./example.json | toyjq-linux '.type'

"Directory"

Output just the `name` and `children` fields in the input JSON:

cat ./example.json | toyjq-linux '{name, children}'

{
  name: 'directoryobject',
  children: [ { foo: 'bar' }, { foo: 'bar1' } ]
}

Assuming now the Windows platform version of toyjq, and using a PowerShell cmdlet for this example…

Select the children array, convert it to an object in PowerShell and then select the last item in that object array:

(cat .\example.json | .\toyjq-win.exe '.children' | ConvertFrom-Json).foo | Select -Last 1

bar1

The above examples show how you can easily process data from one input pipeline (stdin in this case) and send it along through the pipeline using Node.js streams.

You can find the example toyjq app on my GitHub repository.