Jez Higgins

Freelance software generalist
software created
extended or repaired


Older posts are available in the archive or through tags.

Feed

Follow me on Twitter
My code on GitHub

Contact
About

Tuesday 09 May 2017 Rillet.js - a JavaScript streams library

In my recent talk at ACCU 2017 I spent some time talking about generators. Generators are, of all the things introduced in ES2015, probably the most important single feature. While it's always possible to do things in a programming language, often we don't because it seems like hard work or even because it simply doesn't occur to us. The ES2015 generators say, explicitly, hey kids, this is a good idea right here. It opens the door to a new type of programming in JavaScript, and will I hope lower the impedance mismatch that often exists between different libraries.

By a new type of programming in JavaScript I, of course, mean a type of programming that's been around approximately for ever but which is now easy and attractive in JavaScript. I'm talking about functional programming, a style of programming that avoids mutable data and changing state, widely popularised by Lisp (and then subsequently ignored for a long time because John McCarthy was just so absurdly clever it took the rest of us an age to really understand what he was talking about). More specifically, I'm talking about functional programming with streams which, in addition to sidestepping mutable state, can also be spendidly expressive and concise. Here's an actual question I found on Stack Overflow -

a=[{'report1':{'name':'Texas'}},{'report2':{'name':'Virginia'}}]

From the above javascript object I need to extract only 'report1' object. I can use foreach to iterate over 'a' but somehow need to extract 'report1' object.

One of the suggested answers is

var report1 = null;

for ( var i = 0, len = a.length; i < len; i++ )
{
  if ( 'report1' in a[i] )
  {
    report1 = a[i].report1;
    break;
  }
}
which, while correct and maybe even obvious, can't match the lovely expressiveness of
var obj = a.find(o => o['report1']);
This answer uses the find method which ES2015 adds to the Array class. JavaScript already had map, filter and reduce methods and together with find they provide the core functional programming tools that have delighted Lisp programmers for so long.

So problem solved, right? Why am I banging on about generators?

Array.find, Array.map, Array.reduce and friends are great and all, but the clue is in the name - Array. They work with arrays, so your data needs to be in an array. There's also an issue in the implementation. Consider this example, finding the first even number in an array with filled with random numbers -

const arr = function_returning_an_array_of_random_numbers();

const first_even_number = arr.filter(n => n%2==0)[0];
Neat and elegant, right? Well, yes and no. Think about the filter method - what does it return? As the [0] on the end there shows, it returns an array. The filter method is, therefore, operating over the whole of arr building a new array which it returns, and we, in this case, index into to grab the first item. That's unnecessary, redundant, work. If arr isn't that long, it probably isn't an issue, but what if arr is long? That extra work might start to take noticable time or memory. What if arr is infinitely long?

Crazy talk, right? An array can't be infinitely long, it's true, but a stream can be. And this is where generators enter the pictures.

JavaScript generators provide are really an interface to a sequence, potentially infinitely long, of data. The source of the data could be a database cursor, a web socket, a file, a calculation, almost anything. In the example above if instead of a function returning an array of random numbers, we had a generator producing a sequence of random numbers and we want to find the first even number in that sequence

function* randomInterval() {
    for(;;)
	yield Math.floor(Math.random()*101);  // random number between 0 and 100
}

let first_even_number;
for (const n of randomInterval())
  if (n%2 == 0) {
    first_even_number = n;
    break;
  }
Oh, that's worse.

This is where (at last!) rillet.js comes in. Through the well established technique of applying another level of indirection - in this case throwing more generators at things - we can rillet-ify (not a real word) this to

const from = require('./rillet.js').from;      // pull in module

function* randomInterval() {
    for(;;)
	yield Math.floor(Math.random()*101);  // random number between 0 and 100
}

const first_even_number = from(randomInterval()).filter(n => n%2==0).first();
What the Rillet library provides is a thin wrapper around a generator with some additional methods on it. Rillet methods include map, filter, and so on, and they operate similarly to the Array methods with the same name, with one important exception. Instead of returning an array, they return another Rillet instance, ie another generator.

In the code above, from is simply a factory method that calls the Rillet constructor. The class itself is the thinnest possible wrapper around the generator.

class Rillet {
    constructor(iterable) {
	this.iterable = iterable;
    } // constructor

    *[Symbol.iterator]() {
        yield* this.iterable;
    }

    // more of the class
}
A filter generator that wraps around a generator and only yields values matching the supplied predicate is straightforward
function* filter(iterable, predicate) {
  for (const a of iterable)
    if (predicate(a))
      yield a;
} // filter
Rillet's filter method just throws a new Rillet object around that ...
class Rillet {
  // constructor, etc

  filter(predicate) { return new MangoRange(filter(this.iterable, predicate)); }

  // more of the class
}
Rillet map and other methods work similarly. In this way, we can built up arbitrarily complex calculation pipelines. This effect of this is two fold - first, as I said above, we can handle arbitrarily long sequences of data, and second, no work is done until we call a method that actually exercises the pipeline. Furthermore, when we do finally exercise the pipeline the smallest possible amount of work is done. Working with arrays is maximalist - everything is eagerly evaluated - while working with streams is minimalist - everything is lazily evaluated.

At time of writing, it's a library in progress so rather than describe every method I'm going to embed the README here so it'll always be up to date.

...


The Rillet source is available on GitHub. It's also available as an npm package

  npm install rillet


Tagged javascript, fp, code, and rillet


Jez Higgins

Freelance software generalist
software created
extended or repaired

Older posts are available in the archive or through tags.

Feed

Follow me on Twitter
My code on GitHub

Contact
About