Jez Higgins

Freelance software generalist
software created
extended or repaired


Older posts are available in the archive or through tags.

Feed

Follow me on Twitter
My code on GitHub

Contact
About

Friday 07 July 2017 From X to Y

A month or so back I spent a hectic couple of weeks and a client job and have realised I'm now in a position to write one of those 'We reimplemented a piece of software originally written in language X using language Y and you won't believe what happened next' articles. Well, you may well not believe it but you can probably guess it, because articles like that are 10-a-penny and the reported results are always that some problem with the original code was solved in the most wonderful way. That problem could be speed, maintainabilty, memory use, lack of suitable job applicants, almost anything, but you can be sure it was resolved by the simple expedient of chucking out a load of code and writing some new code. The new code is invariably shorter and more expressive, somehow lacking the cruftiness of the old. The author invariably had lots of fun writing the new code, and often declares their chosen language to be their favourite. Frequently the article conclude with a little coda encouraging others to follow suit, or roundly insulting those so blinkered they choose not to.

This is not that article.

It is the case that I replaced a piece of production code developed by several people over a number of months in nine days, by myself. The new code is visibly faster, in the way that people notice without having to run benchmarks and draw graphs. It's leaner too - the VM it deployed to has half the CPU and a quarter of the memory of the previous version. It doesn't time out, exceed its memory quota, or crash.

I must be ace on the old keyboard, right?

Well, perhaps. But if there's any lesson to be learned here, it's that reimplementing a system is generally substantially easier than building it the first time round. In this case, a data access layer for a website, the existing code told me exactly which endpoints I needed to support, and which parameters they took. The productions logs told me which of those endpoints were actually being used, and with which parameter combinations. Almost straightaway, therefore, I knew where to concentrate my efforts, and what I could ignore completely. The existing code had a reasonable set of unit tests, which saved me the bother of having to come up with my own. I was able to look at the existing database queries, and translate them across into my new code. I could examine how the existing code had a addressed a particular issue and evaluate whether I wanted to bring that into the new code. I had the ridiculous luxury of being able to run up the old code and my new code side by side, prod them with the same requests, and compare the results, one with the other.

In short then, I benefited hugely from the fact that the system already existed. There was just a ton of stuff I had to think a whole lot less about, and quite a lot I didn't have to think about at all. Consequently, I was able spend more of my effort thinking about the things that did matter, and the result, unsurprisingly, was less code.

While I rarely suggest an all-out rewrite, I did one here because I was asked to. The language I worked in wasn't my choice, it was the client's. The new code is quicker, not because of any special property of that language, but because I was able to look at the old code, see the inefficencies, and was given time and space to do something about it. If they'd chosen another language, the results would have been similar. Hell, if they'd chosen the same language as before, the results would have been similar. If I'd been left to work on the original code for the same amount of time, I probably could have produced something similar1.

Even when you're throwing it out, there's a lot of value in old code. Be grateful for it.

1. Given their lack of faith in the existing code, I suspect that if I'd suggested just letting me have at it, I may well have been turned down. Had that suggestion been accepted, I probably would have only been given half as much time.
Tagged code

Monday 29 May 2017 Extending Rillet

Since first releasing Rillet.js, my Javascript streams library, back at the start of May, I've carried on working on it and some how managed to produce, at time of writing, a further 10 releases of one sort or another. While some of those are trivial documentation fixes, I have added quite a number of additional terminal methods and a stream modifier too.

The motivation for part of this work has been to provide parity with JavaScript's Array class. If you can do it eagerly with an Array, you should be able to do it lazily with Rillet, hence the addition of every, some, none, and join. I've drawn inspiration (by which I mean copied) from other streams libraries, including .NET's Linq, Java 8 Streams, Python iterators, and my own previous work, which so far has resulted in the max, min, and sum terminators. Lastly, and perhaps most importantly coming as it does from work I've been doing, is the modifier method uniq, which strips duplicates from the stream.

Of course, all modifiers can be written in terms of filter or map, and all terminators can be written in terms of reduce, sometimes trivially so. Here's max

  seq.reduce((a,b) => (a > b) ? a : b, Nill);

and sum, which is even simpler

  seq.reduce((a,b) => Number(a)+Number(b), 0);

Why bother, then? The first reason, and it's a pretty important one, is that's it's simply more expressive. What a call to sum does is pretty obvious, while reduce((a,b) => Number(a)+Number(b), 0), although in no way obscure still requires you to think a bit. This is even more the case for something like

  seq.filter(function() {
    const seen = {};
    return i => {
      if(seen[i]) return false;
      seen[i] = true;
      return true;
    }
  }())....

which is a whole lot wordier, significantly less obvious than

  seq.uniq()....

and also contains a subtle bug. Correctness, then, is another reason. Each time you have to write an extra bit of code, you risk getting it wrong. The more the library can do for you, the fewer bugs you'll write. Finally, there are cases where a specialised implementation can be more efficient. Rillet's none, some, and every can all return early when appropriate, while the same tests expressed in terms of reduce would have to continue until the sequence was exhausted. For most cases I expect the execution time difference to be trivial, but when it does count, it could be significant.

When I first put Rillet together it wasn't really much more than a little demonstration to support my ACCU talk. Rather to my surprise, my recent work has all been in JavaScript, and so I've been using it anger. It's working out rather well so far.


Tagged javascript, fp, code, and rillet

Tuesday 09 May 2017 Rillet.js - a JavaScript streams library

In my recent talk at ACCU 2017 I spent some time talking about generators. Generators are, of all the things introduced in ES2015, probably the most important single feature. While it's always possible to do things in a programming language, often we don't because it seems like hard work or even because it simply doesn't occur to us. The ES2015 generators say, explicitly, hey kids, this is a good idea right here. It opens the door to a new type of programming in JavaScript, and will I hope lower the impedance mismatch that often exists between different libraries.

By a new type of programming in JavaScript I, of course, mean a type of programming that's been around approximately for ever but which is now easy and attractive in JavaScript. I'm talking about functional programming, a style of programming that avoids mutable data and changing state, widely popularised by Lisp (and then subsequently ignored for a long time because John McCarthy was just so absurdly clever it took the rest of us an age to really understand what he was talking about). More specifically, I'm talking about functional programming with streams which, in addition to sidestepping mutable state, can also be spendidly expressive and concise. Here's an actual question I found on Stack Overflow -

a=[{'report1':{'name':'Texas'}},{'report2':{'name':'Virginia'}}]

From the above javascript object I need to extract only 'report1' object. I can use foreach to iterate over 'a' but somehow need to extract 'report1' object.

One of the suggested answers is

var report1 = null;

for ( var i = 0, len = a.length; i < len; i++ )
{
  if ( 'report1' in a[i] )
  {
    report1 = a[i].report1;
    break;
  }
}
which, while correct and maybe even obvious, can't match the lovely expressiveness of
var obj = a.find(o => o['report1']);
This answer uses the find method which ES2015 adds to the Array class. JavaScript already had map, filter and reduce methods and together with find they provide the core functional programming tools that have delighted Lisp programmers for so long.

So problem solved, right? Why am I banging on about generators?

Array.find, Array.map, Array.reduce and friends are great and all, but the clue is in the name - Array. They work with arrays, so your data needs to be in an array. There's also an issue in the implementation. Consider this example, finding the first even number in an array with filled with random numbers -

const arr = function_returning_an_array_of_random_numbers();

const first_even_number = arr.filter(n => n%2==0)[0];
Neat and elegant, right? Well, yes and no. Think about the filter method - what does it return? As the [0] on the end there shows, it returns an array. The filter method is, therefore, operating over the whole of arr building a new array which it returns, and we, in this case, index into to grab the first item. That's unnecessary, redundant, work. If arr isn't that long, it probably isn't an issue, but what if arr is long? That extra work might start to take noticable time or memory. What if arr is infinitely long?

Crazy talk, right? An array can't be infinitely long, it's true, but a stream can be. And this is where generators enter the pictures.

JavaScript generators provide are really an interface to a sequence, potentially infinitely long, of data. The source of the data could be a database cursor, a web socket, a file, a calculation, almost anything. In the example above if instead of a function returning an array of random numbers, we had a generator producing a sequence of random numbers and we want to find the first even number in that sequence

function* randomInterval() {
    for(;;)
	yield Math.floor(Math.random()*101);  // random number between 0 and 100
}

let first_even_number;
for (const n of randomInterval())
  if (n%2 == 0) {
    first_even_number = n;
    break;
  }
Oh, that's worse.

This is where (at last!) rillet.js comes in. Through the well established technique of applying another level of indirection - in this case throwing more generators at things - we can rillet-ify (not a real word) this to

const from = require('./rillet.js').from;      // pull in module

function* randomInterval() {
    for(;;)
	yield Math.floor(Math.random()*101);  // random number between 0 and 100
}

const first_even_number = from(randomInterval()).filter(n => n%2==0).first();
What the Rillet library provides is a thin wrapper around a generator with some additional methods on it. Rillet methods include map, filter, and so on, and they operate similarly to the Array methods with the same name, with one important exception. Instead of returning an array, they return another Rillet instance, ie another generator.

In the code above, from is simply a factory method that calls the Rillet constructor. The class itself is the thinnest possible wrapper around the generator.

class Rillet {
    constructor(iterable) {
	this.iterable = iterable;
    } // constructor

    *[Symbol.iterator]() {
        yield* this.iterable;
    }

    // more of the class
}
A filter generator that wraps around a generator and only yields values matching the supplied predicate is straightforward
function* filter(iterable, predicate) {
  for (const a of iterable)
    if (predicate(a))
      yield a;
} // filter
Rillet's filter method just throws a new Rillet object around that ...
class Rillet {
  // constructor, etc

  filter(predicate) { return new MangoRange(filter(this.iterable, predicate)); }

  // more of the class
}
Rillet map and other methods work similarly. In this way, we can built up arbitrarily complex calculation pipelines. This effect of this is two fold - first, as I said above, we can handle arbitrarily long sequences of data, and second, no work is done until we call a method that actually exercises the pipeline. Furthermore, when we do finally exercise the pipeline the smallest possible amount of work is done. Working with arrays is maximalist - everything is eagerly evaluated - while working with streams is minimalist - everything is lazily evaluated.

At time of writing, it's a library in progress so rather than describe every method I'm going to embed the README here so it'll always be up to date.

...


The Rillet source is available on GitHub. It's also available as an npm package

  npm install rillet


Tagged javascript, fp, code, and rillet

Saturday 29 April 2017 Talk: A Browse Through ES6

ES6 is (almost) the most recent version of the language most commonly known as Javascript. Its publication in 2015 was the first update to Javascript since 2009 and brought a number of pretty radical revisions to both language and library.

This session takes a look at some of the most significant features, the impact they have on the way we write Javascript, how we can start using them today, why we should, and a look forward to Javascript’s future evolution.


I presented this talk at ACCU Conference 2017.


Tagged code, javascript, talk, accu-conference, nordevcon, and rillet

Sunday 26 February 2017 First ice cream van of the year ...

A longitudinal study

Recorded this year at 15:55 on Sunday, 26 February. That's early - the fourth earliest recorded - and reflects the relatively mild winter.

Analysis!

The full ice cream van data is available in this spreadsheet

Previous years:


Tagged icecream
Older posts are available in the archive or through tags.


Jez Higgins

Freelance software generalist
software created
extended or repaired

Older posts are available in the archive or through tags.

Feed

Follow me on Twitter
My code on GitHub

Contact
About