Adding flexibility to build processes

 

Knowing where to break the chain

One of the ways we can im­prove a build, re­lease and de­ploy pro­cess is chan­ging where in the chain we make use of de­pend­en­cies. For ex­ample, people quite com­monly want to be able to de­ploy as fre­quently as they can, thus it­er­ating quickly and being able to get faster feed­back on how valu­able users find a fea­ture, or other change.

The first it­er­a­tion of this might be to im­ple­ment some py­thon scripts that builds and pack­ages ap­plic­a­tion from source code, then pushes and in­stalls them onto the target matchines. Al­tern­at­ively, you might end up de­ploying to AWS, using Chef at in­stance start time to build and then con­figure some docker con­tain­ers, that then then get run from a con­tainer sched­uler.

In each case, we defer a sig­ni­ficant amount of work to de­ploy­ment or in­stance start up time re­spect­ively, and end up with a large amount of re­dundant work. As an ad­di­tional risk, if any ser­vices on the crit­ical path (eg: a chef server or re­gistry) are at risk of fail­ure, then not only does that in­crease the risk of a de­ploy­ment fail­ure, it can lengthen the time to re­covery if you need to re­deploy to fix an er­ror.

So in typ­ical soft­ware en­gin­eering terms, we’ve ended up with a high de­gree of tem­poral coup­ling in our build pro­cess; in that we have sev­eral items that may change for dif­ferent reasons (eg: base system con­figured with chef and an ap­plic­a­tion docker im­age) that are forced to change in lock­step. 1

In each pro­cess, we end up with two parts of the system that can change at dif­ferent rates, ie: the ar­ti­facts output from the build pro­cess, and the state of the target sys­tems. For ex­ample, you might create build ar­ti­facts from fea­ture branches for testing pur­poses, but only de­ploy work from the master branch to pro­duc­tion, but to many identic­ally con­figured ma­chines. Fol­lowing Pla­to’s na­tion of carving nature at the joints, this dif­fer­ence in rates of change im­plies we can in­tro­duce a de­gree of freedom into our pro­cess.

That’s all a rather long-winded way of saying that we can split our pro­cess into stages:

So, for ex­ample, rather than using chef-client to con­figure an AWS ma­chine whenever a new in­stance is cre­ated, we could create an AMI that is pre-­con­figured with all of the changes that chef would make; con­figured ahead of time to start the re­quisite ser­vices on boot. For docker im­ages, we can have a single in­stance of a re­gistry; which can be seeded with image tar­balls (ie: docker save) stored on a well-­known web­server (eg: S3).

Or, in­stead of re-building pack­ages every time we de­ploy; we could pub­lish signed pack­ages to a well-­known re­pos­itory that can be con­sumed by apt-get, yum or sim­ilar. This also means that if you don’t have bit-­for-bit re­pro­du­cable builds, you can still min­imise ac­ci­dental dif­fer­ences in con­fig­ur­a­tion, as well as making soft­ware proven­ance more legible.

The down­side to this is that each de­gree of sta­ging can in­tro­duce ad­di­tional com­plex­ity: package re­pos­it­ories need to be con­sidered when thinking about soft­ware se­cur­ity, for one. So as with any kind of sys­tem, there’s a trade off between the ad­di­tional flex­ib­ility granted by the de­gress of freedom, and the car­rying cost.

As a post­script, this was some­what in­spired by ideas from J. B. Rains­ber­ger’s De­mys­ti­fying the De­pend­ency In­ver­sion Prin­ciple, and Don­ella Mead­ows’ Leverage Points ap­plied to soft­ware supply chains.


  1. Tech­nic­ally, this might well be the dual of coup­ling, or per­haps de-­co­her­ence, as we have com­pon­ents that change for dif­ferent reasons forced to­gether, rather than having com­pon­ents that have common reasons to change forced apart.