Design and Entropy

The bulk of the work I’ve done over the years has been in line-of-business and back office applications: systems who often are called “enterprise” systems, where the people using the system are often not the same people as those paying to build it. These systems are usually some sort of data gathering application, replacing what was once manual paper forms in some way — payroll, leave application, inventory: those kinds of problems where the larger form of the solution is similar to other systems, but the devil is in the details.

In those systems, there usually is a set of common idioms on how to structure the system: there is a code which defines the domain objects or nouns of the system, above that is a layer of data access code whose job is to broker between the data store and the system at large, above of that is a different layer of code, services, whose job is to orchestrate the different business operations that define the application’s purpose. And then finally, on top of that layer cake, there’s the code responsible for translating and responding to the user’s actions, the user interface.

But, as I said: the devil is in the details. The map of the system may look the same from up above, but once you look into it more closely, the differences suddenly become massive. You start questioning thing: why should this class be over here? Shouldn’t this bit of code be in the method over there? Can we split up this method?

Plans

“No plan of operations extends with certainty beyond the first encounter with the enemy’s main strength”

Helmuth von Moltke the Elder

The way we structure our systems is often a fight against entropy and expedience. I’m pretty sure a lot of other systems’ designs have slowly decayed as bugs are fixed, or features implemented: the neat little separations of concerns slowly disappear as we try to wrestle with the limits of our own knowledge and understanding.

The system I’m currently working on was built by several people, over a couple of years, and I’ve been tasked with untangling it from its dependencies on AppEngine as its original host environment. There is a lot to be done, a lot of technical debt to be paid off: bits of code in the wrong layer of the system; parts of the UI code reaching down directly into the database, bypassing the service layer; code that no longer has any purpose, some commented out, some still floating in the corpus, but vestigial and shrunken from its former state.

Whenever I work on such systems, I try to figure out the lay of the land, where each bit of functionality ought to be, and where it is now. I perform a sort of code archaeology: looking into the system’s history through what it is now.

Understanding the lay of the land gives me a way to figure out how to fight entropy: it gives me a way to recover the lost structure of the system, covered in all of the scar tissue from real world use. I often fight my urge to criticize the system’s design as it is now, as often things are where they are for very good reason: a bug needed to be fixed here, causing a ripple of changes which duplicated code over there, and there wasn’t any time to clean things up; that module used to be part of a particular feature, but that feature was papered over in the time since, duplicated by a different feature.

Sometimes though: you have to accept that the true design of the system is only evident in hindsight: in the crucible of real world usage will you actually understand what code is necessary, and what isn’t. No plan survives first contact with the enemy.

From Here To There

There’s a very informative book I read early in my career titled Working Effectively with Legacy Code, by Michael Feathers. In it, he details various tools and strategies for working with legacy codebases, and he defines legacy code as any code that lacks tests. I’d argue further: all code eventually becomes legacy code, even with tests, as the whole design of the system slowly succumbs to entropy and chaos. Tests only prevent regression: they prevent you from reintroducing known bugs; they cannot stop you from introducing new ones. New bugs are often introduced because new code interacts in a novel way with old code: code paths never thought about, never really tested, combinations of state that have not really been considered. Entropy.

So, a lot of the strategies described in that book can still be effective in dealing with entropy in the large: finding the seams, breaking large bits of code into smaller collaborators, making small changes.

Tests are only a part of the story, and the design and structure of the system is another. Structure is anti-entropy in action: a way of reducing complexity and chaos. But tests allow you to have more confidence restructuring the system: again, tests prevent regression, so you know at least you aren’t reintroducing problems, but we have to take the trade-off of the probability of new problems arising.

Eventually, we have to accept that all systems cannot be designed perfectly: entropy cannot be completely beat. No business can afford to stay still for long. Eventually, other bugs will be need to be fixed, and new features will need to be implemented.

We fight entropy in our systems every day: I mean, the whole endeavour of software is fundamentally a fight against entropy, come to think of it. We just need to choose what battles to pick.

Previously: Freelancing