I recently re-read The Soul of A New Machine, and one of these days I should write about it in a little more detail – it’s one of those books that I would definitely recommend to anyone in the industry, even if some of the experiences chronicled there aren’t necessarily the best in hindsight. I recall a passage where the team is trying to debug their machine, which struck me as pretty relevant even now.
… The machine has failed. They have their pictures. They pull up their chairs and start studying snapshots of signals.
They are trying to figure out exactly what Gollum is doing when it fails. The pictures and the printed “listing” of the steps in the diagnostic program give them the answer.
Debugging is hard. It involves constructing a mental model of the system’s state, and a model of
what the system is doing. Having your software tell you exactly what
it’s doing is quite useful, that may mean inserting logging
statements, or even inserting a
Sure, having modern tools – debuggers integrated into your development environment, providing watches and breakpoints – gives you a leg-up into understanding, inspecting, and sometimes altering the state of the system at any particular time. I’m not averse to using an IDE to debug the software I work on: I’m not a Luddite.
In a lot of cases though, it’s much simple to insert a
Personally, I find it useful to use
Like I mentioned previously, I’m not averse to using a debugger, but
often times debuggers introduce additional difficulties – separate
build modes, or separate execution modes. So, as a first-pass tool,
the simplicity of
However, there are times when you simply can’t use a debugger, or setting up your system for debugging is overkill: maybe the conditions that trigger the bug you’re trying to fix involve running several iterations through data, or waiting several minutes, and stepping through each iteration is tedious. Maybe you’re hunting a concurrency bug, a pretty time-sensitive affair, and the insertion of breakpoints or watches might perturb things enough to make things suddenly work.
In those cases, knowing that you can at least fall back to a much
simpler tool to figuring out the state of the system is
useful. Sometimes, all you really need is