Don't understand some code? Delete it.
We've all done it. Knee-deep in some code, we've seen a block we don't really understand and thought "I'll just leave that there because I'm not entirely sure what it does, but it must be doing something". It's the less risky option, right? I mean, if you don't touch the code then it won't break.
In the short term, yes, but long-term the risks build up and up. I was sat in the newly-opened Waterloo Tap on Sunday enjoying a very nice IPA from a local brewery discussing this with a friend. The starting point had been a talk he recently went to in New York by Gerald Jay Sussman on system flexibility (worth a watch), but we moved on to talking about systemic risk in technology. Having both worked in an investment bank during the financial crisis, the—possibly overstretched—parallels could easily be drawn.
Software engineers, much like traders, are often incentivised to focus on the immediately-apparent risks, rather than long-term risks. A trader pockets their bonus at the end of the year, in spite of that badly-understood ticking time-bomb of a complex derivative that's going to be on the books for thirty years. Leaving a piece of code in place that you didn't really understand means that the release you're doing may not break things and everyone's happy with a seemingly-successful deploy, but the cruft builds up over time creating huge overall risk. Others then build systems that talk to that system and eventually everything goes pop.
The RBS systems failure in 2012 is a prime example, where some customers couldn't access their bank accounts for weeks on end. I fear this is just the tip of the iceberg, and we're going to see a lot more of this as the house of cards starts shaking under more and more layers being added.
So, what do we do about it? One thing is obvious: be bold in ripping out code you don't understand. Inaction is still a decision to not do something. See some code smell? Rip it out and see what breaks. Better something small break now than everything come tumbling down catastrophically a few years down the line. This is equally important at the architecture level. It's often easy to overlook that shady interaction between systems you know really shouldn't exist. But, again, these build up over time and create a spiders' nest of badly-understood dependencies.
If you can't measure it, you can't improve it.
Put some quality management software in place to measure code quality. We're using SonarQube, and are in the process of figuring out the best ways to integrate it into our existing projects. Tools such as Scientist, which GitHub have just released, also let you be bolder in your changes, safe in the knowledge of a safe way to roll them out.
How do you manage long-term software risk? What tools do you use? Do let me know your thoughts.
This originally appeared on our company blog. Picture credit (cropped): Austin Gruenweller
blog comments powered by Disqus