Researchers are still working to understand the impact of the Juniper incident – the details of how the VPN traffic decryption backdoor are still not fully understood. That such devastating backdoors could make it in to such a security-critical product, and remain for years undetected has shocked many (and pushed many others deeper into their cynicism).
There are though, some questions that are far more important in the long run:
- How was the code inserted?
- Who inserted the code?
- Was someone directed to make these changes?
- Why wasn’t it detected?
Today a member of my development team pushed a commit that caught my eye – our policy on commits is simple: small logical commits. This keeps it easy to study the history and simplifies following changes for code reviews. The diff for the change was 72,395 lines. Seventy two thousand lines.
When I saw this it made me think about Juniper and how easy it would be to insert a backdoor into a complicated code base. What controls are in place to catch it, and just how effective are they?
The change he made was to add support for a new caching layer – a simple change that touched code throughout the application. Even fairly simple refactoring can lead to massive changes; tens of thousands of lines is easy to achieve.
With such large changes, it would be simple to add an extra line of code to insert a backdoor – when performing a peer code review, how many people would catch that one line among the thousands of legitimate lines. How many people are dedicated enough to pay that much attention to each and every change?
Can a code review even be trusted?
For many, the code review is a critical control – a detailed review of each change, though often checking for many things, which can complicate or even compromise the review: style, correctness, performance, and of course, security. Yet, this first line of defense can be the weakest if not managed properly; code reviews rely on people being focused, taking their time, and scrutinizing every detail.
There are some real issues here:
Deadlines: Are code reviews rushed to make a shipping deadline? If a developer lands a change too late, is the change rejected, or the release delayed? Or is the review rushed so it can ship on time?
Review Timeline: Does the reviewer have enough time to perform the review in enough detail to catch everything? If presented with a 72,000 line commit, do they have the time to read each and every line?
Check Every Line: In the Juniper case, the backdoor password was chosen to look like a format string, with hopes that it would be overlooked – which seems to have been a brilliant decision. Skipping debugging and error handling code, while a time savings, can cause the reviewer to miss something critical.
Attention: Is the reviewer focused on the code review, or squeezing it in between more pressing tasks? If they aren’t able to give it their full attention, it can be hard to spot subtle backdoors, especially in complex areas.
Training: Is the reviewer properly trained to understand the impact of security-related changes? How well do they understand the technologies being used? For example, if they are reviewing crypto-related code, would they understand the consequences of making a change to a random number generator?
Escalation: Does the reviewer have resources available should they run into something they aren’t sure about? If they encounter something that is beyond their expertise, can they go to an expert that can provide the answers, or are they expected to figure everything out on their own?
Of all the checks that are performed during the development process, the code review is the single best chance to catch backdoor attempts before they go too far – you have a person analyzing each change, and a chance to discuss the intentions behind it. That is, if the review is being done properly.
What about automated tools?
If there are questions about the quality of code reviews, what other options are there? There are of course static and dynamic analysis tools that provide insight that a person may miss – but they are far from perfect. False negatives happen all the time; leading to an assumption of safety that very well could be false – this could lead a reviewer to just skimming changes. False positives tend to be even more common, leading reviewers to discount the results, and potentially ignoring legitimate critical findings.
Performing detailed taint analysis could have, potentially, discovered the Juniper SSH backdoor – though it wouldn’t have identified the backdoor added to allow decryption of VPN traffic.
The VPN backdoor was a combination of using an existing bug that exposed raw output from Dual EC DRBG, and changing a single constant. No tool would have detected that this was a backdoor, it’s simple, it’s subtle, it was well thought out. Only a person that understood what they were looking at would catch it. That simple.
Automation helps, it makes life easier, it can show a reviewer places that need attention – but automation alone is not a replacement for dedicated, properly trained, focused people reviewing code changes line by line.
There are problems with code reviews; much is expected from those that perform them, and they don’t always have the resources to do the job properly. If there’s a failure at the code review level, a developer with a clue can easily slip a well thought out backdoor past static and dynamic analysis, security testing, and the like.
Cutting corners on code reviews may be one of the worst mistakes you can make when it comes to ensuring the security of your code.