Shrinking your code review

by alex_gaynor

It's an unfortunate reality, but one of the few things we know about software quality is that lines of code is positive correlated with bugs, or as Notorious B.I.G. would say, "Mo Code Mo Problems". Code review faces a similar challenge: the larger a patch you're reviewing, the less effective your code review is1.

There's a few reasons for this:

In short, it's hard to focus your attention on what's important when there's a lot going on. Or as one developer succinctly put it:

How do we solve this? Smart ass answer: smaller code reviews! Ok but how?

Fundamentally there's only two ways to get smaller reviews: write less code or split your changes up across multiple reviews. I'm not going to focus on writing less code, a great much about has been written about this topic. Instead I'm going to focus on how to split up your changes.

I want to start with a meta point: don't try to prematurely split up your patch. If in the process of writing your feature or bugfix you notice something that belongs in its own patch, it makes sense to split it out into its own pull request immediately, but it's not worth obsessing about. There's nothing wrong with submitting a large PR because you weren't sure how to best split it up and working through that with your reviewers.

With that out of the way, here's a few tactics for splitting up a large patch:

Dependencies: If your patch upgrades a dependency, split that out. Often you'll bump a dependency and then go to resolve a deprecation warning or add a new feature using it. Do the bump in a separate PR. This is particularly important if you vendor your dependencies, where changes to the vendored files will "pollute" review space, making it easy to miss the changes to your software.

Utilities: If your patch adds stand-alone utility functions, split them out. These are often the easiest thing to break out, particularly pure functions which can be easily unit tested.

Functionality: If your patch adds functionality that spans several layers of your application, say a new feature for a webapp, which contains a new database table and accompanying business logic, controller methods, forms, and HTML, split the different pieces apart. First land the new database tables and migrations, then the business logic and accompany tests, then the forms, controllers, and HTML. For patches like this it's particularly important to post the "complete" patch before breaking it apart so reviewers have a bit of context what you're driving towards. If the scope of your feature is particularly large, you'll likely want to explore feature flags as a mechanism for allowing pieces to be landed individually without needing to be launch-ready.

Refactors: A common pattern is to start writing a feature (or a bugfix) and realize the code is a bit messy and to end up with both some cleanups or refactors and the feature you originally set out to write. Split that refactor out.

In the process of splitting a patch out, you may have to write small amounts of code that are specific to the intermediate state (for example, a test that a particularly code path isn't implemented yet). Each pull request that's merged should stand alone, tests that assert that the intermediate state is correct are just as important as ones that assert about the final state.

One final note: to do this in practice, it's important you know your VCS well. git in particular has many features which make this type of workflow pleasant if you know them well.

[1]https://smartbear.com/SmartBear/media/pdfs/11_Best_Practices_for_Peer_Code_Review.pdf

Hi, I'm Alex. I'm a software engineer at Mozilla, working on Firefox security. Before that I was a software engineer with the U.S. Digital Service. I'm an avid open source contributor and live in Washington, DC.