For many people, the value of code review is rather unclear. Since the value is not clear code review is often not done, skipped, or forgotten entirely in favor of more obviously important work items.
In fact, the time cost of doing code review - lost time for the reviewer(s) plus the delay in time for the code under review to production - can make code review seem to be a poor use of time. This is particularly true if time is tightly constrained, which is typical in most software engineering organizations. Compounding this, if code isn't written with code review in mind, code review can be excessively difficult and time consuming, as the reviewing engineer attempts to make sense of too many lines of unclear, confusing code.
This post aims to share the true value proposition of introducing code review into your software development process. You will hopefully become convinced that the short term time cost of reviewing code is well worth the productivity and quality gained in the near, ongoing, and infinite future.
What should code review look like to the coder?
For producing code, a given engineer should do roughly the following in a loop:
- Define as small an idea as possible that needs to be implemented, or pop the latest one off the "stuff to do" stack.
- Implement the idea.
- Send out a diff of changes to the code that implement the idea for review by a peer(s), manager(s), or pertinent owner(s). This diff should include the changes and text explaining the changes and how they were tested.
- If necessary, iterate on feedback from the review and repeat the previous step.
- Ship the code.
A given engineer might have a few of these workflows going, ostensibly one for each distinct idea they are working on. A given engineer should have enough going on and get feedback from code reviews quickly enough that they always feel busy. A given engineer should also be putting out some sort of code for review with regularity and as close to daily or faster frequency as possible.
A more concrete example with the Phabricator tool stack on a git repository might look like
- git checkout -b idea_name - this creates a branch named idea_name. (I hope your idea has a better name.)
- Implement the idea.
- git commit -a -m "stash" - this creates a local git commit in the branch.
- arc diff - this starts an Arcanist workflow which ultimately sends a diff of changes with explanatory text to the Differential application on the Phabricator instance configured for your git repository. This creates a Differential revision where reviewers, the given engineer, and others can view the changes and leave feedback.
- Do other things while waiting for Phabricator notifications of diff feedback or acceptance from your co-workers.
- If necessary, iterate and re-run arc diff, which updates the Differential revision.
- Once accepted, run arc land idea_name - this pushes the change to the upstream.
What should code review look like to the reviewer?
For reviewing code, a given engineer should consider this their top time-wasting priority at any given moment. This is to say, when not actively producing and looking to take a break - say when one might check email or social networking sites before getting back to it - engineers should check their code review queue and seek to clear it. Further, in general, engineers should strive to give feedback within 24 hours, if not sooner as local work convention dictates.
Beyond this, a reviewer should have to do as little as possible thanks to solid tools. The code review tools should:
- Run lint and automagically correct errors to reduce human time involved.
- Run tests and report results as necessary or otherwise alter the workflow to reduce human time involved.
- Make it easy to browse the larger codebase for insight during review.
- Make it easy to leave feedback on the code in question.
Strong tools coupled with well-written diffs help reviewers feel like they either got to accept some super sweet code or they were able to provide some valuable feedback to their teammate.
With Phabricator and some configuration, when users run arc diff a lint process is run and suggested lint changes can easily be included as part of the workflow by the user. Similarly, unit tests can be incorporated into the arc diff workflow. Finally, the Differential application makes understanding the code and leaving feedback easy and intuitive.
The important detail in the sample process - small, manageable changes for everyone!
In the sample process the important detail is that a given engineer submits the fewest amount of changes necessary to implement a single idea. This is a trick to make "change management" much easier. In short, the smaller the set of changes the easier it is to manage it.
The immediate benefit for code review is (surprise!) code review is much easier if there is less code to review and its all intended to make a single idea work. The reviewer can more easily understand the complete set of changes. Note that this benefit extends to future engineers who can also more easily look at a series of easily-understandable changes than giant, hairy sets of changes. Further, this ease of change management can pay even more future dividends - consider reverting changes from production - which becomes easier if the changes are smaller and map to singular ideas. Consider launching a new product that integrates with your search service in one giant checkin. Wouldn't it be nice to be able to just roll back the particular commit that landed the search integration rather than the whole product? Even more ideally (perhaps), wouldn't it be nice to roll back the commit that turned on the integration to the search service? In reality, your organization is likely launching multiple updates in a given release, and such flexibility in reverting parts of the release could quickly prove invaluable to maintaining overall forward momentum.
Designing for easy change management has an even more awesome side effect however. Recall that good software design is hard, particularly breaking down complex systems into smaller, more manageable parts. If you have code review as part of your software engineering process then you have already placed significant pressure on engineers to decompose complexity. I think that all code can greatly benefit from this consistent, subtle pressure to make it singular in purpose and easily understandable to another engineer.
Ever make a mistake? Catch it before its too late.
I'm not really talking about bugs. Bugs do go out in code that's been peer reviewed but peers also catch many, many issues. The catch rate becomes higher as an organization has more distinct individuals working across a larger codebase. There will be many experts and mentors who will likely be reviewing changes for those who are less experienced in the given code. These reviewers catch valuable issues before they hit production in the code review process. But again, I'm not really talking about bugs.
What I am talking about is poor or otherwise sub-optimal design. Turns out that once you commit a change you are far less likely to fix it given feedback than if addressing the feedback is necessary to commit the code in the first place. Ergo, with more subtle issues like sub-optimal design, the simple inertia of committing the code can be enough to stop both potential feedback givers and code fixers, who will subsequently live forever with the sub-optimal design. What's truly fascinating is that long-term each individual will have their efficacy dictated by how efficiently they can work with the existing codebase. Code review helps make sure everyone can work with the codebase effectively and efficiently by helping keep the design quality high.
Even if you are an excellent engineer consistently, there is an increasing chance as the codebase grows that there is a better way to accomplish your given idea. Consider new abstractions, frameworks, or even programming language features that may obviate the need to write the code you've written altogether.
Code review helps catch these mistakes before they happen. Before they get into your codebase and become something no one really wants to fix even though they kind of make the codebase "dirty". Before they are a subtle tax on your organization that everyone knows are terrible but no one can justify taking the time to fix since it "works." Trust me, these things happen and it they happen a lot more often without code review in place.
Code review also helps in those instances where peer reviewed code makes it out to production. Now, when it comes time to fix the issue, multiple engineers have knowledge of the underlying code. Fixes should come faster, and if you communicate well around fixes, then everyone involved with the code can learn.
"Free" process for every single commit.
Now that you have everyone doing code review, you have "free" process for every commit...! Some of that process can really be free (automated and require no human time) too:
- lint tests - wow, no more debates about spacing conventions.
- unit tests - no more breaking the build...!
- your automated testing dream here!
Beyond this automation, a strong tool chain will also allow you to hook code review into other parts of your larger development process, such as task management. For example, Phabricator makes it easy to link Differential revisions to Maniphest tasks by including a reference to the Maniphest task in the original text given to arc diff. These links between revisions and tasks can also updated via the Differential and Maniphest user interfaces. More generally, arbitrary code can be written to listen to events from Phabricator.
All that said, there is one bit of process that is highly recommended every engineer doing code review consider. As part of submitting code for review, the engineer should provide a test plan which describes how they tested their change.
The idea here is that writing this down forces the engineer to double check that they actually adequately tested the feature. It also gives the reviewer a chance to augment the test plan, further reducing the chance of bad code making it to production. Finally, it ends up describing what the change actually does in an interesting way that may lead to valuable feedback and code changes as part of the feedback. (See this document for more details on how to think about and write good test plans.)
For example, consider a change that introduces a new class with a half a dozen methods, but the provided test plan only covers two of the methods. A reasonable discussion could next occur, where potentially the coder could realize with the reviewers help they had an insufficient test plan, or the other four methods would be removed, or perhaps be properly documented as unfinished.
What is it that you do again? Be better.
I think this is the most important point and also the most overlooked. As software engineers, we basically do one thing at the end of the day - we write code.
The best way to get better at something is to get timely feedback in detail about your actual performance and how it could be better. That is to say, at a certain point to advance your skill you benefit more from timely coaching based on actual performance than anything else. In this regard, software engineers are exceedingly lucky as its rare for a workflow to have such a natural stopping point where a peer can evaluate the work and provide advice and feedback. (This is also one of the big value propositions of pair programming, but that's a topic for another post. (maybe))
Whatever your organization is, its success is dictated most by the people within that organization. Make sure the software engineers in your organization have the right frameworks - in this case, code review as part of their larger software development progress - to help them achieve their long-term potential.
Adopting a code review process seems like a losing proposition at first since the value is subtle and the time cost is very obvious. However, code review actually makes your organization much stronger:
- The decomposition of code for review makes code easier to work with.
- Easier to review... and understand.
- Easier to revert more granularly if issues arise.
- Implicitly better design from continually trying to break code down into singular ideas with minimal code reach.
- Issues are caught before they hit production.
- Bugs are often caught by veterans, mentors, and owners in bigger organizations.
- Bad code is stopped before its committed, saving incalculable long-term organizational taxes by helping keep the code quality high.
- Additional automation and process can easily be added to the code review "choke point."
- Individual software engineers benefit immensely from the coaching that code review provides.
For these reasons, I cannot highly recommend enough that you consider adding code review to your software development process, and that if you need tools to do so you consider using Phabricator.