Life With Gerrit Code Review

Gerrit is a code review tool, based on Git, originally written for the Android project. It's the center of our workflow on the OpenStack project. In every other programming job I've had, every programmer in the company has had commit rights to the central repository, and code review has been an occasional ad hoc practice, mostly ignored in the name of expediency. On OpenStack, only Gerrit has commit rights to the main branches of the central repository, and if your code isn't reviewed, it doesn't get in. This is a very different way to work, and wanting to try it was part of the reason I decided to work on OpenStack.

On OpenStack Nova, we have Gerrit configured to require a +2 vote from two core reviewers, plus a +1 vote by our Jenkins automated test system. So, basically, you have to impress two senior programmers and one robot. Non-core-reviewers can and do review code as well, but they can only vote +1 not +2, and their approval isn't enough to get your code in.

On a nuts-and-bolts level, using Gerrit like this requires a ton of work by a talented infrastructure team, who configure the central Git/Gerrit server and the Jenkins server with all its automated test jobs. I'm not on that team, though, so that's all magic to me. The thing that every single programmer on the team has to deal with is a particular workflow in Git reflecting that changes go to Gerrit rather than directly being pushed to origin:master, plus using the Gerrit web interface to review code and to view feedback from others.

On the Git side, we use the git-review tool to simplify submitting local Git branches to Gerrit. Basically, a user has to follow a few steps, nicely documented in a wiki page, to setup his local repository to communicate with Gerrit. Some of these are things you'd already do with the average centralized Git setup, like configuring your username and email address and ssh keys and cloning the central repository to your local computer. Luckily, most of the extra setup is a one-time thing. The things that are different day-to-day is that we always develop in a local branch, never on master; that instead of using "git push" to send our local changes up to the server, we use "git review" to send them to Gerrit, and that most branches are a single commit rather than a series of commits, so if you need to make changes to appease a reviewer you do "git commit –amend" to modify the last commit, rather than a normal "git commit" to make a new one.

When you run "git review", it sticks an ugly line like "Change-Id: Ie4a59eeb7e3895f5d35471377c3bea462c690210" at the end of your commit message. This is Gerrit's change id, and it's separate from the Git commit id because Git's id will change if you rebase or amend your commit (which we do often), while Gerrit's id needs to remain constant. A few seconds after you run "git review", your commit shows up in the Gerrit web interface, in our case at review.openstack.org (If it's a modification of an existing commit with the same Change-Id, then the new change shows up as a new Patch Set under the existing change.)

After code is submitted for review, you're at the mercy of two classes of reviewers. The human ones are busy with other reviews, writing their own code, attending meetings, playing Foosball, whatever. We have a large enough team on some OpenStack projects that you'll usually get your first human review within a few hours, but if it's a weekend or holiday then you might have to wait. And of course you don't just need one review, you need two +2 reviews from core reviewers, so if there's some back-and-forth about the quality of your change, it can take a few days to get everything through the system.

And then there are the bots. Bots are tireless and ready to review code 24/7/365, but we have lots of tests to run and finite hardware to run them on, so in practice it can sometimes take about an hour for your commit to get to the front of the queue and have the tests run. (This depends on how many other commits are going in; you get faster test runs during off-peak hours.) Our tests consist of a bunch of core OpenStack functional tests like starting virtual machines, plus running the project's entire Python unit test suite, plus a customized version of the pep8 tool to enforce the project's coding standards. (The latter is, in my opinion, the best thing ever because it means I never need to look at 250-column-wide OpenStack code that doesn't fit in my xterm, and because it frees up human reviewers from needing to deal with style so they can focus on substance.)

If all the automated tests pass and two core reviewers like your code enough to vote +2, then you're done, and your commit eventually gets into the project's master branch. That sometimes happens on the first try, for very simple changes. Usually, it takes a few revisions. You go over 80 columns in one place and get dinged by pep8, some unit test fails, or some reviewer thinks you have a bug in your logic or a typo in your commit message or need a comment to describe some hairy logic. In cases where it's clear what the problem is, the solution is pretty simple: you modify the code, run "git commit –amend" to modify your commit, and run "git review" to push the new patch set to Gerrit and hope they like it better this time. Sometimes spurious test failures happen, and you have to tell Gerrit to "recheck" or "reverify" your existing commit. Eventually, everyone is happy and your code gets in, and then you can delete your local Git branch and go work on something new.

How does this change your day-to-day routine as a programmer? First, it takes longer to get small changes in. Something small that might take 5 minutes to code and then 1 minute to commit and push now takes 10 minutes to code (because you're being more careful to avoid being dinged by a reviewer) and a couple of hours to get reviewed. Second, as with any project that has centralized test infrastructure, you spend a lot of time saying "but that test passed on my machine!" and then digging through logs, figuring out why it fails for the Jenkins bot. Third, you spend a lot of time reviewing other people's changes.

What do you get for all that effort? First, all the code in the project's master branch has been reviewed by at least two senior people besides its author, and has made it through a pretty comprehensive test suite. In short, obvious junk with syntax errors doesn't get in at all. Buggy code can of course still get in (you can't test for everything), but the bugs have to be more subtle. Second, because of the way we continually amend a single commit during the review process, each change in the master repository tends to be a "perfect patch", with mostly correct code and a mostly useful commit message, adding one feature or fixing one bug. When later reviewing project history, you don't usually need to look at the one commit that adds most of the feature, then two more later that day that fix bugs in it. (That, in turn, makes it easier to backport fixes to other branches, since they're more likely to be self-contained.) OpenStack's code isn't perfect, but it's a lot better than on any other project of similar size that I've worked on. Third, because you spend a lot of time reviewing other people's code, you accidentally learn about parts of the project that you haven't directly worked on yet.

Would I choose to use Gerrit on a new project? Yes, with a few reservations. First, you need a solid enough infrastructure team to set it up and maintain it well. If all commits are gated by Gerrit and Gerrit breaks, you need to fix it right now or integration of new changes will stop. Second, you need a big enough team that some code reviewers are just about always available (at least during the team's core business hours), to keep work from grinding to a halt while waiting for reviews. Third, you need people who are willing to constructively criticize other people's code, and people who are willing to accept constructive criticism. If everyone just says "+1" or "+2" all the time without really looking hard, then you're just wasting your time. Fourth, you need an out-of-band way to communicate when authors and reviewers have a major disagreement and they need to have a ten-minute interactive conversation rather than just slowly throwing one-liners back and forth in the code review tool. If you're in the same office, that's easy. OpenStack is a global multi-company team, so we use IRC, which works fine.