Google Went Monorepo And Never Looked Back

Google has way too much code in their codebase and they chose to put it all in one repository. Anyone Google employee has access to all of Google code from Pixel, to Youtube, to Google Cloud. Google made it work because working with a single repository is simpler. Its simpler dealing with less dependency management,

Google has way too much code in their codebase… and they chose to put it all in one repository. Anyone Google employee has access to all of Google code from Pixel, to Youtube, to Google Cloud.

Google made it work because working with a single repository is simpler. It’s simpler dealing with less dependency management, and having a single source of truth.

Lets take it back to 2015 and talk about how Google stored 2 billion lines of code in 9 million source files. They store all of that in a single repository.

That’s right, all of Google’s 40+ million commits and code changes are all on a single code repository. It scales, it works, and it has advantages.

They do it all with trunk-based development, where everyone writes to HEAD or the most recent version of code.

To do all this at scale, Google does 2 things:

  • Trunk-based development

  • Coding Workflow

  • Google writes all their code using trunk-based development. This just means everyone works on HEAD, or the latest code in their codebase, and makes changes on top of that.

    Everywhere else? Make new branch, add changes to it, and merge that branch to the main branch. But Google doesn’t do that here.

    Making a code change? Google has internal systems to rebuild everything that code change touched. If the build breaks, it’s not going through.

    Before you even get to that point, another engineer needs to review, comment, and approve that code. Code changes don’t go anywhere without approval.

    And who needs to approve? Someone who actually owns that area of code. Google has programmatic ownership checks for coding directories to make sure random Google cloud engineers aren’t approving changes to Youtube algorithms

    Glad you asked, here’s the advantages

    We’re all reading the same book and things don’t go stale. If other Google teams want to reuse shared code, they don’t have to worry about which versions they’re using. It’s all here and it’s all available.

    Everyone is using the same version of a dependency. If there needs to be an upgrade, you’ll know immediately if the upgrade broke something. It’s just easier to upgrade everything all at once to use the same versions.

    Changes aren’t split between different areas of code. If we need to rename everything, we can do it all in a single code change without being split between multiple repositories.

    Here’s what doesn’t work having a Monorepo.

    It’s too easy to handle dependencies. It also means it’s too easy to add dependencies you don’t need. Code clean up gets harder and binaries and builds take longer. It’s a tragedy of the commons.

    With billions of lines of code, using grep probably isn’t going to cut it for finding code files. Google needs to invest in building their own tools to code search at scale.

    To this day, Google still uses a monorepo and actively builds tools on top of it. It’s a lot of upfront investment getting engineers to work on this, but the overall benefits has been worth it over splitting their codebase into multiple repositories.

    Google has always been monorepo and has never gone back.

    (Links to official article and sources are available to paid subscribers. They help maintain and support this newsletter!)

    ncG1vNJzZmiaqamytLXZnpudnaOetK960q6ZrKyRmLhvr86mZqlnl6S8qLjEZq6epqRiurC6zqucqaddlrulec2erZ6qXaG8sLfEnQ%3D%3D

     Share!