Go ahead and take a wild guess as to how many lines of code make up the entirety of Google’s services. We’re talking everything here — search, Gmail, YouTube, Google Docs etc. Now you might want to aim high because there’s a good chance that even your best guess will come in far below the actual figure — a whopping 2 billion lines of code.
Now how about taking a stab at how large Google’s entire codebase is? How bout trying 86 terabytes on for size.
These interesting tidbits come courtesy of Google engineering manager Rachel Potvin who disclosed it at the @Scale engineering conference earlier this week. Not surprisingly, the topic of Potvin’s talk centered on the benefits and challenges associated with storing and managing a gargantuan codebase. And ensuring that everything remains safe, Google’s mammoth code repository is stored and updated at 10 Google data centers located across the globe.
DON’T MISS: 10 more hidden features in iOS 9
Interestingly, Potvin highlighted that many of the details on the slide below have never been shared outside of Google before.
Talk about some incredible figures.
All the more fascinating is that all of Google’s codebase — from search and maps to YouTube and Google Docs — resides in a monolithic source code repository available to and used by 95% of Google engineers, or about 25,000 users to be exact.
“Without being able to prove it,” Potvin said, “I’d guess that this is probably the largest single repository in use anywhere in the world.”
To put the 2 billion lines of code figure into perspective, Potvin added that the Linux Kernel is comprised of 15 million lines of code across 40,000 thousand files.
Also interesting is that 15 million lines of code across 250,000 thousand files are changed every single week. On an average workday, Potvin says that there are about 45,000 commits.
“Not only is the size of the repository increasing,” Potvin explained, “but the rate of change is also increasing. This is an exponential curve.”
All in all, Google’s gargantuan codebase is quite remarkable, and perhaps surprising to some who might solely associate Google with a sparse homepage which, as it turns out, belies the complexity which lurks beneath.
Now one of the benefits that comes with giving Google engineers access to the full breadth of Google’s source code is that it allows them to combine code from disparate sources.
It’s not just that all 2 billion lines of code sit inside a single system available to just about every engineer inside the company. It’s that this system gives Google engineers an unusual freedom to use and combine code from across myriad projects. “When you start a new project,” Potvin tells WIRED, “you have a wealth of libraries already available to you. Almost everything has already been done.” What’s more, engineers can make a single code change and instantly deploy it across all Google services. In updating one thing, they can update everything.
Video of Potvin’s full talk can be seen below.