Google: Docker Does Containers Right

Two leaders for Google's cloud computing services talk about how they use containers in the data center, and why virtual machines also remain critical to their strategy.

Charles Babcock, Editor at Large, Cloud

February 19, 2015

6 Min Read
Google's Craig McLuckie

Google has a decade of experience in using containers to improve how its data center runs search, Gmail, and other operations. But the company has rarely talked much about its experience, even though it's launching more than 3,000 containers per second. That's about to change as the company starts to evangelize container use and an open-source project it supports.

Senior product lead for Google Compute Engine, Craig McLuckie, is slated to speak today at the Linux Collaboration Summit in Santa Rosa, Calif., where he will describe how Google generates, manages, and tears down 2 billion containers a week. In an interview with InformationWeek on Wednesday, he previewed the lessons learned he will talk about at the Summit.

Over the last two years, McLuckie has been an advocate inside the company for Google to nurture a broader community involved in container management. It had already launched a container management system known internally as "the Borg," then outgrew its capabilities and replaced it with something it called, "Let Me Contain That For You."

"We don't have a lot of skill at naming things. Can you tell?" he said during an interview in the courtyard of the Vineyard Creek Hyatt, where the summit is being held.

[Want to learn more about VMware's recommendations on container management? See VMware: VMs and Containers, Better Together. ]

Let Me Contain That For You was replaced about two years ago by Kubernetes, a container cluster management system capable of supervising container life cycles on the large scale that Google requires. McLuckie and Joe Beda, senior staff engineer, became founding members of the Kubernetes open source code project when Google launched it last July. (Kubernetes is a Greek word meaning roughly helmsman.)

Google engineers had come up with the concept of Linux Control Groups, one of the basic building blocks that Docker uses in constructing a container system, and donated them to the Linux kernel team. But as he continued to work on Google's container management, McLuckie realized Google was venturing further and further out on a limb by itself. "It was really lonely in the container space," he recalls.

Docker Gets Google's Attention

Just as Google was launching the elements of a Kubernetes system internally, a company then known as DotCloud produced something it called Docker. Docker "captured lightning in a bottle," he recalls. It recognized that the Linux Syscall, the interface that an application uses to converse with the Linux kernel, was a stable and standard element of each Linux distribution. By relying on Syscall, a container could package up an application with just the elements of Linux that were needed by the app, eliminating most of the bulk of an operating system.

Docker built the package as a set of layers, something like striations of sediment in geographic formations. Changes or additions could be made to the contents of a container, but each would constitute its own layer, so that troubleshooting or bug-fixing could also be done one layer at a time. Then Docker built a toolset that enforced the practice of packaging a given container the same way each time.

It doesn't sound like much, but for developers, it sidestepped a lot of tedious and repetitive work needed to get applications to run.

Beda and McLuckie became Docker enthusiasts, and they steered Kubernetes as an open source project toward handling Docker containers -- although it is by no means limited to just Docker. Kubernetes in effect generates a server cluster for containers that gives the application the hardware to run on, even if a given server fails beneath it. The combination of containers and container management lifted huge burdens from the backs of developers.

"Docker got the combination right," McLuckie says. "They got the tuning right. It's the intersection of a tool chain that fosters reuse, and the simplicity of the tuning experience [for deployment] that allowed Docker to flourish."

Containers are going mainstream, as Web, mobile, and enterprise developers by the thousands are learning to use containers and make them a part of their rapid software development arsenal.

Containers vs. Virtual Machines

McLuckie believes containers will complement rather than replace virtual machines. The Google Cloud Platform -- App Engine and Compute Engine -- puts a customer's workload in a container, then puts the container inside a virtual machine. Google would like to see virtual machines with lower overhead and faster initiation times, but a virtual machine still provides harder boundaries and a smaller attack surface than containers. The Linux Syscall interface used by containers has about 130 commands, while a virtual machine's hypervisor has about 30, so it's easier to inspect the hypervisor's commands and ensure none of them have been infiltrated or changed. SysCall was meant to be able to do an extremely wide range of things for an application through the Linux kernel. "The problem is that the Syscall interface is so broad," says McLuckie.

From time to time, some heretofore unseen code illustrates that it can break out of a container's isolation and intrude on neighboring container operations. When discovered, the Linux kernel team finds a way to eliminate the problem. "But we see one about every six months. … We wouldn't feel good about scheduling one customer's containers on a server where other customers' containers are running," McLuckie says.

At the same time, Google for its own use is willing to run many containers natively on a server -- on bare metal, not virtual machines -- or together in a single virtual machine on a server. That's because it has lots of experience with its own operational code and knows which things are compatible at close quarters. Containers of course are sharing not only their host's operating system kernel but also the system CPU, memory, and storage.

Google Will "Double-Bag" Customer Apps

It comes down to each container user making a decision on how to use containers: isolated only by the conventions of the operating system's ability to keep memory and CPU assignments straight, or further isolating the application in a virtual machine, McLuckie says.

He says an enterprise datacenter may soon do things like run 50 applications together in containers on a single server when it trusts the code and has found the apps' operation to be compatible.

But when it comes to cloud operations, "we see the VM as the only truly safe isolation. … Until we see foolproof security for containers, we will always double-bag our customers' workloads," McLuckie says.

Google now offers Google Container Engine, announced in November, which is Kubernetes as a Google managed service for running customers' containers. It was followed by Google Container Registry in January. 

Attend Interop Las Vegas, the leading independent technology conference and expo series designed to inspire, inform, and connect the world's IT community. In 2015, look for all new programs, networking opportunities, and classes that will help you set your organization’s IT action plan. It happens April 27  to May 1. Register with Discount Code MPOIWK for $200 off Total Access & Conference Passes.

About the Author(s)

Charles Babcock

Editor at Large, Cloud

Charles Babcock is an editor-at-large for InformationWeek and author of Management Strategies for the Cloud Revolution, a McGraw-Hill book. He is the former editor-in-chief of Digital News, former software editor of Computerworld and former technology editor of Interactive Week. He is a graduate of Syracuse University where he obtained a bachelor's degree in journalism. He joined the publication in 2003.

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like

More Insights