How eBay Manages Its Data Centers

The company is a third of the way through a major three-year grid computing initiative and is hoping to move toward automatic service level management.
EBay uses commercial tools when it can. It plans to adopt those types of distributed resource management tools over the medium term. "After all, our business is about running a virtual economy, not designing and buildings systems and enterprise management tools," Strong said.

For the time being, new semantic and modeling technologies help eBay describe its systems, understand how they relate to one another, and even discover systems it didn't previously know were there. For example, eBay is beginning to use the Resource Description Framework and the Web Ontology Language, two Semantic Web technologies, to "store and query relationships" between and among the software and hardware in its networks.

EBay also is working with others to create standard ways to describe how software and devices in a network relate to one another, known as modeling. It monitors many of the emerging modeling standards groups, but the company chairs the Open Grid Forum's Reference Model working group, because, according to Strong, the OGF is "the only place specifically focused on large distributed systems." Strong also acts as chair of the OGF itself.

The OGF's Reference Model group's focus is to develop a common modeling language to unify other standards like the Information Technology Infrastructure Library (ITIL) and Distributed Management Task Force (DMTF). "No one tool can manage the modern data center, so interoperability is absolutely critical," Strong said. The work eBay has done with the OGF has informed its own ontology, which could provide a starting point for implementing future technologies that take advantage of these emerging standards to simplify distributed management.

With these and other technologies in place, eBay is already able to automatically provision and monitor its systems. The company reprovisions its entire auction platform, including more than 16,000 application instances on more than 8,000 systems, every two weeks. Eventually, however, the company would like to be able to dynamically shift system resources to meet real business requirements.

"In an ideal world," Strong said, "the goal would be for eBay's auction site to be able to say, it should take users this long to do this part of the workflow and therefore we should apply some algorithms and automatically apportion systems."

Partially in order to make the systems based on these management technologies more powerful, eBay is also adopting SOA-style software development, componentizing applications into their building-block pieces. "The real art form is not just understanding the infrastructure, but understanding the application that runs on it," Strong said.

EBay already has taken plenty of details into consideration to keep performance up. For example, its six data centers are all located in the western United States, partially because with so much data, all of them have to be active, and locating them far away from one another would lead to unacceptable latency, despite the fact that eBay also tries to minimize the number of "cross calls," or messages sent between databases.