With private cloud, Zynga found it could do the same work it had been doing on Amazon EC2, but with one-third the

Charles Babcock, Editor at Large, Cloud

February 17, 2012

6 Min Read

Zynga found it could do the same amount of work in its private cloud as it had been doing on Amazon EC2--but with only one-third the number of servers. It's a startling statistic, and how they did it bears explanation.

The comparison is oranges to oranges, when you consider that Zynga runs one virtual machine per physical server, whether the server is running on Amazon's EC2 or in Zynga's zCloud. Unlike Amazon, however, Zynga engineered its zCloud servers so they were optimized for different roles within its gaming software infrastructure--database access, Web server, or game logic execution. Of necessity, Amazon EC2 servers are general purpose machines, designed to run a wide variety of workloads, not do one job supremely well.

As recently as a year ago, Zynga, the producer of such popular online games as Farmville, Mafia Wars, and most recently CastleVille, was heavily dependent on Amazon Web Services for servers to host its players' activity. Eighty percent of player activity still took place on Amazon servers in January 2011. By January 2012, the figures had flipped, with 80% of game activity taking place in-house and 20% on Amazon.

[ Want to learn more about Zynga's "reverse cloud-bursting" approach to cloud computing? See Lessons From Farmville: How Zynga Uses The Cloud. ]

"In mid-2010 we realized we were renting what we could own," recounted Allan Leinwand, Zynga's infrastructure CTO, in his keynote Wednesday at Cloud Connect, a UBM TechWeb event in Santa Clara, Calif. Up until then, Zynga had found it difficult to project what its data center needs would be, given the rapid launch of successive games. The launch of Farmville added 25 million active users to the Zynga roster in five months. Rather than build data centers ahead of demand and having them sit idle until the demand materialized, it shifted more and more of its operations through 2009 and 2010 onto Amazon.

Zynga architects conceived of zCloud as an Amazon-like infrastructure in Zynga-owned or -leased data centers, governed by one management interface. It took just six months from conception to execution to get zCloud up and running, Leinwand told a nearly full auditorium at the Santa Clara Convention Center.

Zynga's Nov. 14 launch of CastleVille was the first game in several years that was launched inside Zynga; launching games had become primarily an Amazon-hosted event. CastleVille, where you adopt a role in building a castle fantasyland, proved to be another success. "CastleVille was launched solely in the zCloud, and it reached five million users in six days," recalled Leinwand in an interview before his address.

Leinwand was one of several CTOs tapped by Facebook to serve on its Open Compute Project to establish specifications for energy-efficient servers. He said zCloud servers follow the recommendations of the project, which included server cooling innovations, without precisely matching its design. Zynga doesn't build its own servers the way Facebook and Google do. It buys them through OEMs, who produce the exact types of servers it wants.

With only one VM per virtual machine, Zynga can adapt CPU, memory, and I/O to the type of task the server will undertake, then combine the various optimized sets of servers in its zCloud. Zynga is unusual in being able to do this because its games have many elements in common and in some instances, different games are using the same underlying application logic, even though their features vary. In one sense, it's an enterprise with one application, and its data center has been geared to run that application.

Leinwand said Zynga had carefully studied its existing operations and measured server performance to find where constraints lay before undertaking zCloud. "Our efficiencies weren't quite there as we started doing these game rollouts," he admitted. His team developed tools to measure what was going on in different elements of the game stack--its PHP execution, its memory mappings, CPU usage by game function, storage I/O rates, delivery of network packets per second. What they found was eye opening.

"We thought the main flow of traffic through the data center was from east to west; it turned out to be from north to south. We found lots of areas where we could improve," Leinwand told the crowd. ZCloud builders redesigned how game logic servers processed data from all the activities taking place in the games. If it was temporary data or needed quickly as play continued, a game logic server stored it on a nearby device on its own rack. If was data that needed to be persisted for future reference, it was moved off the game logic server and streamed away to longer term storage, freeing up the host for greater availability to players.

"Know thy game," intoned Leinwand, and it turned out that the Zynga staff hadn't known very well the pathways of the complicated interactions.

Other bottlenecks were found in the networks to storage systems, Internet traffic moving through Web servers, firewalls' ability to process the streams of traffic, and load balancers' ability to keep up with constantly shifting demand.

Zynga went a step further. Since some of its game activities require connections to Facebook applications or take place in Facebook applications, it sought locations for its zCloud data centers that were in the same region as Facebook's. Likewise, with its continued use of Amazon, it sought to identify "super regions" where the data centers of all could be in close proximity, reducing delays caused by geographic separation.

"We think of all of our operations in the super region as a really big data center," said Leinwand. A Zynga game data center is directly connected to an Amazon EC2 data center by fiber optic cable, allowing Zynga to shift workloads over a high-speed interconnect.

With both Amazon and Facebook, "latency within a super region is in the single digit milliseconds. They are very tightly coupled pieces of infrastructure," he said.

Zynga uses Citrix Systems CloudStack as its virtual machine management interface superimposed on all zCloud VMs, regardless of whether they're in the public cloud or private cloud.

Asked when he would complete his move out of the public cloud into zCloud, Leinwand paused, then said, "I don't know if I've thought about it that way. I want both in our toolbox. I want to launch games in zCloud and have the flexibility of bursting to Amazon."

In the interview, he was more explicit. "We own the base, rent the spike. We want a hybrid operation. We love knowing that shock absorber is there."

Charles Babcock is an editor-at-large for InformationWeek.

IT's jumping into cloud services with too much custom code and too little planning, our annual State of Cloud Computing Survey finds. The new Leap Of Cloud Faith issue of InformationWeek shows you what to be aware of when using the cloud. Also in this issue: Cloud success stories from Six Flags and Yelp, and how to write a SAN RFI. (Free registration required.)

About the Author(s)

Charles Babcock

Editor at Large, Cloud

Charles Babcock is an editor-at-large for InformationWeek and author of Management Strategies for the Cloud Revolution, a McGraw-Hill book. He is the former editor-in-chief of Digital News, former software editor of Computerworld and former technology editor of Interactive Week. He is a graduate of Syracuse University where he obtained a bachelor's degree in journalism. He joined the publication in 2003.

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like


More Insights