Netflix chief cloud architect Adrian Cockcroft shares five money-saving maneuvers for big Amazon Web Services users.

Charles Babcock, Editor at Large, Cloud

April 8, 2013

6 Min Read

Cloud Computing Comparison: PaaS Providers

Cloud Computing Comparison: PaaS Providers

Click above for detailed features matrixes on PaaS vendors

Instead of trying to build out its own data centers for its rapidly expanding film and video distribution business, Netflix finds the better strategy is to use Amazon Web Services' cloud resources. At Cloud Connect 2013, the architect of that strategy disclosed some of his secrets for optimizing use of the Amazon cloud.

Adrian Cockcroft is a leading proponent of Amazon Web Services, so much so that he is sometimes criticized for channeling Netflix computing onto Amazon's EC2. A recent InformationWeek column, "How Netflix Is Ruining Cloud Computing," drew 39 comments, including several by Cockcroft defending himself. John Engates, CTO of Rackspace Cloud, an Amazon competitor, added a follow-up commentary, "What Netflix Could Do For Cloud Computing."

An April 4 session at Cloud Connect 2013, a UBM Tech event in Santa Clara, Calif., featured Cockcroft and Amazon Web Services technology evangelist Jinesh Varia. It drew a crowd to hear their take on Netflix's five tips to maximize the cost effectiveness of AWS.

And while it's commonly viewed that Netflix is dependent on AWS for vital services, Varia said at the start that AWS relies on Netflix to educate it about meeting a large, demanding customer's needs. "Adrian and his team challenge Amazon Web Services in every way. They help us to make AWS better," he said.

[ Want to learn more about managing spending in the cloud? See Cloud's Thorniest Question: Does It Pay Off? ]

Cockcroft, in turn, said the switch to Amazon allowed Netflix to try using large data center resources, fail at it without paying a heavy penalty in unused gear because it was only rented by the hour, then try again. The ability to execute a rapid, iterative testing of ideas "[gives] us an ability to try things out, even more than our own data centers would," he said.

Cockcroft offered these five tips for using AWS.

1. Weigh Costs Vs. Business Goals.

As an example, he said Netflix had no staff or servers in South or Central America when it opened operations in South America Sept. 5, 2011. It thought it would improve customer service throughout the region if it added servers in the AWS center in Sao Paulo, Brazil. But it found requests to its virtual servers in Brazil from other countries were almost always routed over the Internet through Miami. That's because Miami is a massive hub for network carriers, and an Internet user in Ecuador wanting to talk to one in Brazil will almost always be routed through Miami. Netflix found there was no performance advantage to using servers in Brazil, which were farther from Miami than AWS's East Coast servers. It reverted to serving South and Central American through its hundreds of servers in Northern Virginia because "it made sense to serve them out of U.S. East," said Cockcroft.

European customers, on the other hand, could be more efficiently served out of AWS's Dublin, Ireland, data center. AWS services in Dublin are slightly more expensive than those in U.S. East, but reducing latencies for customers was worth the increase, he said. Netflix launched a 1,000 virtual machine footprint in Dublin, using the same procedures and same APIs to which it was already accustomed. "Everything just worked," he said.

Mastering these business tradeoffs of weighing the cost and latency penalties, when they exist, against your business goals is one of the fundamental challenges of cloud computing. Cockcroft used a rough equation to formulate the trade-off: "How many dollars should you spend to reduce customer latencies by 50% if that increases your conversion rate by 10%?"

2. Plan Ahead For Disaster.

Netflix is rare among Amazon users in implementing operations in multiple regions. Amazon urges customers to achieve high availability by implementing the same application and data in more than one availability zone. Multiple zones exist in the same region, but Netflix wants its U.S. East operations in Northern Virginia able to fail over to one of Amazon's West Coast facilities. That allows a region-wide disaster, such as Hurricane Sandy, to strike Northern Virginia but still leave Netflix with a guarantee of continued operations. (Sandy created temporary problems with some services but did not knock AWS off the air.)

10 Tools To Prevent Cloud Vendor Lock-in

10 Tools To Prevent Cloud Vendor Lock-in

10 Tools To Prevent Cloud Vendor Lock-in(click image for larger view and for slideshow)

Netflix wants "50% of its operations in one region and 50% in another," with one region serving as the fail-over site for the other, said Cockcroft. That way, neither a hurricane in the East nor an earthquake in the West would close down its business.

It plans to have a reservoir of reserved instance capacity, the type of virtual server for which the customer pays upfront and gets guaranteed access when it's needed, in each region so that it can invoke that capacity and continue full operation. Netflix may not use the reserved instances much of the time, but if disaster strikes, that capacity "will be yanked out from under someone else" and given to Netflix. That's possible because AWS sells spot instances that go for substantially under-market prices until a pre-paid customer like Netflix needs them.

3. Use Reserved Instances Appropriately.

Cockcroft advised other major AWS users to determine which workloads tend to operate at a steady state and then identify what that state is. By identifying steady-state workloads, customers know how much of a reserved instance they should buy and can move workloads onto the lower-cost server type to gain significant savings. "If you're going to run your job less than 3.5 months a year, then stick with an on-demand instance," he advised. Because reserved instance users make an upfront payment as well as pay an hourly rate for a minimum one-year contract, it doesn't pay to seek the lower rate unless the job runs for more than three and a half months.

4. Borrow Idle Capacity For Dev And Test.

Cockcroft said Netflix is trying to get away from having a separate set of AWS on-demand accounts for development and testing and will try to use idle reserve instance capacity for dev and test. That way, Netflix's surplus reserved instance capacity is getting some use while still being available for peaks in production workloads or disaster recovery. Such a move keeps the capacity Netflix needs available in the cloud, while putting it to a secondary use when it's not actually employed in everyday processing. That is, reserved instances can serve both a primary and secondary purpose, cutting your cloud bill. "I'm not sure other customers have figured this out," Cockcroft noted.

5. Consolidate Accounts To Gain Discounts.

Cockcroft urged Cloud Connect attendees to consolidate their AWS bill under their company's name, instead of having many individuals and departments run their own accounts. No percentages were cited, but Amazon will discount services for larger users, but it won't discount them if they exist as many independent accounts. "Every business has more than one application and more than one cloud account," pointed out Varia. A customer qualifies for a lower-priced billing tier when the accounts are consolidated.

About the Author(s)

Charles Babcock

Editor at Large, Cloud

Charles Babcock is an editor-at-large for InformationWeek and author of Management Strategies for the Cloud Revolution, a McGraw-Hill book. He is the former editor-in-chief of Digital News, former software editor of Computerworld and former technology editor of Interactive Week. He is a graduate of Syracuse University where he obtained a bachelor's degree in journalism. He joined the publication in 2003.

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like

More Insights