Cloud // Infrastructure as a Service
News
4/8/2013
10:05 PM
Connect Directly
Twitter
RSS
E-Mail

Netflix's 5 Secrets For Maximizing Amazon Cloud Value

Netflix chief cloud architect Adrian Cockcroft shares five money-saving maneuvers for big Amazon Web Services users.



Cloud Computing Comparison: PaaS Providers
Click above for detailed features matrixes on PaaS vendors
Instead of trying to build out its own data centers for its rapidly expanding film and video distribution business, Netflix finds the better strategy is to use Amazon Web Services' cloud resources. At Cloud Connect 2013, the architect of that strategy disclosed some of his secrets for optimizing use of the Amazon cloud.

Adrian Cockcroft is a leading proponent of Amazon Web Services, so much so that he is sometimes criticized for channeling Netflix computing onto Amazon's EC2. A recent InformationWeek column, "How Netflix Is Ruining Cloud Computing," drew 39 comments, including several by Cockcroft defending himself. John Engates, CTO of Rackspace Cloud, an Amazon competitor, added a follow-up commentary, "What Netflix Could Do For Cloud Computing."

An April 4 session at Cloud Connect 2013, a UBM Tech event in Santa Clara, Calif., featured Cockcroft and Amazon Web Services technology evangelist Jinesh Varia. It drew a crowd to hear their take on Netflix's five tips to maximize the cost effectiveness of AWS.

And while it's commonly viewed that Netflix is dependent on AWS for vital services, Varia said at the start that AWS relies on Netflix to educate it about meeting a large, demanding customer's needs. "Adrian and his team challenge Amazon Web Services in every way. They help us to make AWS better," he said.

[ Want to learn more about managing spending in the cloud? See Cloud's Thorniest Question: Does It Pay Off? ]

Cockcroft, in turn, said the switch to Amazon allowed Netflix to try using large data center resources, fail at it without paying a heavy penalty in unused gear because it was only rented by the hour, then try again. The ability to execute a rapid, iterative testing of ideas "[gives] us an ability to try things out, even more than our own data centers would," he said.

Cockcroft offered these five tips for using AWS.

1. Weigh Costs Vs. Business Goals.

As an example, he said Netflix had no staff or servers in South or Central America when it opened operations in South America Sept. 5, 2011. It thought it would improve customer service throughout the region if it added servers in the AWS center in Sao Paulo, Brazil. But it found requests to its virtual servers in Brazil from other countries were almost always routed over the Internet through Miami. That's because Miami is a massive hub for network carriers, and an Internet user in Ecuador wanting to talk to one in Brazil will almost always be routed through Miami. Netflix found there was no performance advantage to using servers in Brazil, which were farther from Miami than AWS's East Coast servers. It reverted to serving South and Central American through its hundreds of servers in Northern Virginia because "it made sense to serve them out of U.S. East," said Cockcroft.

European customers, on the other hand, could be more efficiently served out of AWS's Dublin, Ireland, data center. AWS services in Dublin are slightly more expensive than those in U.S. East, but reducing latencies for customers was worth the increase, he said. Netflix launched a 1,000 virtual machine footprint in Dublin, using the same procedures and same APIs to which it was already accustomed. "Everything just worked," he said.

Mastering these business tradeoffs of weighing the cost and latency penalties, when they exist, against your business goals is one of the fundamental challenges of cloud computing. Cockcroft used a rough equation to formulate the trade-off: "How many dollars should you spend to reduce customer latencies by 50% if that increases your conversion rate by 10%?"

2. Plan Ahead For Disaster.

Netflix is rare among Amazon users in implementing operations in multiple regions. Amazon urges customers to achieve high availability by implementing the same application and data in more than one availability zone. Multiple zones exist in the same region, but Netflix wants its U.S. East operations in Northern Virginia able to fail over to one of Amazon's West Coast facilities. That allows a region-wide disaster, such as Hurricane Sandy, to strike Northern Virginia but still leave Netflix with a guarantee of continued operations. (Sandy created temporary problems with some services but did not knock AWS off the air.)

10 Tools To Prevent Cloud Vendor Lock-in
10 Tools To Prevent Cloud Vendor Lock-in
(click image for larger view and for slideshow)
Netflix wants "50% of its operations in one region and 50% in another," with one region serving as the fail-over site for the other, said Cockcroft. That way, neither a hurricane in the East nor an earthquake in the West would close down its business.

It plans to have a reservoir of reserved instance capacity, the type of virtual server for which the customer pays upfront and gets guaranteed access when it's needed, in each region so that it can invoke that capacity and continue full operation. Netflix may not use the reserved instances much of the time, but if disaster strikes, that capacity "will be yanked out from under someone else" and given to Netflix. That's possible because AWS sells spot instances that go for substantially under-market prices until a pre-paid customer like Netflix needs them.

3. Use Reserved Instances Appropriately.

Cockcroft advised other major AWS users to determine which workloads tend to operate at a steady state and then identify what that state is. By identifying steady-state workloads, customers know how much of a reserved instance they should buy and can move workloads onto the lower-cost server type to gain significant savings. "If you're going to run your job less than 3.5 months a year, then stick with an on-demand instance," he advised. Because reserved instance users make an upfront payment as well as pay an hourly rate for a minimum one-year contract, it doesn't pay to seek the lower rate unless the job runs for more than three and a half months.

4. Borrow Idle Capacity For Dev And Test.

Cockcroft said Netflix is trying to get away from having a separate set of AWS on-demand accounts for development and testing and will try to use idle reserve instance capacity for dev and test. That way, Netflix's surplus reserved instance capacity is getting some use while still being available for peaks in production workloads or disaster recovery. Such a move keeps the capacity Netflix needs available in the cloud, while putting it to a secondary use when it's not actually employed in everyday processing. That is, reserved instances can serve both a primary and secondary purpose, cutting your cloud bill. "I'm not sure other customers have figured this out," Cockcroft noted.

5. Consolidate Accounts To Gain Discounts.

Cockcroft urged Cloud Connect attendees to consolidate their AWS bill under their company's name, instead of having many individuals and departments run their own accounts. No percentages were cited, but Amazon will discount services for larger users, but it won't discount them if they exist as many independent accounts. "Every business has more than one application and more than one cloud account," pointed out Varia. A customer qualifies for a lower-priced billing tier when the accounts are consolidated.

Comment  | 
Print  | 
More Insights
Comments
Newest First  |  Oldest First  |  Threaded View
adrianco
50%
50%
adrianco,
User Rank: Apprentice
4/11/2013 | 12:12:21 AM
re: Netflix's 5 Secrets For Maximizing Amazon Cloud Value
The full slide deck is available here:
http://www.slideshare.net/Amaz...

Most of the discussion and slides was actually by Jinesh, so many of the quotes in this article are of things that Jinesh said, rather than what I said.

The discussion of Brazil overstates what we did. We ran a small experiment in AWS Brazil for a week or two earlier this year, it wasn't a large scale deployment. The point was that we could easily try out deploying systems anywhere in the world.

Point 4 and 5 above doesn't quite have it right. With consolidated billing, reservations apply across accounts. It makes sense to have excess reservations in production accounts so that you have a capacity guarantee for handling production peaks. The excess is mopped up by other accounts at the end of the month, so that there is no cost penalty for the extra headroom. The other optimization is to autoscale down the production web services instances during the night, and use the same reserved instances to create short lived hadoop clusters to do the daily ETL processing for business intelligence metrics.
Laurianne
50%
50%
Laurianne,
User Rank: Author
4/9/2013 | 6:26:05 PM
re: Netflix's 5 Secrets For Maximizing Amazon Cloud Value
The consolidation comment brings up an interesting point. Many IT chiefs have rogue Amazon instances out there, set up by developers or even folks on the business side, that they don't know about. You have to track them down before you can consolidate.

Laurianne McLaughlin
InformationWeek
Multicloud Infrastructure & Application Management
Multicloud Infrastructure & Application Management
Enterprise cloud adoption has evolved to the point where hybrid public/private cloud designs and use of multiple providers is common. Who among us has mastered provisioning resources in different clouds; allocating the right resources to each application; assigning applications to the "best" cloud provider based on performance or reliability requirements.
Register for InformationWeek Newsletters
White Papers
Current Issue
InformationWeek Tech Digest, Nov. 10, 2014
Just 30% of respondents to our new survey say their companies are very or extremely effective at identifying critical data and analyzing it to make decisions, down from 42% in 2013. What gives?
Video
Slideshows
Twitter Feed
InformationWeek Radio
Archived InformationWeek Radio
Join us for a roundup of the top stories on InformationWeek.com for the week of November 16, 2014.
Sponsored Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.