Cloud // Platform as a Service
Commentary
3/25/2013
02:00 PM
Connect Directly
LinkedIn
Google+
Twitter
RSS
E-Mail
50%
50%

How Netflix Is Ruining Cloud Computing

A laser focus on Amazon Web Services and seeming disregard for next-gen best practices could spell lock-in, and derail real IaaS competition.

On March 13, Netflix announced $100,000 in prize money for the developers who do the most to improve its open source tools for controlling and managing application deployments on cloud computing. Before spearheading this contest, Netflix's cloud architect, Adrian Cockcroft, released many internal Netflix tools as open source. Currently, 8 cloud-architecture-specific tools are available from Netflix, and Cockcroft has been very open in sharing his and Netflix's knowledge in public forms.

In theory, all of this should be wonderful. In reality, however, it's likely to leave cloud computing with an enormous hangover of subpar practices and architectures for years to come. Netflix is the poster child for "Cloud Computing v1.0" and demonstrates both the enormous benefits and troubling problems. Cloud Computing v1.0 is a strictly an Amazon Web Services affair -- it was first, and no other provider had the core features necessary to build comparable applications (think multiple availability zones and EBS with snapshots and quick restores). So it makes sense that Netflix embraced AWS; it saw huge benefits in being able to deploy and scale its service using the interfaces and architectures that were possible when AWS launched.

But Netflix has also suffered repeatedly at the hands of Cloud Computing v1.0 with four outages in 2012 alone, which certainly points to the possibility for some improvement in the high availability of its service. Of note, the Christmas Eve outage is perhaps most troubling from a "v1.0" perspective, as it was solely the result of Netflix's reliance on a less-necessary AWS service for load balancing, which could have been handled in any number of other ways to increase server availability.

[ Check out our new InformationWeek cloud computing comparison of 13 top PaaS vendors: Cloud Computing Comparison: PaaS Providers. ]

The reason the Netflix contest is likely to leave organizations worse off is because it thoroughly embraces this "Cloud Computing v1.0" mindset, both from an "AWS-is-the-only-vendor" standpoint as well as from an architectural standpoint. While it's arguable that there still isn't (quite yet) another infrastructure-as-a-service (IaaS) vendor that has a thoroughly tested core feature set, unless you just walked out of the tattoo parlor with "#AWS" on your shoulder, you know it won't be long. And all companies running on AWS should be looking forward to the rise of additional IaaS vendors, like those in our IaaS buyer's guide, for two reasons: higher availability and price competition.

Every cloud architect should know that it's only a matter of time before organizations have applications deployed across the world on many different IaaS providers in many different data centers, based on request volume and location in combination with a market for computing resources that changes price constantly. Locking yourself down to AWS today, for greenfield cloud architectures, would be the equivalent of deciding to develop an iPhone-only application when you know you'll have to support iPads, Android and others in the future.

In addition to the annoying AWS-centrism of the Netflix contest, there's a deeper problem: Some of Netflix's tools embrace a cloud architecture that was fine in the days of Cloud Computing v1.0 but that will look increasingly suspect as time goes on. I know that it's hard to throw out code and systems that are working fine, especially when they still look pretty good -- and often, squeaking out a bit more time is the right internal decision for an individual company. But instead of just wringing out the last bits of value, Netflix is throwing significant money at the rest of the world, asking everyone to embrace and extend their tools and code that are not particularly good practices for future cloud architectures.

Perhaps the best example of a bad-practice Netflix tool is Aminator. Aminator helps you build Amazon Machine Images (AMIs) easily, based on a "base" AMI and a package of code. "I must have produced about 25,000 Ubuntu AMIs," raved one excited early user. There's just one problem: It's hard to understand when this would ever be a good idea. Several years ago, spawning tons of images would have been a somewhat acceptable way to roll out a revised version of an application (due to application code, operating system, and/or server software). But today we have widespread adoption of configuration management tools like Chef and Puppet that make massive AMI creation a subpar practice at best. Amazon Web Services itself recently rolled out a service called OpsWorks, which would be a significantly better way to handle deploying applications -- it uses Chef.

There are other less-bad tools, but many bear the mark of having to architect around a number of issues that have since been more or less resolved; it's a bit like an open source project that relies heavily on SOAP instead of being RESTful. For example, Edda, which figures out what cloud resources you're using at AWS, just seems like something that had to be built because no one properly set up how resources should be requested and deployed. And Asgard, a very cool tool from 2010 for managing a variety of different applications across AWS, would be a hard sell as a best-of-breed tool today compared with other open source options, notably Scalr and Chef.

This is not to say that all of Netflix's open source cloud tools fit into this mold. Denominator is a great DNS manager (because it's multi-cloud), and Simian Army is a fabulous, ground-breaking idea for testing cloud architectures (it is, unfortunately, AWS-only).

There's a possibility that the Netflix contest will help lead the world toward Cloud Computing v2.0 and beyond by embracing multi-cloud architectures that use orchestration and configuration management in optimal ways. However, I am skeptical on both fronts. Cockcroft's public comments suggest little interest in using other cloud vendors. A good chunk of the prize money is in AWS credits, and Amazon's CTO is a judge; all this points to a very AWS-centric mindset. Moreover, the fact that Netflix just released Aminator last week indicates to me that Netflix is happy to roll out whatever tools they've built, regardless of whether they fit in with a best-practices modern cloud architecture.

But please, Netflix, prove me wrong. Embrace a less proprietary, more highly available, more standardized cloud -- and put Google's Urs Hölzle on the panel while you're at it. #UrsForNetflixJudge

Cloud Connect returns to Silicon Valley, April 2-5, 2013, for four days of lectures, panels, tutorials and roundtable discussions on a comprehensive selection of cloud topics taught by leading industry experts. Join us in Silicon Valley to see new products, keep up-to-date on industry trends and create and strengthen professional relationships. Use Priority Code MPIWK by March 30 to save an extra $200 off the advance price of Conference Passes. Register for Cloud Connect now.

Comment  | 
Print  | 
More Insights
Comments
Oldest First  |  Newest First  |  Threaded View
Page 1 / 5   >   >>
Deirdre Blake
50%
50%
Deirdre Blake,
User Rank: Apprentice
3/26/2013 | 3:23:03 PM
re: How Netflix Is Ruining Cloud Computing
So the point is....? Netflix shouldn't hold any developer competitions until a "real" contender to AWS emerges?
treehousetim
50%
50%
treehousetim,
User Rank: Apprentice
3/26/2013 | 4:26:39 PM
re: How Netflix Is Ruining Cloud Computing
OpenStack offers flexibility and choice through a highly engaged community of over 6,000 individuals and over 190 companies including Rackspace-«, Dell, HP, IBM, and Red Hat-«.

http://www.rackspace.com/cloud...
tw426
50%
50%
tw426,
User Rank: Apprentice
3/26/2013 | 5:09:28 PM
re: How Netflix Is Ruining Cloud Computing
Really? So just because AWS doesn't meet your classification of a modern, standard way of doing things, you abandon it? Replacing a working system for those reasons is a funny mindset... and one of the benefits of opening a 'contest' is to look at new ideas with an open attitude. Who is to say that something even better won't come out of this experiment?

It sounds to me like you may just want to bash NetFlix (as if they need any help from you) - they have demonstrated that they are perfectly able to do that themselves!
D. Henschen
50%
50%
D. Henschen,
User Rank: Author
3/26/2013 | 5:13:05 PM
re: How Netflix Is Ruining Cloud Computing
Sounds more like the point is big users of capacity should promote a cloud free market in which there's a real choice. Amazon has been great about lowering prices, but perhaps it would lower prices even more if it had real competition. Trouble is, it's tough for anybody except those operating at Google or, perhaps, Microsoft scale to achieve or even approach the economies of scale already established by Amazon.
adrianco
50%
50%
adrianco,
User Rank: Apprentice
3/26/2013 | 6:20:30 PM
re: How Netflix Is Ruining Cloud Computing
There should be a techblog.netflix.com post in the next day or so that will give more context to the Cloud Prize and clarify most of the points above. However I will address some of the specific issues here.

Cloud 1.0 vs. 2.0?
I would argue that the way most people are doing cloud today is to forklift part of their existing architecture into a cloud and run a hybrid setup. That's what I would call Cloud 1.0. What Netflix has done is show how to build much more agile green field native cloud applications, which might justify being called Cloud 2.0. The specific IaaS provider used underneath, and whether you do this with public or private clouds is irrelevant to the architectural constructs we've explained.

Outages
The outages that have been mentioned were regional, they didn't apply to Netflix operations in Europe for example. Our current work is to build tooling for multi-regional support on AWS (East cosat/West coast), including the DNS management that was mentioned. This removes the failure mode with the least effort and disruption to our existing operations.

Portability
Other cloud vendors have a feature set and scale comparable to AWS in 2008-2009. We're still waiting for them to catch up. There are many promises but nothing usable for Netflix itself. However there is demand to use NetflixOSS for other smaller and simpler applications, in both public and private clouds, and Eucalyptus have demonstrated Asgard, Edda and Chaos Monkey running, and will ship soon in Eucalyptus 3.3. There are signs of interest from people to add the missing features to OpenStack, CloudStack and Google Compute so that NetflixOSS can also run on them.

Edda
You've completely missed the point of Edda. It does three important things. 1) if you run at large scale your automation will overload the cloud API endpoint, Edda buffers this information and provides a query capability for efficient lookups. 2) Edda stores a history of your config, it's a CMDB that can be used to query for what changed. 3) Edda cross integrates multiple data sources, the cloud API, our own service registry Eureka, Appdynamics call flow information and can be extended to include other data sources.

AMInator
If you want to spin up 500 identical instances, having them each run Chef or Puppet after they boot creates a failure mode dependency on the Chef/Puppet service, wastes startup time, and if anything can go wrong with the install you end up with an inconsistent set of instances. By using AMInator to run Chef once at build time, there is less to go wrong at run time. It also makes red/black pushes and roll-backs trivial and reliable.

Cloud Prize
The prize includes a portability category. It's a broad category and might be won by someone who adds new language support to NetflixOSS (Erlang, Go, Ruby?) or someone who makes parts of NetflixOSS run on a broader range of IaaS options. The reality is that AWS is actually dominating cloud deployments today, so contributions that run on AWS will have the greatest utility by the largest number of people. The alternatives to AWS are being hyped by everyone else, and are showing some promise, but have some way to go.

We hope that NetflixOSS provides a useful driver for higher baseline functionality that more IaaS APIs can converge on, and move from 2008-era EC2 functionality to 2010-era EC2 functionality across more vendors. Meanwhile Netflix itself will be enjoying the benefits of 2013 AWS functionality like RedShift.
jemison288
50%
50%
jemison288,
User Rank: Moderator
3/26/2013 | 7:06:34 PM
re: How Netflix Is Ruining Cloud Computing
The point is that Netflix is holding a developer competition which--as it is designed--will likely produce tools and encourage practices more consistent with "clouding computing v1.0". It's fairly clear how tools should be built to work with multiple clouds--where you should build abstraction layers, and how you should embrace more open standards and choices instead of choosing proprietary ones. Most of Netflix's tools don't do that today (and in particular, Asgard's place at the center of things and reliance upon so many proprietary AWS services and API calls), and how the tools are currently built and how the contest is designed are all pointing off in a direction that goes away from a multi-cloud, standardized-configuration-management design.

Look, if this were Microsoft hosting a contest to improve its cloud tools, what I wrote would be a non-story--of course we should all be skeptical about whether what Microsoft is offering would be best-of-breed and useful. The problem is that very few people in the cloud world (at least those to whom I've spoken) seem to be viewing this Netflix/AWS contest (let's not leave AWS out, since it's very, very clearly a joint contest, with Werner and the AWS prizes) with a similar skeptical eye.
jemison288
50%
50%
jemison288,
User Rank: Moderator
3/26/2013 | 7:13:09 PM
re: How Netflix Is Ruining Cloud Computing
Wow--nothing in my piece says anything about AWS not being modern or me abandoning it. AWS still gets the vast majority of my cloud spend today, and I think it's a wonderful service.

The point is that a *cloud architecture* that *only works with one cloud* is not a good cloud architecture going forward. Good cloud architectures will work with multiple clouds. If I am hiring you as a cloud architect today, and you tell me that you are going to create a greenfield cloud architecture that is going to hook directly into AWS APIs and that you are fundamentally uninterested in testing or supporting any other clouds in the future, then I would fire you on the spot.

I am also a big fan of Netflix's services, of their corporate culture, and of the fact that they're willing to be so open about so many things, including releasing so much source code. (I think I've been a Netflix member for more than 10 years now). The problem is that I am concerned about the vast number of people and organizations who are going to take Netflix's cloud deployment as a reference architecture, which will help Netflix, but will fundamentally set cloud architecture practices back to 2010.

All that said, if the result of this contest is to bring Netflix's tools into the future and make them multi-cloud, and use standardized configuration management, then that will be excellent (as I say at the end of my piece). However, for the many reasons I outlined in my piece, I am deeply skeptical of whether that will happen.
jemison288
50%
50%
jemison288,
User Rank: Moderator
3/26/2013 | 7:16:43 PM
re: How Netflix Is Ruining Cloud Computing
Yes, I absolutely agree with this. Big users of capacity need to have architectures that can take advantage of multiple IaaS vendors (and I agree that we're only really looking at Google and Microsoft, at least today), and need to not look like Netflix's architecture, which requires so many proprietary AWS services (e.g., DynamoDB) as it stands today.
lgarey@techweb.com
50%
50%
lgarey@techweb.com,
User Rank: Apprentice
3/26/2013 | 7:26:25 PM
re: How Netflix Is Ruining Cloud Computing
Agreed Doug. It's to NetFlix's own benefit to nurture a vibrant set of IaaS providers. Lorna Garey, IW Reports
jemison288
50%
50%
jemison288,
User Rank: Moderator
3/26/2013 | 9:26:54 PM
re: How Netflix Is Ruining Cloud Computing
Adrian--

I read this as a "tree" response to a "forest" issue, but I'll respond with respect to both forest and trees.

The forest is this: Netflix's cloud architecture--as seen through public talks and open source code--is fundamentally (a) so intertwined with AWS as to be essentially inseparable, and (b) significantly behind the best *general* open options for configuration management and orchestration. It also is far from "the Unix way" of having encapsulated/abstracted tools that can be interchanged with others to build a best-in-breed architecture.

Your answer doesn't really do anything to do address this "forest" argument: you defend the complete reliance that Netflix and (most of) its tools have on AWS based upon an analytical database that is really beside the point as far as cloud architectures go. (Don't get me wrong--I think RedShift is *awesome*, but its presence is completely irrelevant to a generalized reference cloud architecture, which is the power of NetflixOSS that's so concerning).

Your defense of AMInator and Edda (I wish you'd defend Asgard also!) is ultimately a defense of why those solutions work for Netflix and its application and current architecture--but that's not the point. Obviously you're a smart and capable architect and you have reasons for using them at Netflix. The point is that--as they stand today--they're not promoting good *generalized* application architectures. You should be promoting Chef before you promote a tool that essentially encourages people make horrible design decisions (in lieu of using Chef at all). You should be defending Netflix tools based upon standardized, reference deployments, not based on launching 500 VMs of the same exact machine which *is not exactly a common use case for the cloud*.

Look--it's possible to write awesome and fabulous PHP code, but most PHP developers don't. One of the reasons why Netflix is now choosing Python is because the generalized Python developer writes consistent and good code. (We chose Python for the same reasons you did). But to someone who has no idea what a good cloud deployment looks like, the way AMInator sits out there--you're going to see a lot more people like the guy super-psyched to have built 25,000 AMIs over Twitter.

The overall point of the piece is this: Netflix has a lot of power and clout in the cloud architecture world, and there are a whole lot of people looking to Netflix for guidance on how to deploy on the cloud. Netflix has made some choices (the "forest" above) that are flat-out bad choices if you take anything like a long-term approach to your cloud architecture. There is no historical precedent that you can cite as being a good example for being so intertwined with a single IT vendor. And it's way more important for people deploying on the cloud to know and understand configuration management than it is for them to have a tool that--as far as its public users go--are using to bypass CM entirely.

Page 1 / 5   >   >>
Google in the Enterprise Survey
Google in the Enterprise Survey
There's no doubt Google has made headway into businesses: Just 28 percent discourage or ban use of its productivity ­products, and 69 percent cite Google Apps' good or excellent ­mobility. But progress could still stall: 59 percent of nonusers ­distrust the security of Google's cloud. Its data privacy is an open question, and 37 percent worry about integration.
Register for InformationWeek Newsletters
White Papers
Current Issue
InformationWeek Tech Digest - July 22, 2014
Sophisticated attacks demand real-time risk management and continuous monitoring. Here's how federal agencies are meeting that challenge.
Flash Poll
Video
Slideshows
Twitter Feed
InformationWeek Radio
Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.