Cloud // Platform as a Service
Commentary
3/25/2013
02:00 PM
Connect Directly
RSS
E-Mail
50%
50%
Repost This

How Netflix Is Ruining Cloud Computing

A laser focus on Amazon Web Services and seeming disregard for next-gen best practices could spell lock-in, and derail real IaaS competition.

On March 13, Netflix announced $100,000 in prize money for the developers who do the most to improve its open source tools for controlling and managing application deployments on cloud computing. Before spearheading this contest, Netflix's cloud architect, Adrian Cockcroft, released many internal Netflix tools as open source. Currently, 8 cloud-architecture-specific tools are available from Netflix, and Cockcroft has been very open in sharing his and Netflix's knowledge in public forms.

In theory, all of this should be wonderful. In reality, however, it's likely to leave cloud computing with an enormous hangover of subpar practices and architectures for years to come. Netflix is the poster child for "Cloud Computing v1.0" and demonstrates both the enormous benefits and troubling problems. Cloud Computing v1.0 is a strictly an Amazon Web Services affair -- it was first, and no other provider had the core features necessary to build comparable applications (think multiple availability zones and EBS with snapshots and quick restores). So it makes sense that Netflix embraced AWS; it saw huge benefits in being able to deploy and scale its service using the interfaces and architectures that were possible when AWS launched.

But Netflix has also suffered repeatedly at the hands of Cloud Computing v1.0 with four outages in 2012 alone, which certainly points to the possibility for some improvement in the high availability of its service. Of note, the Christmas Eve outage is perhaps most troubling from a "v1.0" perspective, as it was solely the result of Netflix's reliance on a less-necessary AWS service for load balancing, which could have been handled in any number of other ways to increase server availability.

[ Check out our new InformationWeek cloud computing comparison of 13 top PaaS vendors: Cloud Computing Comparison: PaaS Providers. ]

The reason the Netflix contest is likely to leave organizations worse off is because it thoroughly embraces this "Cloud Computing v1.0" mindset, both from an "AWS-is-the-only-vendor" standpoint as well as from an architectural standpoint. While it's arguable that there still isn't (quite yet) another infrastructure-as-a-service (IaaS) vendor that has a thoroughly tested core feature set, unless you just walked out of the tattoo parlor with "#AWS" on your shoulder, you know it won't be long. And all companies running on AWS should be looking forward to the rise of additional IaaS vendors, like those in our IaaS buyer's guide, for two reasons: higher availability and price competition.

Every cloud architect should know that it's only a matter of time before organizations have applications deployed across the world on many different IaaS providers in many different data centers, based on request volume and location in combination with a market for computing resources that changes price constantly. Locking yourself down to AWS today, for greenfield cloud architectures, would be the equivalent of deciding to develop an iPhone-only application when you know you'll have to support iPads, Android and others in the future.

In addition to the annoying AWS-centrism of the Netflix contest, there's a deeper problem: Some of Netflix's tools embrace a cloud architecture that was fine in the days of Cloud Computing v1.0 but that will look increasingly suspect as time goes on. I know that it's hard to throw out code and systems that are working fine, especially when they still look pretty good -- and often, squeaking out a bit more time is the right internal decision for an individual company. But instead of just wringing out the last bits of value, Netflix is throwing significant money at the rest of the world, asking everyone to embrace and extend their tools and code that are not particularly good practices for future cloud architectures.

Perhaps the best example of a bad-practice Netflix tool is Aminator. Aminator helps you build Amazon Machine Images (AMIs) easily, based on a "base" AMI and a package of code. "I must have produced about 25,000 Ubuntu AMIs," raved one excited early user. There's just one problem: It's hard to understand when this would ever be a good idea. Several years ago, spawning tons of images would have been a somewhat acceptable way to roll out a revised version of an application (due to application code, operating system, and/or server software). But today we have widespread adoption of configuration management tools like Chef and Puppet that make massive AMI creation a subpar practice at best. Amazon Web Services itself recently rolled out a service called OpsWorks, which would be a significantly better way to handle deploying applications -- it uses Chef.

There are other less-bad tools, but many bear the mark of having to architect around a number of issues that have since been more or less resolved; it's a bit like an open source project that relies heavily on SOAP instead of being RESTful. For example, Edda, which figures out what cloud resources you're using at AWS, just seems like something that had to be built because no one properly set up how resources should be requested and deployed. And Asgard, a very cool tool from 2010 for managing a variety of different applications across AWS, would be a hard sell as a best-of-breed tool today compared with other open source options, notably Scalr and Chef.

This is not to say that all of Netflix's open source cloud tools fit into this mold. Denominator is a great DNS manager (because it's multi-cloud), and Simian Army is a fabulous, ground-breaking idea for testing cloud architectures (it is, unfortunately, AWS-only).

There's a possibility that the Netflix contest will help lead the world toward Cloud Computing v2.0 and beyond by embracing multi-cloud architectures that use orchestration and configuration management in optimal ways. However, I am skeptical on both fronts. Cockcroft's public comments suggest little interest in using other cloud vendors. A good chunk of the prize money is in AWS credits, and Amazon's CTO is a judge; all this points to a very AWS-centric mindset. Moreover, the fact that Netflix just released Aminator last week indicates to me that Netflix is happy to roll out whatever tools they've built, regardless of whether they fit in with a best-practices modern cloud architecture.

But please, Netflix, prove me wrong. Embrace a less proprietary, more highly available, more standardized cloud -- and put Google's Urs Hölzle on the panel while you're at it. #UrsForNetflixJudge

Cloud Connect returns to Silicon Valley, April 2-5, 2013, for four days of lectures, panels, tutorials and roundtable discussions on a comprehensive selection of cloud topics taught by leading industry experts. Join us in Silicon Valley to see new products, keep up-to-date on industry trends and create and strengthen professional relationships. Use Priority Code MPIWK by March 30 to save an extra $200 off the advance price of Conference Passes. Register for Cloud Connect now.

Comment  | 
Print  | 
More Insights
Comments
Newest First  |  Oldest First  |  Threaded View
Page 1 / 5   >   >>
Decision Consultants
50%
50%
Decision Consultants,
User Rank: Apprentice
4/30/2013 | 3:54:40 PM
re: How Netflix Is Ruining Cloud Computing
Great interesting article. Interested to know who will win the contest.
mtimc
50%
50%
mtimc,
User Rank: Apprentice
4/25/2013 | 9:19:57 AM
re: How Netflix Is Ruining Cloud Computing
Joe

Do you mean that Adrian's changed his position from where it was on slide 20 of:http://www.slideshare.net/adri... (March last year)?

Compared to other approaches to portability that I've encountered, NF seems to be quite grown up and pragmatic and actively worried about lock-in.

Can you point me at a relevant citation?
mtimc
50%
50%
mtimc,
User Rank: Apprentice
4/24/2013 | 2:03:01 PM
re: How Netflix Is Ruining Cloud Computing
Joe
Do you just have a minimal shim abstraction, then?

It seems to me that the AWS API *is* a good (enough) abstraction for now - it's so much richer than the competition and the experience of using it still too limited to establish a more abstract version. Provided your following a decent CD process, you'll have the tests in place to allow you to evolve the abstraction as you learn.

In my experience, abstraction shims can be useful, but they can also cost more than they're worth unless the problem domain is very well understood.

Discuss...
jason833
50%
50%
jason833,
User Rank: Apprentice
4/2/2013 | 11:00:00 PM
re: How Netflix Is Ruining Cloud Computing
That's an interesting assumption. As matter of fact, I do which is why I chose AWS as my IaaS provider.

AWS has 3 independent Regions in the US (risk management), and it's cheaper to deploy multi-regions using the same toolsets that works for all, then spending time and efforts building your toolsets to work with other IaaS providers (cost management). In addition, combining both, you get stability and ease of management by reducing complexity of your Infrastructure as a whole.

With that said, I'm eagerly waiting for Google Compute Engine to grow into a "tree", as I've read favorable review on its limited preview. Google is probably the only powerhouse out there who can compete with Amazon head-on in terms of scalability and technologies. Thoughts?
David Berlind
50%
50%
David Berlind,
User Rank: Apprentice
4/2/2013 | 10:49:41 PM
re: How Netflix Is Ruining Cloud Computing
Just a heads up to everyone on this thread that Rackspace CTO John Engates has chimed into the Netflix debate with an op-ed piece on InformationWeek.com. You can find it here: http://www.informationweek.com...
jemison288
50%
50%
jemison288,
User Rank: Moderator
4/2/2013 | 11:01:08 AM
re: How Netflix Is Ruining Cloud Computing
I assume you do not work in a risk management or cost management role, then?
jemison288
50%
50%
jemison288,
User Rank: Moderator
4/2/2013 | 11:00:16 AM
re: How Netflix Is Ruining Cloud Computing
I would put it this way (and I plan on writing in more detail on this): I think from an IaaS provider, you need the equivalent of EC2 (at least a few machine types), S3, EBS (snapshot/restore/detach/attach), and Security Groups from AWS. Then you should interact with that IaaS through an abstraction layer, not directly with their API, so you can switch to other, similar providers. (Netflix does argue that since they're at least--in some cases--using boto/similar libraries, it will be easier to add support for other clouds, but I am skeptical).

So, on that abstraction layer. Ultimately, from a software architecture standpoint, you can either start from a "I'm going to build a library that interfaces with X API", or you can start from, "I'm going to build a generalized set of interfaces to a certain type of service, and I'm going to translate from my generalized interfaces to specific implementations". I believe--and I think the evidence shows--if you start with the former, it's very hard to connect to other APIs that do similar things, just not in the exact same way. However, if you do the latter, and you are explicit from the beginning that you'll be translating from your outward-facing interfaces to other APIs, you'll have set the project up properly. These are really quite different software engineering projects, and I think that's why you hear the Netflix engineers really focus on the idea that other platforms will have to adopt the AWS API for things to work properly with their tools.

I view the multi-cloud issue as (a) critical, and (b) very hard, for the reasons you mentioned above. So much software uses RDBMS, and multi-master replication is generally speaking not the most reliable thing. Netflix has a good article about how they treat different types of data to handle their multi-region deployment, and I think we're going to have to move to architectures like that to take full advantage of the cloud.

In the end, I think the debate in these comments has more to do with us talking past each other than anything else (as Charles Babcock points out here somewhere). I think we all agree that Netflix has solved their particular use case in a great way, and that open sourcing their tools and giving speeches on their architecture is very useful. Netflix appears to view their AWS API lock-in as a necessary thing to keep for the future (and I don't see why they shouldn't work toward abstracting all of it) and Netflix appears to think that everyone approaching their tools will be capable of making an informed decision about the potential issues of using them (and I think that's a really bad assumption).
mtimc
50%
50%
mtimc,
User Rank: Apprentice
4/1/2013 | 1:06:53 PM
re: How Netflix Is Ruining Cloud Computing
Joe
Have you got any specific best practices in mind, beyond the obvious one of 'assume cheap, ephemeral hardware'? Which as significant changes on enterprise software. Followed by 'Continuous Delivery' (or deployment for some), so that you've got mechanisms for testing and replacing the components at low risk?

It seems to me that Netflix has done some sterling work in these dimensions, but that it's very unobvious to most enterprises that they are necessary underpinnings of moving to pay as you go infrastructure.

Can you be more specific in the abstractions that you have in mind?

I'd also add that Amazon does seem to be reacting to competition as price drops feel like they are accelerating, even if they are still nowhere near the cost advantage that AWS has according to James Hamilton.
jason833
50%
50%
jason833,
User Rank: Apprentice
3/30/2013 | 12:52:36 AM
re: How Netflix Is Ruining Cloud Computing
AWS got an early head start in IaaS offering, and built a massive eco-system with all the building blocks that no others can follow even to this day. I just don't see why anyone would want invest the time and money on cross cloud mobility and multi-cloud tools at this stage.
jason833
50%
50%
jason833,
User Rank: Apprentice
3/30/2013 | 12:24:18 AM
re: How Netflix Is Ruining Cloud Computing
From where I stand, Joe's taking the name of the contest way too literally; hence, calling out Netflix for implying that the Cloud = AWS.

But are there other trees in the forest, or is it just a 1-tree forest with a bunch of shrubs. Google Compute Engine (GCE) is looking extremely promising, so I'm eagerly watching it grow into a tree.

As for bake v. no bake, it depends on many variables. How big is the infrastructure per service, how large is the change, how many iterations performed per... It's a moot point.
Page 1 / 5   >   >>
Google in the Enterprise Survey
Google in the Enterprise Survey
There's no doubt Google has made headway into businesses: Just 28 percent discourage or ban use of its productivity ­products, and 69 percent cite Google Apps' good or excellent ­mobility. But progress could still stall: 59 percent of nonusers ­distrust the security of Google's cloud. Its data privacy is an open question, and 37 percent worry about integration.
Register for InformationWeek Newsletters
White Papers
Current Issue
InformationWeek Elite 100 - 2014
Our InformationWeek Elite 100 issue -- our 26th ranking of technology innovators -- shines a spotlight on businesses that are succeeding because of their digital strategies. We take a close at look at the top five companies in this year's ranking and the eight winners of our Business Innovation awards, and offer 20 great ideas that you can use in your company. We also provide a ranked list of our Elite 100 innovators.
Video
Slideshows
Twitter Feed
Audio Interviews
Archived Audio Interviews
GE is a leader in combining connected devices and advanced analytics in pursuit of practical goals like less downtime, lower operating costs, and higher throughput. At GIO Power & Water, CIO Jim Fowler is part of the team exploring how to apply these techniques to some of the world's essential infrastructure, from power plants to water treatment systems. Join us, and bring your questions, as we talk about what's ahead.