Comments
How Netflix Is Ruining Cloud Computing
Newest First  |  Oldest First  |  Threaded View
<<   <   Page 4 / 5   >   >>
adrianco
50%
50%
adrianco,
User Rank: Apprentice
3/26/2013 | 11:49:08 PM
re: How Netflix Is Ruining Cloud Computing
The Netflix cloud platform *is* the abstraction we built to isolate our streaming apps from dealing directly with AWS. The rest of it is the things we had to build to get things done. The structure of NetflixOSS is based on lots of separate services that can be combined in many ways or used individually just like the 'Unix way'.

You're advocating a lowest common denominator for cloud architecture, I'm pointing out that this roots you in 2008, and ignores everything that AWS has developed in response to the needs of their customers in the following five years. To effectively use tools like Asgard cloud APIs need to implement ideas from 2010, and it's obviously possible, since Eucalyptus have done just that, and other cloud API vendors are looking at it.

We really do push code many times every day using tens or hundreds of instances of a baked AMI. We have several hundred different services running. If you don't want to run at scale, you don't need to do this, but if scale and high availability and fast startup matters, AMInator solves the problem. The developer you mentioned having built 25,000 AMIs is our engineer who has been building and testing AMInator for the last few months.

There are many historical precedents for using a single IT vendor. You pick the ones that work at scale. There's also a philosophical difference between the concerns of Operations people who want multiple vendors to save money, and Developers who want a single vendor to get things done more quickly. For example developers don't write SQL code that transparently works on Oracle, SQLserver and MySQL they picked one of them to start with and port if they have to.

You mention Zynga, which I regard as a case study of why it's a bad idea to spend hundreds of millions of dollars building datacenters just before your business model collapses. Most of the people who built Z-cloud are gone now.
jemison288
50%
50%
jemison288,
User Rank: Moderator
3/26/2013 | 10:55:03 PM
re: How Netflix Is Ruining Cloud Computing
Adrian--

Here's the tree response.

Cloud 1.0 vs. 2.0:
You say, "The specific IaaS provider used underneath, and whether you do this with public or private clouds is irrelevant to the architectural constructs we've explained." But of course, the Netflix contest has to do with your tools, not "architectural constructs" per se. And your tools are absolutely tied to a specific IaaS provider. And from an architectural perspective, while Netflix has done some awesome things, I think Zynga's architecture (including what they pay for what they get) is much more likely to be what best-practice enterprise cloud architectures look like in 5+ years. I don't know many cloud architects who are aware of the difference between Zynga and Netflix who would pick Netflix's implementation over Zynga's--again, largely because of the multi-cloud capabilities.

Outages:
My point in bringing up the outages was not to imply that they were international or fatal; it was to point out that Netflix's cloud architecture is not perfect (not that any cloud architecture is, but always good to point this out), and that one can tie at least one outage to Netflix's specific architectural decision to embrace a proprietary service (ELB) when other, non-proprietary, more resilient options were available. I'm happy that Netflix won't repeat an outage due to that specific proprietary service, but the overall philosophy of choosing AWS services over open options that are more flexible and more resilient remains.

Portability:
As long as you continue to force Netflix to use new and expanded Amazon-provided services over other options, you're creating a moving target that no other vendor will hit. The better path to take is: how should organizations design their architectures so that they can maintain portability and interoperability across multiple vendors, whereever possible? If, today, we wait for Google Compute Engine to be more widely available and tested, then shouldn't we be moving to abstraction layers for API communication, instead of doubling down and adopting more and more proprietary APIs? Perhaps AWS will release their API to the world and allow all businesses to use it openly, but they haven't yet, and so it's a very risky move to bet an architecture on AWS and any vendors (e.g., Eucalyptus) that AWS will bless. Perhaps what you're saying is that you'd be happy to see abstraction layers in the Netflix tools for working with other clouds, but *you're not actually saying that*. Please say that.

Edda:
Thanks for the details on Edda; my knowledge was from reading what was easily available on github: "Why did we create Edda? ... if we see a host with an EC2 hostname that is causing problems on one of our API servers then we need to find out what that host is and what team is responsible, Edda allows us to do this." Edda sounds like a great tool to take multi-cloud--would you considering suggesting that as a theoretically good project for the contest (no guarantees on prizes)?

AMInator:
Your explanation of using Chef with AMInator makes a lot of sense in the "500 simultaneous instances" use case. Which is--you would admit--not a common circumstance amongst the people who use/will be using your cloud tools. And unfortunately, your first happy user of AMInator (on Twitter, at least) made over 25,000 Ubuntu AMIs with it--can you tell me why that would ever be a good architectural decision? AMInator strikes me as a tool like PHP or a GOTO statement--there are places where you should probably use them, but it's hard to argue that they should be part of any kind of "best practices" decision.

Cloud Prize:
The fact that only one out of ten prizes involves portability, and the fact that you take such an expansive view of portability to include adding language support to an existing tool (which has NOTHING to do with cloud portability!), shows that you really think that cloud portability unimportant to Netflix. If Netflix wants to make that business decision, then fine. But I would argue that Netflix is a role model in the world, and has a lot of ears, and that it's just irresponsible for Netflix to lead the rest of the world on the same path.

To the extent that Netflix is trying to exploit open source in the same way many companies do--to share code in exchange for getting additional development for free--I have no issues. Go for it. But I have a problem in the way that Netflix's tools and architectural decisions are taken as THE reference architecture. I write here, and I wrote the piece, not to try to convince you, Adrian, to change the way Netflix does things. I would like you to the run the contest in such a way that it promotes portability and interoperability and make the judging panel less AWS-centric. But beyond that, I'm really writing these for those people out there considering whether Netflix's cloud architecture is something they should copy verbatim. (Don't!)

Heidi Roizen, an entrepreneur-turned-VC, put it this way: "I don't ask 'what happens if?' ... I ask 'what happens when?'"--meaning that there are certain things that we know will happen, and if we aren't thinking about them and planning for them, we're not thinking strategically enough. (One of her examples: "what happens when we have self-driving cars?") It is a certainty that we will have viable IaaS competitors to AWS. But the attitude that is embedded in the Netflix cloud tools--and from what I see of the contest today--is one that essentially says, "we will look nowhere but AWS." And there is no thing that has been said in response to my piece that says otherwise. In fact, you don't even address my quoting of you in different places where you have excluded other options by fiat, regardless of price or functionality.

And that is the problem.
jemison288
50%
50%
jemison288,
User Rank: Moderator
3/26/2013 | 9:26:54 PM
re: How Netflix Is Ruining Cloud Computing
Adrian--

I read this as a "tree" response to a "forest" issue, but I'll respond with respect to both forest and trees.

The forest is this: Netflix's cloud architecture--as seen through public talks and open source code--is fundamentally (a) so intertwined with AWS as to be essentially inseparable, and (b) significantly behind the best *general* open options for configuration management and orchestration. It also is far from "the Unix way" of having encapsulated/abstracted tools that can be interchanged with others to build a best-in-breed architecture.

Your answer doesn't really do anything to do address this "forest" argument: you defend the complete reliance that Netflix and (most of) its tools have on AWS based upon an analytical database that is really beside the point as far as cloud architectures go. (Don't get me wrong--I think RedShift is *awesome*, but its presence is completely irrelevant to a generalized reference cloud architecture, which is the power of NetflixOSS that's so concerning).

Your defense of AMInator and Edda (I wish you'd defend Asgard also!) is ultimately a defense of why those solutions work for Netflix and its application and current architecture--but that's not the point. Obviously you're a smart and capable architect and you have reasons for using them at Netflix. The point is that--as they stand today--they're not promoting good *generalized* application architectures. You should be promoting Chef before you promote a tool that essentially encourages people make horrible design decisions (in lieu of using Chef at all). You should be defending Netflix tools based upon standardized, reference deployments, not based on launching 500 VMs of the same exact machine which *is not exactly a common use case for the cloud*.

Look--it's possible to write awesome and fabulous PHP code, but most PHP developers don't. One of the reasons why Netflix is now choosing Python is because the generalized Python developer writes consistent and good code. (We chose Python for the same reasons you did). But to someone who has no idea what a good cloud deployment looks like, the way AMInator sits out there--you're going to see a lot more people like the guy super-psyched to have built 25,000 AMIs over Twitter.

The overall point of the piece is this: Netflix has a lot of power and clout in the cloud architecture world, and there are a whole lot of people looking to Netflix for guidance on how to deploy on the cloud. Netflix has made some choices (the "forest" above) that are flat-out bad choices if you take anything like a long-term approach to your cloud architecture. There is no historical precedent that you can cite as being a good example for being so intertwined with a single IT vendor. And it's way more important for people deploying on the cloud to know and understand configuration management than it is for them to have a tool that--as far as its public users go--are using to bypass CM entirely.

lgarey@techweb.com
50%
50%
lgarey@techweb.com,
User Rank: Apprentice
3/26/2013 | 7:26:25 PM
re: How Netflix Is Ruining Cloud Computing
Agreed Doug. It's to NetFlix's own benefit to nurture a vibrant set of IaaS providers. Lorna Garey, IW Reports
jemison288
50%
50%
jemison288,
User Rank: Moderator
3/26/2013 | 7:16:43 PM
re: How Netflix Is Ruining Cloud Computing
Yes, I absolutely agree with this. Big users of capacity need to have architectures that can take advantage of multiple IaaS vendors (and I agree that we're only really looking at Google and Microsoft, at least today), and need to not look like Netflix's architecture, which requires so many proprietary AWS services (e.g., DynamoDB) as it stands today.
jemison288
50%
50%
jemison288,
User Rank: Moderator
3/26/2013 | 7:13:09 PM
re: How Netflix Is Ruining Cloud Computing
Wow--nothing in my piece says anything about AWS not being modern or me abandoning it. AWS still gets the vast majority of my cloud spend today, and I think it's a wonderful service.

The point is that a *cloud architecture* that *only works with one cloud* is not a good cloud architecture going forward. Good cloud architectures will work with multiple clouds. If I am hiring you as a cloud architect today, and you tell me that you are going to create a greenfield cloud architecture that is going to hook directly into AWS APIs and that you are fundamentally uninterested in testing or supporting any other clouds in the future, then I would fire you on the spot.

I am also a big fan of Netflix's services, of their corporate culture, and of the fact that they're willing to be so open about so many things, including releasing so much source code. (I think I've been a Netflix member for more than 10 years now). The problem is that I am concerned about the vast number of people and organizations who are going to take Netflix's cloud deployment as a reference architecture, which will help Netflix, but will fundamentally set cloud architecture practices back to 2010.

All that said, if the result of this contest is to bring Netflix's tools into the future and make them multi-cloud, and use standardized configuration management, then that will be excellent (as I say at the end of my piece). However, for the many reasons I outlined in my piece, I am deeply skeptical of whether that will happen.
jemison288
50%
50%
jemison288,
User Rank: Moderator
3/26/2013 | 7:06:34 PM
re: How Netflix Is Ruining Cloud Computing
The point is that Netflix is holding a developer competition which--as it is designed--will likely produce tools and encourage practices more consistent with "clouding computing v1.0". It's fairly clear how tools should be built to work with multiple clouds--where you should build abstraction layers, and how you should embrace more open standards and choices instead of choosing proprietary ones. Most of Netflix's tools don't do that today (and in particular, Asgard's place at the center of things and reliance upon so many proprietary AWS services and API calls), and how the tools are currently built and how the contest is designed are all pointing off in a direction that goes away from a multi-cloud, standardized-configuration-management design.

Look, if this were Microsoft hosting a contest to improve its cloud tools, what I wrote would be a non-story--of course we should all be skeptical about whether what Microsoft is offering would be best-of-breed and useful. The problem is that very few people in the cloud world (at least those to whom I've spoken) seem to be viewing this Netflix/AWS contest (let's not leave AWS out, since it's very, very clearly a joint contest, with Werner and the AWS prizes) with a similar skeptical eye.
adrianco
50%
50%
adrianco,
User Rank: Apprentice
3/26/2013 | 6:20:30 PM
re: How Netflix Is Ruining Cloud Computing
There should be a techblog.netflix.com post in the next day or so that will give more context to the Cloud Prize and clarify most of the points above. However I will address some of the specific issues here.

Cloud 1.0 vs. 2.0?
I would argue that the way most people are doing cloud today is to forklift part of their existing architecture into a cloud and run a hybrid setup. That's what I would call Cloud 1.0. What Netflix has done is show how to build much more agile green field native cloud applications, which might justify being called Cloud 2.0. The specific IaaS provider used underneath, and whether you do this with public or private clouds is irrelevant to the architectural constructs we've explained.

Outages
The outages that have been mentioned were regional, they didn't apply to Netflix operations in Europe for example. Our current work is to build tooling for multi-regional support on AWS (East cosat/West coast), including the DNS management that was mentioned. This removes the failure mode with the least effort and disruption to our existing operations.

Portability
Other cloud vendors have a feature set and scale comparable to AWS in 2008-2009. We're still waiting for them to catch up. There are many promises but nothing usable for Netflix itself. However there is demand to use NetflixOSS for other smaller and simpler applications, in both public and private clouds, and Eucalyptus have demonstrated Asgard, Edda and Chaos Monkey running, and will ship soon in Eucalyptus 3.3. There are signs of interest from people to add the missing features to OpenStack, CloudStack and Google Compute so that NetflixOSS can also run on them.

Edda
You've completely missed the point of Edda. It does three important things. 1) if you run at large scale your automation will overload the cloud API endpoint, Edda buffers this information and provides a query capability for efficient lookups. 2) Edda stores a history of your config, it's a CMDB that can be used to query for what changed. 3) Edda cross integrates multiple data sources, the cloud API, our own service registry Eureka, Appdynamics call flow information and can be extended to include other data sources.

AMInator
If you want to spin up 500 identical instances, having them each run Chef or Puppet after they boot creates a failure mode dependency on the Chef/Puppet service, wastes startup time, and if anything can go wrong with the install you end up with an inconsistent set of instances. By using AMInator to run Chef once at build time, there is less to go wrong at run time. It also makes red/black pushes and roll-backs trivial and reliable.

Cloud Prize
The prize includes a portability category. It's a broad category and might be won by someone who adds new language support to NetflixOSS (Erlang, Go, Ruby?) or someone who makes parts of NetflixOSS run on a broader range of IaaS options. The reality is that AWS is actually dominating cloud deployments today, so contributions that run on AWS will have the greatest utility by the largest number of people. The alternatives to AWS are being hyped by everyone else, and are showing some promise, but have some way to go.

We hope that NetflixOSS provides a useful driver for higher baseline functionality that more IaaS APIs can converge on, and move from 2008-era EC2 functionality to 2010-era EC2 functionality across more vendors. Meanwhile Netflix itself will be enjoying the benefits of 2013 AWS functionality like RedShift.
D. Henschen
50%
50%
D. Henschen,
User Rank: Author
3/26/2013 | 5:13:05 PM
re: How Netflix Is Ruining Cloud Computing
Sounds more like the point is big users of capacity should promote a cloud free market in which there's a real choice. Amazon has been great about lowering prices, but perhaps it would lower prices even more if it had real competition. Trouble is, it's tough for anybody except those operating at Google or, perhaps, Microsoft scale to achieve or even approach the economies of scale already established by Amazon.
tw426
50%
50%
tw426,
User Rank: Apprentice
3/26/2013 | 5:09:28 PM
re: How Netflix Is Ruining Cloud Computing
Really? So just because AWS doesn't meet your classification of a modern, standard way of doing things, you abandon it? Replacing a working system for those reasons is a funny mindset... and one of the benefits of opening a 'contest' is to look at new ideas with an open attitude. Who is to say that something even better won't come out of this experiment?

It sounds to me like you may just want to bash NetFlix (as if they need any help from you) - they have demonstrated that they are perfectly able to do that themselves!
<<   <   Page 4 / 5   >   >>


The Business of Going Digital
The Business of Going Digital
Digital business isn't about changing code; it's about changing what legacy sales, distribution, customer service, and product groups do in the new digital age. It's about bringing big data analytics, mobile, social, marketing automation, cloud computing, and the app economy together to launch new products and services. We're seeing new titles in this digital revolution, new responsibilities, new business models, and major shifts in technology spending.
Register for InformationWeek Newsletters
White Papers
Current Issue
InformationWeek Tech Digest - July 22, 2014
Sophisticated attacks demand real-time risk management and continuous monitoring. Here's how federal agencies are meeting that challenge.
Flash Poll
Video
Slideshows
Twitter Feed
InformationWeek Radio
Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.