Q&A: Amazon CTO Werner VogelsInformationWeek's 2008 Chief of the Year discusses Web services architecture, cloud computing, and his changing role as CTO.
Amazon CTO Werner Vogels spends more than half of his time on the road, helping IT departments with the transition to cloud computing. Vogels joined Amazon in 2004 from Cornell University, where he worked for 10 years in computer science research, focusing on large-scale, distributed enterprise systems. At Amazon, he's put that background to practice in helping the e-retailer scale its IT infrastructure and open it to other companies in the form of Amazon Web Services.
In an interview with InformationWeek's John Foley at Amazon's Seattle headquarters, Vogels discussed the architectural design and philosophy behind AWS, his role as a customer-facing technologist, and his vision of what's next in cloud computing.
InformationWeek: Let's start with the nuts and bolts. What's the IT architecture behind Amazon Web Services?
Vogels: Around the time I joined, we already had established this large-scale service-oriented architecture. The phase before that, Amazon was mainly databases and application servers. That had come to sort of an end of life as an architecture around 2000, 2001. We moved to this service-oriented architecture by taking individual pieces of business logic that sat in the application servers, looked for the data that they operated on, brought those together, and put an API on them, that's what we call a service. It allowed us an evolutionary path to a service-oriented architecture. Each of those pieces that make up the e-commerce platform are actually separate services. Whether it's Sales Rank, or Listmania, or Recommendations, all of those are separate services. If you hit one of Amazon's pages, it goes out to between 250 and 300 services to build that page.
It's not just an architectural model, it's also organizational. Each service has a team associated with it that takes the reliability of that service and is responsible for the innovation of that service. So if you're the team that's responsible for that Listmania widget, then it's your task to innovate and make that one better. Around the time that I joined, we also did a deep dive on what these teams are actually spending their time on, and when we did an analysis on that, we found that a lot of those teams were spending their time on the same kind of things. In essence, they were all spending time on managing infrastructure, and that was a byproduct of the organization that we had chosen, which was very decentralized.
So in a traditional enterprise architecture step, we decided to go to a shared-services platform and that became the infrastructure services platform that we now know in the outside world as AWS. We first had to develop it for ourselves in a way that those teams could focus on the innovation side and not become super app administrators and super operators, because there's no glory in that, although at Amazon-scale, all engineers need to be aware of scale, reliability, and be able to failover their services from one data center to another.
So, we have an e-commerce platform that consists of all of these services. On top of that, we run a number of very large e-commerce operations -- Amazon.com, our international sites, and sites like Marks & Spencer and Target. It's a multitenant, large service-oriented architecture. Then we drop one step down and you get to the infrastructure services that power the e-commerce platform; most of those reaching the outside world we know as AWS, but they were targeted initially at internal customers. Below that sit our hardware services, the teams that construct data centers and build them out and do networking.
InformationWeek: Amazon is known as an open source shop. Is that still true?
Vogels: Where in the past we could say this was a pure Linux shop, now in terms of the large pieces of the e-commerce platform, we're a pure Amazon EC2 shop. There's an easier choice of different operating systems. Linux is still very popular, but, for example, Windows Server is often a requirement, especially if you need to transcode video and things that have to be delivered through Windows DRM [digital rights management], so there is a variety of operating systems available for internal developers. We have a long history of not trying to restrict our developers, allowing them to pick the tools that they feel work best. If they want to do prototyping, these days engineers will often flock to Ruby, there are some experiments going on with highly concurrent services, they may be using Erlang, other groups are using Java and C++ and things like that.
That also drove a very important principle over how we constructed the infrastructure services, namely that if you were using Amazon EC2, you weren't necessarily required to use S3 or SimpleDB or Elastic Block Store. Of course, you hope that all of these services are seductive enough for our engineers to use them, but if they felt there were better tools available for that particular task, then they should be free to do so. So it's required in the way that we constructed these services that there was no internal lock-in. That's the same advantage we give to customers in the outside world; none of these things are locked-in.
Most of our software is best characterized as being homebuilt or homegrown. There's hardly any third-party software left, and it has nothing to do with that we don't think third-party software is great. If I could buy things, I would, but in the past we've seen that to get to the reliability that Amazon requires, and to have the control over both cost and performance, we need to have much better control of the software that runs our services, and for that vendors just haven't reached the point yet where they can reliably deliver software that can operate at Amazon scale. We build our software, and if we do buy third-party software, we use it in a form that every other customer is using.