Commentary
Cloud Takes A Hit: Amazon Must Fix EC2
Amazon's "availability zones" were a key protective concept for the cloud, but they failed to protect access to data when EC2 went down.It seems to me the outage of Amazon’s cloud computing service yesterday was a signal event. IT advocates of cloud computing face severe internal skepticism that the cloud is a reliable, distributed environment. In the past, they’ve responded that skilled service providers, such as Amazon, architect against failure with availability zones, independently running sections in one data center. If you run your application in one and keep a mirror image in another, you’re protected. Some enterprises found out yesterday the architecture doesn’t work. Their critics had a field day.
Amazon’s outage in Northern Virginia yesterday impeded customer access to data beyond one availability zone in that center. Amazon has a West Coast data center as well as one in Northern Virginia, but something that wasn’t clear before became clear yesterday. Amazon zones don’t extend to a different data centers in different geographic locations. This fact is reverberating today among users of cloud computing. The different availability zones are supposed to keep services running, even if part of the data center fails. They didn‘t function as advertised.
More Cloud Insights
Webcasts
- Big Data at High Speed: Complex Event Processing at 10x
- Cloud or Premise Based Contact Center – Which is Right [for YOU]?
White Papers
- e-Commerce Strategies for Business-to-Business (B2B) Sales and Marketing
- Cloud Computing Drives Break through Improvements in IT Service Delivery, Speed, and Costs
Reports
More >>Amazon Web Services has been posting its usual terse explanations to its Service Watch Dashboard, but for the anxious IT manager they don't say much. They don't say, for example, when the cause of the trouble can be expected to be alleviated. Service troubles started at 5 minutes before 1 a.m. Pacific time on Thursday. At 11:09 a.m., the dashboard acknowledged many customers were asking when service would be back: "We deeply understand why this is important and promise to share this information as soon as we have an estimate that we believe is close to accurate." Their best guess: "in a few hours."
Let's be clear on what did and did not happen. Amazon's EC2 infrastructure as a service, the compute servers, stayed up and running in Northern Virginia, but some of them lost the ability to access data, launch a customer's stored instances, and save results of running instances. That means those customer servers or “instances” that were running time sensitive applications or customer facing apps were rendered useless.
On the other hand, some customers may not have been affected at all. CloudSleuth, an EC2 monitoring service from Compuware that's meant to illustrate the capabilities of its Gomez monitoring service, had two test applications running in Northern Virginia Thursday and they responded to pings indicating that they had stayed up and running through the outage. Neither of the test apps were making use of Relational Database Service or Elastic Block Store, key affected services. If they had needed them, they would have stalled.
A disruption to the RDS appears to have lead to interruptions of the EBS storage service that Amazon offers customers to capture data and record the application instance. The failure of these services in a zone of what's known as US-East 1, an Amazon data center in Northern Virgina, was bad enough, but their failure in turn triggered RDS and EBS service disruptions in additional availability zones.
Most enterprise applications in EC2 would be making use of EBS and some would use RDS as well. Their inability to access data would render them useless in many cases for the length of the service disruption. Until Amazon can demonstrate that it knows what caused the problem and how to fix it, this disruption puts a stake in the heart of the argument that Amazon zones are adequate protection against failure.
That's because Amazon itself presents the zones as the chief protection against your application failing. "By launching instances in separate Availability Zones, you can protect your applications from failure of a single location," states the guidance for users of Amazon Machine Images.
What is a zone? Only Amazon knows for sure. I know the new New York Stock Exchange data center in Mahwah, N.J., designed for high availability, was built on the border of two utility companies, giving it two sources of power. To me, a cloud data center has at least two zones with distinct electricity sources. One can fail, and the rest of the facility keeps running. Likewise, with telecommunication carriers, two or more are necessary. Zones within the data center tap into difference services; they're architected against both failing at the same time. Yesterday's outage, on the contrary, says zones are not insulated from one another and a service failure of one can spill over into another. This is a body blow to cloud computing.
Related Reading
| To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy. |
Subscribe to RSSResource Links
Related Webcasts
- Creating an Agile, Flexible Cloud Computing Model
- Big Data at High Speed: Complex Event Processing at 10x
- SMB Server Guide: Meeting Email, Virtualization, and Business Application Challenges
- Securing the Cloud: Extend the Benefits of Traditional IT Environments to Cloud
- Perform Better in a Hybrid Cloud World
SELECTED CLOUD CONTENT
- Ciena's Virtual WANs Offer Bandwidth For Cloud Apps
- Oracle Buys Vitrue For Social Marketing
- EMC Shares New Atmos Details
Sponsored Resource Center
This Week's Issue
Free Print Subscription
SubscribeCurrent Healthcare Issue
- InformationWeek Healthcare CIO 25: Our second annual honor roll of the health IT leaders driving healthcare's transformation.
- EHR Unreadiness: Only a small percentage of physicians planning to apply for Meaningful Use funds have e-health record systems capable of achieving most of the requirements. .
- And much more!
- Read the Current Issue
Featured Whitepapers
- e-Commerce Strategies for Business-to-Business (B2B) Sales and Marketing
- Cloud Computing Drives Break through Improvements in IT Service Delivery, Speed, and Costs
- A Revolutionary Approach to Cloud Building
- Automating User Management and Single Sign-on for Salesforce.com
- Cloud First IT: Managing a Growing Network of SaaS Applications



