Latest Amazon Web Services outage prompts complaints from critics, customers. But Amazon supporters say incident simply shows need to use multiple availability zones.
Amazon's 7 Cloud Advantages: Hype Vs. Reality
(click image for larger view and for slideshow)
An Amazon Web Services outage at the company's primary East Coast data center during the evening of June 14 produced negative comments in the press and on social media. At the same time, some members of the cloud computing community rose in Amazon's defense.
Heroku, Quora, Parse, and Pinterest were among the sites affected by the outage, along with many small companies that rely on Amazon as their source of compute power instead of a traditional data center. The latter prospect produced the wry comment: "With AWS having an outage, thousands of startups with literally dozens of customers are affected," tweeted Laurie Voss, technical lead at social media analytics firm, awe.sm, in San Francisco.
Amazon itself kept to a few cryptic comments on its Service Health Dashboard. It cited a power outage, but didn't say whether its cause lay with an electricity supplier or a failed component inside its facilities. It did say the outage affected part of one availability zone. Some U.S. East customers have discussed four availability zones being available at the Nothern Virginia site. An availability zone has separate power and communications facilities, so that an outage in one doesn't spread to others.
Amazon recommends that customers who wish to avoid an outage run applications in two availability zones as a high-availability best practice. That would have protected customers in the June 14 outage. But it proved less than a bullet-proof guarantee in Amazon's bigger, Easter weekend outage in April, 2011. In that incident, later termed "a remirroring storm," service freeze-ups in one zone affected the availability of those services in other zones, according the Donnie Flood, VP of engineering at Bizo, a business information site caught in the service collapse.
Nevertheless, keeping an active back up copy in a second availability zone worked this time for Control Group, a New York custom application building firm that hosts its customers' apps on AWS EC2. "There's a little bit of overhead to that," conceded Dave Rocamora, VP of DevOps (development/operations) at the firm.
But Control Group embeds the automatic deployment of an active backup system in its new applications. Upon deployment to EC2, the backup will be established in a different availability zone unless the customer turns it off.
Rockamora said his firm is producing production systems for 20 different customers, including e-commerce transaction, video distributing, and HIPAA-compliant health care applications. He estimated "90% of them are active/active," using two availability zones and this practice saw all his customers through the June 14 outage.
The Amazon incident prompted a seller of software for on-premises private cloud, Piston Cloud, to engage in a bit of one upmanship: "These very public 'glitches' underscore the fact that private cloud is best--in terms of cost, security, scalability and innovation--every time," said Joshua McKenty, CEO of Piston, which produces on-premises software based on OpenStack open source code.
But that and a blog on the Piston site June 15 prompted some rejoinders: Jeff Sussna, principal at IT service consultancy Ingineering.IT tweeted: "Amazon rivals should not throw stones. FUD hurts the public cloud industry, not just the vendor."
Netflix chief cloud architect, Adrian Cockcroft, whose company is one of Amazon's largest users, tweeted June 15: "What part of 'Availability Zone' do people not understand? Exactly the failure mode we expect and plan for..."
Private clouds are more than a trendy buzzword--they represent Virtualization 2.0. For IT organizations willing to dispense with traditional application hosting models, a plethora of pure cloud software options beckons. Our Understanding Private Cloud Stacks report explains what's available. (Free registration required.)
How Enterprises Are Attacking the IT Security EnterpriseTo learn more about what organizations are doing to tackle attacks and threats we surveyed a group of 300 IT and infosec professionals to find out what their biggest IT security challenges are and what they're doing to defend against today's threats. Download the report to see what they're saying.
IT Strategies to Conquer the CloudChances are your organization is adopting cloud computing in one way or another -- or in multiple ways. Understanding the skills you need and how cloud affects IT operations and networking will help you adapt.