Amazon IDs Cause Of Data Center Outage - InformationWeek
Cloud // Software as a Service
08:07 PM
Connect Directly

Amazon IDs Cause Of Data Center Outage

The failure of two power components at a Virginia data center affected some EC2 operations on December 9th, Amazon Web Services says.

Apparent Networks set up the monitoring service because it wanted to illustrate what its PathView Cloud could do for companies making use of cloud computing. It said it maintains 20 accounts in the data center that experienced the outage and six of them went down. Apparent Networks spokesmen were careful to say they have no way of knowing if their experience applied to the data center as a whole.

By using a network path to monitor the data center, Apparent Networks can see something that Hyperic's systems management system, Cloud Status. It tracked its own pinging and command traffic to a router in Northern Virginia where it stopped short of the virtual server that Apparent was running there. Amazon is known to operate a data center near McLean, Va., but company officials don't name specific locations in communications. Likewise, the Amazon Service Health Dashboard avoids naming locations beyond a region in which it might have several data centers. In this case it referred only to the US-East-1 region.

If a user of Apparent Networks PathView Cloud found evidence of a service outage, that user could match up that information with Amazon's own CloudWatch service or Hyperic's CloudStatus to see how his individual virtual machines were performing and learn more, noted Javier Soltero, CTO of management products at SpringSource, a unit of VMware.

"On the whole, Amazon is extremely consistent," said Soltero. That consistency isn't simply in operating data centers but in its willingness to report incidents to customers through the service dashboard. In this instance, however, "we saw a gap between the actual outage" and when the service notices started to appear. The gap was 34 minutes long, if Apparent Networks outage times are right, which is either a short time or an unbearably long time. Your view of the gap depends on whether you were running time-sensitive workloads or non-sensitive workloads, if you were an EC2 customer in the data center affected.

Amazon's incident notice language is also location non-specific. Customers can't tell from the notices whether they have a virtual machine running where the incident is taking place. They must either subscribe to Amazon's CloudWatch or a third party service, such as PathView Cloud or Cloud Status, that's looking at the cloud from the outside.

2 of 2
Comment  | 
Print  | 
More Insights
Newest First  |  Oldest First  |  Threaded View
How Enterprises Are Attacking the IT Security Enterprise
How Enterprises Are Attacking the IT Security Enterprise
To learn more about what organizations are doing to tackle attacks and threats we surveyed a group of 300 IT and infosec professionals to find out what their biggest IT security challenges are and what they're doing to defend against today's threats. Download the report to see what they're saying.
Register for InformationWeek Newsletters
White Papers
Current Issue
2017 State of the Cloud Report
As the use of public cloud becomes a given, IT leaders must navigate the transition and advocate for management tools or architectures that allow them to realize the benefits they seek. Download this report to explore the issues and how to best leverage the cloud moving forward.
Twitter Feed
InformationWeek Radio
Archived InformationWeek Radio
Join us for a roundup of the top stories on for the week of November 6, 2016. We'll be talking with the editors and correspondents who brought you the top stories of the week to get the "story behind the story."
Sponsored Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.
Flash Poll