Cloud // Infrastructure as a Service
News
9/13/2013
03:48 PM
Connect Directly
Twitter
RSS
E-Mail
50%
50%

Amazon Web Services Hit By Slowdown

AWS' HMSCloud encryption key storage service produced network errors in part of Zone B, causing system delays and failovers.

8 Great Cloud Storage Services
8 Great Cloud Storage Services
(click image for larger view and for slideshow)
An encryption key storage service inside Amazon Web Services, the Hardware Security Module appliance, was affected by connectivity problems over a period of an hour and 18 minutes in one availability zone Friday morning.

The slowdown affected only one availability zone, according to the Amazon Service Health Dashboard, but it was severe enough to trigger a failover to alternative systems for latency-sensitive customers. One customer, Wuaki.tv, an API management firm in Spain, found its production systems failing over to other availability zones in US-East and US-West, as planned, in the event of business-risking operational problems.

The HSMCloud service is designed to allow customers to store encryption keys in a secure manner inside AWS, keeping the keys close to the customer's virtual machines running in an Amazon Virtual Private Cloud. Without the HSM, the keys need to be kept on-premises and retrieved each time an encryption function is performed, slowing operations. US-East is Amazon's most heavily used data center complex with five availability zones, located in Ashburn, Va.

AWS Service Health Dashboard reported that HSMCloud service returned to normal at 9:22 a.m. Pacific. As usual, the company offered no explanation beyond the cryptic notices posted to the dashboard.

[Want more on Amazon's occasional outages? See Amazon's Dec. 24 Outage: A Closer Look. ]

In addition to HSMCloud, other services in the Northern Virginia facility were affected by connectivity problems at about the same time. They included EC2 compute, Simple Email Service, Elastic Load Balancing, Relational Database Service and the RedShift data warehousing service. In each case, a single availability zone was affected.

Affected customers said the connectivity problems appeared to arise in the B availability zone. Each availability zone appears to be made up of several discrete data center sections.

Rhommel Lamas, a systems operations engineer for Wuaki.tv, said its production system in one section of the B availability suffered major latencies as it tried to connect to D and E availability zones. It also suffered the same latency as it attempted to connect to another B zone section, according to a map drawn by Wuaki.tv's monitoring service, Boundary. The latencies amounted to 1,260ms or 1.26 seconds, a crippling delay for the latency-sensitive business of managing customers' APIs. Such a latency would back up thousands of API requests on highly trafficked websites, putting the customers of Wuaki.tv at risk of losing site visitors and business.

Lamas in a telephone interview said he spotted the delays building in his Boundary monitoring system a few minutes before Amazon reported them on its dashboard. Wuaki.tv production systems automatically fail over to backup systems when latencies reach a certain threshold.

"We have a complex architecture and this is just one tiny part of it," Lamas said in an email message preceding the telephone interview. "We saw how all of our Region B on US-East was failing with increasing latency issues and errors between machines in different zones," he wrote. Wuaki.tv lost messaging packets as well in attempted communications between the zones.

Lamas said his firm didn't suffer any loss of business because automated failovers worked as planned. "No, it didn't affect our ability to serve our customers," he said. It wasn't immediately clear whether Amazon SLAs that offer replacement time for any lost in a service outage would offset Wuaki.tv's increased instance cost.

Other Amazon users noticed the problem. AWS said it began at 7:32 a.m. Pacific. At 8:34 a.m., Joshua Frattarola (@jfrattarola) tweeted: "Friday 13th starts off with major network issues affecting all of AWS US-EAST-1 Region. Probably somebody typing Google into Google."

Comment  | 
Print  | 
More Insights
Comments
Oldest First  |  Newest First  |  Threaded View
RhommelL296
50%
50%
RhommelL296,
User Rank: Apprentice
9/16/2013 | 10:02:27 AM
re: Amazon Web Services Hit By Slowdown
There is a mistake on the post. I work for 3scale.net not for wuaki.tv.
jemison288
50%
50%
jemison288,
User Rank: Moderator
9/16/2013 | 5:01:36 PM
re: Amazon Web Services Hit By Slowdown
Note that everyone has different "availability zone" labeling in AWS--and a zone can be one or more data centers. So my US-East-1b may consist of data centers that are in your US-East-1a and US-East-1d.
Patrickmoore
50%
50%
Patrickmoore,
User Rank: Apprentice
9/17/2013 | 9:56:25 AM
re: Amazon Web Services Hit By Slowdown
"Note that everyone has different "availability zone" labeling in
AWS--and a zone can be one or more data centers. So my US-East-1b may
consist of data centers that are in your US-East-1a and US-East-1d." I totally agree with jemison
Multicloud Infrastructure & Application Management
Multicloud Infrastructure & Application Management
Enterprise cloud adoption has evolved to the point where hybrid public/private cloud designs and use of multiple providers is common. Who among us has mastered provisioning resources in different clouds; allocating the right resources to each application; assigning applications to the "best" cloud provider based on performance or reliability requirements.
Register for InformationWeek Newsletters
White Papers
Current Issue
InformationWeek Tech Digest - August 20, 2014
CIOs need people who know the ins and outs of cloud software stacks and security, and, most of all, can break through cultural resistance.
Flash Poll
Video
Slideshows
Twitter Feed
InformationWeek Radio
Sponsored Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.