Cloud Customers Must Understand Weak Points
In discussing Amazon's three-day outage, cloud computing advocates said customers must do more homework before selecting a vendor.
More Cloud Insights
- Maximize the benefits of virtualization for greater ROI
- Building a Hybrid Cloud in Government: It's not that Complicated
- Software Tool Selection: A Process For Success
- Create solutions on IBM SmartCloud Enterprise: Best practices and tools
In the hallways, meeting rooms, and watering holes of the Mandalay Bay convention facilities, the Amazon outage in particular was a leading topic of discussion. In many cases, one attendee asked another whether it meant cloud computing had suffered a setback.
David Berlind, chief content officer for UBM Techweb, moderated a panel that included representatives of three cloud suppliers, Terremark, Rackspace, and virtualization vendor Citrix Systems.
"It's like an airplane crash. It's very bad because a lot of people are hit hard. But you're actually safer in an airplane than in a car. Broadly speaking, you are better off in the cloud," said Simon Crosby, CTO for data center and cloud at Citrix Systems.
"Our phone started ringing right away. Amazon didn't tell customers right away what was the problem," said Andy Schroepfer, VP of enterprise strategy at Rackspace, a San Antonio supplier of infrastructure as a service.
"There a problem in the cloud. A lot of people think it's magic. At the end of the day, it's standard IT infrastructure. But people think, 'They wouldn't sell me a service that wasn't backed up,'" he added.
Instead of assuming that, the cloud customer needs to understand the details of the cloud service supplier's architecture, identify weak points, and figure out what is an acceptable strategy for his needs, if his section of the cloud data center should fail.
"There is some magic, but you can't just throw out IT best practices," Schroepfer said.
Berlind noted that Amazon SLAs didn't cover the loss of Elastic Block and Relational Database services, even though they hindered the operation of customer's virtual machines. Amazon's SLA covers only the running instances, not related services. Amazon made up for the outage, which lasted for three days for some customers, by offering 10 days of free service in EC2.
"For Web sites such as Groupon and Reddit.com, is that really enough?" challenged Berlind.
Schroepfer said the damage caused by the outage to an ecommerce site has to be balanced against the benefits it gains in the cloud. Most ecommerce sites do five times their normal business at the year-end holidays. By being located in the cloud, they don't need to overprovision their data centers with five times the amount of computing needed in the other 11 months.
Crosby offered a more pointed response: "Groupon and Reddit.com wouldn't exist if there weren't a cloud." They are built on low infrastructure budgets that find the computing power they need through infrastructure as a service suppliers, such as AWS, he said.
Randy Rowland, senior VP of product development at Terremark, now part of Verizon Business, said cloud customers need to negotiate the SLA they need with their cloud supplier. "Some of Amazon's customers didn't know what their SLAs were," he claimed.
"We define the SLA upfront to meet a company's goals," he said. Terremark was acquired by Verizon Business, the telecom supplier, in January for $1.4 billion. Verizon has data centers worldwide and wanted Terremark's expertise in dispensing reliable services from them to government agencies and other customers. "Every customer gets a named account manager," Rowland added.
Terremark and suppliers such as Savvis and Verizon Business are not known as the low price suppliers, as Amazon Web Services is. Their named account managers, guarantees of higher than the 99.95% uptime that Amazon offers, and special recovery services all come at a higher price.
Customer needs to know their exposure to failure will be based on the type of cloud supplier they choose. And after they choose, there's still work to be done.
"You've got to develop the relationship. If there's an outage and you don't know who to call, there is no relationship," said Schroepfer.
Virtualization can hide performance issues from traditional tools. This report provides guidelines to help you understand the hurdles that virtualization puts in front of APM, and how to get over them. Download it here. (Free with registration.)