Companies are sometimes choosing software-as-a-service over on-premises software because it is inexpensive and flexible, yet recent outages at Google and Workday are reminders that SaaS isn't perfect. SaaS lets CIO pay only for what they need, but how do CIOs make sure they get the performance they've paid for?
Most SaaS vendors only guarantee 99.9% availability or better, depending on the app, because in any IT infrastructure, unplanned outages can and do occur. But a good SaaS vendor, in addition to presenting a solid fail-over plan, is forthright about system performance, and communicates the details of unplanned outages if they occur.
Ray Wang, an analyst with Altimeter Group, published a best-practices document for SaaS earlier this month, titled "Customer Bill Of Rights: Software-as-a-Service." On performance metrics, Wang advises that SaaS customers should expect their vendors to provide a "trust site" for monitoring service levels (such as Trust.salesforce.com or status.netsuite.com), although not all vendors provide such sites. Vendors also should provide a monthly report on key availability and continuity metrics, Wang advises.
Workday, which offers human resources, financial apps, and payroll-as-a-service, is among those that don't offer a trust site. But it's still a small company, with only 100 customers, so it was easy enough to call or e-mail customers with status updates during a 15-hour outage on Sept. 24.
A network-attached storage device that stored operating system files for Workday's production servers self-diagnosed a corrupted mode. But rather than simply logging it, the device took itself offline. When Workday couldn't resolve the problem within a few hours, it began the 12-hour process of moving its systems over to a back-up data center.
These details were provided to customers during the outage. "The communication from Workday was frequent and fairly complete," said Andy Schlei, VP and divisional CIO at Sony Pictures Entertainment. Sony is a new Workday customer for human resources SaaS. Sony is still running it in a test environment, so its HR operations weren't affected by the outage. Still, news of it was "disappointing," said Schlei. "It's not good to have an outage like this. But, there was no data loss, there was a reasonable amount of time for recovery, and they clearly communicated the root cause."
Indeed, transparency is key to successful SaaS relationships, said Wang. At a Workday user conference on Oct. 19 in San Francisco, Wang said he talked to several Workday customers about the outage. "For the most part, customers who got those phone calls from Workday throughout the night were very understanding, because Workday was pretty up front about it," he said.
"It's like waiting in line at the airport when your flight's been cancelled. If the agent's not giving you enough information to make a decision on what you should do next, you become frustrated. It's how you handle crisis management -- do you hide information from customers, or tell them everything you have so that they can make decisions?"
Google, meanwhile, had a nearly two-hour outage on Sept. 1 that affected all of its enterprise customers, including Genentech, Hamilton Beach, and Johnson Diversey, and another later in the month that affected a small group of customers.
Still, in the roughly 10 years since hosted software services have been available, the average rate of uptime for any given SaaS vendor is likely higher than what typical IT shops experience with comparable applications installed on-premises, Wang said. But when apps are in the cloud, "these providers are going to have to be responsible for a lot more," he said.
Google, which offers an ongoing Apps Status Dashboard on availability, notified all users in a blog, but large enterprise customers were offered one-on-one post-mortem calls with Google executives.
There can be varying levels of vendor responsibility, depending on what arrangements have been made. For example, companies such as Oracle offer to host software in a customer's own data center. If the customer's hardware running the software causes an outage, then the SaaS vendor might not be responsible for it.
Not all SaaS vendors are willing to do service-level agreements, particularly if it's a low-cost service notes Todd McClelland, a Washington, D.C.-based attorney who represents both cloud computing customers and vendors on deals. "If you're not paying a lot, then the vendor is not going to take a whole lot of responsibility," McClelland said. "We're only seeing discussion of SLAs in very large deals."
Still, as interest in SaaS and other types of cloud computing picks up, there's more discussion of how to formalize vendor expectations around both performance and security, McClelland said.
For example, some customers are asking for vendors to show SAS 70 certification, McClelland said. That stands for Statement on Auditing Standards No. 70, Service Organizations, and was established by the American Institute of Certified Public Accountants as a way to assess how well a firm handles sensitive data. But in recent years it's been used by non-accounting firms to audit the quality of security systems, processes and controls. Google, for example, has pointed to its SAS 70 certification as proof that its cloud computing services are secure.
Companies need to "consider whether vendors should be holding and hosting certain types of customer data," said McClelland. "The Amazons and IBMs of the world will implement security measures that are greater than customers would have on their own systems," he said. "But even though large vendors' systems are more protected, there [are] more people who will want to hack into those systems because there's so much more data."
InformationWeek has published an in-depth report on new software models. Download the report here (registration required).