Disaster recovery is a topic we tend to avoid, as it will never be a strategic, revenue-generating project. Few CIOs get out of bed jazzed about implementing a DR plan. Yet it's difficult for those same CIOs to go to bed knowing that a critical system is one click away from disaster.
Years ago, it wasn't out of the ordinary for a company to spend a million dollars to protect itself against an outage that might cost $100,000. Thankfully, the economics of building a sound DR plan have come down to earth, and we can thank public cloud providers for changing those economics.
Private clouds are relatively easy to build today, depending on your needs. The cost of storage is coming down. Large Internet pipes are much cheaper than they were even three years ago. Co-location facilities are readily available in urban areas and are fairly priced. And, most importantly, public cloud providers are adding options almost daily and are competing for your business against 100 other public cloud players. Market leaders such as Riverbed and F5 have brought to market products that make leveraging public cloud services easier than ever.
For example, Riverbed has beefed up its Whitewater cloud storage line to include improved capabilities for accessing extremely low-cost public cloud offerings such as Amazon Glacier. F5 continues to bolster its BIG-IP line to support public cloudbursting and public cloud load balancing. In the public cloud, you pay for what you use, and orchestration features like those can help customers spin up and spin down resources.
Having lots of options is obviously a good problem. But your decision to rent or buy your DR capacity isn't just based on cost. As an IT department, you will be judged on how well your DR plan works, not on how cheap it is. And if using a public cloud for DR doesn't meet your recovery service-level agreements, then you might have no choice but to buy. However, if you're paying millions of dollars per year for redundant systems, and you can creatively leverage the public cloud and still meet your SLAs, then the potential savings of renting is too alluring to ignore.
DR planning is an evolving animal. As new business-critical software is deployed or moved between datacenters, IT must always be thinking about how to keep the lights on in the event of an outage. One of the great characteristics of DR planning is that you can modify your decisions quickly without losing a lot of time and money. Think about it: If things go sideways 12-24 months into a large SAP or other software deployment, there's not a whole lot you can do to shift gears quickly.
But on the infrastructure and DR side of the house, your investments in a private cloud aren't throwaway costs, because they can be used to serve other business needs if you decide to leverage the public cloud.
In this article, we'll explore some of the major factors involved in a rent-versus-buy decision for DR and business continuity at an actual midsized company, which we'll call Company X. We asked an IT director at Company X to discuss the most important variables of its DR strategy in mixing in public cloud resources. His answers show just how complex that rent-versus-buy decision can be.
Disaster recovery plan scoping
The first thing Company X did five years ago was inventory the systems and applications that were critical to the business.
Company X's definition of "critical": if an unplanned outage would have a significant, companywide impact on the organization's ability to operate. All the executives at Company X agreed that IT should design a DR strategy for critical systems that included a 99.9% uptime SLA. That translates into eight hours and 46 minutes of unplanned downtime per year -- an achievable goal. Some of the IT assets that fell into the critical category were Internet access, core network access, the corporate messaging environment, access to shared file systems, perimeter and remote access gateways, and certain business applications.
Everything else, including access to department-specific applications that weren't generating revenue directly, fell into a Tier 2 category, where 99% uptime was satisfactory. Five years ago, Company X wasn't prepared to use the public cloud. However, public cloud options have matured to the point where Company X is now looking at them.