Replicating data locally or across the WAN to a disaster recovery site in real time isn't new--companies with the big bucks to implement it have been doing this for years. What has changed in the last few years is that this higher level of protection is increasingly affordable and accessible to more companies; it's now well within the reach of small businesses.
Lower initial costs for hardware, software, and services are making continuous data protection more affordable, as are technologies and services that significantly lower ongoing monthly charges. These include everything that makes it more efficient to send data across the wire, from compression and deduplication to WAN optimization.
Technologies such as compression and deduplication can decrease data traffic and significantly reduce bandwidth requirements and overall costs. While deduplication and compression allow more efficient use of available bandwidth, bandwidth itself is becoming cheaper--yet another driver in increasing the affordability of these options. Better and cheaper WAN acceleration products and technologies are also playing a role by allowing much more efficient use of available bandwidth. That enables replication across a smaller and lower-cost link. It also may allow faster recovery from a problem.
But replication has a dark side as well. If you have corruption in data at your primary site--due to hardware or software problems, user error, or other problems--you might replicate that corruption to the secondary site as well. That's not what you're looking for from a disaster recovery system. You need something that protects against that corruption, and that's where continuous data protection, or CDP, comes in.
What is CDP? There's some disagreement as to what qualifies as true CDP and what may be more appropriately called "near CDP." The CDP group within the Storage Networking Industry Association defines it as continuously capturing or tracking "data modifications and stores changes independently of the primary data, enabling recovery points from any point in the past." CDP systems may be block-, file-, or application-based.
A true CDP approach should capture all data writes, thus continuously backing up data and eliminating backup windows. CDP uses a process known as journaling to capture and time-stamp data and provide rollback of data to a previous state such as a specific point in time or before a specific application event. If data gets a corrupted, you can roll back to before the corruption happened.
CDP is the gold standard--the most comprehensive and advanced data protection. But "near CDP" technologies can deliver enough protection for many companies with less complexity and cost. For example, snapshots can provide a reasonable near-CDP-level of protection for file shares, letting users directly access data on the file share at regular intervals--say, every half hour or 15 minutes. That's certainly a higher level of protection than tape-based or disk-based nightly backups and may be all you need.
The bottom line for companies with more modest budgets is that, all along this spectrum from continuous to snapshots, there are disaster recovery and business continuity technologies and services considered exotic and prohibitively expensive just a few years ago that are now within reach. With proper expectations and the right disaster recovery plan, most companies can implement something far more effective than conventional tape-based backups.
4 Tips For Doing CDP Right
We talked to vendors working with companies migrating away from tape-based backups and toward replication, snapshots, and CDP systems. They're nearly unanimous with their advice on avoiding cost overruns and project failures. Here are four tips:
1. Don't take the same approach for all data. Just because you can replicate data or implement CDP doesn't mean you should do so across the board. Classify data and applications to buy and implement only the appropriate level of disaster recovery protection. You might classify Exchange e-mail and Oracle transactional data as highly critical and implement real-time replication to a disaster recovery site but take a cheaper approach with file and print data.
2. Set the right recovery goals. Costs can skyrocket when recovery goals aren't properly aligned with business requirements. It relates to tip No. 1: Does everything need a one-hour recovery time? This is like the deductible on your insurance--you have to hit the right balance between risk and cost.
3. Use multiple technologies. For example, you might use Exchange 2010's built-in replication and disaster recovery features for e-mail, while relying on storage-based replication for other data. One caveat: Look at those many point solutions within a complete business continuity strategy to make sure it's manageable in case of a major failure. Does it create pockets of data and require a team of app experts to get fully running again?
4. Don't forget means of access. Replication and CDP don't help if employees can't access them. Do they use a secondary location or remote access via VPNs? It takes planning, training, and communication.
Just 36% of companies have implemented and regularly test disaster recovery and business continuity plans, InformationWeek's 2010 State of Enterprise Storage report finds, though that's up from 28% the prior year. History shows that the more we rely on something, the more likely it is to fail at the most inopportune time. Well, at least my personal history does. With the critical importance of IT operations to most businesses, having disaster recovery and business continuity plans that cost-effectively tap the best of today's technology must be high on the priority list of well-performing IT organizations.
Behzad Behtash is an IT consultant who previously was CIO of Tetra Tech EM and VP of systems for AIG Financial Products.