Inside A Bank's Cloud-Based Disaster Recovery Plan
Seattle Bank uses HotLink technology to spin up applications in AWS in case of disaster in the bank's own data center.
In June 2013, Scott McGillivray joined Seattle Bank as VP, Director of Information Technology. The regional community bank, which has five branches in the Seattle area and offers financial services for individuals and businesses, had recently closed down a secondary data center at a colocation facility. The previous CIO had wanted to consolidate operations in the company’s headquarters.
“It’s convenient to reboot a server,” said McGillivray of having one data center in the HQ, “but it doesn’t do much for disaster recovery.” The bank is required by law to have geographical redundancy to maintain business-critical systems, so one of his first tasks was to figure out how to bolster the bank’s DR capabilities.
“It’s foolish to build a [second] data center at this point,” said McGillivray. “Unless you’re doing enormous scale, there’s no reason to deal with cooling, redundant power, and so on.” His instinct was to look at a provider to build a DR site in the cloud.
There was no shortage of choices, including Amazon, Microsoft and Rackspace. However, he found that while he could copy data to the cloud, it would be difficult to get an application up and running if his primary site failed.
“Nobody had a good competent offering to build a secondary DR site,” said McGillivray. “They can run stuff in the cloud, but no one was talking about shifting workloads.”
After some searching, McGillivray came across the startup HotLink. HotLink’s technology lets IT manage multiple hypervisors inside VMware’s vCenter management software. HotLink’s transformation engine maps the capabilities of Hyper-V, Xen, KVM and Amazon’s AMI so that these hypervisors appear inside vCenter. Third-party VMs can be spun up, spun down, monitored and moved just as if they were a vSphere VM.
In addition, HotLink has a product called DR Express that integrates Amazon Web Services with vCenter, letting customers replicate and restore virtual machines from a premises data center into Amazon. Because his IT staff already used VMware hypervisors and vCenter, he liked being able to leverage that infrastructure for the DR project.
McGillivray decided to try HotLink and AWS to provide disaster recovery for some of the bank’s internal applications, including file and database services that support the bank’s accounting and mortgage systems.
VMware hadn’t announced its own hybrid cloud initiative at the time McGillivray started this project. “That might be a better solution in the long run," he said. "Right now, the scope that Amazon has and the size of the organization, makes it a safer bet from a risk management standpoint.”
At the time of our interview, the company was in the middle of using DR Express to replicate data into Amazon. McGillivray said the process was straightforward, though not without issues. “You deploy the tool in your VMware environment. You pick your RPO and replication starts, you snapshot it, and send it over [to the cloud]. That part works smoothly.”
Once all the data was replicated from his premises to the cloud, it was fairly easy to bring up a new server to run the application off premises. “We can be up and running on a server [in the cloud] in 20 minutes, often quicker,” he said.
However, the size of the VMs in his own data center was a challenge; he couldn’t move an entire VM, so he had to break it into smaller pieces. “Instead of having everything in one big container, we’ve created separate VMDKs and cut them down to 250 or 300Gbytes that can be snapshotted,” he said.
He also noted that it’s easier to get his workloads into AWS than it is to get them back out. “It’s not two-way replication,” he said. “In order to shift a workload back to my host here, it requires shutting it down and exporting that VM back to here before I can spin it back up. It would be really nice is if there was a method for bi-directional replication where if I sent a workload to AWS, HotLink would say ‘Oh you’re running over there, let’s reverse the snapshotting.’”
For the future, he said he’d like to see HotLink integrate with Amazon’s storage gateway. “Instead of creating VMDK containers, it would be great to mount an iSCSI target on S3 and pick up and run.”
In terms of cost, McGillivray is watching his Amazon usage closely. “Storing the data out there is not an expensive process--it’s running the VMs. The big cost with AWS is when you turn on a machine. It’s another reason why this isn’t an active-active solution, it’s for DR for when I need it.”
As for HotLink, he noted “I’m not getting a screaming good deal, but when I look at what I’m paying for HotLink, plus storage, plus running a server now and then, it’s almost an order of magnitude cheaper to use this than to maintain a secondary co-lo somewhere.”
“The financial industry is a little weird in that regulations are still playing catchup to what’s available and possible in the cloud,” said McGillivray. “I’m trying to take a cautious, prudent approach. I could have to change everything next year because a bank got hacked after putting stuff in Amazon.” At the same time, he also has to get things done.
“What I come back to is that I’m not running two sites. I’m running a DR site that I can send work to if the primary goes down. In a bank, you can’t have a theoretical DR plan.”
Drew is formerly editor of Network Computing and currently director of content and community for Interop. View Full Bio
Multicloud Infrastructure & Application ManagementEnterprise cloud adoption has evolved to the point where hybrid public/private cloud designs and use of multiple providers is common. Who among us has mastered provisioning resources in different clouds; allocating the right resources to each application; assigning applications to the "best" cloud provider based on performance or reliability requirements.
InformationWeek Tech Digest, Nov. 10, 2014Just 30% of respondents to our new survey say their companies are very or extremely effective at identifying critical data and analyzing it to make decisions, down from 42% in 2013. What gives?