InformationWeek: The Business Value of Technology

InformationWeek: The Business Value of Technology
InformationWeek Big Data Coverage
= Member Content
Facebook Twitter Share

E-mail | Print | Permalink | LinkedIn | RSS

How Columbia Sportswear Will Survive Next Tsunami: Cloud


International sportswear provider pursues hybrid cloud disaster recovery plan, to avoid another data center shutdown as happened in the 2011 Japan tsunami.




Columbia Sportswear, the $1.7 billion-revenue outdoor clothing retailer, learned a hard lesson in March 2011 when a tsunami struck the Fukushima region of Japan. Its data center in Tokyo remained intact, but soon stopped operating due to frequent electrical power interruptions.

"A clothing manufacturer is not high on the list of those getting emergency diesel power," noted Michael Leeper, director of global technology for Columbia. IT staff in Tokyo had access via the Internet to facilities in other parts of the country and the world. It would have been able to reach them, if it had duplicate systems located elsewhere. But "the country was going through turmoil as we still tried to conduct business. We couldn't sustain power. We had to shut off the equipment for days at a time," he recalled in an interview.

As it was, even if they found a site in Japan with steady power, they'd need to retrieve data off tapes. If a three-petabyte data transfer took several days, then the Tokyo data center was likely to be disrupted somewhere in the middle of it. The lesson irrevocably sank in. "We didn't realistically have a disaster recovery plan ... we'd have to pray that the data came back off the tape," he said.

[ Want to learn more about how VMware competes with Microsoft to become the supplier of on-premises cloud operations? See Microsoft Vs. VMware: Who'll Be Private Cloud King? ]

Leeper, however, is part of the generation of managers who has enthusiastically embraced virtualization. In the U.S., Columbia has two data centers, one in its headquarters city of Portland, Ore., and another in Denver, with one facility serving as the recovery center for the other. Leeper has pushed VMware virtualization deep into the heart of the U.S. data centers, moving from about 15% virtualized to 96% in about nine months. It's routine to move virtual machines from one rack to another via live migration, known as vMotion. But in Japan, he had no way for his systems to escape the aftermath of the tsunami.

"We use vMotion to move servers 30 feet, 50 feet, or 100 feet in the data center. Why not use it to move them hundreds of miles?" he asked. In effect, he was looking for a hybrid cloud style of operation where the cloud could take up the slack for the 3-4 weeks needed to get a company facility reliably operational again. That would require a smooth transfer of operational responsibility from one site to another. It would also require an up-to-date data stream that could be shut down at one site and resurrected at another.

Leeper talked to large cloud providers, including Terremark (a unit of Verizon) and Savvis (a unit of CenturyLink), about using them as disaster recovery sites in the U.S. "They wanted us to bring them our workloads to run in their data centers. Then we'd set up disaster recovery. They didn't understand what we were talking about," he recalled.

Columbia is a VMware ESX Server and vSphere user. Leeper found a smaller, VMware-compatible, regional provider of infrastructure service in Tier 3, a firm in nearby Seattle and a VMware partner. Columbia experimented with them and found, using VMware vCloud Suite with vSphere, vCloud Director, and vCenter Site Recovery Manager, Tier 3 functioned fine as a hosting service at which Columbia systems could be quickly initiated.

But for it to function as a temporary disaster recovery site, Columbia needed a way to continuously, but cheaply, replicate business data. Without that, it might not succeed in pulling everything needed off of tapes. Even if all the taped data materialized, there would be some inevitable setback, some missing data from the day of disaster, or possibly several days since the last tape had been made, that would delay the business' ability to reopen all operations.

Columbia likes the concept of a cloud facility providing a backup and recovery center, but Leeper's staff is studying how to replicate data to it at a price that Columbia considers acceptable. "How do we seed the data to this site before we need the disaster recovery?" he asked.

For that matter, even if it created such a facility tomorrow at Tier 3, "they don't quite have the global reach we need," Leeper adds, thinking of his Japan operation. That is, the latencies caused by distance between Seattle and Tokyo would mean greater delays in processing than the company wants to put up with. If Columbia's going to move disaster recovery to the cloud, the cloud provider needs to have facilities in the parts of the world that are key to Columbia. Then he wants to be assured Columbia's backup and recovery systems can be moved from one site to another, since no one can be sure where disaster will strike next. And he doesn't want to be forced to establish full disaster recovery sites in each part of the world where Columbia operates.

Columbia continues to use Tier 3--"they're a great partner," said Leeper--but he is trying to find a less expensive way to replicate data to the cloud than those available today, one that can move data between his data centers or between the cloud data centers that his firm has chosen. He knows hybrid cloud will work for his company, because Columbia's staff has done enough experimentation at Tier 3 in moving recovery systems around to provide proof of concept.

If a disaster occurred in Tokyo, he'd like the option of being able to move his systems at a moment's notice to Shanghai, where replicated data sits waiting to run. He knows that's a goal that may not be far off, but it's still something that he can't do today.

Columbia has 4,000 employees and 50 locations around the world, along with additional data centers in Hong Kong and Strasbourg, Austria. The concept of hybrid cloud would give it an ability to recover from any type of disaster in any location, instead of setting up duplicate sites near each.

Leeper wants to be able to move workloads to a temporary site in the cloud, recover--or even move--an existing data center to a new location, and bring it back up again "without end users knowing there's been an outage, except for a 15-20 second pause in their systems." He knows it's possible as soon as he and the staff solve the frequent data replication piece.

Charles Babcock is an editor-at-large for InformationWeek.


Federal agencies must eliminate 800 data centers over the next five years. Find how they plan to do it in the new all-digital issue of InformationWeek Government. Download it now (registration required).

Cloud services can play a role in any BC/DR plan. Yet just 23% of 414 business technology pros responding to our 2011 Business Continuity/Disaster Recovery Survey use services as part of their application and data resiliency strategies, even though half (correctly) say it would reduce overall recovery times. Our The Cloud's Role In BC/DR report shows how the combination of cloud backup and IaaS offerings can be a beneficial part of a "DR 2.0" plan. (Free registration required.)




InformationWeek encourages readers to engage in spirited, healthy debate, including taking us to task. However, InformationWeek moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing/SPAM. InformationWeek further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
Subscribe to RSS


Advertisement


Virtual Infrastructure Reports

report Informed CIO: VDI Snake Oil Check
You won't lose your shirt on a desktop virtualization initiative, but don't expect it to be simple to build or free of complications. This report examines the three biggest problems when developing a business case for VDI: storage costs, ongoing licensing, and the wisdom of prolonging the investment in PC infrastructure.

report Fundamentals: Next-Generation VM Security
Server virtualization creates new security threats while turning the hypervisor into a network black hole, hiding traffic from traditional hardware defenses -- problems a new breed of virtualization-aware security software tackles head-on.

report Delegation Delivers Virtualization Savings
IT can't-and shouldn't-maintain absolute control over highly virtualized infrastructures. Instituting a smart role-based control strategy to decentralize management can empower business units to prioritize their own data assets while freeing IT to focus on the next big project.

report The Zen of Virtual Maintenance
Server virtualization has many advantages, but it can also lead to chaos. Many organizations have unused or test VMs running on production systems that consume memory, disk and power. This means critical resources may not be available in an emergency: say, when VMs on a failed machine try to move to another server. This can contribute to unplanned downtime and raise maintenance costs. Easy deployment also means business units may come knocking with more demands for applications and services. This report offers five steps to help IT get a handle on their virtual infrastructure.

report Pervasive Virtualization: Time to Expand the Paradigm
Extending core virtualization concepts to storage, networking, I/O and application delivery is changing the face of the modern data center. In this Fundamentals report, we'll discuss all these areas in the context of four main precepts of virtualization.

report Virtually Protected: Key Steps To Safeguarding Your VM Disk Files
We provide best practices for backing up VM disk files and building a resilient infrastructure that can tolerate hardware and software failures. After all, what's the point of constructing a virtualized infrastructure without a plan to keep systems up and running in case of a glitch--or outright disaster.