InformationWeek: The Business Value of Technology

InformationWeek: The Business Value of Technology
InformationWeek - Our New iPad App

Disaster Recovery

DISASTER STRIKES! ARE YOU READY ?

Years of work can be lost in seconds. The difference between sinking and surviving in business depends partly on how well you're prepared for the unexpected. If disaster struck today, how would your company do?


By Barbara DePompa
Issue date: May 15, 1995

F rom her desk at Penn Mutual headquarters
in Horsham, Pa., Patricia Bennett spends all day
trying to make sure the insurer doesn't repeat
t he mistakes of 1989. Back then, a nine-alarm
fire destroyed the company's downtown
Philadelphia data center.

Bennett, Penn Mutual's director of
disaster-recovery planning, remembers
all too well when firefighters sent 4 million
gallons of water surging through its
seventh-floor computer room.

Disaster, in the form of tornado, flood, earthquake, or wanton violence such as the recent bombing in Oklahoma City, strikes haphazardly and without warning.

But there can be nothing random about planning for business recovery. Central control of the disaster-recovery process is essential to keeping businesses operating and to providing at least a minimal level of service to their customers.

This is true especially for companies such as Penn Mutual. The insurer manages $7.1 billion in assets with both distributed client-server technologies and legacy systems. In these companies, the central information systems department must drive the development of a viable recovery pla n for the entire organization. The trouble is that disaster recovery in client-server environments is far more complicated than it is for a mainframe data center.

And because companies use technologies produced by several vendors in a distributed computing environment, they create multiple points of failure. That can magnify the scope and severity of the problem.

Make The Commitment
Disaster-recovery vendors, including Comdisco Disaster Recovery Services, a division of Comdisco Inc . in Rosemont, Ill., and SunGard Recovery Services Inc. in Atlanta, create recovery plans that include a central "hot" or "cold" disaster recovery center. But they, along with recovery-systems providers such as IBM, Hewlett-Packard, and Wang Laboratories, admit it's difficult to sell client-server restoration services to businesses. The reason: Duplicating distributed computing environments is expensive.

Ultimately, the internal IS organization must formulate plan s to recover technology and business processes that occur outside the data center. "Responsibility for disaster-recovery planning still falls to IS 99% of the time," says Claude Brazell, U.S. program manager for business-recovery services at HP in Santa Clara, Calif.

But central disaster recovery is a difficult prospect in a technology era when central IS organizations have little or shared control over distributed technology resources in organizations. "We [at HP] end up being able to solve the server problem, but client systems are very difficult to bring into the equation," says Brazell.

There's even resistance in some central IS quarters to assuming responsibility for client-server disaster recovery. When IBM consulting instructor Glenn Anderson said it wasn't the responsibility of the central IS organization to handle disaster recovery outside of the data center at IBM's Share user group meeting in Los Angeles last March, he was lustily applauded by an audience of 50 executives who ran their compan ies' mainframes.

The IS organization "can't control the world," Anderson told the audience. "While it must maintain the integrity of corporate operational data, MVS systems can't be responsible for the backup of ThinkPads."

In spite of these pockets of resistance, most central IS organizations recognize that they need to come to grips with the problem. "We are no longer able to avoid technology," says Penn Mutual's disaster-recovery director Bennett. "After a disaster strikes, it's often impossible to do manually what is now being done by computer."

Forming Teams
To prepare for disasters, business and technology planners are teaming up to examine where critical information is stored. They are turning to software tools and consultants to help them construct ways to protect valuable data.

Many are pulling servers back into the data center to better manage and protect information and regain control over security, data integrity, and asset tracking. But users and analysts say the market for tools to back up and restore multiplatform environments to a single, synchronized point in time is too immature. "The responsibility for backup and recovery falls almost entirely on each organization's shoulders," says Jeff Marinstein, president of Contingency Planning Research Inc., a disaster-recovery consultancy in Jericho, N.Y.

Indeed, disaster recovery in client-server environments is a business problem that requires a technology solution. Many organizations are either drafting or converting a key technology manager to the role of disaster-recovery coordinator.

That's what happened to Floyd Rowe, the disaster-recovery manager at Frank Russell Co., a financial-consulting and investment firm that manages more than $500 billion in assets and employs more than 1,000 people. In 1990, the company decided to include client-server systems in its disaster-recovery planning.

But Rowe, a senior technical analyst at the time, feared the com pany's mainframe hot-site provider couldn't provide adequate backup and recovery of Frank Russell's distributed systems.

Centralizing Systems
At its 12-story headquarters in Tacoma, Wash., Frank Russell runs a TCP/IP fiber-optic backbone network with copper wiring into each floor. Currently, 900 users access a Novell NetWare 3.1 network that is being upgraded to NetWare 4.1. The traditional disaster-recovery approach worked for the company's IBM S/390 Model 9121 mainframe, used for number crunching and front-end communications to data feeds from the East Coast.

But what about its five Wang VS minicomputers or seven IBM RS/6000 servers running the Sybase relational database management system? The Wang systems run several critical applications, including a customer information system. Rowe had already centralized all servers at company headquarters. And the data that resides on LANs is backed up by computer operators as part of the company's normal daily procedure.

Because it re lies so heavily on Wang computers, Frank Russell contracted for the services of a Wang mobile-recovery truck with two VS systems, an RS/6000 server, a large Novell file server, and 20 PCs.

The unit also contains a router and multiple modems for dial-up links. The financial-services firm signed its client-server recovery contract with Wang in February 1994 and another with Comdisco for mainframe hot-site recovery in October.

But before Rowe could spend the money on client-server contingency planning, he had to convince management that the company needed better disaster recovery. He set up visits to other companies with better recovery plans. "This enabled our management to see a gap between [other companies'] plans and ours," says Rowe. "That led to the newly revised plan and test rehearsals now under way."

Rowe found out that management support for disaster recovery is easier if technology planners first can identify critical information and then document the impact of a loss of that information ove r time.

Once management sees exactly what is at stake, designing plans that will protect all forms of data in an organization becomes easier. Rowe, who is now the disaster-recovery manager at Frank Russell, has a full-time staff member dedicated exclusively to business-continuation planning.

IS planners are beginning to understand that technology is just a means to an end. That end is recovering critical business operations. "It's like getting a flat tire on the way to your wedding," notes Penn Mutual's Bennett. "It may take too much time to change the tire and drive the car to the service. It may be a lot easier to hop in another car and go back to fix the tire later."

Size Doesn't Matter

Analysts and disaster-recovery services providers note that regardless of how large a system is, planners should instead focus on how to implement the necessary protection. They also must determine who needs to be trained to protect various systems.

On one level, all workers must be made aware of the dangers of losing critical information. Notes consultant Marinstein: "The big joke is if you can get them to back up their PCs, they will leave the diskettes on top of the system, rendering the backup useless if they can't get back into the building after a disaster occurs."

A good disaster-recovery plan takes into account every department and how each communicates with the other. Understanding workflow is essential. At Penn Mutual, there are now disaster-recovery plans within each department--more than 100--covering the operations of some 10,000 networked PCs.

Whether a company operates 10,000 PCs or only 10, there's no simple guide to client-server disaster recovery. "What's good for me in services or software may not do the job at all in any other organization," says Bennett.

Still, there are some aids. The Systems Audit Group Inc., a consulting firm in Newton, Mass., publishes a guide called the Disaster Recovery Yellow Pages. This three-ring binder, now in its fourth edition, provides a list of consulting services, hot sites, mobile vans, and emergency equipment sources. It also lists software that businesses can use to help them plan business and data recovery.

These applications, marketed by SunGard and other disaster-recovery providers, help businesses choose what's most essential to recover. The software has proven key to the growth of SunGard's client-server recovery business. More than half of all new business comes from client-server rather than from traditional data center recovery contracts, says William Beaumont, a SunGard senior VP.

All disaster-recovery vendors say prevention is better than any cure. In disaster-recovery circles, that translates to reducing risk. Critical data and applications must be moved to where they can be protected. Traditional mainframe environments consolidated the mechanics of dealing with crises. But in distributed client-server computing, there are many ways--and many points-- of failure. A disaster-recovery coordinator or other IS professional often doesn't control the data.

First Aid First
Paul Kirvan, director of telecommunications at the Mt. Sinai Medical Center, a 1,100-bed teaching hospital in New York, has been at his job for less than six months. But he's already in the hot seat, trying to create an adequate recovery plan for an organization that operates a three-tiered client-server system for a campus that's spread over three miles.

Other than basic precautions of backing up the hospital's Amdahl mainframe and storing tapes off site, little had been done to protect the organization's minicomputers and PC LANs. Mt. Sinai Medical runs clinical, administrative, and financial applications on Digital Equipment VAX servers, Stratus fault-tolerant systems, IBM AS/400 minicomputers, and various 486 PCs. In addition, the hospital operates close to 100 LANs over token-ring or Ethernet protocols.

Kirvan is tying together those networks and building fault toleranc e into the campus's IBM's Systems Network Architecture and TCP/IP fiber-optic backbone. The idea, explains Kirvan, is that if any part of a network fails, "we have a way to recover and reroute information."

For that he's using an IBM NetView/6000 network management tool on an RS/6000 server. The tool helps Kirvan monitor network operations and track problems.

Management will be further simplified as more hospital employees are linked over the campuswide wide area network. Of the organization's 12,000 employees, close to half will be joined together by year's end.

By that time, the hospital hopes to define what Kirvan calls a "master disaster" recovery plan to protect all minicomputer and client-server networks. By the mid-1996, Kirvan expects that each part of the hospital will have in place its own departmental recovery plan. "We've got to get to 100% uninterrupted service uptime," he says. "That's critical to a major medical center."

Practice And More Practice
A key component of any disaster-recovery plan is practice . Rehearsals do more to educate, sell the process, and stress testing than documentation or cajoling by an IS professional. In the most effective client-server disaster rehearsals, organizations work with live data and real situations.

Penn Mutual, for instance, tests backup procedures for its 800 customer service line twice a year. "It's not cheap," admits Bennett. "But it's not as expensive as the cost of recovery without a well-rehearsed plan."

Frank Russell was due to conduct the first integrated test of both its client-server and mainframe recovery plans in May. The effort will cost as much as $10,000, including the price of declaring a disaster to both Wang and Comdisco. Other costs include shipping the tapes to disaster-recovery facilities and housing and feeding the personnel involved in the test. The company's entire annual disaster-recovery budget is $120,000.

The Comdisco mainframe hot site is close to 1,000 miles away from Frank Russell offices in San Ramon, Calif. But the mobile unit can be set up anywhere within a two-mile radius of the hot site. The team can connect the mainframe, Wang VS, and RS/6000 systems over wireless links.

Training occurs at many levels. Frank Russell senior IS executives will update and maintain disaster-recovery plans. Department managers will be asked to incorporate business-continuity planning as a part of their jobs. And users will be encouraged to store all critical information at daily intervals on the LAN servers.

Difficult, But Worth It
The challenge to protect client-server environments is likely to get more complicated before it gets any easier. One of the saddest statistics that underline the need for better business-recovery planning came from Survive! magazine, published by Survive Inc., a Morristown, N.J., association of disaster-recovery coordinators.

The magazine reported that of 350 businesses operating in the World Trade Center before the Fe bruary 1993 bombing there, 150 were out of business a year later. Most did not operate large legacy systems, working instead on minicomputer- or PC-based networks that were tied to off-site data centers. Many were unable to restore their businesses simply because they couldn't reenter the building for several days following the bombing.

As the tragic events in Oklahoma demonstrate, many disasters can't be anticipated, let alone prevented. But businesses can prevent the interruptions that so frequently follow calamity. IS managers who support businesses that operate client-server systems must realize that no matter what's at stake, leading the recovery effort is their responsibility.

Comments on this story?




Get InformationWeek Daily

Don't miss each day's hottest technology news, sent directly to your inbox, including occasional breaking news alerts.

Sign up for the InformationWeek Daily email newsletter

*Required field

Privacy Statement



This Week's Issue

Technology Whitepapers

Featured Reports







Video