Global CIO: IBM's Bank Outage: Anatomy Of A Disaster - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

IoT
IoT
Cloud // Cloud Storage
Commentary
8/4/2010
07:45 PM
Bob Evans
Bob Evans
Commentary
50%
50%

Global CIO: IBM's Bank Outage: Anatomy Of A Disaster

IBM personnel inadvertently triggered a 7-hour outage at Singapore's largest banking network last month by using unapproved procedures. Here's a detailed look at what went wrong.

At 2:58 a.m. on the morning of July 5, disaster struck for Singapore's largest banking network and a small IBM support team troubleshooting a communications problem between one of the bank's storage devices and its mainframe. Here's how the bank, DBS Group, and its IT services provider, IBM, described that harrowing moment, which occurred after 40 hours of on-and-off attempts to fix the problem:

"[The on-site IBM engineer] replaced the cable using the same procedures as before. This caused errors that threatened data integrity. As a result, the storage system automatically stopped communicating with the mainframe computer, to protect the data. At this point, DBS banking services were disrupted."

Disrupted, indeed: for the following seven hours—until 10:00 a.m.—DBS's customers were unable to access banking services via branches, ATMs, online, or mobile. One media report said DBS has about 1,000 ATMs and, early in 2008, had almost 1,000,000 online-banking customers.

And while bank CEO Piyush Gupta 8 days later issued a long and deeply apologetic letter to DBS Group's customers in which he took full responsibility for the outage and resulting customer inconvenience and loss of trust, IBM has also been embroiled in the controversy over what happened, why it happened, and how it can be prevented in the future.

What we do know at this point is that the outage will cost DBS Group a great deal more than the negative impact on customers as the government agency that oversees the banking industry, the Monetary Authority of Singapore, has ordered DBS to place $230 million in regulatory capital as a result of the outage.

What is not known—at least publicly—is whether IBM will have to compensate DBS for the cost of the outage and/or related costs. In two separate media reports from a joint press conference involving both DBS and IBM, both articles said the companies would not answer questions about that possibility.

Global CIO
Global CIOs: A Site Just For You
Visit InformationWeek's Global CIO -- our new online community and information resource for CIOs operating in the global economy.

According to ChannelNewsAsia.com's coverage of the press conference, IBM regional general manager Cordelia Chung said that "the personnel directly involved with this incident have been removed from direct customer support activity and disciplined" and that "IBM has taken steps to enhance the training of all related personnel on the most current procedures."

And the BusinessTimes.com.sg article quoted Chung as saying, "We have also taken steps to review installations of the same storage system at other financial institutions in Singapore for whom we provide maintenance services."

(If I may inject an opinion here: if I were DBS CEO Gupta, I'd have very mixed feelings about that preventive-maintenance approach taken by IBM not only on behalf of DBS's competitors but also on the shoulders of a very troublesome incident for DBS. That's why I would guess that while neither company would comment on whether IBM would be compensating DBS for its role in the crash, IBM's going to be paying DBS very generously in either cash or extended and comprehensive additional services. On top of that, the removal from the involved employees from future dealings with customers, plus disciplinary action against them, plus Chung's very public apologies to both DBS and its customers all seem to add up to a very uncomplicated admission by IBM of some level of culpability in the outage.)

The article goes on to quote Chung as describing IBM's top priority once they realized a crash had occurred:

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Previous
1 of 2
Next
Comment  | 
Print  | 
More Insights
News
Can Cloud Revolutionize Business and Software Architecture?
Joao-Pierre S. Ruth, Senior Writer,  1/15/2021
Slideshows
10 IT Trends to Watch for in 2021
Cynthia Harvey, Freelance Journalist, InformationWeek,  12/22/2020
News
How CDOs Can Build Insight-Driven Organizations
Jessica Davis, Senior Editor, Enterprise Apps,  1/15/2021
White Papers
Register for InformationWeek Newsletters
Video
Current Issue
2021 Top Enterprise IT Trends
We've identified the key trends that are poised to impact the IT landscape in 2021. Find out why they're important and how they will affect you.
Slideshows
Flash Poll