Report Blames Northrop Grumman For Virginia Outages - InformationWeek
Government // Open Government
04:15 PM
Ransomware: Latest Developments & How to Defend Against Them
Nov 01, 2017
Ransomware is one of the fastest growing types of malware, and new breeds that escalate quickly ar ...Read More>>

Report Blames Northrop Grumman For Virginia Outages

A review of the week-long statewide network outage said it was caused by a combination of human error, faulty hardware, and a failure to follow best practices.

Government Innovators
Slideshow: Government Innovators
(click image for larger view and for full slideshow)

Faulty hardware and Northrop Grumman's failure to follow best practices were responsible for a statewide IT system failure in Virginia last summer that affected online services and network operations for a week, according to a report on the incident released by Virgnia Gov. Robert McDonnell.

The independent review -- prepared by Agilysys, an IT services firm -- found that the combination of the failure of a data storage system and then human error during an attempt to replace one of the failed memory boards caused the unprecedented outage, which affected more than 20 government agencies.

The report also faulted Northrop Grumman, which has a $2.3 billion contract to work with the Virginia Information Technologies Agency (VITA) to look after communications and computer services for the state, for not adhering to industry best practices following the incident. VITA was created in 2003 to maintain and modernize the state's IT operations.

The commonwealth's trouble began Aug. 25 when two memory boards that were meant to back up each other failed. Analysis by EMC, the manufacturer of the boards, said a so-called "electrical over stress condition at the component level" caused the dual failure, which resulted in a loss of data.

Following that, "human error during the memory board replacement process resulted in the incurred extended outage," according to the report.

The outage also was exacerbated by a gap in the Information Technology Service Continuity Management (ITSCM) processes, which resulted in the spread of corrupt data. Lack of a continuity procedure also was one of the reasons it took 18 hours to get the system back up and running, according to the report. Full service to all affected operations and agencies did not return until about a week later.

Specifically, parties responsible for responding to the incident did not suspend what's called Symmetrix Remote Data Facility (SRDF) before the memory board replacement process, which "negatively impacted the data recovery procedures" and allowed corrupt data to be replicated.

SDRF is a process used to replicate data from a local storage array to a remote storage array. The report cites Northrop Grumman as the responsible party for managing risk during the SRDF process.

Northrop Grumman spokeswoman Christy Whitman said the company has been "working hard" since the outage to "make the appropriate improvements to help avoid or mitigate similar disruptions."

The company also is ready to talk with Virginia officials about how best to implement report recommendations, she added.

It's still not known how much the outage will cost the commonwealth and if and how Northrop’s relationship with VITA will be affected. State officials long have criticized the partnership, which has had its troubles over the years.

Comment  | 
Print  | 
More Insights
Newest First  |  Oldest First  |  Threaded View
How Enterprises Are Attacking the IT Security Enterprise
How Enterprises Are Attacking the IT Security Enterprise
To learn more about what organizations are doing to tackle attacks and threats we surveyed a group of 300 IT and infosec professionals to find out what their biggest IT security challenges are and what they're doing to defend against today's threats. Download the report to see what they're saying.
Register for InformationWeek Newsletters
White Papers
Current Issue
Digital Transformation Myths & Truths
Transformation is on every IT organization's to-do list, but effectively transforming IT means a major shift in technology as well as business models and culture. In this IT Trend Report, we examine some of the misconceptions of digital transformation and look at steps you can take to succeed technically and culturally.
Twitter Feed
Sponsored Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.
Flash Poll