About 25,000 state employees were unable to access the state's IT network for several hours one day and problems during a disaster recovery test prevented thousands from completing motor vehicle transactions in state branch offices for nearly two days.
Slideshow: Government Innovators
(clickimage for larger view and for full slideshow)
Last week was a trying one for Michigan IT officials and staff week as the state experienced two IT outages--the second during a disaster recovery test--in the space of two days.
The first of the two occurred Monday, when nearly 25,000 employees were unable to access the state's IT network for about three and a half hours, Kurt Weiss, public information officer for the Michigan Department of Technology, Management, and Budget (DTMB), said in a phone interview.
While that one definitely "impacted state business," the second one--which occurred late morning on Wednesday and lasted until late Thursday afternoon--had a significant "citizen impact," preventing thousands of people from successfully completing driver's license and other motor vehicle transactions in state branch offices, he said.
The latter problem is reminiscent of an outage that affected the commonwealth of Virginia last summer, which had a major impact on DMV operations there for about a week. Northrop Grumman is paying nearly $5 million to the commonwealth for that incident.
The DTMB IT department is currently doing a root cause analysis of both incidents and plans to publish a "lessons learned" review of them once that is complete, Weiss said. No data was lost in either incident, although some data files were corrupted during the second and had to be restored through tape backup, he said.
The first problem became apparent Monday morning when state employees showed up for work and couldn't access the network. It took about two and a half hours for IT staff to realize that an upgrade over the weekend to patch security holes had gone wrong somewhere, Weiss said. Access to the network was restored by 10:30 a.m.
The second outage was more serious and occurred during a disaster recovery test Wednesday morning about 11:30 a.m, he said. A link between the test environment and production environment was severed by human error, taking a mainframe computer serving its branch office system (BOS) down.
"We went from testing a disaster recovery event to having a real one," Weiss said, adding that BOS provides IT connectivity and services for 131 Michigan secretary of state branch offices.
IT officials are re-evaluating how to perform such tests in the future in light of the incident, and another test will not be performed until this study is complete, he said.
Data stored on the mainframe that was affected included the bulk of information about driver's license and motor vehicle registration in the state--what other states' DMVs would typically store and manage, he said. The Michigan secretary of state handles these types of services for the state.
Weiss said that IT staff members were able to get the mainframe up and running by Thursday morning. However, because of data-recovery operations that were necessary as the result of file corruption during the outage, the files on the mainframe were inaccessible until the end of the business day Thursday, Weiss said.
As a result of the outage, people who appeared at state branch offices to take care of driver's license and motor vehicle transactions were turned away. Weiss said the state does about 80,000 transactions per day through those offices. "We had angry citizens who traveled to their branch office to be told they weren't able to get service," he said.
To make up for the time lost, branch offices gave people unable to be served during the outage passes to go to the front of the line when they returned. Offices also remained open for two extra hours Friday and will do so Monday and Saturday as well, Weiss said. Any late fees people might incur also are being waived during this time.
Other state processes that were affected during the second outage included the ability for police officers to look up driver's license information or for automobile dealerships to transfer license plates for vehicles that they sold, Weiss said.
IT outages aren't the only upheavals Michigan has seen in recent months. In October, state CIO Ken Theis resigned amid an ongoing IT consolidation effort that includes combining systems and data centers, streamlining processes, and merging departments. The state was without a new CIO until March, when--after an administration change in the governor's office--David Behen took over the job.
The mainframe that went down last week also is part of an old system that is in need of modernization, Weiss said, but Michigan's budget woes have so far prevented the state from doing the upgrades it needs. "We do need to modernize all of those applications for the secretary of state," he said.
Vendors are fighting it out in the market for integrated network, computer, and storage systems. In the new all-digital issue of Network Computing, we go ringside to help you pick a winner. Download the issue now. (Free with registration.)
How Enterprises Are Attacking the IT Security EnterpriseTo learn more about what organizations are doing to tackle attacks and threats we surveyed a group of 300 IT and infosec professionals to find out what their biggest IT security challenges are and what they're doing to defend against today's threats. Download the report to see what they're saying.
IT Strategies to Conquer the CloudChances are your organization is adopting cloud computing in one way or another -- or in multiple ways. Understanding the skills you need and how cloud affects IT operations and networking will help you adapt.