Data Center Blackout In San Francisco Caused By A Bug
Backup generators at 365 Main failed to complete their start sequence because of a memory problem in the engine monitoring and control component.
When the lights went out in San Francisco last week, data center operator 365 Main's backup power generators also failed. Now the company has identified the cause of the problem: an engine monitoring and control component known as a Detroit Diesel Electronic Controller, or DDEC.
In a statement released Wednesday, 365 Main said that following the power outage last week, three of its 10 Hitec backup generators failed to complete their start sequence because of a memory problem in their DDECs.
"The team discovered a setting in the DDEC that was not allowing the component to correctly reset its memory," the company said in a statement. "Erroneous data left in the DDEC's memory subsequently caused misfiring or engine start failures when the generators were called on to start during the power outage on July 24."
In other words, the generators failed to start because of a bug.
A Detroit Diesel MTU spokesperson was not immediately available.
Detroit Diesel describes its DDEC as a tool to optimize engine performance and to simplify troubleshooting, electronic diagnostics, and data extraction.
Officials with 365 Main said the company has fixed the problem by "altering the timing of a command to the DDEC component, allowing more time between the engine shutdown command and the DDEC reset command."
Miles Kelly, VP of marketing for 365 Main, said that Hitec generators at his company's El Segundo, Calif., facility have the same DDEC. "Once we were able to diagnose the problem and test the fix, we deployed it here in San Francisco and Los Angeles," he said. "What Hitec is doing, having basically made this joint discovery with us, is they are going to be rolling out the fix to all other generators that are exposed to the bug."
Kelly said his company has been contacting other companies that use generators with this component.
Several prominent Web sites were knocked offline or experienced limited availability following the outage and the failure of 365 Main's backup power system, including AdBrite.com, CurrentTV.com, Craigslist.org, RedEnvelope.com, SecondLife.com, Six Apart's blog sites (LiveJournal.com, TypePad.com, Vox.com), Technorati.com, and Yelp.com.
The Agile ArchiveWhen it comes to managing data, donít look at backup and archiving systems as burdens and cost centers. A well-designed archive can enhance data protection and restores, ease search and e-discovery efforts, and save money by intelligently moving data from expensive primary storage systems.
2014 Analytics, BI, and Information Management SurveyITís tried for years to simplify data analytics and business intelligence efforts. Have visual analysis tools and Hadoop and NoSQL databases helped? Respondents to our 2014 InformationWeek Analytics, Business Intelligence, and Information Management Survey have a mixed outlook.