Big Data // Software Platforms
News
10/29/2013
03:07 PM
Connect Directly
LinkedIn
Twitter
RSS
E-Mail
50%
50%

Federal Agencies Advised To Brace For Big Data

Big data challenge isn't now to handle current volume but what's to come, warn federal CIOs.

Federal CIOs are only starting to come to grips with the massive amounts of data headed their way, a group of current and former government CIOs said Monday.

That's presenting new challenges for budget-constrained government agencies that already process large volumes of data for predictive analysis, scientific visualization and forecasting. The concern for managing an exponential rise in data is compounded by the Obama administration's push to release more data to the public in response to the administration's Open Data Policy -- especially those working in the scientific and health fields.

"The challenge is not so much processing what data we have, but what's coming," said Roger Baker, chief strategy officer for Agilex and a former CIO for the Department of Veterans Affairs. Baker spoke during a panel discussion Monday at the annual American Council for Technology -- Industry Advisory Council Executive Leadership Conference.

The National Oceanic and Atmospheric Administration (NOAA) offers a glimpse of the volume of data that some agencies already handle. NOAA collects more than 2 billion observations from 17 satellites, and another 1.5 billion observations from sensors around the world each day, according to Joe Klimavicz, CIO for NOAA, who spoke at the conference. The agency relies on supercomputers capable of processing 2 million billion calculations per second to analyze that data and produce the 15 million weather and related reports NOAA issues throughout each day, Klimavicz added.

[ "Lowest price technically acceptable" contracts sacrifice long-term value for short-term savings. Read LPTA Contracts Stifle Government Innovation. ]

"Our data associated with this is growing 30 petabytes a year," he said, noting that the agency's dependency on big data tools and its dependency on the infrastructure to carry, store and process that data continues to grow along with that data.

Klimivicz outlined a number of additional concerns his office is now confronting as big data volumes grow even bigger. One is having the controls in place to validate the origin and the accuracy of the information streaming into NOAA. "Metadata quality is incredibly important to us," he said. So is ensuring data integrity. "If enough bits randomly flip from zero to one, that can begin to impact our climate models."

The sheer volume and diversity of data sources agencies NOAA deals with introduces another layer of concern for CIOs like Klimavicz. He explained how a program begun with the FAA more than two decades ago, to collect in-flight weather data from US airlines, generates more than 100,000 automated reports a day but took nearly two years to fully assimilate into NOAA's weather models to ensure the data dovetailed correctly with other data sources.

Getting agreement within communities of interest how to define data and how that data is eventually used remains yet another challenge.

Department of Energy CTO Robert Bectel said developing the semantic architecture and a system for federating data can get costly. "The real win for me is if I can get a community of scientists to make sure the data coming in is accurate and consistent."

Creating data standards proved to be a major undertaking in the health field, according to Baker, solved in part by the development of a health data dictionary agreed to by the VA and Defense Department. Data standards play a critical role as data gets aggregated, Baker said. "We found 240 ways to represent the way penicillin was prescribed," he said, explaining the level of work that is often involved in making seemingly identical data make sense in aggregate form.

Perhaps just as important to defining data is defining how data customers use the data. "Real-time depends on what you're talking about," Baker pointed out. "We thought getting information (to users) in 24 hours was okay." But Defense Department doctors expected to see data minutes after a patient's labs were completed.

"Health data is moving same way as science data," he added, explaining that the Centers for Disease Control and Prevention is using big data to model the movement of diseases the same way NOAA models climates.

Comment  | 
Print  | 
More Insights
Comments
Newest First  |  Oldest First  |  Threaded View
WKash
50%
50%
WKash,
User Rank: Author
11/15/2013 | 12:44:18 PM
Big Data in Public Sector
The White House announced a number of new Big Data projects, in collaboration with the private sector, at an event on Nov. 12.  Among agencies providing major new Big Data initiatives recently: NASA, DARPA, Energy Dept., USGS, NIH, and others.  Read more on White House blog

http://www.whitehouse.gov/sites/default/files/microsites/ostp/Data2Action%20Agency%20Progress.pdf

http://www.whitehouse.gov//sites/default/files/microsites/ostp/Data2Action%20Announcements.pdf
In A Fever For Big Data
In A Fever For Big Data
Healthcare orgs are relentlessly accumulating data, and a growing array of tools are becoming available to manage it.
Register for InformationWeek Newsletters
White Papers
Current Issue
InformationWeek Tech Digest, Nov. 10, 2014
Just 30% of respondents to our new survey say their companies are very or extremely effective at identifying critical data and analyzing it to make decisions, down from 42% in 2013. What gives?
Video
Slideshows
Twitter Feed
InformationWeek Radio
Sponsored Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.