Petabytes of information are accumulating across government: military veterans' genomics data, climate records dating to the 16th century, years worth of stock trades and even the results of particle physics experiments.
The era of big data has arrived in government just as it has in business. Digital documents, transactions, intelligence, photos, video, Web content and electronic correspondence are filling storage systems to the brim. At the same time, IT budgets are flat, agencies are being pressed to consolidate data centers, and IT teams don't have the skills they need to manage, analyze and apply all of that data.
Government CIOs and their staffs must quickly work their way up the big data learning curve. Terabyte databases are growing into petabyte databases, pushing the processing and storage limits of the IT systems in place and testing the know-how of even the most experienced database managers.
A big data workshop held by the National Institute of Standards and Technology in January drew more than 800 attendees from federal agencies and the technology companies that work with them. IT leaders from the Department of Defense, the Department of Energy, NASA, the National Oceanic and Atmospheric Administration (NOAA), Veterans Affairs and the White House were among those who came to discuss the convergence of big data and cloud computing, big data life cycle management and big data analytics.
The Obama administration pushed big data up the federal IT priority list last March when it unveiled a formal research and development initiative aimed at developing new technologies for big data management and analysis. The goal is spurring breakthroughs in science and engineering, transforming education and strengthening national security. In a blog post titled "Big Data Is A Big Deal," Tom Kalil, deputy director for policy at the Office of Science and Technology Policy, called for an "all hands on deck" among government, businesses, universities and nonprofits.
To kick-start the effort, six federal agencies -- the Defense Advanced Research Projects Agency, the departments of Defense and Energy, National Institutes of Health (NIH), National Science Foundation (NSF), and U.S. Geological Survey (USGS) -- announced plans to invest $200 million collectively in big data R&D. A new interagency steering group is crafting a national R&D strategy, the components of which include foundational research, development of IT infrastructure that's "big data ready," education and workforce development, and collaboration.
Some agencies have begun to develop their own plans for big data research and management. The Pentagon will spend $250 million annually on big data ($60 million of which is included in the $200 million in new federal research). One area of investment is a DARPA program called XDATA to develop "computational techniques and software tools for sifting through large structured and unstructured data sets," according to a White House document on the federal initiative.