Big Data. Big Decisions
InformationWeek
Special Coverage Series

Commentary

Doug Henschen

Doug Henschen

Executive Editor, InformationWeek

Big Data Initiative Or Big Government Boondoggle?

A White House plan to step up research on big data analytics sounds promising, but agencies could save big bucks through consolidation, collaboration, and cost sharing.

The Obama Administration last week unveiled a "Big Data Research and Development Initiative" that will see at least six government agencies making $200 million in additional investments to "greatly improve the tools and techniques needed to access, organize, and glean discoveries from huge volumes of digital data."

The big data initiative sounds good in theory, and I'm all for promoting U.S. competitiveness in math and science. But after sitting through nearly two hours of presentations on the feds' big data initiative, I fear those investments will be spread too thinly among too many agencies that aren't collaborating.

More Insights

Webcasts

More >>

White Papers

More >>

Reports

More >>

It's encouraging that the White House is at least aware of all the agencies involved in data- and compute-intensive research. The administration released a fact sheet that listed at least 80 projects and initiatives across a dozen federal agencies, including the Department of Defense, Department of Homeland Security, Department of Energy, Health and Human Services, and Food and Drug Administration.

[ Want more on the government's big data plan? Read White House Shares $200 Million Big Data Plan. ]

Who knew the government was funding so much data-driven research? The White House issued this fact sheet as if to say, "Look how much we're doing already!" But when you start reading about all the separate initiatives and all of the high-performance computing labs and research facilities already in place, it makes your head spin. As a taxpayer, it pains me to see so many examples of apparently duplicative research, staff, and infrastructure.

The big data initiative was prompted in part by a December 2010 report by the President's Council of Advisors on Science and Technology (PCAST) on "Designing a Digital Future," which found the U.S. is investing too little in networking and IT research. Part of the reason we're not spending "enough" is that we're spreading investments among agencies conducting R&D for their respective fields rather than on networking and IT that could benefit everyone.

It was a good sign that last week's presentation kicked off with the announcement of an initiative between the National Science Foundation and National Institute of Health to fund 15 to 20 research projects to the tune of $25 million. The idea behind this Big Data Solicitation is to seed and provide direction for initiatives that will speed data-driven scientific discoveries related to health and disease. What's more, it's an invitation to academia, non-governmental organizations, and the private sector to participate. This is exactly the kind of collaborative effort I think we need.

But after a promising start, the four speakers who followed--from the U.S. Geological Survey, the Department of Defense, the Defense Advanced Research Projects Agency, and the Department of Energy--seemed more intent on talking about their unique initiatives and less focused on how they could collaborate with other agencies. Amid the din of acronyms and price-tag-unknown projects, the same terms kept coming up: data volume, data variety, modeling and algorithms, data visualization, making information actionable, and so on.

It all reminded me of a conversation I had with Don Burke a couple of years ago on the topic of the lack of cooperation, collaboration, and consolidation among government agencies involved in national security. "Every agency says, 'I have unique needs.' Then their IT providers say, 'I will give you the 100% solution for that need, but you have to give us all this money to create a unique solution,'" explained Burke, "doyen" of Intellipedia, an intelligence-community-wide wiki started in 2006 by the Office of the Director of National Intelligence.

Intellipedia aims to help the intelligence community connect the dots on threats by collapsing the walls between data silos. Reading through all the big data projects and initiatives the government already has on the table, I think there's an opportunity to do more shared big-data research and create shared big-data platforms.

Yes, the U.S. Geological Survey, NASA, the Department of Defense, and the National Institute of Health are doing very different types of data-driven research and analyses, but they're all grappling with the use of unstructured data and large-scale machine data, they're all pushing the envelope on data mining, and they're all looking for better data visualization and reporting techniques.

Johns Hopkins, for one, believes in big data collaboration across disciplines. Dr. Peter Greene, Johns Hopkins' chief medical information officer, tells me that that institution's oncology researchers are collaborating with the university's Department of Astronomy. The cancer researchers face the big data challenge of studying the human genome, which consists of 3 billion base pairs of DNA. Johns Hopkins' Department of Astronomy, meanwhile, has a data center with rack upon rack of compute power applied to large-scale computational astronomy calculations. Why build a separate data center when one can handle both astronomy and healthcare calculations?

The government's hugely important data center consolidation plan didn't come up at all during last week's announcements. So what about assessments of compute-power requirements and staffing needs? Are our current labs anywhere near maximum utilization? It strikes me that consolidating high-performance computing centers and relying on cloud delivery of services to multiple agencies could go a long way toward cutting the big cost of big-data analysis.

If we're to avoid the problem identified in the original PCAST report--spreading budgets too thinly across too many agencies studying parochial requirements--these departments and agencies must recognize that there's a huge opportunity for their research dollars to go further. If they will only give up a bit of control and a bit of their "unique" agendas and a bit of their precious budgets, we could be creating big data research and systems for the common good.



Related Reading




Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

BYTE encourages readers to engage in spirited, healthy debate, including taking us to task. However, BYTE moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing/SPAM. BYTE further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.

Follow InformationWeek

By The Numbers

What Are Your Primary Concerns About Using Big Data Software?

Base: 417 respondents at organizations using or planning to deploy data analytics, BI or statistical analysis software
Data: InformationWeek 2013 Analytics, Business Intelligence and Information Management Survey of 541 business technology professionals, October 2012

What Do You Think?

What's your attitude about SQL analysis on top of Hadoop?
We want fast, standard SQL analysis capabilities on Hadoop ASAP
Hadoop is for unstructured data; SQL is for relational databases
We'll give SQL on Hadoop a try, but relational DBs will remain the mainstay
Given strong SQL support on Hadoop, we'd nix the data warehouse
We're not interested in Hadoop
No opinion



Related Content

From Our Sponsor

Five Big Data Challenges and How to Overcome Them with Visual Analytics

Five Big Data Challenges and How to Overcome Them with Visual Analytics

Business leaders often need a visual snapshot of data to quickly grasp and use it. This paper identifies five challenges in presenting data and how visual analytics can resolve them. Solutions are suggested to overcome the challenges of: speed, data clarity, data quality, displaying meaningful results, and dealing with outliers.

Game-Changing Analytics: How IT Executives Can Use Analytics to Create Innovation and Business Success

Game-Changing Analytics: How IT Executives Can Use Analytics to Create Innovation and Business Success

Today's competitive advantage requires a deeper understanding of your business, your market and your customers. As an IT executive, you can drive that knowledge transformation. In this white paper, learn how to make decisions as a strategic business leader and three steps to begin an analytics initiative within your enterprise.

Data Visualization Techniques: From Basics to Big Data with SAS Visual Analytics

Data Visualization Techniques: From Basics to Big Data with SAS Visual Analytics

High-performance data visualization turns sophisticated analyses into meaningful graphics, leading to faster and smarter decision making. In this white paper, learn how visual analytics can transform big data, with additional features such as real-time functionality, mobile compatibility, robust applications for technical groups and accessibility for nontechnical users.

Big Data: Lessons from the Leaders

Big Data: Lessons from the Leaders

Financial performance, competitive advantage, operational efficiency, strategic decision making - every business goal can extract value from big data, and the time for doubt or inaction has long passed. In this Economist Intelligence Unit report, in-depth interviews with data pioneers reveal the link between the effective use of big data and the bottom line among other results.

Decision-Driven Data Management: A Strategy for Better Decisions with Better Data

Decision-Driven Data Management: A Strategy for Better Decisions with Better Data

Which came first, the data or the decision? This white paper makes the case for having a decision in mind, then tailoring big data's volume, variety and velocity to achieve business results such as overcoming customer dissatisfaction or creating well-informed strategies in real time.

Informationweek Reports

Research: The Big Data Management Challenge

Research: The Big Data Management Challenge

The challenge of big data is real, but most organizations don't differentiate 'big data' from traditional data, and nearly 90% of respondents to our survey use conventional databases as the primary means of handling data. We'll help you understand what constitutes big data (it's not just size) and the numerous management challenges it poses.