Obama Administration Defends Its Data Quality

Inaccuracies in stimulus-tracking data on Recovery.gov do exist, but overall the transparency effort has been a success, says a special advisor to the President.

J. Nicholas Hoover, Senior Editor, InformationWeek Government

November 18, 2009

3 Min Read

The data on the federal government's stimulus-tracking Web site has been criticized for being inaccurate, including inflated job creation numbers and Congressional districts that don't exist. But the White House is refusing to back down, championing the effort as a "huge success" and noting that the data will get better.

"When you consider the sheer number of reports that had to be filed, processed, and posted; the fact that this had never been done before; and the very short time to check reports and make sure they were right -- the data collected and posted is very impressive," said Ed DeSeve, special advisor to the president and to the OMB Director for implementation of the Recovery Act, in a blog post on the White House's Web site.

Since stimulus-spending reports began flooding Recovery.gov in October, media reports and government transparency advocates have pointed out discrepancies and errors in the data. For example, according to Recovery.gov, dozens of jobs have been created in the nonexistant 69th and 99th Congressional districts of the Northern Mariana Islands, and the data showed that the purchase of a single lawn mower in Arkansas helped save or create 50 jobs.

According to the White House, such problems are anomalies. Most of the data, DeSeve wrote, is correct. "Even if as many as 5% to 10% of the reports or 5% to 10% of the totals are wrong (and we don't think it is that high), that still means the Recovery Act saved or created between 600,000 and 700,000 direct jobs in its first seven months, more than most experts predicted when it passed," DeSeve wrote.

Many of the mistakes "don't undermine information at the heart of the data or the fact that real jobs have been created," DeSeve maintained. For example, although one project purports to be in Arizona's nonexistent 15th Congressional district, the full data shows the proper address where the funds were received. It's unclear why Recovery.gov's data intake mechanism (FederalReporting.gov) accepted the entry of the 15th Congressional district in that data field.

DeSeve admitted that transparency can be "messy." Federal agencies use reviewers to sort through the data, but he said the project is on an unprecedented scale, and some numbers were bound to be incorrect.

Others are questioning whether the data quality problem on Recovery.gov is symptomatic of a wider issue. LeAntha Sumpter, deputy director for program development and implementation for defense procurement and acquisition policy at the Department of Defense, noted in a conference last week that federal agencies are getting daily calls from reporters questioning stimulus data.

"When you're looking at transparency from the federal level, we're getting killed at the operational level," Sumpter said. "Unless you want to go down to the data level and look at integrating data at core levels, the biggest problem we have federal government-wide is definitions of data, the contexts in which they're used, and the processes around them."

Unified computing platforms promise to consolidate everything and anything into a single chassis. Find out about that and more in Network Computing's second all-digital issue. Download the issue here (registration required).

About the Author(s)

J. Nicholas Hoover

Senior Editor, InformationWeek Government

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like

More Insights