Take the Department of Defense, for example. It has between 15 and 20 ERP systems, which are only patched together as an afterthought, if at all. "Each one of them is a starship unto itself, but where is the enterprise?" LeAntha Sumpter, deputy director for program development and implementation for defense procurement and acquisition policy at the DoD, asked at a conference in D.C. on Thursday. "Where does it come together?" Likewise, the Department of Transportation, one of the biggest recipients of stimulus money, has about a dozen ERP systems.
Just within the Department of Defense, said Sumpter, who's also the agency lead for posting data on usaspending.gov, there are somewhere between six and 10 different standards for country codes. That means that KS (Kansas) is also KS (South Korea). And to think, there you were, wondering why we sent military aid to Topeka.
That problem is exacerbated by a lack of training, education and oversight about data quality, said Marv Langston, former Navy CIO and deputy DoD CIO, citing an odd problem where a federal acquisition system at one time showed the Department of Defense as a huge buyer of soybeans. It turned out that the product code for soybeans used to be nothing but a series of ones, i.e. 11111. "People are lazy, it's just as simple as that," he said.
Combine those problems and mix them with a White House-led data transparency initiative, and you're bound for trouble. Sumpter noted that federal agencies are getting daily calls from reporters questioning stimulus data. "When you're looking at transparency from the federal level, we're getting killed at the operational level," she said.
Posting incorrect data online may cause a public relations headache, but it also exposes the flawed processes underneath, rather than hiding the problems via a thorough and time-consuming scrubbing that takes months. It just goes to show that the government needs to continue looking for ways to push for data standards such as the National Information Exchange Model or NIEM, and work on accurate data reporting off the bat, rather than having to do incredibly hard work of scraping over data for months before release. It also shows that citizen-reported data might require extra care.
"We're not inept, we just haven't done the due diligence," Sumpter said. "Unless you want to go down to the data level and look at integrating data at core levels, the biggest problem we have federal government-wide is definitions of data, the contexts in which they're used, and the processes around them."