The government is looking to make significant improvements to its open data repository, adding features for developers and consumers and prodding agencies to participate.
In documents circulating among federal agencies and released to the public on Tuesday, the Office of Management and Budget has laid out plans to move Data.gov out of "beta" phase and into "government-wide execution," as federal CIO Vivek Kundra put it in an interview last week.
Released at the same time as the Obama administration's wider Open Government Directive, a memo and draft concept-of-operations document encourage agencies to post more data on Data.gov, with an eye toward ensuring data is machine-readable, high-quality, and useful while also protecting privacy and security interests.
"Our key principles focus on making sure that we democratize as much data as possible and that that data is targeted towards high-value datasets," Kundra said.
The White House has repeatedly held out the government data portal as a hallmark of its open government strategy. Until now, though, while broadly written policy missives from President Obama and the Office of Management and Budget have encouraged federal agencies to be more open, there's been little formal guidance on exactly how federal agencies should use Data.gov as a forum for their transparency.
In many ways, it shows. Though Data.gov now houses more than 110,000 individual datasets, almost all of those are geodata on administrative and political boundaries. Of the non-geodata raw data feeds, 411 of 728 are toxics release inventory datasets. To be fair, Data.gov also houses 353 data tools, many of which house tons of datasets themselves, but much of that data is locked up inside those tools in non-machine-readable formats. Many federal agencies which presumably have wealths of useful data, meanwhile, have posted very few datasets on Data.gov.
Now, however, OMB is setting a formalized policy, and has begun asking the public for its input via a non-government Web site powered by crowdsourcing platform IdeaScale.
The new formal policy on Data.gov isn't just some high-level guidance without any teeth, either. OMB plans to actually rate federal agencies on their participation, keeping track of qualitative and quantitative metrics on everything from the number of datasets published by each agency, to citizen ratings of that data, to how well agencies attach metadata to their datasets. OMB will rate itself, too, via usage metrics and measuring feedback.
Data.gov has already gone from 7 staffers working largely at the personal direction of Kundra to more than 200 points of contact across the federal government. The concept of operation further formalizes those roles, and instructs or encourages agencies to set up training, participate in Data.gov working groups to create best practices, and establish Data.gov "data stewards' advisory groups." Structured data from a number of other government data Web sites like USAspending.gov and FBO.gov will be integrated into Data.gov.
The Agile ArchiveWhen it comes to managing data, donít look at backup and archiving systems as burdens and cost centers. A well-designed archive can enhance data protection and restores, ease search and e-discovery efforts, and save money by intelligently moving data from expensive primary storage systems.
2014 Analytics, BI, and Information Management SurveyITís tried for years to simplify data analytics and business intelligence efforts. Have visual analysis tools and Hadoop and NoSQL databases helped? Respondents to our 2014 InformationWeek Analytics, Business Intelligence, and Information Management Survey have a mixed outlook.