Commentary Gets Updated: A Closer Look

Version 2.0 of government's data portal relies heavily on open-source platforms, but finding usable data can still be like looking for buried treasure.
5 Big Wishes For Big Data Deployments
5 Big Wishes For Big Data Deployments
(click image for larger view and for slideshow)

The White House's release this week of a newly designed version of its government data portal,, was greeted with predictable fanfare and generally underwhelming reviews. The site's introduction highlights the administration's latest steps to make government data more readily available to the public, but it also marks another step forward for the government's use of open-source software.

U.S. Deputy CTO Nick Sinai and Senior Advisor Ryan Panchadsaram, in announcing the new design of, were upfront in declaring, " is far from complete (think of it as a very early beta)."

On the surface, the new portal is in fact a fresh start aimed at making it easier for data enthusiasts and application developers to find, visualize and reuse government data. Under the hood, however, it also embraces the government's expanding reliance on open-source software.

[ Want more on how agencies might meet President's open data mandate? Read Data Management Key To Federal Open Data Policy. ]

The new site, for instance will be making use of Apache Solr for search, CKAN (Comprehensive Knowledge Archive Network) for its data management platform and WordPress for its content management system. It even makes use of open-source fonts.

The preview of the new site is a clear response to the President's technology-driven management agenda announced earlier this month and a White House executive order issued in May for agencies to make their data more accessible, including being machine readable by default.

But in many respects, the new site is also likely to disappoint die-hard data users as being not much more than a shiny new showroom attached to the same old government data warehouse, a warehouse still in need of operating improvements and accessible data.

The new edition of does offer some improvements over the original site, which began in 2009 as a basic clearinghouse for federal agency datasets, many of which were not designed for the general population. Although has been routinely criticized for this lack of accessibility, it deserves credit for spawning more than a dozen special interest communities around health, education, energy, safety and other data. It has also led, thanks to the vision of federal CTO Todd Park, to a collection of cottage industries that are putting government data to work for private enterprise.

The new site design clearly reflects an injection of new thinking from various sources. Among them are the Office of Science and Technology Policy, the General Services Administration, and private sector experts working through the Presidential Innovation Fellows program.

For instance, has fused the usual social media tools into the site to capture streams of comments that highlight how private enterprise and the public at large are taking advantage of government data. The design brings the public's ideas about government data front and center.

The result is slicker promotion of some of the government's best data harvesting successes, such as the release of a database showing the significant variations in what hospitals and healthcare providers charge across the nation for 100 most common inpatient services and 30 most common outpatient services.

The new site also uses D3.js, a Web-based JavaScript library that helps manipulate and visualize data in documents. This makes it possible to look beyond the card catalogue view that generally provided, and actually view the data in's repository dynamically.

The new version also has winnowed down's offering, pulling together what would seem to be the most usable subset of's total inventory of data sets and application programming interfaces. The new site features 75,713 datasets and 100 APIs compared to the 184,259 datasets 295 government APIs previously listed on

But the real test of the new design is whether users can find and make ready use of the government's vast data resources. And the early results suggest a lot more needs to be done to reduce the number of steps it takes to find and actually extract what in many cases remains buried data treasure.

Find out what government IT teams need to know to deliver new, more agile enterprise networks and services. Also in the new, all-digital Next-Gen Networks issue of InformationWeek Government: How the Navy cut the price tag for its newly awarded Next Generation Enterprise Network contract to HP by more than a billion dollars. (Free registration required.)