Government // Big Data Analytics
Commentary
8/8/2014
09:06 AM
David F Carr
David F Carr
Commentary
Connect Directly
Twitter
LinkedIn
Google+
RSS
E-Mail
50%
50%

Federal Open Source Is Messy – And That's OK

Open source projects like the National Library of Medicine's Pillbox show potential of open innovation -- including competition with projects started elsewhere.

NASA's Orion Spacecraft: 9 Facts
NASA's Orion Spacecraft: 9 Facts
(Click image for larger view and slideshow.)

On the scale of the federal government, it should come as no surprise that open source innovation would be a messy affair. Open source doesn't lend itself to central planning. If the consumers of code have complete freedom, including the freedom to fork -- or to abandon an open source project and try to create something better on their own -- then they will tend to do so.

Consider the case of Pillbox, an initiative of the National Libraries of Medicine, which is part of the National Institutes of Health and ultimately part of the US Department of Health and Human Services. Pillbox is a search engine for medicines, a tool for identifying loose pills by shape, size, color, and the text imprinted on the outside of the capsule. Working with the Veterans Administration's huge pharmacy system, the Pillbox team captures pill images and matches them with products from the drug label database maintained by the U.S. Food and Drug Administration. The first version of the tool came out in 2010, and since then it has been opened up further with API and source code releases on API documentation and code releases on GitHub.

[Are open source projects the answer to lack of government creativity? Read Federal IT Innovation Depends On Being Open.]

At some point, the Pillbox project was probably doing more to make FDA data organized and publicly accessible than the FDA itself. The FDA announced its OpenFDA initiative in June, after catching up on a paperwork backlog. The FDA's first open data API is for adverse drug event reports, but it's meant to be the first in a series of open data initiatives from the agencies.

When I met Pillbox project manager David Hale at the Federal Big Data Summit earlier this summer, he was hopeful that the two projects would prove complementary. He also complimented the FDA on the way it had used the Amazon cloud for data processing, storage, and elastic search. "That is absolutely groundbreaking, not just within government but especially within the FDA -- showing that it's okay to use something like the cloud and that there are immediate benefits to doing so," he said.

At the same time, although he was circumspect discussing the politics of interagency cooperation and competition, I could tell he also had some concerns about whether the FDA would wind up duplicating his efforts unnecessarily. The Pillbox project has been politically fraught all along, sometimes running afoul of concerns within his own agency that it was "not scientific" in the mode of most NLM research and might open the agency up to liability. The Pillbox site was taken offline a couple of times as a result, and the version that's live now is plastered with disclaimer messages that the data shouldn't necessarily be counted on for life-and-death decisions. Yet emergency room doctors have used the tool to identify pills that a patient used in an attempted overdose. Through the API, Pillbox data also has been incorporated into the pill identification feature on the Drugs.com website and into mobile apps for use by doctors and first responders.

Some federal open data initiatives make the mistake of believing that just making the data available or just providing an API is enough. But what those in government can really contribute is an understanding of how the data is structured and what it means. By making the Python code behind Pillbox available as open source, he believes he is conveying more of that information -- and allowing outsiders to see the assumptions baked in to how the software does what it does.

Along the way, Hale believes he has learned a lot about how to scrub the data he pulls in from the FDA and other sources to make it more useful (more explanation in the video included below). He doesn't want to see that work go to waste.

Yet almost by definition, open source projects inside or outside of government invite developers who think they have a better idea to blaze their own path. You could argue that it's a waste of effort for the developers of WordPress and Drupal to invest so much effort into solving a lot of the same problems in Web content management -- but each platform also solves problems that the other does not.

Damon Davis, the director of the Health Data Initiative and an official of the HHS CTO's office, acknowledged that tension while participating in a panel discussion along with Hale at the Federal Big Data conference. "It does happen sometimes that there are two very similar projects happening at the same time," he said. "A lot of times the leaders of those efforts are very open to merging their projects together." However, forcing the issue doesn't make sense -- sometimes two projects that superficially look redundant are actually pursuing distinct goals, he said.

"We have to understand, too, what the differences are," Damon said. "It's a major, major ball of yarn that can be tough to unwind."

A light touch is probably the right touch, encouraging cooperation where it makes sense while understanding the role of healthy competition.

Sophisticated attacks demand real-time risk management and continuous monitoring. Here's how federal agencies are meeting that challenge. Get the new Flexibility Equals Strength issue of InformationWeek Government Tech Digest today. (Free registration required.)

David F. Carr oversees InformationWeek's coverage of government and healthcare IT. He previously led coverage of social business and education technologies and continues to contribute in those areas. He is the editor of Social Collaboration for Dummies (Wiley, Oct. 2013) and ... View Full Bio
Comment  | 
Print  | 
More Insights
Comments
Newest First  |  Oldest First  |  Threaded View
ZibdyH648
50%
50%
ZibdyH648,
User Rank: Apprentice
8/15/2014 | 1:18:42 PM
Re: Pillbox sounds great but...

Hi Alison,

We completely agree with your comment and we would love to work with them to make it better. David Carr has written about our application – ZibdyHealth and you can try for yourself. We have built a database based on barcodes and it summaries relevant information about the medication. Drug monographs are great but in 3 years I have come across only 2 persons who claim to read them diligently. We have about 600,000 products in it but this is an ongoing effort for us so we constantly add more drugs into our database.

David thanks for adding the link to the video in your article. It is very telling but I was a bit disappointed after watching it. They have only 2000 drug images and only 100 provided by the manufacturers. To identify drugs based on images, there are three different criteria – color, shape and imprint. Around 7 minute mark, first two – color and shape were discounted completely and it was suggested that they do not have way to search for imprint yet. How many seniors can even read this imprint?

IHMO, there are plenty of solutions which exist for healthcare professionals, we need to make it simple for an average person.

Alison_Diana
100%
0%
Alison_Diana,
User Rank: Author
8/11/2014 | 12:05:26 PM
Pillbox sounds great but...
I wonder how Pillbox's timeline compares with other, free medication-search databases available via web or app? It's a great idea and tool -- but I'd love to know whether they considered using a third-party database or, at least, partnering with a company that had already done all the legwork on this tool? That's why I wonder about the timing: This agency could well have been first and, if so, wouldn't it have been great if they sold the rights and poured the resulting money back into the agency or IT?
danielcawrey
50%
50%
danielcawrey,
User Rank: Ninja
8/9/2014 | 6:12:54 PM
Good thing
It is a good thing that the governnment is leveraging open source.

But in the respect for reducing federal waste, I wonder if there should be some better guidelines to make completely sure that each project serves a particular mission or goal. There should be some standards, at least. This will be necessary at some point. 
Skirting the Big Data Expertise Shortage
Skirting the Big Data Expertise Shortage
Federal departments and agencies have embraced big data in a big way, despite a shortage of trained and experienced workers, particularly data scientists. What tools and strategies are helping bridge the divide?
Register for InformationWeek Newsletters
White Papers
Current Issue
InformationWeek Tech Digest - September 10, 2014
A high-scale relational database? NoSQL database? Hadoop? Event-processing technology? When it comes to big data, one size doesn't fit all. Here's how to decide.
Flash Poll
Video
Slideshows
Twitter Feed
InformationWeek Radio
Sponsored Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.