Government // Open Government
News
6/5/2014
03:35 PM
Connect Directly
LinkedIn
Google+
Twitter
RSS
E-Mail
50%
50%

OpenFDA Backstory: Breaking The Paperwork Backlog

The startup Captricity uses a combination of crowdsourcing and OCR to digitize mountains of paper records, particularly for government agencies and healthcare.

and read in data that OCR can't handle at all. To keep in compliance with HIPAA in healthcare and other requirements for private or otherwise sensitive data, the service "shreds" the images, so that crowdsourced human workers see only isolated fields from a form, rather than the whole thing -- just the first name, just the last name, or just the middle two digits of a Social Security number, for example.

Shreddr breaks form images into individual fields for processing.(Source: Captricity)
Shreddr breaks form images into individual fields for processing.
(Source: Captricity)

Another tactic is to use multiple OCR engines, including one of Captricity's own design, and "vote them together" to find the best translation from image to data. The Shreddr API allows application developers to automate the submission of images and get back data in XML or other structured formats. Captricity also works with third parties who will collect boxes of paper for scanning. For the Georgia campaign finance project, the reports were submitted to the cloud service through an e-fax gateway. The FDA already had an internal team to scan documents, but instead of storing images alone, that team began submitting them to the Captricity service.

The technology is not limited to applications in healthcare, Chen said, but "that's where our heart is." The company spun off from a series of academic research projects Chen completed on his way to a PhD in computer science from UC Berkeley. One of his projects was a video titled "Data in the First Mile," which investigated how community health workers in Africa automated the collection of health data even though they used paper forms, rather than a direct interface to an online system.

In the slums of Kenya, another goal of Chen's was to improve the reporting on a public health project to replace open-air latrines with more hygienic portable toilets. Providing computers or laptops to the workers responsible for checking that the toilets had been cleaned and serviced would have been cost-prohibitive, so the workers recorded their reports on paper and used a camera phone to submit them.

"When I lifted my head out of my dissertation and the creation of the [software] engine, I realized this problem was everywhere," he said, particularly in government. "There are paper backlogs in many agencies. In many cases, it's not being talked about, because they're handling it, but there are also a lot of instances where they're barely hanging on."

Yet the standard way of handling such problems is still to hire an army of temporary workers to type in the information. When agency heads see the technology, they tend to find it immediately applicable to a range of problems they address, but getting on their radar is the big challenge. "They have no idea this is possible."

Has meeting regulatory requirements gone from high priority to the only priority for healthcare IT? Read Health IT Priorities: No Breathing Room, an InformationWeek Healthcare digital issue.

David F. Carr oversees InformationWeek's coverage of government and healthcare IT. He previously led coverage of social business and education technologies and continues to contribute in those areas. He is the editor of Social Collaboration for Dummies (Wiley, Oct. 2013) and ... View Full Bio

Previous
2 of 2
Next
Comment  | 
Print  | 
More Insights
Comments
Newest First  |  Oldest First  |  Threaded View
David F. Carr
50%
50%
David F. Carr,
User Rank: Author
6/6/2014 | 1:33:15 PM
Do you have a paperwork backlog?
Curious how many of you have a paperwork backlog you need to conquer. How are you addressing it? Have you found effective alternatives to the approach described here?
Time to Reconsider Enterprise Email Strategy
Time to Reconsider Enterprise Email Strategy
Cost, time, and risk. It's the demand trifecta vying for the attention of both technology professionals and attorneys charged with balancing the expectations of their clients and business units with the hard reality of the current financial and regulatory climate. Sometimes, organizations assume high levels of risk as a result of their inability to meet the costs involved in data protection. In other instances, it's time that's of the essence, as with a data breach.
Register for InformationWeek Newsletters
White Papers
Current Issue
InformationWeek Tech Digest, Dec. 9, 2014
Apps will make or break the tablet as a work device, but don't shortchange critical factors related to hardware, security, peripherals, and integration.
Video
Slideshows
Twitter Feed
InformationWeek Radio
Archived InformationWeek Radio
Join us for a roundup of the top stories on InformationWeek.com for the week of December 14, 2014. Be here for the show and for the incredible Friday Afternoon Conversation that runs beside the program.
Sponsored Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.