Google Revisits Crowdsourcing With reCAPTCHA Acquisition - InformationWeek
IoT
IoT
Mobile // Mobile Applications
Commentary
9/16/2009
04:38 PM
Thomas Claburn
Thomas Claburn
Commentary
Connect Directly
Google+
LinkedIn
Twitter
RSS
E-Mail
50%
50%
RELATED EVENTS
The Real Impact of a Data Security Breach
Aug 02, 2017
In this webcast, experts discuss the real losses associated with a breach, both in the data center ...Read More>>

Google Revisits Crowdsourcing With reCAPTCHA Acquisition

In its second acquisition this year, Google has bought reCAPTCHA, a company that provides CAPTCHA images as a barrier to online fraud.

In its second acquisition this year, Google has bought reCAPTCHA, a company that provides CAPTCHA images as a barrier to online fraud. The term CAPTCHA is an acronym for the phrase "Completely Automated Public Turing test to tell Computers and Humans Apart."

Luis von Ahn, co-founder of reCAPTCHA, was among the computer scientists from Carnegie Mellon and IBM who coined the term back in 2000.

What makes reCAPTCHA interesting for Google is that it kills two birds with one stone. Not only are CAPTCHAs necessary for online security -- to prevent spammers from using scripts to automatically register thousands of Gmail accounts, for example -- but reCAPTCHA's unique technology is designed to cull the phrases it presents to users from scanned books.

Because of this, Google will be able to improve the accuracy of the optical character recognition (OCR) applied to book scans through what amounts to "crowdsourced" copy editing. In so doing, Google is again finding value in aggregated intelligence: The highly relevant search results that made Google's name owe a lot to the PageRank algorithm developed by company co-founders Sergey Brin and Larry Page. PageRank weighs links between Web pages as if they were votes for relevance, thereby leveraging the judgment of the crowd to determine which Web sites matter.

"reCAPTCHA's unique technology improves the process that converts scanned images into plain text, known as Optical Character Recognition," explain von Ahn, co-founder of reCAPTCHA, and Will Cathcart, a Google product manager, in a blog post. "This technology also powers large scale text scanning projects like Google Books and Google News Archive Search. Having the text version of documents is important because plain text can be searched, easily rendered on mobile devices and displayed to visually impaired users. So we'll be applying the technology within Google not only to increase fraud and spam protection for Google products but also to improve our books and newspaper scanning process."

Google has no shortage of errors to correct. One of the company's Book Search engineers recently acknowledged that there are millions of errors in the metadata used to describe the books scanned for Google Book Search. No doubt the company's OCR output isn't perfect either.

But such problems look a lot less daunting when one can leverage CAPTCHA input to correct errors.

Comment  | 
Print  | 
More Insights
Comments
Newest First  |  Oldest First  |  Threaded View
How Enterprises Are Attacking the IT Security Enterprise
How Enterprises Are Attacking the IT Security Enterprise
To learn more about what organizations are doing to tackle attacks and threats we surveyed a group of 300 IT and infosec professionals to find out what their biggest IT security challenges are and what they're doing to defend against today's threats. Download the report to see what they're saying.
Register for InformationWeek Newsletters
White Papers
Current Issue
IT Strategies to Conquer the Cloud
Chances are your organization is adopting cloud computing in one way or another -- or in multiple ways. Understanding the skills you need and how cloud affects IT operations and networking will help you adapt.
Video
Slideshows
Twitter Feed
Sponsored Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.
Flash Poll