MIT researchers are working on a database system called Qurk that can identify tasks ill-suited to computers and automate the assignment and presentation of those tasks--human intelligence tasks, or HITs--to human workers.
Amazon Mechanical Turk, which launched as a beta service in late 2005, is perhaps the best known crowdsourcing service for labor. It has become a platform for businesses like transcription service CastingWords and has spawned services like CrowdFlower and Samasource that complement it or compete.
Initial work on the Qurk system was described in two academic papers last year and a follow-up paper, "Human Powered Sorts and Joins," will be presented soon at the 38th International Conference on Very Large Databases, which takes place in Istanbul, Turkey, from August 27-31.
[ How can you speed up apps? Read Facebook Takes Mobile App Native For Speed. ]
The paper describes how crowdsourced labor tasks can be integrated into Qurk, "a declarative workflow engine," with a particular focus on "sorts" and "joins," two common database operations. Qurk helps people build crowd-powered data processing workflows using a language similar to Apache Pig.
It turns out that while humans may be better at certain tasks than computers, like judging the age of a person in a photograph, they're not very efficient at generating HITs--descriptions that identify the work to be done--and their performance can be enhanced through a better user interface, as well as through reuse of workflow implementations.
"Human workers periodically introduce mistakes, require compensation or incentives, and take longer than traditional silicon-based operators," the paper notes. "Currently, workﬂow designers perform ad-hoc parameter tuning when deciding how many assignments of each HIT to post in order to increase answer conﬁdence, how much to pay per task, and how to combine several human-powered operators (e.g., multiple ﬁlters) together into one HIT. These parameters are amenable to cost-based optimization, and introduce an exciting new landscape for query optimization and execution research."
In other words, human intelligence tasks can be identified and presented more effectively by software than by flawed, self-interested, carbon-based operators who linger over lunch.
As proof of concept, the researchers conducted an experiment using Amazon Mechanical Turk that involved joining two sets of images--identifying whether they were the same person--to be grouped together or distinct. Done the traditional way, the cost of this HIT came to $67. With Qurk, the cost was reduced to $3.