Wikipedia is implementing an artificial intelligence (AI) effort to help it rein in online vandals who make malicious edits that must be vetted and deleted, and to scale its limited editing resources without discouraging the participation of new contributors and editors.
The online, community-sourced and edited encyclopedia's data researchers and specialists have assembled a package of machine learning tools to create the Web service called ORES --an acronym that stands for Objective Revision Evaluation Service.
The goal is to equip machines to do what they do well and allow human editors to do what human editors do well, thus increasing the number of revisions that can be vetted and articles that can be updated.
Aaron Halfaker, senior research scientist at the Wikipedia Foundation told InformationWeek in an email that he estimates Wikipedia's volunteer communities put about 12 million labor hours per year into the effort of updating the online encyclopedia.
"Wikipedia operates at a massive scale," Halfaker said. "We get about 500,000 edits per day. The communities that work on these wikis put a massive amount of time and attention into both building and curating the encyclopedia's content…Automation like the type these AIs provide are immensely valuable for efficiently applying the time and attention to the problem of writing a high quality encyclopedia."
The Wikimedia Foundation announced the ORES machine learning plans in a blog post this week.
And while the project is just launching now on Wikipedia, its roots are in 2007. Up until then the online encyclopedia had increased the ranks of its volunteer community editors at a rapid clip. Founded in 2001, Wikipedia grew from hundreds of editors to thousands of editors in 2004 and peaked in 2007 at 56,400 active editors. But then those numbers began to decline.
That's because Wikipedia implemented bots to fix the vandalism problem. Vandals, for instance, would edit Wikipedia entries with edits intended to inflict harm. And while new bots were good at fixing vandalism quickly, they also had an unintended side effect. Newer editors were more likely than their predecessors to have their first contributions rejected.
Halfaker discovered the problem through a research project that he began as a student and that ultimately led to the publication of a 2009 paper titled "The Rise and Decline of an Open Collaboration System: How Wikipedia’s reaction to popularity is causing its decline."
"Several changes the Wikipedia community made to manage quality and consistency in the face of a massive growth in participation have ironically crippled the very growth they were designed to manage," Halfaker and his co-authors wrote in their 2009 research paper. "Specifically, the restrictiveness of the encyclopedia’s primary quality control mechanism and the algorithmic tools used to reject contributions are implicated as key causes of decreased newcomer retention."
Enter machine learning. The ORES service runs a "precatcher" to score edits as they happen and caches the results to speed up how fast the service can respond to requests, Halfaker told InformationWeek. The service will score any edit in Wikipedia, recent or historic.
It currently supports 14 languages, and the organization is "aggressively adding support for new languages as fast as we can recruit translators," Halfaker said.
The service is intended to offer additional benefits, too. While editors do a lot of curation work such as categorization, linking, and quality assessments, Halfaker said that's the kind of work that machines do very well -- work that involves sorting, shuffling, and organizing.
[Find out more about what's new with artificial intelligence. Read Artificial Intelligence: 10 Things To Know.]
"With the AIs that ORES supports, we can help editors direct their attention to articles that need specific types of work, edits that need review and newcomers that need support," he said. "Honestly, I think this is just the tip of the iceberg and that by opening up these services, we're letting editors think creatively about what they would like to do with the signals they provide."
**New deadline of Dec. 18, 2015** Be a part of the prestigious InformationWeek Elite 100! Time is running out to submit your company's application by Dec. 18, 2015. Go to our 2016 registration page: InformationWeek's Elite 100 list for 2016.Jessica Davis is a Senior Editor at InformationWeek. She covers enterprise IT leadership, careers, artificial intelligence, data and analytics, and enterprise software. She has spent a career covering the intersection of business and technology. Follow her on twitter: ... View Full Bio