Wikia Search Gets Distributed Web Crawler - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Software // Information Management
05:52 PM
Connect Directly

Wikia Search Gets Distributed Web Crawler

Jimmy Wales buys Look Smart's Grub search engine and secures it under an open source license for a public release later this year.

Wikia, Inc., a provider of community Web sites that users can edit, said on Friday that it had acquired distributed search software called Grub to enhance the company's forthcoming wiki-inspired search engine.

At the O'Reilly Open Source Convention (OSCON), Wikia co-founder Jimmy Wales announced the acquisition of Grub from search engine Look Smart and the release of the software under an open source license. Financial terms of the deal were not disclosed.

In much the same way that Wikipedia relies on the distributed brain power of the Internet community, Wikia Search aims to make use of distributed processing power of Internet-connected computers.

"That's a very loose analogy but the idea is that you have a lot of spare bandwidth that you're not using a lot of the time, and if you want to use it to do something, this would be something you could do with it," said Wales. "This tool, it's not really a tool where people will be making editorial judgments, so it's different."

As a distributed program, Grub benefits incrementally from each user that installs and runs the software. The Grub client will make local bandwidth, processor time, and storage space available so that Wikia Search, once it launches, can crawl and index Web pages.

"Of the various pieces of the puzzle that we need to create the full search engine, this is one of them," said Wales. "We're planning to have first public Web site available by the end of this year."

Wikia Search will rely on Lucene, a Java-based open source indexing and search library that powers search services at sites like Digg and Joost, and will probably use Nutch, an open source search engine built atop Lucene.

Though the components of Wikia Search are still being decided on, people will play a major role. "We're definitely intending to have human input into the search results, through the social Web site that we're designing right now," said Wales.

Despite the potential problems of involving people in the search process, Wales believes that search engine spammers can be kept in check by the community. "If people are abusing the system, then they should be kicked out," he added.

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Comment  | 
Print  | 
More Insights
InformationWeek Is Getting an Upgrade!

Find out more about our plans to improve the look, functionality, and performance of the InformationWeek site in the coming months.

Remote Work Tops SF, NYC for Most High-Paying Job Openings
Jessica Davis, Senior Editor, Enterprise Apps,  7/20/2021
Blockchain Gets Real Across Industries
Lisa Morgan, Freelance Writer,  7/22/2021
Seeking a Competitive Edge vs. Chasing Savings in the Cloud
Joao-Pierre S. Ruth, Senior Writer,  7/19/2021
White Papers
Register for InformationWeek Newsletters
Current Issue
Monitoring Critical Cloud Workloads Report
In this report, our experts will discuss how to advance your ability to monitor critical workloads as they move about the various cloud platforms in your company.
Flash Poll