Software // Information Management
News
7/27/2007
05:52 PM
Connect Directly
Google+
LinkedIn
Twitter
RSS
E-Mail
50%
50%

Wikia Search Gets Distributed Web Crawler

Jimmy Wales buys Look Smart's Grub search engine and secures it under an open source license for a public release later this year.

Wikia, Inc., a provider of community Web sites that users can edit, said on Friday that it had acquired distributed search software called Grub to enhance the company's forthcoming wiki-inspired search engine.

At the O'Reilly Open Source Convention (OSCON), Wikia co-founder Jimmy Wales announced the acquisition of Grub from search engine Look Smart and the release of the software under an open source license. Financial terms of the deal were not disclosed.

In much the same way that Wikipedia relies on the distributed brain power of the Internet community, Wikia Search aims to make use of distributed processing power of Internet-connected computers.

"That's a very loose analogy but the idea is that you have a lot of spare bandwidth that you're not using a lot of the time, and if you want to use it to do something, this would be something you could do with it," said Wales. "This tool, it's not really a tool where people will be making editorial judgments, so it's different."

As a distributed program, Grub benefits incrementally from each user that installs and runs the software. The Grub client will make local bandwidth, processor time, and storage space available so that Wikia Search, once it launches, can crawl and index Web pages.

"Of the various pieces of the puzzle that we need to create the full search engine, this is one of them," said Wales. "We're planning to have first public Web site available by the end of this year."

Wikia Search will rely on Lucene, a Java-based open source indexing and search library that powers search services at sites like Digg and Joost, and will probably use Nutch, an open source search engine built atop Lucene.

Though the components of Wikia Search are still being decided on, people will play a major role. "We're definitely intending to have human input into the search results, through the social Web site that we're designing right now," said Wales.

Despite the potential problems of involving people in the search process, Wales believes that search engine spammers can be kept in check by the community. "If people are abusing the system, then they should be kicked out," he added.

Comment  | 
Print  | 
More Insights
The Agile Archive
The Agile Archive
When it comes to managing data, donít look at backup and archiving systems as burdens and cost centers. A well-designed archive can enhance data protection and restores, ease search and e-discovery efforts, and save money by intelligently moving data from expensive primary storage systems.
Register for InformationWeek Newsletters
White Papers
Current Issue
InformationWeek Tech Digest - July 22, 2014
Sophisticated attacks demand real-time risk management and continuous monitoring. Here's how federal agencies are meeting that challenge.
Flash Poll
Video
Slideshows
Twitter Feed
InformationWeek Radio
Archived InformationWeek Radio
A UBM Tech Radio episode on the changing economics of Flash storage used in data tiering -- sponsored by Dell.
Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.