Commentary

Stephen Wellman
 

Will Google Be Destroyed By Open Source Search Engines?

Could open source kill the golden egg that laid Google? If Wikia has its way, it just might.

Could open source kill the golden egg that laid Google? If Wikia has its way, it just might.The Wikia project, started by Wikipedia co-founder Jimmy Wales, seeks to turn the process of building a search engine from a multimillion dollar project to one that could cost just hundreds or thousands of bucks. That's a game changer.

Here is a look at what Wikia hopes to accomplish:


More Mobility Insights

White Papers

More >>

Reports

More >>

Webcasts

More >>

The project, which was started by Wikipedia co-founder Jimmy Wales, consists of four components, the indexing of the Web, developing a search engine application, an algorithm, and using people to help filter sites and rank results.

One of the most expensive components of a search engine is the effort needed to index the Web. Companies have to buy servers and software to crawl the Web looking at what's on every page, to create a comprehensive list of what's on the Web.

Well, how will Wikia be able to provide all this search engine technology and service -- especially crawling the Web -- for free? Open user participation, of course:

The cost of indexing the Web is one of the main hurdles to starting a search engine, and for-profit companies have raised the bar year after year by indexing the Web more and more often. It used to be catalogued once a week, or once a day. Now it's once an hour, or even more often. The high cost of running these crawls has become a competitive weapon.

Wikia believes its crawl of the Web will cost nearly nothing, because it's asking Internet users to help out by downloading Web-crawling software from Grub, which will use their computers during idle time to crawl the Web, and send results back to Wikia for the index. So far, a thousand people have downloaded the application, and Penchina is hoping for 100,000 or more. The goal is to post the entire index online, as well as regular updates, so anyone can use them.

If I have any skepticism about Wikia, it centers on this piece. I know that distributed computing and its white-hot offspring, grid computing, are big IT trends (and that they can work), but crawling the Web is the competitive advantage that Google, Yahoo, and Microsoft use to maintain their market share. Will a mishmash of random crawls from across the Web really be an adequate substitute to a centralized effort? We'll have to wait and see.

As for Wikia itself, I think that even if the distributed Web crawling doesn't work as well as Google, just having that option available -- along with open and free search engine parts -- will be the catalyst that both vertical search and the local search advocates have been looking for.

In May I predicted that Google would die not from direct competition with a new, direct rival but from the challenges posed by an army of thousands of tiny niche search engines and Web apps.

We've seen Technorati and Blinkx beat Google at its newer search initiatives. How many more little search engines will we see once Wikia goes live?


Related Reading




Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

InformationWeek encourages readers to engage in spirited, healthy debate, including taking us to task. However, InformationWeek moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing/SPAM. InformationWeek further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
T-Shirt Giveaway T-Shirt Giveaway: Each week we're selecting one great comment from our readers. The author of the comment will receive an InformaitonWeek Community t-shirt. So get posting!
Subscribe to RSS

Resource Links