Google Connects Search Appliances For Billion Document Indexing
With 34 GSAs linked together, Google says version 6 could let a company host a search index that's as big as the entire Google Web index in the year 2000.
Google on Tuesday plans to turn its enterprise search effort up a notch with the introduction of a more powerful version of its enterprise search hardware, the Google Search Appliance.
Version 6 of the Google Search Appliance includes a new architecture that allows it to handle far more documents than previous iterations. While the indexing capacity of GSA 6.0 units remains unchanged at 30 million documents, the new servers can be linked together, providing a way to index as many as a billion documents.
Thus with 34 GSAs taking up less space than five standard server racks, a company can host a search index that's as big as the entire Google Web index in the year 2000.
But most companies don't need to search that many documents, or at least don't think they do.
"When I talk to CIOs, most companies still index less than 10% of their content," said Nitin Mangtani, lead product manager for Google enterprise search products. "If you're not indexing your content, your users cannot find things. We believe that with this release, some of the barriers that were stopping organizations from indexing their content will be taken away."
GSA 6.0 also includes Ranking Framework, Node Biasing, and Collection Biasing, features that allow administrators to make certain documents more or less relevant based on a variety of criteria. And it includes social features like Query Suggestions and User-Added Results to improve the search experience.
Google also has added a new hybrid security model that allows administrators to choose between the greater speed of "early binding" -- omitting documents from being indexed as per policy -- and the greater security of "late binding" -- constructing a complete index and denying access after the fact as the most current policy dictates.
Google's entry-level GSA going forward will be the GB-7007, a 2U node rack server, and its high-end model will be the GB-9009, a 2U node rack server with 3U of attached storage. Both are based on the Dell PowerEdge R710 server, though the GB-9009 has about three to four times more processing power, according to Mangtani.
Join InformationWeek’s Lorna Garey and Mike Healey, president of Yeoman Technology Group, an engineering and research firm focused on maximizing technology investments, to discuss the right way to go digital.