Amazon reached agreements with 190 publishers and digitized 120,000 books through optical character-recognition scanning. It then text-searched and indexed its digitized library to generate Search Inside the Book on Amazon.com.
To use the service, site visitors need only type a topic or string of words into the regular Amazon search window for books. Instead of coming back with a particular title or set of titles by a given author, a search for the topic brings back a list of results, much like a regular search engine.
"We try to bring back the results with the most relevancy" to the visitor's named topic, says Udi Manber, Amazon's chief algorithm officer. The online retailer isn't using the full text-search capability of a commercial relational database or an established search engine, he says. "It's all our proprietary technology. The secret sauce is how we put it together."
But the new search mechanism has produced "serious concern" among members of the Authors Guild, according to The New York Times. Authors say they've been able to download multiple consecutive pages of their books through clever searches, the paper reported this week, and the new feature may lead to reduced sales as searchers get the portions of a book that they want.
Manber says the feature is more akin to patrons visiting a bookstore and leafing through a prospective purchase than a giveaway of content. Amazon left the decision of which books to digitize up to publishers, he says.
To actually view pages with the reference, site visitors must give their E-mail addresses to Amazon and create a password.
Amazon has created a digital library of "dozens and dozens of terabytes" because it included not only the text in the books but image information as well.
Search-engine company Google Inc. said it has begun talks with book publishers to compile its own searchable database of thousands of volumes, according to Publishers Weekly, a book publishing trade magazine.