Organizing the world's information and making it universally accessible--Google's ambition--has its ramifications. In handling billions of search requests, Google generates terabytes of data about the Web searches of its users, a wealth of information the company mines regularly but guards vigorously. For six months, the U.S. Department of Justice, in an effort to uphold the Child Online Protection Act, has been pressing to get its hands on some of that data. What happens next could reveal as much about Google as Google knows about its users.
The Justice Department subpoenaed Google in August, demanding two months of search queries and all the URLs in its index. Negotiations led to a narrowing of the government's request, to 5,000 queries and 50,000 URLs, but the sides hit an impasse. Unlike AOL, MSN, and Yahoo, which gave the government what it sought, Google contested the order. A federal judge will hear Google's case in San Jose, Calif., this week.
Google argues that the data the feds want isn't relevant to the government's effort and that some of what's requested--the number, length, and type of queries processed--constitutes a trade secret. Google also objects to the work involved in producing the requested information and worries that "being forced to compromise its privacy principles" would erode customer confidence.
But it's not just a matter of principle. Google's always been coy about how much personal information it collects. Now questions over how much it knows and who gets access to that information are becoming more important. Google's become the canary in the data mine.
Google has called into question whether the government's subpoena complies with the Electronic Communications Privacy Act, which limits the circumstances under which electronic data and communications can be disclosed to the government and other entities. The act covers two types of network services: electronic communications and remote computing. The government claims Google provides neither, but attorney Richard Wiebe disagrees. In a brief filed by the Center for Democracy and Technology, Wiebe argues that Google, as an outsourcer of search functions, qualifies as a remote computing service provider.
It's not the first time Google has had to stare down the double barrels of user privacy and government compliance, and given Google's rich archive of user data, it probably won't be the last. Anticipating interest in its data stores from authorities in China, where its legal options are fewer, Google elected not to offer Gmail or Blogger from servers based in that country until "we can do so in a manner that respects our users' interests in the privacy of their personal communications."
In the States, Justice is trying to prove that the Child Online Protection Act is necessary by demonstrating that Internet filtering software doesn't adequately protect minors from viewing sexually explicit material. To make its case, the government aims to use data from search engines to perform statistical analysis about the effectiveness of Internet filters in screening out pornography. The feds aren't looking to include information that can be tied to individual users, but some worry the government's request could become a precedent for just that.
Search keywords and associated URLs aren't exactly trade secrets. Search engine queries are routinely sold, stripped of personally identifiable information that might have been gleaned from the original query. InfoSpace, which owns meta-search engines Dogpile.com and MetaCrawler.com, sells keyword lists to online advertising companies.