Privacy concerns prompted the search company to think about anonymizing selected server logs.
"After talking with leading privacy stakeholders in Europe and the U.S., we're pleased to be taking this important step toward protecting your privacy," said Peter Fleischer, privacy counsel-Europe, and Nicole Wong, deputy general counsel, in a blog post. "By anonymizing our server logs after 18-24 months, we think we're striking the right balance between two goals: continuing to improve Google's services for you, while providing more transparency and certainty about our retention practices."
Google says it will change some of the digits in logged IP addresses and alter cookie file information to anonymize users. The technical details have yet to be worked out.
Don't expect a guarantee of anonymity, however. "It is difficult to guarantee complete anonymization, but we believe these changes will make it very unlikely users could be identified," Google says.
Privacy advocate Lauren Weinstein, in a message posted to David Farber's Interesting People e-mail discussion list, called the move "an immensely positive sea change to Google's attitude toward this data."
Google has already demonstrated noteworthy commitment to privacy through its decision to fight a U.S. Department of Justice subpoena received in 2005 demanding user search data, a battle that AOL, Microsoft, and Yahoo shied away from. The court hearing the case eventually denied the U.S. Department of Justice access to Google search data and allowed it only a small portion of the URLs sought from Google's Web index.
A spokesperson for the DOJ declined to comment on Google's change in policy. Federal officials have said that the U.S. government is opposed to mandatory data destruction requirements and have urged ISPs and Internet companies to retain server log data for at least two years to assist law enforcement efforts.
It remains to be seen how eager Google and other companies will be to keep any kind of data about their users once they see the bill.
David McClure, president and CEO of the U.S. Internet Industry Association, says the Department of Justice needs to be much clearer about what it is asking for and how the Internet industry might be able to accommodate legitimate government interests.
"We don't mind the investment, but we hate throwing money away and making consumers pay more for something we know from the outset we know won't work," says McClure, who says that anyone with any level of technical sophistication will be able to avoid being identified by an IP address.
McClure points out that the cost of keeping a gigabyte of data for a year is about $7.60, twice that if the data is stored in a secure data center. According to an IDC study commissioned by storage vendor EMC, 161 exabytes of digital information were created and copied in 2006. One exabyte equals a billion gigabytes. By 2010, IDC expects the volume of data created and copied to rise sixfold to 988 exabytes, reflecting a compound annual growth rate of 57%.
The Agile ArchiveWhen it comes to managing data, donít look at backup and archiving systems as burdens and cost centers. A well-designed archive can enhance data protection and restores, ease search and e-discovery efforts, and save money by intelligently moving data from expensive primary storage systems.
2014 Analytics, BI, and Information Management SurveyITís tried for years to simplify data analytics and business intelligence efforts. Have visual analysis tools and Hadoop and NoSQL databases helped? Respondents to our 2014 InformationWeek Analytics, Business Intelligence, and Information Management Survey have a mixed outlook.