1.1% of the Web pages indexed by Google and MSN are sexually explicit, and content filtering software will miss up to 60% of those pages while blocking up to 23.6% of non-explicit pages, according to expert testimony in the federal government's quest to sustain the Child Online Protection Act.

Thomas Claburn, Editor at Large, Enterprise Mobility

November 14, 2006

3 Min Read

Of the billions of Web pages indexed by Google and MSN, 1.1% are sexually explicit and content filtering software will miss between 8.8% and 60.2% of them, while blocking between 0.4% and 23.6% of "clean" Web pages.

These figures reflect the testimony of Philip Stark, a professor of statistics at the University of California, Berkeley, who submitted his analysis of Internet content filtering in court earlier this year on behalf of the federal government's effort to sustain the Child Online Protection Act (COPA).

Though COPA was ruled unconstitutional, the Supreme Court directed the Philadelphia court hearing the case to evaluate how technology might affect the constitutionality of the statute. Attorneys for the Department of Justice and the American Civil Liberties Union have been addressing this issue since hearings resumed in late October. Arguments are tentatively scheduled to conclude on Monday. A final ruling could take months.

COPA calls penalties of up to $50,000 per day and up to six months in prison for making material deemed "harmful to minors" available online, regardless its value for adults.

The government's argument, supported by Stark's analysis, is that content filtering doesn't work and that COPA, signed into law in 1998 by President Clinton but never enforced, is thus necessary to protect minors online.

In documents released on Monday, Stark estimates the prevalence of sexual content on the Web, based samples of 50,000 Web sites from Google's search index and 1 million Web sites from MSN's search index that were obtained by government subpoena.

Google, which fought the government's demand for information, won the right to withhold user search queries but was required provide a sample of its index. AOL, MSN, and Yahoo provided the government with an undisclosed number of search queries from different one week periods over the summer of 2005.

Stark's findings include: 1.1% of the Google and MSN indexes consist of sexually explicit pages; of these, 44.2% in the Google index and 56.7% in the MSN index are hosted in the U.S.; 6% of Web searches retrieve at least one sexually explicit Web page; and 1.7% of search results are sexually explicit.

Using unconfirmed estimates that Google's index contains about 24 billion documents, Stark's figures suggest that there are at least 264 million sexually explicit Web pages on the Net.

Among the 500 most popular search terms listed in the 20 million queries inadvertently released by AOL over the summer, "porn" and "sex" ranked 41st and 43rd respectively, according to a list published at DontDelete.com. (The most popular search term in the AOL data set is "Google.")

While Stark's analysis of filter performance appears to support the government's contention that COPA is necessary to do what technology can't, ACLU attorney Catherine Crump argues that the government hasn't met its burden of showing that "this flawed law is more effective than Internet filtering technology."

The government must prove that COPA serves a compelling government interest, that the law is tailored narrowly enough not to suppress protected speech, and that there are no alternatives that are less restrictive of the right to free speech.

One less restrictive, more effective alternative, suggests Crump, involves keeping computers in a central location in the home, where kids can be monitored. She also says the government could safeguard children through better-crafted laws and public information campaigns to educate parents.

"COPA would chase a tremendous amount of valuable speech off the Net," Crump says, citing as an example the pictures of torture and sexual abuse at Iraq's Abu Ghraib prison, which could easily be deemed "harmful to minors."

Beyond its constitutional failings, Crump says COPA is behind the times, noting that it doesn't apply to Web sites overseas or sexual content that's distributed by some means other than the Web, such as peer-to-peer or IM networks.

About the Author(s)

Thomas Claburn

Editor at Large, Enterprise Mobility

Thomas Claburn has been writing about business and technology since 1996, for publications such as New Architect, PC Computing, InformationWeek, Salon, Wired, and Ziff Davis Smart Business. Before that, he worked in film and television, having earned a not particularly useful master's degree in film production. He wrote the original treatment for 3DO's Killing Time, a short story that appeared in On Spec, and the screenplay for an independent film called The Hanged Man, which he would later direct. He's the author of a science fiction novel, Reflecting Fires, and a sadly neglected blog, Lot 49. His iPhone game, Blocfall, is available through the iTunes App Store. His wife is a talented jazz singer; he does not sing, which is for the best.

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like


More Insights