09:39 AM

As Blogs Grow, So Does Spam

Inboxes aren't the only things getting flooded by unwanted content. "Splogs" are taking root on the Web and threatening the existence of some sites by inundating them with fake blog posts.

Spam blogs are a big headache for Taylor Bayouth.

He wants the Web site he founded, tBlog.com, a combination social network and blog publishing platform, to parse the words of its more than 200,000 members to update their profiles every time they post. Bayouth believes the "thought matching" system would be unique. But one of the biggest roadblocks he faces - besides competing against much bigger competitors, like MySpace.com - is the amount of spam blogs that hit his site.

"Spam is our No. 1 enemy," he says. "This is what we battle with on a daily basis. Spam could literally just kill this thing."

It may be little consolation to Bayouth, but in fact, there are millions of spam blogs - or "splogs" - with more added every day. And they aren't going away.

"It's not getting any better and it's probably getting worse," says Tim Finin, a computer science professor at University of Maryland, Baltimore County, who helped write a paper about detecting splogs that was presented last month at an American Association for Artificial Intelligence conference.

That may be true, but Natalie Glance, senior research scientist with Nielsen BuzzMetrics, which tracks and analyzes what consumers say online about companies and their brands, says there is some good news: blog search engines are getting better at separating the garbage from their results.

"Identifying spam isn't all that hard," she says. "The thing is it's a game of escalation."

Nielsen BuzzMetrics operates its own search engine, BlogPulse, which reported Wednesday morning that it has identified more than 26 million blogs, with nearly 87,000 new within the previous 24 hours. The measurement company indexed 828,890 posts in the same time period. By comparison, search engine Technorati says the blogosphere is even bigger and doubling in size every six months. The company tracks more than 35 million blogs and about 1.2 million new posts each day, which works out to about 50,000 per hour. It reports that about 9 percent of new blogs are spam, and that 60 percent of pings - the messages blogs send to a centralized network service notifying of newly published posts - are from known spam sources. Technorati says it blocks these spam pings, or spings.

"Spam blogs and their cousins spings continue to present infrastructure providers like Technorati a challenge," founder and CEO David Sifry wrote on the site's blog Monday. "Aside from a few notable spam storms ... the high level of interesting, original content being created greatly outweighs the fake or duplicate content listed on splogs."

A study last December by the eBiquity Research Group at UMBC found that the amount of spam pings is even higher, nearly 75 percent. eBuiquity also discovered that more than half of the blogs pinging one particular ping server, weblogs.com, are spam.

Finin, who helps run eBiquity at UMBC, says Technorati is as good as any search engine at picking out splogs, but that one out of every five blogs it counts are actually fake.

The people who create splogs - or, more accurately, the people who write the programs that do it for them - rarely intend for anyone to actually read their posts. They're just building a giant clump of links that refer back to some other site - that, say, promotes gambling or sells something like Viagra - and thus increases the page rank of that site on different search engines.

Then, in the odd chance that anyone might actually read their junk posts, the creators put ads on them that generate a small commission, usually a fraction of a dollar, for every click.

1 of 2
Comment  | 
Print  | 
More Insights
Register for InformationWeek Newsletters
White Papers
Current Issue
InformationWeek Tech Digest, Nov. 10, 2014
Just 30% of respondents to our new survey say their companies are very or extremely effective at identifying critical data and analyzing it to make decisions, down from 42% in 2013. What gives?
Twitter Feed
InformationWeek Radio
Archived InformationWeek Radio
Join us for a roundup of the top stories on InformationWeek.com for the week of November 16, 2014.
Sponsored Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.