Spam Blogs Pollute Internet Searches - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

IoT
IoT
Feature
News
5/11/2006
04:15 PM
50%
50%

Spam Blogs Pollute Internet Searches

Spam blogs, known as splogs, are invading the Web by the millions. Blog search engines are trying to stamp them out, but more work needs to be done.

Spam blogs give taylor Bayouth a big headache.

He wants the social network and blog publishing site he founded, tBlog.com, to parse the words of its 200,000 members every time they post a blog and use that analysis to update their profiles. Bayouth believes such a "thought matching" system would be unique. But one of the biggest problems he faces--besides competing against much bigger competitors, such as MySpace.com--is the amount of spam disguised as blogs that hits his site.

"This is what we battle with on a daily basis," Bayouth says. "Spam could literally just kill this thing."

It's a battle that likely won't end soon. There are millions of spam blogs, or splogs, with more added every day. "It's not getting any better, and it's probably getting worse," says Tim Finin, a computer science professor at University of Maryland, Baltimore County, who co-wrote a paper about detecting splogs that was presented at an American Association for Artificial Intelligence conference in March.

Search engines designed specifically to sift through blogs, such as BlogPulse and Technorati, claim to be getting better at separating out the garbage. "Identifying spam isn't all that hard," says Natalie Glance, senior research scientist with Nielsen BuzzMetrics, which runs BlogPulse and tracks and analyzes what consumers say online about companies. "It's a game of escalation."

Who's Splogging Whom?

This blog looks legit, but click a link and you'll be looking at ads for golf vacations, hard drive repairs, and divorce lawyers.

(click image for larger view)


This blog looks legit, but click a link and you'll be looking at ads for golf vacations, hard drive repairs, and divorce lawyers.
The people who create splogs--or, more accurately, the people who write the programs that create splogs--rarely intend for anyone to actually read their posts, which are often poorly written or even strings of nonsensical words. They're just building a giant clump of links that refer back to other sites, perhaps those that promote gambling or sell Viagra. When people click on those links, they increase the page rank of those sites on various search engines. Splog creators also sometimes include on their splogs ads that generate a small commission, usually a fraction of a dollar, for every click.

Here's one scenario: You want to test out a new programming language, so you run a blog search on it, hoping to find out about others' experiences with it. You end up at a site that looks like a blog--including a supposed blogger's name, photo, and archive of postings--but click on a posting, and you end up at a site advertising hard drive repair.

In a daily report run last month, BlogPulse identified more than 26 million blogs, with nearly 87,000 new ones within the previous 24 hours. The company indexed 828,890 posts in the same time period. Technorati reports an even bigger blogosphere: It tracks more than 35 million blogs and 1.2 million new posts each day, an average of 50,000 per hour. About 9% of new blogs are spam, reports Technorati, and 60% of pings--the messages blogs send to a centralized network service notifying of a newly published post--are from known spam sources. Technorati says it blocks these spam pings, known as spings. "Spam blogs and their cousins, spings, continue to present infrastructure providers like Technorati a challenge," founder and CEO David Sifry wrote on the site's blog.

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Previous
1 of 2
Next
Comment  | 
Print  | 
More Insights
The State of Cloud Computing - Fall 2020
The State of Cloud Computing - Fall 2020
Download this report to compare how cloud usage and spending patterns have changed in 2020, and how respondents think they'll evolve over the next two years.
News
Top 10 Data and Analytics Trends for 2021
Jessica Davis, Senior Editor, Enterprise Apps,  11/13/2020
Commentary
Where Cloud Spending Might Grow in 2021 and Post-Pandemic
Joao-Pierre S. Ruth, Senior Writer,  11/19/2020
Slideshows
The Ever-Expanding List of C-Level Technology Positions
Cynthia Harvey, Freelance Journalist, InformationWeek,  11/10/2020
Register for InformationWeek Newsletters
Video
Current Issue
Why Chatbots Are So Popular Right Now
In this IT Trend Report, you will learn more about why chatbots are gaining traction within businesses, particularly while a pandemic is impacting the world.
White Papers
Slideshows
Twitter Feed
Sponsored Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.
Sponsored Video
Flash Poll