Building a Web 2.0 site is only half the battle. Keeping it free of spam and malware is the other half, and a new online tutorial on the subject serves up some solid tips on how to fight -- and win.
Building a Web 2.0 site is only half the battle. Keeping it free of spam and malware is the other half, and a new online tutorial on the subject serves up some solid tips on how to fight -- and win.Earlier this month, IBM's DeveloperWorks site published a two-part guide dedicated to battling spam on Web 2.0 sites. Here is how the author, developer and computer engineer Uche Ogbuji, explains the problem in a nutshell:
"Spam on the Web is one of the biggest threats to a modern Web developer. The "bad guys" become more and more sophisticated every year in how to vandalize and proliferate ads over any Web 2.0 page they can grasp. To make matters worse, spam is increasingly used to distribute malware. The arms race is on, and Web developers need to know what basic tools are available to battle spam on their Web sites. This two-part installment provides a thorough guide to anti-spam techniques. This first article explains how to assess whether a visitor is a spammer and how to organize site workflow to discourage spam."
It's an interesting read even if you're not an IT professional; although some of the material is technical in nature, most laypeople with a certain amount of IT knowledge will find it accessible. (The author also offers some historical background on the problem that will be a real trip down memory lane for anyone old enough to know when the names Canter & Siegel could whip Usenet denizens into fits of rage.)
The first article in the series focuses on the human side of the spam-control game, including tactics for assessing the behavior and intentions of visitors to your company's Web site. That includes, of course, methods for separating real visitors from robots, such as the now-ubiquitous nonce tests. Part Two tackles topics such as methods for combating linkback spam (an especially deadly threat to sites with blogs or community-contributed content), content analysis techniques, community-based anti-spam initiatives, and a discussion of HTTP proxy blacklists and proxy-blocking strategies.
Both articles also include links to additional online resources. Some of these, such as this guide to more effective proxy-blocking techniques for Apache Web servers, may qualify as essential reading for IT admins who need to get up to speed on this topic in a hurry.
It is hard to overstate the importance of keeping these virtual vandals off your company's Web site before they get a toehold. Spammers, like the malware developers with whom they share so much in common, are highly motivated, highly predatory, and highly intelligent -- if you don't work constantly to stay ahead of them, rest assured they will very quickly get ahead of you.