Google recently began indexing the executable files it finds on Web sites, said Dan Hubbard, Websense senior director of security, and his San Diego-based researchers have created a toolset that lets them automatically search for specific strings within those files, such as identifiers known to be used by popular malware packers.
"Google indexes binary files, and we were able to come up with queries that look for certain strings, some of which are very specific, in Windows executables generated by packers," explained Hubbard.
By combining the Google search API with some homebrewed tools, Websense researchers automated the discovery process and uncovered more than 2,000 malicious Web sites. About 10 to 15 percent of those, said Hubbard, were legitimate sites that had been hacked and so were hosting predatory .exe files. It's Websense's policy, however, not to publicly disclose compromised sites.
The team also uncovered evidence of here-to-fore unknown attack code.
"We also found previously-undetected malcode," Hubbard said, that included new Trojan horses, unknown variants of the Bagel and Mytob worms, and malware writing toolkits. "It's been a lot of help."
Websense won't publish the toolset it created to Google for malware, but Hubbard said he plans to "roll up a bunch of the tools and share them with other security researchers."
Google's indexing of binary file contents shouldn't be considered a large threat, it is evidence of the trend toward Web sites storing and then distributing hostile code.
"Sure, there's a downside to this," said Hubbard. "Malware writers, if they figured out how to do this, could [search for and] download malcode from multiple sites, maybe even see what kind of strings they should put in, or not put in, their binaries."
Search engines have been harnessed by attackers before. In 2004 and 2005, for example, the creator of several MyDoom variants used Google, Yahoo, and other engines to find victims' e-mail addresses. Two years ago, a MyDoom worm generated enough traffic in e-mail queries to slow down the search site.