Security researchers have a brand-new tool to use to go digging for malicious executables on the Web: the Google SOAP Search API.
Malware hunters at Websense Security Labs have figured out a way to use the freely available Google API to find dangerous .exe files sitting on thousands of Web servers around the world.
The Google API uses Simple Object Access Protocol and WSDL (Web Services Description Language) standards to offer developers an easy way to run search queries outside the browser. Because of the way the search engine indexes executables, Websense was able to create code to look for strings associated with malware packers.
Dan Hubbard, senior director of security and technology research at the San Diego-based Web filtering software company, said the use of the Google API started as an experiment after bloggers noticed some Google search queries returning .exe files.
Hubbard’s research team found that, when Google indexes an executable file, the search engine parses the PE (Portable Executable) file format of the Microsoft Windows executable. This means that queries can be written to extract items from the internals of the binary.
Hubbard said Websense created code to query “unique identifiers” within the PE file format that would indicate potentially malicious files.
“We’re finding literally thousands of sites with malicious code executables. From hacker forums, newsgroups to mailing list archives—they’re all full of executables that Google is indexing,” Hubbard said.
About 15 percent of the results came back from legitimate Web sites hijacked by malicious hackers and seeded with executables.
“We were able to find a lot of compromised sites distributing malware, most likely without the knowledge of the site owner,” Hubbard said.
The queries also turned up pieces of spyware on popular online gaming sites and variants of the virulent Bagle and Mytob worms.
“While we do not believe that the fact that Google is indexing binary file contents is a large threat, this is further evidence of a rise in Web sites being used as a method of storing and distributing malicious code,” Websense executives said in a research note announcing the experiment.
Hubbard said he plans to publish the full results of the experiment and the actual code used in the API queries on private security mailing lists to help other researchers automate the process of finding malicious Web sites.
“At Websense, we’re mining almost 80 million Web sites every 24 hours to look for threats. The big issue is that you can’t [wait anymore] for people to send you malware samples. You have to go out and proactively look for stuff,” Hubbard said.
Researchers from the anti-malware engineering team at Microsoft also are working on an automated way to classify malware families and variants attacking Windows computers. Microsoft is proposing the use of distance-measure and machine-learning technologies to come up with automatic classification of viruses, Trojans, spyware, rootkits and other malicious software programs.
Googling for Executables
How it works
* Using the freely available Google SOAP Search API, security researchers run automated queries on billions of Web pages.
What they found
* Thousands of pieces of malicious binaries, including spyware on poker and casino sites, variants of the Bagle and Mytob worms, and multiple keylogger Trojans.
Who’s hosting malicious .exe files
* Hacker forums, newsgroups, mailing list archives and legitimate Web sites hijacked by malicious hackers.
Source: Websense Security Labs