Search Spammers Hacking More Websites

The head of Google’s Web-spam-fighting team warns that spammers are increasingly attacking websites.

Kristina Grifantiniarchive page

July 30, 2009

The head of Google’s Web-spam-fighting team, Matt Cutts, warned last week that spammers are increasingly hacking poorly secured websites in order to “game” search-engine results. At a conference on information retrieval, held in Boston, Cutts also discussed how Google deals with the growing problem of search spam.

Search spammers try to gain unfair prominence for their Web pages in search results, thereby making money from the products that these sites offer or from advertising posted on them. The practice, also known as “spamdexing,” exploits the way search engines’ algorithms figure out how to rank different pages for a particular search query. Google’s page-rank algorithm, for instance, in part gives prominence to pages that are heavily linked to other material on the Web. Spammers can exploit this by adding links to their site on message boards and forums and by creating fake Web pages filled with these links. Garth Bruen, creator of the Knujon software that keeps track of reported search spam, says that some campaigns involve creating up to 10,000 unique domain names.

“We’re getting better at spotting spammy pages,” said Cutts after his talk, adding that spammers are increasingly hacking legitimate websites and filling their pages with spam links or redirecting users to other sites.

“As operating systems become more secure and users become savvier in protecting their home machines, I would expect the hacking to shift to poorly secured Web servers,” said Cutts. He expects “that trend to continue until webmasters and website owners take precautions to secure Web-server software as well.”

“I’ve talked to some spammers who have large databases of websites with security holes,” Cutts said. “You definitely see more Web pages getting linked from hacked sites these days. The trend has been going on for at least a year or so, and I do believe we’ll see more of this.”

Bruen agrees. “We’ve seen an increase in spam e-mail and spam domains that not only sell illicit products, but that attempt to download malware and infect the visitor’s PC,” he says. Such malware could use an unknowing victim’s computer to send out e-mail spam.

“It really is an arms race,” says Daniel Tunkelang, one of the conference organizers and the chief scientist at search company Endeca.

To prevent such attacks, Cutts recommended that anyone running her own website regularly patch the Web server and any software running on it. “In the same way that you wouldn’t browse the Web with an unpatched copy of Internet Explorer, you shouldn’t run a website with an unpatched or old version of WordPress, cPanel, Joomla, or Drupal,” said Cutts. He also suggested that users hand over management of Web software. “Using a cloud-based service where the server software is managed by someone else can often be more secure,” he said.

During his talk, Cutts also explained that Google’s efforts to identify dubious Web sites now include parsing the JavaScript code that underlies pages. Code may contain hidden instructions that record users’ data, for example.

“It wasn’t obvious to me that Google can do this,” says Endeca’s Tunkelang. “And apparently some spammers were saying that Google can’t do that.”

Cutts noted that spammers and hackers are also finding new ways to spam, with the rise of social networking sites like Facebook and Twitter. These sites “bring identity into the equation, but don’t really have checks to verify that a profile or person sending you a message is who you think they are,” said Cutts.

“Authentication [across the Web] would be really nice,” says Tunkelang. “The anonymity of the Internet, as valuable as it is, is also the source of many of these ills.” Having to register an e-mail before you can comment on a blog is a step in this direction, he says, as is Twitter’s recent addition of a “verified” label next to profiles it has authenticated.

Danah Boyd, a Microsoft Research scholar who studies social media, suggests that spammers take advantage of the fact that people don’t always adhere to the rules on social-networking sites–for example, they sometimes provide fake information about themselves. “The variability of average users is precisely what spammers rely on when trying to trick the system,” says Boyd. “All users are repurposing systems to meet their needs, and the game of the spammer keeps changing. That makes the work that Matt does very hard but also very interesting.”

Keep Reading

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.