Google's Great Spam Quest

The search engine wants to weed out sites that create low quality articles simply as a way of luring people to online ads.

Google is working on ways to rid its search results of “content farms”—sites that create many pages of very cheap content crafted to appear high up in Google’s results. Speaking this week at Farsight 2011, a one-day event in San Francisco on the future of search, the firm’s principal search engineer, Matt Cutts, said that Google is considering tweaks to the algorithms that guide its search results. It’s also considering more radical tactics, such as letting users blacklist certain sites from the results they see.

In recent months, Google has been criticized by tech industry insiders for allowing so-called “content farms” to occupy high rankings in results for common searches. The operators of such sites create articles containing common search keywords and phrases as a way of luring visitors to their online ads. Much of the content on such sites, for example those operated by Demand Media, is created by very low-paid freelancers.

Search engines are currently being bested by those tactics, said Vivek Wadhwa, a visiting researcher in technology and business at Berkeley, Duke, and Harvard universities, at Tuesday’s event. “Over the last 15 years, search has changed very little,” he said, “but the Web has changed and become pretty clogged by spam.” Wadhwa said he realized the scale of the problem after small-scale experiments with his students revealed that the shortcomings in Google searches appeared frequently for common searches.

Cutts announced last week that Google’s algorithms had been altered to penalize sites that copied content from other sites as a way of climbing higher in search rankings. But he acknowledged that it was a challenge to identify and demote low-quality content. “Someone recently found five articles on how to tie shoes on one of these sites,” he said. “We want to find an algorithmic solution to this and are working on it.”

Some question whether an algorithmic approach can work. Startup search company Blekko uses a different approach: yesterday it announced that it had excluded 20 “spam” sites from its index entirely, based on which pages its users had marked as spam when they appeared in search results. The 20 sites include many often described as content farms, including Demand Media’s eHow site. Blekko, which launched last November, uses Wikipedia-like functionality to allow users to mark pages as spam, and to work together on filters (dubbed “slashtags”) that include or exclude sites from searches on particular topics.

“The Web has turned into a swamp,” said Blekko cofounder Rich Skrenta at the event, “because search engines gave URLs economic value.” Methods of ranking search results that rely mainly on which sites have the most links or keywords are no longer robust enough, he said. Instead, a more human touch is required.

Harry Shum, who leads development on Microsoft’s search engine, Bing, also appeared at the event and agreed that search companies need new approaches. “I think this is a big problem,” he said. “Google is overemphasizing the automatic approach. Maybe we need to take into account the authority of the authors of pages or other social information.” Bing has experimented with a feature that draws on information from a person’s Facebook friends to rank results.

Cutts claimed that it’s not Google’s style to make “editorial decisions” to block certain sites—the company would prefer to find purely automatic ways to filter out sites that don’t help users. “Using algorithms can work in German and Japanese as well as it does in English,” he pointed out. However, he also revealed that Google is experimenting, internally for now, with a Blekko-like strategy where users can wrest some control of their search results.

“I have a Chrome bar installed on my laptop that will let you block certain sites from results,” said Cutts. “If people want to send us direct feedback, that’s great.” However, he gave no indication of when the feature might be launched publicly.

Tech Obsessive?
Become an Insider to get the story behind the story — and before anyone else.

Subscribe today

Uh oh–you've read all of your free articles for this month.

Insider Premium
$179.95/yr US PRICE

Want more award-winning journalism? Subscribe to Insider Premium.
  • Insider Premium {! insider.prices.premium !}*

    {! insider.display.menuOptionsLabel !}

    Our award winning magazine, unlimited access to our story archive, special discounts to MIT Technology Review Events, and exclusive content.

    See details+

    What's Included

    Bimonthly magazine delivery and unlimited 24/7 access to MIT Technology Review’s website

    The Download: our daily newsletter of what's important in technology and innovation

    Access to the magazine PDF archive—thousands of articles going back to 1899 at your fingertips

    Special discounts to select partner offerings

    Discount to MIT Technology Review events

    Ad-free web experience

    First Look: exclusive early access to important stories, before they’re available to anyone else

    Insider Conversations: listen in on in-depth calls between our editors and today’s thought leaders

You've read all of your free articles this month. This is your last free article this month. You've read of free articles this month. or  for unlimited online access.