Briefing Media

Case Study

Patrolling the Web for Pirated Content

  • January/February 2010
  • By Stephen Cass

BOOMING BUSINESS
Despite a recent dip in revenue, online ad campaigns are still popular with advertisers
Tommy McCall

Piracy of copyrighted work is rampant on the Internet. The plague of "scraper sites" is just one example: they copy content belonging to other sites in hopes of snagging readers--and advertising revenue from automated networks such as Google's AdSense.

In the United States, the Digital Millennium Copyright Act (DMCA) lets copyright holders protect themselves by sending online service providers a "takedown notice" if one of the providers' users uploads content belonging to the rights holder. Removing offending material immunizes providers from being sued as accomplices to intellectual piracy. (Similar laws exist in most of the developed world.) Search engines can also be ordered to remove links to such content from search results.

The problem for copyright holders is that before they can serve notice, they must first spot the unauthorized copies of their content. Attributor, a company based in Redwood City, CA, was founded in 2005 to turn this problem into money. "People give us their content. We then crawl the Internet and find where and how it's being reused," says Jim Pitkow, who cofounded Attributor and serves as its CEO.

While the system can detect partial copies--the initial scan can spot as little as two or three sentences of a client's content embedded in a Web page, according to Pitkow--Attributor is focusing on complete or nearly complete copies that reuse more than 125 words. This avoids the thorny issue of determining what is and isn't covered by the "fair use" exemption in copyright law, which allows people to incorporate portions of a copyrighted work for the purposes of commentary, education, or parody. Once matches are found, Attributor can respond on behalf of its clients.

Advertisement

Even when complete copies are found, sending a takedown notice tends to be a last resort; the tactic is often perceived as corporate bullying. Instead, a publisher might, for example, ask a blogger reposting a news story to provide appropriate attribution and add a link back to the originating website.

Ultimately, Attributor hopes its tracking data will be used in a system where online advertising networks agree to give its clients a share of any ad revenue from pages that contain copied content or face a torrent of takedown notices under the DMCA. So far the networks have been cool to this idea. In the meantime, media clients, which include CondéNet, Thomson Reuters, and the Associated Press, find Attributor's service valuable for tracking where their content is appearing: a site that frequently uses a publisher's content might be amenable to striking a licensing deal, converting a pirate into a customer.

Print

Close Comments

To comment, please sign in or register

Forgot my password

erbium

340 Comments

  • 750 Days Ago
  • 01/26/2010

So this is one more bot?

managing a large website, bots that come in and scrape my content are a major irritation. 

I simply 403 them, and then it will never be able to tell if I have copyrighted content from someone else.  I don't so I have no problem sending them to the bit-bucket.  they have no inherent right to access my site and they are usually pretty easy to spot.  I have my own custom program to separate out bots from the raw logs.

see webmasterworld's bots forum for how stupid and arrogant some of these bots are:
http://www.webmasterworld.com/forum11

or incredibill's rants on blogspot.
http://incredibill.blogspot.com/

(not for the weak of language :) )
http://incredibill.blogspot.com/search/label/Bad%20Bots

Reply

Advertisement

MAGAZINE

Can We Build Tomorrow's Breakthroughs?

Manufacturing in the United States is in trouble. That's bad news not just for the country's economy but for the future of innovation.

Advertisement

Technology Review Lists

TR50

Our list of the 50 most innovative companies, including the following:

SpaceX

Google

ARM Holdings

Roche

More

Advertisement

Facebook

Advertisement