Skip to Content
Uncategorized

Patrolling the Web for Pirated Content

December 21, 2009

Piracy of copyrighted work is rampant on the Internet. The plague of “scraper sites” is just one example: they copy content belonging to other sites in hopes of snagging readers–and advertising revenue from automated networks such as Google’s AdSense.

In the United States, the Digital Millennium Copyright Act (DMCA) lets copyright holders protect themselves by sending online service providers a “takedown notice” if one of the providers’ users uploads content belonging to the rights holder. Removing offending material immunizes providers from being sued as accomplices to intellectual piracy. (Similar laws exist in most of the developed world.) Search engines can also be ordered to remove links to such content from search results.

The problem for copyright holders is that before they can serve notice, they must first spot the unauthorized copies of their content. Attributor, a company based in Redwood City, CA, was founded in 2005 to turn this problem into money. “People give us their content. We then crawl the Internet and find where and how it’s being reused,” says Jim Pitkow, who cofounded Attributor and serves as its CEO.

While the system can detect partial copies–the initial scan can spot as little as two or three sentences of a client’s content embedded in a Web page, according to Pitkow–Attributor is focusing on complete or nearly complete copies that reuse more than 125 words. This avoids the thorny issue of determining what is and isn’t covered by the “fair use” exemption in copyright law, which allows people to incorporate portions of a copyrighted work for the purposes of commentary, education, or parody. Once matches are found, Attributor can respond on behalf of its clients.

Even when complete copies are found, sending a takedown notice tends to be a last resort; the tactic is often perceived as corporate bullying. Instead, a publisher might, for example, ask a blogger reposting a news story to provide appropriate attribution and add a link back to the originating website.

Ultimately, Attributor hopes its tracking data will be used in a system where online advertising networks agree to give its clients a share of any ad revenue from pages that contain copied content or face a torrent of takedown notices under the DMCA. So far the networks have been cool to this idea. In the meantime, media clients, which include CondéNet, Thomson Reuters, and the Associated Press, find Attributor’s service valuable for tracking where their content is appearing: a site that frequently uses a publisher’s content might be amenable to striking a licensing deal, converting a pirate into a customer.

Keep Reading

Most Popular

DeepMind’s cofounder: Generative AI is just a phase. What’s next is interactive AI.

“This is a profound moment in the history of technology,” says Mustafa Suleyman.

What to know about this autumn’s covid vaccines

New variants will pose a challenge, but early signs suggest the shots will still boost antibody responses.

Human-plus-AI solutions mitigate security threats

With the right human oversight, emerging technologies like artificial intelligence can help keep business and customer data secure

Next slide, please: A brief history of the corporate presentation

From million-dollar slide shows to Steve Jobs’s introduction of the iPhone, a bit of show business never hurt plain old business.

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.