Patrolling the Web for Pirated Content
Piracy of copyrighted work is rampant on the Internet. The plague of “scraper sites” is just one example: they copy content belonging to other sites in hopes of snagging readers–and advertising revenue from automated networks such as Google’s AdSense.
In the United States, the Digital Millennium Copyright Act (DMCA) lets copyright holders protect themselves by sending online service providers a “takedown notice” if one of the providers’ users uploads content belonging to the rights holder. Removing offending material immunizes providers from being sued as accomplices to intellectual piracy. (Similar laws exist in most of the developed world.) Search engines can also be ordered to remove links to such content from search results.
The problem for copyright holders is that before they can serve notice, they must first spot the unauthorized copies of their content. Attributor, a company based in Redwood City, CA, was founded in 2005 to turn this problem into money. “People give us their content. We then crawl the Internet and find where and how it’s being reused,” says Jim Pitkow, who cofounded Attributor and serves as its CEO.
While the system can detect partial copies–the initial scan can spot as little as two or three sentences of a client’s content embedded in a Web page, according to Pitkow–Attributor is focusing on complete or nearly complete copies that reuse more than 125 words. This avoids the thorny issue of determining what is and isn’t covered by the “fair use” exemption in copyright law, which allows people to incorporate portions of a copyrighted work for the purposes of commentary, education, or parody. Once matches are found, Attributor can respond on behalf of its clients.
Even when complete copies are found, sending a takedown notice tends to be a last resort; the tactic is often perceived as corporate bullying. Instead, a publisher might, for example, ask a blogger reposting a news story to provide appropriate attribution and add a link back to the originating website.
Ultimately, Attributor hopes its tracking data will be used in a system where online advertising networks agree to give its clients a share of any ad revenue from pages that contain copied content or face a torrent of takedown notices under the DMCA. So far the networks have been cool to this idea. In the meantime, media clients, which include CondéNet, Thomson Reuters, and the Associated Press, find Attributor’s service valuable for tracking where their content is appearing: a site that frequently uses a publisher’s content might be amenable to striking a licensing deal, converting a pirate into a customer.
Be the leader your company needs. Implement ethical AI.
Join us at EmTech Digital 2019.