How to Delete Regrettable Posts from the Internet
It’s possible—though not always foolproof—to get embarrassing things taken down. Voluntary data-labeling standards could make it even easier.
People shouldn’t necessarily be haunted forever by indiscretions or harassment that can warp their reputations.
It might seem that the Internet doesn’t lose track of anything that has been published online. The alleged permanence of tweets, blogs, snapshots, and instant messages worries many privacy activists and policymakers such as Viviane Reding, justice commissioner of the European Union and vice president of the European Commission. She has proposed that Europe adopt a “right to be forgotten”—a proposal that is now working its way through the EU legal process and could be law within two years.
Reding’s proposal would grant EU citizens the right to withdraw their consent from online information services after the fact—allowing people to redact embarrassing things from the global information commons, even after the data had been copied to other websites. It’s a controversial proposal: George Washington University law professor Jeffrey Rosen wrote in the Stanford Law Review that such a right could have deeply negative implications for both free speech and journalism and could ultimately fragment the Internet. Rosen pointed out that companies like Google would need to suppress from European search queries information that had been deemed “forgotten” on the continent, even as such information would still be perfectly allowable in the United States.
The proposal might also be unnecessary. Even without a right to be forgotten, there are still many ways that information can be removed from the Web. Such methods could be made more widespread.
Somewhat surprisingly, the easiest information to remove from the Internet may be data stored in Facebook, and to a lesser extent in other social networks. Facebook’s “Statement of Rights and Responsibilities” says that any information a Facebook user uploads to the social network remains that user’s property—posting, liking, and otherwise interacting with Facebook merely gives the service a revocable license to the data. That license ends when the data are deleted.
Wiping away those embarrassing self-portraits you took and posted when you were drunk won’t delete the copies that your friends have saved on their own hard drives. But who makes copies of photos anymore? Here’s a way that the convenience of cloud-based services works in favor of privacy controls: they give you one-stop-shopping for information oblivion, a single place to go and get something deleted.
Yahoo and other websites have similar forms for requesting that information be taken down. They do this even though they generally are not required to by U.S. law. Advertising-funded websites make so little money off any individual piece of data that it’s much easier to take information down than to spend time fighting for the rights of the person who posted the data.
Back in 2005, I met a person who had been the victim of horrible harassment a few years earlier, in high school. Even years later, this colleague of mine was still haunted by a series of harassing websites that her tormentors had put up on free Web-hosting services. My colleague was too traumatized to deal with the issue, so I sent a few e-mails to the Web-hosting companies, and within a few days the offending material had been taken down. Today a search for the person’s name yields only professional results, not those teenage pranks.
Unfortunately, wiping data away from every cranny of the Internet can be challenging. Consider my colleague. If you know where to look, it’s still possible to find those harassing pages. They don’t show up in Google or Bing, but there are copies hidden away at the Internet Archive, a website that seeks to preserve most of the Internet’s content for posterity. There are procedures for removing data from the Internet Archive, but those procedures generally require the active participation of the current holder of the Web domain. Fortunately for my colleague, the Internet Archive’s pages aren’t indexed by Google or Bing, so except for those people who know specifically where to look, the information is invisible.
In fact, it’s hard to imagine a system that could index all of the world’s information thoroughly enough to allow someone exercising the “right to be forgotten” to track down and eradicate every regrettable message or photo. More likely, the mechanisms to find that data would cause more privacy violations than they would prevent.
A better solution could be a set of standards for labeling the provenance of information on the Internet. It would be somewhat like the way Facebook requires application developers to keep checking back to see whether personal information is still acceptable to use. It would also take advantage of the privacy-protecting steps that other sites like Twitter and Yahoo sometimes are willing to take for their users.
This could be done using the HTML microdata standard being developed. It is still evolving, but this standard will expand the ways that information in Web pages can be represented in their underlying HTML code. For example, the microdata could include tags designed to facilitate privacy tracking and the retraction of privacy-sensitive information. So if you persuaded a website to take down information because it violates the site’s terms of service, that website could automatically notify others that have made copies of your information, informing them that the license to use the data has been revoked.
Such voluntary technical measures would go a long way toward improving the situation that policymakers hope to fix with a legal right to be forgotten.
Couldn't get to Cambridge? We brought EmTech MIT to you!Watch session videos here