Skip to Content
Uncategorized

Who Benefits from AOL’s Released Search Logs?

Privacy takes a hit. But statisticians are having a field day.
August 15, 2006

Last week, AOL’s PR team cringed as the world learned that the company had publicly posted search terms from 650,000 AOL users. They posted the search log on a research site, and subsequently took it down after a flurry of coverage in the blogosphere. Nonetheless, a number of sites have reposted the log.

While specific names of AOL users weren’t link directly to searched terms, there was no guarantee of anonymity: AOL had assigned each user a number, and often users searched their own names, and their hometowns. The New York Times was able to track down one AOL user who talked with a reporter about her searches for “numb fingers” and “dog that urinates on everything.”

The data dissemination led privacy advocates to trumpet the dangers of search companies storing people’s queries. At the same time, though, other people – Internet researchers, statisticians, sociologists, and political scientists – silently cheered.

Before the AOL release, all major search engines had kept their data from the public eye. This meant that researchers interested in the activities of users of search engines had to either rely on speculative data from open, infrequently used search engines, or make educated guesses. The AOL search log, which contains more than 30 million search terms, could thus provide some missing insight into how people use the Web, says Matt Hindman, a political scientist at the Arizona State University in Phoenix. A better understanding of Web dynamics has implications for political campaigns, education, and an entire economy built on advertising through Web searches. “For researchers like me,” Hindman says, “that’s exciting.”

Shortly after AOL’s goof, a site called AOL Stalker was created. Its main draw is that it allows people to search through the AOL database and view user searches as well as other search data. The author of the site has also posted the first in a series of basic data analyses. This initial number-crunching examines how well the rank of search results can predict a page’s click-through rate – in other words, it shows how well results match what people want to find. According the analysis, in 47 percent of searches, people didn’t click on any of the presented results. While the revelation that nearly half of all AOL searches don’t go anywhere isn’t earth-shattering, further analysis could provide insight into how to make search engines more useful or guide advertisers in their ad placements.

As giddy as this sort of data makes statistics hounds, the creator of AOL Stalker, at least, still seems mindful of the sensitive nature of the information. The site’s creator lets anyone request that certain information be hidden from the site’s search engine if it’s too revealing. As noted in the fine print: “If you find any data that actually makes it possible to identify a user, please let us know using the contact form, and we’ll remove those references.” – By Kate Greene

Keep Reading

Most Popular

Europe's AI Act concept
Europe's AI Act concept

A quick guide to the most important AI law you’ve never heard of

The European Union is planning new legislation aimed at curbing the worst harms associated with artificial intelligence.

Uber Autonomous Vehicles parked in a lot
Uber Autonomous Vehicles parked in a lot

It will soon be easy for self-driving cars to hide in plain sight. We shouldn’t let them.

If they ever hit our roads for real, other drivers need to know exactly what they are.

supermassive black hole at center of Milky Way
supermassive black hole at center of Milky Way

This is the first image of the black hole at the center of our galaxy

The stunning image was made possible by linking eight existing radio observatories across the globe.

transplant surgery
transplant surgery

The gene-edited pig heart given to a dying patient was infected with a pig virus

The first transplant of a genetically-modified pig heart into a human may have ended prematurely because of a well-known—and avoidable—risk.

Stay connected

Illustration by Rose WongIllustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.