Skip to Content
Uncategorized

Who Benefits from AOL’s Released Search Logs?

Privacy takes a hit. But statisticians are having a field day.
August 15, 2006

Last week, AOL’s PR team cringed as the world learned that the company had publicly posted search terms from 650,000 AOL users. They posted the search log on a research site, and subsequently took it down after a flurry of coverage in the blogosphere. Nonetheless, a number of sites have reposted the log.

While specific names of AOL users weren’t link directly to searched terms, there was no guarantee of anonymity: AOL had assigned each user a number, and often users searched their own names, and their hometowns. The New York Times was able to track down one AOL user who talked with a reporter about her searches for “numb fingers” and “dog that urinates on everything.”

The data dissemination led privacy advocates to trumpet the dangers of search companies storing people’s queries. At the same time, though, other people – Internet researchers, statisticians, sociologists, and political scientists – silently cheered.

Before the AOL release, all major search engines had kept their data from the public eye. This meant that researchers interested in the activities of users of search engines had to either rely on speculative data from open, infrequently used search engines, or make educated guesses. The AOL search log, which contains more than 30 million search terms, could thus provide some missing insight into how people use the Web, says Matt Hindman, a political scientist at the Arizona State University in Phoenix. A better understanding of Web dynamics has implications for political campaigns, education, and an entire economy built on advertising through Web searches. “For researchers like me,” Hindman says, “that’s exciting.”

Shortly after AOL’s goof, a site called AOL Stalker was created. Its main draw is that it allows people to search through the AOL database and view user searches as well as other search data. The author of the site has also posted the first in a series of basic data analyses. This initial number-crunching examines how well the rank of search results can predict a page’s click-through rate – in other words, it shows how well results match what people want to find. According the analysis, in 47 percent of searches, people didn’t click on any of the presented results. While the revelation that nearly half of all AOL searches don’t go anywhere isn’t earth-shattering, further analysis could provide insight into how to make search engines more useful or guide advertisers in their ad placements.

As giddy as this sort of data makes statistics hounds, the creator of AOL Stalker, at least, still seems mindful of the sensitive nature of the information. The site’s creator lets anyone request that certain information be hidden from the site’s search engine if it’s too revealing. As noted in the fine print: “If you find any data that actually makes it possible to identify a user, please let us know using the contact form, and we’ll remove those references.” – By Kate Greene

Keep Reading

Most Popular

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.

OpenAI teases an amazing new generative video model called Sora

The firm is sharing Sora with a small group of safety testers but the rest of us will have to wait to learn more.

Google’s Gemini is now in everything. Here’s how you can try it out.

Gmail, Docs, and more will now come with Gemini baked in. But Europeans will have to wait before they can download the app.

This baby with a head camera helped teach an AI how kids learn language

A neural network trained on the experiences of a single young child managed to learn one of the core components of language: how to match words to the objects they represent.

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.