Personalization is a key part of Internet search, providing more relevant results and gaining loyal customers in the process. But new research highlights the privacy risks that this kind of personalization can bring. A team of European researchers, working with a researcher from the University of California, Irvine, found that they were able to hijack Google’s personalized search suggestions to reconstruct users’ Web search histories.
Google has plugged most of the holes identified in the research, but the researchers say that other personalized services are likely to have similar vulnerabilities. “The goal of this project was to show that personalized services are very dangerous in terms of privacy because they can leak information,” says Claude Castelluccia, a senior research scientist at the French National Institute for Research in Computer Science and Control, who was involved with the work. The work will be presented this summer at the Privacy Enhancing Technologies Symposium in Berlin, Germany.
The researchers got hold of personal information by taking advantage of the fact that Google uses two different protocols to communicate with its users’ browsers. Google protects sensitive information, such as passwords, by using a protocol called “https” that encrypts the data as it’s communicated. Other times, when dealing with search queries for example, Google uses the ordinary “http” protocol, which sends information back and forth in the clear. The researchers say this mixed design can inadvertently reveal information.
Google offers a variety of Web services, including Gmail, Google Docs, and Google Calendar. A less well-known service is Google Web History, which records searches made by a user while she is signed in to her Google Account. At the time the researchers were investigating it, Web History was also the source of personalized suggestions that Google offered users on its search page.
The researchers were able to get access to users’ Web History by intercepting cookies–files stored on a person’s computer that hold useful bits of information such as authentication credentials or the contents of a shopping cart. For many services, such as Gmail, this information is encrypted before it is sent. At the time, Web History sent its cookies in the clear. By eavesdropping on an unsecured network, such as a public Wi-Fi hotspot, an attacker can intercept Web cookies. The researchers determined that intercepted Web History cookies could provide access to that user’s Web History account.
The researchers also found another way to reconstruct users’ search history. Another cookie–the one that authenticates a user to Google’s search service–is also sent in the clear. By capturing this cookie and impersonating the user in communications with the search service, they were able to run algorithms that quickly reconstructed large portions of a user’s Web search history.
Castelluccia says companies should recognize that they need to use secure channels whenever a user’s personal information is being transmitted. “The main lesson of the attack is that companies should use https as much as possible,” he says, adding, “Of course, https has a cost–it means Google has to use more servers, energy, and all that.”
Google responded to the researchers by changing its Web History so that it does always use encrypted communications. The company also temporarily suspended its search suggestion service. And suggestions for Google Maps, which the researchers were also able to access, are now encrypted, too.
Alma Whitten, software engineer for Google’s Security and Privacy arm, said in a statement that Google increased its use of https in response to the researchers. “Google has been and continues to be an industry leader in providing support for encryption in our services, which is designed to address precisely the issues that all major websites face when transmitting information over http to users connecting via an unsecured network channel,” she said.
“Google was very reactive and very responsible,” Castelluccia says. However, he notes that search suggestions are still being provided via mobile phones and are still vulnerable. The researchers are keeping track of which services are vulnerable on a website devoted to the project. (Update May 17, 2010: Google fixed the mobile issue described on April 28.)
Ben Adida, a fellow at Harvard University’s Center for Research on Computation and Society, says that intercepting unencrypted traffic is “trivial” today, and “the consequences can be surprisingly privacy-invasive.” He adds, “This work is nice because it concisely shows how half-measures often provide little protection: there is a growing need to move all sensitive services to [https].”
However, Adida warns that encryption won’t solve all privacy problems. “We are slowly entrusting more of our data to large companies that then risk becoming targets of large-scale attacks,” he says. “It’s important to continuously secure these services, but it’s equally important to realize the inherent risk we run by giving this data to third parties in the first place.”
Capitalizing on machine learning with collaborative, structured enterprise tooling teams
Machine learning advances require an evolution of processes, tooling, and operations.
The Download: how to fight pandemics, and a top scientist turned-advisor
Plus: Humane's Ai Pin has been unveiled
The race to destroy PFAS, the forever chemicals
Scientists are showing these damaging compounds can be beat.
How scientists are being squeezed to take sides in the conflict between Israel and Palestine
Tensions over the war are flaring on social media—with real-life ramifications.
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.