Want to Track People? There’s an App for That

A new program analyzes iPhone apps and finds the ones that are grabbing your data.

Robert Lemosarchive page

January 25, 2011

More than half of all iPhone apps collect and share a unique code that could be used to track users without their knowledge, according to a recent study.

Manuel Egele, a doctoral student at the Technical University of Vienna, and three other researchers examined how more than 1,400 iPhone apps handle user data. Only a small number blatantly compromised privacy: 36 accessed the device’s location without first informing the user; another five mined data from the user’s address book without permission. The research will be presented at the Network and Distributed System Security Symposium in early February.

However, more than half of the iPhone applications studied collected the device ID—a 40-digit hexadecimal number identifying a particular phone. More than 750 of the apps studied used some sort of tracking technology. In about 200 cases, the developer created a way to track a device’s identifier code; the other apps used this functionality from advertising or tracking software library. Egele’s research will be presented at the Network and Distributed System Security Symposium in early February.

“There is a potential for companies who are not too legit to build profiles of their users,” Egele says. “The identifier [code] is not tied to a username, but you could link it to a Facebook account, and that would give you a lot of information on the user, including—most of the time—their real name.”

Apple, which recently celebrated its 10 billionth App Store download, vets applications and requires developers to request authorization in order to access users’ data. However, little is known about how the company checks each app.

“You don’t know exactly what these apps do—they don’t come from big developers, they come from regular people,” says Charlie Miller, an iPhone security expert and principal analyst with Independent Security Evaluators. The iPhone automatically limits what programs can do using a so-called “sandbox,” but these restrictions are not very strict, meaning it isn’t difficult to collect personal data, Miller says. “They do run in a sandbox, but it’s a pretty lenient sandbox.”

The four researchers analyzed 825 apps available for free on Apple’s App Store and another 582 apps on the Cydia repository, a service that makes software available to users who have removed Apple’s security measures from their iPhones, a process known as “jailbreaking.”

Egele and his colleagues defined a violation of privacy as an incident where a program reads sensitive data— addresses, phone numbers, e-mail account information—from the device and sends that data over the Internet without asking permission. The researchers had no way to discover if the user was tricked into giving consent.

“The description of privacy that we came up with is that we did not want sensitive data from the mobile device extracted without the user knowing,” Egele says. “But we cannot tell if it is malicious or not.”

The researchers developed software to test each program’s functionality and to determine if it collected and transmitted sensitive information without informing the user. They also had to decipher how certain apps function with only limited information.

An interesting insight from the research: apps from the App Store were more likely to surreptitiously access user data than apps from the unpoliced Cydia repository, says Egele.

Miller says Apple should improve its process for vetting applications. “There is not an easy solution to the problem, but having a central clearinghouse (like Apple) is the best way to do it,” he says. “But right now, Apple’s probably not doing it right.”

Keep Reading

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.