Tracking Trick Shows the Web Where You Are

A new technique could be used to target advertising to users’ surroundings without their knowledge.

Tom Simonitearchive page

December 15, 2010

Using nothing more than the unique number assigned to every Internet connection, websites could determine whether you’re logging on at home, at work, or a travel location like an airport or hotel, researchers at Microsoft have shown. They say the technique could target advertisements more precisely—or improve the security of Web services by identifying users as legitimate according to their location.

**Moving trends:** Microsoft researchers used Web users’ Internet protocol addresses to track when they moved in 2008 and 2009.

Websites commonly use the numbers known as Internet protocol (IP) addresses to approximate the physical location of visitors (visit this site to see the location guessed from your IP address). The method, which is typically accurate to the level of a city, lets advertisers target people with local deals.

Until now, though, IP addresses have not been used to determine what kind of place the person is connecting from. Researchers at Microsoft Research Silicon Valley used a data set of IP addresses collected from logs of updates to an unnamed widely used software package and from log-ins to an unnamed popular webmail service. Tracking user locations by IP address could help advertisers sidestep suggested features of the “do not track” option that Congress is considering as a way to let people opt out of tracking by advertisers.

They first identified the IP address or addresses where each user most frequently logged in. Then they tagged any addresses more than 250 miles away from those as “travel.” Combining the logs for different users and looking at the timing of log-ins sharpened the labeling of IP addresses as either “home,” or residential connections, “work” locations such as offices, or “travel” locations from which many different users logged in when away from home. To make the results more robust, IP addresses were assigned to a particular category only if the records of the majority of people that had logged in there led to the same conclusion.

Verifying that the resulting database identified locations accurately would require following people around, but other clues suggest the method works. The software update data showed that over 90 percent of “travel” IP addresses were associated with laptop computers rather than desktops, while “home” IP addresses were equally likely to be a laptop or a desktop. Another test compared the logged IP addresses with a small public data set of nearly 4,000 known residential broadband IP addresses. Around 24 percent of the “home” IP addresses overlapped with that set, compared with just 0.4 of “travel” IP addresses.

“A Web application can benefit from this location context information in many ways,” wrote research team members Yinglian Xie and Martin Abadi in an e-mail. “One example would be targeted advertisements. For example, ads about plumbing services may be more relevant to users when they are at home than when they travel.”

Other evidence suggests that this kind of targeting would work. The researchers compared logs from Web searches made from “home” and “travel” IP addresses to see how often people in the two types of locations clicked on search ads. Some search words primarily led to ad clicks from one type of location. For instance, ads served against the words “movies,” “inn,” and “cars” more often tempted users with travel IPs, while “employment,” “applications,” and “college” led to ad clicks from home.

The IP data could also be used to track people when they move to a new home, making it possible to map relocation patterns for the U.S. (see image).

“I’m not aware of any companies doing anything this sophisticated yet,” says Jules Polonetsky, director of a think tank called the Future of Privacy Forum, “but I wouldn’t be surprised to see smaller firms experiment with it soon.” Firms that track Web users in order to target advertising are increasingly turning to methods that don’t rely on cookies, he explains, because these are unaffected by browser settings or the “private browsing” modes that feature in modern browsers. “They’re looking to avoid any control that the user has,” says Polonetsky.

The best hope a user has for regaining control in the face of IP-address-based profiling is anonymity software, says Jacob Appelbaum, a security researcher and lead developer of the Tor project, which develops open-source anonymity software. Tor masks a user’s IP address by passing his or her connection through a network of relays around the world. “Some people don’t want to help businesses gain better intelligence every time they visit a website,” says Appelbaum, “Tor is a way to opt out.”

The Microsoft team says its method could be made more powerful by combining its data with data that ISPs maintain about their subscribers and with databases of the location of public Wi-Fi access points. But this enhancement is not one they will pursue, say Xie and Abadi, explaining that “the business and privacy questions that it raises are potentially problematic.”

Keep Reading

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.