Study Shows Google’s Dominance of Online Advertising

The mobile Web browsing of three million people reveals how pervasive ad tracking technology is and shows that Google’s is dominant.

Tom Simonitearchive page

December 17, 2013

Analysis of the mobile Web browsing habits of over three million people has revealed previously unseen patterns in how the major advertising companies carve up the Internet. Among the findings of a new study is that Google’s advertising tentacles extend to at least 80 percent of online publishers. It also found that if only a small fraction of Web surfers opted out of being shown shown ads based their previous online behavior, it would significantly decrease the industry’s profits.

The new study was carried out by academics at Stony Brook and Columbia universities, with researchers from two major telcos: AT&T in the U.S. and Telefonica in Spain. They used records of 1.5 billion mobile Web surfing sessions from 2011, supplied by an unidentified mobile network operator, to see which sites people visited and which online ad companies provided ads on those pages.

The data showed how advertisers use technology embedded in Web pages to track visitors and target them with ads. Phillipa Gill, an assistant professor at Stony Brook who led the study, says it is the most comprehensive overview yet of the online ad industry as a whole: “We had access to all of the websites the population of users accessed, publishers they visit, and all the ad aggregators on those publishers.”

Up to now, attempts to study online ad networks have involved looking at the Web pages of top publishers—defined in the study as content-hosting sites such as nytimes.com—rather than studying the full spectrum of sites people actually visit, Gill says. Ad companies and online publishers don’t release such data themselves. The new approach made it possible to count which ad aggregators are most pervasive online, putting them in the best position to track people’s browsing habits.

Google came out a clear winner, with one or more of its various ad technologies appearing on the majority of publishers visited by people whose data was in the sample. Google-branded ad tracking and targeting technology was found on 80 percent of publishers, and products from its mobile ad network subsidiary AdMob on 19 percent. Facebook has the second most extensive tracking network, mostly thanks to its “Like” button; its technology was present on 23 percent of publishers’ sites.

The figures suggest that Google should be able to effectively track most Web users. The more sites an ad aggregator is present on, the more of any one individual’s online browsing it will be able to monitor, says Gill.

A simple model the researchers built suggested that half the people whose information was in their data set could have their online interests fully captured by ad aggregators. That was partly due to the broad reach of ad aggregators and also to the fact that many people browse only a small handful of content types—for example, websites devoted to sports or motoring. “When the aggregators are able to see the full range of sites you are visiting, they are able to gauge fairly accurately if you are in the market for a car or cell phone,” says Gill.

The researchers also used their data to inform simple economic models about the value of different types of users and the scale of ad revenues collected by different companies. Ads were assigned values based on what sites they appeared on and how much an aggregator knew about the past browsing habits of the people seeing it. They showed that most revenue accrues to the very largest aggregators—90 percent of ad revenue is collected by just 5 percent of ad companies.

That model was also used to explore the potential effect on revenue of a technology like Do Not Track—a now-stalled Web standard that would allow Web users to opt out of targeted ads (see “High Stakes In Internet Tracking”).

Under Do Not Track, opting out doesn’t hide ads; it only prevents targeted ones from appearing. But a person in the industry told Gill that targeted ads command prices between two and 10 times those of untargeted ads. “The signal of your intent is basically gone,” she says. “We wanted to look at how bad it would be if something like Do Not Track was implemented in a broad way.”

The answer, from an ad industry perspective, is quite bad. The researcher’s figures show that if everyone adopted Do Not Track, ad revenue would fall by about 75 percent. If only the most valuable 5 percent of people in the data set adopted the measure, industrywide revenue would still drop 30 percent or more. “Even if a small fraction deploy Do Not Track, they can have a pretty big impact,” says Gill.

Krishna Gummadi, a researcher at the Max Planck Institute for Software Systems in Germany, says the new research introduces a novel way to study the economics and impacts of online advertising at a large scale. “The paper opens up new directions in exploring an important area,” he says. “Few studies have quantified—or rigorously investigated—the relationship between how much information an aggregator collects about a user and how valuable the information is.”

Keep Reading

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.