But Who’s Counting?

No one really knows how many people visit websites. A San Francisco startup and Google are both working to change that.

Jason Pontinarchive page

February 24, 2009

In August 2006, when Roger McNamee invested in Forbes, he did so in part because its Web audience was thought to be huge. McNamee is a founder of Elevation Partners, a Silicon Valley private-equity firm that counts Bono of the rock band U2 as one of its managing partners; it specializes in big, bold investments in media and technology. Onstage at EmTech, Technology Review’s annual conference, he said, “Look: I’m not investing in Forbes for its dead-trees business.”

**He counts:** Konrad Feldman, a cofounder of San Francisco–based startup Quantcast, sees big business in audience measurement.

At the time, Jim Spanfeller, the chief executive of Forbes.com, claimed that more than 15 million readers around the globe had visited his site in February, making Forbes the world’s leading business site. He supported his boast with research from ComScore Media Metrix, one of the two leading suppliers of third-party traffic data for the Web. The numbers seemed safe enough: Forbes.com’s internal server logs showed even greater Web traffic. It was embarrassing, therefore, when ComScore announced that it had changed the methods it used to estimate worldwide audiences, and that little more than seven million people had visited Forbes.com in July. That placed Forbes’s online audience below those of Dow Jones (whose sites include WSJ.com) and CNN Money (whose sites include Fortune). Bitchy press accounts suggested that McNamee had been overcharged–if not actually robbed–for his investment, which was variously reported at between $250 million and $300 million.

More than two years later, McNamee claims he always knew there were broad discrepancies between what the internal sever logs of Forbes.com showed and what third parties reported. “To be a headache, it would have to be surprising,” he says. Instead, he suggests, he invested with no very precise idea of Forbes.com’s audience: “I looked at every indicator that was out there. They were all bad. In the end, I had to think about it differently. I invested in Forbes because I thought the market was underserved, and because they had made fewer mistakes than anyone else.” (To this day, McNamee declines to say how much he paid for how large an equity stake.)

People still can’t agree on how many readers visit Forbes.com. “According to ComScore, we have six to seven million visitors [per month]; our own logs say 18 to 20 million,” says Spanfeller. But while the difference between third-party and internal measurements is, for a variety of reasons, particularly striking in the case of Forbes, confusion about the size of online audiences is universal.

RESOURCES:

View graphs of U.S. advertising spending and U.S. online ad spending.

No one really knows how many people visit websites. No established third-party supplier of audience measurement data is trusted. Internal Web logs exaggerate audiences. This matters to more people than investors, like McNamee, who worry that they have no way to evaluate new-media businesses. The issues involved are technical, and occluded by ugly jargon, but they concern anyone anxious about the future of media as print and broadcast television and radio shrink in importance.

Happily, a California startup and Google are working to measure Web audiences in new and better ways.

The Price of Journalism
Why care about something as arcane as dodgy audience measurement? Here’s why: where content is free, as it is on most websites, the only thing that will pay for quality journalism–or, really, anything valuable at all–is advertising. For most new-media businesses, “display” or banner advertising is the main source of operating revenues. But the general inability to agree on audience numbers is stunting the growth of display advertising.

Every year, advertisers spend billions of dollars online; eMarketer, a research firm, predicts $25.7 billion in 2009 in the United States alone. Marketers study Web audiences to help them decide which sites to spend money on: they try to divine the number of people who visit a site every month, demographic details about those visitors, the length of time they stay on the site, the number of pages they view, and the relationship, if any, between the ads they see and the way they behave. The people who actually buy ads–media buyers and planners at advertising agencies–use this information to choose appropriate sites for campaigns. Finally, publishers use the data to set advertising rates.

However, the correlation between the size of Web audiences and their value to advertisers is not direct. In print, the relationship between audience size and advertising spending is simple, because the prices of ads derive largely from a publisher’s audited statement of circulation; media planners buy the total audience. Online, it’s more complicated because the currency of display advertising is ad impressions, or the number of times a specific ad is served to a particular part of a website. “Audience numbers don’t affect my buying decisions very much,” explains David L. Smith, the chief executive and founder of Mediasmith, an interactive-media planning and buying agency whose clients include the National Geographic Channel and Sega. “If we were buying the total audience of a site, it would be different. But most of the time we buy packages of impressions.”

Jim Spanfeller, who is a past chair and current board member of the Interactive Advertising Bureau (IAB), the industry association that represents sellers of online advertising, agrees with Smith that unreliable audience measurement doesn’t directly affect ad spending, at least at larger sites: “If you’re an established site like Forbes.com, you’re selling on an ad-impression basis. The problem arises when an agency is thinking about moving money from one medium, like print or television, onto the Web.” Then, Spanfeller says, media planners can’t show their clients whether Web audiences replicate or complement the audiences that advertisers are reaching through traditional media. “We need believable numbers so that we can do cross-media comparisons,” he says. Additionally, bad audience measurement “hurts smaller sites with more targeted audiences that don’t have a lot of impressions”–the class of sites that Spanfeller, like many digerati, says occupies “the long tail.”

Thus, the real consequence of the audience measurement problem is a chilling effect on the transfer of advertising from older media to new. Meanwhile, another form of online advertising is growing quickly–but it’s not the ads publishers sell. The numbers clarify. Spending on “keyword” or search advertising (the sponsored links that appear near search results on Google.com and other search sites) grew 21 percent in 2008, mostly at the expense of print, local television and radio, and Yellow Pages advertising; it now constitutes 45 percent of all online advertising. That’s because the effectiveness of keywords is unambiguous: advertisers pay directly for click-throughs or purchases. There’s no need to appeal to anything so disputed as the size or composition of Web audiences. This growth in keyword advertising has mainly benefited the search firms. By comparison, the display advertising that media companies sell grew only 4 percent the same year.

Four percent growth might sound all right to some, but it occurs at the same time that advertising revenues in print are falling rapidly. For instance, ad spending in newspapers will decline from $50.8 billion in 2007 to $45 billion by 2012, according to Borrell Associates, a research firm. Even Forbes is sweating. As a private company, it does not disclose its revenues, but the number of ad pages in its magazine has been shrinking since 2000. At the same time, the company’s online advertising revenues are reported to be between $55 million and $70 million, a figure Spanfeller did not dispute. That’s not so much for a publication with an audience of 20 (or even seven!) million. In the glory days of print advertising, publications with much smaller audiences earned as much or more: Red Herring, which I once edited, earned more than $50 million in print advertising revenue in 2000, and its circulation was only 350,000 readers, according to Ted Gramkow, the magazine’s former publisher.

Display advertising was meant to fund the great shift of readers to new media. It’s not happening. For more than 100 years, advertising paid publishers and underwrote their production of great journalism; now, those ad monies are being funneled to search firms that create nothing but code. As Roger McNamee says: “Getting this right is absolutely necessary for publishers to be able to continue to do interesting things.”

**Adding it up:** Although print, radio, and television still account for a large percentage of total advertising revenues, their share is decreasing as more and more ad dollars are spent online. But this isn’t necessarily a boon to new-media publishers who rely on display advertising, because “keyword” or search advertising dominates the online ad market; it now accounts for nearly half of all online ad revenues.

Panel Discussion
What’s wrong with existing methods of measuring Web audiences? Lots.

ComScore and Nielsen Online, a division of the Nielsen Company, are the established leaders in the field of audience measurement and the sale, to advertisers, agencies, and publishers, of the data that audience measurement produces.

These third-party audience measurement firms exist because the internal logs of publishers are notoriously unreliable in quantifying user activity on a given site. “When publishers use their log files, there are many limitations,” David Smith says. He says that the limitations of using these internal logs (a practice sometimes called “census measurement”) include, in ascending order of impact, overcounting individuals with multiple computers or Web browsers; counting “mechanical visits” by Web “bots” and “spiders” (for example, when Google crawls the Web to estimate the popularity of sites) as visits by real people; and overcounting individuals who periodically flush out the “cookies” of code that sites stash on browsers so that returning visitors can be recognized.

To create more-accurate audience numbers, ComScore and Nielsen Online rely on a methodology inherited from television audience research: the panel. Nielsen, for instance, has recruited nearly 30,000 panelists for its flagship product, called Netview. Panelists agree to have their Web browsing monitored through interviews and through “meters,” or spyware, installed on their personal computers.

But what worked with television doesn’t work nearly so well with the Web. “Panels are always problematical,” says Spanfeller, “but on the Web they’re super-problematical. Panels undercount by one-third to one-half.” In short, publishers simply can’t accept that their audiences are as small as panel-based measurements suggest they are.

Among the problems with panel-based audience research, according to both Spanfeller and Smith, is that it tends to undercount people who look at sites at work, because most companies’ information technology managers won’t install strange spyware on their computers. Sometimes, panelists lie to interviewers. Also, both say, there is a straightforward “sampling error” (what statisticians consider the misprisions that derive from sampling too small a portion of a general population): with as few as 30,000 panelists, the audiences of smaller sites are often grossly underestimated or missed entirely.

A final problem with panel-based measurements is that at the moment, neither Nielsen nor ComScore has itself been audited by an independent party. Who knows, both Spanfeller and Smith asked darkly, how valid the firms’ reporting methods really are?

Nielsen defends its panels. “I guarantee you, if our numbers were higher than the publishers’ server data, we wouldn’t be having this argument,” says Manish Bhatia, the president of global services at Nielsen Online. Bhatia notes that Nielsen does sell products, such as SiteCensus, that install software tags on publishers’ websites and measure server logs. “In combination with panels, they’re useful,” he says. “But panels are more reliable, they provide demographic information, and they tell you what people do after they’ve seen an online ad.”

For its part, ComScore also concedes that server logs have their place: they disclose which Web pages a publisher served, and when. But like Nielsen, the company insists that only panels provide an accurate measurement of audiences and their demographic makeup. “Servers don’t measure people,” says Andrew Lipsman, director of industry analysis at ComScore.

Why are Nielsen and ComScore so wedded to panels? According to David Smith, “The incumbents have a huge amount of money invested in their methodologies–and getting them to admit they have a problem isn’t easy.”

Roger McNamee is more blunt. “I understand why Nielsen is so bad,” he says. “But why isn’t there anything better? There’s a huge market opportunity for any venture capitalist who is willing to fund a system that audits actual traffic.”

“What we need is a third-party Omniture,” says Spanfeller, referring to the website analytics software that many publishers (including Technology Review) use to log their own traffic.

Measure for Measure
Recently, I visited Quantcast, a San Francisco-based startup that is hoping to provide just such a service. Founded in 2005 and funded with $26 million, mainly from Polaris Ventures and Founders Fund, the company wants its service, which launched in 2006, to overthrow traditional panel-based Web audience measurement.

Konrad Feldman, the company’s youthful, redheaded, British-born chief executive and cofounder, met me at the company’s headquarters overlooking the Yerba Buena Gardens and the Moscone Center. In a large conference room with a cement floor, decorated according to the precepts of venture-capitalist high minimalism, he asked whether Technology Review was “quantified”–that is, whether its online visitors were tracked by the startup’s software tags. After we confirmed that our site had been quantified for some time, he opened his laptop and searched for our URL at Quantcast.com.

An elegant dashboard of audience information was swiftly served: TechnologyReview.com, it said, had 342,000 “global people” and 205,000 “U.S. people.” These numbers, which measured monthly visitors to our site, were not so low as those reported by traditional third-party audience measurement firms, but they seemed suspicious: throughout 2008, Omniture reported around 650,000 unique visitors to TechnologyReview.com every month. But we also learned that 32 percent of TechnologyReview.com’s readers earned more than $100,000 a year and that 24 percent had postgraduate degrees, which seemed about right. (A peek at Forbes.com, which is not quantified but whose numbers the startup had extrapolated, showed that the business site had 4.9 million “U.S. people,” who were richer than TechnologyReview.com’s readers, although not as highly educated. Because Forbes was not quantified, Quantcast didn’t supply Forbes.com’s total worldwide audience.)

Quantcast’s service, like that of existing audience measurement firms, begins with panels–or, more precisely, panel-like data in the form of “reference samples,” provided to the company by third parties such as market research firms, Internet service providers, and toolbar companies, among other sources. These statistical methods create a basic model of U.S. Web traffic. But when publishers install Quantcast’s tags on their servers, Quantcast gets more details; the startup adjusts for spiders and bots, people with multiple computers, and cookie flushers. The two methodologies are combined using something Quantcast calls its “mass inference algorithm,” created with the aid of two Stanford University mathematicians and refined by the seven mathematically minded PhDs who work at the company. This algorithmic analysis of panel research and server-based measurement is unique in Web audience measurement (although Nielsen more coarsely combines the two methodologies with a service called VideoCensus, which tracks online video viewing). The resulting audience information, says Feldman, is much more reliable than anything offered by ComScore or Nielsen.

“Publishers and advertisers have used panel-based research for nearly 75 years,” says Feldman. “So there’s obviously an established way of doing things. But equally, there’s a pretty clear recognition in the marketplace that something has got to change.”

Because Quantcast’s audience information is free (where ComScore’s and Nielsen’s measurements are not), the company hopes to make money by charging publishers who enroll in Media Planner, a service launched last May that helps media planners spend their clients’ cash. Although Media Planner is wholly free for now, Quantcast wants to expand the service so that it can finely describe demographic subsets within websites’ audiences, a utility for which the company believes the sites themselves will pay. Feldman explains this tricky idea: “You have a sales force at TechnologyReview.com, and they can’t possibly speak to everyone who might value your audience. But if you can expose that audience to buyers, then you can create a way whereby buyers can discover the parts of your audience they find particularly valuable.” Feldman says that Media Planner allows media buyers to find appropriate audiences, “but it’s the publishers that should pay, as they’re the ones getting higher rates for their audience segments.” More ambitiously, Feldman hopes Quantcast’s audience data, in combination with ad impressions, will create a new currency for advertisers, advertising agencies, and publishers that will make display ads more effective and therefore more valuable.

Feldman and his cofounder, Paul Sutter, the company’s president, do not approach the problem of audience measurement as veterans of media. Feldman, a computer scientist, was the cofounder of Searchspace (now Fortent), which developed software to help financial-services firms detect money laundering and the financing of terrorists. Sutter founded the network optimization company Orbital Data (later acquired by Citrix) as an expert on high-performance computer architectures, a background that has proved useful as Quantcast processes the thousands of terabytes of data it has collected.

When the founders first conceived the company, Sutter says, “we just asked the most simple, kindergarten questions, and it soon became clear that the language that media buyers and planners were speaking was nothing like the language of Internet advertising, with its cost-per-clicks and so forth. Media planners liked to talk about audiences, demographics, and lifestyles. So the answer was quantcasting, which means just reaching the people you want to reach.” Today, the company claims that 85,000 broadly defined “publishers” have elected to be directly measured by Quantcast, including the Disney-ABC Television Group, NBC, CBS, MTV Networks, Fox, BusinessWeek, and Time’s SI.com and CNNMoney.com.

Quantcast is not the only company with the bright idea of replacing panel-based audience measurement. Last June, Google announced a new service, Google Ad Planner, which uses the company’s detailed knowledge of Web traffic to provide interested parties with a more accurate understanding of Web audiences. Wayne Lin, Ad Planner’s product manager, demonstrated the service to me when I visited the GooglePlex in Mountain View, CA. Because Google owns DoubleClick, one of the two dominant systems for serving ads, Web audience data can be combined with the ad-serving system so that media planners know which sites are best suited for which ads. The combination should be powerfully attractive for media planners and marketers, says Lin.

How do media planners regard the two new audience measurement services? “We use Quantcast now at Mediasmith, but they are not complete enough yet to be a total solution,” says David Smith, who briefly advised the startup during its formation. The difficulty, according to Smith, is that the site’s audience information won’t be really useful–let alone a new currency–until more publishers elect to be quantified. Jim Spanfeller agrees. “They’re to be commended for working hard on the problem,” he says. “But it’s very much a chicken-and-egg thing.”

As for Google’s Ad Planner, Smith says, “the agencies will never stand for it.” Smith, like everyone I spoke to, argued that media planners will resist Google’s audience information because no one wants one company to be so dominant in online advertising: were Ad Planner to be widely adopted, Google would be selling keywords through its search advertising network, AdWords; selling banner advertising through its display advertising network, AdSense; serving those ads through DoubleClick; and advising media planners on where to spend their advertising dollars.

Ad Planner also lacks a number of important features that an advertising agency might expect from an audience measurement service. According to Smith, it offers neither very detailed demographics nor a full explanation of its methodologies. Patrick Viera, TechnologyReview.com’s own digital strategist and West Coast advertising manager, said disdainfully when I asked his opinion: “Yeah, I looked at it. It doesn’t do anything you want. It’s just a tool for selling AdSense.”

Still, says Smith, there’s demand for something new. “Publishers have to use third-party measurements, but third parties [such as ComScore and Nielsen] may underestimate audiences, and the truth is probably somewhere in between. That’s why new companies like Quantcast have a chance.”

Growing Pains
But neither Quantcast nor Google nor improved products from ComScore and Nielsen Online could, by themselves or in combination, fix display advertising and thereby ensure the future health of media.

Whatever audience measurement tools are adopted, they will themselves have to be validated by an independent party. Quantcast, ComScore, and Nielsen Online (but not Google) are all in the process of being audited by the Media Rating Council (MRC), which was established by the U.S. Congress in the 1960s to audit and accredit the ratings of broadcasters. Accreditation will smooth disputes about the different audience measurement methodologies, according to George Ivie, the chief executive of the MRC: “It will help bring the numbers closer together; and it will explain and make transparent the differences between the census and panel systems.”

In addition to the disagreements about the size of Web audiences, though, online advertising suffers from deep structural problems that must be addressed before media planners and their advertising clients will spend really large sums. These are various and dauntingly technical, but according to David Smith, they all involve, in one way or another, the absence of commonly accepted, automated means to create, sell, serve, and track the performance of online ads.

Fixing all that will take years, as will the adoption of undisputed audience measurement methods. “This industry is only 13 years old,” says Smith. “It grew rapidly with few standards for six years. Then it collapsed, with very little research and development for four years, and has just been getting back to the right kind of R&D and standards in the past three.”

Still, by any estimate, the general confusion about Web audiences is the reason why the online medium has matured in so ungainly a fashion. “It’s an amazing topic,” wrote Roger McNamee in a conversation using the messaging service of the social network Facebook. “You could see it coming a mile away. Unfortunately, the remedy is not yet obvious.”

Jason Pontin is the Editor in Chief and Publisher of Technology Review.

Keep Reading

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.