Turning to Academics for Analytic Insight

Wharton’s Customer Analytics Initiative tries to help companies make better sense of the data they’re sitting on.

William M. Bulkeleyarchive page

May 20, 2011

Since 2007, the online ticket broker StubHub has been trying to study the buying habits of its customers more closely. Every month, it randomly selects 2,000 first-time buyers and tracks their behavior on its site over time. But analytics experts at the company were already swimming in too much data to make full use of the added information.

**More rigor:** Peter Fader, a professor of marketing at the Wharton School at the University of Pennsylvania, works with companies to figure out the signals their customers are sending.

So last month StubHub provided all that data to academic researchers to see if they could tease out new insights. StubHub wants to know whether its discount offers get dormant buyers to return to the site, whether buyers who are regularly offered discounts stop buying at full price, and which of its e-mail campaigns are successful in retaining customers.

StubHub agreed to work with the University of Pennsylvania’s Wharton Customer Analytics Initiative, a three-year-old organization that aims to make connections between companies with lots of data and academics from multiple universities who want to figure out new ways to analyze it. Originally called the Wharton Interactive Media Initiative, it changed its name this year to reflect its goal of working with more traditional companies rather than just online media. Cofounder Peter Fader, a Wharton marketing professor, hopes it will soon be working with a pharmaceutical company, a financial services firm, and some nonprofit organizations. With growing pools of data, he says, many kinds of companies that want to understand their customers’ behavior need tools more sophisticated than focus groups.

Last year, Wharton researchers worked with ESPN to help the sports network better understand the behavior of World Cup soccer viewers. ESPN wanted to know whether making games available on cell phones and computer monitors hurt viewership on its cable TV channels. It concluded that this cross-platform availability didn’t cannibalize viewership, because fans watched on the best available screen. That’s logical, but it’s important for ESPN ad salespeople to have the research when selling ad time to sponsors.

Companies pay $150,000 to sponsor the initiative, which helps pay for its eight-full time employees. The companies also provide their data to make it all work. One key rule Fader has when evaluating proposals: the research should seek to establish causality, not just correlation. The organization seeks what he calls “granular, longitudinal data” that reveals what individual consumers do over time.

StubHub, a subsidiary of eBay, has such data from the 2,000 first-time buyers it randomly selects to follow every month. It has records of when it sent them e-mails, when it sent them special offers, when they visited the website without buying tickets, and what and when they purchased. It also creates control groups who don’t receive any special offers, so it can tell whether offers made a difference.

“StubHub is giving us a fantastic data set,” says Elea Feit, Wharton’s research director. None of the data is personally identifiable, she says; StubHub doesn’t keep demographic information beyond zip codes.

Grace Lau, StubHub’s director of relationship marketing, told the researchers that the company has no trouble attracting sellers—it lacks tickets for an event less than 2 percent of the time. The challenge it faces is to attract more ticket buyers and get them to buy more often.

The researchers suggested six ways of analyzing the data. One researcher plans to try to “predict customer purchasing behavior as a function of sport team performance.” Several proposals explore the effectiveness of e-mail marketing.

Fader says he has worked to bring analytic rigor to marketing questions for 25 years. But he also thinks that many companies, encouraged by hardware and software vendors, maintain and try to use far more personal information about their customers than they actually need. Many companies have “been collecting a lot of data they shouldn’t have permission to be collecting,” he says. “It creeps people out. The companies don’t even know what to do with the data.” For instance, he says, the most effective way to predict future buying behavior is to gather data that direct marketers have used since the 1960s: RFM, which stands for recency of purchase, frequency of purchase, and monetary value of purchase. Customers’ patterns differ depending on the product, so it’s the job of researchers, he says, to learn what can keep people buying more, or for longer.

Keep Reading

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.