Statistics Unmask Phony Online Reviews

Computer analysis spots the fingerprints that fraudulent raters leave behind.

Neil Savagearchive page

June 18, 2012

Searching for hotels in cities they’ve never visited, people often turn to customer-written reviews on websites such as TripAdvisor. But how do they know those reviews weren’t written by the hotel manager, or by someone paid to post fake opinions online? The U.S. Federal Trade Commission has issued fines when it has uncovered such “opinion spam,” but there’s no easy way to spot it.

Now researchers from the State University of New York, Stony Brook, have come up with a scientific method of detecting whether someone has been posting fake reviews online. Their technique, presented at the International Conference on Weblogs and Social Media in Dublin, Ireland, earlier this month, doesn’t identify individual fraudulent reviews. Instead, it looks at how fake reviews distort the statistical distribution of a hotel’s scores, a sort of forensic analysis that shows something funny is going on.

The technique is “able to pinpoint where the densities of false reviews are for any given hotel,” says Yejin Choi, an assistant professor of computer science at Stony Brook, who carried out the work with colleagues.

If the review scores for any product, including a hotel, are plotted on a graph, they naturally produce a pattern that looks roughly like the letter J. That is, when something is scored from one to five stars, it should have a relatively high amount of one-star reviews, fewer twos, threes, and fours, and then a high number of five-star ratings. Paul Pavou, associate professor of information management systems at the Fox School of Business at Temple University, who studies online commerce, explains that this distribution is caused by a tendency of people to buy things they like, and therefore like what they buy. Furthermore, he says, if a purchase generally meets expectations, the buyer is usually less moved to write a review than if the experience was extremely positive or extremely negative.

But phony reviews distort this normal shape. To find the distortion, and thereby show that there were fake reviews in the mix, the Stony Brook team first selected reviewers it believed were more reliable. These were those who had written at least 10 reviews, more than a day or two apart, and whose rating didn’t stray outrageously from the average for all hotels.

The researchers then compared ratings from those reviewers to ratings from single-time reviewers to see if that second set had an unusually high number of five-star reviews. Hotels with larger discrepancies between these two sets of reviewers were labeled more suspicious. Choi also compared the ratio of positive to negative reviews among different groups of reviewers. And she looked for sudden bursts of reviewing activity that might be part of a marketing campaign.

To validate the findings, Choi and colleagues turned to earlier work she’d done with computer scientist Jeff Hancock of Cornell University. They’d hired people to write phony hotel reviews; a machine-learning algorithm then analyzed the fake reviews and spotted textual clues, like too many superlatives, that made them stand out from real reviews. This time, they had the computer measure the affect that the known fake reviews had on the shape of the distribution. By comparing that with the results from Choi’s other approach, she found fraudulent activity 72 percent of the time.

Using such a technique, a site like TripAdvisor could apply a correction to average hotel ratings. And suspicious results could be paired with other approaches, such as textual analysis, for a more confident finding.

Choi admits that, because it’s so difficult to be sure which reviews are actually phony, the approach is imperfect, but the fact that her results are significantly better than chance means it’s working. “It’s really unlikely some random strategy would achieve 72 percent accuracy,” she says. Pavou, who was not involved in the research, says the approach seems valid.

Choi says fake reviewers “might think that it was a perfect crime, but the truth is, they distorted the shape of the review scores of their own hotels, and that leaves a footprint of the deceptive activity, and the more they do it, the stronger it becomes.”

Keep Reading

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.