Skip to Content

Why and How Baidu Cheated an Artificial Intelligence Test

Machine learning gets its first cheating scandal.

The sport of training software to act intelligently just got its first cheating scandal. Last month Chinese search company Baidu announced that its image recognition software had inched ahead of Google’s on a standardized test of accuracy. On Tuesday the company admitted that it achieved those results by breaking the rules of that test.

The academic experts who maintain that test say that makes Baidu’s claims of beating Google meaningless. Ren Wu, the Baidu researcher who led work on the software in question, has apologized and said the company is reviewing its results. The company has amended a technical paper it released on its software.

We don’t know whether this was the action of one individual or a strategy of the team as a whole. But why a multibillion dollar corporation might bother to cheat on an obscure test operated by academics on a voluntary basis is actually quite clear.

Baidu, Google, Facebook, and other major computing companies have spent heavily in recent years to build research groups dedicated to deep learning, an approach to building machine learning software that has made great strides in speech and image recognition. These companies have worked hard to hire leading experts in the small field – often from each other (see “Is Google Cornering the Market on Deep Learning”). A handful of standardized tests developed in academia are the currency by which these research groups compare one another’s progress and promote their achievements to the public.

Baidu got an unfair advantage by exploiting the test’s design. To get your software scored against the ImageNet Challenge you first train it with a standardized set of 1.5 million images. Then you submit the code to the ImageNet Challenge server so its accuracy can be tested on a collection of 100,000 “validation” images that the software has never seen before.

The Challenge rules state that you must only test your code twice a week, because there’s an element of chance to the results.

Baidu has admitted that it used multiple email accounts to test its code roughly 200 times in just under six months – over four times what the rules allow.

Oren Etzioni, CEO of the Allen Institute for Artificial Intelligence, likens what Baidu did to buying multiple lottery tickets. “If you get to buy two tickets a week you have a certain chance if you buy 200 a week you have more of a chance,” he says. On top of that, testing slightly different code over many tests could help a research team optimize its software for peculiarities of the collection of validation images that aren’t reflected in real world photos.

Such is the success of deep learning on this particular test that even a small advantage could make a difference. Baidu had reported it achieved an error rate of only 4.58 percent, beating the previous best of 4.82 percent, reported by Google in March. In fact, some experts have noted that the small margins of victory in the race to get better on this particular test make it increasingly meaningless. That Baidu and others continue to trumpet their results all the same - and may even be willing to break the rules - suggest that being the best at machine learning matters to them very much indeed.

Keep Reading

Most Popular

A Roomba recorded a woman on the toilet. How did screenshots end up on Facebook?

Robot vacuum companies say your images are safe, but a sprawling global supply chain for data from our devices creates risk.

A startup says it’s begun releasing particles into the atmosphere, in an effort to tweak the climate

Make Sunsets is already attempting to earn revenue for geoengineering, a move likely to provoke widespread criticism.

10 Breakthrough Technologies 2023

Every year, we pick the 10 technologies that matter the most right now. We look for advances that will have a big impact on our lives and break down why they matter.

These exclusive satellite images show that Saudi Arabia’s sci-fi megacity is well underway

Weirdly, any recent work on The Line doesn’t show up on Google Maps. But we got the images anyway.

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.