Skip to Content
Tech policy

Are Face Recognition Systems Accurate? Depends on Your Race.

The available evidence suggests that face matching systems don’t work equally well for different races.

Everything we know about the face recognition systems the FBI and police use suggests the software has a built-in racial bias. That isn’t on purpose—it’s an artifact of how the systems are designed, and the data they are trained on. But it is problematic. Law enforcement agencies are relying more and more on such tools to aid in criminal investigations, increasing the risk that something could go wrong.

Law enforcement agencies haven’t provided many details on how they use facial recognition systems, but in June the Government Accountability Office issued a report saying that the FBI has not properly tested the accuracy of its face matching system, nor that of the massive network of state-level face matching databases it can access.

And while state-of-the-art face matching systems can be nearly 95 percent accurate on mugshot databases, those photos are taken under controlled conditions with generally coöperative subjects. Images taken under less-than-ideal circumstances, like bad lighting, or that capture unusual poses and facial expressions, can lead to errors.

Illustration by Sophia Foster-Dimino

The algorithms can also be biased due to the way they are trained,  says Anil Jain, head of the biometrics research group at Michigan State University. To work, face matching software must first learn to recognize faces using training data, a set of images that gives the software information about how faces differ. If a gender, age group, or race is underrepresented in the training data, that will be reflected in the algorithm’s performance, says Jain.

In 2012, Jain and several colleagues used a set of mugshots from the Pinellas County Sheriff’s Office in Florida to examine the performance of several commercially available face recognition systems, including ones from vendors that supply law enforcement agencies. The algorithms were consistently less accurate on women, African-Americans, and younger people. Apparently they were trained on data that was not representative enough of those groups, says Jain.

“If your training set is strongly biased toward a particular race, your algorithm will do better recognizing that race,” says Alice O’Toole, head of the face perception research lab at University of Texas at Dallas. O’Toole and several colleagues found in 2011 that an algorithm developed in Western countries was better at recognizing Caucasian faces than it was at recognizing East Asian faces. Likewise, East Asian algorithms performed better on East Asian faces than on Caucasian ones.

In the several years since these studies, the accuracy of commercial algorithms has improved significantly in many areas, and Jain says the performance gaps between different genders and races may have narrowed. But so little testing information is available, it is hard to know. Newer approaches to face recognition, such as the deep learning systems Google and Facebook have developed, can make the same sort of mistakes if the training data is imbalanced, he says.

Jonathon Phillips, an electronic engineer at the National Institute of Standards and Technology, conducts performance tests of commercial algorithms. He says that it’s possible to design a test to measure racial bias in face matching systems. In fact, privacy experts have called for making such tests a requirement.

The FBI and MorphoTrust, the vendor that supplies the bureau’s face recognition software, did not answer e-mailed questions from MIT Technology Review regarding whether they test their algorithms’ performance by race, gender, or age.

The arrangements between vendors and the many state law enforcement agencies using face recognition are also not clear. But Pete Langenfeld, manager of digital analysis and identification for the Michigan State Police, says his organization does not test for group-specific accuracy. He said he does not know if the vendor that supplied the technology performs such tests either, but added that it is proprietary information, and the company isn’t required to release that information.

Deep Dive

Tech policy

How the Supreme Court ruling on Section 230 could end Reddit as we know it

As tech companies scramble in anticipation of a major ruling, some experts say community moderation online could be on the chopping block.

2022’s seismic shift in US tech policy will change how we innovate

Three bills investing hundreds of billions into technological development could change the way we think about government’s role in growing prosperity.

Mass-market military drones: 10 Breakthrough Technologies 2023

Turkish-made aircraft like the TB2 have dramatically expanded the role of drones in warfare.

We’re witnessing the brain death of Twitter

An analysis of Musk’s tweets shows him at the center of conversations once kept on the fringes of Twitter.

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.