Skip to Content

Photos of people’s faces are routinely taken from websites to help develop face recognition algorithms, without the subjects’ consent, a report by NBC reveals.

The latest example: In January IBM released a data set of almost a million photos that had been scraped from photo-sharing website Flickr then annotated with information about details like skin tone. The company pitched this as part of efforts to reduce the (very real) problem of bias within face recognition. However, it didn’t get consent from anyone to do this, and it’s almost impossible to get the photos removed.

Dirty secret: IBM is far from alone. As companies scramble to improve their face recognition technology, they need access to vast numbers of images to feed their algorithms. Just taking images that have already been uploaded to the internet is a very fast—but ethically questionable—way to do that.

Rapid progress: Face recognition might be convenient for unlocking your phone, but it could be a powerful surveillance tool as well. Its use is expanding rapidly with virtually no oversight, leading to growing calls for the technology to be regulated.  

A tension: Face recognition algorithms have a poor accuracy record when it comes to identifying non-white faces, and systems can misgender people on the basis of, for example, the length of their hair. One way to combat this is to add more images of, say, black women, or men with long hair, to the training data. But doing so without obtaining people’s consent before using their photos will leave many feeling deeply uncomfortable.

Sign up here to our daily newsletter The Download to get your dose of the latest must-read news from the world of emerging tech.

Deep Dive

Artificial intelligence

Why Meta’s latest large language model survived only three days online

Galactica was supposed to help scientists. Instead, it mindlessly spat out biased and incorrect nonsense.

DeepMind’s game-playing AI has beaten a 50-year-old record in computer science

The new version of AlphaZero discovered a faster way to do matrix multiplication, a core problem in computing that affects thousands of everyday computer tasks.

A bot that watched 70,000 hours of Minecraft could unlock AI’s next big thing

Online videos are a vast and untapped source of training data—and OpenAI says it has a new way to use it.

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.