Skip to Content
Artificial intelligence

An AI saw a cropped photo of AOC. It autocompleted her wearing a bikini.

Image-generation algorithms are regurgitating the same sexist, racist ideas that exist on the internet.
January 29, 2021
Ms Tech | Getty

Language-generation algorithms are known to embed racist and sexist ideas. They’re trained on the language of the internet, including the dark corners of Reddit and Twitter that may include hate speech and disinformation. Whatever harmful ideas are present in those forums get normalized as part of their learning.

Researchers have now demonstrated that the same can be true for image-generation algorithms. Feed one a photo of a man cropped right below his neck, and 43% of the time, it will autocomplete him wearing a suit. Feed the same one a cropped photo of a woman, even a famous woman like US Representative Alexandria Ocasio-Cortez, and 53% of the time, it will autocomplete her wearing a low-cut top or bikini. This has implications not just for image generation, but for all computer-vision applications, including video-based candidate assessment algorithms, facial recognition, and surveillance.

Ryan Steed, a PhD student at Carnegie Mellon University, and Aylin Caliskan, an assistant professor at George Washington University, looked at two algorithms: OpenAI’s iGPT (a version of GPT-2 that is trained on pixels instead of words) and Google’s SimCLR. While each algorithm approaches learning images differently, they share an important characteristic—they both use completely unsupervised learning, meaning they do not need humans to label the images.

This is a relatively new innovation as of 2020. Previous computer-vision algorithms mainly used supervised learning, which involves feeding them manually labeled images: cat photos with the tag “cat” and baby photos with the tag “baby.” But in 2019, researcher Kate Crawford and artist Trevor Paglen found that these human-created labels in ImageNet, the most foundational image data set for training computer-vision models, sometimes contain disturbing language, like “slut” for women and racial slurs for minorities.

The latest paper demonstrates an even deeper source of toxicity. Even without these human labels, the images themselves encode unwanted patterns. The issue parallels what the natural-language processing (NLP) community has already discovered. The enormous datasets compiled to feed these data-hungry algorithms capture everything on the internet. And the internet has an overrepresentation of scantily clad women and other often harmful stereotypes.

To conduct their study, Steed and Caliskan cleverly adapted a technique that Caliskan previously used to examine bias in unsupervised NLP models. These models learn to manipulate and generate language using word embeddings, a mathematical representation of language that clusters words commonly used together and separates words commonly found apart. In a 2017 paper published in Science, Caliskan measured the distances between the different word pairings that psychologists were using to measure human biases in the Implicit Association Test (IAT). She found that those distances almost perfectly recreated the IAT’s results. Stereotypical word pairings like man and career or woman and family were close together, while opposite pairings like man and family or woman and career were far apart.

iGPT is also based on embeddings: it clusters or separates pixels based on how often they co-occur within its training images. Those pixel embeddings can then be used to compare how close or far two images are in mathematical space.

In their study, Steed and Caliskan once again found that those distances mirror the results of IAT. Photos of men and ties and suits appear close together, while photos of women appear farther apart. The researchers got the same results with SimCLR, despite it using a different method for deriving embeddings from images.

These results have concerning implications for image generation. Other image-generation algorithms, like generative adversarial networks, have led to an explosion of deepfake pornography that almost exclusively targets women. iGPT in particular adds yet another way for people to generate sexualized photos of women.

But the potential downstream effects are much bigger. In the field of NLP, unsupervised models have become the backbone for all kinds of applications. Researchers begin with an existing unsupervised model like BERT or GPT-2 and use a tailored datasets to “fine-tune” it for a specific purpose. This semi-supervised approach, a combination of both unsupervised and supervised learning, has become a de facto standard.

Likewise, the computer vision field is beginning to see the same trend. Steed and Caliskan worry about what these baked-in biases could mean when the algorithms are used for sensitive applications such as in policing or hiring, where models are already analyzing candidate video recordings to decide if they’re a good fit for the job. “These are very dangerous applications that make consequential decisions,” says Caliskan.

Deborah Raji, a Mozilla fellow who co-authored an influential study revealing the biases in facial recognition, says the study should serve as a wakeup call to the computer vision field. “For a long time, a lot of the critique on bias was about the way we label our images,” she says. Now this paper is saying “the actual composition of the dataset is resulting in these biases. We need accountability on how we curate these data sets and collect this information.”

Steed and Caliskan urge greater transparency from the companies who are developing these models to open source them and let the academic community continue their investigations. They also encourage fellow researchers to do more testing before deploying a vision model, such as by using the methods they developed for this paper. And finally, they hope the field will develop more responsible ways of compiling and documenting what’s included in training datasets.

Caliskan says the goal is ultimately to gain greater awareness and control when applying computer vision. “We need to be very careful about how we use them,” she says, “but at the same time, now that we have these methods, we can try to use this for social good.”

Deep Dive

Artificial intelligence

conceptual illustration showing various women's faces being scanned
conceptual illustration showing various women's faces being scanned

A horrifying new AI app swaps women into porn videos with a click

Deepfake researchers have long feared the day this would arrive.

Conceptual illustration of a therapy session
Conceptual illustration of a therapy session

The therapists using AI to make therapy better

Researchers are learning more about how therapy works by examining the language therapists use with clients. It could lead to more people getting better, and staying better.

a Chichuahua standing on a Great Dane
a Chichuahua standing on a Great Dane

DeepMind says its new language model can beat others 25 times its size

RETRO uses an external memory to look up passages of text on the fly, avoiding some of the costs of training a vast neural network

THE BLOB, 1958, promotional artwork
THE BLOB, 1958, promotional artwork

2021 was the year of monster AI models

GPT-3, OpenAI’s program to mimic human language,  kicked off a new trend in artificial intelligence for bigger and bigger models. How large will they get, and at what cost?

Stay connected

Illustration by Rose WongIllustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.