Skip to Content
Computing

How to fight hate online

Data scientist Jennifer Chayes thinks we can use computational tools to root out bad behavior online.

Jennifer Chayes
Jennifer Chayes
Christie Hemm Klok

During her time at Microsoft and in academia, Jennifer Chayes has been fighting to use data science and computing to make artificial intelligence more fair and less biased. 

From dropping out of school at the age of 15 to becoming the doyen of data science at the University of California, Berkeley, Chayes has had quite the career path. She joined UCLA in 1987 as a tenured professor of mathematics. Ten years later, Microsoft lured her to cofound its interdisciplinary Research Theory Group. 

It was in her Microsoft lab in New York City that researchers discovered bias in the company’s facial recognition software, showing that the system classified white faces more accurately than it did brown and Black faces. This finding caused the company to turn down a lucrative contract with a police department and start working to remove the bias from such algorithms. The FATE (Fairness, Accountability, Transparency and Ethics in AI) group was created at the lab. 

Anil Ananthaswamy asked Chayes, now associate provost of the Division of Computing, Data Science, and Society and dean of the School of Information at Berkeley, how data science is transforming computing and other fields.

Q: What was it like to transition from academia to industry?

A: That was quite a shock. The VP of research at Microsoft, Dan Ling, called me to try to convince me to go for an interview. I talked to him for about 40 minutes. And I finally said, “Do you really want to know what’s bothering me? Microsoft is a bunch of adolescent boys, and I don’t want to spend my life with a bunch of adolescent boys.”

Q: How did he react to that?

A: He said, “Oh, no, we are not. Come and meet us.” I met some incredible women there when I visited, and I met phenomenally open-minded people who wanted to try things to change the world.

Q: How has data science changed computing?

A: As we’ve gotten more data, computer science has begun to look outward. I think of data science as a marriage of computing, statistics, ethics, and a domain emphasis or a disciplinary emphasis, be it biomedicine and health, climate and sustainability, or human welfare and social justice, and so on. It is transforming computing.

Q: Is there a difference in how data scientists solve problems?

A: With the advent of all of this data, we have the opportunity to learn from the data without having a theory of why something is happening. Especially in this age of machine learning and deep learning, it enables us to draw conclusions and make predictions without an underlying theory.

Q: Can that cause problems?

A: Some consider it a problem in cases in which you have, [for example], biomedical data. The data very accurately predicts what’s going to work and what’s not going to work, without an underlying biological mechanism. 

Q: Any advantages?

A: What the data has allowed us to do now, in many cases, is to run what an economist would call a counterfactual, where you actually see random variation in the data that allows you to draw conclusions without doing the experiments. That’s incredibly useful. 

Do I really want to try out different educations on different populations? Or do I want to see [that] there was random variation at some point that will allow me to draw a really good causal inference, and therefore I can base policy on it?

Q: Do you see a problem in how data is being used, especially by big companies?

A: There are myriad problems. It’s not only being used by tech corporations. It’s being used by insurance companies. It’s being used by government platforms, public health platforms, and educational platforms. If you do not explicitly understand what biases can be creeping in, both in the data sets themselves and in the algorithms, you will likely exacerbate bias.

These biases sneak in [when] there isn’t much data. And it can also get correlated with other factors. I personally worked on interpreting bios and CVs automatically. We are not allowed to use gender or race. Even if I don’t look at [these] protected attributes, there are many things [in the data] that are proxies for gender or race. If you have gone to certain schools, if you grew up in certain neighborhoods, if you played certain sports and you had certain activities, they are correlated [with gender or race]. 

Q: Do algorithms pick up on these proxies? 

A: They exacerbate it. You must explicitly understand this, and you must explicitly prevent it in writing the algorithm. 

Q: How can we address such issues?

A: There is this whole area of FATE: fairness, accountability, transparency, and ethics in AI, which is the design of these algorithms and understanding what they are. But there is so much more that we need to do. 

Q: And data science helps?

A: This is absolutely data science. There is part of the web called the “manosphere,” where a lot of hate is originating. It’s kind of hard to trace. But if you use natural-­language processing and other tools, you can see where it’s coming from. You can also try to build interfaces that allow advocacy groups and others to find this and to help root it out. This goes beyond just being fair. This is turning the tables on the way in which these platforms have been usurped to increase bias and hate and saying, “We are going to use the power of computing and data science to identify and mitigate hate.”

Deep Dive

Computing

ASML machine
ASML machine

Inside the machine that saved Moore’s Law

The Dutch firm ASML spent $9 billion and 17 years developing a way to keep making denser computer chips.

The Steiner tree problem:  Connect a set of points with line segments of minimum total length.
The Steiner tree problem:  Connect a set of points with line segments of minimum total length.

The 50-year-old problem that eludes theoretical computer science

A solution to P vs NP could unlock countless computational problems—or keep them forever out of reach.

This new startup has built a record-breaking 256-qubit quantum computer

QuEra Computing, launched by physicists at Harvard and MIT, is trying a different quantum approach to tackle impossibly hard computational tasks.

DHS logo glitch
DHS logo glitch

The US is worried that hackers are stealing data today so quantum computers can crack it in a decade

The US government is starting a generation-long battle against the threat next-generation computers pose to encryption.

Stay connected

Illustration by Rose WongIllustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.