How to fight hate online

Data scientist Jennifer Chayes thinks we can use computational tools to root out bad behavior online.

Anil Ananthaswamyarchive page

October 27, 2021

Christie Hemm Klok

During her time at Microsoft and in academia, Jennifer Chayes has been fighting to use data science and computing to make artificial intelligence more fair and less biased.

From dropping out of school at the age of 15 to becoming the doyen of data science at the University of California, Berkeley, Chayes has had quite the career path. She joined UCLA in 1987 as a tenured professor of mathematics. Ten years later, Microsoft lured her to cofound its interdisciplinary Research Theory Group.

It was in her Microsoft lab in New York City that researchers discovered bias in the company’s facial recognition software, showing that the system classified white faces more accurately than it did brown and Black faces. This finding caused the company to turn down a lucrative contract with a police department and start working to remove the bias from such algorithms. The FATE (Fairness, Accountability, Transparency and Ethics in AI) group was created at the lab.

Anil Ananthaswamy asked Chayes, now associate provost of the Division of Computing, Data Science, and Society and dean of the School of Information at Berkeley, how data science is transforming computing and other fields.

Q: What was it like to transition from academia to industry?

A: That was quite a shock. The VP of research at Microsoft, Dan Ling, called me to try to convince me to go for an interview. I talked to him for about 40 minutes. And I finally said, “Do you really want to know what’s bothering me? Microsoft is a bunch of adolescent boys, and I don’t want to spend my life with a bunch of adolescent boys.”

Q: How did he react to that?

A: He said, “Oh, no, we are not. Come and meet us.” I met some incredible women there when I visited, and I met phenomenally open-minded people who wanted to try things to change the world.

Q: How has data science changed computing?

A: As we’ve gotten more data, computer science has begun to look outward. I think of data science as a marriage of computing, statistics, ethics, and a domain emphasis or a disciplinary emphasis, be it biomedicine and health, climate and sustainability, or human welfare and social justice, and so on. It is transforming computing.

Q: Is there a difference in how data scientists solve problems?

A: With the advent of all of this data, we have the opportunity to learn from the data without having a theory of why something is happening. Especially in this age of machine learning and deep learning, it enables us to draw conclusions and make predictions without an underlying theory.

Q: Can that cause problems?

A: Some consider it a problem in cases in which you have, [for example], biomedical data. The data very accurately predicts what’s going to work and what’s not going to work, without an underlying biological mechanism.

Q: Any advantages?

A: What the data has allowed us to do now, in many cases, is to run what an economist would call a counterfactual, where you actually see random variation in the data that allows you to draw conclusions without doing the experiments. That’s incredibly useful.

Do I really want to try out different educations on different populations? Or do I want to see [that] there was random variation at some point that will allow me to draw a really good causal inference, and therefore I can base policy on it?

Q: Do you see a problem in how data is being used, especially by big companies?

A: There are myriad problems. It’s not only being used by tech corporations. It’s being used by insurance companies. It’s being used by government platforms, public health platforms, and educational platforms. If you do not explicitly understand what biases can be creeping in, both in the data sets themselves and in the algorithms, you will likely exacerbate bias.

These biases sneak in [when] there isn’t much data. And it can also get correlated with other factors. I personally worked on interpreting bios and CVs automatically. We are not allowed to use gender or race. Even if I don’t look at [these] protected attributes, there are many things [in the data] that are proxies for gender or race. If you have gone to certain schools, if you grew up in certain neighborhoods, if you played certain sports and you had certain activities, they are correlated [with gender or race].

Q: Do algorithms pick up on these proxies?

A: They exacerbate it. You must explicitly understand this, and you must explicitly prevent it in writing the algorithm.

Q: How can we address such issues?

A: There is this whole area of FATE: fairness, accountability, transparency, and ethics in AI, which is the design of these algorithms and understanding what they are. But there is so much more that we need to do.

Q: And data science helps?

A: This is absolutely data science. There is part of the web called the “manosphere,” where a lot of hate is originating. It’s kind of hard to trace. But if you use natural-language processing and other tools, you can see where it’s coming from. You can also try to build interfaces that allow advocacy groups and others to find this and to help root it out. This goes beyond just being fair. This is turning the tables on the way in which these platforms have been usurped to increase bias and hate and saying, “We are going to use the power of computing and data science to identify and mitigate hate.”

Deep Dive

Computing

It’s time to retire the term “user”

The proliferation of AI means we need a new word.

Taylor Majewskiarchive page

How ASML took over the chipmaking chessboard

MIT Technology Review sat down with outgoing CTO Martin van den Brink to talk about the company’s rise to dominance and the life and death of Moore’s Law.

How Wi-Fi sensing became usable tech

After a decade of obscurity, the technology is being used to track people’s movements.

Meg Duffarchive page

Why it’s so hard for China’s chip industry to become self-sufficient

Chip companies from the US and China are developing new materials to reduce reliance on a Japanese monopoly. It won’t be easy.

Zeyi Yangarchive page

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

How to fight hate online

Deep Dive

Computing

It’s time to retire the term “user”

How ASML took over the chipmaking chessboard

How Wi-Fi sensing became usable tech

Why it’s so hard for China’s chip industry to become self-sufficient

Stay connected

Get the latest updates from
MIT Technology Review

The latest iteration of a legacy

Advertise with MIT Technology Review

About

Help

Deep Dive

Computing

It’s time to retire the term “user”

How ASML took over the chipmaking chessboard

How Wi-Fi sensing became usable tech

Why it’s so hard for China’s chip industry to become self-sufficient

Stay connected

Get the latest updates fromMIT Technology Review

Get the latest updates from
MIT Technology Review