It was in her Microsoft lab in New York City that researchers discovered bias in the company’s facial recognition software, showing that the system classified white faces more accurately than it did brown and Black faces. This finding caused the company to turn down a lucrative contract with a police department and start working to remove the bias from such algorithms. The FATE (Fairness, Accountability, Transparency and Ethics in AI) group was created at the lab.

Anil Ananthaswamy asked Chayes, now associate provost of the Division of Computing, Data Science, and Society and dean of the School of Information at Berkeley, how data science is transforming computing and other fields.

Q: What was it like to transition from academia to industry?

A: That was quite a shock. The VP of research at Microsoft, Dan Ling, called me to try to convince me to go for an interview. I talked to him for about 40 minutes. And I finally said, “Do you really want to know what’s bothering me? Microsoft is a bunch of adolescent boys, and I don’t want to spend my life with a bunch of adolescent boys.”

Q: How did he react to that?

A: He said, “Oh, no, we are not. Come and meet us.” I met some incredible women there when I visited, and I met phenomenally open-minded people who wanted to try things to change the world.

Q: How has data science changed computing?

A: As we’ve gotten more data, computer science has begun to look outward. I think of data science as a marriage of computing, statistics, ethics, and a domain emphasis or a disciplinary emphasis, be it biomedicine and health, climate and sustainability, or human welfare and social justice, and so on. It is transforming computing.

Q: Is there a difference in how data scientists solve problems?

A: With the advent of all of this data, we have the opportunity to learn from the data without having a theory of why something is happening. Especially in this age of machine learning and deep learning, it enables us to draw conclusions and make predictions without an underlying theory.

Q: Can that cause problems?

A: Some consider it a problem in cases in which you have, [for example], biomedical data. The data very accurately predicts what’s going to work and what’s not going to work, without an underlying biological mechanism.

Q: Any advantages?

A: What the data has allowed us to do now, in many cases, is to run what an economist would call a counterfactual, where you actually see random variation in the data that allows you to draw conclusions without doing the experiments. That’s incredibly useful.

Do I really want to try out different educations on different populations? Or do I want to see [that] there was random variation at some point that will allow me to draw a really good causal inference, and therefore I can base policy on it?