During her time at Microsoft and in academia, Jennifer Chayes has been fighting to use data science and computing to make artificial intelligence more fair and less biased.
From dropping out of school at the age of 15 to becoming the doyen of data science at the University of California, Berkeley, Chayes has had quite the career path. She joined UCLA in 1987 as a tenured professor of mathematics. Ten years later, Microsoft lured her to cofound its interdisciplinary Research Theory Group.
It was in her Microsoft lab in New York City that researchers discovered bias in the company’s facial recognition software, showing that the system classified white faces more accurately than it did brown and Black faces. This finding caused the company to turn down a lucrative contract with a police department and start working to remove the bias from such algorithms. The FATE (Fairness, Accountability, Transparency and Ethics in AI) group was created at the lab.
Anil Ananthaswamy asked Chayes, now associate provost of the Division of Computing, Data Science, and Society and dean of the School of Information at Berkeley, how data science is transforming computing and other fields.
Q: What was it like to transition from academia to industry?
A: That was quite a shock. The VP of research at Microsoft, Dan Ling, called me to try to convince me to go for an interview. I talked to him for about 40 minutes. And I finally said, “Do you really want to know what’s bothering me? Microsoft is a bunch of adolescent boys, and I don’t want to spend my life with a bunch of adolescent boys.”
Q: How did he react to that?
A: He said, “Oh, no, we are not. Come and meet us.” I met some incredible women there when I visited, and I met phenomenally open-minded people who wanted to try things to change the world.
Q: How has data science changed computing?
A: As we’ve gotten more data, computer science has begun to look outward. I think of data science as a marriage of computing, statistics, ethics, and a domain emphasis or a disciplinary emphasis, be it biomedicine and health, climate and sustainability, or human welfare and social justice, and so on. It is transforming computing.
Q: Is there a difference in how data scientists solve problems?
A: With the advent of all of this data, we have the opportunity to learn from the data without having a theory of why something is happening. Especially in this age of machine learning and deep learning, it enables us to draw conclusions and make predictions without an underlying theory.
Q: Can that cause problems?
A: Some consider it a problem in cases in which you have, [for example], biomedical data. The data very accurately predicts what’s going to work and what’s not going to work, without an underlying biological mechanism.
Q: Any advantages?
A: What the data has allowed us to do now, in many cases, is to run what an economist would call a counterfactual, where you actually see random variation in the data that allows you to draw conclusions without doing the experiments. That’s incredibly useful.
Do I really want to try out different educations on different populations? Or do I want to see [that] there was random variation at some point that will allow me to draw a really good causal inference, and therefore I can base policy on it?
Q: Do you see a problem in how data is being used, especially by big companies?
A: There are myriad problems. It’s not only being used by tech corporations. It’s being used by insurance companies. It’s being used by government platforms, public health platforms, and educational platforms. If you do not explicitly understand what biases can be creeping in, both in the data sets themselves and in the algorithms, you will likely exacerbate bias.
These biases sneak in [when] there isn’t much data. And it can also get correlated with other factors. I personally worked on interpreting bios and CVs automatically. We are not allowed to use gender or race. Even if I don’t look at [these] protected attributes, there are many things [in the data] that are proxies for gender or race. If you have gone to certain schools, if you grew up in certain neighborhoods, if you played certain sports and you had certain activities, they are correlated [with gender or race].
Q: Do algorithms pick up on these proxies?
A: They exacerbate it. You must explicitly understand this, and you must explicitly prevent it in writing the algorithm.
Q: How can we address such issues?
A: There is this whole area of FATE: fairness, accountability, transparency, and ethics in AI, which is the design of these algorithms and understanding what they are. But there is so much more that we need to do.
Q: And data science helps?
A: This is absolutely data science. There is part of the web called the “manosphere,” where a lot of hate is originating. It’s kind of hard to trace. But if you use natural-language processing and other tools, you can see where it’s coming from. You can also try to build interfaces that allow advocacy groups and others to find this and to help root it out. This goes beyond just being fair. This is turning the tables on the way in which these platforms have been usurped to increase bias and hate and saying, “We are going to use the power of computing and data science to identify and mitigate hate.”
Russia is risking the creation of a “splinternet”—and it could be irreversible
If Russia disconnects from—or is booted from— the internet’s governing bodies, the internet may never be the same again for any of us.
Quantum computing has a hype problem
Quantum computing startups are all the rage, but it’s unclear if they’ll be able to produce anything of use in the near future.
These hackers showed just how easy it is to target critical infrastructure
Two Dutch researchers have won a major hacking championship by hitting the software that runs the world’s power grids, gas pipelines, and more. It was their easiest challenge yet.
Inside the plan to fix America’s never-ending cybersecurity failures
The specter of Russian hackers and an overreliance on voluntary cooperation from the private sector means officials are finally prepared to get tough.
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.