Artificial intelligence has a well-known bias problem, particularly when it comes to race and gender. You may have seen some of the headlines: facial recognition systems that fail to recognize black women, or automated recruiting tools that pass over female candidates.
But while researchers have tried hard to address some of the most egregious issues, there’s one group of people they have overlooked: those with disabilities. Take self-driving cars. Their algorithms rely on training data to learn what pedestrians look like so the vehicles won’t run them over. If the training data doesn’t include people in wheelchairs, the technology could put those people in life-threatening danger.
For Shari Trewin, a researcher on IBM’s accessibility leadership team, this is unacceptable. As part of a new initiative, she is now exploring new design processes and technical methods to mitigate machine bias against people with disabilities. She talked to us about some of the challenges—as well as some possible solutions.
The following has been edited for length and clarity.
Why is fairness to people with disabilities a different problem from fairness concerning other protected attributes like race and gender?
Disability status is much more diverse and complex in the ways that it affects people. A lot of systems will model race or gender as a simple variable with a small number of possible values. But when it comes to disability, there are so many different forms and different levels of severity. Some of them are permanent, some are temporary. Any one of us might join or leave this category at any time in our lives. It’s a dynamic thing.
About one in five people in the US currently have a disability of some kind. So it’s really prevalent but hard to pin down into a simple variable with a small number of possible values. There might be a system that discriminates against blind people but not against deaf people. So testing for fairness becomes much harder.
Disability information is also very sensitive. People are much more reluctant to reveal it than gender or age or race information, and in some situations it’s even illegal to ask for this information. So a lot of times in the data you’re much less likely to know anything about disabilities that a person may or may not have. That also makes it much harder to know if you have a fair system.
I wanted to ask you about that. As humans, we decided the best way to avoid disability discrimination is to not reveal disability status. Why wouldn’t that hold true for machine-learning systems?
Yeah, that’s the first thing people think of: if the system doesn’t know anything about individuals’ disability status, surely it will be fair. But the problem is that the disability often impacts other bits of information that are being fed into the model. For example, say I am a person that uses a screen reader to access the web, and I’m doing an online test for a job application. If that test program isn’t well designed and accessible to my screen reader, it’s going to take me longer to navigate around the page before I can answer the question. If that time isn’t taken into consideration in assessing me, then anybody who’s using that same tool with a similar disability is at a systematic disadvantage—even if the system doesn’t know that I’m blind.
So if there are so many different nuances to disability, is it actually possible to achieve fairness?
I think the more general challenge for the AI community is how to handle outliers, because machine-learning systems—they learn norms, right? They optimize for norms and don’t treat outliers in any special way. But oftentimes people with disabilities don’t fit the norm. The way that machine learning judges people by who it thinks they’re similar to—even when it may never have seen anybody similar to you—is a fundamental limitation in terms of fair treatment for people with disabilities.
What would work a lot better would be a method that combines machine learning with some additional solution, like logical rules that are implemented in a layer above. There are also some situations where more attention to gathering a more diverse data set would definitely help. Some people are experimenting with techniques where you take out the core of the data and try to train for the outliers. Others are experimenting with different learning techniques that might optimize better for outliers rather than the norm.
I think it’s only when you start thinking about disability that you start thinking about the diversity of individuals and the importance of outliers. If you don’t have enough gender diversity in your data set, you can fix that. It’s not so easy to fix disability diversity.
How do you get over the problem of people being private about their disability status?
Yeah, in order to test a system for fairness, you need some data. And people with disabilities providing that data is a social good, but it’s a personal risk. People with disabilities are often easily identified even in anonymous data, just because they’re so unique. So how do we mitigate that? We’re still figuring that out.
What are your greatest concerns about this problem?
Oftentimes AI systems are optimizing something that is not the wellbeing of the people who are affected by the decisions. That impact needs to have much more prominence in the design process, so that we’re not just introducing a system that looks at how much money we’re saving or how efficiently we’re processing people. We need new ways of measuring systems that incorporate the aspect of impact on the end users, especially if it’s a disadvantaged group.
How would we do that?
Testing for fairness is one way of measuring that impact. Including the disadvantaged group in the design process and hearing their concerns is another. Even explicitly including some metric for stakeholder satisfaction that you could measure through interviews or surveys—that sort of thing.
What are the things that you’re excited about in this area of research?
AI technologies are already changing the world for people with disabilities by providing them with new capabilities, like applications that tell you what’s in your field of view when you point your phone.
I think that if we do it right, there’s a real opportunity for AI systems to improve on previous human-only systems. There’s a lot of discrimination and bias and misunderstanding of people with disabilities in society today. If we can find a way to produce AI systems that eliminate that kind of bias, then we can start to change the treatment of people with disabilities and reduce discrimination.
This scientist now believes covid started in Wuhan’s wet market. Here’s why.
How a veteran virologist found fresh evidence to back up the theory that covid jumped from animals to humans in a notorious Chinese market—rather than emerged from a lab leak.
The US crackdown on Chinese economic espionage is a mess. We have the data to show it.
The US government’s China Initiative sought to protect national security. In the most comprehensive analysis of cases to date, MIT Technology Review reveals how far it has strayed from its goals.
All charges against China Initiative defendant Gang Chen have been dismissed
MIT professor Gang Chen was one of the most prominent scientists charged under the China Initiative, a Justice Department effort meant to counter economic espionage and national security threats.
The China Initiative’s first academic guilty verdict raises more questions than it answers
Observers hoped that the trial of the prominent Harvard professor Charles Lieber would provide some clues into the future of the Department of Justice’s campaign against Chinese economic espionage.
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.