There’s an easy way to make lending fairer for women. Trouble is, it’s illegal.

Goldman Sachs defended itself in the Apple Card scandal by saying it did not consider gender when calculating creditworthiness. If it did, that could actually mitigate the problem.

Karen Haoarchive page

November 15, 2019

Credit scores overlaid on male and female symbols.Selman Design

Earlier this week, New York’s Department of Financial Services launched an investigation into Goldman Sachs for potential credit discrimination by gender. The probe came after web entrepreneur David Heinemeier Hansson tweeted that the Apple Card, which Goldman manages, had given him a credit limit 20 times that extended to his wife, though the two filed joint tax returns and she had the better credit score.

The @AppleCard is such a fucking sexist program. My wife and I filed joint tax returns, live in a community-property state, and have been married for a long time. Yet Appleâ€™s black box algorithm thinks I deserve 20x the credit limit she does. No appeals work.
— DHH (@dhh) November 7, 2019

In response, Goldman posted a statement saying it did not consider gender when determining creditworthiness. The logic was likely meant to be a defense—how can you discriminate against women when you don’t even know someone is a woman? But in fact, failing to account for gender is precisely the problem. Research in algorithmic fairness has previously shown that considering gender actually helps mitigate gender bias. Ironically, though, doing so in the US is illegal.

Now preliminary results from an ongoing study funded by the UN Foundation and the World Bank are once again challenging the fairness of gender-blind credit lending. The study found that creating entirely separate creditworthiness models for men and women granted the majority of women more credit.

So: should the law be updated?

The sexism of being “gender-blind”

If you don’t want to discriminate by gender, why not simply remove gender from the equation? This was the premise of the Equal Credit Opportunity Act (ECOA), enacted in the US in 1974, during a time when women were regularly denied credit. It made it illegal for any creditor to discriminate on the basis of sex or to consider sex when evaluating creditworthiness. (In 1976, it was updated to forbid discrimination by race, national origin, and other characteristics protected by the federal government.)

But in machine learning, gender blindness can be the problem. Even when gender is not specified, it can easily be deduced from other variables that correlate highly with it. As a result, models trained on historical data stripped of gender still amplify past inequities. The same applies to race and other characteristics. This is likely what happened in the Apple Card case: because women were historically granted less credit, the algorithm learned to perpetuate that pattern.

In a 2018 study, a collaboration between computer scientists and economists found that the best way to mitigate these issues was in fact to reintroduce characteristics like gender and race into the model. Doing so allows for more control to measure and reverse any manifested biases, resulting in more fairness overall.

Gender-differentiated credit lending

The latest study is testing a new hypothesis: would separate models for men and women reduce gender bias further? At an event hosted by the UN Foundation on Tuesday, Sean Higgins, an assistant professor at Northwestern University and a researcher on the study, presented preliminary findings that suggest they would.

In partnership with a commercial bank in the Dominican Republic, the researchers conducted two separate analyses of 20,000 low-income individuals, half of them women. In the first analysis, the researchers used the individuals’ loan repayment histories and gender to train a single machine-learning model for predicting creditworthiness. In the second analysis, the researchers trained a model with only the loan repayment data from the women. They found that 93% of women got more credit in this model than in the one where men and women were mixed together.

This happens, says Higgins, because women and men have different credit histories and different loan repayment behaviors—whether for historical, cultural, or other reasons. Women, for example, are more likely to pay back their loans, he says. But those differences aren’t accounted for in the combined model, which learns to predict creditworthiness on the basis of averages across women and men. Consequently, such models underpredict the likelihood of women repaying their loans and end up granting them less credit than they deserve.

While Higgins and his collaborators tested this hypothesis specifically for low-income women in the Dominican Republic, the qualitative results should hold true regardless of context. They should also apply to characteristics other than gender, and in domains other than finance.

What to do with the law

The problem is, this kind of single-gender model is illegal. The question is whether policymakers should therefore update the ECOA.

Higgins is in favor. “The recent research on algorithmic fairness has reached a fairly clear conclusion that we should be using things like race and gender in the algorithms,” he says. “If the banks don’t have access to those variables and can’t even build in the safety checks to make sure that their algorithms aren’t biased, the only way that we find out about these biases is when people tweet about disparities that they’re encountering in the wild.”

But Andrew Selbst, an assistant law professor at UCLA who specializes at the intersection of AI and law, cautions against moving too quickly. “Rewriting the law that way opens up avenues for bad actors to start including race variables and gender variables and discriminating wildly in a way that’s very hard to police,” he says. He also worries that this solution wouldn’t account for nonbinary or trans people and unintentionally cause them harm.

To have more stories like this delivered directly to your inbox, sign up for our Webby-nominated AI newsletter The Algorithm. It's free.

Deep Dive

Artificial intelligence

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.