How to Upgrade Judges with Machine Learning

Software that helps judges decide whether to jail a defendant while they await trial could cut crime and reduce racial disparities amongst prisoners.

Tom Simonitearchive page

March 6, 2017

Jon Han

When should a criminal defendant be required to await trial in jail rather than at home? Software could significantly improve judges’ ability to make that call—reducing crime or the number of people stuck waiting in jail.

In a new study from the National Bureau of Economic Research, economists and computer scientists trained an algorithm to predict whether defendants were a flight risk from their rap sheet and court records using data from hundreds of thousands of cases in New York City. When tested on over a hundred thousand more cases that it hadn’t seen before, the algorithm proved better at predicting what defendants will do after release than judges.

Jon Kleinberg, a computer science professor at Cornell involved in the research, says one goal of the project was to show policymakers the potential benefits to society of using machine learning in the criminal justice system. “This shows how machine learning can help even in contexts where there’s considerable human expertise being brought to bear,” says Kleinberg, who worked on the project with researchers from Stanford, Harvard, and the University of Chicago.

They estimate that for New York City, their algorithm’s advice could cut crime by defendants awaiting trial by as much as 25 percent without changing the numbers of people waiting in jail. Alternatively, it could be used to reduce the jail population awaiting trial by more than 40 percent, while leaving the crime rate by defendants unchanged. Repeating the experiment on data from 40 large urban counties across the U.S. yielded similar results.

As a bonus, gains like those were possible while simultaneously shifting the jail population to include a smaller proportion of African-Americans and Hispanics.

The algorithm assigns defendants a risk score based on data pulled from records for their current case and their rap sheet, for example the offense they are suspected of, when and where they were arrested, and numbers and type of prior convictions. (The only demographic data it uses is age—not race.)

Kleinberg suggests that algorithms could be deployed to help judges without major disruption to the way they currently work in the form of a warning system that flags decisions highly likely to be wrong. Analysis of judges’ performance suggested they have a tendency to occasionally release people who are very likely to fail to show in court, or to commit crime while awaiting trial. An algorithm could catch many of those cases, says Kleinberg.

Richard Berk, a professor of criminology at the University of Pennsylvania, describes the study as “very good work,” and an example of a recent acceleration of interest in applying machine learning to improve criminal justice decisions. The idea has been explored for 20 years, but machine learning has become more powerful, and data to train it more available.

Berk recently tested a system with the Pennsylvania State Parole Board that advises on the risk a person will reoffend, and found evidence it reduced crime. The NBER study is important because it looks at how machine learning can be used pre-sentencing, an area that hasn’t been thoroughly explored, he says.

However, Berk says that more research is needed into how to ensure that criminal justice algorithms don’t lead to unfair outcomes. Last year an investigation by ProPublica found that commercial software developed to help determine which convicts should receive probation was more likely to incorrectly label black people than white people as “high risk.”

Jens Ludwig, director of the University of Chicago Crime Lab, who worked on the new NBER study, says it demonstrates how unfair outcomes are far from inevitable, by showing that a judge-advising algorithm could reduce crime as well as the rate at which blacks and Hispanics are jailed. “These tools can actually improve fairness relative to the status quo,” he says.

Deep Dive

Artificial intelligence

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.