The news: A new study has found that risk-assessment algorithms are sometimes better than people at predicting whether a criminal will be rearrested within two years of leaving jail. But neither are very good.
The research: Researchers at Stanford University and the University of California, Berkeley tried to recreate a 2018 experiment which found that humans with no training were just as good as a widely-used risk-assessment software program called COMPAS. Risk-assessment algorithms are trained on historical defendant data and are supposed to help judges determine whether a defendant should be kept in jail or be allowed out while awaiting trial. (We went into some of the problems with them in this interactive piece last year.)
Test the humans: The team used a dataset of COMPAS risk assessments covering about 7,000 real defendants and used it to create profiles for each one. These profiles were showed to 400 laypeople recruited through the Amazon Mechanical Turk, asking them to decide if they thought the person would commit another crime. The 2018 study found COMPAS was accurate about 65% of the time, and humans were closer to 67%. This new study managed to closely replicate these results.
New tweaks: But this time the original experiment was changed and extended. For example, the team tested if revealing more information about defendants, giving or withholding feedback after each round and looking just at violent crimes made a difference. The results showed that if the humans didn't get feedback on their prediction accuracy or if they were given lots of extra information about each defendant, the algorithm was more accurate each time. The authors note that in real life, humans rarely get immediate feedback on their decisions regarding defendants and so this might be a more realistic comparison. The research is described in a paper in Science Advances.
Bias abounds: One thing seems to be clear: inaccuracy and bias creep into predictions regardless of whether they’re being made by humans or algorithms. However, the difference is one of accountability. Where people can appeal against judges’ decisions, it’s much harder to contest decisions made by algorithms, which are gradually being used for more areas of official decision-making beyond justice, like housing and schooling.
The solution?: That’s part of the reason why there’s so much work underway to try to make algorithms explainable, although this isn’t necessarily the panacea people think it is—and can actually make things worse by leading us to over-trust statistical models. Campaigners in Europe have managed to rein in “government by unaccountable algorithm” in a few high-profile cases, but without data privacy laws, it’s even harder to push back against these sorts of programs in the US.
Read next: Can you make AI fairer than a judge? Play our courtroom algorithm game.