How AI is changing gymnastics judging

Proponents say the AI-powered Judging Support System will promote fairness and transparency in the sport. But can it deliver?

Ian Babylon

Jessica Taylor Price archive page

January 16, 2024

There was one individual Olympic spot left. According to the intricate set of rules governing who gets slots for the games, it would come down to who placed highest in the high bar final: Croatia’s Tin Srbić or Brazil’s Arthur Nory Mariano.

They were at the 2023 World Championships in Antwerp, Belgium, last October. Mariano went first. He fell during his routine, giving Srbić some wiggle room. He didn’t need it, though: Srbić completed a clean routine, with Tkachev connections and a double-twisting double layout that he stuck cold; at the end of his routine, he pumped his fists in the air in celebration. He’d qualified for the 2024 Paris Olympics.

But when his score came in—a 14.500—Srbić thought the judges had made a mistake, one that could cost him a medal at Worlds. He needed to decide if he wanted to make a challenge.

“When you make an inquiry, you can never be sure about what the judges did in the first place, so you have to make your best guess,” says David Kikuchi, an elite gymnastics coach from Canada. “There is also the risk of your score going down after a review.”

Srbić took the gamble—even though there was a new element of risk in this case. As it turns out, it wasn’t the typical judge that would decide whether he’d in fact landed all his maneuvers. It was AI.

Srbić’s routine, like all routines at the competition, had been captured by a handful of high-definition cameras, which together had built a three-dimensional image of his body as it moved. The footage had then been fed into AI software that was able to analyze each angle and movement to a specificity beyond the capabilities of the human eye.

These championships were the first time the technology, formally known as the Judging Support System, or JSS, had been used on every apparatus in a gymnastics competition—and its first use in a competition that could make or break an athlete’s Olympic dreams. While the AI judging system did not replace human judges—rather, it was available to help judges review routines in case of an inquiry or a “blocked score”—it still marked a watershed moment for the sport that was years in the making. The International Gymnastics Federation (known as FIG, its initials in French) first used JSS to judge pommel horse, rings, and vault back at the 2019 World Championships before adding more events at different competitions each year.

There are obvious upsides to using this kind of technology in competition. Human gymnastics judges must have an eye for quick, tiny movements: the point of a toe, the angle of a split (did she hit 180?), the slightest bend at the hip. AI could help take the guesswork out of the technicalities. And even as AI has been shown to perpetuate bias in other fields, from housing to hiring, JSS proponents believe it can eliminate biases in this case, making the sport both more fair and more transparent for viewers and for the gymnasts themselves.

“There are a lot of 50/50 decisions that judges need to make during competitions,” says FIG’s Steve Butcher, who previously served as sport director for the organization and is now the president of the element-recognition working group for Fujitsu, the company that developed the AI. “No one wants to make the wrong call in the Olympic Games or World Championships or any competition where something major would be on the line.”

At the same time, others fear AI judging will take away something that makes gymnastics special. Gymnastics is a subjective sport, like diving or dressage, where factors like nationality, body type, the location of a judge’s chair, and vague concepts like “artistry” and “performance” affect scores. Technology could eliminate the judges’ role in crafting a narrative. Consider that Nadia Comaneci’s first “perfect” 10 at the 1976 Olympics wasn’t perfect; she shuffled her feet on her landing. But the routine went down in gymnastics history thanks to creative judging, a reward for a certain je ne sais quoi Comaneci brought to the mat.

a multiple exposure image of Nadia Comaneci on the balance beam, next to an image of her with both arms raised. — In Nadia Comaneci's famed 1976 Olympic performace, the scoreboard showed a score of 1.0 because it was only made to display two digits instead of the three needed to show her 10.0 final score.

“You have to have a little bit of subjectivity in the sport,” says Kim Tanskanen, an elite gymnastics coach from Finland. “To take that all away, for me, takes away the fun of the sport and the excitement of it.”

For better or worse, AI has officially infiltrated the world of gymnastics. The question now is whether it really makes it fairer.

“The time to bring innovation to gymnastics has come”

The Judging Support System started with a joke.

Back in 2015, Morinari Watanabe was leading the Japanese Gymnastics Association and was on the verge of being elected the FIG’s ninth president and the first from Asia. He was having a conversation with Hidenori Fujiwara, the head of the sports business development division at Tokyo-based technology company Fujitsu, and Watanabe quipped that pretty soon, robots would be judging gymnastics competitions.

But Fujiwara took it as an assignment. “So we started the project,” he says. “We developed a prototype system and showed Watanabe.”

Watanabe was surprised—but also impressed. He soon became a proponent of an AI judging system, saying in his October 2016 speech accepting the leadership of the FIG, “The time to bring innovation to gymnastics has come.”

For more than a decade, the sport’s judges had used video review to address scoring inquiries. But there was still a need for a system that could catch errors the human eye could not. Human judges can sometimes miss the tiny measurements that can make or break a score—if a split is one or two degrees short of the minimum required, or if a dismount is off axis by just three or four degrees. In the wake of sanctions disciplining judges for scoring irregularities around the same time Watanabe became president, he pledged to “add Japanese technology so I can improve the fairness and justice of gymnastics.”

It became official in 2017, when the FIG formally announced its collaboration with Fujitsu. Developing the system took thousands of hours of research and work. “We started the project even though we didn’t have the technology to realize it,” says Fujiwara, who is now the JSS project manager at Fujitsu.

Initially deploying the same laser sensors that are used to operate autonomous vehicles, they started gathering three-dimensional skeletal data of gymnasts in competitions. Then, drawing from video footage of 8,000 routines, the AI model was trained on the entire Code of Points, the definitive guide to every element, or skill, a gymnast might perform. The system had to be taught the difference between an element and an interval between elements, as well as how much, or how little, movement constitutes “stopping.” It was taught what kinds of variation in a skill (like a split leap at less than 135 degrees) calls for which deduction.

Today’s JSS system no longer relies on lasers but uses four to eight high-definition cameras positioned at each apparatus to capture a three-dimensional view of a gymnast’s performance, analyzing positions of the joints and then comparing those positions with the standards for each element in the Code of Points—almost in real time.

Barring any physical obstructions—like a coach blocking the camera’s view of the athlete, for instance—JSS can judge single skills and entire routines, just as human judges do. It can recognize about 2,000 elements, according to Fujitsu spokesperson Hidetoshi Tomisaka, with about 90% accuracy when compared to a human making the same identification.

But while the use of the technology has expanded since its initial deployment at the 2019 World Championships, it’s still limited. It’s up to the Superior Jury—a panel of judges who supervise competitions and settle disputes—to decide when JSS will be used. Even then, it only comes into play to resolve inquiries, in which athletes challenge their difficulty score, or when there is a significant disagreement between a judge and a supervisor, which is known as a blocked score. JSS is not currently used to make determinations about artistry, and it still isn’t considered ready to use on certain complex routine components, like beam connections and dance elements on floor; the tech just isn’t there yet, says Johanna Gratt, a member of the FIG technical committee and liaison between the FIG and Fujitsu.

Promises of fair play

The FIG promised in a 2021 news release that JSS would be “one of the greatest technical advancements in the sport in decades,” in no small part because it can do things that are beyond human capabilities.

Initially, some judges worried that the system would replace them, says Butcher, who has been closely involved in the deployment of JSS. But, he explains, “I think we’ve proven over time that this was supposed to be an aid for the judges.”

Sunisa Lee in the air with her front leg pointed and straight, her chest arched and head thrown back with her rear leg pointed and bent slightly upward — Sunisa Lee performs a switch ring leap during the final day of the 2019 US Gymnastics Championships.

Consider the switch ring leap, an iconic skill on floor and beam in which a gymnast leaps into a split position with the back leg bent and the head thrown back. It’s notorious for being downgraded in the difficulty score. That’s because judges are especially strict with it—according to the Code of Points, for the move to get full credit, the upper back must be in an arch and the head released. The legs must reach a 180-degree split. The front leg must be horizontal and the back leg bent, with the back foot reaching the crown of the head or higher. All this happens, and is judged, in under a second. Human errors are inevitable.

At the 2023 World Championships, JSS was able to correct just these kinds of errors. Australia’s Clay Mason Stephens filed an inquiry into his pommel horse score, and after it was reviewed using JSS, the score was raised more than three points. Not all the elements of his routine had been counted by the human judges, which had resulted in a “short exercise” deduction.

JSS supporters also hope AI in gymnastics will have the power to eliminate the inevitable biases that humans bring to the judging booth. As it currently stands, judging is not particularly transparent; coaches and gymnasts agree it’s difficult to tell what goes on behind the scenes in scoring or inquiries.

This is further complicated by the fact that factors like nationality and body type can add a conscious or unconscious bias that influences scores. At the 2023 World Championships, for instance, gymnast Kaia Tanskanen knew that as a member of the Finnish team, she was at a disadvantage—what fans call a “leotard bias.” While judges may deny it, fans sometimes perceive “built-in deductions” for countries that don’t have the most competitive and elite programs—countries, in other words, that aren’t the US or Russia.

Kaia Tanskanen in her blue leotard competes on uneven bars during the women's qualifications of the 2023 World Championships in Antwerp, Belgium.

That’s one reason Kaia (whose coach is Kim Tanskanen, her mother) is hopeful about how JSS could change competition: “I feel like the scoring would be more even,” she says.

“Especially the smaller countries that compete internationally—I think the judges just have this assumption of what’s going to happen before they even start the routine, and they kind of judge based off that,” says Emma Spence, an elite Canadian gymnast who competed at the 2022 World Championships. “If we can eliminate that, I think it will make it a little more of a fair chance for everybody.”

While Butcher insists that judges “hopefully are leaving their biases behind them,” he too believes the JSS could help eliminate these factors and do more to create an even playing field.

Yet a lack of transparency around how and when JSS is used in competition may undermine this ideal. Score sheets at FIG events don’t currently include inquiries, so there are no recorded details about how routines were reviewed in competition, including whether JSS was used. Score sheets don’t include itemized deductions, either. In order to determine when JSS was used at the 2023 World Championships, I had to contact individual judges who are high up in the FIG; even they couldn’t tell me exactly how many times the JSS was used. This information simply isn’t recorded.

I was only able to confirm it was used in the case of Srbić after connecting with the men’s technical president; Srbić said via email that even he didn’t know if JSS was used to decide his inquiry.

Butcher told me that following the 2023 World Championships, athletes should have been sent a link to a website to see how their routines were judged by JSS, to help them make improvements. But when I contacted Kaia and Kim Tanskanen after the competition, they said she hadn’t received any information about AI judging either during or after the competition. (Butcher says this is likely a communication issue with the Finnish federation, though Satu Murtonen, the technical director of Finland’s Women’s Artistic Gymnastics, tells me, “Unfortunately, I don’t remember receiving any information about the robot judging.”)

When asked more broadly about transparency, Butcher points out that a lack of information about scoring isn’t dissimilar from the situation in other sports in which “athletes and coaches do not get specific information regarding the deliberations” of judges or referees. He also says the JSS project “will continue to evolve in offering greater fairness and transparency.”

Looking ahead, Fujitsu is focused on commercializing the technology so that it can be sold to gymnastics federations to use in practice. “Training is really where we need this,” Butcher says. “We need the federations to be able to purchase the Fujitsu system … and through that use, the gymnasts improve.”

Kim Tanskanen worries this may deepen the divide between rich programs and poorer ones that might not be able to afford the technology, again threatening the notion of an even playing field.

Butcher admits this is simply the reality: “Well-funded national federations always have an advantage in every sport. It is an unfortunate disparity with few solutions.”

The je ne sais quoi

As Kaia Tanskanen moved across the floor during her qualifications routine in Antwerp, she danced to the beat of her music and, at times, acted out the sound effects: a dramatic breath in, the tapping of a cowbell. After her final tumbling pass, a double tuck, she looked directly at the judges with a grin.

She was bringing something to her gymnastics that she wasn’t able to on vault or bars: self-expression. Her routine on floor is a true performance, and one where she strives to form a relationship with the judges, she says, by making eye contact to “pull them in with my expressions.”

It is this part of the sport that, before the 2023 World Championships began, Kim Tanskanen worried could be lost with the emergence of AI judging.

“You can’t look at a robot’s face and have it look back at you,” she says.

The performance component of a gymnast’s routine is something that has, more recently, been encouraged as a way to preserve the “artistic” side of the sport, even as the skills get ever more difficult.

“Artistry is an important part of our discipline, and we want to see both,” Gratt said in an email. “The power/energy of the difficulties on one hand, but also the grace, femininity, and elegance shown through the choreography.”

While the emphasis on artistry makes the sport more fun to watch, it can also make scoring more subjective. How exactly does one judge what the Code of Points calls “confidence of performance”?

“That’s probably the most difficult task for this Fujitsu system to be able to judge,” Butcher says. “It’s a feeling more than something that’s identifiable, like bent knees and flexed feet.” Fujitsu itself echoes this sentiment: “The notion that AI could be used to judge what is considered beautiful to people in place of humans … there are some parts for which this would be technically feasible and some where it simply isn’t,” says Fujitsu’s Tomisaka.

Though JSS still isn’t used to judge artistry today, Butcher admits he can’t say how far the system will go in the future: “The long-term goal is still being discussed,” he says. But he will say that, at least as of now, it is still considered supplemental to what the human judges are doing.

Gratt agrees. “Technology is always good if we can use it as an additional support to be better, to have more accuracy, to have feedback if you are in doubt,” she says. “But I think technology alone is also not working, because I think it should be a combination of both [AI and human judging] that makes the sport more fair.”

Tin Srbic pumping his arms, which has produced a cloud of chalk dust in the arm in front of him — Tin Srbić cheers after his routine on high bar at the 2023 World Championships.

All these questions will come to a head soon enough. The Paris Olympics are just six months away, though the FIG declined to comment on whether JSS will be used then. The body in charge of gymnastics scoring at the Olympics, OMEGA, also declined to comment.

Will AI ever be able to point out a Nadia Comaneci? Fifty years ago, something about her caught the judges’ eyes and put her on the top of the podium. It was something you can’t train a gymnast to do, and something you can’t train a person to recognize. Perhaps one day, an algorithm could tell us what that “something” was.

But in the meantime, it has a different role to play. When Srbić filed his inquiry, his score didn’t go down. After a review using JSS, his score jumped by .2 points—enough to get him a silver medal at Worlds, and that all-important place at the upcoming Olympics.

Jessica Taylor Price is a freelance journalist originally from Chicago. Her work has appeared in Bleacher Report, Teen Vogue, and National Geographic. She lives in Durham, UK, with her husband.

Deep Dive

Artificial intelligence

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.

Will Douglas Heavenarchive page

Google DeepMind’s new generative model makes Super Mario–like games from scratch

Genie learns how to control games by watching hours and hours of video. It could help train next-gen robots too.

Will Douglas Heavenarchive page

What’s next for generative video

OpenAI's Sora has raised the bar for AI moviemaking. Here are four things to bear in mind as we wrap our heads around what's coming.

Will Douglas Heavenarchive page

The AI Act is done. Here’s what will (and won’t) change

The hard work starts now.

Melissa Heikkiläarchive page

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

How AI is changing gymnastics judging