Uncategorized

# Stealing Baseball

Mathematicians break baseball down to its simplest component – cold, hard stats – in hopes of finding the best players.
November 16, 2005

Baseball is near and dear to my heart. I love the game. My first memory is that of holding a bat (which explains why I carry a bat around the TR offices). As I grew older, I grew to love the statistical analysis inherent in the history of the game. I can go back through 100 plus years of games, seasons, and careers – comparing statistics from modern players with all-time greats. I can, I’d imagine, quantifiably prove that there was such a thing as the Dead Ball Era (there was) and the Steroid Era (there is).

Despite my love of statistics, I have never been one to rest my heart and soul on simple math when it comes to choosing great players. Numbers never tell a complete story. Life is too complex for 2+2 to always equal 4. So my heart skipped a beat yesterday when I read this USA Today article about a husband-and-wife team that developed a computer simulation which predicted the Major League Baseball’s Cy Young Awards, arguably the most coveted pitching award in the game. The system crunched a variety of numbers, giving weight to each one, and then associated a final number with each pitcher who played last year.

No way could this application be right. For a game built on subtlety, it struck me as odd that a silly computer, relying on cold, hard numbers, could tell a complete story. Formulas and numbers only get you so far. Eventually, you need to make a leap of faith that can’t be explained in 2+2=4. And yet, 2+2 always equals 4. And it’s hard to get away from that.

But there is something inherently distrustful about cold, hard numbers, a fact the mathematicians faced as they prepared to announce their decision:

But being human, and perhaps more importantly, baseball fans, the mathematicians made their own mistake the week before the award announcements. Overriding the model, they instead predicted that the New York Yankees’ Mariano Rivera would win the American League Cy Young. They argued that voters would see it as “lifetime achievement” award in a year of weak American League contenders for the prize.

They were, of course, completely wrong. The computer model correctly picked Chris Carpenter and Bartolo Colon, the two recipients of the award this year.

### Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.

### The problem with plug-in hybrids? Their drivers.

Plug-in hybrids are often sold as a transition to EVs, but new data from Europe shows we’re still underestimating the emissions they produce.

### Google DeepMind’s new generative model makes Super Mario–like games from scratch

Genie learns how to control games by watching hours and hours of video. It could help train next-gen robots too.

### How scientists traced a mysterious covid case back to six toilets

When wastewater surveillance turns into a hunt for a single infected individual, the ethics get tricky.

### Stay connected

Illustration by Rose Wong

## Get the latest updates fromMIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!