September 23, 2016
That's The Way I Like It And I'll Never Get Bored
Baseball Prospectus’ playoff odds have traditionally been based on 50,000 simulations of the season, but this year we wanted to say we did one million, so we did one million. Out of those one million simulations, we pretty much had the whole world of possibilities covered: We had a simulation where the A’s won 107 games, and one where the Cubs won 58. We had a simulation where the Phillies, Reds, Rockies, Padres, and Nationals are repping the NL in the playoffs. We had simulations where your favorite team won 100 games, where they lost 100 games, where they won the division by one, where they lost the division by one, where they’re playing a Game 163 to break a tie, where the manager got fired in mid-May, where the manager is the Manager of the Year, where they got the first pick in the draft and where they got to ride through your favorite team’s city’s downtown wearing t-shirts that refer to whatever obnoxious inside-meme carried them through a magical October run. We had seasons where that obnoxious meme was the nonsensical slogan Call The Cows Home, and where that obnoxious meme was a viral Vine of a breakdancing rabbit, and where that obnoxious meme was the team’s shared affection for The Great British Bake Off, and where that obnoxious meme was a fat suit that the catcher would wear during post-game interviews. We had seasons where they didn’t wear meme-displaying t-shirts in the parade, but meme-displaying coveralls. We had seasons where the parade was interrupted by a plague of locusts, and seasons where the parade was interrupted by a plague of cicadas. We had a lot of seasons.
What we didn’t have were any seasons that actually happened.
“So,” Bryan Cole asked at the time, “are you guys going to come back in October and tell us which of those million simulations were closest?”
We are, Bryan.
We went through those million sims this week looking for two things: Which season got the standings closest, and which season got the actual records closest, using RMSE. For the uninitiated, RMSE is root-mean-square error, and basically involves taking the difference between predicted value and actual value, squaring it, averaging all those squares, and then taking the square root of the average. It’s a way of saying, basically, that the average projection was off by X. I think.
The first measure—the standings—is a test, and that test is: If you were given a million tries, could you guess what the standings are going to look like in a given year. There are 120 ways for a division to order itself, which means there are about three trillion (that’s billion, with a tr) ways for the whole league to. But we’re semi-educated here, so we can mostly rule out the uncredible seasons where the NL Central goes CINMILPITSLNCHN. I’d have bet we got one exactly right, but… nope.
The best we got is Sim #601747, with the errors in italics:
(This is through Monday’s action, by the way. We’re stuck, for reasons of timing, using incomplete information, but the orientation of these results is pretty secure.)
This was just about the best year possible to be predicting a season, by the way. All three divisions in the NL were so heavily imbalanced between good and bad that you could pretty much narrow an entire half of baseball to about 100 possible scenarios. When our staff did its preseason predictions, there were loads of NL forecasts that were exactly the same. Better, still, for the predictor: Practically nothing surprised us in the NL this year! The split went off almost exactly as we all imagined it, and the correlation between our staff’s NL projections and the actual records is something like .85. Incredible.
Yet even in this year, the year baseball made it easy and went how it was supposed to, there were still too many ways to be wrong.
The second measure—the records—is an experiment, and that experiment is: If you have close to God knowledge, and you know more or less how the season is going to go exactly, how much is still unknowable? I’m not saying that one million simulations are exactly the same as omniscience, but the smartest of those simulations is getting pretty close to as smart as anybody could possibly be.
Sim #99094 provided the closest season by records, and it still had an RMSE of 4.7 wins. (Note that we’re using PECOTA’s projected final records, as seen on the current Playoff Odds, as our “Actual” here.) Look:
When I was in high school I worked at a two-screen movie theater. The computer in the office had a screensaver that was a little ricocheting ball, like in the game Pong but smaller, crisscrossing the computer screen. Whatever path it covered, it would paint over; once it had painted over the entire screen, we figured it would reset. One day we decided to bet on how long it would take to reset the screen; we’d never seen it happen, because we would always jostle the mouse before the ball completed its work.
I think I had the highest estimate, at something like 80 minutes. Within five minutes, a third of the screen had been painted over. Within 15 minutes, more than half had, and I was conceding that I was going to lose this bet. That’s how I imagine the simulations do. Within one simulation, you probably capture about half the baseball intelligence that you need to understand a season; you’ve got all 30 teams, for instance! There are no Cleveland Spiders or Sonoma Stompers in that league, nosiree. The basic shape of the standings probably points to something real.
Within 10 simulations, you’ve probably got at least one simulation that has 60 or 70 percent of the season cracked. Within 500 simulations, your best simulation reaches, say, 92 percent, and within 10,000 your best sim is at at 98 percent, and within a million you’re at 99 percent. That best sim gets baseball. It gets the season. It knows a lot. And it still misses a lot, and that last little bit, those last little pixels, get really, really, really hard to hit. Our screensaver didn’t finish the job in two days. We let it run overnight, doing the receipts by hand, and we finally gave up.
I wrote my first piece at BP in 2011, and this is my last, as I'll be starting a new gig writing about baseball for ESPN. I was terrified to write that first BP piece, and the first few dozen, because I felt like BP writers were supposed to know baseball all the way to the last pixel. I knew the players and the teams, I knew the stats, and I knew how it felt to watch baseball, but I also had a gnawing worry that if I couldn’t predict how Mike Fiers’ next start was going to go I had no business writing about him. It was intimidating to write alongside writers, and for readers, who seemed so much smarter than me.
But writing those first articles, and getting to know the smarter-than-me writers, and getting the feedback from the smarter-than-me readers, the fear faded away. Nobody gets the final pixels, and none of us at BP expects to. We go into a season knowing that anything can happen, and even when we’re as right as we can plausibly be—even in our most accurate simulation—there are little miracles of the unexpected. Sim #99094 only missed the Yankees by two games, but those two games were Gary Sanchez! It only missed the Dodgers by two, but those two included Rich Hill! It’s in that space, where baseball is simultaneously projectable and unimaginable, that Baseball Prospectus thrives. I couldn’t be prouder of the writers I got to edit over the past four years, who absolutely owned that space.
This is where I would thank them all—I love this staff—but there are too many, and too many other people to thank, so I’ll do those in private. But I do want to single out Rob McQuown, who is mostly behind the scenes here but far more important to producing the site, the annual, everything BP does than I ever was. Sometimes we remember to put a little note at the bottom of our articles that say “Thanks to Rob McQuown for research assistance.” That’s secret BP code for “I wouldn’t be able to get out of bed in the morning if Rob didn’t exist.” He’s the nicest, smartest, hardest-working guy, too. So thanks, Rob, for research assistance.
Aaron Gleeman will take over as Editor in Chief on Monday. He’s going to be so good at this job. Some things really actually are easy to predict, and that’s one of them.