keyboard_arrow_uptop
Baseball Prospectus is looking for a Public Data Services Director. Read the description here.

Baseball Prospectus’ playoff odds have traditionally been based on 50,000 simulations of the season, but this year we wanted to say we did one million, so we did one million. Out of those one million simulations, we pretty much had the whole world of possibilities covered: We had a simulation where the A’s won 107 games, and one where the Cubs won 58. We had a simulation where the Phillies, Reds, Rockies, Padres, and Nationals are repping the NL in the playoffs. We had simulations where your favorite team won 100 games, where they lost 100 games, where they won the division by one, where they lost the division by one, where they’re playing a Game 163 to break a tie, where the manager got fired in mid-May, where the manager is the Manager of the Year, where they got the first pick in the draft and where they got to ride through your favorite team’s city’s downtown wearing t-shirts that refer to whatever obnoxious inside-meme carried them through a magical October run. We had seasons where that obnoxious meme was the nonsensical slogan Call The Cows Home, and where that obnoxious meme was a viral Vine of a breakdancing rabbit, and where that obnoxious meme was the team’s shared affection for The Great British Bake Off, and where that obnoxious meme was a fat suit that the catcher would wear during post-game interviews. We had seasons where they didn’t wear meme-displaying t-shirts in the parade, but meme-displaying coveralls. We had seasons where the parade was interrupted by a plague of locusts, and seasons where the parade was interrupted by a plague of cicadas. We had a lot of seasons.

What we didn’t have were any seasons that actually happened.

“So,” Bryan Cole asked at the time, “are you guys going to come back in October and tell us which of those million simulations were closest?”

We are, Bryan.

We went through those million sims this week looking for two things: Which season got the standings closest, and which season got the actual records closest, using RMSE. For the uninitiated, RMSE is root-mean-square error, and basically involves taking the difference between predicted value and actual value, squaring it, averaging all those squares, and then taking the square root of the average. It’s a way of saying, basically, that the average projection was off by X. I think.

The first measure—the standings—is a test, and that test is: If you were given a million tries, could you guess what the standings are going to look like in a given year. There are 120 ways for a division to order itself, which means there are about three trillion (that’s billion, with a tr) ways for the whole league to. But we’re semi-educated here, so we can mostly rule out the uncredible seasons where the NL Central goes CINMILPITSLNCHN. I’d have bet we got one exactly right, but… nope.

The best we got is Sim #601747, with the errors in italics:

AL East

AL Central

AL West

Boston

Cleveland

Seattle

Toronto

Detroit

Houston

Baltimore

Kansas City

Texas

New York

Chicago

Oakland

Tampa Bay

Minnesota

Los Angeles

NL East

NL Central

NL West

Washington

Chicago

Los Angeles

New York

St. Louis

San Francisco

Miami

Pittsburgh

San Diego

Philadelphia

Milwaukee

Arizona

Atlanta

Cincinnati

Colorado

(This is through Monday’s action, by the way. We’re stuck, for reasons of timing, using incomplete information, but the orientation of these results is pretty secure.)

This was just about the best year possible to be predicting a season, by the way. All three divisions in the NL were so heavily imbalanced between good and bad that you could pretty much narrow an entire half of baseball to about 100 possible scenarios. When our staff did its preseason predictions, there were loads of NL forecasts that were exactly the same. Better, still, for the predictor: Practically nothing surprised us in the NL this year! The split went off almost exactly as we all imagined it, and the correlation between our staff’s NL projections and the actual records is something like .85. Incredible.

Yet even in this year, the year baseball made it easy and went how it was supposed to, there were still too many ways to be wrong.

The second measure—the records—is an experiment, and that experiment is: If you have close to God knowledge, and you know more or less how the season is going to go exactly, how much is still unknowable? I’m not saying that one million simulations are exactly the same as omniscience, but the smartest of those simulations is getting pretty close to as smart as anybody could possibly be.

Sim #99094 provided the closest season by records, and it still had an RMSE of 4.7 wins. (Note that we’re using PECOTA’s projected final records, as seen on the current Playoff Odds, as our “Actual” here.) Look:

Sim 99094

Actual

Angels

71

71

A's

76

71

Astros

84

87

Mariners

89

85

Rangers

90

95

Indians

97

94

Royals

73

82

Tigers

83

86

Twins

64

70

White Sox

81

77

Blue Jays

84

88

Orioles

78

87

Rays

75

69

Red Sox

93

94

Yankees

82

84

Diamondbacks

73

68

Dodgers

90

92

Giants

90

85

Padres

77

68

Rockies

75

78

Brewers

78

73

Cardinals

77

85

Cubs

93

104

Pirates

83

81

Reds

73

67

Braves

67

66

Marlins

77

81

Mets

90

86

Nationals

99

95

Phillies

68

73

When I was in high school I worked at a two-screen movie theater. The computer in the office had a screensaver that was a little ricocheting ball, like in the game Pong but smaller, crisscrossing the computer screen. Whatever path it covered, it would paint over; once it had painted over the entire screen, we figured it would reset. One day we decided to bet on how long it would take to reset the screen; we’d never seen it happen, because we would always jostle the mouse before the ball completed its work.

I think I had the highest estimate, at something like 80 minutes. Within five minutes, a third of the screen had been painted over. Within 15 minutes, more than half had, and I was conceding that I was going to lose this bet. That’s how I imagine the simulations do. Within one simulation, you probably capture about half the baseball intelligence that you need to understand a season; you’ve got all 30 teams, for instance! There are no Cleveland Spiders or Sonoma Stompers in that league, nosiree. The basic shape of the standings probably points to something real.

Within 10 simulations, you’ve probably got at least one simulation that has 60 or 70 percent of the season cracked. Within 500 simulations, your best simulation reaches, say, 92 percent, and within 10,000 your best sim is at at 98 percent, and within a million you’re at 99 percent. That best sim gets baseball. It gets the season. It knows a lot. And it still misses a lot, and that last little bit, those last little pixels, get really, really, really hard to hit. Our screensaver didn’t finish the job in two days. We let it run overnight, doing the receipts by hand, and we finally gave up.

I wrote my first piece at BP in 2011, and this is my last, as I'll be starting a new gig writing about baseball for ESPN. I was terrified to write that first BP piece, and the first few dozen, because I felt like BP writers were supposed to know baseball all the way to the last pixel. I knew the players and the teams, I knew the stats, and I knew how it felt to watch baseball, but I also had a gnawing worry that if I couldn’t predict how Mike Fiers’ next start was going to go I had no business writing about him. It was intimidating to write alongside writers, and for readers, who seemed so much smarter than me.

But writing those first articles, and getting to know the smarter-than-me writers, and getting the feedback from the smarter-than-me readers, the fear faded away. Nobody gets the final pixels, and none of us at BP expects to. We go into a season knowing that anything can happen, and even when we’re as right as we can plausibly be—even in our most accurate simulation—there are little miracles of the unexpected. Sim #99094 only missed the Yankees by two games, but those two games were Gary Sanchez! It only missed the Dodgers by two, but those two included Rich Hill! It’s in that space, where baseball is simultaneously projectable and unimaginable, that Baseball Prospectus thrives. I couldn’t be prouder of the writers I got to edit over the past four years, who absolutely owned that space.

This is where I would thank them all—I love this staff—but there are too many, and too many other people to thank, so I’ll do those in private. But I do want to single out Rob McQuown, who is mostly behind the scenes here but far more important to producing the site, the annual, everything BP does than I ever was. Sometimes we remember to put a little note at the bottom of our articles that say “Thanks to Rob McQuown for research assistance.” That’s secret BP code for “I wouldn’t be able to get out of bed in the morning if Rob didn’t exist.” He’s the nicest, smartest, hardest-working guy, too. So thanks, Rob, for research assistance.

Aaron Gleeman will take over as Editor in Chief on Monday. He’s going to be so good at this job. Some things really actually are easy to predict, and that’s one of them.

Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now
You need to be logged in to comment. Login or Subscribe
gilpdawg
9/23
RIP Sam. But seriously, look forward to reading you at the "worldwide leader."
bpars3
9/23
I have loved reading your work, and listening, too, over the years. You will be missed at BP. Best of luck at ESPN...
bobbygrace
9/23
Thank you, Sam, for your great work at BP over the years. I'll miss seeing your articles here but will look for them at ESPN. I hope "Effectively Wild" will continue forever.

As a Twins fan, I've been reading Aaron Gleeman's work for years, too. He writes beautifully and has stood out as an unflinching critical thinker in a baseball community that lacked such writers for far too long. He's a good addition to BP. Congratulations, Aaron.
heterodude
9/23
I've very much enjoyed your work, Mr. Miller. I will obviously continue to read your work, even as you move onto a different site and medium.

As for Aaron Gleeman, I'm sure he'll do great. Although, I still have a winning record against him in Whatifsports' Hardball Dynasty so there is room for improvement.
bryanherr
9/23
Good luck Sam at your new job! Sim 964264 wasn't too far off for the Twins.
collins
9/27
Actually, it says there that the Twins will win about 70 this season. Closer to 60. About five games off anyway.
Shauncore
9/23
I'm glad we got the simulation number in life that resulted in Sam being the BP editor.
thepete39
9/23
Wow, what a major shift. The Sam Miller Era at BP will be certainly be remembered as being super fun and playful while bringing important growth and evolution for the BP annual. The book has gotten better and better the last few years and I credit Sam (and Jason W.) for that. Best of luck and continued success to you, Sam!

The Aaron Gleeman news is huge. I've been following that guy since his earliest blogging endeavors and I'm thrilled for him.

Looking forward to watching this all unfold.
walrus0909
9/23
Thanks for answering my question, and good luck at the new gig.

As a wise man once said, "We could all use a little change."
NJTomatoes
9/23
Sam, One of the hidden downsides of your departure: I won't get to see you on MLB TV during the spring speculation season causing Bryan Kenny's forehead to furrow when you offer an opinion that has zero varnish applied (I don't think I've ever seen an ESPN writer on their panels). Good luck with your new gig. I hope they continue to let you write smart.
NJTomatoes
9/23
...i guess it's actually his brow that furrows.
nickgieschen
9/23
tl;dr
jfranco77
9/23
If you run your "baseball prospectus 2011" sim one million times, do any of them end up with Sarge in Houston, the Professor in Chicago and Sam at ESPN?

#RIPSam
russell
9/23
Best of luck at ESPN, Sam. I will miss your work at BP, but will try to visit you often at your new place.

And way to bury the lede. Congrats to Aaron on his new role.
MateoM
9/23
I have always enjoyed your articles, Sam. I especially liked the one about the man in black during Kershaw's no-hitter. What a unique way to write about a no-hitter, much like the million sims was a unique way to write a good-bye column. Hillary Clinton's Between the Two Ferns appearance yesterday has already received an "immortal" rating over at Funny or Die. This article and the Kershaw article receive an "immortal" rating from me, for whatever that is worth (other than distracting us from contemplating our own mortality).

Good luck. Looking forward to reading your stuff at ESPN, though I remain a loyal subscriber to BP. I hope you and Ben continue your partnership, also, with EW, but I realize it can't last forever.
jfmoguls
9/24
On never being bored: I remember when I got the first edition of the MacMillan Baseball Encyclopedia and I said that, I'll never be bored again. Still have it, still look it (despite Baseball Reference), still am not bored.

Thanks Sam for many hours of enjoyment. Don't let ESPN take your edge off.

Congrats to Aaron.
brownsugar
9/26
Fare thee well, Sam. You have a unique skill that combines keeping it informative with keeping it fun, and I've appreciated it all along the way. You have single-handedly made me riveted by every Mark Reynolds at-bat that I've seen in the last 4 years.
redspid
9/26
Thanks Sam, really enjoyed reading your articles.