CSS Button No Image Css3Menu.com

Baseball Prospectus home
  
  
Click here to log in Click here for forgotten password Click here to subscribe
Strength of Schedule Report
<< Previous Article
Retrospectus (09/23)
<< Previous Column
Pebble Hunting: The Gr... (09/22)
No Next Column
Next Article >>
Fantasy Article Deep League Report: We... (09/26)

September 23, 2016

Pebble Hunting

That's The Way I Like It And I'll Never Get Bored

by Sam Miller

Baseball Prospectus’ playoff odds have traditionally been based on 50,000 simulations of the season, but this year we wanted to say we did one million, so we did one million. Out of those one million simulations, we pretty much had the whole world of possibilities covered: We had a simulation where the A’s won 107 games, and one where the Cubs won 58. We had a simulation where the Phillies, Reds, Rockies, Padres, and Nationals are repping the NL in the playoffs. We had simulations where your favorite team won 100 games, where they lost 100 games, where they won the division by one, where they lost the division by one, where they’re playing a Game 163 to break a tie, where the manager got fired in mid-May, where the manager is the Manager of the Year, where they got the first pick in the draft and where they got to ride through your favorite team’s city’s downtown wearing t-shirts that refer to whatever obnoxious inside-meme carried them through a magical October run. We had seasons where that obnoxious meme was the nonsensical slogan Call The Cows Home, and where that obnoxious meme was a viral Vine of a breakdancing rabbit, and where that obnoxious meme was the team’s shared affection for The Great British Bake Off, and where that obnoxious meme was a fat suit that the catcher would wear during post-game interviews. We had seasons where they didn’t wear meme-displaying t-shirts in the parade, but meme-displaying coveralls. We had seasons where the parade was interrupted by a plague of locusts, and seasons where the parade was interrupted by a plague of cicadas. We had a lot of seasons.

What we didn’t have were any seasons that actually happened.

“So,” Bryan Cole asked at the time, “are you guys going to come back in October and tell us which of those million simulations were closest?”

We are, Bryan.

We went through those million sims this week looking for two things: Which season got the standings closest, and which season got the actual records closest, using RMSE. For the uninitiated, RMSE is root-mean-square error, and basically involves taking the difference between predicted value and actual value, squaring it, averaging all those squares, and then taking the square root of the average. It’s a way of saying, basically, that the average projection was off by X. I think.

The first measure—the standings—is a test, and that test is: If you were given a million tries, could you guess what the standings are going to look like in a given year. There are 120 ways for a division to order itself, which means there are about three trillion (that’s billion, with a tr) ways for the whole league to. But we’re semi-educated here, so we can mostly rule out the uncredible seasons where the NL Central goes CINMILPITSLNCHN. I’d have bet we got one exactly right, but… nope.

The best we got is Sim #601747, with the errors in italics:

AL East

AL Central

AL West

Boston

Cleveland

Seattle

Toronto

Detroit

Houston

Baltimore

Kansas City

Texas

New York

Chicago

Oakland

Tampa Bay

Minnesota

Los Angeles

NL East

NL Central

NL West

Washington

Chicago

Los Angeles

New York

St. Louis

San Francisco

Miami

Pittsburgh

San Diego

Philadelphia

Milwaukee

Arizona

Atlanta

Cincinnati

Colorado

(This is through Monday’s action, by the way. We’re stuck, for reasons of timing, using incomplete information, but the orientation of these results is pretty secure.)

This was just about the best year possible to be predicting a season, by the way. All three divisions in the NL were so heavily imbalanced between good and bad that you could pretty much narrow an entire half of baseball to about 100 possible scenarios. When our staff did its preseason predictions, there were loads of NL forecasts that were exactly the same. Better, still, for the predictor: Practically nothing surprised us in the NL this year! The split went off almost exactly as we all imagined it, and the correlation between our staff’s NL projections and the actual records is something like .85. Incredible.

Yet even in this year, the year baseball made it easy and went how it was supposed to, there were still too many ways to be wrong.

The second measure—the records—is an experiment, and that experiment is: If you have close to God knowledge, and you know more or less how the season is going to go exactly, how much is still unknowable? I’m not saying that one million simulations are exactly the same as omniscience, but the smartest of those simulations is getting pretty close to as smart as anybody could possibly be.

Sim #99094 provided the closest season by records, and it still had an RMSE of 4.7 wins. (Note that we’re using PECOTA’s projected final records, as seen on the current Playoff Odds, as our “Actual” here.) Look:

Sim 99094

Actual

Angels

71

71

A's

76

71

Astros

84

87

Mariners

89

85

Rangers

90

95

Indians

97

94

Royals

73

82

Tigers

83

86

Twins

64

70

White Sox

81

77

Blue Jays

84

88

Orioles

78

87

Rays

75

69

Red Sox

93

94

Yankees

82

84

Diamondbacks

73

68

Dodgers

90

92

Giants

90

85

Padres

77

68

Rockies

75

78

Brewers

78

73

Cardinals

77

85

Cubs

93

104

Pirates

83

81

Reds

73

67

Braves

67

66

Marlins

77

81

Mets

90

86

Nationals

99

95

Phillies

68

73

When I was in high school I worked at a two-screen movie theater. The computer in the office had a screensaver that was a little ricocheting ball, like in the game Pong but smaller, crisscrossing the computer screen. Whatever path it covered, it would paint over; once it had painted over the entire screen, we figured it would reset. One day we decided to bet on how long it would take to reset the screen; we’d never seen it happen, because we would always jostle the mouse before the ball completed its work.

I think I had the highest estimate, at something like 80 minutes. Within five minutes, a third of the screen had been painted over. Within 15 minutes, more than half had, and I was conceding that I was going to lose this bet. That’s how I imagine the simulations do. Within one simulation, you probably capture about half the baseball intelligence that you need to understand a season; you’ve got all 30 teams, for instance! There are no Cleveland Spiders or Sonoma Stompers in that league, nosiree. The basic shape of the standings probably points to something real.

Within 10 simulations, you’ve probably got at least one simulation that has 60 or 70 percent of the season cracked. Within 500 simulations, your best simulation reaches, say, 92 percent, and within 10,000 your best sim is at at 98 percent, and within a million you’re at 99 percent. That best sim gets baseball. It gets the season. It knows a lot. And it still misses a lot, and that last little bit, those last little pixels, get really, really, really hard to hit. Our screensaver didn’t finish the job in two days. We let it run overnight, doing the receipts by hand, and we finally gave up.

I wrote my first piece at BP in 2011, and this is my last, as I'll be starting a new gig writing about baseball for ESPN. I was terrified to write that first BP piece, and the first few dozen, because I felt like BP writers were supposed to know baseball all the way to the last pixel. I knew the players and the teams, I knew the stats, and I knew how it felt to watch baseball, but I also had a gnawing worry that if I couldn’t predict how Mike Fiers’ next start was going to go I had no business writing about him. It was intimidating to write alongside writers, and for readers, who seemed so much smarter than me.

But writing those first articles, and getting to know the smarter-than-me writers, and getting the feedback from the smarter-than-me readers, the fear faded away. Nobody gets the final pixels, and none of us at BP expects to. We go into a season knowing that anything can happen, and even when we’re as right as we can plausibly be—even in our most accurate simulation—there are little miracles of the unexpected. Sim #99094 only missed the Yankees by two games, but those two games were Gary Sanchez! It only missed the Dodgers by two, but those two included Rich Hill! It’s in that space, where baseball is simultaneously projectable and unimaginable, that Baseball Prospectus thrives. I couldn’t be prouder of the writers I got to edit over the past four years, who absolutely owned that space.

This is where I would thank them all—I love this staff—but there are too many, and too many other people to thank, so I’ll do those in private. But I do want to single out Rob McQuown, who is mostly behind the scenes here but far more important to producing the site, the annual, everything BP does than I ever was. Sometimes we remember to put a little note at the bottom of our articles that say “Thanks to Rob McQuown for research assistance.” That’s secret BP code for “I wouldn’t be able to get out of bed in the morning if Rob didn’t exist.” He’s the nicest, smartest, hardest-working guy, too. So thanks, Rob, for research assistance.

Aaron Gleeman will take over as Editor in Chief on Monday. He’s going to be so good at this job. Some things really actually are easy to predict, and that’s one of them.

Sam Miller is an author of Baseball Prospectus. 
Click here to see Sam's other articles. You can contact Sam by clicking here

Related Content:  Goodbye Columns

18 comments have been left for this article.

<< Previous Article
Retrospectus (09/23)
<< Previous Column
Pebble Hunting: The Gr... (09/22)
No Next Column
Next Article >>
Fantasy Article Deep League Report: We... (09/26)

RECENTLY AT BASEBALL PROSPECTUS
Fantasy Article Fantasy Freestyle: A Few of My Favorite Dyna...
BP En Espanol: Confesiones de un falso manag...
What You Need to Know: Best of the Best
Short Relief: Our Large Retired Sons
BP Toronto
Premium Article Rubbing Mud: Half-Truths and Collision Cours...
Baseball Prospectus News: Subscription Price...

MORE FROM SEPTEMBER 23, 2016
PITCHf/ox: Episode 1: Pilot
What You Need to Know: All Mets'd Up
The Prospectus Hit List: Friday, September 2...
Fantasy Article Fantasy Starting Pitcher Planner: Week 25
The Best of Sam Miller
The Best of Sam Miller
The Best of Sam Miller

MORE BY SAM MILLER
2016-09-30 - BP Daily Podcast: Effectively Wild Episode 9...
2016-09-28 - BP Daily Podcast: Effectively Wild Episode 9...
2016-09-26 - BP Daily Podcast: Effectively Wild Episode 9...
2016-09-23 - Pebble Hunting: That's The Way I Like It And...
2016-09-23 - BP Daily Podcast: Effectively Wild Episode 9...
2016-09-23 - The Best of Sam Miller
2016-09-23 - The Best of Sam Miller
More...

MORE PEBBLE HUNTING
2016-09-23 - Pebble Hunting: That's The Way I Like It And...
2016-09-22 - Pebble Hunting: The Great Big 'Beat PECOTA' ...
2016-08-26 - Pebble Hunting: A Sense Of Where You Are
2016-08-11 - Pebble Hunting: The Saddest Age-27 Seasons o...
More...