keyboard_arrow_uptop

I know that we haven’t actually reached the end of the 2016 season and the game(s) that are left are ummm … kinda important, but the reality is that most teams are already in “next season” mode and it’s time to start thinking about what the future is going to hold. By this time next week, we’ll be talking about the Hot Stove and any other major news stories that might be going on at the time.

As Niels Bohr reminds us: “It’s tough to make predictions, especially about the future.” Last offseason’s genius move, say the signing of Jason Heyward to an eight-year mega-contract, can now look silly after Heyward … what the heck happened to Heyward? We're about to enter the season in which teams start making predictions about the year(s) to come. And assigning dollar figures to those predictions.

For players, it’s time to say goodbye to the people with whom you went to summer camp and to look ahead to next summer when you’ll make some new friends, but a few of the old ones will be back at camp–it’s just that some of them will have changed. A lot can happen in a year. Some of those changes will be good, a pitcher who learns a new pitch or a hitter who puts on some muscle in the offseason. Some will be not so good. Right now, there are plenty of meetings going on in which baseball executives are weighing which players they think are going which way.

Picking the right guys could make all the difference in the world to your team’s chances next year, and since by the end of the week there will be no actual baseball for a few months, everyone can wishcast that their favorite team’s GM is about to pull all the right strings on the free agent market. Even though the evidence suggests that there’s a lot of randomness at work and that “good GMs” might actually just be “lucky GMs.”

Every team preview ever written in the month of March is guaranteed to contain two sentences. One is that “injuries will be a key factor in Team X’s success.” The other is the line about how “if a few things break in favor of Team X, they could put together a playoff run.” How likely is that? In the coming months, 30 GMs will shape 30 rosters for 2017. Can we get some idea of how much of what they are about to do is skill and how much is luck?

Warning! Gory Mathematical Details Ahead!

The thing about free agent signings is that you have to pay the player before you receive the merchandise. That’s the problem with the future in general. I actually found myself asking a question that I’d never thought to answer before. How much–using WAR as our base metric–does the average player change from year to year?

The dirty secret of most projection systems is that they basically take the past few years of stats and assume that the player will do that again, plus or minus some aging curves and randomness. I examined all of the hitter seasons from 1999-2016 following a year in which the hitter had amassed at least 400 plate appearances. I used a simple average of his last three years of WAR as a mark of his “true” expected talent and then looked at what he did in the next year.

It also makes sense that in trying to get a quick thumbnail on a player, we might look back over his last couple of years before declaring him a “two-win player, give or take.” The following table shows the average delta from that three-year WAR average and his next year’s performance, along with the standard deviation of that delta.

Group

Delta (WAR this year – avg. WAR past 3 years)

Standard Deviation

All Players

-0.47

1.94

Avg. WAR past 3 years: < 0

0.98

1.41

Avg. WAR past 3 years: 0-1

0.21

1.61

Avg. WAR past 3 years: 1-2

-0.17

1.72

Avg. WAR past 3 years: 2-3

-0.60

1.93

Avg. WAR past 3 years: 3-4

-0.89

1.98

Avg. WAR past 3 years: 4-5

-1.01

1.98

Avg. WAR past 3 years: > 5

-1.39

2.25

The average below-replacement-level player who comes back gains about a win of value in his next season. It’s worth pointing out that the ones who come back are a very selective sample. These are the ones whom a major-league staff has decided have some chance to actually be worth something above replacement level. The bad ones are simply sent out to pasture. However, at the top end of the scale, we see that the average All-Star-level player loses 1.4 wins on average in the next year. What goes up (and gets older) must come down.

(For those wondering how it is that all players get worse every year, some of that is just aging, but some of that is positive contributors retiring and leaving the data set each year, while new rookies come into the league. If they then contribute anything positive, that’s all “new money” to the system.)

But look at those standard deviations. For those who slept through that day in stats class, that’s a measure of how wide the distribution is. For example, to round for the purposes of simplification, we expect an “all player” to lose half a win (.47) of value next year, with a standard deviation of two (OK, really 1.94). One standard deviation in either direction marks off about 68 percent of the sample (for the initiated, skew statistics on these distributions were minimal). That means that we expect 68 percent of our “all players” to be somewhere between -2.5 wins and +1.5 wins compared to the previous three years.

That’s a lot of variation. The entire mission of a baseball operations department is to explain and influence as much of that variation as possible. Teams are trying to figure out who might be primed for a breakout and working with players already in their system trying to induce one. But let’s see if specific front offices show any specific talent for getting it right.

Using data from 2007-2016, I looked at the reliability in changes in WAR (paneled by team) using Cronbach’s alpha method. It turns out that the deltas showed shockingly low levels of reliability, to the point that there was no evidence that teams had any skill in employing players who would be better than last year. I went back and accounted for the shrinkage factor we see above (bad players tend to get better, good ones tend to regress), although that didn’t help matters any.

Even just re-coding the numbers to see whether the players got better or worse (and switching the KR-21 formula), reliability remained low. There is some methodological fuzz in there, because over 10 years, GMs and various front office folk would have come and gone, but the message that teams in small sample sizes are really bad at figuring out who was going to be good next year remained.

Maybe I’m being too harsh on them for living through the decline years on a few big free agent contracts that they knew were going to be rough, but the reliability numbers were so low that it wouldn’t make much difference even if we gave them credit for “yeah, we knew he was gonna decline that year, and we knew we’d have to live with that” or even “yeah, we knew that he was gonna decline a little bit, but he was a six-win player last year and a five-win guy ain’t so bad.”

The implications of this one are interesting. Because the standard deviation for hitters is in the neighborhood of two wins in how much they will improve or deteriorate, and teams appear to be largely at the mercy of randomness in which way their players will fall on the Plinko board, it means that all of the scheming a team does in the offseason might be offset–by a lot–by the effects of dumb luck.

So far, I’ve only looked at hitters, but the numbers for pitchers are similar. These are starters (more than 100 innings last year and more than half of their appearances as starts), again from 1999-2016.

Group

Delta (WAR this year – avg. WAR past 3 years)

Standard Deviation

All Players

-0.38

2.16

Avg. WAR past 3 years: < 0

1.19

1.48

Avg. WAR past 3 years: 0-1

0.36

1.81

Avg. WAR past 3 years: 1-2

-0.14

1.98

Avg. WAR past 3 years: 2-3

-0.52

2.12

Avg. WAR past 3 years: 3-4

-1.02

2.21

Avg. WAR past 3 years: 4-5

-1.16

2.25

Avg. WAR past 3 years: > 5

-1.75

2.46

Starters show the same pattern as hitters, except they are even more variable. Now, relievers (at least 50 innings, but more than half of their appearances in relief).

Group

Delta (WAR this year – avg. WAR past 3 years)

Standard Deviation

All Players

-0.30

1.16

Avg. WAR past 3 years: < 0

0.56

0.97

Avg. WAR past 3 years: 0-1

-0.10

0.98

Avg. WAR past 3 years: 1-2

-0.68

1.13

Avg. WAR past 3 years: 2-3

-1.15

1.37

Avg. WAR past 3 years: 3-4

-1.29

1.46

Smaller standard deviations, mostly because relievers live in a WAR band that is much smaller than that of starters. Running the same sort of reliability analysis as above for pitchers, I continued to find perilously low levels of reliability by team. Teams do not show a specific talent for picking guys who will improve (or avoiding those who will slink backward).

The thing about all that variability is that it leaves us with the following question: What’s the use of a projection that has a single standard deviation error band that is four wins wide? It stands to reason that we’re talking about 20-something players that will contribute significant playing time to a team, some of them will improve, some will decline, and perhaps it all cancels each other out. But what if it doesn’t?

Let’s assume a team will have seven returning position players contributing full time (and maybe a couple rookies?), and as above, we expect them to perform an average of -0.47 wins below their previous three-year average, but with a standard deviation of 1.94. We know that the distributions show little skew, so we can simply sample from a normal distribution. I instructed my handy dandy spreadsheet to generate seven random values sampled from a normal distribution with those parameters. This represents how much growth or deterioration we can expect over what the players have put up in the past.

I did the same asking it for delta projections for four starters and four relievers. Because these are all random draws, we should be able to get a decent distribution of how much improvement we might expect from a team’s returning players over what they had put up in the past. I ran one million simulations and got an average “loss” of 6.00 wins from our returning veterans. This makes sense given that seven position players with an expected loss of 0.47 wins, four starting pitchers with an expected loss of 0.38 wins, and four relievers with an expected loss of 0.30 wins totals to an expected loss of 6.01 wins overall.

That means it’s not all that hard to see a group of players whom everyone would generally agree should provide about X wins based on their previous performance to provide X-plus-four wins or X-minus-four wins of value. In fact, we could expect at least a four-win increase more than a quarter of the time (28.69 percent of the time to be exact) in this model. And there’s an equal chance of dropping four wins.

Imagine for a moment that a team works all offseason and makes some shrewd moves that improve its talent level, based on reasonable projections, by four wins. All that hard work has a one-quarter chance of being wiped out by random chance. And a one-quarter chance that they will look like geniuses when the results show an eight-win jump.

Random, Random, Random

In United States culture, we have the unfortunate tendency of blaming (or crediting) people for things beyond their control. In baseball, we call GMs geniuses when they have a suddenly and surprisingly good year, and idiots when they have a bad one. The evidence here suggests that we should do neither. There is no evidence that teams have much control over predicting what the natural ebb and flow of their talent pool will be. You Can't Predict Baseball and all that. Sometimes you just get unlucky.

It’s not that teams have no control overall, but that the control they do exert can be easily overwhelmed by random chance. And maybe deep down, it’s the fault of teams that they don’t have as much control as we might hope they would. Teams are in the business of being able to figure out who’s good and who isn’t, and if they’re not able to do that, maybe they need to study harder for the test. But let’s also note that they are trying to predict human behavior, and as a psychologist, I can tell you that it’s damn near impossible to predict humans.

Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now
You need to be logged in to comment. Login or Subscribe