keyboard_arrow_uptop

If you missed it, I filled a guest spot on the Fantasy Baseball Roundtable Radio show last night alongside industry friends and show regulars Patrick DiCaprio, Mike Podhorzer, and Greg Marta.  We talked mostly about buy-high and sell-low players—you can listen to the whole show here, if you’re interested—but there was one more theoretical topic that came up in passing which I found interesting but didn’t have a chance to jump into: trend analysis.

When discussing “buy high” third basemen, Pat mentioned how he was questioning the utility of trend analysis, specifically in regard to Hanley Ramirez.  Despite a four-year trend of declining power, Hanley has bounced back this year to a level that falls between his 2008 and 2009 seasons:

YEAR

AGE

PA

HR

HR/PA

2008

25

693

33

4.8%

2009

26

652

24

3.7%

2010

27

619

21

3.4%

2011

28

385

10

2.6%

2012

29

270

11

4.1%

If you were using trend analysis, you would have seen an aging Hanley Ramirez losing power each year and expected him to continue losing power (or at least level off) in 2012.  You would have been wrong, however.  And not only in results, mind you, but in process.  Trend analysis has no place in projecting a player, as much as some may find solace in using it, seeing a clear pattern and an easy solution.  But easy isn’t always best.

The problem with trend analysis is that it usually runs contradictory to one of the most fundamental concepts in projecting a player: regression to the mean.  The most illustrative way to explain regression to the mean that I’ve found is Bill James’ Whirlpool Principle:

All teams are drawn forcefully toward the center. Most of the teams which had winning records in [2010] will decline in [2011]; most of the teams which had losing records in [2010] will improve in [2011].

Naturally, you can make the leap from teams to players and understand that, basically, all players, whether they were above average or below average in one year, will perform closer to league average in the next year (absent other information).  We know this to be unequivocally, 100 percent true, and all projection systems—from the simplest Marcel to the complex PECOTA—incorporate regression to the mean in one way or another.

I’m sure you can see how this runs contrary to trend analysis.  While regression to the mean tells us that players will post numbers closer to league average next season, trend analysis tells us that they will continue further along the path they’re currently on, traveling further toward one extreme or another.  While the latter might seem to make some sort of sense—after all, it’s easy to visualize a line going straight down through Hanley’s numbers—it can wreak havoc on fantasy players who choose to use it, resulting in incorrect decisions more often than not.

To illustrate, from 1990 through 2010, 410 players (with at least 300 PA in each season) posted three-year HR/PA declines.  Check out how these players fared in the fourth year, on average:

YEAR 1

YEAR 2

YEAR 3

YEAR 4

4.0%

3.3%

2.6%

3.1%

Using trend analysis, we’d think that these players would continue their power slide, but just the opposite happened: they bounced back.  Yes, players who are on the decline should actually be expected to bounce back the following year.  While bells may go off in our heads when we see a neat little pattern in a player’s numbers, we need to fight the impulse to expect that pattern to continue.  Chances are, it won’t.

You need to be logged in to comment. Login or Subscribe
beitvash
6/14
I'm a little confused by this: "basically, all players, whether they were above average or below average in one year, will perform closer to league average in the next year." Using your Hanley example, should he have been expected to regress toward the league average this year, or toward his career average? Or does this refer to his peripheral numbers?
derekcarty
6/14
Yeah, the two concepts are a little incongruent, but I wasn't sure if it was worth going into in the article. Basically, when we're projecting a player, there are two concepts at work that apply here: regression to the mean (which is back to league average) and the use of multiple years worth of data. So for Hanley, it's not really correct to say that we're expecting him to regress back to his previous norm (writers will say this a lot, but that's not really what's happening) but rather that we're not ignoring his past performance simply because he's been trending downward. He still posted those numbers in 2009 and 2010, and they need to be considered, even if we weight them less than 2011 because they happened longer ago. Then we deal with the concept of sample size and regression to the mean. Once we weight and combine his past data, based on our weighted sample size, we'd regress that to league average (or whatever player-type average we're using for Hanley). My point in the article was just that trend analysis doesn't work. In the second table in the article, those players bounceback for two reasons: 1) Their Y1 and Y2 HR/PA tells us that they're better players then their Y3 HR/PA indicates, and 2) Even without that history, the guys with poor HR/PA in Y3 would improve anyway because they'd regress back toward the mean. Make sense?
beitvash
6/14
That does make sense. Thanks for the explanation!
jimcal
6/14
Hi Derek, One thing to point out -- given the access to information may not be level play ground -- is that things change on player's mechanic and playing style *may* have a bigger impact toward projection than regress to mean, which is the reason we love BP authors analyzing SO/LD rate, pitching mechanic, and learning scouting knowledge to find that sustainable success. I like how you said "easy isn't always the best" being a competitive Fantasy Baseball manager/player, the more understanding toward the players actually help your game than outright looking at stat sheets, granted the latter is what matters in Fantasy. One reflection I developed myself is that in reality Baseball amazes us like in fantasy, yet in Fantasy Baseball we are thirsty for information in reality.
derekcarty
6/14
Yes, absolutely. We're talking in broad strokes here, but there are certainly a lot of things that need to be considered when projecting a player.
bhalpern
6/14
Is the data any different if you split those 410 up by age groups? Maybe try through age 30 in year 3, 31-35 in year 3, and 36+ in year 3.
derekcarty
6/14
Yeah, the numbers would change a bit, but the pattern would still be the same.