February 15, 2010
That Peak Age Thing, Part 1
My grandfather used to say that in heaven, everyone was 25. He figured that was the perfect age in life. You're old enough that you're not a kid any more, but young enough to enjoy everything. Grandpa lived to age 93, and more than six years later, I still miss the guy. This one's for you, Grandpa.
So what's the perfect age to be if you're a baseball player? For a while now, there's been a small brouhaha going over those who say that the peak age is 27 and those who hold out for age 29. Now that I'm past both of those landmarks myself, it doesn't seem like that big a difference, but in a profession where a player might play six years if he's good, knowing which of those six years will be his best is vital to a team.
The problem with doing this sort of work is that baseball is a logistical nightmare in terms of doing well-controlled research. Players are not selected at random (like I used to teach my research methodology students) and there is a severe bias in who gets to play and who doesn't. Indeed, we have an entire genre of radio which exists for people to call in and complain when a manager plays the not-so-good guys. Still, the joy of doing research … and yes, there is joy in doing research … is being able to crack some of these issues, despite the fact that they drive you nuts.
As someone who has dealt with more children than I care to mention (and that was before I became a dad), this question of "peak age" struck me as a development question, the same way that I'm often asked questions about whether Junior (no, not Ken Griffey Jr.) is on target with his developmental milestones. But, I wasn't comfortable with one of the hidden assumptions, one built very deeply into how our culture perceives development, which people tend to make in this line of research. We assume that players develop in a gradual and relatively uniform manner consistent with their age. It's a one-size-fits-all approach that's reflected in the other major developmental measurement that's a common feature in our society, schools.
Kids who are 12 belong in sixth grade and are all roughly at the same point in life, right? Maybe not. Kids develop in different ways and at different rates. Go to any sixth grade classroom, and you'll see that the idea of uniform development is preposterous. Kids hit puberty at different ages; girls hit puberty before boys, and it's all on display for you right there in your average sixth grade homeroom. Sure, in the aggregate, kids at 12 years old are "middle school" material. But what about this individual kid?
In education and in child development more generally, if kids aren't learning or developing as quickly as we'd like them to, it's not legal to just remove them from the population. But in baseball, that's exactly what happens. Players who develop quickly are politely invited to be part of the team. Players who don't develop so well are simply sent packing.
So I propose that we first look at this question of peak age from the other direction. When do players generally become just good enough to become regular players in MLB and when do they stop being good enough? I took all players who started their careers after 1980 and ended their careers before 2009. Only seasons in which the player had at least 200 plate appearances counted. It left a sample just shy of 1,000 batters (997 to be exact.) It's not a surprise that most players debut some time between their age-23 and age-27 seasons (using April 1 age). What surprised me was the distribution for when players left the game. Take a look:
There's a spike at age 27 for players leaving the game, but then after that, the rate of attrition falls for a few years and then spikes again at age 31. Odd.
I took a look at when players left the game as a function of what age they debuted at. I wanted to make sure that these twin spikes were some sort of selection artifact based on debut age. Looking individually at every debut age group, there was a similar pattern. Generally, there were attrition spikes around age 27 or 28, and then again around 31 or 32. And then there was one other spike I noted. The most dangerous year for attrition for a batter is his first year. For example, more than 30 percent of players who debut (i.e. have their first 200-plate appearance season) at the age of 26 don't have another season in which they get regular reps. So, there seem to be three major winnowing periods in baseball for batters. The first year, age 27, and age 31.
Not shockingly, players who made their debut younger tended to be the guys who stuck around longest and were most likely to clear those three hurdles. They also tended to be better players. Indeed, a quick stroll through the "survival rates" for each of the groups in the study is enlightening. You can read the chart below as "of all of the players who debuted at age 24, X percent of them survived (had another season of 200-plus plate appearances) past their first year, X percent survived past age 27, and X percent survived past age 31."
Survived Past ------------------------------- Debut Age First Year Age 27 Age 31 0-22 93.8% 72.5% 47.8% 23 88.1% 64.9% 39.3% 24 81.7% 64.0% 35.5% 25 82.9% 61.7% 33.6% 26 69.1% 55.5% 29.2% 27 72.0% 72.0% 33.3% 28 67.3% -- 33.7%
The fact that the players survived past these hurdles says something about their relative quality. Teams do not hang on to 30-year-olds with no skills. But, do members of these groups peak at different times? To find that out, I went into the statistical toolbox and pulled out one of my favorites. Remember, if you don't like statistical gore, just say "and then a miracle happened" and skip to "the results."
Warning! Gory Methodological Details!
I used a mixed linear model, with one fixed factor: age. I also used an AR(1) covariance matrix (auto-regressive, first order). This type of covariance matrix comes in very handy in this type of research, because it specifically corrects for the fact that we have several repeated observations for the same player. This is important because there are some players who are in the sample at age 27, but not age 28 (because they "retired,") The covariance matrix sniffs out the fact that the group still present at 28 was better at age 27 than the retirees and corrects for it when spitting out the relevant output.
My dependent variable was OPS. (Yes, I know I didn't use your favorite No.1 measure for a player; fire when ready.) The output that comes out the other end can be read "if you took an average player from the sample, and only told me his age, I would expect that his OPS for the year would be X." Of course, we'd know more than just a player's age, but the point is to come to some sort of aggregate conclusion.
I split the players in my dataset up, again by debut age and by what the last talent-age "hurdle" (first year, 27, 31) they cleared was. So, we may have a player who debuted at 24, and made it past 27, but not to 31. If he didn't clear the "first year" hurdle, then he only played one year, which is by definition, his best (and worst) year. I found the year in which the model had the predicted OPS as highest. The numbers here are peak ages.
Last Hurdle Cleared -------------------------- First Debut Age Year Age 27 Age 31 0-22 24 26 31 23 25 26 30 24 26 27 31 25 25 27 28 26 26 29 29 27 -- 27 29
Players who stay in the league longer have later peaks, roughly around the age of 29 or 30, which is what J.C. Bradbury found using a sample that included players with longer careers (minimum 5,000 plate appearances, which is roughly eight years at 600-ish PA per year). Those players who play only into their early 30s and who comprise the plurality of players in MLB have peaks around age 27. Those who espouse the age-27 model (Mitchel Lichtman being only the most recent) generally use models that are variations on "what's the most common age to hit the high point?" No wonder they get 27.
Another surprising finding is that good-but-not-great players (those who made it to age 27, but not 31) and who debut later tend to peak later. There's no such thing as one magical age where all forward motion stops. Some guys are later bloomers, and have the same sort of arc as others … they just do it later in life. Those who stick around for a long time, however show the opposite pattern. In that group, those who debut earlier have later peaks. What to make of this two-trajectory model?
It's tempting to say that players who come up early on the phenom track are a riskier lot. If they have a long career, they're likely to have a longer arc of improvement. But, if they have a short to mid-range career, they'll peak quicker. However, we've seen that earlier debut generally heralds a greater chance of a longer career. When they do flame out, it's generally a bigger fireball, but the chances of a fireball are actually lower. It's a tradeoff.
The phenom track can be compared to the Brook Jacoby track. (For those who didn't spend the '80s in Cleveland, Jacoby was the good, serviceable corner infielder for the Indians who actually made a couple of All-Star teams.) In general, it seems that some players come up in their mid-20s, have a two-three year period where they improve, and then fall back to earth (and out of baseball) by the time they hit age 30. The two-three year period appears to be constant. It's just a matter of when they bloom. Of course, the problem is that when a player is coming up, we have no way to know which track he will fall on. His debut age does give us some idea, but it's not a guarantee.
When Bill James originally took up this question, he suggested that players generally peak earlier than is generally thought, and decline more rapidly than is generally thought. He might have inadvertently been picking up on a wrinkle in how people think about the game. The good players do peak around 29, and those are the players about whom we first think. The great unwashed mass of players peak earlier.
The obvious take-home from this study is that method and sample will affect the answer to the question "at what age does a player peak?" I'd argue that this very fact means that the discussion of the one age for player peaks is actually kinda silly. Even beyond the usual cries that "You have to treat everyone as an individual!", assigning one number to "peak age" vastly oversimplifies the situation. Sure, if we're playing a probability game of "given no other information than his age, when can we expect this guy's peak?", then 27 is the best guess.
But to a team making a multi-million dollar bet on a free agent, it's also the type of number that has the illusion of being a lot more informative than it really is. There are some concepts that can be reduced to a simple rule of thumb, and while the rule obscures the details, it's easier to employ than having to sort through the mess of data. I don't think this is one of those cases. Player development works in a much more complicated way than is generally thought.