July 8, 2014
Survival of the Fittest: Position Players
In last week’s article, I extended my approach of survival modelling to examine what the early career success of a pitcher can tell us about his long-term survival in MLB. Despite the inherent randomness of the pitching profession in the age of Tommy John surgery, I discovered that by far the best predictor of a long career was the age at which a player debuted in MLB. Besides debut age, the abilities to rack up strikeouts and avoid walks meant the most for a pitcher’s long-term career outlook.
I turn the same method now to position players. Position players have different risks from pitchers, and a different set of career arcs. Position players are less likely to be hur, and more able to continue their career in the face of injury by moving down the defensive spectrum. What’s more, whereas a pitcher contributes the vast majority of his value from his pitching, a position player might be great in several different ways: by hitting, by fielding, or even (to a lesser extent) by baserunning.
Even with all of that noted, there is a certain obvious commonality to the career success of both hitters and pitchers. In general, the earlier a position player debuts, the longer he’ll last.
As was the case last week, these Hazard Ratio numbers represent the relative probability that a player will last additional years. Numbers above one imply greater risk with greater levels of the corresponding variable, and numbers below imply less risk. A player who debuts at 24 years old instead of 23 is therefore 1.26 times as likely to end his career at some (arbitrary) time in the future, all else being equal.
As I learned with pitchers, the best predictors are by far the age at debut of a young hitter and how much playing time he gets in his first three years in the league. A prospect who debuts early and plays often is probably pretty good, and likely to last for a long time. Debut age is so significant for predicting career length that even integrating the month in which a player was born provides a small but significant coefficient. In other words, even as tiny a difference in age as a month between players tells us something about their relative skill levels.
Debut age doesn’t just predict career length. It has a great deal to do with a player’s long-term success in the league. Indeed, regressing a player’s seasonal average VORP on the age at which he debuted shows that each year is worth, on average, ~.5 runs. A player debuting at age 25 projects by this to be worth 2.5 runs less per year than one who debuts at 20, for instance.
Prospect scholars have known for some time that being young for your level is an important trait, and the early premieres of players like Mike Trout and Bryce Harper were noted for their historic significance. So I won’t claim that the idea of the importance of debut age is especially novel. But that age at debut is so strongly associated with career VORP once again underscores its weight.
However, it’s true that there is the occasional highly touted teenage prospect who makes his entrance into MLB in less-than-successful fashion. As with pitchers, we might reasonably believe that the skills a player shows in his first few* seasons could portend the length and eminence of his career. Roughly speaking, we could divide hitting into two interrelated skills: power (measured by ISO) and plate discipline (measured by strikeout-to-walk ratio), and test how those things predict career length. Additionally, I throw in batting average, as it has historically been an important tool for evaluating players.
Batting average is bizarrely well-associated with career length. I threw it in expecting little relevance, because as well-informed baseball fans know, batting average is quite volatile and generally a poor indicator of offensive proficiency. But even though we know all that now, the data I’m working with is from a time before DIPS and BABIP, when batting average meant something more substantial. I saw a similar phenomenon with ERA and pitchers, in which ERA seemed unreasonably valued relative to strikeouts and walks.
Elsewhere, power (ISO) seems the more consistently valuable relative to plate discipline (as measured by SO/BB ratio). Perhaps that’s because power is more of an ability than a skill; whereas plate discipline can be taught, power is something that’s largely innate (with obvious exceptions). Players who start out without power aren’t likely to suddenly develop it, and their plate discipline may suffer because pitchers aren’t afraid of the thunder in their bats. Alternatively, maybe it’s because front offices are unreasonably enamored of the potential of a powerful player, and willing to give such a player a few more chances than an equivalent player with good plate discipline.
Position players also contribute significant value to their teams as fielders and baserunners, respectively saving and stealing runs. I test the correlation of these skills with career length by looking at the early career FRAA and BRR of each player.
Both baserunning and fielding affect career length in a positive way. If there’s anything to the idea that speedsters age quickly or somehow have shorter careers, I can’t detect it here; baserunning is good for career length, as far as the model is concerned (although only weakly so). Fielding is significantly discounted (in terms of its career-prolonging ability) relative to hitting. However, both fielding and baserunning are worth roughly the same, which is comforting, since both are measured on a run-based scale.
Put all of the above factors together, and you arrive at the following regression.
In the same model, all of the factors combined explain about 35 percent of the variation in career lengths between position players. Noting here that I’m using only a player’s stats from his first three years in the league, it’s remarkable that so much can be learned about a player from only his initial exposure in the league.
Another point to be made here is that the R2 value is much higher than the equivalent one for pitchers (.177). That finding shows how much less reliable pitchers are generally. A pitcher is far more likely to be good early in his career and then flame out (Dwight Gooden), or conversely emerge from nowhere (Cliff Lee), than a position player.
However, the combined regression involves inputs from all sorts of different component statistics. It’s ugly and complicated. Maybe there’s a shortcut of sorts. If only there were some handy, all-encompassing metric which would put together all of (well, most of) the different ways a player could be valuable into one number, I could just use that. Enter VORP.
By merely subbing the first three years of a player’s VORP, I can attain nearly the same R2 value as with five statistics describing various means by which players contribute value. Even though VORP clearly wasn’t used in the era I’m examining (1950-1990), it still does a good job of predicting the lengths of player careers from that time.
Often, though, the reality when prospects actually reach the majors is a lot more complicated. They are suddenly up against the very best athletes, and even the most celebrated can struggle (check out Mike Trout’s first season). Suddenly, what looked like a Hall of Famer in Triple-A ball can seem sub-replacement level in the majors. As those top 50 prospects begin to trickle into the Show for their cups of coffee and eventual, full-time promotions, it becomes paramount for an organization (and fun for a fan) to predict whether those rookies are destined for lengthy careers or brief flameouts.
For all the differences between pitchers and position players, the strongest influences on career length remain largely the same: the age of debut and the early career playing time are far and away the best predictors. As I note above, that age of debut is so important is a fact both well-known and still underappreciated. While we do often consider the age of a prospect, it can be easy to forget how young a player is or how early he made his debut (Bryce Harper: shockingly still just 21). After all, a player at age 20 looks much like a player at age 23 or 24. Age is not a stat reported on scoreboards, but this and other research suggests that it is crucial nonetheless.
Beyond age, however, the reality of a position player’s career is well-described by VORP. No matter how he contributes value, whether it be by stolen base, a hefty bat, or wizardry with the glove, that value** translates into a longer and more productive career. Still, if your chosen rookie struggles, rest easy with the knowledge that there’s still a lot of unpredictability left in these regressions. Some of the ultimate fate of an emerging prospect is probably tied not to his tools or his ranking, but his mental aptitude and ability to adjust to a novel competitive environment—a set of skills that batting average and VORP are both poor at measuring.
Thanks to Rob McQuown for research assistance.
*I use the statistics of each player’s first three seasons, weighted by their number of PAs in these three seasons.
**Though there are some marginal effects—power might be a little more important than early career plate discipline, for instance.