Last week, I wrote a piece on the social development of young baseball players (and humans in general). In the piece, I suggested that one reason that teams might employ older players who are well past their prime, to the point where they are barely replacement level, is that there might be something to the "clubhouse guy" effect, particularly on young players. Players in their early 20s are going through a seldom recognized and only recently understood period of neurological development, and in addition to being baseball players are also trying to figure out how to be adults. There might be some value to having a guy around who is… well, already an adult. Someone who could take a young player under his wing.

When I wrote that, I was thinking mostly in terms of the minor leagues, particularly for the age 18-21 set. During those early years, a player might need guidance not only on how to hit a curveball, but also on how to be a fully-grown man. An older player who has been there might be a good person to approach. Teams often talk about older veterans in terms of their possible contributions to the team off the field, even at the major-league level. Those guys are a calming influence in the clubhouse. They help keep the younger players focused.

Well, if I'm going to propose a theory like that, I should be willing to test it against the evidence. In the past, I have found some evidence that older catchers may help young pitchers in a meaningful, if minor way. Does it work for hitters?

(As always, if you think math is witchcraft, please skip the next section and go to "The Results.")

Warning! Gory Mathematical Details Ahead!
I just realized that I haven't done a gory details piece in a while. Sorry.

Let's define some terms. I looked at all players under the age of 25 from 1993-2011 who had 250 PA in a season and also in the season before. I calculated their on-base percentage for both years, determining the change between years using the standardized method that I have described before. This method controls for differences in the reliability of a statistic given different sampling frames (i.e. OBP, like any stat, is more reliable after 500 PA rather than 250 PA) and produces a z-score to tell how much a player has changed over time. Only players who remained with one team all year were eligible (although a player might change teams between years).

I calculated how many players on our greenhorn's team were 32 or over that season (and who got 250 PA for the team during the year… no fading guys who stopped by for a quick cup of coffee), and what percentage of the hitters on the team this represented. I also coded (yes/no) whether anyone was of an advanced age. Since hitters and pitchers tend to socialize with each other, this seemed appropriate.

Now, there are certain players who are playing into their late 30s not because they are good at helping out younger players, but because they are still excellent at hitting baseballs. So, I slimmed back the pool of possible "mentors" to those who are clearly subpar hitters. I limited the pool to those who had an OBP of less than .310 in the season in question. I calculated how many of those were around (and whether any were around).

I ran a few different regressions. For all of them, I controlled for the previous season's OBP. A really good way to have a sudden upswing in your year-to-year OBP is to have a really down season the year before. In one regression, my dependent variable was the z-score generated above. In this way, we have an idea of how far the player has progressed in standardized terms. For all guys under 25, I looked at whether the number of hitters over 32 on their team predicted change in OBP. I did the same for the percentage of hitters over 32, the presence or absence of such a teammate, and then the number of hitters over 32 who were subpar in their own performance.

In addition to looking at the influence on change over time, I looked at whether any of these factors might, even if they didn't lead to amazing development, keep a young hitter from a collapse. I coded whether a player had fallen in his difference score by more than one standard deviation and ran logistic regressions using the same predictors.

The Results
Nothing was significant. I even played around with the age at which someone was considered a veteran (raised it to 35), and how young a player had to be (moved it to 27). Nothing worked. So there's nothing to the "good guy in the clubhouse" theory, at least as far as it actually impacting player performance. Right?

There's probably someone who read that nothing was significant and thought "Ha! Another myth bites the dust!" Not so fast. This is a mistake that I see a lot in sabermetrics and have probably made myself.

There probably are cases in which an older player takes a younger player under his wing. It's hard to believe that he would do so for everyone on the team (or that everyone would need it). Also, it's probably not the case that all older players are good at that sort of thing. But it's harder to believe that it never happens, that there aren't certain guys who are good at it, and that somewhere along the line, someone benefited from that sort of mentorship. Maybe we don't know fully how to identify the good ones, but it's a bigger stretch of the imagination to deny that this effect is out there.

This exercise neatly illustrates the problem with large N database research. It assumes that all players will respond to some set of circumstances in the same way—young player plus veteran player equals some measurable bump in performance. There are times when everyone does react the same way, and it teaches us something interesting about the way that the game works. But that sort of one-size-fits-all effect is the only type of effect that a large-N database query is capable of finding.

We can say from the analyses that I just did that the mere fact that there are seasoned vets on the team does not mean that young players will all show a positive effect. This should silence the people who indiscriminately praise the signing of any over-the-hill player as a "good clubhouse move" and immediately dream of his "veteran leadership" blanketing the kids on the team in a warm glow of OBP. It doesn't work like that. But that doesn't mean that on the micro level, we wouldn't see some sort of major (and real) effect of mentorship. We just don't have the data set that would allow us to see it.

I suppose that there are a number of other objections that one could make to the methodology (I looked only at OBP, the 24-year-olds in MLB are a very selected group, the 34-year-olds in MLB are a very selected group, the problem might be that a team that plays a bunch of older vets might have done so only because their blue chippers had a bad few months and were sent down to the minors for more experience, etc). Those problems might be getting in the way, but I think there’s more to the results than that.

In the chat that I hosted here at BP a couple weeks ago, I was asked where I thought the next big advance in sabermetrics would come from. I answered that it would come from moving away from the large-N model and understanding each player as his own unique data set. It's an approach that has been oddly missing from the sabermetric worldview, and I think that we're the worse for it.

Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now
You need to be logged in to comment. Login or Subscribe
Excellent piece, Russell. Especially the caution toward the end. It is, indeed, one of the more common errors in sabermetrics to fail to find evidence of a relationship in a particular sample and to proclaim the death of another myth. Our mantra should be "absence of evidence is not evidence of absence." And your thoughts about moving away from large-N tests are provocative; kudos!
Good piece. It is hard to quantify the effect of veteran presence, and it may often be over-rated; but it does seem to have real value - especially in individual cases. Case in point: when the Nationals re-signed Adam LaRoche this week, Davey Johnson praised the move, using the term "clubhouse presence." Most of the analysis of the move has projected LaRoche's offensive and defensive production in comparison to that of Michael Morse - and most analysis gives an edge to Morse on that score. But one of the key reasons the Nats wanted LaRoche back was because he was the player that 19 year old Bryce Harper hangs out with on road trips. According to published reports, the two became very close. While other team mates would go out to a bar, LaRoche and Harper would go somewhere else and talk baseball. Johnson remembers, all-too-well, how the careers of Dwight Gooden and Daryl Strawberry, two comparable talents, were ruined by off-the-field influences, and I am sure that he was eager to keep LaRoche around as a stabilizing "big-brother" influence for Harper. That value may be very real, but it is psychological and developmental -- hard to objectively verify and very difficult to tease out of the statistics.
I'm pretty sure no one is giving the performance edge to Morse. I don't even think the two are that comparable in that regard
Look closer. Morse was a late bloomer, but when healthy over the last couple of years he has hit at least as well as LaRoche, and often better. LaRoche is a better fielder, but Morse (though a bit of a butcher in LF) was certainly adequate at first base in 2011. Morse does have a history of injuries, but he is younger and cheaper than LaRoche. Knowing that he could substitute Morse for LaRoche with no likely loss of productivity gave Russo all the leverage in his contract negotiations with LaRoche. The argument for LaRoche over Morse consistently came down to his fielding and his "veteran presence."
Nice piece. I would think that the right veteran (mix of personality, credibility, playing knowledge, and people skills that are best for a certain young player) can probably help a younger player that might need help. I'm sure making it to the major leagues can be an adjustment for some players, as can learning a new role there. Charismatic teammates can probably influence some young players, for good or ill. Some younger players probably already have the right mindset and likely need little help, others are likely mostly immune to peer influence. But there are probably some players that a good veteran, through setting a good example, being friends, or advice, might help.

It also might vary by team. If the team has several bad influences, it might be very important to have some good ones, too. If the team lacks bad influences, it might be less important.

Since this outcome is going to depend on the young player, the veteran, and the overall team environment, my guess is that statistics offer little help with this, as there are too many variable we lack information on and are hard to quantify.
Related to this, I've often wondered if the age distribution of a roster has any detectable influence on team performance- e.g., would a team with a relatively even distribution of a handful of rookies, a handful of young post-rookies, and some early and late veterans perform better than a team made up predominantly of veterans or young regulars?

I can't imagine an easy way to statistically test this, what with the thousands of other variables involved, but one could start by dividing teams into age-distribution categories and comparing records.
...the idea being that players at different ages bring different perspectives/advantages, and the more perspectives there are, the better for the team as a whole.
This is the sort of line of work that would have to start with a "direction before precision study." Age and spread of age are easy to figure (mean and SD), and that's a good starting place.
Good article. This question, like many questions that people try to tackle in sabermetrics, is probably fundamentally not addressable with OBP (or WAR, or VORP or whatever).

Veterans, club house chemistry, and so on and so forth, these and umpteen other factors, they affect different players in different ways. Perhaps a veteran leader shows a kid a couple of tricks to deal with the daily grind, extending his longevity. Perhaps being in a series of good clubhouses encourages a player to keep his love of baseball and hence good habits... Perhaps... Perhaps... Perhaps...

I love digging into the fog of this game we all love, but as a wise sabermetrician advised, it must be respected. As highly trained, professional, and mechanically perfect baseball players may be, they're still people, with emotions, preferences, tastes, quirks, and flaws.
Methodological question-

When you controlled for the first season's OBP, did you regress the change in OBP (as a dependent variable) against the first year (explanatory variable)? This will give a biased result. The correct way is to use the sum (or average) of the OBP for both years as the explanatory variable.

You probably know better, but I see this error all the time.
This was something of a strategic methodological choice on my part. The biggest danger that I saw was random extreme variations (i.e., career years) distorting year-to-year improvement stats, so I only controlled for year-1 raw OBP. Your approach is perfectly defensible. I just went a slightly different direction.
I gather you are evaluating the proposition that the improvements to the young players happen in the season the veteran is kept around. What about in later seasons? If the veteran spending a season with the younger player is a "treatment" for immaturity, might not the effects come out later?
A reasonable hypothesis.
Some supposed "veteran" influences, turn out to do the reverse - see the Omar Vizquel incidents in Toronto last season. The oldest man in baseball ended up throwing management under the bus for a lack of direction on the team, failed to point out Yunel Escobar's faux-pas when he should have known better, etc.

On the flip side, look at the recent stuff emerging about 21 year old hockey star, Tyler Seguin -
You're only capturing about 50% of Mark Kotsay's last 4 years on the bigs with your sample. Any methodology should start with him and work backwards to define the set
I totally agree with your last paragraph. People are so eager to claim that outliers will not repeat that they miss the very real possibility that a certain human being might behave abnormally in certain situations due to a very real and repeatable cause. Josh Hamilton cannot possibly have persistent day/night splits bc most players do not. Jered Weaver is apparently no more likely than the average pitcher to outperform at home next year. Justin Upton's ridiculous career home/road splits: sheer noise. Luke Hochevar: chronically unlucky, mostly because Hellickson stole all his luck. The list goes on and on.

One reason I am confident that people have gone off the deep end in applying generalizations to tail cases is that I have made a living betting on baseball for the past 10 years, and it's been in no small part due to taking the other side of these inaccurate predictions of mean reversion (provided the sample is large enough to refute it).

I hope someone here or elsewhere can devote an entire series to examining serious career outliers (in FIP vs. ERA, handedness splits, etc.) apart from the rest of the population and determine if such players should be lumped in with the rest of the player universe, or rather if we should be applying different predictions of regression.

Well said, Evo, and in the case of Hellickson, Jason Collette wrote an awesome article on his particular tail-wagging, to which I chimed in with a couple of additional details (with GIF's!).

Also, watch Ricky Nolasco pitch if you want to see an outlier example of the difference between ERA and FIP - his (ERA-FIP) for the past four seasons have been +0.56, +1.17, +0.62, and +1.75. The guy doesn't miss his targets by much, but he misses them all the time, and missing your spots within the strike zone will get you hammered in the show. The result is few walks but a ton of hard-hit baseballs, a factor that slips beneath the radar of any metric that is rooted in box-score stats.

Speaking of which, I think the greatest limitation in today's statistical playground is the error inherent in our input variables - how can we properly tease out the roles of pitching and defense on balls-in-play when we are using a data set that treats a bunt single the same as a screaming liner over the second-base bag?
I also believe that certain pitchers have proven to have outlier BABIP's that are sustainable (Kershaw/Cain), and that it is explainable due to the high incidence of weak contact. There are other pitchers that do this to a lesser extreme, but BABIP is limited by the imperfect inputs problem and thus only catches the outliers - one needs HITf/x to quantify the quality of contact in order to establish significant thresholds.

Luckily, Mike Fast covered that topic for us before bolting for more humid pastures.

Excellent work, as per usual. I meant to ping you when I saw the chat comment, and I couldn't agree more about the large-N model.

I remember learning something in stats class about applying the results of large-N studies towards similarly robust samples - if 500 players tend to regress toward a mean in one season, then one might expect a similar sample to follow the same trend. But the model breaks down on a case-by-case basis.

Besides, individualized analysis opens up so many fun research questions, many of which don't enter the analytic framework of a large-N model.
To break that into stat terms, we over-estimate the variance that our models explain. There are many factors for which we don't (and in some cases, can't) assess.
Really interesting point, Russell — I was intrigued by your quick comment about large-N research in the chat, and this explanation adds some clarity to it.

But do you see people moving away from these studies toward more case-by-case research? It's hard for me to imagine how to even structure such studies. Longitudinal data focused on individual performance, perhaps?
Interesting article Russell, but I wonder if the 250 AB minimum for the geezers isn't a fatal problem. Why assume that the mentors are the geezers who actually playsa lot? In my mind, a proto-typical mentor is the old pinch hitter type who gets ~100 ABs a year. Jason Giambi is a current example.