September 5, 2012
How Much Team Age Matters
Alas! for this gray shadow, once a man—
There is an uneasy overlap between sabermetric analysis and forecasting things to come. To be sure, not all prognostication (not even most of it, I would say) comes from sabermetricians, people who would call themselves sabermetricians, or even people who are well versed in the work of sabermetricians. At the same time, the sort of skillset and temperament required to do sabermetrics frequently leads one to the conclusion that predicting baseball is hard and that the sum of what we don’t know about the future often exceeds the sum of what we do know.
But predicting the future is sometimes very useful and often very interesting; while it is not the only application of sabermetrics (something I feel we should emphasize more often), it is certainly a valid field of inquiry. And one thing sabermetrics does very well is examine previously held conceptions about the game objectively and quantifiably.
What I want to examine is how age affects how we project a team’s performance going forward. The common assumption is that, all else being equal, being young is better than being old. A young team has promise yet unfulfilled, while an old team is on the decline and needs to consider rebuilding. But is this really true? And how much does it matter?
I took all teams from 1950 on and figured their winning percentage and the age of their batters (weighted by plate appearances, omitting pitchers hitting) and pitchers (weighted by innings pitched). Then I found the record of the franchise over the next five seasons (counting a team’s results even if it changed names or city in the interim, figuring that roster constitution is more vital than geography or nomenclature for our purposes here). Then I ran an ordinary least squares regression to see how these factors worked together to predict future wins. (The results were similar when I limited the scope of the study to the next three seasons.)
I should speak briefly as to what a regression like this does and how to interpret the results. A linear regression is so named because it outputs a linear formula that takes the following form: