CSS Button No Image Css3Menu.com

Baseball Prospectus home
  
  
Click here to log in Click here for forgotten password Click here to subscribe

Futures Guide 2014 is Now Available in Paperback and Three E-book Formats.

Premium and Super Premium Subscribers Get a 20% Discount at MLB.tv!

<< Previous Article
Baseball Therapy: Is T... (09/05)
<< Previous Column
Manufactured Runs: Is ... (08/22)
Next Column >>
Premium Article Manufactured Runs: Sea... (09/12)
Next Article >>
Painting the Black: Ch... (09/05)

September 5, 2012

Manufactured Runs

How Much Team Age Matters

by Colin Wyers

the archives are now free.

All Baseball Prospectus Premium and Fantasy articles more than a year old are now free as a thank you to the entire Internet for making our work possible.

Not a subscriber? Get exclusive content like this delivered hot to your inbox every weekday. Click here for more information on Baseball Prospectus subscriptions or use the buttons to the right to subscribe and get instant access to the best baseball content on the web.

Subscribe for $4.95 per month
Recurring subscription - cancel anytime.


a 33% savings over the monthly price!

Purchase a $39.95 gift subscription
a 33% savings over the monthly price!

Already a subscriber? Click here and use the blue login bar to log in.

Alas! for this gray shadow, once a man—
So glorious in his beauty and thy choice,
Who madest him thy chosen, that he seem'd
To his great heart none other than a God!
I ask'd thee, "Give me immortality."
Then didst thou grant mine asking with a smile,
Like wealthy men who care not how they give.
But thy strong Hours indignant work'd their wills,
And beat me down and marr'd and wasted me,
And tho' they could not end me left me maim'd
To dwell in presence of immortal youth,
Immortal age beside immortal youth,
And all I was in ashes.
- “Tithonus,” by Alfred Tennyson

There is an uneasy overlap between sabermetric analysis and forecasting things to come. To be sure, not all prognostication (not even most of it, I would say) comes from sabermetricians, people who would call themselves sabermetricians, or even people who are well versed in the work of sabermetricians. At the same time, the sort of skillset and temperament required to do sabermetrics frequently leads one to the conclusion that predicting baseball is hard and that the sum of what we don’t know about the future often exceeds the sum of what we do know.

But predicting the future is sometimes very useful and often very interesting; while it is not the only application of sabermetrics (something I feel we should emphasize more often), it is certainly a valid field of inquiry. And one thing sabermetrics does very well is examine previously held conceptions about the game objectively and quantifiably.

What I want to examine is how age affects how we project a team’s performance going forward. The common assumption is that, all else being equal, being young is better than being old. A young team has promise yet unfulfilled, while an old team is on the decline and needs to consider rebuilding. But is this really true? And how much does it matter?

I took all teams from 1950 on and figured their winning percentage and the age of their batters (weighted by plate appearances, omitting pitchers hitting) and pitchers (weighted by innings pitched). Then I found the record of the franchise over the next five seasons (counting a team’s results even if it changed names or city in the interim, figuring that roster constitution is more vital than geography or nomenclature for our purposes here). Then I ran an ordinary least squares regression to see how these factors worked together to predict future wins. (The results were similar when I limited the scope of the study to the next three seasons.)

I should speak briefly as to what a regression like this does and how to interpret the results. A linear regression is so named because it outputs a linear formula that takes the following form:

y = m*x + b

Where y is what we want to predict, x is what we’re using to predict y, m is the coefficient (also called the slope, when we have only one predictor variable) and b is the constant (which can also be called the intercept).

In this case, with three predictors, you have three different terms that are multiplied by coefficients. Each of those coefficients has a measure of our confidence in the finding, called a p-value. As a shorthand, we can say that p-values below 0.05 typically indicate statistical significance (this is a useful shorthand, but it should not be taken as a dogmatic rule).

We also have a measure of the overall effectiveness of our regression, the adjusted r-squared. R-squared measures the percentage of the variance in y that is explained by our model. Why does it need to be adjusted? Because of the way OLS is figured, adding another explanatory variable will improve R-squared, even if the new variable does not actually increase our understanding. Adjusted R-squared controls for the number of variables, so that it only increases when a new variable increases R-squared more than we would expect if the new variable was random (i.e. lacked any additional explanatory value).

So now let’s take a look at the output of our regression (I used Gretl, although other software packages would probably give a similar output):

             coefficient    std. error    t-ratio    p-value

  -----------------------------------------------------------

  const       0.432562      0.0250882     17.24     6.56e-061 ***

  W_PCT       0.357024      0.0165703     21.55     6.53e-090 ***

  AGE_BAT    -0.00442296    0.000899343   -4.918    9.71e-07  ***

  AGE_PIT     0.000608661   0.000813805    0.7479   0.4546  

Statistics based on the weighted data:

Sum squared resid    2286.318   S.E. of regression   1.235414

R-squared            0.245282   Adjusted R-squared   0.243770

What can we make of that? The p-value on AGE_PIT is very, very much above our rule-of-thumb of .05, so we think that it isn’t significant. We don’t necessarily think that means age of pitchers is irrelevant, just that it isn’t adding any additional understanding of what’s going on here. (I should note that AGE_BAT and AGE_PIT are correlated, even after omitting pitchers hitting from the calculation.)

What does omitting the AGE_PIT variable do to our regression?

             coefficient   std. error    t-ratio    p-value

  ----------------------------------------------------------

  const       0.440695     0.0226061     19.49     1.28e-075 ***

  W_PCT       0.359260     0.0162960     22.05     1.61e-093 ***

  AGE_BAT    -0.00414516   0.000818937   -5.062    4.67e-07  ***

Statistics based on the weighted data:

Sum squared resid    2287.172   S.E. of regression   1.235232

R-squared            0.245000   Adjusted R-squared   0.243992

Our adjusted R-squared increases, ever so slightly. In practice, the difference between the two models will be slight, for the most part. But there’s no reason for us to prefer a more complicated model that offers less explanatory power, because while most cases will be practically unaffected, there is a chance that there will be some rare cases that are negatively affected by a meaningful amount.

(I do want to caution as well that you cannot directly compare coefficients—different variables have different scales and standard deviations, and comparing the coefficient of a value with an average of .500 with a variable with an average of 28 can cause problems. But even if two values have the same average, different standard deviations can lead to differences in the scales of coefficients that can make them difficult to compare directly. In this case, however, winning percentage is truly a more important predictor than the average age of position players.)

So what does this mean, exactly? Teams in our sample ranged from an average age of 24 to 34. That means that the most extreme difference in ages possible lead to a difference in expected win percentage over the next five seasons of 0.041, or roughly seven games per season.

But that’s the most extreme case. How does it apply to real teams—say, 2012 teams? Here you go:

TEAM

AGE_BAT

AGE_W_PCT

Kansas City Royals

26.2

0.009

Houston Astros

26.7

0.007

Washington Nationals

27.1

0.005

Seattle Mariners

27.2

0.005

New York Mets

27.4

0.004

Pittsburgh Pirates

27.4

0.004

San Francisco Giants

27.5

0.004

San Diego Padres

27.4

0.004

Oakland Athletics

27.7

0.003

Detroit Tigers

28.0

0.002

Toronto Blue Jays

27.9

0.002

Chicago Cubs

28.0

0.002

Baltimore Orioles

28.1

0.001

Cleveland Indians

28.1

0.001

Atlanta Braves

28.4

0

Miami Marlins

28.3

0

Arizona Diamondbacks

28.7

-0.001

Los Angeles Angels

28.7

-0.001

Colorado Rockies

28.7

-0.001

Cincinnati Reds

29.0

-0.002

Minnesota Twins

29.1

-0.003

St. Louis Cardinals

29.7

-0.005

Boston Red Sox

29.7

-0.005

Milwaukee Brewers

29.5

-0.005

Chicago White Sox

29.9

-0.006

Texas Rangers

29.9

-0.006

Tampa Bay Rays

29.8

-0.006

Los Angeles Dodgers

30.6

-0.009

Philadelphia Phillies

31.6

-0.013

New York Yankees

33.0

-0.019

 

The last column is the difference in expected win percentage for a team that age and the average team in the regression. Looking at age alone, a team like the Yankees would expect to win three games a season fewer over the next five years than a team with their won-loss record otherwise would. Over at the other end of the spectrum, a team like the Astros would expect to win one game per season more than a team with the same record but of an average age. Now given the sizable differential in other measurable attributes, you might still want to be the Yankees (even though it would mean passing up a chance to hang out with Mike Fast and Kevin Goldstein in the break room). But the age of a roster has a real impact on our estimates of how a team will perform going forward.

Colin Wyers is an author of Baseball Prospectus. 
Click here to see Colin's other articles. You can contact Colin by clicking here

Related Content:  Age,  Team

6 comments have been left for this article.

<< Previous Article
Baseball Therapy: Is T... (09/05)
<< Previous Column
Manufactured Runs: Is ... (08/22)
Next Column >>
Premium Article Manufactured Runs: Sea... (09/12)
Next Article >>
Painting the Black: Ch... (09/05)

RECENTLY AT BASEBALL PROSPECTUS
Premium Article Minor League Update: Games of Tuesday, April...
Premium Article The Prospectus Hit List: Wednesday, April 23
Moonshot: What PITCHf/x Can Tell Us About Ba...
Painting the Black: Some Things Brewing
Premium Article What You Need to Know: Historic Tuesday
Premium Article Overthinking It: Pujols Rewrites the Script
The Lineup Card: 10 of Our Favorite Pitches

MORE FROM SEPTEMBER 5, 2012
The Lineup Card: 12 Stats We Wish Were Readi...
Premium Article The Platoon Advantage: Shaving an Icon
Premium Article Pebble Hunting: The Rockies' Rotation, Befor...
Fantasy Article Value Picks: Relievers for 9/5/12
Premium Article Collateral Damage Daily: Wednesday, Septembe...
Premium Article The Prospectus Hit List: Wednesday, Septembe...
What You Need to Know: Wednesday, September ...

MORE BY COLIN WYERS
2012-09-21 - Manufactured Runs: The Very Long Night of Me...
2012-09-12 - Premium Article Manufactured Runs: Searching for Fatigue in ...
2012-09-07 - BP Unfiltered: Do the Dodgers Lack Chemistry...
2012-09-05 - Premium Article Manufactured Runs: How Much Team Age Matters
2012-08-27 - BP Unfiltered: Ethier's Interference
2012-08-22 - Manufactured Runs: Is the Answer to Imperfec...
2012-08-15 - Premium Article Manufactured Runs: The Importance of Imperfe...
More...

MORE MANUFACTURED RUNS
2012-10-03 - Premium Article Manufactured Runs: Mariners to Move Safeco F...
2012-09-21 - Manufactured Runs: The Very Long Night of Me...
2012-09-12 - Premium Article Manufactured Runs: Searching for Fatigue in ...
2012-09-05 - Premium Article Manufactured Runs: How Much Team Age Matters
2012-08-22 - Manufactured Runs: Is the Answer to Imperfec...
2012-08-15 - Premium Article Manufactured Runs: The Importance of Imperfe...
2012-07-18 - Premium Article Manufactured Runs: Getting Shifty Again
More...

INCOMING ARTICLE LINKS
2014-01-14 - Premium Article Overthinking It: Will the 2014 Yankees Have ...
2012-10-11 - Premium Article Playoff Prospectus: ALDS Game Three Recap: Y...
2012-09-12 - Premium Article Manufactured Runs: Searching for Fatigue in ...