CSS Button No Image Css3Menu.com

Baseball Prospectus home
  
  
Click here to log in Click here for forgotten password Click here to subscribe

World Series time! Enjoy Premium-level access to most features through the end of the Series!

<< Previous Article
Premium Article Future Shock: The Top ... (04/08)
<< Previous Column
Premium Article Manufactured Runs: Whe... (04/01)
Next Column >>
Manufactured Runs: Tha... (04/13)
Next Article >>
Premium Article Under The Knife: Sneak... (04/08)

April 8, 2010

Manufactured Runs

April is the Cruelest Month

by Colin Wyers

the archives are now free.

All Baseball Prospectus Premium and Fantasy articles more than a year old are now free as a thank you to the entire Internet for making our work possible.

Not a subscriber? Get exclusive content like this delivered hot to your inbox every weekday. Click here for more information on Baseball Prospectus subscriptions or use the buttons to the right to subscribe and get instant access to the best baseball content on the web.

Subscribe for $4.95 per month
Recurring subscription - cancel anytime.


a 33% savings over the monthly price!

Purchase a $39.95 gift subscription
a 33% savings over the monthly price!

Already a subscriber? Click here and use the blue login bar to log in.

April really is a great time to be a baseball fan. Even in the worst case (say, being a Cubs fan and watching Carlos Zambrano getting lit up list a Christmas tree on Opening Day), having baseball is better than not having baseball. And April is truly a time when all baseball fans can have hope. Nobody’s been eliminated yet. Nobody’s even out of the race yet. Now, of course, some things are more likely than others—but that’s not what hope is about, is it?

So it’s a great time to be a fan. But it’s a horrible time to be a baseball analyst. (That's still a net win, really, as baseball analysts are generally also baseball fans.) Why do analysts suffer? Because there’s an expectation that since there’s baseball going on, and since one is a baseball analyst, one must, well, analyze that baseball. The thing is—there’s not really a whole lot one can say about a month’s worth of ballgames, at least in the way of useful analysis. There’s little we can know in April that we didn’t already know in March.

Small Sample Size

Consider it the sabermetrician’s catechism—a small sample size is not very indicative of future performance.

Here’s what I mean. I took a look at batters from ’04 to ’09. I broke their season down into two parts—April, and everything after. I also took a look at how that batter performed in the prior season.

What I found was that, in comparing April stats to the rest of the season, the root mean square error was worth about .060 points of True Average—a hitter with a TAv of .260 in April would have a TAv between .195 and .314 a total of 68% of the time. When comparing the previous season, I found an RMSE worth about .037 points of True Average—or a TAv between .298 and .224.

What this tells us is that one month’s worth of batting results is much less predictive than the previous season’s batting results. (And in fact a single season isn’t terribly predictive, either.)

Selective Sampling

The reason we have a small sample is because we’re really engaging in a bit of selective sampling. We rarely have but one month’s batting results for any player. While it’s true that only this year’s batting results count toward this year’s games, it’s a wholly arbitrary distinction when it comes to predicting future performance.

And even if this is a batter’s first year in the majors, we still have other information. We have a wealth of minor-league batting stats. We have a sizable body of research on how those stats translate into major-league stats. We have scouting profiles, the player’s age—we have a wealth of information.

Looking simply at April stats isn’t just a case of not having a lot of information, it’s a case of us leaving out the perfectly good information we do have. That act of exclusion makes our analysis less—not more—accurate.

Confirmation Bias

But what if we already have a theory about how good (or bad) a player is? And his stats in April confirm what we already suspected—that he’s on the decline, or that he’s ready to have resurgence? Surely, in those cases, a small sample can tell us something, can’t it?

That’s really the most dangerous case of all. It’s the logical fallacy known as "confirmation bias"—viewing the data in a way that confirms what we think.

We already know that one month of stats is less predictive than a whole season’s (and both are less predictive still than a good projection system, such as PECOTA). It’s dangerous to ignore that in the face of data that supports our point of view. It leads to us dismissing most findings (rightly) due to low predictive value, but cherry-picking the ones that support our predetermined conclusions.

Just Enjoy It

So what’s an analyst to do?Watch some baseball. Enjoy it. But don’t read too much into it.It’s really hard to sit there and constantly say, "We don’t have enough data to say anything" every time someone asks what a player’s performance in April means. But it’s the right thing to say.

Technical Notes

A note about those RMSE figures. What I did was figure a hitter’s runs per out for each of the sample periods. Root mean square error is really exactly what it says on the tin—find the error for each term in the sample, square it, find the average squared error, and take the square root.

In this case, I used a weighted average. For the weight, I took the harmonic mean of outs in each of the three sample periods. (To find the harmonic mean, take the reciprocal of each variable—in other words, divide one by each of the terms. Then find the average. Then take the reciprocal again.)

I then converted these runs per out into True Average using the following formula:

( (LeagueRunsPerOut ± RMSE) / 5 ) ^ (.4)

Colin Wyers is an author of Baseball Prospectus. 
Click here to see Colin's other articles. You can contact Colin by clicking here

Related Content:  Sample Size

2 comments have been left for this article.

<< Previous Article
Premium Article Future Shock: The Top ... (04/08)
<< Previous Column
Premium Article Manufactured Runs: Whe... (04/01)
Next Column >>
Manufactured Runs: Tha... (04/13)
Next Article >>
Premium Article Under The Knife: Sneak... (04/08)

RECENTLY AT BASEBALL PROSPECTUS
The View from the Loge Level: Managing to Wi...
Fantasy Freestyle: Playoff Spotlight: Alcide...
Minor League Update: Games of Monday, Octobe...
Playoff Prospectus: World Series Preview: Gi...
Pebble Hunting: An Illustrated Guide to the ...
Baseball Therapy: The Truth About Butterflie...
Pitching Backward: How To Get A Hit Off Madi...

MORE FROM APRIL 8, 2010
Premium Article Under The Knife: Sneaky Recoveries
Premium Article Future Shock: The Top 101 Prospects Update, ...
Premium Article Ahead in the Count: Projecting Free Agent Pe...
Premium Article Contractual Matters: Rule 5 Update

MORE BY COLIN WYERS
2010-05-04 - BP Unfiltered: Excessive Force
2010-04-29 - BP Unfiltered: Talking two languages
2010-04-13 - Manufactured Runs: Thawing Out Frozen Ropes
2010-04-08 - Premium Article Manufactured Runs: April is the Cruelest Mon...
2010-04-01 - Premium Article Manufactured Runs: When Subjective Overrules...
2010-03-28 - BP Announcements: Is small ball back?
2010-03-26 - Premium Article OPS, I Did it Again
More...

MORE MANUFACTURED RUNS
2010-05-19 - Premium Article Manufactured Runs: Everything You Wanted to ...
2010-05-14 - Premium Article Manufactured Runs: Wrighting the Wrong
2010-04-13 - Manufactured Runs: Thawing Out Frozen Ropes
2010-04-08 - Premium Article Manufactured Runs: April is the Cruelest Mon...
2010-04-01 - Premium Article Manufactured Runs: When Subjective Overrules...
More...