Why predicting player breakouts is more important than minimizing error.
Last week, the sabermetric community had—well, not an argument, because the participants were generally professional and cordial to one another, but a debate about what we might expect over the rest of the season from a player who is currently enjoying a hot (or cold) streak. It all started with researcher Mitchel Lichtman (better known by his initials, MGL) posting two articles, one on hitters and one on pitchers, that made the case that we should trust the projection systems rather than expect a player’s recent performance to continue. Remember Charlie Blackmon, who was the best player in baseball for three weeks and was smart enough to make those weeks the first three weeks of the 2014 season? He’s a good example. He had never been anything special, nor was he projected for greatness this year. And in retrospect, his hot streak to start the season looks a lot like a small-sample fluke.
After we released the PECOTA Top 100 prospects list last week, a few commenters remarked on PECOTA’s apparent catcher leanings. Eleven of them appeared on the list, some higher than nationally beloved prospects. How dare PECOTA! In comparison, Jason Parks’ top 101 featured eight catchers, suggesting a small discrepancy in the position distribution of PECOTA’s rankings.
The rest of this article is restricted to Baseball Prospectus Subscribers.
Not a subscriber?
Click here for more information on Baseball Prospectus subscriptions or use the buttons to the right to subscribe and get access to the best baseball content on the web.
Have we been underrating big-market, high-payroll teams?
A couple of weeks ago, I wrote about the distribution of team wins, and the discovery that the distribution may in fact be bimodal, not normal as one might expect.
One of the predictions that came from this theory was that teams right at .500 would, counterintuitively, tend to regress away from the mean. So one thing we can do is actually check to see if the real world behaves the way we expect it to. I took all teams from 1969 on with even numbers of games and split them into “halves” of even-number games. I use scare-quotes for halves since in order to boost the sample size, I split into increments of two and kept any pair where both “halves” were within 20 games of each other. Then I looked at teams that were exactly .500 in the “before” sample— 716 teams total—and saw what they did afterward:
The teams that have outhit and outpitched their projections, or fallen the farthest short.
We’re approaching the halfway point of the season, though we’re still over a month away from the nominal start of the second half. And that means we’re also approaching the point at which we stop thinking about how we thought the season would play out (except for our probably accidentally accurate predictions, which we treasure forever). According to Colin Wyers, in-season team records become more reliable than pre-season projections around Game 103. Most of us don’t have a particular point of the season at which we entirely abandon pre-season projections—nor should we—but every day we trust what we’ve seen so far a little more and what we expected to see a little less. And eventually, we look back and wonder why we didn’t see certain things coming.
PECOTA has had plenty of successes. The projected team TAvs for the Rangers and Brewers, for example, have been correct to the point, and the projected team ERAs for the Mets and Diamondbacks have been less than 0.02 points off. But while PECOTA deserves a pat on its back for its accurate predictions, there’s much more to say about the surprises. This article is about the lineups and pitching staffs that have defied our expectations so far.
If everyone on the Astros played to their 90th-percentile projections, and everyone on the Angels played to their 10th-percentile projections, which would win more games?
Last year around this time I had plans to compare the Astros’ teamwide PECOTA projections to those of a variety of lower-level squads: the best Triple-A roster, the best Double-A roster, an All-Star High-A team, etc. I didn’t get to it, and then the season started, and I still didn’t get to it, because the Astros started off hot and it would have been weird to have run that piece about a team that was 22-23 in mid-May. I was sort of glad I didn’t run it, because the longer I lived with the idea the more it started to feel mean.
So this year, I have a similar idea, and I’m rushing it out before the guilt kicks in. Again I’m going to be exploring just how bad the worst team in baseball is. Or just how good the worst team in baseball is. That’s the point of it, after all. It’s not to prove that the Astros are as bad as, say, a team of High-A All-Stars. It’s to see if the Astros are as bad as a team of High-A All-Stars, and if they’re significantly better (as I suspect they would have been), then we’ve learned a little something about baseball.
Asking questions about PECOTA's projections, and explaining what the system thinks is in store for Bryce Harper.
When the PECOTA spreadsheet appears, one of the first things people do is pick out the players projected to make the greatest gains or suffer the largest declines. Then the questions start: Why does PECOTA like/dislike so-and-so so much? Is there a problem with the projections? Or is the system just picking up on something I’m not seeing?
Behind the scenes, the BP staff goes through the same thought process. Before we publish the projections, we approach PECOTA’s output with a skeptical eye, on the lookout for anything that could be a bug. But even after we’re satisfied with the spreadsheet and release it to our subscribers, PECOTA retains the capacity to surprise.
PECOTA's projected award winners, bounceback candidates, and betes noires.
If your holiday was anything like most of mine, you’ll want a couple of Tylenol and some Gatorade this morning because you’re feeling the effects of PECOTA Day. Now that we’ve slept it off, it’s time to take a look at some of the highlights of the data as they project the 2013 season.
Team win totals can be found here if you want to use the projection system to forecast the playoff races eight months before the Division Series. But individual performances are easier to assess because they’re not compounding (or more accurately, just adding together) error with the projections.
BP begins to roll out its projections and fantasy tools for the 2013 season.
Welcome to the initial launch of this year’s PECOTA forecasts. We hope you find them enlightening, useful, and predictive.
Let’s start with the business aspects of things. In order to access the PECOTA forecasts, you need to be a subscriber to Baseball Prospectus. Monthly subscribers will have access to certain PECOTA features but will not have access to downloads like the PECOTA spreadsheets. The best value we offer is a yearly subscription, which not only gives you access to the full PECOTA product offering, but also unrestricted access to our extensive prospect coverage, R.J. Anderson’s Transaction Analysis, in-depth analysis from the likes of Ben Lindbergh, Sam Miller, and more, and the latest in baseball research from the likes of Russell Carleton and myself. If you feel you can pass on that, we offer our lower-priced Fantasy subscription, which give you full access to the PECOTA products and all fantasy-focused articles on the site.