We missed on the Royals; can we excuse ourselves, or should we have seen this coming?
Last week, I wrote an article about the fact that no Baseball Prospectus staffer picked the Rangers to be serious contenders this year. (Neither did anyone else I could find.) Because the Rangers lead the race for the second Wild Card spot, the question was whether we, as a sabermetric community, missed something that ought to have allowed us to forecast Texas' success, or whether they were a true fluke. I concluded, in the end, that we underestimated the value of the team's organizational depth, suffered from recency bias that unfairly diminished our expectations of once-great players, and undersold the degree of variance possible in a league with many teams on the edges of contention.
The Rangers were a good first case study in this area, but the best and most important team on which to challenge our views is the Royals.
The rest of this article is restricted to Baseball Prospectus Subscribers.
Not a subscriber?
Click here for more information on Baseball Prospectus subscriptions or use the buttons to the right to subscribe and get access to the best baseball content on the web.
PECOTA has been right on some guys, and way off others.
One of the more fun things to do right before the season starts is to take a deep dive into PECOTA and see what is foreseen for that upcoming summer. Whether it's discussing players who changed PECOTA's mind with one big year, exploring how a uniquely built team projects, or looking at some out-of-the-box projections, PECOTA can provide hours of entertainment. Of course, by the end of the season, going back and seeing just how accurate some of the projections PECOTA spit out is interesting as well.
But since I'm getting a little impatient, I'm not going to wait for the end of the season. I want to know how PECOTA is doing now.
Who has the biggest gaps between their preseason and rest-of-season projections?
Most seasons, it seems, the Giants are a tough team to read, a tough team to cover, a tough team to talk about. They're three-time champions in the last five years, but there has yet to be any season in which it felt like they were some infallible juggernaut. To the contrary, they seem to be forever cobbling together a winner, often on the strength of some unexpected boost from a theretofore unknown entity. Last season, especially late in the season, that player was Joe Panik. This year, though the Giants don't look like a playoff team at the moment, they've gotten a similar (and similarly surprising) jolt from Matt Duffy.
Panik and Duffy don't have a great deal in common. The former was the team's first round pick in 2011; Duffy was an 18th rounder in 2012. Duffy moved through the minors considerably more quickly, earning a mid-season promotion from the South Atlantic to the California League in 2013. Panik got a full season in the Cal League, and then a full season with Double-A Richmond. It wasn't until he mashed Triple-A pitching in 2014 that he earned a mid-season bump, but that bump saw him move into an important role on the eventual World Series winner. Duffy put up better numbers than Panik for the majority of his minor league career, but never saw Triple-A, and graduated (permanently, it appears) to The Show after more than 500 fewer plate appearances than Panik had.
In 2000, when Alex Rodriguez was a free agent, Scott Boras did something amazing that we just don't appreciate enough.
On Sunday, Darren Rovell tweeted a handful of pages from the free agent binder that Scott Boras put together for a 25-year-old Alex Rodriguez. As most of us around here tend to be projections junkies, surely you’ll find this page particularly interesting:
Can the uncertainty in a player's projections be projected?
There are two important aspects of prediction. The first concerns the accuracy of the prediction—that is, how close a prediction is to the actual, observed result. The second is uncertainty, which is how sure a forecaster is about his or her projection. These issues are fundamental forecasting concepts, and similarly apply to predictions of the weather, the stock market, or the outcome of tomorrow’s ballgame. At present, only one of these facets of a prediction gets much attention in the world of baseball projections, and that is accuracy. Accuracy is measured by the absolute error, which defines how close, on average, a forecast is to the actual, observed result. Projectionists struggle primarily to minimize this number.
The under-examined facet of prediction that we will address in this article is the uncertainty. Whereas we know that predictions tend to be accurate to within a hundred or so points of OPS, we would also like to know whether we are more or less likely to be wrong on certain players. The uncertainty is often treated as a second-order concern because it is usually more difficult to estimate. However, as we show, it is possible to predict ahead of time which players’ forecasts are more uncertain than others. This concept is important because certain teams may prefer high versus low-risk players—a team with high win expectations (90+ wins) might prefer to reduce risk, whereas a middle-of-the-road team (80-85 wins) would presumably seek risk in order to “get lucky” and reach the postseason.
A look at how to avoid allowing biases to influence your projections.
As soon as the baseball season comes to its inevitable and saddening end, baseball, as it does each year, will enter the offseason. For the fantasy baseball community, this means we will be entering ranking and projection season. After following “our players” and players of interest all season, we are now asked to take an all-encompassing look at the league’s baseball players. The result of doing projections periodically, as opposed to continuously, is that we are likely to invite certain biases into our processes, which can negatively impact our results. We will take a look at why we do periodic projections, the biases that come with such a process, how these biases manifest themselves, and some ways to hopefully de-bias our process.
The devil’s advocate in me asks, “if periodic projections causes certain problems, why not do continuous projections?” The short answer is that doing continuous projections is not feasible or desirable for most of us. A computer program could certainly perform continuous projections, but we—as mere people (note: people are awesome)—do not have the ability to continuously adjust our valuations on such a large scale. Sure, each time we watch, read about, or hear about a player, our impression of said player will be altered or reinforced consciously or subconsciously, but that is not what I am getting at. Rather, what I mean is that we cannot watch all players play every one of their plays, and we cannot fully analyze all of what we see or all of the available data. The result of all this humanness is that we can really only fully update our projections on a league-wide basis come decision times; those being the offseason for auctions and drafts, as well as, to some extent, the trade deadline. While we constantly update our valuations for the players we follow, my assumption is that very few people follow every player and those who do probably do not do so diligently enough to properly continuously update each player’s projection.
Why predicting player breakouts is more important than minimizing error.
Last week, the sabermetric community had—well, not an argument, because the participants were generally professional and cordial to one another, but a debate about what we might expect over the rest of the season from a player who is currently enjoying a hot (or cold) streak. It all started with researcher Mitchel Lichtman (better known by his initials, MGL) posting two articles, one on hitters and one on pitchers, that made the case that we should trust the projection systems rather than expect a player’s recent performance to continue. Remember Charlie Blackmon, who was the best player in baseball for three weeks and was smart enough to make those weeks the first three weeks of the 2014 season? He’s a good example. He had never been anything special, nor was he projected for greatness this year. And in retrospect, his hot streak to start the season looks a lot like a small-sample fluke.
After we released the PECOTA Top 100 prospects list last week, a few commenters remarked on PECOTA’s apparent catcher leanings. Eleven of them appeared on the list, some higher than nationally beloved prospects. How dare PECOTA! In comparison, Jason Parks’ top 101 featured eight catchers, suggesting a small discrepancy in the position distribution of PECOTA’s rankings.