The 2020 MLB regular season, with luck, will max out at 60 games for most teams. A 60-game season, coincidentally, also amounts to a 60 percent cut in the 162 games that ordinarily comprise the regular season. Baseball is a notoriously high-variance sport, and the 162-game season has traditionally helped provide consistency to summaries of player performance. But not this year.
How well can a statistic compensate for the loss of more than 100 games? The proposed 2020 season is the equivalent of ending a normal baseball season in early June. So, we reviewed various leading batter and pitching statistics—all of which we think are reasonable to use—and measured how well they performed at various points early on in 2019 relative to their end-of-season values.
Batter Statistic Convergence
We’ll start with batting statistics. Our participants will be OPS, wOBA (MLB / Savant version), wRC+ (FanGraphs), xwOBA (MLB), and BP’s DRC+. Non-Deserved-stat data was sourced from Baseball Savant and FanGraphs.
In 2019, the season enjoyed a soft launch around March 20 and then started in earnest a few days later. The 2020 season, as of mid-August, approximates the number of games ordinarily played through approximately mid-April, at least for teams that have played most of their originally-scheduled games this summer.
We provide line graphs comparing how the various batting statistics correlated to their 2019 end-of-season values for those same batters over the course of the season. The Pearson correlations were weighted by plate appearances; the standard errors are essentially zero, and are thus omitted. As usual, we measured correlations, rather than raw error, because these statistics are not on the same scale and cannot otherwise be meaningfully compared.
The best possible value is 1, and all metrics of course achieve that value, as compared to themselves, by the end of the season. The question is how well they do up until then.
There are arguably two useful measures of convergence here, relative to the shortened 2020 season. The first, of course, is the actual value on June 1, 2019 (when most teams have typically completed about 60 games) relative to the end-of-season value on September 27, 2019. The second measure is how rapidly the statistic approached that June 1, 2019 value.
For batters, DRC+ stands out in both respects. First, by early June, DRC+ is already closing in on its end-of-season values (weighted corr. = .85). xwOBA has consistently been the next best option in our stat testing comparisons for batters, and it occupies second place here also with a weighted correlation to its year-end values of .78. That leaves wOBA (MLB variant) and wRC+ with .72, and OPS lagging a bit at .68.
Second, DRC+ stands out by how rapidly it starts finding signal in batter performances. By April 1, 2019—after which the average team had played fewer than five games—DRC+ is already at a correlation of .51 to end-of-season values. For the 2020 season, this means that DRC+ had probably achieved this threshold before the end of July. At the same time in 2019, xwOBA checks in at .33 and the remaining statistics are mired around .24.
By the equivalent of April 15—close to where the 2020 season is now, relatively speaking—DRC’s correlation to its final season value is over .7. This demonstrates that DRC+ is, on average at least, effectively cutting through what it sees as random variation almost immediately.
All these stats have plenty more to learn after one week, but one of them has already learned much more than the others and has largely learned what there is for it to know after 60 games.
Pitcher Statistic Convergence
Let’s move on to pitchers, who are harder to measure because they have less control over baseball outcomes. Again, we consider several frequently used statistics: Earned Run Average (ERA), Fielding Independent Pitching (FIP, from FanGraphs), xwOBA1, and Deserved Run Average (DRA), and compare them over the same time periods.
Here, the statistics are more evenly matched. DRA and xwOBA both jump out to the lead early, and then DRA lags for a while before catching up. We traditionally don’t publish DRA for the first few weeks of the season for just this reason, but fair is fair, and other metrics show more strength in the early going, especially xwOBA.
By mid-May, two interesting things have occurred. First, the metrics have begun to clump together. Second, ERA has started to show surprising “strength”: FIP and ERA perform basically the same at first, but over the next month, and as we move into June, ERA is closer to its final seasonal value than FIP or xwOBA. Curious! By early June, ERA remains in the lead, trailed just slightly by the remaining statistics.
What does this mean? Has ERA been unfairly maligned as a subpar evaluation of pitcher skill? Probably not. What this reminds us is that reliability is not the same thing as validity: an umpire who consistently calls high strikes does not become a good umpire merely by being wrong in the same way, night after night. Likewise, ERA tends to disproportionately reflect other factors that we think are not fairly credited to pitchers and which are not going to change much over a season: things like park effects and quality of defense, not to mention the difficulty of moving beyond a particularly strong or disastrous start of the season. This also underscores why measuring statistics by their inherent “stability” can be a dubious endeavor, and why even the tests portrayed in this article are just one way (albeit often a useful way) to evaluate a statistic’s explanatory power.
The relative stickiness of even ERA, though, does underscore why measuring pitcher performance can be so challenging.
Special thanks to Shawn Brody for benchmarking assistance.
1MLB introduced xERA somewhat recently, but daily splits for it are not publicly available to our knowledge, and it appears to track xwOBA as a practical matter.
Thank you for reading
This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.Subscribe now