Last Tuesday, the Phillies’ official Twitter made a pair of tweets. It made a number of tweets, including this specific pair of tweets:
Nola has a 2.94 xFIP (expected fielding independent pitching), 4th in MLB behind Kershaw, Fernandez & Syndergaard. pic.twitter.com/EMbemga9VX
— Phillies (@Phillies) July 19, 2016
What's that? You've never heard of xFIP? It's pretty simple, actually. pic.twitter.com/ADzjVvkHLW
— Phillies (@Phillies) July 19, 2016
Among people who both saw this and cared, there were a few reactions. There were a few jokes about the Phillies (because they’re bad and used to have a reputation for being dumb, but are citing advanced stats, get it?), and a fair number of people expressing confusion/unfamiliarity/hostility with or toward xFIP and advanced stats more broadly. There was also a good deal of excitement that a team Twitter, which engages with the “average” fan on a regular basis, was prominently displaying advanced stats and trying to explain them. Finally, several people expressed displeasure with the Phillies choice of stat.
Within that final group, there were two basic complaints. Some people noted that there are more accurate measures of a pitcher’s ability, which is true, and felt the account should have used one of those alternate measures instead. Others noted that xFIP is complicated and somewhat intimidating, which is also true, and felt the account should have used something more easily understood by a layperson fan.
This is when we stop talking about the actual Phillies tweet, because the reason they used xFIP—it makes Nola look as good as possible—is both simple and boring. The conversation it sparked, however, is genuinely interesting, and raises some questions. Such as, what’s the point of stats? What are we trying to do here, and why?
The simplest answer, and also the best answer, is to better understand what’s going on. What someone does with that understanding varies, but that’s the whole goal of all this, basically. One way progress can be made on that front, and the main way progress has been made in the last couple decades, is by expanding the depth of possible understanding. This is what the big discoveries and developments do, what DIPS theory did to our understanding of pitching back in the late 90s/early 00s, what wOBA and TAv and that kind of stat did to our understanding of offense in the late 00s, and what DRA is doing to our understanding of pitching right this very minute. They provide increasingly accurate descriptions of what actually happened on the field, and increasingly accurate predictions of what will happen in the future, to the people who both want to know and have the ability to use them.
That doesn’t describe every baseball fan, however. Depth isn’t the only dimension along which understanding can progress: it can also go wide, so to speak, and improve the understanding of large numbers of people who aren’t as engaged with the sabermetric movement. In the same way that the increase in depth of understanding over the last 20 years has been stunning, the stagnation of the width of understanding has been stunning. Roger Clemens’s player comment from the 1996 BP Annual cited his wins and ERA, among other things; those stats didn’t appear in the 2015 player comments for Clayton Kershaw, David Price, or Madison Bumgarner (for some randomly picked examples), as we’ve essentially moved past them when it comes to evaluating players. Box scores, on the other hand, look almost exactly the same as they did in 1996, and low-engagement fans probably cite the same stats as they did then, too. If 20 years ago, baseball understanding was a puddle, it’s now a mineshaft, not a pool.
There’s no doubt that some of this lack of progress is due to this group being reluctant to move away from what they’re comfortable with. Some of it, however, is due to high-engagement fans giving them no reason to make that move. Proselytization has been approached all too often with smugness, condescension, and a seeming desire to beat people about the head with their own ignorance rather than educate them. As Sean O’Rourke put it in the piece that link leads to, this is trying to supplant the old dogma with a new dogma. It doesn’t matter how much smarter or better the new dogma is, it’s not going to get through.
It’s more than that, though, since even the well-intentioned and condescension-free attempts to sell an improvement often don’t work. The articles describing the recent improvements to DRA feature something called a Spearman Correlation, which seems to pretty clearly demonstrate that DRA is the single best tool to evaluate pitchers with. That is the perfect way to accomplish an increase in depth, and nearly useless for trying to increase width. The person using pitcher wins or ERA or batting average isn’t waiting for proof that they’re missing out on accuracy by doing so; indeed, that proof likely will entrench them even further, thanks to the backfire effect. When deeply held beliefs encounter contradictory evidence (like, for example, a detailed description of how and why DRA is better than ERA), rather than correcting themselves, they get deeper.
As a result, I think the way to reach these fans is not by proving them wrong, but by consistent exposure to concepts slightly more accurate, and slightly more complicated, than what they’re comfortable with. In this context, the question to be asked of any stat is not whether it’s the most accurate thing available, but the most accurate thing available at a given level of complexity. DRA achieves accuracy that nothing else can touch, but that doesn’t mean FIP is useless or obsolete, since FIP is also much, much simpler. I explained FIP to my mom the other week, when she was grousing about David Price and his surprisingly large ERA. I could’ve told her that his DRA is stellar, too, but I wouldn’t have been able to explain what that meant in a short conversation, and she probably wouldn’t have asked about Drew Pomeranz’s DRA if she didn’t know what made it better than ERA. It takes incremental, constant exposure, not a sudden QED.
If I’m right, a team account tweeting about something like xFIP is a sign that things are moving in the right direction, as it’s a way for fans to be introduced to the concepts that lead all the way to things like DRA. That doesn’t mean the complaints about using xFIP in the above tweets aren’t valid; maybe it’s not accurate enough to justify its complexity (or, alternately, not simple enough to justify its (relative) inaccuracy). Those are the reasons it might not be the right thing to tweet, however, not that it happens to be neither the simplest nor the most accurate stat available.
If things are progressing outward, rather than just downward, it raises the question: why now? What is it about this moment in time that is (hopefully) yielding advancement in width as well as depth? If I had to guess, I’d say the credit lies with Statcast, for two reasons. First, it’s the coolest way advanced stats have been presented to the broader baseball audience, ever. Exit velocity conveys a lot of the same information as basic DIPS theory—there are things players can control, and things they can’t (or at least can control less), and generally we should pay more attention to the former—but in a much more viscerally satisfying way. There’s a reason almost every broadcast features at least one mention of a ball hit at or above 100 mph: it sounds awesome, in a way that run values and linear weights don’t.
Second, and probably more importantly, Statcast is controlled by MLB. That’s the other reason it’s on every broadcast, which has made it so that the data it provides, and the general principles of sabermetrics that underlie it, are in the public eye in a way that something like WARP has never been. I can imagine an award in the next decade that goes to the player with the highest exit velocity over the course of a season, or the highest speed in the outfield. I have a lot more trouble imagining an award for the highest WARP. MLB having a profit incentive to promote process-based measurements might turn out to be the single biggest thing for increasing the width of understanding of baseball.
I might be wrong about this part; it’s very possible things aren’t actually getting better at this moment in time. I’m nearly certain I’m not wrong that things need to get better, that the growth in depth of understanding has outpaced the growth in width by many, many times. Sabermetrics has been a niche thing inside the baseball community for nearly its entire lifetime, so much so that we take that fact for granted, but it doesn’t have to be, and hopefully won’t be forever.