keyboard_arrow_uptop
Image credit: © Brett Davis-USA TODAY Sports

Today we announce an exciting development in the way we measure baseball defense here at Baseball Prospectus. The range portion of our fielding metric, Fielder Runs Above Average (FRAA), which comprises the vast majority of (non-catcher) value we assign to defenders, is being retired for leagues in which we have access to Statcast data. We have decided to name the updated metric Range Defense Added (RDA), and it is distinguished by relying entirely upon publicly-available data sources.  

RDA in turn is part of Deserved Runs Prevented (DRP). DRP is essentially RDA plus FRAA’s other traditional components, but range is such a large portion of defensive value that it made sense to establish a clear break from the past, and assign a new overall name.  For the first time in a long time, our defensive ratings are also now being posted to BP’s leaderboards. Split by player-season, player-team, or just team, you can see how we rate the various players and the teams on which they play.

The current fielder measurement landscape is dominated by Sports Info Solutions’ Defensive Runs Saved, which relies on in-house video review of each play, and MLB’s Outs Above Average, which uses both Statcast ball trajectory data and fielder positioning coordinates not available to the general public. Range Defense Added seeks to provide comparable (and possibly greater) accuracy in measuring fielder success, without the use of proprietary sources. RDA will also offer novel information to readers, such as a measure unique to shortstops measuring those defenders’ Attempt Range and a Range Out Score for all positions that provides an easy summary of fielder quality on a rate basis. 

Because it relies on Statcast data, RDA will primarily focus on major-league fielders, and will only be available for seasons beginning in 2015. For other seasons and levels, the existing FRAA formula will continue to be used to calculate defensive range, at least publicly. As Statcast data becomes publicly available at other levels, RDA will cover them, too. Other aspects of fielder defense, such as outfield assists and baserunning prevention, will continue to use our traditional formulas at all levels, which seem to work fine, and which constitute a much smaller portion of overall fielder value. The combination of run values from RDA and these other components will give you a DRP for the player or their team.

This article will address the philosophy behind RDA, and how it performs against other leading defensive metrics. If you only have time for a short introduction to RDA, feel free to check out our Frequently Asked Questions.

The Basics

Analysts have been trying to estimate the quality of fielder defense for decades. Jeff Zimmerman and Dan Basco discussed much of that history here. Tom Tango offered an additional perspective on competing methods in his primer to Outs Above Average (OAA), MLB’s current range defensive metric. Sports Info Solutions has published its own defensive metric, Defensive Runs Saved (DRS), for two decades. Their landing page describes their philosophy and some of the components of their metric.

Fielding is much more difficult to assess than hitting or pitching, because it is dependent upon other people’s hitting and pitching. As a result, it is more difficult to decide when a fielder should have been successful, and sometimes whether a fielder was even the proper fielder. Fielders with similar numbers of games played can vary widely in the number of opportunities they receive, the difficulty of those opportunities, and the stadiums in which they are required to execute those opportunities.

Like Outs Above Average and the Plus/Minus metric for DRS, RDA is a range metric that measures the extent to which a given fielder converts a fielding opportunity into an out. One fielder, typically the person who picks the ball up first, is credited with the fielded play, and to the extent the play was converted into an out, the fielder is credited (or penalized) the difference between the outcome of the play (out/no out) and the average expected success of a fielder in making that out. RDA is calculated for all fielders with at least one fielded ball and is currently calculated for all MLB players from 2015 through 2022. Additional contributions like the receiving/scooping of attempted outs and multiple-out relays are in line for future study.

RDA introduces new concepts that we think readers will enjoy.  First, we are introducing the Range Out Score, a new rate stat to evaluate fielding. Quality rate metrics are hard to find for fielding.  DRS does not seem to offer one; OAA features its usual percentile system for Savant statistics, but unfortunately there are nowhere near 100 meaningful levels of defense at any position, which presents a challenge. OAA also calculates a “success rate added,” but I don’t see it talked about much.

The Range Out Score is the percent above- or below-average that the player successfully converts a charged play to an out, relative to others at their same position. Different positions have wider ranges. Pitchers, catchers, and first basemen are unlikely to exceed 1-2% either way, infielders typically range between plus or minus 3-4%, and outfielders can be plus or minus 5% or more, given their unique opportunity to combine fielding acumen with running speed—two distinct skills that can be additive in their benefits. Across the board, an average player at each position receives a 0, an above- or below-average fielder will usually be a point or two in either direction, and a great or terrible fielder will go a few points beyond that. Think of Range Out Score as a more intelligent version of Fielding Percentage, but on a plus/minus basis. As such, we can easily speak of a “-2 infielder” or a “+4 outfielder” to summarize a defender’s fielding skill. Just be sure to compare players only to other players at the same position.

Second, for shortstops—although we may expand this to other infielders later—we introduce the concept of Attempt Range. Attempt Range is a counting stat addressing the fact that, unlike outfielders who always field the ball, one way or another, infielders can control their own denominator by the extent to which they are able to reach balls on the edge of their position’s typical range. One major flaw of Fielding Percentage is that players can improve their rating by fielding fewer balls and then executing at a higher level on less difficult plays. Attempt Range tries to lift the curtain on this behavior: you can check not only a player’s success rate on plays fielded (Range Outs Made/Range Out Score) and also the extent to which they are fielding more or fewer balls than others at their position (currently only shortstop).  Consider this table of RDA’s top shortstops last year:

Table 1: Best Shortstops by RDA, 2022 season

Name Range Outs Made Range Out Score Attempt Range
Willy Adames 17.2 4.4 16
Dansby Swanson 15 3.2 0
Isiah Kiner-Falefa 11.3 3 15
Ha-Seong Kim 11.1 3.1 12
Geraldo Perdomo 8.9 2.3 8
Nico Hoerner 8.4 2.1 3
Trea Turner 8.1 2 6
Taylor Walls 6.8 2.3 0
Andrew Velazquez 5.6 1.9 10
Nick Allen 4.1 2 7

Although we also calculate a runs equivalent, we agree with OAA that presenting plays made in terms of “outs” is more straightforward. So, interpreting Table 1, RDA’s top shortstop by Range Outs Made is Willy Adames, with a terrific Range Out Score of over 4, and over 17 Range Outs Made on the positive side of the ledger. As noted above, players beyond plus or minus 2 are starting to demonstrate the extremes of any position, and those under 2 either way are basically average, except at positions where there is little variety to be had. 

The other names in Table 1 hopefully are not a surprise, but the issue here isn’t just the Range Out Score assigned, but the additional context provided by Attempt Range. None of these shortstops has negative range, per se, but some of them feature terrific execution with only average range (Dansby Swanson) while Willy Adames delivers the complete package, creating more opportunities than anyone else, and also executing at a high level across the board. 

Table 2: Worst Shortstops by RDA, 2022 season

Name Range Outs Made Range Out Score Attempt Range
Paul DeJong -3.4 -1.7 -4
Tim Anderson -4.2 -1.7 7
Francisco Lindor -4.3 -1 -8
Didi Gregorius -6 -3.3 -8
Bryson Stott -7.1 -3.1 -6
Oneil Cruz -8.1 -3 -5
Bo Bichette -9 -2.1 -1
Xander Bogaerts -11.3 -2.5 -15
Luis García -11.3 -5.9 -5
Bobby Witt Jr. -20.1 -6.1 -12

RDA’s least favorite shortstops of 2022 span a range of below-average fielding, but again RDA offers subtleties, not just criticism. Tim Anderson features above-average Attempt Range, but below-average execution. Bo Bichette has average Attempt Range but below-average execution. Luis García and Bobby Witt Jr. have some of the worst Range Out Scores you will see for a shortstop in any year. Francisco Lindor may be a surprise appearance on this table for some, but RDA joins DRS in seeing his performance last year as slightly below average, perhaps anticipating inevitable decline.

The Need for Change

FRAA has been our defensive metric at BP for quite some time. It is also the last of BP’s Big Three measurements—pitching, hitting, and fielding—to be overhauled to methods we consider to be current best practice. The time for that modernization has come.

Why do we need to update the range portion of FRAA? As of two years ago, in a test of same-season descriptive power, it did well. The answer is that we are also interested in repeatable player skill, not just our ability to calibrate current events. By this alternative measure, FRAA is not performing well enough, in part because resources like Statcast did not exist when FRAA’s range calculation methods were last updated.

How do you know if a metric is good? First and foremost, the values need to make sense. There is no substitute for expert judgment, and in our collective view at BP, RDA’s estimates make sense across the board. Unfortunately, many methods that aren’t that good can look promising if they get the top and bottom guys somewhat right. So, we needed more objective criteria. 

We decided that the benchmark most likely to serve as a severe test of range metric quality is the (1) year-to-year reliability (aka “stickiness”) of (2) individual ratings for (3) players who changed teams, preferably (4) with good year-to-year calibration. For us, this follows from the concept of fielder skill, and the necessity of separating the contributions of individual players from their teams. Fielder ratings should not be getting polluted, even inadvertently, from the quality of their team’s positioning decisions or neighboring fielders. The cleanest way to take these confounders out of the system is to rip the band-aid off, and evaluate your metric on its ability to correctly rank, year after year, the best and worst fielders who have been shipped off to other teams.

Between 2016 and 2022, we identified over 1,000 fielders who fit this category, excluding pitchers and catchers, providing a robust sample. Using Spearman correlations to compare the consistency of their fielder rankings, weighted by number of fielded plays we credit to each fielder, here is how the existing range portion of FRAA stacked up to OAA and the range portions of DRS and UZR, for fielders ranked by all of these systems. Note that these and other numbers in our tables have been updated since our Annual essay, to incorporate the benefits of the imputation scheme we have since developed for plays missing Statcast batted ball data[1]:

Table 3: Year-to-Year Spearman Reliability, Team-Changers by Position
2016–2022 (higher is better)

Position OAA DRS UZR FRAA Players
1B 0.23 0.07 0.15 0.17 193
2B 0.28 0.24 0.08 0.14 233
3B 0.15 0.25 0.11 0.07 230
SS 0.24 0.27 0.14 0.24 147
LF 0.44 0.32 0.23 0.13 282
CF 0.36 0.20 0.12 0.27 195
RF 0.40 0.24 0.19 -0.06 249

By this measuring stick, FRAA’s performance is not terrible—it actually holds its own at first base and shortstop—but it is not great either. A difference of a few points doesn’t matter, as these measurements all have a standard deviation of about .1 over bootstrap resampling. (Hence, we also keep the shading of our leaders light). But the gap between FRAA and OAA/DRS is consistent. On average, FRAA certainly performs worse:

Table 4: Overall Year-to-Year Reliability, Team-Changers, 2016–2022 (higher is better)

OAA DRS UZR FRAA
0.31 0.23 0.15 0.12

There is no shame in this: the competing metrics (even UZR) have access to resources that FRAA does not, so it is not surprising that they perform better. But there is no need to be satisfied with this state of affairs either. And we are not.

The Improved Performance

RDA incorporates advanced resources like Statcast batted ball data. It also incorporates the foundations of our Deserved metrics, especially the concept of principled skepticism. This means that we dole out only partial credit for each play, granting more credit only when a fielder shows a consistent pattern within a single season, positive or negative. But RDA also benefits from original analysis, experimentation, and our willingness to think broadly about the driving causes of good defense. The extra work is unfortunately required, because we still have a significant asymmetry of information: We do not have access to MLB’s fielder coordinates, nor do we have a staff of video analysts who study every play. And our goal is to create the best metric we can while relying entirely on data in the public domain.

Fortunately, we seem to have succeeded. Let’s show the two previous tables again, and this time we will add RDA into the comparison for those same players:

Table 5: Year-to-Year Spearman Reliability, Team-Changers by Position
2016–2022 (higher is better)

Position OAA DRS UZR FRAA RDA Players
1B 0.23 0.07 0.15 0.17 0.30 192
2B 0.28 0.24 0.08 0.14 0.28 230
3B 0.15 0.25 0.11 0.07 0.25 229
SS 0.24 0.27 0.14 0.24 0.32 146
LF 0.44 0.32 0.23 0.13 0.59 282
CF 0.36 0.20 0.12 0.27 0.65 195
RF 0.40 0.24 0.19 -0.06 0.55 248

As you can see, RDA matches or beats all other systems across the infield, and runs away with it at the outfield positions. The overall average correlations across all of these positions tell a similar story:

Table 6: Overall Year-to-Year Spearman Reliability, Team-Changers, 2016–2022 (higher is better)

OAA DRS UZR FRAA RDA
0.31 0.23 .15 .12 .43

RDA is particularly adept at resisting “house effects” of particular teams, which might be driven by better positioning or the effects of neighboring fielders. Consider the net difference between the values in Table 6 and Table 7 (below), when we look at all fielders, not just those who changed teams. For this comparison, we want the penalty to be either 0 or somewhat positive:

Table 7: Overall Year-to-Year Reliability Penalty, By Player Status, 2016–2022 (closer to zero is better)

Cohort OAA DRS RDA
Team Changers +0.31 +0.23 +0.43
Everyone +0.33 +0.29 +0.39
Net Penalty -0.02 -0.06 +0.04

OAA arguably has some house effects influencing its metric, but not much. By a slightly larger margin, RDA positively resists the effect of a player’s team and neighboring fielders. DRS demonstrates the largest negative differential, showing much more consistency in rating players who remain with the same teams. The overall hierarchy between metrics is maintained when all players are considered, which supports the validity of using the team-changer subset. In defense of DRS, there is something to be said for recording how well a player performs on a particular team at a particular time. But from our standpoint, the team-independent skill demonstrated by the fielder is of greater interest and better reflects their likely contribution.

We also mentioned a fourth criterion above: year-to-year calibration. One regular criticism of defensive metrics is that, even when they are able to consistently grade defenders as positive or negative, the values too often show extreme variance from year to year, even though it is unlikely a typical player’s skill is changing much. One way to measure a metric’s ability to resist this yo-yo effect is the Pearson correlation of their values for fielders from year to year. The Spearman correlation, which we used above, compares the consistency of a metric’s ranking, making it a bit more robust. The Pearson, on the other hand, evaluates the consistency of the actual value, and is analogous to the normalized mean squared error. So, let’s take Table 6 and switch from Spearman to Pearson correlations:

Table 8: Overall Year-to-Year Pearson Reliability, Team-Changers, 2016–2022 (higher is better)

OAA DRS UZR FRAA RDA
0.13 0.06 .06 .04 .43

All of the competing metrics, particularly the state-of-the-art ones, take massive hits between their year-to-year Pearson correlation numbers relative to their Spearman estimates. This could mean that a fielder’s value is necessarily non-linear and/or inconsistent. But that hypothesis is belied by the fact that RDA’s stickiness suffers no penalty whatsoever when we shift from looking at consistency of fielder ranks to looking at the actual values assigned to those fielders. As such, RDA not only consistently ranks fielders from year to year, but it goes further and assigns them much more consistent values. RDA accomplishes this even though each (full) season is modeled with no knowledge of what happened to any player or batted ball in any other season. In fairness, RDA values can be more compressed than other metrics, but that appears to be more of a feature than a bug, and the range of any metric’s values makes no difference to the rank correlations discussed above.”

RDA also seems to perform fine in evaluating pitcher and catcher range. RDA sees the value of both as fairly miniscule, perhaps because it questions whether some pitcher plays would have been more routine for other infielders to make. OAA does not appear to evaluate pitchers or catchers, but DRS seems to rate them reliably:

Table 9: Overall Year-to-Year Spearman Reliability, Team-Changers, 2016–2022 (higher is better)

Position DRS FRAA RDA
C .26 0 .33
P .21 .20 .12

Last but not least, RDA seems poised to function well going forward. New restrictions on fielder positioning can only make RDA’s job easier. Moreover, while RDA seems to work fine with both Trackman and Hawk-Eye-based systems, its performance in the first (full) seasons of the Hawk-Eye system is eye-opening. Again, the weighted Spearman correlations with team-changers:

Table 10: Overall Year-to-Year Reliability, Team-Changers, 2021–2022 (higher is better)

OAA DRS UZR FRAA RDA
0.22 0.23 0.16 .11 .57

Perhaps 2021 and 2022 were just randomly good years for RDA, but if Hawk-Eye is truly adding more accuracy to batted ball measurement, it makes sense that RDA would benefit disproportionately from it. In fact, the benefits of Hawk-Eye seem to be so remarkable that any defensive system, public or private, that designed its fielding model before Hawk-Eye may want to revisit it for possible improvements.

With all this said, please remember that evaluating defensive accuracy remains tricky. Although reliability is arguably the most important benchmark by which to grade a metric, at least when you don’t know the “right” answer, the fact that you are measuring something consistently does not automatically mean you are measuring it correctly, a point we have made previously. Unlike with batters or pitchers, we cannot simply take next year’s OPS or RA9 and see how well it has been “predicted,” because fielders do not have an obvious equivalent metric. Without a consensus ground truth, it’s difficult to agree upon what is “right” and what is not. With that said, we can’t think of a single good metric that doesn’t start by being incredibly reliable, nor can we think of a baseball metric that is highly reliable and doesn’t also provide useful information. 

Why Does RDA work?

How are we able to provide a credible alternative to DRS and OAA without access to their additional resources? 

It helps that the outcome of many balls in play is predetermined by the nature of the batted ball itself. On average, they will be an out or hit of some kind regardless of who fields the position and where their team orders them to field it.

That leaves, however, the plays where additional factors do matter. DRS and OAA respond by focusing on the fielder’s perceived location during each play. Our approach is the opposite: to zoom out, and consider how fielder positioning is part of a larger process.

We begin by noting that every fieldable ball has an outcome driven by (at least) three factors relevant to this discussion: (1) batted ball characteristics, (2) fielder skill and (3) fielder positioning. We know the outcome of the play, and we know most of the relevant batted ball characteristics: they include launch speed, launch angle, and the estimated bearing[2]. That leaves fielder skill and fielder positioning, and in order to solve for the former we need to have some sense of the latter. If fielder positioning is consistent across baseball in its out-generating effects—at least on average—then we can treat individual deviations from it as random and we can solve for fielder skill, as it is the only remaining unknown (at least in this discussion). But if fielder positioning is dynamic and truly unknown from play to play, we could have issues.

Thus, ideally we find some surrogate for fielder positioning on each play. To do this, we consider fielder positioning as a part of an overall process rather than a mere set of coordinates for each play. To be more specific, we consider team positioning of fielders as driven by (at least) two sub-factors: (1) shared team goals and (2) batter outcomes by fielding position. 

Sub-factor one recognizes that opposing teams share the goal of getting the batter out. If all opposing teams are trying to get the batter out, and they all have access to enhanced fielder data and can watch each other’s strategies to see what works best, it is reasonable to assume they will gravitate toward an optimal fielding strategy or strategies for each batter—whether it be a similar set of fielding positions or some other approach that provides the most comparable result. If so, whatever the particular fielding alignment turns out to be for a batter, we can assume that it will have similar overall results on average, conditional on the skills of each team’s fielders. In other words, if teams are consistently optimizing for the best outcome, then we can assume that differences between teams in fielder positioning are minimal, random, or a bit of both. The benchmarks cited above suggest that these assumptions do in fact hold, at least on average.

The second sub-factor complements the first: Different batters tend to have different results when their balls in play are fielded by different positions, because few hitters hit with the same power and authority to all locations. Thus, teams know not only what strategies are considered best practice for a given batter (the first component, above), but they (and we as analysts) know which fielded positions typically bode well or poorly for each batter’s balls in play. Bound up in this factor is the team’s ability to position fielders optimally, and the ability of a batter to generate spin and other factors that affect the success of a struck ball beyond those currently disclosed by MLB. And, this is a factor we can directly control for.

Because fielder positioning is a function of (at least) these two other factors, it can be recharacterized as a mediator of a larger process, driven by shared team goals and typical batted ball results to each fielding position. If so, actual fielder positioning becomes ignorable in the larger picture for the same reasons that the pathways above it are themselves ignorable or accounted for in our system. If our stated assumptions hold on average about the drivers of fielder positioning in general, then the precise location of each fielder on each play is no longer necessary to know, at least with respect to our ability to paint an overall picture of fielding skill. (It would be nice to have the additional information, but that is not the same thing as it being necessary). Without that additional information, we cannot grade each individual play with the precision we would like, and the values could be a smidge more volatile, but we also are not trying to grade individual plays. Rather, we are trying to objectively measure the displayed fielding skill over a sample of plays, and ultimately a full season. Plus, from Savant data we at least have the baseline knowledge of whether fielders are in a shifted or standard configuration.

As a result, RDA arguably is trying to answer a slightly different question than DRS and OAA do. DRS and OAA seem to be asking whether a particular play was above- or below-average for that fielder under each system’s assumptions about the challenges of that particular play. RDA, on the other hand, asks whether a fielder’s overall play was consistently above- or below-average in light of the extent to which outs should have been made by somebody on those plays. 

The OAA/DRS approach has the potential to give you more detail, sometimes, about what a particular fielder actually did. But the RDA approach arguably tells you more about how a fielder would be expected to perform under typical circumstances, and at the very least establishes an informed prior for how well a fielder is performing. Answers to both questions offer value, and the more answers readers have, the better.

Conclusion

DRP and RDA values are now available for players and teams on our leaderboards. As we roll out the new metric, we would appreciate hearing feedback from you on the numbers you are seeing, particularly if you notice any possible issues or have suggestions on how to expand our list of Frequently Asked Questions. Our goal is to get this right, or at least as right as reasonably possible, and our eagle-eyed readers are some of our best evaluators.


[1]All DRS, UZR, and OAA values were provided courtesy of our friends at FanGraphs.

[2]The actual bearing of each ball off the bat, despite being measured, is not made publicly available by MLB. It can be partially estimated from the stringer estimates of where the ball was actually fielded. We would appreciate the actual bearing measurement being published in the public domain.

Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now
You need to be logged in to comment. Login or Subscribe
Jon Crate
2/08
This is, as the kids say, very boffo.
jamiewonders
2/08
Was super psyched when I saw this in the Annual - FRAA was one of the few flaws I saw in using WARP, and it's awesome to see it upgraded.

I also love the philosophy of using publicly available data, and it's a shame that MLB still hasn't released horizontal launch angles for batted balls (among other withheld data).
Richard Cramer
2/08
Impressive in magnitude. Convincing in results.
Michael Pendergast
2/08
Like it.
Matthew Gold
2/08
I can't comment on the calculations, but it is surprising that IKF, who was benched by the Yankees during the postseason, largely (as far as we could tell) on the basis that his fielding was subpar, would be rated as one of the best fielding shortstops in all of baseball.
Harry Pavlidis
2/08
On the sole basis of RDA, which covers range, he does rate very well. But overall, which is reflected by his DRP (which includes throwing and base-advancement related information) he's a bit below average.
Harry Pavlidis
2/08
*otherwise a bit below average, still positive overall at least in 2022
John Mayne
2/14
What do you make of the critique of the stickiness numbers here?

https://hareeb.com/2023/02/13/oaa-and-the-new-baseball-prospectus-defensive-metric-range-defense-added-infield-edition/