Lies, Damned Lies: Making RBIs Useful

There isn’t a whole hell of a lot to do in Lansing, Michigan. There aren’t any mountains, and there isn’t any seacoast. The nearest amusement park is 400 miles away. There’s a minor league ball team there now, but there wasn’t when I grew up. There’s a college there–a big, state university–with lots of college parties, and lots of college girls, and a lot of kids from Lansing start behaving like college students long before they really should. But even those with precocious synapses manage to sneak in a few years of relative innocence before learning what sororities and beer bongs are, and my synapses were late to the party. There’s a big city not too far away, but to paraphrase W.C. Fields, the prevailing sense that one has when one is in Detroit is that, all things considered, one would rather be in Lansing.

So what you do a lot is drive. You drive past the cow farms and the meadows and rolling hills or whatever the hell they’re called in the TripTik and the dilapidated country town with the antique store that your mother likes so much. You drive with your dad in an American-made sedan and you listen to Ernie Harwell and the Tigers. You drive at 62 m.p.h. past a shuttered-up farmhouse with peeling gray paint and a half-working windmill, and Steve Balboni stands there like a house by the side of the road and watches Frank Tanana‘s fastball go by, or that’s what Ernie tells you. You drive and you listen and you daydream and you talk about baseball.

I can distinctly remember, on one of those lazy summertime drives with my dad, talking to him about the relative merits of batting average and RBIs. Much better, we were agreed, to have a run producer like Kirk Gibson in your lineup than a batting average specialist like Wade Boggs. Baseball, after all, is won and lost with runs; an RBI, by definition, produces a run, while a base hit doesn’t. Boggs had to rely on Jim Rice or Dwight Evans to drive him in, but Gibby got things done all by himself. We felt this to be a sophisticated and almost inscrutable line of argument. A few years later, we’d use the same reasoning to argue with anyone who would listen that Cecil Fielder had been robbed by Rickey Henderson in the MVP voting.

Nowadays, of course, my father and I have developed a more nuanced view of baseball and its statistics. We could tell you, for example, that some of the impetus behind our enthusiasm for the RBI was the result of the peculiar park effects of Tiger Stadium, which tended to increase run scoring and power categories, while depressing batting average. We could also tell you that our skepticism about batting average was correct: it is a relatively weak predictor of run scoring.

Surely, the RBI has its problems, and we could tell you that too. The usual modernist argument against the RBI, a case that Dayn Perry makes eloquently, is that it’s extremely dependent upon a player’s context. All else being equal, a hitter who has Derek Jeter and Gary Sheffield hitting immediately in front of him is going to have more RBIs than a hitter who is batting behind Jack Wilson and Rob Mackowiak, simply because the former pairing will be on base so much more frequently.

But what if we could remove these sorts of considerations from the equation? Would the RBI then become useful? I asked Keith Woolner to provide me with the following information for each major league hitter in 2003:

The percentage of the time that a runner on third base scored when he was batting;
The percentage of the time that a runner on second base scored when he was batting;
The percentage of the time that a runner on first base scored when he was batting;
The percentage of the time that the hitter himself scored, by hitting a home run.

Let’s take a look at the leaders in the first category. Which hitters were most effective at driving in runners from third base?

Table 1: Percentage of Runners Scored from 3B (minimum 50 opportunities)


Player           OPP  R   RATE
Sheffield_Gary    84 48  57.1%
Stynes_Chris      50 28  56.0%
Stewart_Shannon   64 34  53.1%
Cintron_Alex      51 27  52.9%
Sweeney_Mike      68 36  52.9%
Helton_Todd       72 38  52.8%
Anderson_Marlon   63 33  52.4%
Randa_Joe         51 26  51.0%
Walker_Todd       63 32  50.8%
Lee_Carlos        77 39  50.7%
LEAGUE AVERAGE           38.6%

The list is dominated by what we might call good contact hitters. Power is not so important when it comes to scoring a runner on third; except in very rare instances, any base hit will score the runner. Striking out is bad, since it eliminates the possibility that the runner will score while the batter makes an out, and most of these hitters don’t strike out very much. This is also one spot where having a high walk rate isn’t particularly helpful; with nobody on base and nobody out, a walk is worth exactly as much as a base hit; with two outs and a runner on third, the base hit is worth substantially more. The list, of course, also includes its share of sample size anomalies; I think you’d be hard-pressed to find a manager who would rather have Chris Stynes or Marlon Anderson batting with a runner on third than, say, Vladimir Guerrero.

Moving clockwise across the diamond…

Table 2: Percentage of Runners Scored from 2B (minimum 100 opportunities)


Player           OPP  R   RATE
Stynes_Chris     100 26  26.0%
Casey_Sean       117 30  25.6%
Delgado_Carlos   174 44  25.3%
Anderson_Garrett 147 37  25.2%
Abreu_Bobby      161 40  24.8%
Everett_Carl     121 29  24.0%
Sheffield_Gary   143 34  23.8%
Crawford_Carl    122 29  23.8%
Wilson_Preston   176 41  23.3%
Matsui_Hideki    153 35  22.9%
LEAGUE AVERAGE           16.6%

One thing that should jump out immediately is the sharp decline in the league average scoring rate: in any given plate appearance, a runner on third base is going to score almost two-and-a-half times as often as a runner on second. It seems apparent that some of the conventional wisdom about the importance of having runners in ‘scoring position’ is misguided. A base hit scores a runner from second base only about 63 percent of the time, and an out will virtually never score a runner from second unless it’s combined with an error.

The character of hitters on this list is not entirely different from the one that we examined before. It includes a lot of doubles hitters, folks like Hideki Matsui and Bobby Abreu. Our friend Chris Stynes shows up again; if only Jayson Stark had known!

Table 3: Percentage of Runners Scored from 1B (minimum 150 opportunities)


Player           OPP  R   RATE
Pujols_Albert    215 28  13.0%
Thomas_Frank     202 24  11.9%
Edmonds_Jim      172 20  11.6%
Chavez_Eric      221 24  10.9%
Thome_Jim        240 26  10.8%
Guerrero_Vlad    161 17  10.6%
Rodriguez_Alex   231 24  10.4%
Giles_Brian      213 22  10.3%
Guillen_Jose     166 17  10.2%
Jenkins_Geoff    176 18  10.2%
LEAGUE AVERAGE            5.6%

This list, in contrast, is a testament to the importance of isolated power. A team will have a runner on first base far more often than it will have a runner on second or a runner on third–but it requires an extra-base hit to score that runner (a double, incidentally, will score a runner from first base a little bit more than 40 percent of the time). Of course, a team can try and play station-to-station ball, advancing a runner one base at a time, but it will often find that it runs out of outs before it runs out of bases.

In the interest of completeness, we’ll also provide the list of the leaders in home runs per plate appearance (it is true that a batter can sometimes score himself without hitting a home run, such as when he hits a triple and the defense makes an error, but those instances are unusual enough that it should be safe to ignore them).

Table 4: Leaders in HR/PA (minimum 400 PA)


PLAYER         HR  PA  RATE
Lopez_Javy     43 495  8.7%
Bonds_Barry    45 550  8.2%
Edmonds_Jim    39 531  7.3%
Sosa_Sammy     40 589  6.8%
Thome_Jim      47 698  6.7%
Rodriguez_Alex 47 715  6.6%
Thomas_Frank   42 662  6.3%
Pujols_Albert  43 685  6.3%
Sexson_Richie  45 718  6.3%
Sanders_Reggie 31 498  6.2%
LEAGUE AVERAGE         2.8%

I hope that you can see what I’m trying to do here, which is to break the RBI down into its component parts. The percentage of the time that, say, a given hitter knocks in a runner from second base is, in fact, largely a function of his ability. Given the sorts of sample sizes that we’re dealing with, it is also heavily influenced by luck. But save for some negligible differences due to baserunning ability, it does not have very much to do with the abilities of his teammates. It is context-independent. Put another way, the foremost problem with the RBI is that different hitters will be faced with different baserunning states with different frequencies. David Ortiz conducted about 27 percent of his plate appearances last season with a runner on second; Bobby Higginson, just 17 percent.

By removing these sorts of variances, and replacing them with league average figures, we can go a long way toward making the RBI context-neutral. These were the relevant, league average rates in 2003:

Lg3PA = League percentage of Plate Appearances with a runner on 3B = 10.7%
Lg2PA = League percentage of Plate Appearances with a runner on 2B = 20.7%
Lg1PA = League percentage of Plate Appearances with a runner on 1B = 30.9%

With these averages in place, we’re now ready to introduce the CIRBI–the Context-Independent RBI. (Any resemblance to the former Twins slugger is purely intentional):


CIRBI = (R3H * Lg3PA + R2H * Lg2PA + R1H * Lg1PA) * PA + HR

R3H, R2H and R1H represent the percentage of runners that a given hitter knocks in from third base, second base, and first base respectively, as we’ve calculated in the tables above. All of the components of the CIRBI are context-neutral, or at least pretty close to it. You’d probably want to build in an adjustment for park effects, but we can skip that step for now.

Here were the major league leaders in CIRBI in 2003, presented along with their actual RBI totals:

Table 5: 2003 CIRBI Leaderboard


Player          CIRBI    RBI
Delgado_Carlos    138    145
Pujols_Albert     131    124
Sheffield_Gary    131    132
Rodriguez_Alex    128    118
Helton_Todd       124    117
Thome_Jim         124    131
Sexson_Richie     122    124
Wilson_Preston    120    141
Wells_Vernon      120    117
Lee_Carlos        117    113
Anderson_Garret   117    116

One of the nice properties of the CIRBI is that it operates on exactly the same scale as the regular ol’ RBI. A CIRBI total above 100 represents a good season, and a total above 120, an outstanding one. In most cases, indeed, the differences between CIRBI and RBI totals are relatively slight.

Nevertheless, it does seem apparent that the CIRBI is somewhat better correlated with more advanced metrics of productivity. Players like Alex Rodriguez and Albert Pujols rank higher in CIRBIs than they do in RBIs, while Preston Wilson ranks lower, even before accounting for park effects. At the same time, the CIRBI, like the RBI, is able to recognize the strengths of a hitter like Garret Anderson for what they are. While Garret gets a lot of flack around here for his poor plate discipline, he makes contact well, stays healthy, produces a lot of extra-base hits, and is a pretty good guy to have at the plate when you’ve got runners on base.

It should be noted that the CIRBI behaves like a counting stat, rather than a rate stat; all else equal, it will increase linearly with playing time. It should further be noted that the CIRBI, like the RBI, is subject to the vagaries of “clutch” hitting. If a hitter hits uncharacteristically well with runners on base, that will reflect positively on his CIRBI total, even though it is unlikely that it represents any sort of repeatable ability. I wouldn’t expect any Prospectus authors to start using CIRBI in lieu of more robust metrics like slugging percentage and isolated power.

Nevertheless, the CIRBI does have its charms. It could provide for potentially powerful ammunition in an MVP debate–we could have used it to point out, for example, that A-Rod was an exceptionally productive hitter with runners on base, even though he might not have had as many opportunities as his counterparts in Boston and New York. It is scaled in such a way that makes it accessible to the more traditional sorts of folks in the baseball community; I don’t think that I could persuade Jim Hendry about the merits of MLVr, but maybe I could sell him on the merits of the CIRBI. The CIRBI does a pretty good job of answering, in the most literal sense of the term, the question of which hitters were most effective at driving runners home. It does a better job of that than the original, context-dependent version of RBI.

That said, I don’t know that you’re going to be seeing CIRBI next to VORP and SNWL on the BP stats page any time soon. The real problem with RBI–and the real problem with CIRBI–is not so much that it is context-dependent, but that it conflates the question of driving in runs with the more important matter of producing runs. Barry Bonds doesn’t do so well in terms of CIRBI, but he’s the best hitter on the planet because he’s going to get on base something like 60 percent of the time this year, creating unprecedented opportunities for the hitters hitting behind him (that the Giants can’t do better than A.J. Pierzynski and Marquis Grissom is another matter entirely). Wade Boggs and Rickey Henderson were better hitters than Kirk Gibson and Cecil Fielder were better hitters for the very same reason. Driving in runs is only half the battle, and because of the importance of avoiding outs, it’s the less important half.

As Chris Kahrl argues in the case against OPS, a little bit of simplicity can be a dangerous thing. I think my dad, who has never been much of an Ockham’s Razor guy, would agree. Probably not Ernie Harwell, though.

Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now

Lies, Damned Lies: Making RBIs Useful

Thank you for reading

Latest Articles

Box Score Banter: No Exit B

MLU: Triantos Tries on Some Power $

Speed, Spin, and Snap $

Pat Murphy, Wade Miley, and the Ship of Theseus $

Abanicar menos no es un negocio complicado

Nate Silver

Latest Articles

Box Score Banter: No Exit B

MLU: Triantos Tries on Some Power $

Speed, Spin, and Snap $