In the performance analysis community, it has long been shouted that
OBP is the single most important statistic of the cadre of official baseball
metrics. It’s so important in fact that one of Baseball Prospectus’
original stated goals was to have ESPN post OBP in their broadcasts.
Having achieved that modest target, it’s time to suggest some changes. Certainly, suggesting any changes to OBP is likely to draw shouts of protest from across the baseball community, but I’ll disregard the growing din and take it on anyway.
OBP is designed to measure the percentage of times that a player reaches
base. Of course, it’s not really that simple because no statistic is truly
great unless you can base it on someone’s opinions. OBP excludes fielder’s
choices–despite some colorful
arguments to the contrary–because the official scorer has ruled that
the batter would have been out if the defense hadn’t chosen to retire a
different runner. There are all sorts of holes in this line of thinking,
but it’s not the issue I’m looking to tackle here.
The other big piece missing from OBP is the fact that reached on error
(ROE) has also been excluded. (For BP’s prior work on ROE, see Keith Woolner’s articles here and here.) If you watch enough baseball, thoughts start
to creep into your head, wondering whether certain players can “generate”
errors to get on base. The poster boy for this line of thinking is
Ichiro Suzuki (or Ichiro! if you live within 100 miles of
Derek Zumsteg). Ichiro!’s speed and batting style certainly appear to make
defenses rush, maybe bobbling a few more balls and leaving him standing on
first after a routine ground ball for anyone else. Others may argue that
there’s a case for players who hit the ball harder than others. Perhaps
they too generate errors, but instead of speed making fielders rush, it’s
the velocity of the ball forcing the error. Thus, since those ROE are the
result of some talent of the batters and not necessarily the fault of the
defense, those plate appearances, rather than being counted against OBP,
should be counted for OBP.
There are several problems with this line of thinking. First and
foremost, there’s still the inherent problem of the official scorer and his
tendencies to rule various identical events as hits or errors, depending on
other factors not relevant to the play at hand. Players who play in front
of “hometown” official scorers will have more of their borderline calls
ruled as hits than players whose scorers who hold the defense to a higher
Second, there may be a difference between infield and outfield
ROE. While there’s certainly an argument that players can generate ROE in
the outfield by hitting a plethora of nearly fieldable line drives, most of
the influence we’re searching for empirically comes from infielders and
their rush to throw out a speedy runner.
Regardless, it’s still educational to peruse the stats and see who would
benefit the most from an adjustment of OBP thinking. If the “pressure on
the defense” idea is true, we should see plenty of fast players at the top
of the list, while the glacier racers should be down at the bottom. Since
2000, here are the 25 players with at least 500 PA whose OBP would be helped
the most by including ROE:
Batter PA OldOBP NewOBP Diff Pat Meares 798 .281 .306 .025 Devon White 600 .333 .355 .022 Calvin Murray 692 .311 .332 .021 Olmedo Saenz 813 .341 .362 .021 Ty Wigginton 944 .322 .341 .019 Darren Bragg 657 .317 .336 .019 Mike Lansing 930 .290 .309 .019 Damon Buford 645 .307 .326 .019 Bobby Estalella 798 .321 .340 .019 Rey Ordonez 1317 .288 .307 .019 Aaron Boone 2100 .328 .347 .019 Ken Harvey 759 .335 .354 .019 Jose Macias 1487 .295 .313 .018 Wendell Magee 795 .294 .312 .018 Tsuyoshi Shinjo 960 .296 .314 .018 Dee Brown 632 .283 .301 .018 Bill Haselman 546 .315 .333 .018 John McDonald 589 .267 .285 .018 Rondell White 2036 .343 .361 .018 Keith Ginter 725 .342 .360 .018 Jack Wilson 1892 .294 .311 .017 Raul Casanova 573 .309 .326 .017 Jeff Cirillo 2140 .342 .359 .017 Shawon Dunston 577 .272 .289 .017 Greg Vaughn 1450 .335 .352 .017
Well, that didn’t exactly go as planned. Olmedo Saenz?
Ken Harvey? Pat Meares? Greg
Vaughn? These are not exactly the kind of guys who make fielders
rush throws. Our case study, Ichiro!, comes in at number 67 out of 464.
Here’s a look at the bottom 25:
Batter PA OldOBP NewOBP Diff Doug Mientkiewicz 2011 .375 .380 .005 Luis Lopez 621 .291 .296 .005 Dave Hansen 677 .383 .388 .005 Shawn Wooten 704 .317 .322 .005 David Ortiz 2076 .350 .355 .005 Jay Gibbons 1646 .318 .323 .005 Chad Kreuter 634 .372 .377 .005 Daryle Ward 1255 .308 .313 .005 Mo Vaughn 1366 .356 .361 .005 Brady Anderson 1220 .343 .348 .005 Steve Cox 1380 .341 .346 .005 Barry Bonds 2660 .526 .530 .004 Dave Martinez 772 .346 .350 .004 Austin Kearns 877 .381 .385 .004 Troy O'Leary 1447 .318 .322 .004 Greg Norton 966 .318 .322 .004 Jason Giambi 2905 .445 .449 .004 Todd Pratt 749 .375 .379 .004 Carlos Pena 1257 .322 .325 .003 Joe Crede 1079 .300 .303 .003 Brian Schneider 965 .310 .313 .003 Russ Branyan 1221 .320 .322 .002 Nick Johnson 990 .376 .378 .002 Armando Rios 966 .322 .323 .001 Travis Hafner 633 .362 .363 .001
In order to check to see if there is any correlation between speed and
the likelihood of ROE, I’ll attach a metric not terribly unlike the old Bill
James Speed Score. This one, which I’ll call Speed Factor for now, will be
simpler. Speed Factor will take the total stolen base attempts multiplied
by the stolen base success rate and again multiplied by the percentage of
doubles and triples that are triples. I’ve excluded events like runs scored
per time on base because they’re entirely too team-dependent. And while
triples rate is highly park dependent, it’s a small adjustment that doesn’t
dramatically alter the scores, so I’ll defer to Occam’s Razor for now. For
the math inclined:
A quick sort on Speed Factor shows a top 10 of:
Batter Speed Factor Juan Pierre 45.097 Dave Roberts 43.951 Luis Castillo 43.922 Alex Sanchez 34.517 Tom Goodwin 32.000 Roger Cedeno 31.927 Carl Crawford 30.823 Carlos Beltran 29.840 Tony Womack 29.647 Cristian Guzman 29.537
And the bottom:
Batter Speed Factor Eddie Perez .000 Todd Hundley .000 Tony Clark .000 Matt LeCroy .000 Jason Phillips .000 Albert Belle .000 Cal Ripken Jr. .000 Greg Colbrunn .000 Greg Myers .000 Tom Wilson .000
In general, the metric fits well with what we see on the field. But how
does it fit with the increase in OBP by adding ROE? Not at all. The
relationship between OBP with ROE and Speed Factor borders on complete
randomness. (Again, for the numerophiles: R-squared = 0.0106). This lack
of relationship may be due in some part to the very limited range of
increase from ROE (a maximum of .025 in this study), but even increasing the
precision of OBP to further decimals reveals minimal increase. In this
case, it appears that speed in general is not related to the propensity to
reaching base on an error.
This relational failure holds true for the other proposed theories as
well. Returning to the question of players who hit the ball harder than
others, I tried mapping any variety of power statistics and came up with
nothing. Additionally, combining Speed Factor with any kind of power
statistics yielded virtually no correlation. Even looking at batters who
put a larger percentage of balls in play yielded virtually no correlation.
(One final possible justification for including ROE in OBP is that OBP
with ROE may map better to overall team run scoring. This test also fails.
From 2000 to 2003, OBP maps to team run scoring marginally better than OBP
As such, using arguments about various players “generating errors” by
putting pressure on the defense to justify including ROE in OBP appear
unfounded. Players who reach on an error the most often have no
characteristics to distinguish themselves from those who do not. This all
boils down to the fact that some players are simply lucky and some are not.
Of course, it could also mean there should be better regulation of official
scorers and their decisions, but that’s another article altogether.