June 17, 2004
Another Look at OBP
Do Speedy Batters Force More Errors?In the performance analysis community, it has long been shouted that OBP is the single most important statistic of the cadre of official baseball metrics. It's so important in fact that one of Baseball Prospectus' original stated goals was to have ESPN post OBP in their broadcasts.
Having achieved that modest target, it's time to suggest some changes. Certainly, suggesting any changes to OBP is likely to draw shouts of protest from across the baseball community, but I'll disregard the growing din and take it on anyway.
OBP is designed to measure the percentage of times that a player reaches base. Of course, it's not really that simple because no statistic is truly great unless you can base it on someone's opinions. OBP excludes fielder's choices--despite some colorful arguments to the contrary--because the official scorer has ruled that the batter would have been out if the defense hadn't chosen to retire a different runner. There are all sorts of holes in this line of thinking, but it's not the issue I'm looking to tackle here.
The other big piece missing from OBP is the fact that reached on error (ROE) has also been excluded. (For BP's prior work on ROE, see Keith Woolner's articles here and here.) If you watch enough baseball, thoughts start to creep into your head, wondering whether certain players can "generate" errors to get on base. The poster boy for this line of thinking is Ichiro Suzuki (or Ichiro! if you live within 100 miles of Derek Zumsteg). Ichiro!'s speed and batting style certainly appear to make defenses rush, maybe bobbling a few more balls and leaving him standing on first after a routine ground ball for anyone else. Others may argue that there's a case for players who hit the ball harder than others. Perhaps they too generate errors, but instead of speed making fielders rush, it's the velocity of the ball forcing the error. Thus, since those ROE are the result of some talent of the batters and not necessarily the fault of the defense, those plate appearances, rather than being counted against OBP, should be counted for OBP.
There are several problems with this line of thinking. First and foremost, there's still the inherent problem of the official scorer and his tendencies to rule various identical events as hits or errors, depending on other factors not relevant to the play at hand. Players who play in front of "hometown" official scorers will have more of their borderline calls ruled as hits than players whose scorers who hold the defense to a higher standard.
Second, there may be a difference between infield and outfield ROE. While there's certainly an argument that players can generate ROE in the outfield by hitting a plethora of nearly fieldable line drives, most of the influence we're searching for empirically comes from infielders and their rush to throw out a speedy runner.
Regardless, it's still educational to peruse the stats and see who would benefit the most from an adjustment of OBP thinking. If the "pressure on the defense" idea is true, we should see plenty of fast players at the top of the list, while the glacier racers should be down at the bottom. Since 2000, here are the 25 players with at least 500 PA whose OBP would be helped the most by including ROE:
Batter PA OldOBP NewOBP Diff Pat Meares 798 .281 .306 .025 Devon White 600 .333 .355 .022 Calvin Murray 692 .311 .332 .021 Olmedo Saenz 813 .341 .362 .021 Ty Wigginton 944 .322 .341 .019 Darren Bragg 657 .317 .336 .019 Mike Lansing 930 .290 .309 .019 Damon Buford 645 .307 .326 .019 Bobby Estalella 798 .321 .340 .019 Rey Ordonez 1317 .288 .307 .019 Aaron Boone 2100 .328 .347 .019 Ken Harvey 759 .335 .354 .019 Jose Macias 1487 .295 .313 .018 Wendell Magee 795 .294 .312 .018 Tsuyoshi Shinjo 960 .296 .314 .018 Dee Brown 632 .283 .301 .018 Bill Haselman 546 .315 .333 .018 John McDonald 589 .267 .285 .018 Rondell White 2036 .343 .361 .018 Keith Ginter 725 .342 .360 .018 Jack Wilson 1892 .294 .311 .017 Raul Casanova 573 .309 .326 .017 Jeff Cirillo 2140 .342 .359 .017 Shawon Dunston 577 .272 .289 .017 Greg Vaughn 1450 .335 .352 .017
Well, that didn't exactly go as planned. Olmedo Saenz? Ken Harvey? Pat Meares? Greg Vaughn? These are not exactly the kind of guys who make fielders rush throws. Our case study, Ichiro!, comes in at number 67 out of 464. Here's a look at the bottom 25:
Batter PA OldOBP NewOBP Diff Doug Mientkiewicz 2011 .375 .380 .005 Luis Lopez 621 .291 .296 .005 Dave Hansen 677 .383 .388 .005 Shawn Wooten 704 .317 .322 .005 David Ortiz 2076 .350 .355 .005 Jay Gibbons 1646 .318 .323 .005 Chad Kreuter 634 .372 .377 .005 Daryle Ward 1255 .308 .313 .005 Mo Vaughn 1366 .356 .361 .005 Brady Anderson 1220 .343 .348 .005 Steve Cox 1380 .341 .346 .005 Barry Bonds 2660 .526 .530 .004 Dave Martinez 772 .346 .350 .004 Austin Kearns 877 .381 .385 .004 Troy O'Leary 1447 .318 .322 .004 Greg Norton 966 .318 .322 .004 Jason Giambi 2905 .445 .449 .004 Todd Pratt 749 .375 .379 .004 Carlos Pena 1257 .322 .325 .003 Joe Crede 1079 .300 .303 .003 Brian Schneider 965 .310 .313 .003 Russ Branyan 1221 .320 .322 .002 Nick Johnson 990 .376 .378 .002 Armando Rios 966 .322 .323 .001 Travis Hafner 633 .362 .363 .001
In order to check to see if there is any correlation between speed and the likelihood of ROE, I'll attach a metric not terribly unlike the old Bill James Speed Score. This one, which I'll call Speed Factor for now, will be simpler. Speed Factor will take the total stolen base attempts multiplied by the stolen base success rate and again multiplied by the percentage of doubles and triples that are triples. I've excluded events like runs scored per time on base because they're entirely too team-dependent. And while triples rate is highly park dependent, it's a small adjustment that doesn't dramatically alter the scores, so I'll defer to Occam's Razor for now. For the math inclined:
A quick sort on Speed Factor shows a top 10 of:
Batter Speed Factor Juan Pierre 45.097 Dave Roberts 43.951 Luis Castillo 43.922 Alex Sanchez 34.517 Tom Goodwin 32.000 Roger Cedeno 31.927 Carl Crawford 30.823 Carlos Beltran 29.840 Tony Womack 29.647 Cristian Guzman 29.537
And the bottom:
Batter Speed Factor Eddie Perez .000 Todd Hundley .000 Tony Clark .000 Matt LeCroy .000 Jason Phillips .000 Albert Belle .000 Cal Ripken Jr. .000 Greg Colbrunn .000 Greg Myers .000 Tom Wilson .000
In general, the metric fits well with what we see on the field. But how does it fit with the increase in OBP by adding ROE? Not at all. The relationship between OBP with ROE and Speed Factor borders on complete randomness. (Again, for the numerophiles: R-squared = 0.0106). This lack of relationship may be due in some part to the very limited range of increase from ROE (a maximum of .025 in this study), but even increasing the precision of OBP to further decimals reveals minimal increase. In this case, it appears that speed in general is not related to the propensity to reaching base on an error.
This relational failure holds true for the other proposed theories as well. Returning to the question of players who hit the ball harder than others, I tried mapping any variety of power statistics and came up with nothing. Additionally, combining Speed Factor with any kind of power statistics yielded virtually no correlation. Even looking at batters who put a larger percentage of balls in play yielded virtually no correlation.
(One final possible justification for including ROE in OBP is that OBP with ROE may map better to overall team run scoring. This test also fails. From 2000 to 2003, OBP maps to team run scoring marginally better than OBP with ROE.)
As such, using arguments about various players "generating errors" by putting pressure on the defense to justify including ROE in OBP appear unfounded. Players who reach on an error the most often have no characteristics to distinguish themselves from those who do not. This all boils down to the fact that some players are simply lucky and some are not. Of course, it could also mean there should be better regulation of official scorers and their decisions, but that's another article altogether.