Prospectus Idol Entry: Why is On Base Percentage King?

May 23, 2009

The lesson that many people take away from Michael Lewis’s best-selling book Moneyball is that On Base Percentage (OBP) is the only way to build a good baseball team. What is often missed is that the book is really a tale of economics, about finding inefficiencies in the market and exploiting them. In the late 1990s and early 2000s, the baseball market was inefficient at judging the value of OBP. Realizing this, the low budget Oakland A’s were able to build a successful offense on the cheap.

For better or for worse, the mantra “OBP is king” is often the first step an average baseball fan takes towards becoming a regular Baseball Prospectus reader. For those readers of Moneyball who are not convinced that Billy Beane wrote the book to praise his own genius, it should be obvious that it is better to get on base than to make an out. But why is OBP so important?

To explore how OBP affects run scoring, I will conduct a thought experiment. A thought or “gedanken” experiment is an imaginary experiment that illustrates an idea or hypothesis that would otherwise be difficult or impossible to perform. These were particularly useful for 20th century physicists as they attempted to visualize revolutionary new ideas. Einstein was well known for his numerous thought experiments on relativity, and the most famous example from quantum mechanics is Schrödinger’s Cat, better known to many BP readers as the namesake of Dan Fox‘s column.

In our thought experiment we will consider two teams, one made up of nine hitters each of whom has a 0.400 OBP and a second team made up of nine hitters each of whom has a 0.300 OBP. As an homage to two of the better known players from my baseball card collecting days, we’ll call the 0.400 OBP team the “Rickeys” (Rickey Henderson had a lifetime OBP of 0.401) and the 0.300 OBP team the “Shawons” (Shawon Dunston had a 0.296 lifetime OBP).

For simplicity we will think of each plate appearance for the Rickeys as resulting in 0.6 outs and 0.4 times on base. It doesn’t really matter how the player reached base, just that he didn’t make an out. Similarly, each Shawon plate appearance will result in 0.7 outs and 0.3 times on base. Since a nine inning game consists of 27 outs, we can determine how many plate appearances will be necessary for each team to accumulate 27 outs by dividing 27 by the fraction of each plate appearance resulting in an out.

For the Rickeys, 27 outs divided by 0.6 outs per plate appearance yields 45 plate appearances. In 45 plate appearances the Rickeys will make 27 outs, leaving 18 times in which they reach base by a hit or walk. For the Shawons, 27 outs divided by 0.7 outs per plate appearance yields 38.6 plate appearances. We’ll round this to 39, meaning the Shawons reached base 12 times before accumulating 27 outs.

Clearly it is better to reach base more often than not, but how can we quantify how much better the Rickeys’ offense is than the Shawons’ offense? The best method would be to calculate the probability of each plate appearance ending in a single, double, triple, home run, walk, out… to estimate how many runs each offense should score in a given game, then playing the season out and repeating this with random numbers millions of times.

While writing a computer code to run Monte Carlo simulations would create a more accurate model, the aim of this column is to introduce beginners to more advanced statistical concepts without utilizing much math. We can get a fairly good answer just by looking at pitching statistics that most readers will be familiar with from a typical fantasy baseball league: ERA and WHIP! Unsurprisingly, there is a correlation between ERA and WHIP; pitchers who give up fewer walks and hits usually allow fewer runs

The graph below shows WHIP plotted against ERA for all pitcher seasons from 2000-2008 with more than 20 IP (thanks to http://baseball1.com/ for the raw stats). The straight line is a simple, unweighted least-squares fit to the data. Note that for particularly high WHIPs the results become non-linear, with more points above the line for a given WHIP. For our purposes the type of fit doesn’t really matter, it is intended to guide the eye and show what a typical ERA is for a typical WHIP.

Returning to our two offenses, the Rickeys averaged 18 walks plus hits per 27 outs (9 innings), or a WHIP of 18/9 = 2.00, while the Shawons averaged 12 walks plus hits per 27 outs or a WHIP of 12/9 = 1.33. Without looking at the figure above, your fantasy baseball spidey sense is no doubt tingling at the sight of those WHIPs. Chances are a fantasy team whose pitchers sported a collective WHIP of 1.33 would be doing reasonably well, while most fantasy managers wouldn’t consider using a pitcher with a WHIP of 2.00 for fear of the damage he would do to their team. Although…he would be a fantastic HACKING MASS player assuming he was used sufficiently!

Looking at the figure, we see that from 2000-2008 pitchers with a WHIP of 2.00 typically had an ERA over 7 while pitchers with a WHIP of 1.33 had an ERA of about 4. In other words, a pitcher who allowed all opposing hitters to reach base at the rate of the Rickeys would have been among the worst pitchers in baseball, while a pitcher who allowed hitters to reach base at the rate of the Shawons would have been an above average pitcher (the average pitcher in the sample had an ERA of 4.52). There are very few data points for pitchers with a WHIP near 2.00, since pitchers who allow this many base runners do not last long in the major leagues.

Thinking about ERA from the offense’s point of view, our hypothetical Rickeys would score about 7 runs per game while the Shawons would score a much more modest 4 runs per game. A difference of 3 runs per game is a lot, but how many more games would the Rickeys win than the Shawons over the course of a season?

Ideally we would take into account things like the number of runs each team allowed on defense, fluctuations in scoring from game to game, league wide scoring levels, etc. However, for our purposes, the sabermetric rule of thumb that an extra 10 runs scored or prevented will result in an extra win over the course of a season will suffice. Over 162 games, the Rickeys would score about 500 more runs than the Shawons. All other things being equal, the Rickeys would be expected to win nearly 50 more games!

As a sanity check that this method is valid, the figure below plots the OBP and runs per game of every team season from 2000-2008, again with a least-squares fit to the data overlaid. While no team approached a 0.400 OBP, extending the line to an OBP of 0.400 predicts a team would score about 7.0 runs per game. A few did approach a 0.300 OBP: the 2002 Tigers (0.300), 2003 Tigers (0.300), and 2003 Dodgers (0.303). These offensively challenged squads averaged a paltry 3.6 runs per game. Despite reducing the discussion to common fantasy statistical categories, our estimate of the effect of OBP on runs scored proved very good!

This thought experiment is clearly an exaggeration of how offenses are constructed. No team is likely to field a lineup with nine hitters capable of a 0.400 OBP, while only the occasional team is likely to be so bad as to field a team with nine hitters approaching a 0.300 OBP. However, it illustrates well the importance of conserving outs. If you want to score runs, OBP is king.

Postscript: Call me CWebb. As a number of readers have pointed out in the comments to my initial entry, I was sloppy with my usage of BABIP when it should have been BA on contact. I am embarrassed to look so foolish on my first chance in the sabermetric spotlight. I know 1993 was Webber’s second Final Four, but you know what I mean. It is a tremendous honor to be one of the ten BP Idol finalists, and I am grateful to all the readers who are holding me to the same lofty standards that they expect of the regular BP writers. I can’t promise I’ll never make another mistake, but I’ll sure try. Hopefully I can follow this up with 15 more years of elite basketball…er baseball analysis.

Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now

Matthew Knight

Latest Articles

You need to be logged in to comment. Login or Subscribe

kgoldstein

5/24

This one actually surprised me. At first I was saying to myself, "oh no, not something else about how great OBP is," but then you took it in a pretty interesting direction, so as they say, way to make it your own.

Reply to kgoldstein

wcarroll

5/24

Couple things ... I don't like that he stuck with a "title" here. It's a bit presumptuous and precious. (Of course, you don't see that since we evidently shifted the subtitle. Still.) Also, the apology is overwrought and didn't need to be there. He made a mistake, you learn. I don't follow his process here at all. It starts off about OBP and then veers quickly off to WHIP. I had to read through a couple of times to really grasp what he was trying to do and I dont think he did it. This is the first article I've read that took a real step back from what I saw in the original submissions.

Reply to wcarroll

ckahrl

5/24

Yes, to clarify on what Will has pointed out, and which isn't transparent to the public, is that Matthew stuck with his reference to "Back of the Envelope," which we trimmed as a matter of presentation. I guess my problem with this piece is that it's hard to call this an introductory piece when, unlike his initial, winning entry, the light humor from before is missing, and I guess I suffer from a bit of cognitive dissonance when something that's supposed to be "Basic" gives us a couple of scatter plots--if this is the back of an envelope, that's some envelope. Regardless, while I did find the extended initial throat-clearing a bit tedious, that *did* speak to the exercise's purpose, to deliver something introductory. It's an uneven effort, but one that I liked, but I'd like to see Matthew resume with the light, confident, engaging tone that he introduced himself with.

Reply to ckahrl

rbross

5/24

(I feel deficient without a photo.) Anyway, with all due respect to the staff's opinions, I actually like the tone of your piece. I found it light, simple, and instructive but not condescending. I also liked your scatter plots. They were also simple and quite clear. There are little bits and pieces of your writing that could be cleaned up, but that's to be expected for anyone writing on deadline without an editor.

What I don't like is the way you went about trying to answer your initial question: "Why is OBP King?". You don't really answer that. In order to do so, you would have needed to demonstrate how and why OBP is more important than other statistical categories. Instead, all you really prove is that OBP is important because it correlates to run scoring. You don't demonstrate that it does so any more than any other statistic. I think your comparison should have been between OBP and, say, AVE, or SLG, not between a better and worse OBP. The latter doesn't really need an analysis, much less 1500 words. The former does (although some might argue that Michael Lewis has already done this quite convincingly).

Anyway, good luck with the rest of the competition!

Reply to rbross

Oleoay

5/24

I'm not sure how you can write an article about OBP without defining OBP and an article on WHIP without defining WHIP. It seems you assume the reader knows what OBP and WHIP is and why it is important, but then go through one or two "thought experiments" to show how important it is?... but to lack a formula or definition? Yet you spend a paragraph explaining what a "thought experiment" is? Or you digress into Monte Carlo? You use Rickey Henderson and Shawon Dunston as examples, but show scatter plots from 2000-2008?

There is something here... but the parts are disjointed. Pick one area, master your voice in that area, then you'll have a better idea when to digress. Find a better way to transition from your first paragraph (which I liked) to your fourth paragraph (which I also liked). Scrap the part about WHIP until another article and use this one to show you've firmly established the importance of OBP.

Reply to Oleoay

chunkstyle

5/25

If I can keep up, I'm voting for one article a week, based simply on enjoyment.

This one gets mine this time. Thanks!

Reply to chunkstyle

georgeforeman03

5/25

I liked this idea quite a bit. It's a simple concept and easy to understand. I think this article could be an excellent introduction to why OBP matters with a few changes.

People understand that WHIP and ERA are correlated. Heck, they probably understand the relationship of those two stats better than they understand the idea of "correlation". As such, you could probably just drop some paragraphs and say, "Between 2000 and 2008, pitchers with a WHIP of 1.33 had an average ERA of around 4, while pitchers with a WHIP of 2.00 averaged ERAs closer to 7" and be done with it. If you want to do it simple, do it simple.

To fill this now-vacated second half of your article, you could talk about how having a good OBP guy atop your lineup can help by giving everyone behind them tons of RBI opportunities, probably giving examples.

But yeah. Good idea/instincts, presentation could use some work. :-)

Reply to georgeforeman03

acmcdowell

5/25

I liked this article, but I felt like the section involving WHIP and ERA didn't flow very well. As will pointed out, it didn't really stick to the title, and I think the author should have gone straight for the OBP and runs/game relationship.

Reply to acmcdowell

Shkspr

5/25

OBP is not King. OBP is Life.

The major disconnect to me is that the article is about OBP, but much of the meat looks at it from a pitcher's point of view. This forced the article to move us to a stat (WHIP) that features a different scale and inverse relationship than the one we're learning about, which I think would be confusing to people looking for the basics. On the other hand, I guess I do like the idea of writing as if someone interested in learning the analysis side of things is coming from a fantasy background.

Reply to Shkspr

jtrichey

5/25

This one has my thumb perched in the middle ground. It's hard to get much more basic than OBP at least for the BP crowd. Not faulting you for that though. You did take it an interesting way. I did have to read the paragraph that morphed into a WHIP discussion, but on just 1 reread it made a lot of sense. Anyway I am reserving a thumbs up until I check out more of the competition first.

Reply to jtrichey

SkyKing162

5/25

If my dad asked me for the basics of OBP, I would not refer him to this article. There were some interesting points worked in, but I'm still waiting for a really good introductory article. Writing a Basics article might seem strange when your actual audience is a whole bunch of people who are already familiar with OBP (or whatever), but the trick is to impress us with the path you take from simple to more complex without losing anyone on the way.

Reply to SkyKing162

hotstatrat

5/25

Sorry, but I think Matthew Knight is still missing the mark. Does even an entry level BP reader need a long explanation as to why OBP is important? Spare us, please. Many of my favorite people are geeks, but I found this to be a crashing bore.

Reply to hotstatrat

Oleoay

5/25

Two articles that had thought experiments, but also had some problems. The initial entry made a number of assumptions and incorrectly used some terms while this entry assumed things about the reader... they don't know OBP is important but know WHIP is important solely because it's a fantasy stat? I keep wavering on giving this one a thumbs-up... it seems there's potential that in a future article you'll write something that I wuld be "wowed" by, and I've given my votes to articles that had less of a "wow" factor, but were otherwise solid. Both articles submitted so far by you have been better than mediocre, but not quite great either. Both seemed like they could have been something "more", or lacking that, "more solid". Is it worth a save to the next round? I'm not sure as of yet... but I do know there are other finalists who are writing "wow" articles and are writing "sound" articles.

Reply to Oleoay

irablum

5/25

I will say this. I did enjoy the article, but I was left with one overriding thought. Ok, two overriding thoughts. One was, "Why was the concept of power completely ignored in the article?" The second was: "Where is the regression analysis which would prove the year to year stability of OBP?" Now, understanding that you approached the subject from a less rigorous stand, it might be reasonable to ignore a regression analysis, but to ignore power is as bad as the A's ignoring offense when they built their team this year. Whats worse is that your two example players, Dunston and Henderson, were PERFECT: Dunston hit .269/.296/.416 where Rickey hit .279/.401/.419

Basically, the differences in their batting average were minimal and the differences in their power were minimal. The only relevant difference in their game was walk rate (16.4% vs 3.2% in about twice the number of PA's for Rickey over Dunston). We'll ignore the position difference and stolen base and baserunning since they have nothing to do with what happens at the plate which is what he was talking about.

Reply to irablum

newsense

5/25

Matthew is jumping through a lot of hoops to compare offensive output derived from OBP to offensive output derived from WHIP. Simple math shows them to be essentially the same thing: OBP=(H+BB)/(H+BB+OUTS)
WHIP = (H+BB)/(OUTS/3)

Reply to newsense

brianjamesoak

5/25

My take on it was that he was using WHIP and OBP almost synonymously, except that WHIP was from a pitcher's standpoint.

Reply to brianjamesoak

newsense

5/26

Then he's just being redundant by making graphs for both.

Reply to newsense

metal1341

5/25

I liked this article, but the transition from OBP to WHIP was poorly done. It was not smooth at all and I had to re-read it to get where he was going. Other than that, I enjoyed the article. And I don't think the scatter points were very complicated so that an average person couldn't read it.

Reply to metal1341

llewdor

5/25

I don't object to the use of scatterplots in a beginners' article - perhaps someone should write such an article on how to read scatterplots, because they are so very useful when dealing with baseball data.

However, in attempting to defend OBP (a laudable objective) he relies on ERA and WHIP to do it, suggesting to the beginning reader that either stat is worth a damn. And they're not.

Reply to llewdor

jsnell

5/25

Not a basics article. OBP is never sufficiently defined; it's assumed that the reader knows what it is. OBP is such a fundamental statistic and is the building-block for a whole way of thinking about baseball stats. I would've spent time talking about OBP versus batting average and really explaining why this stat is preferable. But instead we're off into a series of hypotheticals with teams made up of Rickeys and Shawons (cute).

As a BP article, this piece is okay. As a basics article? Sorry, but if I'm being honest, it's an abject failure.

Reply to jsnell

molnar

5/26

I thought the first paragraph was well-done. I hated the explanation of thought experiment, and I'm not sure that the "experiment" itself led to any conclusion. I thought that after mentioning OBP as a specific example of market inefficiency, there were two ways to go: either give me examples to show how that inefficiency has gone away, or explain why OBP is king. Not why it's good, why it's King. While scatterplots do in fact show that getting on base more often is *good* - it's better than not getting on base! - without any information about the slope, that's all they do. There are myriad reasons why Rickey Henderson was better than Shawon Dunston; OBP is only one of them. Bob hit the nail on the head above saying that in order to demonstrate that OBP is King, you have to compare it to the other contenders for the throne.

Reply to molnar

BurrRutledge

5/26

Reading this article first, and I had no problems with it as a beginner article. If you want the definition of OBP, do a roll-over.

The article is about WHY it's important - because it creates runs. And he showed us two different ways how that is true - 1) WHIP & ERA, and 2) actual team data comparing OBP and Runs.

That's not to say the article couldn't be improved. My advice would be to drop the OBP:WHIP&ERA analysis and simply jump to OBP:Runs. With the remaining space of the article, compare AVG:Runs. Does OBP have a stronger correlation to Runs? If so, then it's the more important stat... QED.

Reply to BurrRutledge

joelefkowitz

5/26

I don't quite understand BP's decision to make the theme "Basics". The readers and voters are all BP readers so a basic article of OBP is absolutely not needed. I much rather enjoyed introductory articles on subjects at least a little more complicated.

Besides which, I'm not even convinced by your article that OBP is king. why isn't slugging %? if the Shawons all hit home runs and the Rickeys all walk, than the shawons score a minimum of 12 runs a game, while the rickeys could score 0. that is a thought experiment. a scatter plot and regression analysis of WHIP isn't. if we are too assume equal slg, than you should show that the same difference in slugging while assuming equal obp results in less expected runs if you want to convince me obp is king.

Reply to joelefkowitz

Oleoay

5/26

I think the Basics were picked to give us readers another example of the finalist's writing skills, while keeping the reins on them. If a finalist can't properly explain a basic concept, then their future submissions will be even more shaky.

Reply to Oleoay

JKGaucho

5/26

I agree with a lot of what has been said, and will only add that the first three paragraphs were weak. Particularly, I think readers of Moneyball understood the theme of market inefficiencies.

Reply to JKGaucho

invictus

5/26

I am left distinctly thinking 'eh'. Aside from the writing, which was not the best, the attempted humor fell flat for me, and all your argument really says is that OBP correlates with runs.

Seriously, 'a walk is as good as a hit' has been baseball conventional wisdom for decades. If you had asked a baseball fan on the street in the late '80s, "Which would you rather have, holding all else equal, a player who walks less or a player who walks more?" you would have gotten the obvious response, "The guy who walks more. We'll score more runs that way."

Your conclusion is trivial. If you'd compared to, say, average, that would have been significantly better. It would have been /very/ introductory, but heck, I can deal with that. What I'm actually unhappy about is an article proving a point nobody disagrees with.

Reply to invictus

jpkand

5/26

I think you want a little too much from the "basics" theme. Admittedly, I have had a thirst for more analysis after reading a few of these, but I think that is not the point.

I don't think the point was simply to show that OBP correlates with runs, but to show the practical difference between, for instance, a .333 OBP and a .383 OBP. I think did that nicely. Also don't think a lot of people really grasp how important that might be, but maybe that's just me!

Reply to jpkand

josh7798

5/26

I find it ineteresting that everyone's critique of this article is that he didn't define OBP, and yet as BurrRutledge pointed out, all you have to do is a roll-over and, voila!, there it is. I liked how he just delved right into his point.

My only critique is that his transition into the WHIP angle was missing clarity. I think that trying to put a different spin on OBP is fine, but you need to be more clear about what that spin is.

Here's my main point, though. The link to the Joe Morgan chat is priceless. Matthew, you're getting my vote just for that.

Reply to josh7798

Oleoay

5/26

Most people new to BP won't know to roll over words for additional definitions, unless they are highlighted blue. The slightly off-color background of a word with a definition attached isn't really obvious to me unless I squint at the monitor.

But I do find it interesting that you are basing your vote on a link to what Joe Morgan said. I'll keep that in mind in case I'm a finalist next year :)

Reply to Oleoay

greensox

5/26

OBP is obviously an important stat and the writer explained it well and the article was interesting.
I'd like to see some recognition of the limitations of obp in building a baseball team. 3 walks and 3 outs = no runs.

Reply to greensox

GraigNettles

5/26

Dear Author: I think on a 'basics' level your article does the job. I believe (just my belief - can't prove it) that most baseball followers start down the road of appreciating statistical analysis by being in fantasy/roto leagues - and the WHIP comparison to OBP will assist the process of moving the basic reader from a fantasy/roto world into a deeper appreciation of baseball stats. Below the thinking level of many BP readers, but does the job if aimed at 'newbies'. Look forward to seeing more from you.

Reply to GraigNettles

DrDave

5/26

Sorry, no. Didn't like it at all. It seems unlikely that anyone who didn't already know OBP is important would be convinced by appeal to WHIP. Worse, the oversimplification of the probability model was sufficient to make it useless for intuition, and the implication that Monte Carlo simulation is the only alternative is both misleading and annoying to people who know some math.

I approve of the topic; I think there *are* things to say about OBP that most people don't get right away. Things like "every out you make robs your teammates of about 1.5 PA's they could be using to score runs". Things about the tradeoff between long-sequence offense and short-sequence offense. Things about how OBP is (perversely) more important in a #5 hitter than in a #3 hitter. But I didn't get any of that.

Reply to DrDave

jpkand

5/26

I must say, I had a poor opinion of the author after the first article, but I was won over by this.

Unfortunately I think the votes will go with the judges because I disagree with Mr. Carroll here. I thought the trick used to draw a figure for run production was very clever (not the least confusing). I liked the scatter plots as well for the "basics" theme - what's easier to interpret than a best fit line? It also helped anchor some of the text surrounding the plots.

I did think there were a lot of superfluous bits at the top but I understand that this is going to be his gimmick for the competition. I also didn't think the multiple "what would be preferred here is..." digressions were helpful. It is nice to acknowledge the limitations of the approach, but perhaps a line at the end would have gotten that across without coming across as a blowhard and distracting me from the article.

Reply to jpkand

darts1

5/26

I agree with many of the others who commented on this, that this article had a lot of potential, but was comparing something we already knew. I think the biggest "old school" fans would agree they would rather have a .400 OBP player over a .300 OBP player if everything else is equal. I think the true comparison is between a player like Ichiro who consistently has a great pre-Moneyball stat like Batting Average, to players like Adam Dunn and Nick Swisher who bat about 80 points lower then Ichiro but get on base at a higher clip, and provide more power.

Reply to darts1

Prospectus Idol Entry: Why is On Base Percentage King?

Thank you for reading

Latest Articles

First-Pitch Swinging is Good, but for Who? $

TA: Marlins Get Less Meyer-ed, More Mired; Rafaela Extension; One Million Injuries $

BSB: Kyle and the Kafkaesque Factory B

How Long Can the Twins Maintain an Alternating Current Behind the Plate? $

The A’s Can Get Worse B

Matthew Knight

Latest Articles

First-Pitch Swinging is Good, but for Who? $

TA: Marlins Get Less Meyer-ed, More Mired; Rafaela Extension; One Million Injuries $

How Long Can the Twins Maintain an Alternating Current Behind the Plate? $