The lesson that many people take away from Michael Lewis’s best-selling book Moneyball is that On Base Percentage (OBP) is the only way to build a good baseball team. What is often missed is that the book is really a tale of economics, about finding inefficiencies in the market and exploiting them. In the late 1990s and early 2000s, the baseball market was inefficient at judging the value of OBP. Realizing this, the low budget Oakland A’s were able to build a successful offense on the cheap.

For better or for worse, the mantra “OBP is king” is often the first step an average baseball fan takes towards becoming a regular Baseball Prospectus reader. For those readers of Moneyball who are not convinced that Billy Beane wrote the book to praise his own genius, it should be obvious that it is better to get on base than to make an out. But why is OBP so important?

To explore how OBP affects run scoring, I will conduct a thought experiment. A thought or “gedanken” experiment is an imaginary experiment that illustrates an idea or hypothesis that would otherwise be difficult or impossible to perform. These were particularly useful for 20th century physicists as they attempted to visualize revolutionary new ideas. Einstein was well known for his numerous thought experiments on relativity, and the most famous example from quantum mechanics is Schrödinger’s Cat, better known to many BP readers as the namesake of Dan Fox‘s column.

In our thought experiment we will consider two teams, one made up of nine hitters each of whom has a 0.400 OBP and a second team made up of nine hitters each of whom has a 0.300 OBP. As an homage to two of the better known players from my baseball card collecting days, we’ll call the 0.400 OBP team the “Rickeys” (Rickey Henderson had a lifetime OBP of 0.401) and the 0.300 OBP team the “Shawons” (Shawon Dunston had a 0.296 lifetime OBP).

For simplicity we will think of each plate appearance for the Rickeys as resulting in 0.6 outs and 0.4 times on base. It doesn’t really matter how the player reached base, just that he didn’t make an out. Similarly, each Shawon plate appearance will result in 0.7 outs and 0.3 times on base. Since a nine inning game consists of 27 outs, we can determine how many plate appearances will be necessary for each team to accumulate 27 outs by dividing 27 by the fraction of each plate appearance resulting in an out.

For the Rickeys, 27 outs divided by 0.6 outs per plate appearance yields 45 plate appearances. In 45 plate appearances the Rickeys will make 27 outs, leaving 18 times in which they reach base by a hit or walk. For the Shawons, 27 outs divided by 0.7 outs per plate appearance yields 38.6 plate appearances. We’ll round this to 39, meaning the Shawons reached base 12 times before accumulating 27 outs.

Clearly it is better to reach base more often than not, but how can we quantify how much better the Rickeys’ offense is than the Shawons’ offense? The best method would be to calculate the probability of each plate appearance ending in a single, double, triple, home run, walk, out… to estimate how many runs each offense should score in a given game, then playing the season out and repeating this with random numbers millions of times.

While writing a computer code to run Monte Carlo simulations would create a more accurate model, the aim of this column is to introduce beginners to more advanced statistical concepts without utilizing much math. We can get a fairly good answer just by looking at pitching statistics that most readers will be familiar with from a typical fantasy baseball league: ERA and WHIP! Unsurprisingly, there is a correlation between ERA and WHIP; pitchers who give up fewer walks and hits usually allow fewer runs

The graph below shows WHIP plotted against ERA for all pitcher seasons from 2000-2008 with more than 20 IP (thanks to http://baseball1.com/ for the raw stats). The straight line is a simple, unweighted least-squares fit to the data. Note that for particularly high WHIPs the results become non-linear, with more points above the line for a given WHIP. For our purposes the type of fit doesn’t really matter, it is intended to guide the eye and show what a typical ERA is for a typical WHIP.

Returning to our two offenses, the Rickeys averaged 18 walks plus hits per 27 outs (9 innings), or a WHIP of 18/9 = 2.00, while the Shawons averaged 12 walks plus hits per 27 outs or a WHIP of 12/9 = 1.33. Without looking at the figure above, your fantasy baseball spidey sense is no doubt tingling at the sight of those WHIPs. Chances are a fantasy team whose pitchers sported a collective WHIP of 1.33 would be doing reasonably well, while most fantasy managers wouldn’t consider using a pitcher with a WHIP of 2.00 for fear of the damage he would do to their team. Although…he would be a fantastic HACKING MASS player assuming he was used sufficiently!

Looking at the figure, we see that from 2000-2008 pitchers with a WHIP of 2.00 typically had an ERA over 7 while pitchers with a WHIP of 1.33 had an ERA of about 4. In other words, a pitcher who allowed all opposing hitters to reach base at the rate of the Rickeys would have been among the worst pitchers in baseball, while a pitcher who allowed hitters to reach base at the rate of the Shawons would have been an above average pitcher (the average pitcher in the sample had an ERA of 4.52). There are very few data points for pitchers with a WHIP near 2.00, since pitchers who allow this many base runners do not last long in the major leagues.

Thinking about ERA from the offense’s point of view, our hypothetical Rickeys would score about 7 runs per game while the Shawons would score a much more modest 4 runs per game. A difference of 3 runs per game is a lot, but how many more games would the Rickeys win than the Shawons over the course of a season?

Ideally we would take into account things like the number of runs each team allowed on defense, fluctuations in scoring from game to game, league wide scoring levels, etc. However, for our purposes, the sabermetric rule of thumb that an extra 10 runs scored or prevented will result in an extra win over the course of a season will suffice. Over 162 games, the Rickeys would score about 500 more runs than the Shawons. All other things being equal, the Rickeys would be expected to win nearly 50 more games!

As a sanity check that this method is valid, the figure below plots the OBP and runs per game of every team season from 2000-2008, again with a least-squares fit to the data overlaid. While no team approached a 0.400 OBP, extending the line to an OBP of 0.400 predicts a team would score about 7.0 runs per game. A few did approach a 0.300 OBP: the 2002 Tigers (0.300), 2003 Tigers (0.300), and 2003 Dodgers (0.303). These offensively challenged squads averaged a paltry 3.6 runs per game. Despite reducing the discussion to common fantasy statistical categories, our estimate of the effect of OBP on runs scored proved very good!

This thought experiment is clearly an exaggeration of how offenses are constructed. No team is likely to field a lineup with nine hitters capable of a 0.400 OBP, while only the occasional team is likely to be so bad as to field a team with nine hitters approaching a 0.300 OBP. However, it illustrates well the importance of conserving outs. If you want to score runs, OBP is king.

Postscript: Call me CWebb. As a number of readers have pointed out in the comments to my initial entry, I was sloppy with my usage of BABIP when it should have been BA on contact. I am embarrassed to look so foolish on my first chance in the sabermetric spotlight. I know 1993 was Webber’s second Final Four, but you know what I mean. It is a tremendous honor to be one of the ten BP Idol finalists, and I am grateful to all the readers who are holding me to the same lofty standards that they expect of the regular BP writers. I can’t promise I’ll never make another mistake, but I’ll sure try. Hopefully I can follow this up with 15 more years of elite basketball…er baseball analysis.

#### Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now
What I don't like is the way you went about trying to answer your initial question: "Why is OBP King?". You don't really answer that. In order to do so, you would have needed to demonstrate how and why OBP is more important than other statistical categories. Instead, all you really prove is that OBP is important because it correlates to run scoring. You don't demonstrate that it does so any more than any other statistic. I think your comparison should have been between OBP and, say, AVE, or SLG, not between a better and worse OBP. The latter doesn't really need an analysis, much less 1500 words. The former does (although some might argue that Michael Lewis has already done this quite convincingly).

Anyway, good luck with the rest of the competition!

There is something here... but the parts are disjointed. Pick one area, master your voice in that area, then you'll have a better idea when to digress. Find a better way to transition from your first paragraph (which I liked) to your fourth paragraph (which I also liked). Scrap the part about WHIP until another article and use this one to show you've firmly established the importance of OBP.

This one gets mine this time. Thanks!

People understand that WHIP and ERA are correlated. Heck, they probably understand the relationship of those two stats better than they understand the idea of "correlation". As such, you could probably just drop some paragraphs and say, "Between 2000 and 2008, pitchers with a WHIP of 1.33 had an average ERA of around 4, while pitchers with a WHIP of 2.00 averaged ERAs closer to 7" and be done with it. If you want to do it simple, do it simple.

To fill this now-vacated second half of your article, you could talk about how having a good OBP guy atop your lineup can help by giving everyone behind them tons of RBI opportunities, probably giving examples.

But yeah. Good idea/instincts, presentation could use some work. :-)

The major disconnect to me is that the article is about OBP, but much of the meat looks at it from a pitcher's point of view. This forced the article to move us to a stat (WHIP) that features a different scale and inverse relationship than the one we're learning about, which I think would be confusing to people looking for the basics. On the other hand, I guess I do like the idea of writing as if someone interested in learning the analysis side of things is coming from a fantasy background.

Basically, the differences in their batting average were minimal and the differences in their power were minimal. The only relevant difference in their game was walk rate (16.4% vs 3.2% in about twice the number of PA's for Rickey over Dunston). We'll ignore the position difference and stolen base and baserunning since they have nothing to do with what happens at the plate which is what he was talking about.

WHIP = (H+BB)/(OUTS/3)

However, in attempting to defend OBP (a laudable objective) he relies on ERA and WHIP to do it, suggesting to the beginning reader that either stat is worth a damn. And they're not.

As a BP article, this piece is okay. As a basics article? Sorry, but if I'm being honest, it's an abject failure.

The article is about WHY it's important - because it creates runs. And he showed us two different ways how that is true - 1) WHIP & ERA, and 2) actual team data comparing OBP and Runs.

That's not to say the article couldn't be improved. My advice would be to drop the OBP:WHIP&ERA analysis and simply jump to OBP:Runs. With the remaining space of the article, compare AVG:Runs. Does OBP have a stronger correlation to Runs? If so, then it's the more important stat... QED.

Besides which, I'm not even convinced by your article that OBP is king. why isn't slugging %? if the Shawons all hit home runs and the Rickeys all walk, than the shawons score a minimum of 12 runs a game, while the rickeys could score 0. that is a thought experiment. a scatter plot and regression analysis of WHIP isn't. if we are too assume equal slg, than you should show that the same difference in slugging while assuming equal obp results in less expected runs if you want to convince me obp is king.

Seriously, 'a walk is as good as a hit' has been baseball conventional wisdom for decades. If you had asked a baseball fan on the street in the late '80s, "Which would you rather have, holding all else equal, a player who walks less or a player who walks more?" you would have gotten the obvious response, "The guy who walks more. We'll score more runs that way."

Your conclusion is trivial. If you'd compared to, say, average, that would have been significantly better. It would have been /very/ introductory, but heck, I can deal with that. What I'm actually unhappy about is an article proving a point nobody disagrees with.

I don't think the point was simply to show that OBP correlates with runs, but to show the practical difference between, for instance, a .333 OBP and a .383 OBP. I think did that nicely. Also don't think a lot of people really grasp how important that might be, but maybe that's just me!

My only critique is that his transition into the WHIP angle was missing clarity. I think that trying to put a different spin on OBP is fine, but you need to be more clear about what that spin is.

Here's my main point, though. The link to the Joe Morgan chat is priceless. Matthew, you're getting my vote just for that.

But I do find it interesting that you are basing your vote on a link to what Joe Morgan said. I'll keep that in mind in case I'm a finalist next year :)

I'd like to see some recognition of the limitations of obp in building a baseball team. 3 walks and 3 outs = no runs.

I approve of the topic; I think there *are* things to say about OBP that most people don't get right away. Things like "every out you make robs your teammates of about 1.5 PA's they could be using to score runs". Things about the tradeoff between long-sequence offense and short-sequence offense. Things about how OBP is (perversely) more important in a #5 hitter than in a #3 hitter. But I didn't get any of that.

Unfortunately I think the votes will go with the judges because I disagree with Mr. Carroll here. I thought the trick used to draw a figure for run production was very clever (not the least confusing). I liked the scatter plots as well for the "basics" theme - what's easier to interpret than a best fit line? It also helped anchor some of the text surrounding the plots.

I did think there were a lot of superfluous bits at the top but I understand that this is going to be his gimmick for the competition. I also didn't think the multiple "what would be preferred here is..." digressions were helpful. It is nice to acknowledge the limitations of the approach, but perhaps a line at the end would have gotten that across without coming across as a blowhard and distracting me from the article.