Ugh!  What is that vile stench? That’s right, it’s Sidney Ponson starts and the Mariners' offense polluting our pristine Run Expectancy and Win Expectancy Matrices. Before we write our congressman to apply for stimulus money for the cleanup, let's ask: How did it get this way?

The main tools that a sabermetrician uses to analyze baseball strategy are the Run Expectancy Matrix and the Win Expectancy Matrix. For the reader not familiar with these powerful tools, check out my article from the Basics series from the BP Idol competition last year. Simply put, these matrices are the basis for improved performance measuring (WXRL, etc.) and strategy analysis (stolen bases and sacrifice hits).

Before blindly using these tools, the sabermetrician needs to consider their limitations. To calculate the Run Expectancy Matrix, all play-by-play situations with the same baserunner/out state are aggregated and then the average number of runs additionally scored in that inning from that point on is empirically determined. The base calculations of the Run Expectancy Matrix assume that the run environment is exactly the same regardless of pitcher, offense or ballpark and that the number of runs scored by each side for a game will likely be between 4.5 and 5.0, typically the season average. The problem is that we are grouping together situations where Chris Carpenter is pitching against the Astros (a low run-scoring expectation), along with Fausto Carmona pitching against the Yankees (a high run-scoring expectation).  In the former situation, we hypothesize that it is much more important to manufacture whatever runs one can versus the latter, where giving away outs will be detrimental to winning.

So how do we clean up our Run Expectancy Matrix? Let’s bring in the EPA! No, not the Environmental Protection Agency, but the Environment Prediction Algorithm. What we’re going to do is estimate the true run environment then use it to create Run Expectancy Matrices for the different environments. As an exercise, we will see how our understanding of various strategies change based on the environment.

The EPA Approach

The EPA considers four main factors in estimating the likely runs scored per game:

  1. Home/Visitor: As Matt Swartz pointed out in a series of articles, there is a distinct advantage between the home and visitor which typically comes out to a 0.2–0.3 runs per game difference.
  2. Offense: For each year, I calculated 60 factors, two for each team, one when the offense faces a starting left-hander, and another when they face a starting right-hander.
  3. Starting Pitcher: Combining the defense, the relief corps, and the starting pitcher, we create a factor for each starter and the team he is pitching for (for example, in 2009, Cliff Lee pitching for the Phillies or for the Indians will be two separate factors).
  4. Ballpark: The effect that each of the 30 (and sometimes a few more) ballparks have on the expected runs per game.

In theory, we could create a model that considers whether it is a day/night game, and such cross-effects like when CC Sabathia faces a lineup of four left-handers and five right-handers, but as Dr. Leo Marvin writes: Baby Steps. We don’t want our algorithm to become all tied up like Bob Wiley.

The algorithm I used was a sequential heuristic to create a linear model that estimates the number of runs/game based on the factors above. As an example, here are some of the set of factors for the 2009 season:

Factor Runs/Game
Home Team 4.73
Visiting Team 4.49


Top 5 Offense Adjustment Bottom 5 Offense Adjustment
Yankees vs. RHP +1.10 Pirates vs. LHP -1.03
Red Sox vs. LHP +0.92 Mariners vs. LHP -0.85
Yankees vs. LHP +0.91 Mets vs. LHP -0.82
Angels vs. LHP +0.90 Cardinals vs. LHP -0.75
Angels vs. RHP +0.81 Reds vs. RHP  -0.73


Top 5 SP* Adjustment Bottom 5 SP* Adjustment
Chris Carpenter -1.95 Manny Parry +1.49
Felix Hernandez -1.74 Luke Hochevar +1.39
Jair Jurrjens -1.46 Jeff Suppan +0.89
Zack Greinke -1.42 Jorge De La Rosa +0.86
Tim Lincecum -1.39 Braden Looper +0.82

* minimum 25 starts

So let’s take a hypothetical example to see how this works. The Yankees are at home against the Royals with Luke Hochevar on the mound for Kansas City. Taking all the factors into account: the Yankees are the home team (4.73 runs) going against a righty (+1.10) who is Hochevar (+1.39), our expectation is that the Yankees would average 7.22 runs per game in this situation, before we take the ballpark into account.

Typically, this methodology predicts a run environment somewhere between two and eight runs per game. For the 2009 season, the following table shows the distribution of the number of games and a sample game from that bucket. Keep in mind that the total number is twice the number of games played as there will be one record for the visiting team’s scoring and the home team’s scoring.

Runs/Game Games Example
< 2.0 23 4/9 PIT at STL Carpenter (1.79 runs)
2.0-3.0 207 7/22 ATL vs. SF Lincecum (2.56 runs
3.0-4.0 1,005 9/17 DET vs. KC Greinke (3.33 runs)
4.0-5.0 1,813 5/2 LAA at NYA Sabathia (4.59 runs)
5.0-6.0 1,292 6/17 TOR at PHI Moyer (5.31 runs)
6.0-7.0 386 7/12 BOS vs. KC Chen (6.87 runs)
7.0-8.0 102 4/25 LAA vs. SEA Silva (7.38 runs)
> 8.0 32 5/16 PHI at WAS Cabrera (8.63 runs)

Analyzing Some Strategies

I ran the EPA for the last five years (2005-09) and then tagged every game based on the expected runs to be scored by the home team and the visiting team.  Focusing on just the first seven innings of each game (since a team is trying maximize runs in most of these innings rather than settling for one run), I calculated a Run Expectancy Matrix based on four filters:

  1. Include all games, i.e., calculate it as is
  2. Only look at low-scoring run environments (expected runs per game is less than 2.5)
  3. Mid scoring environment (expected runs per game is between 4.0 and 5.5)
  4. High-scoring environment (expected runs per game is greater than 7.5 runs)

Here are the different Run Expectancy Matrices:

Situation Scoring Environment
Out/Runners Overall Low Mid High
0/000 0.536 0.260 0.529 1.074
0/003 1.425 1.063 1.437 1.868
0/020 1.169 0.687 1.145 2.031
0/023 2.070 1.846 2.008 2.542
0/100 0.926 0.470 0.904 1.679
0/103 1.794 0.967 1.744 2.321
0/120 1.528 0.853 1.478 2.257
0/123 2.386 1.250 2.259 3.083
1/000 0.288 0.147 0.286 0.568
1/003 0.968 0.646 0.953 1.475
1/020 0.716 0.435 0.699 1.216
1/023 1.433 1.039 1.396 1.969
1/100 0.556 0.253 0.547 1.054
1/103 1.185 0.681 1.154 1.714
1/120 0.943 0.485 0.912 1.536
1/123 1.608 0.485 0.912 1.536
2/000 0.109 0.605 0.110 0.198
2/003 0.381 0.322 0.367 0.553
2/020 0.342 0.187 0.345 0.534
2/023 0.599 0.475 0.591 0.950
2/100 0.236 0.134 0.237 0.418
2/103 0.505 0.278 0.484 0.869
2/120 0.449 0.287 0.441 0.741
2/123 0.774 0.308 0.754 1.116

First, let’s analyze the stolen base when there is just a runner on first base. Typically the rule of thumb is that the expected success rate should be around 70-72 percent. The table below shows how the break-even success rate changes based on the run-scoring environment. The numbers in parentheses are modified break-even success rates when we consider that, over the last five years, 5.1 percent of the time a stolen base of second is attempted there is an error resulting in the runner advancing to third.

Scoring Environment
Outs Overall Low Mid High
0 72% (71%)



72% (71%) 76% (76%)
1 74% (72%) 51% (50%) 74% (73%) 84% (83%)
2 69% (69%) 72% (69%) 69% (68%) 78% (78%)

As we would expect, in a low-scoring environment, the importance of manufacturing runs is of such importance that the break-even rate drops significantly as compared to what we would have considered before.  Obviously on the flip side, in the high run-scoring environment the bar for success is raised.

The other strategy that is usually hotly debated is the sacrifice bunt. There was a specific instance that got me thinking about this concept of quantifying strategies in low run-scoring environments: the June 25 game between the Mets and Cardinals that I watched for the BP Idol competition. Twice in that game (with Johan Santana pitching for Mets, a 2.68 run/game based on the EPA prediction), Cardinals manager Tony La Russa twice called for the sacrifice bunt with a runner on first and no outs. Below is the table of three likely bunting situations with no outs: runner on first, runner on second, and runners on first and second. The values are the change in expected runs by employing the strategy: a negative number meaning the strategy hurts the offense and a positive number meaning the strategy helps the offense.  The first numbers suggest what the change is given a successful sacrifice.  The numbers in parentheses reflect adjustments based on the following historical trends of the last five years when a sacrifice is attempted: 70 percent successful attempt (batter is out, runners advance one base), 23 percent unsuccessful attempt with either the lead runner thrown out or a strikeout, 3.5 percent throwing error by the defense that allows runners to advance more than one base, 2 percent double play, 1.5 percent fielder’s choice that results in all runners being safe.

Scoring Environment
Runners Overall Low Mid High
1st -.21 (-.20) -.04 (-.03) -.21 (-.19) -.46 (-.46)
2nd -.20 (-.16) -.04 (-.01) -.19 (-.15) -.56 (.51)
1st and 2nd .09 (-.17) +.18 (+.08) -.08 (-.15) -.29 (-.35)

So, in the situation with La Russa above, given that it was likely going be a low-scoring run environment game, it caused my initial analysis to be that it was a bad decision, likely costing his team 0.2 runs. In reality, this decision was essentially breakeven. Given that our parameters for errors and unsuccessful attempts are averages over all attempts, if a better than average bunter and/or a poor fielding catcher were in the game, these may flip to a positive benefit. Once again, the last column indicates that in high-scoring environments, attempting to manufacturing runs early is a poor strategy. Woe is the manager who sacrifices with Bruce Chen on the mound.

Next Steps

As some of you may realize, there could be a lot more to this EPA process than simply analyzing strategies.  If we have an accurate way to predict the likely-run scoring environment for both the visitors and the home team, it’s just a quick stroll in the mathematical woods, to determine a win percentage for the home team.  Hmmmm…. if only there was a place that one could go to wager on games like this armed with this information.  So, next time, we are going to have some fun by seeing how EPA stacked up against the historical closing lines in Las Vegas throughout an entire season.

Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now
You need to be logged in to comment. Login or Subscribe
Marge Simpson: "Eepa." What does that mean?
Comic Book Guy: I believe it was the sound Green Lantern made when Sinestro dropped him in a vat of acid. "Eeeeee-paaaa!"
Wow, this is really cool, thanks! One of those things that people tend to give just rough adjustments for because it's too much work to do it more accurately.

When calculating factors for offense platoon splits, starting pitcher and defense, did you use real numbers (like 2009 ERA) or forecasting tools?

How much variation is there in the actual offense each day around the single offense factor you use? I suppose that your factor probably already includes the tendency of managers to play more RHB against a LH starter if you use actual numbers, but it would get tricky if you're using forecasts for the typical lineups.
YAY!!!! Great article. It is nice to see an article here that actually shows "conventional wisdom" actually contains some wisdom.
Tim, great stuff! We talk about run environments affecting strategy decisions, based on RE and WE matrices all the time, but this is a great attempt at actually quantifying it as well as assigning an actual run environment to a particular class of pitchers and teams.

One thing I think you messed up:

"70 percent successful attempt (batter is out, runners advance one base), 23 percent unsuccessful attempt with either the lead runner thrown out or a strikeout, 3.5 percent throwing error by the defense that allows runners to advance more than one base, 2 percent double play, 1.5 percent fielder’s choice that results in all runners being safe."

Where are the hits??

According to The Book, all sac bunt attempts result in singles 12% of the time.

The actual breakdown (again, overall - it much depends on the batter and the inning, where the inning is a proxy for how much the defense is expecting the bunt) is this:

Batter out, runner advances: 48.4%, not 70%
FC, both runners safe: .6%
An out, no runner advance: 26.2%
DP: 4.5%
A hit: 13.4%
ROE: 3.4%
walk/hp: 3.1%

And a few other various and sundry outcomes.

Those numbers include when the batter gets 2 strikes and ends up swinging away. If we just look at actual bunt attempts all the through the PA, the numbers change slightly.
I was torn on this. I lumped all bunts where the hitter was given a single as more of a "bunt attempt for a hit" instead of a "sacrifice bunt"

how do we deal with the issue of the "bunt attempt for a hit gets scored as a sacrifice because the runner was unsuccessful?"

Issue 2 Is it possible to take those run changed values in the table on bunting and apply them to a win probability matrix based on inning and situation to determine if the strategy really does make sense. Could it really be possible that bunting with a man on second and no outs in the first inning of a scoreless game could be a winning strategy if you are the A's playing at the Mariners and Felix Hernandez is on the mound? Does it matter who your starter is in that case?

Maybe we could take this data, and combine it with the PBP data for the last year to come up with a list of the best and worst (with an emphasis on the worst) managerial bunting decisions..... or CS's... (Yadier Molina getting nailed in a game at Colorado on a pick off seems to loom large)
Great stuff.
Great article. In terms of strategy, especially of steals and bunts I think it pays to also look at who is in the lineup following. RE and WE are calculated assuming average offense, and even your adjustment considers the hitting team as all the same (possibly above or below averge) team. Obviously Pujols bunting is different than Navarro bunting in terms of what the true expectation is.
Wonderful concept.