Ugh! What is that vile stench? That’s right, it’s Sidney Ponson starts and the Mariners' offense polluting our pristine Run Expectancy and Win Expectancy Matrices. Before we write our congressman to apply for stimulus money for the cleanup, let's ask: How did it get this way?
The main tools that a sabermetrician uses to analyze baseball strategy are the Run Expectancy Matrix and the Win Expectancy Matrix. For the reader not familiar with these powerful tools, check out my article from the Basics series from the BP Idol competition last year. Simply put, these matrices are the basis for improved performance measuring (WXRL, etc.) and strategy analysis (stolen bases and sacrifice hits).
Before blindly using these tools, the sabermetrician needs to consider their limitations. To calculate the Run Expectancy Matrix, all play-by-play situations with the same baserunner/out state are aggregated and then the average number of runs additionally scored in that inning from that point on is empirically determined. The base calculations of the Run Expectancy Matrix assume that the run environment is exactly the same regardless of pitcher, offense or ballpark and that the number of runs scored by each side for a game will likely be between 4.5 and 5.0, typically the season average. The problem is that we are grouping together situations where Chris Carpenter is pitching against the Astros (a low run-scoring expectation), along with Fausto Carmona pitching against the Yankees (a high run-scoring expectation). In the former situation, we hypothesize that it is much more important to manufacture whatever runs one can versus the latter, where giving away outs will be detrimental to winning.
So how do we clean up our Run Expectancy Matrix? Let’s bring in the EPA! No, not the Environmental Protection Agency, but the Environment Prediction Algorithm. What we’re going to do is estimate the true run environment then use it to create Run Expectancy Matrices for the different environments. As an exercise, we will see how our understanding of various strategies change based on the environment.
The EPA Approach
The EPA considers four main factors in estimating the likely runs scored per game:
- Home/Visitor: As Matt Swartz pointed out in a series of articles, there is a distinct advantage between the home and visitor which typically comes out to a 0.2–0.3 runs per game difference.
- Offense: For each year, I calculated 60 factors, two for each team, one when the offense faces a starting left-hander, and another when they face a starting right-hander.
- Starting Pitcher: Combining the defense, the relief corps, and the starting pitcher, we create a factor for each starter and the team he is pitching for (for example, in 2009, Cliff Lee pitching for the Phillies or for the Indians will be two separate factors).
- Ballpark: The effect that each of the 30 (and sometimes a few more) ballparks have on the expected runs per game.
In theory, we could create a model that considers whether it is a day/night game, and such cross-effects like when CC Sabathia faces a lineup of four left-handers and five right-handers, but as Dr. Leo Marvin writes: Baby Steps. We don’t want our algorithm to become all tied up like Bob Wiley.
The algorithm I used was a sequential heuristic to create a linear model that estimates the number of runs/game based on the factors above. As an example, here are some of the set of factors for the 2009 season:
|Top 5 Offense||Adjustment||Bottom 5 Offense||Adjustment|
|Yankees vs. RHP||+1.10||Pirates vs. LHP||-1.03|
|Red Sox vs. LHP||+0.92||Mariners vs. LHP||-0.85|
|Yankees vs. LHP||+0.91||Mets vs. LHP||-0.82|
|Angels vs. LHP||+0.90||Cardinals vs. LHP||-0.75|
|Angels vs. RHP||+0.81||Reds vs. RHP||-0.73|
|Top 5 SP*||Adjustment||Bottom 5 SP*||Adjustment|
|Chris Carpenter||-1.95||Manny Parry||+1.49|
|Felix Hernandez||-1.74||Luke Hochevar||+1.39|
|Jair Jurrjens||-1.46||Jeff Suppan||+0.89|
|Zack Greinke||-1.42||Jorge De La Rosa||+0.86|
|Tim Lincecum||-1.39||Braden Looper||+0.82|
* minimum 25 starts
So let’s take a hypothetical example to see how this works. The Yankees are at home against the Royals with Luke Hochevar on the mound for Kansas City. Taking all the factors into account: the Yankees are the home team (4.73 runs) going against a righty (+1.10) who is Hochevar (+1.39), our expectation is that the Yankees would average 7.22 runs per game in this situation, before we take the ballpark into account.
Typically, this methodology predicts a run environment somewhere between two and eight runs per game. For the 2009 season, the following table shows the distribution of the number of games and a sample game from that bucket. Keep in mind that the total number is twice the number of games played as there will be one record for the visiting team’s scoring and the home team’s scoring.
|< 2.0||23||4/9 PIT at STL Carpenter (1.79 runs)|
|2.0-3.0||207||7/22 ATL vs. SF Lincecum (2.56 runs|
|3.0-4.0||1,005||9/17 DET vs. KC Greinke (3.33 runs)|
|4.0-5.0||1,813||5/2 LAA at NYA Sabathia (4.59 runs)|
|5.0-6.0||1,292||6/17 TOR at PHI Moyer (5.31 runs)|
|6.0-7.0||386||7/12 BOS vs. KC Chen (6.87 runs)|
|7.0-8.0||102||4/25 LAA vs. SEA Silva (7.38 runs)|
|> 8.0||32||5/16 PHI at WAS Cabrera (8.63 runs)|
Analyzing Some Strategies
I ran the EPA for the last five years (2005-09) and then tagged every game based on the expected runs to be scored by the home team and the visiting team. Focusing on just the first seven innings of each game (since a team is trying maximize runs in most of these innings rather than settling for one run), I calculated a Run Expectancy Matrix based on four filters:
- Include all games, i.e., calculate it as is
- Only look at low-scoring run environments (expected runs per game is less than 2.5)
- Mid scoring environment (expected runs per game is between 4.0 and 5.5)
- High-scoring environment (expected runs per game is greater than 7.5 runs)
Here are the different Run Expectancy Matrices:
First, let’s analyze the stolen base when there is just a runner on first base. Typically the rule of thumb is that the expected success rate should be around 70-72 percent. The table below shows how the break-even success rate changes based on the run-scoring environment. The numbers in parentheses are modified break-even success rates when we consider that, over the last five years, 5.1 percent of the time a stolen base of second is attempted there is an error resulting in the runner advancing to third.
|72% (71%)||76% (76%)|
|1||74% (72%)||51% (50%)||74% (73%)||84% (83%)|
|2||69% (69%)||72% (69%)||69% (68%)||78% (78%)|
As we would expect, in a low-scoring environment, the importance of manufacturing runs is of such importance that the break-even rate drops significantly as compared to what we would have considered before. Obviously on the flip side, in the high run-scoring environment the bar for success is raised.
The other strategy that is usually hotly debated is the sacrifice bunt. There was a specific instance that got me thinking about this concept of quantifying strategies in low run-scoring environments: the June 25 game between the Mets and Cardinals that I watched for the BP Idol competition. Twice in that game (with Johan Santana pitching for Mets, a 2.68 run/game based on the EPA prediction), Cardinals manager Tony La Russa twice called for the sacrifice bunt with a runner on first and no outs. Below is the table of three likely bunting situations with no outs: runner on first, runner on second, and runners on first and second. The values are the change in expected runs by employing the strategy: a negative number meaning the strategy hurts the offense and a positive number meaning the strategy helps the offense. The first numbers suggest what the change is given a successful sacrifice. The numbers in parentheses reflect adjustments based on the following historical trends of the last five years when a sacrifice is attempted: 70 percent successful attempt (batter is out, runners advance one base), 23 percent unsuccessful attempt with either the lead runner thrown out or a strikeout, 3.5 percent throwing error by the defense that allows runners to advance more than one base, 2 percent double play, 1.5 percent fielder’s choice that results in all runners being safe.
|1st||-.21 (-.20)||-.04 (-.03)||-.21 (-.19)||-.46 (-.46)|
|2nd||-.20 (-.16)||-.04 (-.01)||-.19 (-.15)||-.56 (.51)|
|1st and 2nd||.09 (-.17)||+.18 (+.08)||-.08 (-.15)||-.29 (-.35)|
So, in the situation with La Russa above, given that it was likely going be a low-scoring run environment game, it caused my initial analysis to be that it was a bad decision, likely costing his team 0.2 runs. In reality, this decision was essentially breakeven. Given that our parameters for errors and unsuccessful attempts are averages over all attempts, if a better than average bunter and/or a poor fielding catcher were in the game, these may flip to a positive benefit. Once again, the last column indicates that in high-scoring environments, attempting to manufacturing runs early is a poor strategy. Woe is the manager who sacrifices with Bruce Chen on the mound.
As some of you may realize, there could be a lot more to this EPA process than simply analyzing strategies. If we have an accurate way to predict the likely-run scoring environment for both the visitors and the home team, it’s just a quick stroll in the mathematical woods, to determine a win percentage for the home team. Hmmmm…. if only there was a place that one could go to wager on games like this armed with this information. So, next time, we are going to have some fun by seeing how EPA stacked up against the historical closing lines in Las Vegas throughout an entire season.