Doesn’t it always seem to happen to your team? Down by two in the 5th, your team gets a lead-off double followed by a walk. First and second, no one out – this is the start of a big inning. Next thing you know, after a lazy fly-ball to shallow left and a strikeout, your only hope is a solid base hit. One weak grounder to short later, you curse your team for another wasted opportunity. The Run Expectancy Matrix helps us determine how much of a wasted opportunity the above scenario truly was.
There are 24 unique states describing the position of runners and the number of outs in an inning. There are 8 unique runner positions (bases empty, men on 1st and 2nd, bases loaded, etc.) and 3 out possibilities.
For each state, we are interested in determining the expected number of runs that are scored in the rest of the inning. If one goes to the Statistics page of Baseball Prospectus and clicks on “Run Expectancy Matrix” a table like the following will pop up:
Runners Exp_Outs_0 Exp_Outs_1 Exp_Outs_2 000 0.526 0.281 0.108 003 1.520 0.951 0.362 020 1.165 0.708 0.334 023 2.017 1.425 0.600 100 0.908 0.536 0.228 103 1.772 1.566 0.496 120 1.558 0.944 0.461 123 2.349 1.596 0.803
Back to our scenario, this table tells us that, with runners on 1st and 2nd (Runners = 120) and no outs, the expected number of runs is 1.558. That’s fine and good, but where did this 1.558 come from? More importantly, what were the assumptions?
To create the Run Expectancy Matrix, the sabermetric gnomes (you thought real people did this?) look at the play-by-play data for some time period, typically a given year or set of consecutive years. At the beginning of each play, they look at the state and then determine how many runs were scored in this half-inning from the beginning of the play to the end of the inning. Then, they calculate the average of the number of runs that follow for all plays with the same beginning state.
To illustrate the point, let’s walk through an example inning. On April 5, 2008 in the bottom of the fifth inning of the Rangers–Angels game, there were six distinct plays:
- Matthews, Jr. singled to left
- Guerrero singled to left. Matthews, Jr. to 2nd.
- Anderson pops out to second.
- Hunter doubles to left. Matthews, Jr. scores. Guerrero to 3rd.
- Kotchman hit by a pitch.
- Kendrick grounds into a 6-4-3 double play.
The table below shows how the above six plays would be translated into the state and the additional runs to be scored that inning.
Play Runners Outs Additional Runs Scored 1 000 0 1 2 100 0 1 3 120 0 1 4 120 1 1 5 023 1 0 6 123 1 0
At the beginning of the third play, we have the state of 1st and 2nd, no outs. After this state, one run scored during the rest of the inning (on the fourth play). Throughout 2008, there were 2,520 times a play started with 1st and 2nd and no outs. These plays plus all following plays in the inning led to 3,925 runs scored. If we divide 3,925 by 2,520, we get 1.558 expected runs.
There is some filtering of which plays are included and excluded in the calculations. Typically, a Run Expectancy Matrix focuses on innings when the batting team’s objective is to maximize runs. When the batting team’s objective is to play for a single run (tied-game in the ninth) or the inning is partial (a walk-off hit in the bottom of the ninth), these innings are typically excluded, because they underestimate the true run potential of a given state. In the numbers presented in this article, I have used each play in 2008 that occurred between the first and eighth inning as a simple filtering.
Uses of the Run Expectancy Matrix
There are two significant uses of the Run Expectancy Matrix:
- Improved performance measurement through advanced statistics
- Strategy analysis.
In regards to improved performance measurement, imagine the following: a reliever enters the game in the middle innings with his team up by one, with the bases loaded and only one out. He gives up a sacrifice fly that ties the game and then strikes out the next batter to end the inning. With traditional statistics, this reliever would be charged with a blown save and allowing one inherited runner to score. Did he provide value to his team? Actually, he did. Based on the run expectancy matrix, in this situation, we would expect the opponent to score 1.596 runs. In this example, the reliever improved his team’s game position by 0.596 runs, while the traditional statistic of blown saves suggests a poor performance. This type of situational analysis in comparing the actual runs scored to the expected runs scored is the basis of statistics INR, WX, and WXRL.
The other important use is to analyze the value of certain strategies (stolen bases, sacrifice hits) based on a likely game situation. By comparing the beginning state to a few possible end states, we can determine if a given strategy is likely to create more runs. Joe Sheehan wrote an excellent analysis of the stolen base for the Baseball Prospectus Basics series back in 2004 using the Run Expectancy Matrix for 2003.
Limitation #1: Expected Runs vs. Run Frequency
Relying only on the Run Expectancy Matrix in evaluating strategy, however, can lead to flawed conclusions. We would dismiss the sacrifice bunt entirely, because in all situations a successful sacrifice bunt decreases the expected number of runs. If a team is playing for one run, however, a Run Frequency Matrix (something I first saw on Tom Tango’s website) shows that it can be a sound strategy. The table below shows the probability of scoring at least one run given the situation:
Runners Exp_Outs_0 Exp_Outs_1 Exp_Outs_2 000 28.2% 16.5% 7.1% 003 86.3% 65.5% 25.3% 020 62.8% 41.0% 21.9% 023 83.9% 69.0% 26.6% 100 42.4% 27.1% 12.7% 103 83.3% 62.5% 26.7% 120 64.5% 42.1% 22.6% 123 86.6% 67.0% 32.4%
With a runner on 2nd and no outs, sacrificing him over to 3rd slightly improves the odds of his scoring from 62.8% to 65.5%, although the overall expected runs go down from 1.165 to 0.951. We are increasing the odds of scoring at least one run by 2.7%, but at the cost of lowering the probability of scoring more than one run. Interestingly enough, the very common situation of sacrificing with a runner on 1st and no outs is typically not beneficial, as the probability of bringing that runner home goes down from 42.4% to 41.0%. This counter-intuitive result highlights one other key issue of using the Run Expectancy Matrix (or Run Frequency Matrix) blindly. It is important to understand who is bunting, who is on deck, and who is the runner.
Limitation #2: Understanding Lineup Position
The Run Expectancy Matrix essentially assumes that an “average” hitter is coming up to the plate in every given state. However, it’s obvious that the average number of runs scored will be different in a 1st and 3rd, no-out state with Mauer, Morneau, and Kubel due up versus Buscher, Gomez, and Punto.
I looked at the play-by-play data for 2008 and calculated the Run Expectancy Matrix for each lineup position. In the table below, I pulled some examples that will help us determine the required breakeven success rate of a stolen base when a man is on first with no one out. In the first column, we see the run expectancies for the beginning state. The average is 0.908 runs, but this varies from 1.052 (for the #3 hitter) to 0.783 runs (for the #9 hitter).
Lineup Expected Runs Expected Runs Expected Runs Position (100 - 0 out) (020-0 out) (000-1 out) Overall 0.908 1.165 0.281 1 0.923 1.192 0.323 2 1.018 1.254 0.325 3 1.052 1.232 0.328 4 0.938 1.201 0.311 5 0.827 1.162 0.270 6 0.813 1.090 0.260 7 0.819 1.022 0.234 8 0.873 1.098 0.217 9 0.783 1.108 0.225
If the steal is successful, we have a man on second with no outs (the 2nd column). If our base stealer is caught, we have the bases empty and one out (the 3rd column). The overall numbers (the first row) suggest that the benefit of a successful steal is an additional 0.257 runs (1.165 – 0.908). However, if caught stealing, it costs the team -.627 runs (0.908 – 0.281). Therefore, the breakeven percent (after some simple math) comes out to be a 70.9% success rate. If we do the same math for each lineup position, we see that the breakeven percent varies based on who is up to bat, from a low of 62.4% when the #5 hitter is up to 80.1% when the #3 hitter is up.
Lineup Benefit of Cost of Position Successful SB Caught Stealing Breakeven Percent Overall 0.257 0.627 70.9% 1 0.269 0.600 69.0% 2 0.236 0.693 74.6% 3 0.180 0.724 80.1% 4 0.263 0.627 70.4% 5 0.335 0.557 62.4% 6 0.277 0.553 66.6% 7 0.203 0.585 74.2% 8 0.225 0.656 74.5% 9 0.325 0.558 63.2%
We are, of course, assuming that our current lineup positions match the average historical lineup positions of all teams. A better mousetrap would be for a manager to have these breakeven percentages based on his current lineup and the likely production from each spot. The best mousetrap would also have these percentages adjusted based on the current game state (are we maximizing runs or going for simply a single run), and the likely success rate based on our man at first and the current opposing battery-mates. To create something like this, we would need to use more sophisticated tools from optimization theory like dynamic programming, but that is a different article altogether. Ah, to dream the impossible dream!
Thank you for reading
This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.Subscribe now
There's nothing really inappropriate about the title - it's just bathroom humor. Beats the heck out of watching a bunch of football analysts in ties laughing like hyenas at one another's non-jokes.
It should be (Hope it comes out in the comment section):
Play Runners Outs Runs Scored
1 000 0 1
2 100 0 1
3 120 0 1
4 120 1 1
5 023 1 0
6 123 1 0
Basically the table shows what the simple db table would be.
First column: Play Number (corresponding to the numbered list above)
Second column: Runners at beginning of the play (000 = bases empty, 120 = Runners on 1st and 2nd)
Third Column: Outs - Number of outs at beginning of play
Fourth Column: Additional Runs scored. Number of additional runs scored in the inning starting with the current play to the final play inclusive.
Overall, as a reader of the Book Blog, and sabremetric fan. I loved the piece.
Play Runners Outs Additional Runs Scored 1 000 0 1 2 100 0 1 3 120 0 1 4 120 1 1 5 023 1 0 6 123 1 0
I'm sure did not come out how the author intended. At least I hope not. Still, thumbs up from me. Run expectancy has always fascinated me anyway, and the added bonus of what changes by lineup position makes this a big time winner. Basics yes, and I learned some things. great job!
Maybe this wasn't exactly in the spirit of The Basics theme that they were hoping for, but I didn't think it was so far of a departure. I felt that most of the article truly lined up with the goal.
Again, these mostly weren't in the style I was expecting. I enjoyed most of the advanced information. And just because I wasn't expecting it doesn't mean I get to have the final word.
This stands out as clearly the best entry on this round - and I happened to like Tim's qualifying entry on organizational depth the best, too.
Tim is clearly the Adam of this competition. For those of you who are either musically deadened or never watch American Idol, Adam was by far the best singer this season, although, Tim's superiority might not be quite as obvious. I could see Ken Fuchs and Brian Cartwright doing some collosal work and some of the other writers strong enough to keep Tim from getting overconfident. Heck, most American Idol viewers prefered that nice handsome boy Chris.
Instead of using a game story to good effect it is immediately abandoned for dry research. the average fan isn't coming in with a sabermetrics or scientific background and will, I think, do what I did: read a paragraph or two... start skimming and never find a way back into the article.
Sure it might be great original research but that wasn't the assignment now was it?
As an article, I love it. As to the assignment, I'm not so sure.
Pure nerd quibble: if you're going to talk about using Run Expectancy or Run Frequency to evaluate whether or not to bunt, you need to mention the possibility that the bunt attempt does not result in a successful sacrifice. Strikeout, pop up, lead runner forced on FC, both runners safe on error... all of these things happen more often that most people think.
I did not find the writing style "dry" in the least, as others have commented. I guess it's all in then eye of the beholder, but for me, Tim writes with an ease and confidence, both here and in his initial entry, that is impressive.
Oh, and the title...umm...YES.
But, really, great job!
My point on this was not so much simulation or dynamic programming, but as one wants to get more sophisticated, one could not simply do an averageing of play-by-play data because one would run into data sparseness issues, i.e., the whole sample size issue. Therefore, if one desired a tool/mehodology that took into account: the current lineup, the opposing battery-mates, the situation, etc. one needs to go down the path of DP or simulation.
Personally, simulation is fine and dandy, but it does determine what kind of time constraints. It's easy enough to run a simulation of 1,000,000 runs if you have several minutes to wait (depending on the granularity of the simulation), however, one could use a DP approach if by plugging in a new pitcher or pinch-hitter, one needs to have a recommendation in just a few seconds.
The article was interesting too.
I felt that the writing was less clear (and less entertaining) in the second half of the article. I also felt that the numbers suffered from a common problem in sabermetrics - too many digits. I'd be astounded to find out that the run expectancy numbers had more than two significant digits. The table looks to be sourced directly from BP, but that sort of nit still needs to be picked.
I thought Tim did an excellent job of crystallizing the two ways in which this data can be useful: performance measurement, and strategy analysis. And as intriguing as the latter application is, the former is even more exciting in its potential for changing how we evaluate players. I've long thought that the fairest way to value a player's offensive contribution to his team over the course of a season is to determine his total expected-runs-added based on the changes to "state" all of his plate appearances (and base-running actions) caused. Along the same lines, perhaps the fairest way to compare the contributions of two different players over a season is to consider their expected-runs-added totals in the context of their actual opportunities (since, e.g., a hitter leading off an inning always has a greater ability to increase the expected runs for that inning than does a hitter batting with two outs).
Alternatively, a modified version of Tango's "Run Frequency Matrix" could help us assign credit more accurately to hitters for runs scored and RBI (by giving more credit to, say, the batter who drives a runner in from first with two outs than to the batter who drives one in from third with no outs). A possible way to assign this credit is to ask: what is the probability that this runner will score from where he put himself on the bases (via his own walk, HBP or hit, plus SBs if any), given the number of outs there are (or more properly, the number of outs there *were* when he got to where he did on his own)? Whatever that percentage is, that's the percentage of the run that he should get credit for (Runs Deserved). The guy driving him in gets the rest of the credit (1 - Runs Deserved) as an RBI Deserved (or, when there are intervening batters between the batter scoring the run and the batter driving him in, they are assigned partial credit for the RBI as well, as their contributions warrant). To make the stats correspond in raw terms to traditional R and RBI, we would simply double the numbers (since nearly every run also results in an RBI). Fielding can even be evaluated with this data more precisely, as the cost of an error can be quantified based on exactly how it changed the state. A bobble by a third baseman fielding a bunt with two outs and nobody on costs his team 0.228 expected runs (the difference between what happened -- guy on first with two out (0.228) -- and what should have happened -- end of inning (0.00)), whereas a throwing error into the OF by that same third baseman on a would-be double play with nobody out and a guy on first might cost his team 1.909 expected runs (or the difference between runners on second and third and no outs (2.017) and nobody on with two outs (0.108)).
Anyway, I digress. Awful title, solid job with the rest, it really did read as a nice opening chapter to a book that I'd love to read about Markov analysis in baseball. Maybe I'm being hypercritical of the others, but this was the only entry this week which I considered to be truly BP worthy, and the only one for which I voted.