BP Comment Quick Links
World Series time! Enjoy Premiumlevel access to most features through the end of the Series!



March 4, 2000 Analytic Model Creation ContestDevelop your own analytic model and win stuff!If this page is too wide for you to view easily, you can download a copy of the file in PDF format here. You need Adobe Acrobat, a free PDF viewer, to use this file. You can check Adobe's web site for more information.An analytic model of perinning scoring distributions for studying ingame strategies * Introduction * Analyzing the Per Inning Run Scoring Historical Record * The Runs Per Inning Scoring Formula * Expected vs. Actual Runs * The Perfect Closer * Modeling Game With Analytica * The Runs Per Inning Game Resolution Model * Home Team Scoring Module * Wins and Losses Module * Displaying Results in Analytica * The Online Probabilistic Game Predictor * The Contest * Acknowledgements * One of the most invaluable sabermetric tools for analyzing strategic decisions has been the baseout expected runs matrix. Thorn & Palmer presented such a table in their classic book The Hidden Game Of Baseball. The table shows every baserunner situation (bases empty, runner on first, bases loaded, etc) and number of outs, and shows the average number of runs teams scored in actual games from that situation onward through the rest of the inning. Using such a tool enables one to calculate average breakeven success rates for attempting stolen bases, the average value of the sacrifice hit, the costliness of a double play, and so on. As useful as such as table is, there are limitations. The data are compiled from the playbyplay accounts of actual games, and as such are difficult for the ordinary fan to compute and independently verify. The results are applicable only to the average or typical team in the data set  as offense levels change, so does the baseout expected runs table, though it's not obvious exactly what the changes are. Such a table also can't be of help when considering the differences between average hitting teams and great/poor hitting teams. The Cleveland Indians have a higher expected run total for any baseout situation than do the Minnesota Twins, but you can't assess the magnitude of the difference from the table itself. In addition, expected runs alone do not tell the whole story. Going into the bottom of the ninth inning, down by 2 runs, the key piece of information is not the expected number of runs, but the probability of scoring at two (to tie) or more (to win) runs in a single inning. The emphasis on endgame strategies involving closers, pinch hitters, pinch runners, and the like derives from the fact that all runs are not equal, when you have specific information about the state of the game. The next step in the evolution of tools to analyze strategic decisions would be to have an analytic framework from which teams of differing offensive and defensive strengths could be studied. In addition, explicit consideration of the probability distributions of various outcomes would enable more sophisticated treatments of strategy. With these needs, in mind, I've attempted to develop new tools to aid in such analyses. Analyzing the Per Inning Run Scoring Historical Record My first goal was to develop a formula for estimating the probability of a team scoring a certain number of runs in an inning (the Runs Per Inning, or RPI formula). The formula would take into account the overall offensive strength of the team. First, we needed empirical data to verify a model against. To that end, I collected inning by inning scoring data for all teams between 1980 and 1998 (courtesy of Retrosheet and Total Sports/Baseball Workshop). I categorized each team by their season's average runspergame (3.03.5, 3.54.0, 4.04.5, etc.), and determined the number of times the teams in each category scored 1, 2, 3, … runs per inning. In the following, every notation of X runs per game refers to the class of teams scoring between X and (X + 0.5) runs/game. The results are shown in the following two tables:
Having determined the empirical distributions, what we then needed was an RPI formula to predicting the likelihood that a team who averages X runs per game scores Y runs in an inning. Start by graphing the results to get a visual representation of what the distribution looks like: In all cases, the chance of scoring zero runs constitutes more than half the outcomes. The general shape shows a strong decline to scoring 1 run in an inning and a more gradual decline thereafter. Since the shapes are similar, we should be able to use the same kind of approximation for teams of various strengths, with different parameters to adjust the exact shape. That is, high scoring teams and low scoring teams both have similar looking scoring distributions, with only minor adjustments to account for the different levels of offense between them. Let's look at the specific case of teams averaging 4.55.0 runs per game: Though the run scoring distribution is a discrete distribution (it's impossible to score fractional numbers of runs per inning), there's a strong resemblance between it and an exponential distribution, which take the form Prob{team scores Y runs in an inning} = c*e^{kY. } For such a distribution, plotting the probabilities on a logarithmic graph should yield a straight line. The results are encouraging  that's a pretty straight line (other than a slightly more negatively sloped portion between 0 and 1). An exponential approximation to this distribution will likely be a good one. Let's graph the remaining team categories similarly: We're on to something good. All of the team categories show basically loglinear trends in the likelihood of perinning scoring, at least in the 1  6 runs/inning levels. The zero point is still slightly higher than a pure loglinear model would suggest. And at 7+ runs per inning, there's some noisiness in the results, which is reasonable to expect since the frequency of these events is so rare. In the 19 years worth of, innings with 7 or more runs scored amount to less than 0.16% of the innings played. The overall trends still appear to be linear, though with a slight bit of negative acceleration at higher run levels. A model that accurately models the 06 runs per inning range, and extrapolates reasonable for higher levels of scoring will still be a useful tool for our purposes.
The Runs Per Inning Scoring Formula We've now identified a few characteristics of the RPI formula:
The following exponential formula meets the first two items of our criteria: Probability ~= ^{ } where A is the average runs/game, and R is the specific Runs/Inning. The values of m, p, n, and b are constant coefficients. Let's examine the RPI formula in depth. It's an exponential, as we desired, with the exponent dependent on the parameters A and R. The mA +pR portion controls most of the slope of the loglinear of the line. The n(R/A) is be a correction term that slightly increases the rate of decline for as R gets large. The values of the coefficients will be determined through a regression analysis that will give us the best possible fit to the actual data. The end result will do well in approximating the run scoring in the >= 1 run portion of the graph. Which still leaves the very significant problem of determining the probability of scoring zero runs, since the formula does not do a good job of computing it if you just plug in zero for the value of R. Again, the reason is that the historical data on the loglinear graph above shows that the probability of zero is higher than we'd expect by a straight line. However, we do have a way out of the dilemma. We can simply define the probability of zero as the remaining probability after we sum the probabilities of all values of R greater than 1. As we will see, this solution is more than adequate. The range of R deserves a little attention. While in principle R should be able to range as large as we want, the validity of calculating the probability of a team scoring, say, 20 runs in an inning is questionable. The formula and coefficients were validated against the existing history, and the incidence of scoring 9 or more runs is vanishingly small, typically less than .04% for all values of R greater than 9 combined. As a practical limit, the odds of scoring more than 8 runs per inning is virtually zero, and in any event, the discrepancies between actual and computed values in the lower run scoring range (0, 1, and 2 runs/inning) are larger than the entire effect of the R>9. Given the difficulties in validating run scoring above this level, the miniscule effect on practical strategic assessments, and the rarity of the events themselves, it makes sense to set a maximum value of R that we will explore, which we will denote as R_{max} in performing the calculations. Setting R_{max} in the range of 8, 9, or 10 will capture the overwhelming and significant portion of the perinning distribution of team run scoring. With the incorporation of the probability of zero, and the range of R, our formula now looks like: Probability of a team that averages A runs per game scores R runs in a given inning = [ if R > 0 then If R = 0 then ] There's one last step we can take to improve accuracy. We know what the pergame average scoring is supposed to be, as it's one of the parameters to the formula. We can also determine the expected runs/inning the formula predicts as follows: The coefficients used define the proper shape, but they are also best suited for the range of scoring closest to the median of actual team run scoring (A ~= 4.5 RPG). At the extremes, the approximation gives the proper shape, but loses some accuracy in the expected level of scoring. We can remedy this by scaling the probabilities for positive run scoring totals by the ratio between the actual scoring (A) and the predicted scoring (Expected_RPG). Thus our final formula (including the values for the coefficients) is: Probability of a team that averages A runs per game scores R runs in a given inning = [ if R > 0 then If R = 0 then Where and m =0.01219 n = 1.813 p = 0.3865 b = 1.042 ] It took us awhile to get there, and it looks pretty hairy, but the real question is how well it works. We'll explore that in the next section. The real test of an estimation tool is in how well is estimated what it purports to measure. In the case of the RPI formula, there are two independent parameters to consider, the team's average run scoring (strength of offense), and the specific number of runs for which we want to know the chance of the team scoring in an inning. The tables below show the results for team strengths between 3.0 and 6.0 runs per game, and for perinning scoring between zero and eight runs. Note that while only 1 decimal place is shown, none of the values are actually zero, just less than 0.1%:
The RPI formula works very well in predicting the chances of scoring across a variety of teams. Of course, since it was the data set that the model was developed against, one would hope for a reasonably solid fit. An interesting and useful question is whether the distribution for each inning leads to an accurate prediction for the distribution of full game run scoring. Thanks to Clay Davenport's work last summer expanding on Bill James's "Pythagorean theorem" for team winning percentage, we have a framework for estimating the distribution of run scoring given a team's overall average level of offense. In lieu of an official designation for this work, I've taken to calling it the "Pythagenport" formula (if you've got a better name, send it in). We can compare the Pythagenport formula to the RPI formula, using a probabilistic tool like Analytica, from Lumina Decision Systems. Using Analytica, we can quickly simulate a large number of games using the RPI model, and plot the frequency of each game result. Below, we've charted the RPI results against those of the Pythagenport method for various team strengths:
The results are encouraging. The RPI model accurately estimates the overall frequency of pergame run scoring, as verified by an independently created methodology. There are some small discrepancies, mostly due to the fact that RPI models 9 innings of scoring that are independent from one another (no factors beyond the team influence the run scoring distribution), whereas actual games have a strong dependency in the form of the opposing pitcher. If you fail to score in the first inning, you are slightly less likely to score in the rest of the game, because it's slightly more likely that you're facing a Pedro Martinez or Randy Johnson than facing Jeff Fassero or Jamie Navarro. Conversely, you're more likely to score in the first against a bad pitcher, which increases your chances of scoring in subsequent innings. The net effect is that the assumption of independence used in the RPI model isn't entirely accurate, but doesn't affect the results too much, as you can see in the charts above. An interesting use of the RPI model is to investigate specific baseball strategies. One such example is the use of a closer like Dennis Eckersley. How much benefit does a team actually derive from such a closer? And how much of the value depends on the strength of the team's offense? Since relief pitchers (even closers) are used in a variety of ways, some specificity is desirable. Let's describe an idealized, hypothetical "perfect" closer as follows:
This simplified version of a LaRussastyle closer is easy to model, and thus we can simulate games to ascertain exactly how much benefit such a closer would be over the course of a season. We will assume in all the following examples that only the home team employs the perfect closer (the visiting team has no such advantage). First, we need our base case, which is the performance of the home team without the benefit of the closer. The table below shows the chances of the home team winning depending on the strength of their offense and that of the opposition:
(Note that since the games are being simulated, the exact figure of 50.0% for teams of equal strength isn't perfectly touched, though 49.9% is sufficiently close for our purposes!) Now, of all the games that are played, only a fraction will meet the criteria for the perfect closer to be used (protecting a 13 run lead in the 9^{th} inning). We can have Analytica track the percentages for us:
