Happy Thanksgiving! Regularly Scheduled Articles Will Resume Monday, December 1
February 17, 2005
Playing With the Lineup
On the radio in Boston yesterday, the hosts of the show asked me if I thought that Edgar Renteria would be a good fit for the second spot in the Red Sox lineup. My response was that yes, Renteria would make for a good candidate for the two spot because he has a good OBP; their theory was that he would make a good #2 hitter because he hits for contact. We briefly discussed the idea that the batter in the two hole should be a contact hitter, able to get the leadoff hitter into scoring position for the big bats behind, and the general theories that higher OBP players should bat higher in the lineup to get them more plate appearances.
The more I think about it, though, I'm not sure that we know how much difference the lineup makes most of the time, or whether some of those conventional theories about lineup structure make sense. There has been work on lineup theory out there, from Keith Woolner's efforts to more advanced research using Markov models.
The lineup is the part of the game over which the manager has the most control. Like the recent trends in bullpen management, however, it's so ruled by convention that managers really can't do anything that contradicts with mainstream thinking without drawing the ire and consternation of fans and writers alike. When Tony LaRussa starting batting his pitcher eighth instead of ninth to "get more guys on base in front of Mark McGwire," he was lambasted. When the A's put Jeremy Giambi--who was as good at stealing as Tommy Williams--in the leadoff spot, the outcry from the media was even greater. Writers stacked soapboxes on top of soapboxes to scream about the essential skills that a leadoff hitter must have, conclusions drawn from years of experience watching and playing baseball.
A lineup's construction has two ramifications: how many times each player bats, and how those plate appearances interact with each other (for the time being, I'm going to disregard tactical matters such as alternating left-handed and right-handed hitters to mitigate platoon issues). The first effect is quite simple and can easily be estimated by average team performance and lineup position. If Barry Bonds bats first, he's going to get significantly more plate appearances over the course of the season than if he bats lower. Usually, that's a good thing, but the reason that Bonds doesn't bat first is the second ramification. Conventional wisdom teaches that the value of those extra plate appearances will be nullified by the fact that Bonds will frequently come to the plate with no one on base by virtue of batting leadoff or, later in the game, after the weaker hitters at the bottom of the lineup. His ability to advance baserunners with long hits is wasted.
In order to start to tackle the problem of lineup optimization from a more theoretical and mathematical standpoint, I've written a program that simulates games from a probabilistic perspective. This program is quite similar to something like Strat-O-Matic; it is given a large set of probabilities for game situations and it "plays" out the games, going through each batter and determining the outcome based on the given probabilities. A basic example would be the following:
The sum of these (.350) would be a player's on-base percentage (OBP). For a plate appearance by this player, the program will generate a random number between 0 and 1. If that number is less than .350, the player reaches base. If the number is between 0 and .232 (the player's probability of a single), the player is credited with a single. Similarly, if the number is between .233 and .267 (the sum of the probability of a single and a double), the player is credited with a double, and so on. The program then adjusts the game situation accordingly and moves on to the next batter.
By providing the program with a full lineup of player probabilities and then running the program for a full 162 games, we can approximate the number of runs a given lineup would score. It would be prudent to note at this point that there are a number of assumptions built into the program. First off, there are no stolen bases. Secondly, while there are several situations in baseball where multiple situations can result from the same inputs--e.g., if there is a man on second base when a single is hit, that runner can remain at second, advance to third, score or be thrown out on the bases--because no speed component of any kind has been employed yet, the major-league average probability is assigned to each of these events.
With all of that stated, we can now begin various theories about lineup construction using the program. For each lineup, the program will run 1,000 seasons, producing a minimum runs scored, a maximum runs scored and a mean. To establish a baseline, we'll use the major-league average for each probability for all nine players in the lineup, essentially seeking to determine what an average major-league lineup would generate in the program.
(Keep in mind that this result is not exactly intended to be accurate with major-league run scoring averages for several reasons. First, and most obviously, stolen bases, double plays, sacrifices and other more specific game events have yet to be considered. Second, these simulated games will never go into extra innings. Third, park and opposing pitching effects are not calculated).
For each batter in the lineup, we will use the following probabilities:
Running through 1,000 iterations yields the following results:
Minimum Runs: 657
Further, the results for each player:
Player 1: 799 PA, .260/.331/.415
(The standard deviation of runs per season is about 77, a rather large number. More on this later.)
Thus, the program baseline for an average major-league lineup given the absence of auxiliary game factors is 785 runs. Of course, that says absolutely nothing practical. There's no such thing as a major-league average lineup, much less any team with nine identical hitters. However, it does provide a baseline for the program.
To start addressing some more practical problems, changes to the players in the lineup must be made. In order to minimize the difference between the new players' output and the old, the new lineup will attempt to sum up closely to the old. In essence, home-run power removed from one hitter will be added to another. I'll also generate a group of players comprising a typical major-league lineup, not quite the Red Sox, yet not the Diamondbacks:
Player 1: .305/.361/.346
Players 1 and 2 are the higher on-base percentage players with little power; Player 3 is the big slugger; Players 4 and 5 are the more typical middle-of-the-lineup power hitters; Player 6 is a slightly less powerful hitter with a slightly higher average; Players 7 and 8 are the bottom fillers with little power but respectable OBP; and Player 9 is a pitcher (we'll play NL for now, though it should be noted that using the DH in finding the average ML player does make this a slightly more robust NL lineup). Again, this team's total probability of each batting outcome is the same as the major-league average team above, just disbursed more normally among the players. Running the program with this lineup, in the order listed above yields the following results:
Minimum Runs: 665
From these results, we can see that this lineup does slightly better than the previous version, to the tune of 16 runs on average. Thus, conventional lineup structure, at least so far, seems to improve upon theoretical lineup structure slightly.
Now it's time to start mixing things up and having a little fun. In an effort to generate an optimal lineup structure, the first step is to verify some of the basic underlying principles. First, the idea that players with higher AVG, OBP or SLG should be higher in the lineup can easily be tested. To avoid tainting the results, each player will have the same stats except for the stat being tested. For example, when testing AVG, each player will have the same OBP and SLG. The program will be given six different lineups, two for each of three "teams." Each of the three teams will have one statistic in which they all differ and the other two will remain the same. These three teams will be analyzed twice, one with the variant statistic in descending order and once in ascending order. Further, the range of the difference in the variant statistic will be closely mapped to actual major league distribution. So despite the occasional Bonds, the program won't have anyone with a .605 OBP.
After running each lineup, the program produced the following results. Below are the minimum number of runs, the mean, the maximum, and the 25th and 75th quartiles. From the numbers, a fair idea of the curve of each lineup can be gathered.
Lineup Min Quartile Mean Quartile Max Avg Desc 672 752 780 806 923 Avg Asc 662 755 782 808 919 Obp Desc 705 790 818 846 947 Obp Asc 660 762 792 821 926 Slg Desc 676 762 790 816 912 Slg Asc 656 747 777 805 926To maintain the continuity of the two control statistics, the three different "teams" were not equal, so it's important not to compare across statistics. However, comparing the same teams with the different lineups is revealing. The two lineups with variant AVG produced nearly identical results, indicating that a player's batting average has little impact on lineup performance. However, stacking a lineup with respect to SLG and, to a greater extent, OBP does seem to result in an increased chance of run scoring.
Another tenet of lineup structure is grouping or bunching better players together. Generating lineups to test this is very simple: a team is created with only two different kinds of players, six average players and three superstars. The first iteration will have the superstars batting second, fifth and eighth. The second will have them batting third, fifth and seventh. The third will have them fourth, fifth and sixth. By keeping them balanced around the middle of the lineup, the effects of having better players higher in the order should be reduced.
The following results came back:
Lineup Min Quartile Mean Quartile Max Not bunched (2,5,8) 792 952 985 1017 1154 Somewhat bunched (3,5,7) 857 953 986 1019 1136 Very bunched (4,5,6) 824 962 994 1026 1173Looking at the quartiles and the mean, it's clear that bunching doesn't take much effect until the players are actually batting sequentially. Even then, however, the effect at the mean is fewer than 10 runs.
Getting back to that average lineup that scored 801 runs (with a minimum of 665 and a max of 934), we can quickly test some more theories about lineup construction. First, let's look at Tony LaRussa's infamous movement of the pitcher to the eighth spot in the lineup:
Minimum Runs: 673
Not much difference to speak of; though the difference is slight, all three metrics adjust slightly upward. Next, let's try the idea that players with high OBP should be moved to the top of the lineup regardless of their power numbers, to give them the most plate appearances while using fewer outs:
The Bonds Opening:
Minimum Runs: 670
While the difference is again slight, this measure seems to have decreased the potency of the lineup, though it should be noted that the hypothetical lineup doesn't have a standout OBP player to the degree of Bonds, and its initial structure was already quite close to a descending OBP strategy.
It would be interesting to add things like double plays, stolen bases and baserunning skills to the program, but I'll save that for a follow up. At this point, the most interesting conclusion to me is just how wide the runs scored results set is for identical lineups. For the initial average lineup the standard deviation was 77 runs; the distribution for other lineups were much closer to 40. The implications of this is that even if a general manager knows exactly what each player is going to hit in a given season, the 95% confidence range (typically two standard deviations) is about 160 runs. This is something that we don't talk about too much, but think about that: you know exactly how each player is going to hit this year and your team could win 84 games or they could win 100. That's just statistics for you. Yes, 65% of the time they're going to win between 88 and 96, but keep that in mind when you go back at the end of the year to review preseason predictions.
With regards to lineup structure, this was far from exhaustive research on the subject, but it appears that bunching better players together and sorting by descending OBP yields the best results for run scoring with similar lineups. However, the differences between those lineups and the traditional lineup structure are minimal. It's entirely possible that adding factors such as steals, extra bases and left-right alternation may make enough of a difference to counteract losses in OBP towards the top of the lineup or bunching of the better hitters.
Back to the Boston question: will Renteria be a good #2 hitter in Boston this year? Renteria will be a good hitter. If he bats second, then he'll be a good #2 hitter, but it doesn't seem to make terribly too much difference where he bats.