Last

week’s column about lineup order optimization generated a greater

response than I anticipated, especially for one with such loose

conclusions, so I’m going to dig a little deeper into the topic. So

far, the only application of the lineup program has been checking

various basic ideas–like sorting by descending or ascending AVG, OBP,

and SLG and bunching of the better hitters–but there’s a bit more that

can be done before adding some more enhancements to the program to see

if we can attempt to adjust for baserunning, steals, and platoons.

One of the more interesting questions left unanswered last week was

just how important sorting by OBP or SLG is. By using two lineups for

each metric–one in ascending order and one in descending order–it

was clear that players with higher OBP and SLG should be near the top of

the order. Sorting by absolutely the wrong way only changed the lineup

output by 26 runs at the OBP mean and 13 at the SLG mean. Considering

the sample size and the standard deviations, the results were close to

statistically significant, but the confidence was not high. Thus, we

could only loosely conclude that OBP is more important than SLG when

determining a lineup order when all other factors are equal.

What was not addressed was the fact that teams often have to make the

choice between the two. It’s easy to choose to bat a player with a

.260/.330/.500 line earlier than a player with a .260/.330/.400 line,

but things become a little muddled with comparing something like

.260/.310/.500 to .260/.360/.380.

To begin to take a look at that question, I put together a new team,

but to keep things simple, this team only has three players. First up is

**Wily Mo Pena**, the resident high-SLG, low-OBP sample

point with a 2004 line of .259/.316/.527. Pena is the only player last

year who slugged at least .500 with an OBP of lower than .320 in at

least 300 PAs. Congratulations, Wily. Next is **Luis
Castillo**, selected for his .291/.373/.348 performance last

year. Castillo’s OBP outpaced his SLG by one of the largest differences

in the league, thus making him the perfect candidate for the high-OBP,

low-SLG slot. Finally, we’ll plug the last hole with

**Morgan**

Ensbergwho comes in at an impressively league average

Ensberg

.275/.330/.411. Though he is a little shy in the power department,

Ensberg makes a nice “this porridge is just right” player between Pena

and Castillo.

Each of these players was given three spots in the lineup and then all

possible lineup combinations of these three players were run through the

program (which runs each lineup through 1,000 seasons), giving us a

sample size of well over a million seasons by the time things are all

finished. The program outputs a minimum, mean, and maximum for each

lineup. I also outputted the full results for the first 50 lineups to

check standard deviations, all of which were between 39 and 41 runs. Of

all the lineups, the highest mean runs scored was 834; the lowest mean

was 816. Despite testing every possible combination with these three

players, the range of means over the entire sample was 18 runs. There’s

just not that much difference.

Still, 18 extra runs can be hard to come by when shopping for

players, so it’s still worth looking into a little more deeply. For each

player, I’ve averaged how many runs the team scored when they were in a

given lineup spot. Here’s what we’ve got:

While the range above is very small, the sample size of data is large

enough to draw a few conclusions from the data. First, notice how Pena

and Castillo are extremely divergent in the #1, #3, and #4 spots in the

order, but are almost equal in the #2 and #5 spots. Having a high-OBP

player in the top spot maximizes run scoring, but the advantage of OBP

is quickly lost to SLG, perhaps as early as the second spot in the

lineup. On-base percentage comes back with a vengeance in the bottom

four spots. Ensberg–the average player data point–appears to outpace

both high-OBP and high-SLG towards the bottom of the lineup, but I

wonder how much of that is simply the fact that he’s not as good of a

hitter as the other two; the apparent run scoring when he’s at the

bottom of the order may simply be a result of Pena and Castillo getting

more plate appearances when he’s at the top of the order.

Looking at the best and worst performing lineups confirms a little of

this. Here are the three lineups that mustered the maximum 834 run mean:

Pos Lineup 1 Lineup 2 Lineup 3 ------------------------------------ #1: Castillo Castillo Castillo #2: Castillo Castillo Castillo #3: Pena Pena Pena #4: Pena Pena Pena #5: Pena Pena Pena #6: Castillo Ensberg Ensberg #7: Ensberg Castillo Ensberg #8: Ensberg Ensberg Ensberg #9: Ensberg Ensberg Castillo

And the two that notched the minimum 816:

Pos Lineup 4 Lineup 5 --- -------- -------- #1: Pena Ensberg #2: Castillo Pena #3: Castillo Castillo #4: Castillo Castillo #5: Ensberg Ensberg #6: Ensberg Ensberg #7: Ensberg Pena #8: Pena Pena #9: Pena Castillo

From this small sample, Pena’s power in the fifth spot looks to

slightly outweigh his value in the second spot. Castillo still finds his

way towards the top of the lineup in the first of the worst lineups, but

the biggest difference between the worst and best lineups is the

presence of a couple Wily Mo’s at the bottom of the order. One other

interesting point to note is that Lineup 3 and Lineup 4 both have the

same bunching, they just happen to start at different parts. In this

example, bunching of high-SLG or high-OBP hitters does not appear to

have a significant effect on run scoring.

It’s rare for a team to have three Penas or Castillos, so in another

effort to see where their particular talents are best suited, I ran through

nine lineups: eight average players and Pena or Castillo batting in all

nine positions in the lineup. Here’s how they shook out:

Pena shows a great deal more range in his results than Castillo,

peaking out in the three and four spots, as expected from the previous

results. Interestingly, this result appears even without the typical

poor hitters at the bottom of a lineup. Most of the criticism of putting

a slugger towards the top of the lineup centers around the reduced

number of baserunners on base in front of a slugger, but the results

here seem to indicate that the advantage is something else, perhaps the

right combination of leading off the first inning with a better OBP, but

still getting the slugger the maximum number of plate appearances. While

Castillo’s top production is in the first spot, he shows far less change

as he moves down the lineup.

So where does this leave us? Remember that we’re dealing with a very

small range of possible outcomes, meaning that much of the data being

drawn from these results cannot be considered statistically significant.

That said, when teams have a choice between a high-SLG, low-OBP player

like Pena and a high-OBP, low-SLG player like Castillo, the traditional

lineup structure with Castillo towards the top and Pena in the 3-5 spots

yields near maximum run scoring. Though it may be ideal to bat

baseball’s best hitters–those who are among the league leaders in both

OBP and SLG–towards the top of the lineup, teams that are forced to

choose between high OBP and SLG appear to already be following a

near-optimal model for maximizing run scoring.