keyboard_arrow_uptop

In Episode 1080 of the Effectively Wild podcast, co-host Jeff Sullivan noted that several of the batters who’ve reached base via error three times in a game did so in doubleheaders (1:06:06):

"This seems like something that somebody could research, to see if errors have been more prevalent in doubleheaders. If you adjust for era, there could be something there. I am not nearly interested enough to do the research … so somebody out there, do the hard work!"

I took up the mantle of somebody, but as you’ll see, it was not I who did the hard work.

I didn’t exactly follow Sullivan’s edict, though. I concentrated only on recent doubleheaders, with recent defined as since the 30-team era began in 1998. Doubleheaders have become uncommon in contemporary baseball, enough so that the difference between a twinbill and a regular game may not be the same as when they were more common. The reasons for the decline in doubleheaders have a little to do with collective bargaining (players don’t like them) and a lot to do with economics.

Fifty years ago, the Cardinals, Red Sox, Dodgers, and Mets were the only teams that drew over 1.5 million fans to their home games. A single-admission doubleheader in 1967 could mean a nice boost in ticket sales—those four teams were the only ones to average over 20,000 per game—along with extra concessions. So far this year, the only teams that haven’t drawn 20,000 per game are the A’s and Rays. Nearly half of all teams are averaging over 30,000 per game. There is no economic reason to schedule a doubleheader; you’re not going to get much of an attendance bump.

As a result, doubleheaders are mostly unscheduled, the result of postponements, and even then they’re largely of the day-night, two-separate-admission variety. There were 72 doubleheaders in 1998, the first season of the 30-game era. There were only 28 last year.

Still, Sullivan’s question is valid. Doubleheaders may be infrequent, and the break between a day-night twinbill is longer than for a single-admission double dip, but they put similar strains on a team. You go through a bunch of pitchers. You have to split catching duties. And for those players who play both games, the mental and physical fatigue of a nine-inning major-league game is doubled. Those strains, and that fatigue, could affect play on the field. But does it?

To answer that question, our data genius Rob McQuown did the hard work. He retrieved data on every doubleheader played from 1998 through last week. That gave me 994 doubleheaders, comprising 1,988 team games. That number is dwarfed, of course, by the 94,170 non-doubleheaders since 1998. But it’s a robust sample, equivalent to over two-fifths of a full season, or over a dozen team seasons. That’s enough to draw some conclusions.

Now, some of you are going to object to this data. You’re going to say that I should be comparing the doubleheaders between, say, the Tigers and Royals to the non-doubleheader games between the Tigers and Royals, not combining all doubleheader games and all non-doubleheader games. I should be taking the ballpark into account, and the time of year, given how offense tends to rise with temperatures.

Those are all valid observations! And I’m going to ignore them. I want to get a rough idea of what goes on in doubleheaders, that’s all. Close is good enough. It’s not like this is going to change the way the game is played. Here are my results. I think this table is pretty self-explanatory. The only abbreviation that may unfamiliar is UER for unearned runs.

Metric

Doubleheaders

Other Games

Difference

Runs/Game

4.718

4.600

0.119

Errors/Game

0.663

0.634

0.029

UER/Game

0.376

0.363

0.013

BA

.2642

.2615

.0027

OBP

.3331

.3293

.0038

SLG

.4169

.4163

.0006

OPS

.7499

.7456

.0044

One of the problems I find when presenting tables of this type, in which one of the columns is labeled difference, is the sign to use. Should I put a plus sign in front of positive differences and a minus sign in front of negative differences? Or do positive numbers get presented as is, with parentheses around negative results? In this case, I don’t have to make this decision. Every single difference is positive! There’s more offense, more runs, and more errors in doubleheaders than in other games. Without exception.

But some of those changes are pretty small. A .4169 slugging percentage vs. a .4163 slugging percentage … that’s a pretty imperceptible difference. On the other hand, as I pointed out, we’ve got some pretty robust samples here. You get enough data, and even small changes can be statistically significant.

So I’m going to present the same table, but this time I’m going to add a column labeled P Value. That’s the significance level of the difference between the two numbers, using this online calculator. If you’re not into P values, just know this: A value below 0.10 is somewhat statistically significant. A value below 0.05 is pretty clearly statistically significant. To make it easy, I’ll bold the strongly significant differences and put the more weakly significant ones in italics. (If, on the other hand, you are into P values, you probably want to read this footnote. [1])

Metric

Doubleheaders

Other Games

Difference

P Value

Runs/Game

4.718

4.600

0.119

.023

Errors/Game

0.663

0.634

0.029

<.0001

UER/Game

0.376

0.363

0.013

.001

BA

.2642

.2615

.0027

.097

OBP

.3331

.3293

.0038

.0005

SLG

.4169

.4163

.0006

0.656

OPS

.7499

.7456

.0044

.0673

There are two conclusions here, I think.

  1. Sullivan’s intuition was right. There is a statistically significant increase in errors in doubleheaders compared to other games, resulting in a statistically significant increase in unearned runs. We can imagine plenty of reasons for the sloppier play: tired players who play both games, and rusty, less talented substitutes who play just one game.
  2. The increase in unearned runs contributes to—but does not solely cause—an increase in runs overall in doubleheaders. That increase is caused largely by a statistically significant increase in on-base percentage, with only a little attributable to more base hits. I didn’t show it in the table, but there’s a statistically significant increase in both walks per game and hit batters per game in doubleheaders, again likely due to fatigued and/or underused pitchers. (We can all recall minor leaguers being called up for a doubleheader and sent back down once it’s over.)

The trends are significant, but are they interesting? Probably not. With so few doubleheaders being played now, the difference isn’t very important. I randomly looked up the 1938 Boston Bees, just to see how much things have changed. They played a doubleheader in May, three in June, eight in July (including July 1, 3, and 4!), eight in August, and nine in September. That’s 29 doubleheaders! Almost two-fifths of their games were in doubleheaders!

Their successor, the Atlanta Braves, played one doubleheader in 2014, one in 2015, and one in 2017. That’s it over the past four seasons. Whether those two games per year featured a few more errors and runs than the other 160 is a curiosity, not a feature.



[1] For those of you statistically inclined, you’ll know that t-tests such as these require means, which are in my table, and sample sizes, which are 1,988 for doubleheaders and 94,170 for non-doubleheaders. They also require standard deviations. Here’s what I did there. The standard deviation for runs per game is about two-thirds of the mean calculated on a per-game basis. It’s more like 12% or so on a per-team basis. I wanted to set my deviations fairly wide, so I arbitrarily set it at 50% of the mean (i.e., closer to the game-to-game variation than team-to-team), and did the same for unearned runs. For the four batting statistics, the team standard deviations are .010 for BA, .012 for OBP, .015 for SLG, and .026 for OPS. Similar to what I did for runs per game, I simply quadrupled them for this experiment: .0429 for BA, .0484 for OBP, .0594 for SLG, .1037 for OPS. Finally, the standard deviation of team errors per game is 0.011, which is really, really small, so I set the standard deviation at 50% of the mean, as for runs per game.

I know, I know, these are borderline laughably imprecise; the key here, I think, is that I chose pretty wide standard deviations across the board, reducing the chance of false positives.

You need to be logged in to comment. Login or Subscribe
newsense
7/27
Shouldn't there be a difference between the first and second games of a doubleheader? After all, players shouldn't be fatigued during the first game. Actually the better study would be comparing the first game to the second rather than DH games to non-DH games. (Use a paired t-test)
mainsr
7/27
Perhaps. Maybe even probably! But I didn't do it that way for two reasons. First, I don't assume there'd be a meaningful difference on the pitching side. It's going to be situational; if you have a bunch of quad-A arms on the bullpen, you're just as likely to use them in a blowout first game than a close second game. Second, while you're undoubtedly right about players playing both games getting worn down, there's not a definite formula regarding which games the nominal subs play. For example, if a team plays a night game Tuesday, a day-night DH Wednesday, and has an off day Thursday, the regular catcher could well catch the second game rather than the first. Third, and most importantly, there'd be a bit of a sample size issue, given how rare DHs are. FWIW, of the 43 games in which a player reached base via error three times, it occurred in the first game of a doubleheader three times and in the second game four times.
mainsr
7/27
*three reasons. Not two.
tribefan204854
7/28
In double headers pitching use is different in both halves of the twin bill. I would suspect that a pitcher who isn't doing well will pitch to more batters and that would occur in both games. Whether errors may be related to unfamiliar lighting or boredom (outfield)or fatigue, or whether this may be due to more play by second stringers who aren't as good in the field might also be further investigated. However, with the few double headers the issue may be moot.
mainsr
7/30
I agree with you completely about pitching. With fielding, though, given the lack of scheduled DHs (in which the second game is often played in shadows) compared to day-night DHs (in which both games start at more or less regular times), I don't know if lighting would be an issue. And I think the key issue here is your last sentence.
lichtman
7/28
DH games feature inferior players all across the board, batters, pitchers, and probably fielders. The higher run scoring is likely due to inferior pitching being stronger than inferior batting as well as taxing the bullpen. That being said you probably want to look at errors per BIP as it may be that the difference in error rate per game is completely explained by more BIP.
mainsr
7/28
That makes sense. I probably should've gotten Rob to retrieve K's for me so I could look at BIP. I wanted to test Jeff's hypothesis that there are more errors in doubleheaders than other games, and that turned out to be true. But yes, an increase in BIP could underlie both more errors and more base hits.
duncanf
7/30
It'd be interesting to know of the dataset which lineup is presented in which game. Managers could play the "A" team in the second game as opposed to the first.
mainsr
7/30
You can blame me for the methodology here. I just asked Rob to give me annual summary data for all doubleheaders, not individual game lines. So all games are combined and indistinguishable in this analysis.