In 2013, Adam Wainwright led Major League Baseball by pitching five complete games. In 2012, Justin Verlander was much more of an ironman and pitched six. A mere 30 years ago, in 1983, six complete games would have landed Verlander in a tie for 42nd place with such notables as Storm Davis, Bob Forsch, Jim Gott, Ken Schrom, and Bruce Hurst. Even 20 years ago, six complete games would have been good for a tie with David Cone for 15th place in MLB. What happened to finishing what you started?

Last week, we saw that starting pitchers really have seen a reduction in their workload over time. Since 1950, there has been a steady downward trend in the number of batters that pitchers have faced, the number of outs they’ve recorded, and the number of pitches that they’ve thrown. Indeed, the percentage of games in which the starter records at least 27 outs has fallen from 30 percent in 1950 to two percent in 2012.

What happened to the complete game? Well, for one, it’s hard to get through nine innings in 100 pitches, or even 110, and as we saw last week, managers have reined in their starters over time. But perhaps there’s another reason why managers have felt more and more comfortable turning to the bullpen in the seventh inning. Let’s see if we can figure it out.

Warning! Gory Mathematical Details Ahead!
Let’s consider the choice that a manager might face at the end of the fifth inning. His starter is showing signs of tiring, and he must decide whether the starter should go out for one more inning or whether he should tell someone down in the bullpen to get ready. It’s not an easy decision. He has to come up with some estimate of what he thinks the starter is capable of in the sixth. He has to balance that against what the score is, because his first job is to win the game. He also has to think about the state of his bullpen. If the starter can get through the sixth, the bullpen has to pitch only three innings rather than four, and that can affect the next day’s game. He might also be in a situation where this particular starter, tired though he may be, is still a better option than the guy he’d have to bring in.

I tried modeling how this decision has played out over time. I located all cases in which a starter had lasted all the way through the fifth inning (recorded 15 outs). To try to isolate cases in which we can surmise that the manager knew the starter was faltering, but still left him in, I looked for all cases in which the sixth inning (whether the starter completed it or not) was his final act that day. I figured out how many runs (on average) the starter surrendered in those sixth innings. I then found games in which the starter exited after exactly five innings and found out how many runs (on average) the relievers in the sixth inning gave up. For the results that I’m about to show, I considered only games in which the score was still within three runs (in either direction). It actually doesn’t end up making much difference in the overall conclusions when you take that filter off.

Before you are allowed to see the results, you must memorize the following paragraph. The results that follow do show what actually happened, but the decision as to whether a reliever came into the game was NOT made at random. Managers probably let their better starters go an extra inning and their back-of-the-rotation guys go to the locker room. Managers with good bullpens were probably more likely to pull the plug, and those with a bullpen from Hades probably thought twice about it. Here are the results, by year, going into the sixth inning:

We see that over time, there are peaks and valleys in the number of runs that relievers (the blue line) give up when they go out there, but the numbers vibrated within a fairly small range between 0.3 and 0.7 runs. However, when a starter was allowed to go out there in 1950, he was likely to give up more than a run and a half! But over time, the gap between what starters did in their sixth and final inning, and what relievers did in their first inning and the sixth overall began to narrow. By 2010, the two lines were touching.

Here’s the same graph modeling the same basic decision, only this time, the game is headed into the seventh inning.

We see the same basic pattern at first, but by the time we get to the late 1970s and early 1980s, the lines cross. And for the decision headed into the eighth inning, we see the same pattern again.

All three graphs have the same message. In the 1950s and 1960s, managers were much more likely to leave the starter in for an extra inning than go to the bullpen. In fact, if we graph the number of cases where a starter is left in for another inning vs. the number of times a reliever is brought in (I’m showing the graph for the decision going into the eighth inning—they all basically have the same shape over time), the majority of cases favor “one more inning” in the 1950s, ’60s, and into the ’70s. Somewhere in the ’80s, the trend turns downward and accelerates.

Managers developed quicker hooks in the ’80s. Probably not coincidentally, this was also the time that the results that they got from leaving the starter in vs. bringing in a reliever began to come into line. A theory: over time, managers realized that when they sent just any old pitcher out there for another inning when he was tired, it produced worse results than when they had gone to the bullpen. It wasn’t always the right call to go to the pen, but managers became better at picking their spots for when a starter should be pushed and when he should be restrained. Eventually, managers realized that going with a tired starter—especially a tired, bad starter—when a fresh reliever was available was counter-productive.

For the curious, the decision going into the ninth inning looks like this.

For most of the last six decades, starters have been about equal to their relieving counterparts in the ninth inning, and lately, they’ve been better. We do need to account for the fact that those starters who are pitching in the ninth inning may not be particularly tired, while in the previous innings that we’ve studied, we’ve selected those who have a marker that at least suggests fatigue. Additionally, by the ninth inning, the manager has an eight-inning sample of how the pitcher has been performing that day to consider.

The Benefits of a Bunch of Relievers
The evidence that I’ve presented above is not a direct confirmation of why complete games have become an endangered species over the past few decades. But it does show the development of a force that surely has an impact on those rates. It used to be that managers erred on the side of “one more inning” for their starter, and the results when they did so were awful. Maybe they did so out of some sense that starters should finish their games. Given that even in the 1950s, relievers were far more effective when they were brought in, it wasn’t likely for lack of options out in the pen. Over time though, managers adjusted, calling more often for relievers and letting the starters stay in for that extra inning only when a much more sober cost-benefit analysis suggested that it was a good idea. (You could make the case that they are too conservative now when it comes to letting the starter stay in—ideally, the starter and reliever lines should be close together, meaning that managers have found some sort of equilibrium—but managers may also be thinking that another inning means a higher pitch count, and a higher pitch count is a risk factor for injury.)

So, yes. There was a time when the complete game was much more common than it is now. I’d argue that the reason for the decline isn’t because today’s pitchers are wimps. It’s because it’s always been foolish to send a tired pitcher out to the mound when there’s a better option available. It just took baseball a few decades to figure this out.

To put some sort of number on it, in the 1960s, the gap between the starter and reliever lines floated between about .2 and .3 runs. Let’s call it a quarter of a run, just to have a nice easy number to work with. By the 1980s, the performance between the two groups was equal, and coincided with the development of a quicker hook. The “cost” of the quicker hook was that teams were beginning to have pitching staffs that had 11 and 12 pitchers on them, rather than 9 or 10. The extra bodies were needed because the bullpen was going to be taxed more with a quicker hook. Let’s assume that if teams went back to the days of pushing their starters harder—either to prove how manly they are or because they want to convert some of those reliever roster spots into something more useful—we would see the same sort of discrepancy re-appear. More tired starters would be sent out to work an extra inning, at the cost of a quarter of a run each time.

In the 1960s, managers seemed to choose “send him out for one more” at a rate that was 25 percentage points higher than it is now. Let’s say that now, 25 percent more of the time (40.5 games) than he would have in 1965, the manager makes a decision to pull a tired pitcher, because with the expanded bullpens of this era, he can. In doing so, he saves his team a quarter of a run. The modern bullpen thus saves (using some admittedly slapdash math, roughly) 10 runs a season over the way things used to be done. There might be other savings that come in the form of not running up pitch counts, and thus preventing injuries, but we haven’t gotten there yet.

One common critique of the modern bullpen is that those extra relievers consume roster spots that could be used to have extra hitters available on the bench for use in platoons, or for defensive specialists, pinch hitters, or designated pinch runners. Those things may very well have value, but so do the extra relievers. A few months ago, I looked at the value of some other uses of roster spots that could be facilitated through a player who played multiple positions. Finding someone a good platoon partner would add perhaps 150-200 plate appearances in which the batting team gets a handedness advantage when they would not have otherwise, and perhaps three or four additional on-base events (perhaps two or three runs). A defensive specialist replacing a really bad defender might save a team .04 runs per inning, but get to play only 80-100 innings over the course of a year (three or four runs) as a defensive replacement. Having a good “10th man” on the bench who would divert plate appearances away from the really awful-hitting (but he can play short!) utility infielder is worth a couple of extra runs per year. BP’s own Sam Miller has also shown that being able to carry a designated pinch runner (of the Billy Hamilton variety) would be worth about a tenth of a win in the space of a month, so call it five or six runs over the course of a year. And that’s if you have Billy Hamilton.

Let’s pretend that a team went back to “the old days” when pitchers were pushed, and could liberate two roster spots from fringy relievers and re-purpose them to position players. The team would lose about 10 runs of value from having to push a tired starter out there, plus probably put their starters at greater risk for injury. The math behind some of these estimates involves the occasional assumption/best guess, and in specific circumstances, the effects might be bigger (or smaller), so your exact mileage may vary.

However, looking at all of these alternate uses for a roster spot, even if you found two “best-case scenario” guys, the value that they have to replace is roughly 10 runs (on average), plus whatever benefit comes from managing the pitch counts better. Most of these extra batters are worth an upgrade of between three to five runs over the course of a season in ideal circumstances. There are probably specific cases where a team could make up those 10 runs and add some profit. But maybe you can also see that the quick hook and the big bullpen are actually a perfectly reasonable strategy, and it’s not surprising that evolutionary pressures have moved the game to favor that roster construction over time. It may not make for an aesthetically pleasing game, but it’s perfectly reasonable from the point of view of maximizing a team’s chances to win a game.

Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now
You need to be logged in to comment. Login or Subscribe
Did you consider a chaining effect from heavy workload starters - that relievers will be kept fresher and thus more effective over the course of a season with a 240 inning guy in the rotation over a 180 one? Would there be a way to isolate this effect? Maybe reliever effectiveness by month (acknowledging that September call-ups would skew things)?

Thanks and keep up the great work
In my head, yes, although it seems that over time, the historical trend has dealt with this by having a couple extra guys in the bullpen, so as not to overtax the relief corps.
But isn't there an issue that those extra guys are not very good (sub-replacement on most teams, I'd imagine)?

Also, considering leverage, if your 8th inning guy is tired/less effective in a September pennant race then wouldn't that would to have a significant effect if lesser relievers move up the 'pecking order'? And wouldn't this effect be exacerbated if the better relievers are tired by the post-season?
"To try to isolate cases in which we can surmise that the manager knew the starter was faltering, but still left him in, I looked for all cases in which the sixth inning (whether the starter completed it or not) was his final act that day."

Wow, that's a selective sample and a half! I don't think you are isolating cases when the manager knows a starter is faltering but leaves him in for at most one more inning. I think you are isolating cases where a pitcher pitched horribly in the 6th and was taken out (either in the middle or after).

By no means is the result of that inning representative of how starters pitch in the 6th inning when their manager thinks they are faltering. If you look at any inning which is the last inning for a starter, you will see a bad inning. The earlier that inning is, the worse that inning will be (because the reason a manager takes out a starter in the early innings is because he pitched badly in that inning - in the later innings it could be just a high pitch count).

I also think the results you are seeing across time is mostly this:

In the early days, the only reason a manager takes out a pitcher in or after the 6th is because he had a terrible inning. In modern days, a manager takes out a starter during or after the 6th if he had a bad inning OR if his pitch count starts to get high.

I'm not really sure what we are to conclude from that...
One thing I did think about was to look only at cases where either the starter or the reliever who entered finished the inning, as a way to delete the possibility of it being a starter who suddenly lost it after everything was looking great.

There is some sense of counter-balance in that relievers can also have just absolutely horrid innings as well.
I must be confused. If I'm reading the graphs right, in the 1950s, starters tended to give up many more runs than relievers in most of the innings investigated. In the present day, this difference is much reduced. Doesn't that suggest _less_ reason to remove the starter now than before?
As I said, if you are looking at the "last inning" pitched that is not an unbiased look at the runs allowed. The "last inning" runs allowed will be inflated because a pitcher who allows lots of runs in an inning will often be taken out during or right after that inning.

The reason you are seeing more runs allowed in that last inning in "the old days" has nothing to do with how good the starters are. It simply means that in the old days, starters were taken out only during or after a terrible 6th inning. In modern times, starters are taken out after and during a bad 6th inning AND when their pitch counts are high.

So, again, these are very misleading numbers, and I don't know why Russell chose this "last inning pitched" criteria. I don't think it yields anything useful.
The argument that I'm making is that managers have gotten better about picking their spots. If they were to revert to the old days, they would send pitchers out whom they shouldn't send. It's not the starters that have improved, but the managers who have done so.
"If they were to revert to the old days, they would send pitchers out whom they shouldn't send."

Can you explain how you came to this conclusion (if you have time)? I'm not following you.
Let's leave aside your point about the selection bias in the sample (it's a valid critique, but for a moment, let's assume that it doesn't make a difference) In the 60's, we see that starters in their final innings were a quarter run worse than relievers in the same inning. Over time, we see that managers moved more toward relievers, and as they did, the lines converged. My hypothesis is that this was the managers realizing (consciously or not) that they were sending out starters when they had a reliever who would have been a better option, and becoming better about telling the difference. In the 60s they weren't. If they went back to the 60s, it would be returning to the days of incorrect assessment.
"Let's leave aside your point about the selection bias in the sample (it's a valid critique, but for a moment, let's assume that it doesn't make a difference)"

I'm sorry, but you can't leave out that point. That is THE whole point (that I am making).

"In the 60's, we see that starters in their final innings were a quarter run worse than relievers in the same inning."

The runs in that "final inning" have no reflection on the talent of the pitchers in that inning. I can create any number of runs I want in the "final inning" simply by deciding when to take out a pitcher.

As I said, the primary reason for the runs allowed being much higher in the old days in that last inning is the reason for taking the starter out. We have NO IDEA of the talent of the pitchers in that last inning. None whatsoever. You have to have an unbiased inning for that.

In fact, the ONLY thing that looking at runs allowed in the last inning tells us is why managers took them out. The higher the runs allowed in that inning are, the more managers took them out because they were bad and not because of any other reason (like a high pitch count). The other reason that the "last inning" has fewer runs allowed in modern days is because managers will leave in a pitcher if he allows lots of runs in that inning but his pitch count is low! That is because the bad inning will sometimes not be the last inning. In the old days, when a pitcher gave up a lot of runs, he was often removed even with a low pitch count.

I'm sorry, but unless I am reading your conclusions wrong, I think you made a fatal mistake by looking at the last inning.
I'm not as confident as MGL that the study revealed little or noting, but I know I would have looked at a lot more data, in search of more information on the evolution of the game and of managerial thinking.

I would have looked at all starting pitching, inning by inning starting with the sixth, in reasonably close games. Break it down four ways: not the pitcher's final inning, lifted after the inning without a pinch-hitter (you'd want to isolate the leagues in the DH era), lifted afterward for a pinch-hitter, lifted mid-inning. I would classify the last bucket by five further variables: pitcher retired last man faced or not, pitcher left inherited runners or not, pitcher already charged with ER when lifted or not, reliever had a platoon advantage (or would have, had the other team not pinch-hit) that the starter lacked, or not; and whether the team was ahead, tied, or behind. (Tied probably acts like behind, but I wouldn't make that assumption to begin with.)

Now, I've just divided the last group of pitchers into 32 or 48 buckets, but we don't worry about that yet. First, let's look at the historical trends for all this data, without worrying about the results. Get a sense of how managerial strategies have shifted.

Then compile the data for relief pitching, in each inning. I think I'd charge the SP with the RE of his inherited runners and credit or debit the relievers with the difference between that and actual. But maybe not.

I'd use some sort of running average or multi-year buckets to eliminate some of the year-to-year noise.

I'd spend far too much time looking at the interactions among the five mid-inning variables. But one goal is to identify any buckets that can be usefully defined as combinations of more than one variable (e.g., pitcher retired last batter + reliever had platoon advantage).

I'd have no idea what I'd discover, but there's nothing more fun than going where the data leads.