Over the past few weeks, I've been taking an in-depth look at a single decision made by a manager: Tim Lincecum's 148-pitch no-hitter from a few weeks ago. Bruce Bochy left Lincecum in well past the usual 100-pitch limit to give him a chance at baseball immortality. But at what cost? We've seen that if a pitcher makes his next start on regular rest, there is a small carry-over effect of throwing a lot of pitches, but it's not all that big and it might even just be a methodological quirk anyway. We've seen some evidence that taking a pitcher out of a shutout (not necessarily a no-hitter) doesn't seem to affect him for good or for ill. But what about the obvious question. Are marathon pitching sessions penny-wise and pound-foolish?
There are plenty of people who would blame the demise of Johan Santana's career on his throwing 130 pitches in pursuit of a no-hitter in 2012 (despite the fact that he did miss all of 2011, so it's not like he was structurally sound to begin with.) There are still plenty of snide remarks about Dusty Baker's handling of Mark Prior and Kerry Wood during the Cubs' stretch run in 2003. Then again, Nolan Ryan has famously said that he does not believe that teams should be so beholden to the pitch count as they have been. Others have mumbled about how this is a problem of pitchers being soft and that they should "man up and shut up."
Does the evidence support the claim that pitchers who have especially long outings are more prone to injury? In the past here at BP, Rany Jazayerli popularized the idea of Pitcher Abuse Points, "awarded" to managers who allowed their pitchers to throw gobs of pitches. In 2000, Keith Woolner pulled pitch count and injury data from 1988-1998 and looked at career levels of pitcher abuse and the likelihood of injury. He found that "abused" pitchers (those who had games in which they threw a significantly large number of pitches) were more likely to be injured over the course of their careers. Let's take another look at the data, 13 years later and see whether the basic finding, that abusive pitch counts really are hazardous to a pitcher's health, is still supported.
Warning! Gory Mathematical Details Ahead!
I started with all games from 2003-2012, for the reasons that it's a nice 10-year stretch and we have pitch count data from all of those years. Here at Baseball Prospectus, we also have a database of all (reported) injuries since 2002, although for this study, I used only injuries that involved a pitcher being sent to the disabled list, since they were serious enough that a team realized that they needed to shelve a pitcher for at least two weeks to give him time to recover. I merged the two.
One problem with modeling injury risk is that one side effect of being injured is that you spend a good amount of time not pitching afterward. It might be two weeks or two months, but injured pitchers stop providing data (at least for a while). In research methodology terms, this is a clear bias in the data set. We need to find a way around that.
Fortunately, we can turn to the world of public health (and actuarial science) for some methodological help. One problem with modeling factors that can lead to death is that dead people don't provide much more data in the "after" phase of the research. Thankfully, there are techniques that take this very problem into account.
I'll be using a Cox regression analysis to model trips to the disabled list within a season. I structured my data set to look at each pitcher's starts sequentially from his first to his last start of the season, if he managed to avoid the disabled list all year, or from his first start to the last start he had before the first time he went on the DL. I realize that a pitcher might go on the DL, serve his two weeks, and come back and pitch, but previous research has shown that the greatest risk factor for being on the DL is whether a pitcher has previously been there. Once he's been on, his risk for subsequent trips (and subsequent lost productivity) goes up almost by a factor of 10 in the following year.
Cox regression assumes that at as time goes on, some percentage of the cases within a sample will "drop out." Not everyone who is exposed to a risk factor will be on the DL, but we will see that over time, those who are exposed to a risk factor will be more represented among those who drop out. And some pitchers who are not abused may simply land on the DL through dumb luck. Things happen.
I coded all starts for each pitcher for whether he crossed certain thresholds. I started at 75 pitches and worked my way forward in increments of five until I hit 130. For example, a pitcher who threw a standard 100 pitches crossed the threshold for 75, 80, 85, 90, 95, and 100 pitches. Since the theory of pitcher abuse suggests that not all pitches are created equal and that pitch 125 is marginally more damaging than pitch 85, for each line in the data set, I entered how many times a pitcher had crossed each threshold up to that point within that season. I controlled for the total number of pitches that he had thrown in total (yes, this will conflate a bit with the "threshold crossing" variables… we'll get to that) and for whether he had spent time on the disabled list in the previous year or two years prior.
I considered only starters who pitched their first game in April (or earlier) because those who came up later may have been in the minors, and who knows what happened there. This sampling method probably does have some noise in it, since some guys who make the team out of spring camp are found wanting and are sent back to Triple-A (and thus disappear from our radar), and there are swingmen who start sometimes and relieve sometimes. I'm willing to accept that noise.
The results of a Cox regression are similar in form to those of a binary logistic regression in that we are building a term that is a natural log of the odds ratio that an event will happen. Here, we are specifically modeling whether the pitcher will make his next start or will succumb to an injury severe enough to put him on the disabled list. This makes eyeball interpretations a little harder, but let's start with a pitcher in his first start who has no previous disabled list history.
First, here are the coefficients from the Cox regression:
Factor |
B |
Standard Error |
Significant? |
Crossed 75 pitches |
.260 |
.095 |
Yes |
Crossed 80 pitches |
.238 |
.089 |
Yes |
Crossed 85 pitches |
.140 |
.071 |
Yes |
Crossed 90 pitches |
.238 [sic] |
.060 |
Yes |
Crossed 95 pitches |
.167 |
.049 |
Yes |
Crossed 100 pitches |
.166 |
.049 |
Yes |
Crossed 105 pitches |
.165 |
.055 |
Yes |
Crossed 110 pitches |
.171 |
.065 |
Yes |
Crossed 115 pitches |
.253 |
.097 |
Yes |
Crossed 120 pitches |
.004 |
.167 |
No |
Crossed 125 pitches |
.333 |
.311 |
No |
Crossed 130 pitches |
.678 |
.478 |
Yes |
Total number of pitches to date |
(.038) |
.002 |
Yes |
DL stint last year (yes/no) |
.560 |
.105 |
Yes |
DL stint 2 years ago (yes/no) |
.573 |
.107 |
Yes |
Once a pitcher hits 75 pitches, we create the function that he will "die" (i.e., end up on the disabled list afterward) by taking the coefficient from crossing 75 pitches (.260) and adding the contribution of the number of pitches (-.038) * 75. We get -2.59. This is equal to 6.59 percent. That may seem like a lot, and it is. There's an unseen adjustment for the number of starts that a pitcher has made. In theory, his arm is not as tired (and injury-prone?) in his Opening Day start as it is in his 14th start. It isn't that he has a six and half percent chance of landing on the DL after throwing just 75 pitches on Opening Day. (It's actually much lower.) It's more instructive to look at how each step along the way adds to the overall injury risk.
Let's look at the predicted contribution toward injury risk that each step on the ladder makes and (more importantly) the change in that contribution over time.
Threshold crossed |
Predicted contribution to injury risk |
Delta from above |
75 pitches |
6.59% |
— |
80 pitches |
6.91% |
0.32% |
85 pitches |
6.59% [sic] |
(0.32%) |
90 pitches |
6.89% |
0.30% |
95 pitches |
6.75% |
(0.14%) |
100 pitches |
6.59% [sic] |
(0.16%) |
105 pitches |
6.43% |
(0.14%) |
110 pitches |
6.32% |
(0.11%) |
115 pitches |
6.71% |
0.39% |
120 pitches |
5.62% |
(1.09%) |
125 pitches |
5.85% |
0.23% |
130 pitches |
10.19% |
4.34% |
This is a rather interesting table. Injury likelihood actually stays relatively flat up to 110 pitches. At 115 pitches, there's an upswing, then a downswing at 120. The finding that, of course, stands out is that pushing a pitcher to 130 pitches has an enormous effect, bumping up the chances that the pitcher will suffer an injury that will put him on the disabled list after this start by more than four percent, and in all of his subsequent starts for the year.
It's worth pointing out that if you look at the standard errors for those regression coefficients, they get bigger (that is, more unreliable) as the pitch counts get bigger. This is, in part, because there aren't a lot of games in which a pitcher clocks 130 (or 120) over the last 10 years. Small sample sizes make for unreliable regression coefficients. However, it seems that 130 pitches in one outing is the inflection point where small increases in injury risk (which will always be there) turn into downright hazardous working conditions for a pitcher.
Additionally, it's a biased sample that gets to go well above 110. Somewhat by definition, it's guys whom managers and pitching coaches believe are able to handle such a workload on one night and not suffer too great an effect. Assuming that they have some idea of what they are doing (i.e., they are smart enough to introduce bias into the sample—and I do worry about how much bias that is and don't know how to fully control for it), we need to be careful about saying that everything would be just peachy for all pitchers who get to 115. In that group, we're getting data only from those who are certified workhorses. Despite that, the effect at 130 shines through, and the effect size is larger than having a previous DL stint. It looks like even workhorses have a point where they break down and are sent to the glue factory.
I ran some supplemental analyses in which I looked to see whether these effects varied by age (on April 1st) by including all of the variables in the original regression, as well as an interaction term for each of those variables with age. None of the interaction terms reached significance. I did the same for body-mass index (perhaps bigger pitchers can handle the workload?) Nothing there. I also used body-mass index squared, to allow for a quadratic function. Perhaps there was a body-mass index that was a happy medium for handling workloads? There was not.
The Society for the Prevention of Cruelty to Pitchers
Let's sum up some take-home messages:
1) There will always be injury risk to starting pitchers. It happens.
2) Going a little bit beyond the usual 100-pitch limit does not appear to appreciably increase the risk of an injury later in the year. Whether or not a pitcher is effective after 100 pitches or whether the team would be better off with a fresh reliever is another issue. Frankly, if your bullpen is so bad that you don't have someone out there who is better than a starter with 110 pitches logged, you have a bigger problem.
3) Letting a pitch count climb up around 120 is entering a danger zone, and is probably understated somewhat by these findings due to the sample bias inherent in who gets to throw 120 pitches in a night. Letting any pitcher exceed 130 pitches has the same effect as going back in time and giving him a significant injury (one big enough to land him on the DL), and a previous DL trip is a very powerful predictor of a future DL visit.
4) Once a pitcher has thrown an extended outing, he will carry the "scar" of that event until the end of the season. He might not go on the disabled list the next day, but he's now a bigger injury risk every time he takes the mound.
5) We are dealing with probability here, not certainty. There will be pitchers who have a marathon outing and never get hurt. There will be those who are handled with kid gloves and get hurt after three starts. Sometimes, you do everything right and it doesn't work. Maybe the obsession with 100 pitches as a hard limit for a start isn't the right cutoff point for preventing injuries, but there is clearly a point where high pitch counts do become abusive to pitchers.