BP Comment Quick Links
Strength of Schedule Report 
August 6, 2013 Baseball TherapyPrioritizing the Pitcher's Health
Over the past few weeks, I've been taking an indepth look at a single decision made by a manager: Tim Lincecum's 148pitch nohitter from a few weeks ago. Bruce Bochy left Lincecum in well past the usual 100pitch limit to give him a chance at baseball immortality. But at what cost? We've seen that if a pitcher makes his next start on regular rest, there is a small carryover effect of throwing a lot of pitches, but it's not all that big and it might even just be a methodological quirk anyway. We've seen some evidence that taking a pitcher out of a shutout (not necessarily a nohitter) doesn't seem to affect him for good or for ill. But what about the obvious question. Are marathon pitching sessions pennywise and poundfoolish? There are plenty of people who would blame the demise of Johan Santana's career on his throwing 130 pitches in pursuit of a nohitter in 2012 (despite the fact that he did miss all of 2011, so it's not like he was structurally sound to begin with.) There are still plenty of snide remarks about Dusty Baker's handling of Mark Prior and Kerry Wood during the Cubs' stretch run in 2003. Then again, Nolan Ryan has famously said that he does not believe that teams should be so beholden to the pitch count as they have been. Others have mumbled about how this is a problem of pitchers being soft and that they should "man up and shut up." Does the evidence support the claim that pitchers who have especially long outings are more prone to injury? In the past here at BP, Rany Jazayerli popularized the idea of Pitcher Abuse Points, "awarded" to managers who allowed their pitchers to throw gobs of pitches. In 2000, Keith Woolner pulled pitch count and injury data from 19881998 and looked at career levels of pitcher abuse and the likelihood of injury. He found that "abused" pitchers (those who had games in which they threw a significantly large number of pitches) were more likely to be injured over the course of their careers. Let's take another look at the data, 13 years later and see whether the basic finding, that abusive pitch counts really are hazardous to a pitcher's health, is still supported. Warning! Gory Mathematical Details Ahead! One problem with modeling injury risk is that one side effect of being injured is that you spend a good amount of time not pitching afterward. It might be two weeks or two months, but injured pitchers stop providing data (at least for a while). In research methodology terms, this is a clear bias in the data set. We need to find a way around that. Fortunately, we can turn to the world of public health (and actuarial science) for some methodological help. One problem with modeling factors that can lead to death is that dead people don't provide much more data in the "after" phase of the research. Thankfully, there are techniques that take this very problem into account. I'll be using a Cox regression analysis to model trips to the disabled list within a season. I structured my data set to look at each pitcher's starts sequentially from his first to his last start of the season, if he managed to avoid the disabled list all year, or from his first start to the last start he had before the first time he went on the DL. I realize that a pitcher might go on the DL, serve his two weeks, and come back and pitch, but previous research has shown that the greatest risk factor for being on the DL is whether a pitcher has previously been there. Once he's been on, his risk for subsequent trips (and subsequent lost productivity) goes up almost by a factor of 10 in the following year. Cox regression assumes that at as time goes on, some percentage of the cases within a sample will "drop out." Not everyone who is exposed to a risk factor will be on the DL, but we will see that over time, those who are exposed to a risk factor will be more represented among those who drop out. And some pitchers who are not abused may simply land on the DL through dumb luck. Things happen. I coded all starts for each pitcher for whether he crossed certain thresholds. I started at 75 pitches and worked my way forward in increments of five until I hit 130. For example, a pitcher who threw a standard 100 pitches crossed the threshold for 75, 80, 85, 90, 95, and 100 pitches. Since the theory of pitcher abuse suggests that not all pitches are created equal and that pitch 125 is marginally more damaging than pitch 85, for each line in the data set, I entered how many times a pitcher had crossed each threshold up to that point within that season. I controlled for the total number of pitches that he had thrown in total (yes, this will conflate a bit with the "threshold crossing" variables... we'll get to that) and for whether he had spent time on the disabled list in the previous year or two years prior. I considered only starters who pitched their first game in April (or earlier) because those who came up later may have been in the minors, and who knows what happened there. This sampling method probably does have some noise in it, since some guys who make the team out of spring camp are found wanting and are sent back to TripleA (and thus disappear from our radar), and there are swingmen who start sometimes and relieve sometimes. I'm willing to accept that noise. The results of a Cox regression are similar in form to those of a binary logistic regression in that we are building a term that is a natural log of the odds ratio that an event will happen. Here, we are specifically modeling whether the pitcher will make his next start or will succumb to an injury severe enough to put him on the disabled list. This makes eyeball interpretations a little harder, but let's start with a pitcher in his first start who has no previous disabled list history. First, here are the coefficients from the Cox regression:
Once a pitcher hits 75 pitches, we create the function that he will "die" (i.e., end up on the disabled list afterward) by taking the coefficient from crossing 75 pitches (.260) and adding the contribution of the number of pitches (.038) * 75. We get 2.59. This is equal to 6.59 percent. That may seem like a lot, and it is. There's an unseen adjustment for the number of starts that a pitcher has made. In theory, his arm is not as tired (and injuryprone?) in his Opening Day start as it is in his 14th start. It isn't that he has a six and half percent chance of landing on the DL after throwing just 75 pitches on Opening Day. (It's actually much lower.) It's more instructive to look at how each step along the way adds to the overall injury risk. Let's look at the predicted contribution toward injury risk that each step on the ladder makes and (more importantly) the change in that contribution over time.
This is a rather interesting table. Injury likelihood actually stays relatively flat up to 110 pitches. At 115 pitches, there's an upswing, then a downswing at 120. The finding that, of course, stands out is that pushing a pitcher to 130 pitches has an enormous effect, bumping up the chances that the pitcher will suffer an injury that will put him on the disabled list after this start by more than four percent, and in all of his subsequent starts for the year. It's worth pointing out that if you look at the standard errors for those regression coefficients, they get bigger (that is, more unreliable) as the pitch counts get bigger. This is, in part, because there aren't a lot of games in which a pitcher clocks 130 (or 120) over the last 10 years. Small sample sizes make for unreliable regression coefficients. However, it seems that 130 pitches in one outing is the inflection point where small increases in injury risk (which will always be there) turn into downright hazardous working conditions for a pitcher. Additionally, it's a biased sample that gets to go well above 110. Somewhat by definition, it's guys whom managers and pitching coaches believe are able to handle such a workload on one night and not suffer too great an effect. Assuming that they have some idea of what they are doing (i.e., they are smart enough to introduce bias into the sample—and I do worry about how much bias that is and don't know how to fully control for it), we need to be careful about saying that everything would be just peachy for all pitchers who get to 115. In that group, we're getting data only from those who are certified workhorses. Despite that, the effect at 130 shines through, and the effect size is larger than having a previous DL stint. It looks like even workhorses have a point where they break down and are sent to the glue factory. I ran some supplemental analyses in which I looked to see whether these effects varied by age (on April 1st) by including all of the variables in the original regression, as well as an interaction term for each of those variables with age. None of the interaction terms reached significance. I did the same for bodymass index (perhaps bigger pitchers can handle the workload?) Nothing there. I also used bodymass index squared, to allow for a quadratic function. Perhaps there was a bodymass index that was a happy medium for handling workloads? There was not. The Society for the Prevention of Cruelty to Pitchers 1) There will always be injury risk to starting pitchers. It happens. 2) Going a little bit beyond the usual 100pitch limit does not appear to appreciably increase the risk of an injury later in the year. Whether or not a pitcher is effective after 100 pitches or whether the team would be better off with a fresh reliever is another issue. Frankly, if your bullpen is so bad that you don't have someone out there who is better than a starter with 110 pitches logged, you have a bigger problem. 3) Letting a pitch count climb up around 120 is entering a danger zone, and is probably understated somewhat by these findings due to the sample bias inherent in who gets to throw 120 pitches in a night. Letting any pitcher exceed 130 pitches has the same effect as going back in time and giving him a significant injury (one big enough to land him on the DL), and a previous DL trip is a very powerful predictor of a future DL visit. 4) Once a pitcher has thrown an extended outing, he will carry the "scar" of that event until the end of the season. He might not go on the disabled list the next day, but he's now a bigger injury risk every time he takes the mound. 5) We are dealing with probability here, not certainty. There will be pitchers who have a marathon outing and never get hurt. There will be those who are handled with kid gloves and get hurt after three starts. Sometimes, you do everything right and it doesn't work. Maybe the obsession with 100 pitches as a hard limit for a start isn't the right cutoff point for preventing injuries, but there is clearly a point where high pitch counts do become abusive to pitchers.
Russell A. Carleton is an author of Baseball Prospectus. Follow @pizzacutter4
13 comments have been left for this article.

What about a further look into "pitching load" of a game...or even just a straight look at speed of pitches in a game? From an intuitive standpoint, not all 100+ pitches games are the same from an effort point of view. Since we should have pitch f/x data for good time frame, it should be straight forward to come up with an "effort" metric and then see if there is a particular profile of a type of 100+ pitch effort that brings more danger than another...
A thought that has crossed my mind as well. One problematic issue is that Pf/x doesn't capture mechanics. But, the motion for curvy stuff is probably harder on the arm than a fastball (which is why curves are often banned in little league). Maybe there's something to it.
When I did the Tommy John studies last year, I found a pretty clear indication of harder throwers being more susceptible (after controlling for age and pitching role). This was present across all pitch types.
In looking at pitch type frequencies, I found pitchers who went for TJS threw slightly more fastballs and sliders than the average pitcher, not curveballs. There seems to be academic literature that points in this same direction.
Really interesting article Russell!
The other "pitching load" factor I'd consider would be pitches per inning.
Back in the day, 30pitch innings would just wipe me out. A 30pitch inning was much more difficult on me than two 15pitch innings. I'd likewise expect that a 110pitch complete game is less dangerous than a 110pitch six inning stint.