keyboard_arrow_uptop

The following article, written by Keith Woolner with Rany Jazayerli, appeared
in Baseball Prospectus 2001.

Table of Contents

Long-term injury risk

In the previous article, we derived a new PAP formula
(dubbed PAP^3) that reflects the typical short-term decline in pitcher
performance following a high pitch count outing. In this article, we will
investigate whether PAP^3 has any value in predicting which pitchers are subject
to injury, and if not, whether any PAP-style metric can be derived that does
have predictive value.

Pitch counts and injuries

Before claiming any success for any measure in predicting injury, we must
fundamentally recognize that any PAP-style metric will be positively correlated
with raw pitch counts. Pitchers with high pitch count totals will tend to have
high PAP totals. If a PAP function provides no additional insight into which
pitchers will be injured that pitch count totals alone, there is no reason to
add the added complexity of a PAP system to our sabermetric arsenal. Only if a
PAP function provides injury information above and beyond what can be learned
from aggregate pitch counts should we consider it successful.

Data to be studied

As with the previous study, I looked at starts for all pitchers between 1988-98
for which there was pitch count data in the Baseball Workshop/Total Sports
database. The approach I used was to identify starting pitchers who suffered
major injuries during that span, and compare them to comparable pitchers who did
not suffer a major injury. Pitcher injury data was taken from Neft & Cohen’s The
Sports Encyclopedia: Baseball 2000.

Identifying Injured Pitchers

In the annual season summary section of TSE:BB 2000, team rosters are presented,
and a notation is made if a player was injured for more than 30 days. For the
purposes of this study, I selected pitchers who were starting pitchers in the
year they were injured, and whose recent history indicated a pattern of starting
pitching. Generally speaking, if a pitcher was a full time or near-full time
reliever in either of the two seasons prior to the injury, he was excluded from
consideration. Pitch counts from relief appearances were not included for any
pitcher, since relief outings are generally low in total pitch counts, and the
hypothesis under consideration is that it is high pitch counts that overextend
pitchers, and lead to injury risk.

Furthermore, only certain types of injuries were considered. A two-letter code
indicated the type of injury (if known). Since pitcher overwork would most often
be associated with arm injuries, the only injury categories included were
shoulder injury, elbow injury, arm injury, and sore arm. Any injured pitcher
with one of these codes was presumed to have injured his pitching arm (the
reference does not specify which arm). Note that this categorization considers
only the most serious arm injuries, namely those which held a pitcher out of
action for a month or more. Less serious injuries, including missed turns in the
rotation, and DL stays of less than 30 days, are ignored (and in fact, these
pitchers are considered "healthy" as they did not miss 30 or more days due to
injury during the season).

Since I wanted to consider pitchers for whom we had pitch count data for most of
their career, any pitcher under consideration who accumulated more than 100
innings in the majors prior to 1988 was excluded.

Note that minor league pitch counts are not widely available at present, and
while a more thorough treatment of the impact of career usage and pitch counts
on pitcher injury susceptibility would certainly include them, I restricted the
investigation to major league pitch counts only.

Finally, several pitchers appeared on the injured list multiple times during
their career. Physiologically, prior injury makes one prone to future injury. To
account for this, only the first season a given pitcher suffered a major injury
is included in our data.

Using these criteria, a total of 73 injured pitchers were identified.

Defining Comparable Pitchers

In order to identify a set of similarly worked pitchers who had not been
injured, I found matches for each injured pitcher’s age and career pitch count
total. By doing so, I would have several pitchers with similar age and usage
profiles, but who had not been injured. More specifically, for each injured
pitcher, I found all pitcher whose careers through the same age had amassed
within 10% or the injured pitcher’s career pitch count total. That is, if a
25-year-old Jason Bere had about 7800 career pitches in 1995, I matched him with
any other 25-year-old pitcher who had between 7020 and 8580 career pitches.

Of course, a further restriction was that any matching pitcher was not one of
the 73 injured pitchers, even if they were injured at a different age than the
one they were being compared for. If a single pitcher-season matched more than
one injured pitcher, the duplicate entries were removed, so that no
pitcher-season was counted more than once. A total of 569 healthy comparable
seasons were identified, for an average of 7.8 healthy comparables per injured
pitcher.

Note that the term "comparable pitcher" refers only to the aggregate
number of pitches thrown in a pitcher’s starts, not necessarily in the results.
Two 27 year old pitchers with 5000 career pitches would be considered comparable
in terms of workload, even if one had a 3.00 ERA, and the other a 5.50 ERA. They
are comparable in the total amount of work performed (pitches thrown), not the
in value of the results.

Career PAP as a Predictor of Injury

Our initial hypothesis is that PAP^3 has predictive power beyond raw career
pitch count totals in assessing the likelihood of injury for major league
pitchers. To test this hypothesis, I plotted career PAP^3 vs. career pitch
counts for all the pitcher-seasons in the sample, which is shown in the chart
below:





(Click for full-size image)

Over the course of any pitcher’s career, he will invariably pick up PAP in some
fraction of his outings. By looking at the usage patterns of many pitchers over
the years, you can ascertain the "typical" amount of PAP a major
league pitcher would accumulate given their pitch counts. Linear regression is
one technique for mathematically determining what this typical PAP level is. The
best fitting linear regression equation is plotted in the chart above as the
solid line.

If pitchers with greater than usual PAP are more likely to be injured, we would
expect more of the large dots indicating injured pitchers to lie above the trend
line in the chart above. It’s difficult to tell from visual inspection whether
this is the case or not. We can, however, analyze to the data itself to see if
this is true. Looking at the percentage of each group of pitcher that lie above
the trendline, we discover that:

  • 31% of all injured pitchers had above average career PAP totals for their career pitch counts.
  • 9% of all healthy pitchers had above average career PAP totals for their career pitch counts.

This suggests that high PAP pitchers are more than three times as likely to be
injured as low PAP pitchers of who’ve thrown similar numbers of pitches. We have
our first piece of evidence that PAP provides predictive information beyond what
pitch counts alone can tell us.

As a side note, the careful reader will note that there are four data points
that exceed a career-to-date PAP total of 2,000,000. These four pitcher-seasons
are all from the same pitcher, and far exceed the workload amassed by any other
pitcher. This workhorse is, of course, Randy Johnson, whose career workload
looks like a mistake in the chart. Whatever the results of our analysis of PAP
and injuries, Johnson is almost certainly an extreme outlier, a remarkable
physical specimen for whom comparison to regular major league pitchers may not
apply.

The Workload Stress Metric

Though we now have some indication that high PAP totals are a predictor of
injury risk, the results are somewhat buried in the statistics. They key element
of the findings above is that more PAP for any given number of pitches leads to
higher risk. This leads to the concept of using PAP/NP as a measure of how
intense or stressful a pitcher’s pitches have been. I’ll refer to PAP/NP as
"Workload Stress" or simply "Stress".

I determined career-to-date Stress factors for each pitcher in our sample, with
the intention of plotting Stress versus rate of injury. However, since each
pitcher in the sample has an injury value of either 0 (healthy) or 1 (injured),
a straightforward plot of points would not be particularly revealing.

What I did instead was sort the list of all pitchers by Stress factors, and
created a moving average or "sliding window" of 50 data points at a
time. That is, I took pitchers 1-50 as one data point, pitchers 2-51 for the 2nd
data point, 3-52 as the third, and so on, such that with every step I was adding
one pitcher with a high Stress factor, and dropping the one with the lowest
Stress. I averaged the Stress factors for every pitcher in the window, and
computed the percentage of pitchers in the sample who were injured. This creates
a sample within the sample, for which we can estimate the injury rate for
pitchers with Stress factors similar to the sample’s average Stress. The results
are below:





(Click for full-size image)

Here we see a more compelling representation of the relationship between PAP and
injuries. There’s a clear trend between Stress and the percentage of pitchers
who get injured. There’s a relatively constant increase between 0 and 50, with a
leveling off thereafter. Over a quarter of pitchers with career Stress factors
above 40 have suffered a major injury at some point during the time of the
study, compared with less than 15% of those with career Stress factors below 20.

Interestingly, there are indications of a decline as you approach and exceed a
Stress factor of 100 (the chart is truncated at Stress=100 due to lack of sample
size above this level). However, the injury rate is still well above that of any
Stress factor less than 40. Given the small number of pitchers in the upper ends
of the chart, it could be a sample size effect. If we assume, for the sake of
argument, that this decline is not simply random fluctuation, I would speculate
that this represents a survival effect of sorts. The pitchers who can sustain
that high a workload stress are those whose managers have pushed them harder and
harder until they get a reputation as a workhorse who can consistently shoulder
130 pitch count outings. It takes awhile for both the pitcher to develop to a
point where he can be effective in the late innings (and hence won’t be pulled
for a reliever). Also, a manager may be cautious with a new arm until he’s
comfortable enough with a pitcher to "know" how far he can go. Thus,
the pitchers who end up with the highest levels of stress are the quality arms
who’ve survived the weeding out process.

The Injury Likelihood Equation

The shape of the line on the chart, with a steeper slope at the beginning and
leveling off as you go higher, suggests a logarithmic curve. An example of such
a curve is shown below:





(Click for full-size image)

The formula for the trend line shown above is (LN() is the natural log function):

Prob(Injury) = 0.06 * LN(Stress)

Or, equivalently:

Prob(Injury) = 0.06 * LN(PAP/NP)

(Technical note: This equation holds for Stress factors greater than or equal to
1. The curve is equal to zero for Stress factors below 1).

What this chart suggests is that a pitcher’s career stress factor can help
predict the likelihood of that pitcher suffering a major arm injury at some
point during his career. For example, a pitcher who’s consistently around a
Workload Stress of 30 has a 20% chance of missing a month or more due to arm
injury at some point in his career.

Statistical Significance of Results

Having derived these apparently impressive results, it’s only prudent to ask
whether they are statistically significant or not. One commonly used statistical
test is called a Chi-squared test. Though the details of the test will be
omitted here, for our purposes, the Chi-squared test determines the likelihood
that the results we’ve seen could result from a random split of a uniform
population, given the sample sizes. In other words, Chi-squared will check the
possibility that the high and low PAP pitchers are actually equally likely to be
injured, and the observed differences are due to chance (this is what’s called
the "null hypothesis" — that PAP has no predictive value). If the
resulting probability from the Chi-squared test is too high (traditionally
around 5%), then we can’t reject the possibility that the null hypothesis is
true, meaning that the differences could be explained by chance rather than due
to any predictive power of PAP. Conversely, a very low probability result from
the Chi-squared test increases our confidence that the results are not due to
chance, and that separating pitchers based on PAP does provide information about
their relative injury risks.

Turning first to the career PAP totals, we noted that pitchers with above
average PAP totals given their career pitch counts were far more likely to have
been injured than pitchers with below average PAP totals. Computing a
Chi-squared probability for this sample indicates that the split has only a
0.000018% chance of having occurred by chance. This easily passes the criteria
for statistical significance.

Looking then at the Workload Stress factor versus (PAP/NP), I took a more
granular approach, dividing the sample space into quintiles by PAP/NP, and
computed the injury rates in each of the five groups. I then computed the
Chi-squared probability of this split occurring by chance. The result were
comparable to our previous findings — a relationship like the one observed has
a miniscule 0.0000028% chance of happening by chance. Again, the Stress Workload
factor clears the bar for statistical significance.

Other PAP formulae

As with the short-term PAP results, I examined other possible PAP formulae to
see if the relationship to injury risk was noticeably stronger. Though I do not
present the charts here, I tested classic PAP, other polynomial versions of PAP
(e.g. PAP = (NP-100)^2), and varying the baselines (100 pitches, 90 pitches, 110
pitches, etc). There was no dominant winner among the various formulae. In
general, they resulted in similar predictions as PAP^3. Perhaps isn’t that
surprising, given that unlike single starts, usage patterns tend to even out
more over the course of a career. Furthermore, even with the results we have,
predicting injury is an inexact science, and Workload Stress factors are no
guarantee for either health or injury. Therefore, any reasonable metric that
gives extra weight to high pitch count outings should yield a risk factor that
is in the same ballpark as PAP^3 (pardon the pun). Given that we have a
preferred metric for short-term impact that does acceptably for long-term injury
risk as well, we will stick to simplicity, and use a single metric for both
purposes. The PAP^3 formula will be the basis for our Pitcher Abuse Point work
going forward.

2000 Workload Stress leaders

Though career Workload Stress has been shown to, we can compute Stress factors
for individual pitching seasons (or groups of seasons) to assess whether a
pitcher is "on pace" for difficulties. The list below shows the
pitchers with the highest and lowest Workload Stress rates for the 2000 season
(minimum 10 games started):


PITCHER              GS      PAP     NP   STRESS
Hernandez,Livan      33   422979   3825    110.6
Johnson,Randy        35   439098   4021    109.2
Schmidt,Jason        11   101865   1203     84.7
Helling,Rick         35   313875   3791     82.8
Villone,Ron          23   150263   2246     66.9
Leiter,Al            31   229252   3478     65.9
Clemens,Roger        32   218043   3433     63.5
Hitchcock,Sterling   11    70714   1127     62.7
Wolf,Randy           32   217292   3528     61.6
Martinez,Pedro       29   190327   3165     60.1
Elarton,Scott        30   188275   3139     60.0
Appier,Kevin         31   194467   3314     58.7
Davis,Doug           13    78320   1338     58.5
Miller,Wade          16    97914   1724     56.8
Suppan,Jeff          33   181089   3488     51.9
Mussina,Mike         34   183194   3657     50.1
...
Dreifort,Darren      32     4498   3114      1.4
Karl,Scott           13     1339   1037      1.3
Yan,Esteban          20     2262   1801      1.3
Garland,Jon          13     1407   1198      1.2
Romero,J.C.          11     1009    961      1.0
Glynn,Ryan           16     1512   1456      1.0
Halladay,Roy         13     1253   1208      1.0 
Rose,Brian           24     1728   1862      0.9
Blair,Willie         17     1342   1545      0.9
Guzman,Geraldo       10      737    896      0.8
Perez,Carlos         22     1531   1921      0.8
Ohka,Tomo            12      793   1096      0.7
Rupe,Ryan            18      757   1553      0.5
Bergman,Sean         14      512   1152      0.4 
Gooden,Dwight        14      343   1161      0.3
Fassero,Jeff         23      152   1883      0.1
Schourek,Pete        21      126   1731      0.1
Cornelius,Reid       21      126   1828      0.1
Irabu,Hideki         11       27    853      0.0
Halama,John          30       63   2607      0.0
Arroyo,Bronson       12        8    958      0.0
Johnson,Mike         13        8    981      0.0
Stottlemyre,Todd     18        1   1496      0.0
Eiland,Eiland        10        0    667      0.0


Conclusions and Futures

Injuries to a key pitcher can have a devastating effect on a team’s fortunes,
not to mention that they can shorten or hinder a pitcher’s career. With
escalating salaries, proper pitcher usage is increasingly important to
maximizing a team’s investment in its personnel. As a result, pitch counts are
in prominence, managers and pitching coaches are scrutinized more closely in how
they handle a staff, and player development systems in the minors are
increasingly aware of protecting young arms.

The research presented here has shown, in essence, that not all pitches are
created equal. It is the high pitch count outings that represent the greatest
risk for both short-term ineffectiveness, and long-term potential for injury.
The PAP^3 system represents the most comprehensive attempt to date to quantify
the impact of starting pitcher usage over both time horizons, allowing us to
estimate, based on empirical evidence, the tradeoffs of having a star pitcher
throw deep into a game.

However, before placing too much weight on these discoveries, some caveats
apply. The results of this study should not be considered final because many
active pitchers are included in the study. It will be several years before a
large sample of pitch counts for entire pitcher careers becomes available, and
such a resource is necessary before we can complete the analysis has been
started here.

It’s important to note that the Workload Stress factor is not a prediction of
injury risk for a specific season, but rather a risk of injury over several
years of pitching at that level. Also, PAP^3 may underestimate the relationship
between high pitch counts and injuries. This study considered only the most
major injuries, and did not look at minor injuries, missed turns in the
rotations, or shifts from starting to relief pitching. We also proceeded
assuming that the injury effect of high pitch counts would manifest itself in
arm problems. It’s possible that there would also be effects for other kinds of
non-arm injuries (especially back and leg injuries).

The research questions are far from resolved, and there are still many facets to
the problem that have yet to be fully addressed. For example, a pitcher’s age
may be of considerable importance when assessing the risks of specific pitch
count limits, but was not included in this study. Important data is still
missing from the study, such as minor league, spring training, and post-season
pitch counts. The interactions and spacing between pitcher outings may prove to
have a significant effect — does starting on 3 days rest vs. 4 days rest
substantially affect the risk of either injury or ineffectiveness? There may yet
be better estimates of injury risk as I did not conduct an exhaustive search for
all mathematical representations, favoring the simplicity of a single measure
like PAP^3. Biomechanical experts may help identify physical characteristics
that indicate which pitchers are more or less susceptible or have greater
endurance, allowing personalized PAP formulae for individual pitchers.

There is also the possibility that the relationship between pitch counts and
injury risk is not static over time. Improved training methods, changing usage
patterns and strategies, new medical technology and techniques, new diagnostics
and screening could all impact the negative effects of high pitch counts. Pitch
count data from 1950 may not be terribly informative about the effects on modern
pitchers. Similarly, twenty years from now, an entirely different PAP formula
may need to be developed to take into account the impact of a machine that
rejuvenates muscle tissue instantly that some scientist has yet to discover.
Clearly, we have not learned all we need to know about the effects of pitcher
usage.

For now, however, we can confidently say that PAP^3 yields information about
pitcher performance and durability not answered by pitch counts alone under
current playing conditions. Long pitch count outings noticeably decrease
expected short-term performance, and high stress workloads over time increase
the chances for serious injury. Any strategic analysis of pitcher usage will
have to consider the tradeoff between winning the current game and the long-term
cost. There are clearly times when you will want to ride a workhorse hard, such
as a key playoff game (though Al Leiter will attest that there are limits even
in the World Series). Finding the right balance between winning now and winning
tomorrow remains a interesting challenge, and today we have another tool in our
arsenal to assess a team’s sustainable pitching strategy.

Acknowledgements

I’d like to thanks Dr. Lutz Mueller of Lumina Decision Systems for his advice
and consultation on the design and statistical testing methods in this research.


Keith Woolner is an author of Baseball Prospectus. You can contact him by
clicking here.

You need to be logged in to comment. Login or Subscribe