keyboard_arrow_uptop

Throughout the past week, Corey Dawkins and Marc Normandin have been using the Comprehensive Health Index [of] Pitchers [and] Players [with] Evaluative Result, otherwise known as CHIPPER, to break down the expected health of teams. If you've missed any installments and want to page back through them, you can visit the Team Injury Projection homepage by clicking here. We've heard your questions about what CHIPPER means, where the projections came from, and how they differ from what others provide, and it's time for us to answer them.

Let's take the last one first. What makes CHIPPER different from the other injury projections out there? First off, our injury database contains not only major league injuries from the past eight years, but also minor league, spring training, and winter league data. We're even starting to collect injury data from colleges. All told, we have over 400,000 player days missed to injury in the database. Secondly, CHIPPER does more than just project whether a player is going to miss time: it also tries to provide a ballpark figure for how much time a player is going to miss.

So how does it work?  Let's take a look at the Boston Red Sox Team Injury Projection, which is free for all readers, and we'll cover the details.

BOSTON RED SOX
Team Audit | Depth Chart
 

Dashboard


2010 Recap
 

2010
 

2009
 

2008
 

2007
Third in AL East
71 entries
21 DL trips
               

1349
TDL

19
DMPI
 

1349
TDL
27th
 

19
DMPI
12th
 

1073
TDL
19th
 

17
DMPI
10th
 

939
TDL
13th
 

14
DMPI
7th
 

884
TDL
10th
 

18
DMPI
4th

The Dashboard gives you some context about the team's injury situation in aggregate. In the 2010 Recap section, you'll find that Boston had 71 entries in the CHIPPER database for 2010–all of the disclosed injury incidents for which we have a record. Of those entries, 21 were DL trips, or stays on the Disabled List. Boston had 1349 TDL, or Total Days Lost to Injury, and 19 DMPI, or Days Missed Per Injury, during the 2010 season. You can see Boston's historical TDL and DMPI across the rest of the dashboard; the numbers below the graph labels display the team's ranking league-wide, and the graphs are color-coded to reflect the team's injury performance relative to its competitors.

Hitters in approximate Depth Charts order at time of publication

 

Days Lost to Injury

2011 Injury Risk

Player

Age

2008

2009

2010

1-day

15-days

30-days
Dustin Pedroia

27

0

4

99

Red

Red

Red
Carl Crawford

29

50

4

9

Red

Yellow

Yellow
Darnell McDonald

32

0

0

1

Yellow

Green

Green

CHIPPER's goal in life is to predict the chance that a player is going to miss time in 2011. It considers the likelihood of a player missing one or more games to injury, more than 15 games to injury, and more than 30 games to injury, and rates that risk as either green, yellow or red. We've represented these on the player lines with a color scheme you're used to but symbols you aren't.

 Green: ~15 percent or lower chance;  Yellow: ~15-85 percent chance;  Red: ~ 85 percent and up chance of the player missing this many games with injury.

It's important to note that we're considering only games lost to a disclosed injury, not simply days off. While almost no one plays 162 games anymore, many players don't actually have any injuries reported in a given season. The database behind CHIPPER tracks injury reports beyond just DL visits, but there has to be a reported injury; we're not tracking or reporting routine days off.

Among the sample hitters above, Dustin Pedroia is the best bet to miss a significant amount of time coming off of his injury troubles last year, Carl Crawford is very likely to miss a few games, and Darnell McDonald's fairly unscathed past and positive markers give him a better profile for injury risk.

Pitchers in approximate Depth Charts order at time of publication

 

Days Lost to Injury

2011 Injury Risk

Player

Age

2008

2009

2010

1-day

15-days

30-days
Jon Lester

27

0

0

0

Green

Green

Green
John Lackey

32

53

50

0

Yellow

Yellow

Green
Tim Wakefield

44

19

62

0

Red

Red

Red

In the sample of pitchers above, Jon Lester profiles as one of the least-risky aces in the majors, John Lackey was durable last year but missed considerable playing time in 2008 and 2009 and remains a risk, and Tim Wakefield's age and significant time lost in 2009 make him very likely to spend an extended period in the trainer's room in 2011.

CHIPPER uses logistic regression to determine whether a player is going to miss time at each of the thresholds we've set. For position players, we consider age, position, time lost to injury during the previous three seasons, and proxy variables to represent player type. For pitchers, the categories include age and time lost to injury during the previous three seasons. Surprisingly, including workload didn't make much of a difference in the results.

You're probably wondering what this means to you as a fan or a fantasy owner.  If we say a player is a high risk, should you expect him to head to the DL? Well, yes and no. I expect at least 70 percent of the players we indicate as high-risk to hit their injury threshold, but I can't tell you which ones. If I could, I'd be making a killing in Vegas and not sharing my predictions with you. Since you're not going to get a firmer guarantee out of me, let's move on to what's still to come with CHIPPER this season.

The Team Injury Projections we've run to this point are based only on major-league injury history and contain only the players who saw time in the bigs last season. That should change this week, as we include the minor-league data from our injury database. We'll be sure to let you know when things are updated, and the Player Forecast Manager will always contain the most up-to-date projections. We're still just getting to know this data; looking further ahead, we're working on improving the specificity of our injury projections enough to use it as an input to our PECOTA projection system. We'll also be rolling out this information in team reports and player cards.

We're also planning to add more data about the previous injuries to the mix.  Someone who's had hamstring issues is probably more likely to suffer from a reoccurrence than someone who broke a finger. The difficulty here is small sample size and proper categorization of injuries. That's why we hired an athletic trainer with extensive medical training and experience. Beyond that, we'll continue to refine the model where we see opportunities for improvement, and we hope to introduce additional tools to help you understand, measure, and react to injuries as they occur.

Our goal is to give you the best team injury reports in the business, backed by real-world injury experience, expertise in data analysis, and the only verifiable data set of its kind in the field. Data-driven injury analysis is a relatively untapped area, and there's plenty left to explore. If you have any ideas or suggestions, please let us know.