CSS Button No Image Css3Menu.com

Baseball Prospectus home
  
  
Click here to log in Click here for forgotten password Click here to subscribe

<< Previous Article
Premium Article Divide and Conquer, NL... (03/15)
<< Previous Column
Premium Article Team Injury Projection... (03/15)
Next Column >>
Premium Article Team Injury Projection... (03/17)
Next Article >>
Fantasy Article Fantasy Focus: NL-only... (03/16)

March 15, 2011

Team Injury Projection

How CHIPPER Works

by Dan Turkenkopf

Throughout the past week, Corey Dawkins and Marc Normandin have been using the Comprehensive Health Index [of] Pitchers [and] Players [with] Evaluative Result, otherwise known as CHIPPER, to break down the expected health of teams. If you've missed any installments and want to page back through them, you can visit the Team Injury Projection homepage by clicking here. We've heard your questions about what CHIPPER means, where the projections came from, and how they differ from what others provide, and it's time for us to answer them.

Let's take the last one first. What makes CHIPPER different from the other injury projections out there? First off, our injury database contains not only major league injuries from the past eight years, but also minor league, spring training, and winter league data. We're even starting to collect injury data from colleges. All told, we have over 400,000 player days missed to injury in the database. Secondly, CHIPPER does more than just project whether a player is going to miss time: it also tries to provide a ballpark figure for how much time a player is going to miss.

So how does it work?  Let's take a look at the Boston Red Sox Team Injury Projection, which is free for all readers, and we'll cover the details.

BOSTON RED SOX
Team Audit | Depth Chart
 

Dashboard

2010 Recap
 
2010
 
2009
 
2008
 
2007
Third in AL East
71 entries
21 DL trips
               
1349
TDL
19
DMPI
 
1349
TDL
27th
 
19
DMPI
12th
 
1073
TDL
19th
 
17
DMPI
10th
 
939
TDL
13th
 
14
DMPI
7th
 
884
TDL
10th
 
18
DMPI
4th

The Dashboard gives you some context about the team's injury situation in aggregate. In the 2010 Recap section, you'll find that Boston had 71 entries in the CHIPPER database for 2010--all of the disclosed injury incidents for which we have a record. Of those entries, 21 were DL trips, or stays on the Disabled List. Boston had 1349 TDL, or Total Days Lost to Injury, and 19 DMPI, or Days Missed Per Injury, during the 2010 season. You can see Boston's historical TDL and DMPI across the rest of the dashboard; the numbers below the graph labels display the team's ranking league-wide, and the graphs are color-coded to reflect the team's injury performance relative to its competitors.

Hitters in approximate Depth Charts order at time of publication

 
Days Lost to Injury
2011 Injury Risk
Player
Age
2008
2009
2010
1-day
15-days
30-days
Dustin Pedroia
27
0
4
99
Red
Red
Red
Carl Crawford
29
50
4
9
Red
Yellow
Yellow
Darnell McDonald
32
0
0
1
Yellow
Green
Green

CHIPPER's goal in life is to predict the chance that a player is going to miss time in 2011. It considers the likelihood of a player missing one or more games to injury, more than 15 games to injury, and more than 30 games to injury, and rates that risk as either green, yellow or red. We've represented these on the player lines with a color scheme you're used to but symbols you aren't.

 Green: ~15 percent or lower chance;  Yellow: ~15-85 percent chance;  Red: ~ 85 percent and up chance of the player missing this many games with injury.

It's important to note that we're considering only games lost to a disclosed injury, not simply days off. While almost no one plays 162 games anymore, many players don't actually have any injuries reported in a given season. The database behind CHIPPER tracks injury reports beyond just DL visits, but there has to be a reported injury; we're not tracking or reporting routine days off.

Among the sample hitters above, Dustin Pedroia is the best bet to miss a significant amount of time coming off of his injury troubles last year, Carl Crawford is very likely to miss a few games, and Darnell McDonald's fairly unscathed past and positive markers give him a better profile for injury risk.

Pitchers in approximate Depth Charts order at time of publication

 
Days Lost to Injury
2011 Injury Risk
Player
Age
2008
2009
2010
1-day
15-days
30-days
Jon Lester
27
0
0
0
Green
Green
Green
John Lackey
32
53
50
0
Yellow
Yellow
Green
Tim Wakefield
44
19
62
0
Red
Red
Red

In the sample of pitchers above, Jon Lester profiles as one of the least-risky aces in the majors, John Lackey was durable last year but missed considerable playing time in 2008 and 2009 and remains a risk, and Tim Wakefield's age and significant time lost in 2009 make him very likely to spend an extended period in the trainer's room in 2011.

CHIPPER uses logistic regression to determine whether a player is going to miss time at each of the thresholds we've set. For position players, we consider age, position, time lost to injury during the previous three seasons, and proxy variables to represent player type. For pitchers, the categories include age and time lost to injury during the previous three seasons. Surprisingly, including workload didn't make much of a difference in the results.

You're probably wondering what this means to you as a fan or a fantasy owner.  If we say a player is a high risk, should you expect him to head to the DL? Well, yes and no. I expect at least 70 percent of the players we indicate as high-risk to hit their injury threshold, but I can't tell you which ones. If I could, I'd be making a killing in Vegas and not sharing my predictions with you. Since you're not going to get a firmer guarantee out of me, let's move on to what's still to come with CHIPPER this season.

The Team Injury Projections we've run to this point are based only on major-league injury history and contain only the players who saw time in the bigs last season. That should change this week, as we include the minor-league data from our injury database. We'll be sure to let you know when things are updated, and the Player Forecast Manager will always contain the most up-to-date projections. We're still just getting to know this data; looking further ahead, we're working on improving the specificity of our injury projections enough to use it as an input to our PECOTA projection system. We'll also be rolling out this information in team reports and player cards.

We're also planning to add more data about the previous injuries to the mix.  Someone who's had hamstring issues is probably more likely to suffer from a reoccurrence than someone who broke a finger. The difficulty here is small sample size and proper categorization of injuries. That's why we hired an athletic trainer with extensive medical training and experience. Beyond that, we'll continue to refine the model where we see opportunities for improvement, and we hope to introduce additional tools to help you understand, measure, and react to injuries as they occur.

Our goal is to give you the best team injury reports in the business, backed by real-world injury experience, expertise in data analysis, and the only verifiable data set of its kind in the field. Data-driven injury analysis is a relatively untapped area, and there's plenty left to explore. If you have any ideas or suggestions, please let us know.

Related Content:  Injury,  Time,  Darnell Mcdonald,  Year Of The Injury

31 comments have been left for this article. (Click to hide comments)

BP Comment Quick Links

Asinwreck

Thank you for showing your work.

Mar 15, 2011 06:48 AM
rating: 1
 
John Carter

Thanks for this.

Mar 15, 2011 06:48 AM
rating: 0
 
Tom Gorman
(421)

Red = 85% risk of injury. But you only expect 70% of red players to hit the DL?

Did you do any testing on the system? Input data through 2009 and see if it correctly predicted 2010?

Mar 15, 2011 06:57 AM
rating: 4
 
dianagram

You know what I love about BP (amongst many things)?

Its that staff writers will *publicly* comment on and critique other staff writers' work.

It shows this is not a rubber-stamp shop.

Thanks guys!

Mar 15, 2011 07:16 AM
rating: 2
 
BP staff member Dan Turkenkopf
BP staff

Yes, I did test this against 2010 data. There are some limitations of the logistic regression model that cause discrepancies at the extreme. A few more green players than we expect will end up injured, and slightly fewer red players will get injured - the difficulty is predicting which.

Mar 15, 2011 07:42 AM
 
Lopecci

I have to admit, I do like the injury projection system. But I HATE the name chipper for it. Show Larry some respect, the guys a future hall of famer, a super star in his prime, and a tried & true first class dude. He restructured his contract, to stay on one team. He moved to left field when the Braves needed a left fielder. I mean seriously, with all the data you have at your hands, could you not have come up with a different name? Chipper is atleast willing to step on the field if he is hurt, most guys will take a week off & think nothing of it. Sorry for the rant, hope I got my point across !!

Mar 15, 2011 07:38 AM
rating: -2
 
lmarighi

I like Chipper as a player, but "first-class dude" might take it a little too far. Like many celebrities, he has made mistakes (e.g. http://sportsillustrated.cnn.com/baseball/mlb/news/1998/10/22/jones_paternity/ ). Also, I seem to remember that CHIPPER was one of the entries from when Will Carroll ran a contest to name a new system. . .

Mar 15, 2011 07:45 AM
rating: 0
 
Marc Normandin

I had a lot to do with naming it CHIPPER, and let me tell you that I did so because he's a player I love. My other choice was NOMAR, another favorite, so know there is no ill will meant.

Mar 15, 2011 08:03 AM
rating: 0
 
BP staff member Dan Turkenkopf
BP staff

I'm curious, what did you reverse acronymize NOMAR out to?

Mar 15, 2011 08:11 AM
 
Marc Normandin

I never came up with one I was satisfied with, which was part of the reason CHIPPER won that internal struggle of mine.

Mar 15, 2011 20:58 PM
rating: 0
 
tmangell

tremendous job - thanks! I checked out the Phillies page, and there's going to be a big red cross next to J-Roll's name on my auction sheet!

Mar 15, 2011 07:58 AM
rating: 0
 
Matthew Avery

A couple of questions if you don't mind. You said, "I expect at least 70 percent of the players we indicate as high-risk to hit their injury threshold, but I can't tell you which ones." Now, given that you'd already given a point estimate (85%) that these high-risk players will meet their thresholds, I'm curious where 70% came from.

And that sort of leads to my second question, which is how does the system perform with test data? I assume you did assessments based on cross-validation or data from previous years, and I think it would be informative for you to share how well it did. For example, if you ran system for the 2010 season (obviously using data from previous years), how accurate would it be?

Mar 15, 2011 09:20 AM
rating: 0
 
BP staff member Dan Turkenkopf
BP staff

The model was based off half the data set from 2010 and tested off the other half.

The total number of expected injuries and actual injuries match up quite well, but we do see some discrepancies at the extremes as I mention above.

Basically, for pitchers at 30+ games (the worst estimate), we're over-estimating the red risk by about 40%.

For position players at 1+ games (the best estimate), we're over-estimating the red risk by about 15%.

I'll try to run the model against some earlier seasons later this week if I get a chance, but that's behind adding in minor league injuries in my to-do list.

Mar 15, 2011 10:27 AM
 
Matthew Avery

Cool. Those sound like reasonable results. When you say you over-estimated the red risk for that subset of pitchers by 40%, do you mean, "the risk was really 40% and we projected 80%" or "the risk was really 40% and we projected 56%"?

I assume the latter but it seemed ambiguous.

Mar 15, 2011 12:26 PM
rating: 0
 
BP staff member Dan Turkenkopf
BP staff

Yeah, it was a little ambiguous. Sorry about that.

It's closer to the second.

Let's say we have a hypothetical situation where we have 100 green, 100 yellow, and 100 red players. We'd estimate probably 7 green, 50 yellow and 95 red to get hurt. The actual results are more likely something like 30 green, 50 yellow and 66 red.

Is that more clear?

Mar 16, 2011 05:53 AM
 
leites

Are types of injury differentiated in this model? For instance, if John Lester had missed a year due to shoulder surgery rather than cancer, would his projection be any different?

Mar 15, 2011 10:52 AM
rating: 0
 
BP staff member Dan Turkenkopf
BP staff

Not yet.

If we look very specifically at injury details - bucketing every strained groin together, for example, then our sample size for each category becomes really small.

The right answer I think is to more broadly categorize injuries - maybe muscle and tendon problems in the throwing arm, for example. That's going to take time, and a lot more medical knowledge than I have. But I do have help (http://www.baseballprospectus.com/article.php?articleid=13009).

Mar 15, 2011 11:15 AM
 
leites

I like the way you're approaching it. Another sample issue will be that, for certain types of injuries, the medical treatments and/or rehabilitation regimes have become more effective. So historical data may in some cases be misleading.

Mar 15, 2011 12:13 PM
rating: 0
 
Ogremace

The expansion of this database could also give us a way of comparing team medical staffs against one another, though there would always be a limit to the possible sample size of any group's set.

Mar 15, 2011 15:14 PM
rating: 0
 
BP staff member Corey Dawkins
BP staff

That is true that techniques have advanced as has therapy. For instance the treatment for hip labral tears and injuries inside the joint used to be an open surgery where they actually dislocate the hip in the operating room. Thankfully arthroscopy for the hip is improving and we now have many more surgeons who are trained by the best.

The data that I've gathered is very detailed and has a separate field that covers the injury in greater detail than simply shoulder strain. As much information that is known goes straight into the database.

So for instance if it was an older player who had the open procedure it would state this (it would have to be a significantly older player) vs a younger player who would be counted as arthroscopy. This is the case for every injury I encounter.

Thanks for the feedback.

Mar 15, 2011 18:04 PM
 
Peter Benedict

I would LOVE to have this linked to each player's page, or better yet, have it in the PFM.

Mar 15, 2011 13:29 PM
rating: 1
 
BP staff member Colin Wyers
BP staff

It's in the PFM. Look in the configuration options, under display, and you can set it to show the injury projections along with the PFM output.

Mar 15, 2011 16:49 PM
 
dREaDS Fan

I was looking for but never saw in the article IN CAPS below:

"We've heard your questions about what chipper means, where the projections came from, and HOW THEY DIFFER FROM WHAT OTHERS PROVIDE, and it's time for us to answer them."

How does CHIPPER differ from last year's system ... other systems? (NOMAR?)


Mar 15, 2011 20:07 PM
rating: 0
 
BP staff member Corey Dawkins
BP staff

This is the only place where you will be able to find this level of data on any injury or medical condition, not just on DL trips, and use it for projections.

I will leave the other parts of the answer to the smart guys who can explain the mathematical side of things much better than I can.

Mar 15, 2011 20:17 PM
 
Marc Normandin

There is no NOMAR. That was just me talking about names I considered.

Mar 15, 2011 20:59 PM
rating: 0
 
dREaDS Fan

Caught that ... poor humor on my part.

Mar 16, 2011 06:55 AM
rating: 0
 
Matt Kory

There is no NOMAR, only Zuul!

Mar 16, 2011 14:36 PM
rating: 3
 
Jack Thomas

I realize this system is new. However, I think it is important to add other elements to the projections. The Verducci Effect (SPs under age 25 with an 30 IP increase)has proven itself. Your data base projected Phil Hughes as the lowest risk of any Yankee pitcher despite his 40+ IP increase in 2010. I have knocked him down my board due to the risk factor.
Keep up the good work -- Very useful information.

Mar 16, 2011 15:23 PM
rating: 1
 
JeffZimmerman

Once a dataset was available to check the data, the Verducci Effect was proven to be not true:

http://baseballanalysts.com/archives/2010/02/verducci_effect.php

Mar 18, 2011 07:44 AM
rating: -1
 
Peter Benedict

The CHIPPER info is not available under "Display" in PFM as far as I can see. I would really love to see it there though! The only options there in my view are playing time, expert rankings, minimum dollars, and biographical data.

Mar 16, 2011 15:43 PM
rating: 0
 
BP staff member Dave Pease
BP staff
(2)

Please try that link again--sorry for the confusion.

Mar 16, 2011 16:17 PM
 
You must be a Premium subscriber to post a comment.
Not a subscriber? Sign up today!
<< Previous Article
Premium Article Divide and Conquer, NL... (03/15)
<< Previous Column
Premium Article Team Injury Projection... (03/15)
Next Column >>
Premium Article Team Injury Projection... (03/17)
Next Article >>
Fantasy Article Fantasy Focus: NL-only... (03/16)

RECENTLY AT BASEBALL PROSPECTUS
Fantasy Rounders: Turn On the Green Light
Premium Article The Prospectus Hit List: June 29, 2015
Premium Article Raising Aces: 2015 Draft Audit, Part 3
Premium Article Prospect Profile: Yoan Moncada
Premium Article Transaction Analysis: The Sandberg Goeth
Premium Article BP Milwaukee
Premium Article Monday Morning Ten Pack: June 29, 2015

MORE FROM MARCH 15, 2011
Premium Article Purpose Pitches: Day 6 in Cactus Country
Premium Article Overthinking It: Small Samplings of Spring, ...
Premium Article Team Injury Projection: Philadelphia Phillie...
Premium Article Team Injury Projection: Baltimore Orioles
Fantasy Article Fantasy Beat: Value Picks at Catcher, Second...
Premium Article Prospect Profile: Yasmani Grandal
Fantasy Article Fantasy Beat: Expert Mock Draft Results

MORE BY DAN TURKENKOPF
2015-06-10 - Prospectus Feature: DRA: Improved, Minused, ...
2015-04-29 - Prospectus Feature: Introducing Deserved Run...
2011-03-15 - Team Injury Projection: How CHIPPER Works
More...

MORE TEAM INJURY PROJECTION
2011-03-18 - Team Injury Projection: Chicago White Sox
2011-03-17 - Premium Article Team Injury Projection: Minnesota Twins
2011-03-17 - Premium Article Team Injury Projection: Milwaukee Brewers
2011-03-15 - Team Injury Projection: How CHIPPER Works
2011-03-15 - Premium Article Team Injury Projection: Philadelphia Phillie...
2011-03-15 - Premium Article Team Injury Projection: Baltimore Orioles
2011-03-14 - Premium Article Team Injury Projection: Atlanta Braves
More...

INCOMING ARTICLE LINKS
2011-09-01 - Premium Article Overthinking It: How Much to Pay Jose?
2011-03-31 - Team Injury Projection: San Francisco Giants
2011-03-31 - Fantasy Article Team Injury Projection: Oakland Athletics
2011-03-29 - Fantasy Article Team Injury Projection: Texas Rangers
2011-03-29 - Fantasy Article Team Injury Projection: San Diego Padres
2011-03-28 - Fantasy Article Team Injury Projection: Colorado Rockies
2011-03-28 - Fantasy Article Team Injury Projection: Seattle Mariners
2011-03-25 - Fantasy Article Team Injury Projection: Arizona Diamondbacks
2011-03-25 - Fantasy Article Team Injury Projection: Pittsburgh Pirates
2011-03-24 - Premium Article Team Injury Projection: Cleveland Indians
2011-03-24 - Premium Article Team Injury Projection: Houston Astros
2011-03-22 - Premium Article Team Injury Projection: Cincinnati Reds
2011-03-22 - Team Injury Projection: Kansas City Royals
2011-03-21 - Premium Article Team Injury Projection: St. Louis Cardinals
2011-03-21 - Premium Article Team Injury Projection: Detroit Tigers
2011-03-18 - Premium Article Team Injury Projection: Chicago Cubs
2011-03-18 - Team Injury Projection: Chicago White Sox
2011-03-17 - Premium Article Team Injury Projection: Minnesota Twins
2011-03-17 - Premium Article Team Injury Projection: Milwaukee Brewers