October 22, 2003
Getting PADE, Redux
A Few Adjustments
Last time, we cooked up a way to remove park effects when looking at Bill James' Defensive Efficiency, a stat that measures the percentage of balls in play fielded by a team's defense. The new metric, tentatively called PADE, ranked teams on a zero-centered scale, showing how well a team performed against the league average with their given schedule. The intent was to more fairly judge defenses against each other rather than punish teams like Colorado and Boston for having to play in more difficult venues.
As stated before, defense can be broken down into many facets, but the three most prevalent parts are park factors, pitching, and actual defensive performance. Since we've already figured out how to remove the first one--park factors--the next logical step is attempting to correct for pitching, leaving us closer to a metric that measures only defensive performance.
To do this, we'll take a similar approach to the first version of PADE, but instead of defensive park factors, we'll use defensive pitcher factors. The first step is to determine an expected defensive efficiency for every pitcher, based on their career history.
Looking for defensive efficiency ratings sorted by pitchers inherently concedes that pitchers have some degree of control over the number of hits per balls in play. This concession will not sit well with many readers, mostly owing to Voros McCracken's article of a few years ago. To sum up very quickly, McCracken broke down pitching into statistics over which the pitcher had control (BB, HBP, SO, HR) and those over which he does not (hits on balls in play--a stat he called IPAvg for "In Play AVG"); from there, he pointed out that pitchers have little if any control over the percentage of their balls in play that become hits. Since the publication of this article, there has been a great deal of research surrounding his conclusions, most notably by BP's Keith Woolner and Diamond Mind Baseball's Tom Tippett. Based on their research, McCracken has softened his original conclusion and it's now safe to say that pitchers do have some control over their IPAvg, especially when considering large sample sizes.
Back to what we were getting at: in order to attempt to remove the quality of pitching from our defensive metrics, we need to establish just how difficult it is to play defense behind each team's pitching staff. For a first attempt, we'll establish an expected IPAvg for all pitchers, simply averaging their total career numbers. Rather than look simply at the last three years as we do with park factors, it's important to look at a pitcher' s total career because of the large variation in IPAvg from year to year. Armed with an IPAvg for every pitcher, we simply weight that by how much of the pitching they did for their team and produce an expected defensive efficiency based on the pitching staff (PitchDE). Here are the results:
Team PitchDE Seattle Mariners .7286 Anaheim Angels .7227 Tampa Bay Devil Rays .7221 San Francisco Giants .7182 Arizona Diamondbacks .7171 Oakland Athletics .7146 Chicago White Sox .7140 Los Angeles Dodgers .7132 Cleveland Indians .7131 Baltimore Orioles .7129 St Louis Cardinals .7116 New York Yankees .7108 Philadelphia Phillies .7099 San Diego Padres .7096 Atlanta Braves .7093 Boston Red Sox .7091 Minnesota Twins .7082 Chicago Cubs .7077 Cincinnati Reds .7072 Montreal Expos .7068 New York Mets .7061 Houston Astros .7060 Kansas City Royals .7059 Florida Marlins .7053 Pittsburgh Pirates .7053 Toronto Blue Jays .7046 Colorado Rockies .7041 Detroit Tigers .7003 Milwaukee Brewers .7000 Texas Rangers .6964
Obviously there are some problems here. Primarily, there are quite a few rookie or relief pitchers with small sample size problems, making younger staffs more susceptible to outliers that skew the numbers. Next, there's the problem of pitchers having spent too much of their time in one park or in front of one defense. This is mainly a problem with teams laden with homegrown talent who pitch in extreme parks (see: Athletics, Oakland). Finally, there's the slight problem of pitchers who have been around long enough to have significant playing time in the late '80s and early '90s when the game was played slightly differently.
In order to correct for some of these factors, we're going to have to make a few basic adjustments. First, to eliminate the problem of extremely lengthy careers, we'll limit our numbers to the last nine seasons, starting after the 1994-95 strike. Secondly, similar to what Tom Tippett did, we'll look at pitchers' IPAvg versus their team rather than the raw numbers. By doing so, we can better isolate pitchers from park and team defensive quality effects.
Here come the equations, so if that's not your cup of tea, feel free to scroll down. For each pitcher, we'll compute a year-by-year weighted IPAvg, basically dividing the team's IPAvg (tIPAvg) by the player's IPAvg (pIPAvg). This calculation will give us a number based around 1.000 that tells us whether a player was generally tougher on the defense (lower than 1.000) or easier on the defense (higher than 1.000).
Now that we have this year-by-year IPAvg based on the team environment (tIPAvg/pIPAvg), we need to compute a career average for each pitcher. If we simply averaged the numbers, we'd get some skewed results because of playing time issues--a season in which a pitcher only faced 10 batters would count the same as a season in which he faced 500. So we'll weight each season by playing time and recalculate using the following formula:
pExIPAvg = Sum((tIPAvg/pIPAvg)*(pBIP/sum(pBIP)))
where pBIP is the number of balls in play a pitcher allows in a given season and we're summing year-by-year. This metric, called pExIPAvg, is each pitcher's average IPAvg since 1995 and we'll expect the pitcher to allow that IPAvg this season.
Here's where we can correct for pitchers with small sample sizes - rookies and specialty relievers. (Thanks again to Keith Woolner for the shove in the right direction.) For pitchers with fewer than 500 BIP, we'll adjust things slightly, using a combination of their own performance and their team's overall performance to reel in some of the extreme outliers.
For example, in the original formula, a pitcher who has only 100 BIP, yields 30 hits (a .300 IPAvg), and pitches in front of a defense that yields only .280 IPAvg would get a very extreme baseline of 0.933 (.280/.300). Instead, we'll use ((100*.300)+(500-100*.280))/500 = (150)+(112)/500 = .284. Then, .280/.284 = 0.986, instead of 0.933. For those of you who like complicated equations, this can be expressed as:
If (pBIP < 500) then pExIPAvg = ((pBIP*pIPAvg)+((500-pBIP)*tIPAvg))/500
I used 500 BIP as a minimum because it removed all the drastic outliers while not affecting pitchers with significant playing time. Results do not change dramatically moving the limit higher.
Now we can apply our pExIPAvg for each pitcher to his team in a given season and compute an expected tIPAvg for that year. We'll again need to correct for playing time, but this time as a percentage of the team BIP instead of the sum of a pitchers playing time over his career. It's important to use BIP as the playing time metric instead of batter's faced (BFP) because we're looking at how the pitcher affects the defense rather than run scoring in total. By doing so, we correct for the fact that players like Kerry Wood, who throw more than their fair share of strikeouts and walks, don't weigh on the defense as much as players like Greg Maddux or Tom Glavine despite having similar playing time levels based on BFP. So we'll multiply each players expected IPAvg by his percentage of team BIP and then sum for each team:
tExIPAvg = sum((pExIPAvg)*(pBIP/tBIP))
where tBIP is the number of balls in play the team allowed on the season and we're summing team by team.
We can call this the team expected IPAvg (tExIPAvg). This metric will again be centered around 1.000 with higher numbers meaning it's easier on the defense, and lower numbers being harder. The numbers are slightly off 1.000, probably because pitchers who tend not to get shelled don't last very long, so we'll re-center around the actual average. Here are the team-by-team results:
Team tExIPAvg Philadelphia Phillies 0.9781 Houston Astros 0.9786 Detroit Tigers 0.9883 Montreal Expos 0.9886 Pittsburgh Pirates 0.9905 Atlanta Braves 0.9917 Chicago Cubs 0.9924 Texas Rangers 0.9927 Cincinnati Reds 0.9939 New York Yankees 0.9952 Milwaukee Brewers 0.9953 Los Angeles Dodgers 0.9958 St Louis Cardinals 0.9971 Toronto Blue Jays 0.9972 New York Mets 0.9988 Florida Marlins 0.9989 San Diego Padres 1.0004 Cleveland Indians 1.0007 Kansas City Royals 1.0009 Anaheim Angels 1.0026 Chicago White Sox 1.0045 Minnesota Twins 1.0055 Boston Red Sox 1.0085 Arizona Diamondbacks 1.0087 San Francisco Giants 1.0097 Colorado Rockies 1.0110 Oakland Athletics 1.0111 Baltimore Orioles 1.0113 Tampa Bay Devil Rays 1.0209 Seattle Mariners 1.0310
Again, the lower the number, the harder things are on the defense. Based on the numbers above, we can weight every team's defensive performance like we did with PADE, yielding a zero-based scale that tells us how well a defense did given it's particular pitching staff. To do so, we use the following formula:
PIDE = (Def_Eff/LgDef_Eff)/tExIPAvg
where Def_Eff is a team's raw defensive efficiency on the season, LgDef_Eff is the league average defensive efficiency. We'll then subtract 1 and multiply by 100 to get a scale of percentages. We'll call this new metric PIDE for Pitching Independent Defensive Efficiency (though reader AC' s suggestion of "DEREK: Defensive Efficiency Ratings, Even Keeled" was much better than anything I thought up. Seriously, "PIDE", that's pathetic.) Anyway, the results:
Team PIDE Houston Astros 3.617 Philadelphia Phillies 3.183 Atlanta Braves 1.867 Los Angeles Dodgers 1.856 Oakland Athletics 1.231 Montreal Expos 1.175 Chicago White Sox 0.894 Cleveland Indians 0.823 Pittsburgh Pirates 0.810 Anaheim Angels 0.787 San Francisco Giants 0.751 Chicago Cubs 0.724 St Louis Cardinals 0.577 Detroit Tigers 0.437 San Diego Padres 0.231 Cincinnati Reds 0.014 Seattle Mariners -0.117 Tampa Bay Devil Rays -0.244 New York Mets -0.483 Minnesota Twins -0.532 Kansas City Royals -0.673 Florida Marlins -0.781 Milwaukee Brewers -0.860 New York Yankees -1.185 Arizona Diamondbacks -1.239 Toronto Blue Jays -1.301 Boston Red Sox -2.186 Texas Rangers -2.417 Baltimore Orioles -2.633 Colorado Rockies -3.309
Looking quickly at the results and comparing them with the original list from PADE, there are a few basic points to be made about teams this season:
It would be ideal to combine the two metrics, PADE and PIDE, into a single measure of defensive efficiency. Simply averaging the pitching and park baselines to find an expected performance is one easy way to accomplish this, but that assumes that park and pitching both have an equal effect on the defense. We don't know that this is so. Further, the range of baselines in PIDE (5.3%) is much larger than PADE (3.3%), though we could use standard deviations to normalize them to each other before combining. (The most likely explanation for this discrepancy is the fact that each team plays behind a unique pitching staff, while teams play in many of the same parks a comparable number of times.)
Regardless, combining PADE and PIDE in a way that accurately reflects events on the field requires a longer look, a longer column, and this one has rambled on long enough for today. In the meantime, looking at the PADE and PIDE lists side by side, broad conclusions about the true defensive performance of teams around the league can be drawn with more confidence than simply looking at Bill James' Defensive Efficiency.