July 7, 2010
Prospectus Hit and Run
Fielding and Runs Scored
Last week, ESPN's Rob Neyer picked up on my series regarding this year's dip in scoring and increase in strikeout rates. Echoing a familiar refrain from the comments to both pieces that I received—both online and via the airwaves, where I've discussed the two articles at least half a dozen times—Rob suggested that one reason for both trends may be that teams are focusing more on defense than they have in previous years.
Anecdotally, this certainly rings a bell. In the wake of Moneyball a few years ago, we all heard about A's general manager Billy Beane shifting attention from on-base percentage and plate discipline to defense because that's where the new inefficiencies were in the market for talent. Two years ago, the Rays made a stunning turnaround thanks to an historic 54-point improvement in their Defensive Efficiency. Last year, the Rangers followed that template and emerged as contenders thanks to a 29-point turnaround in that department. Over the winter, GMs of both the Mariners (Jack Zduriencik) and the Red Sox (Theo Epstein) emphasized defense in their off-season plans. Apparently, all the cool kids are doing it.
If it's the case that teams are placing an increasing emphasis on defense, the first place we'd expect this to show up isn't in errors, which are entirely based upon the subjective decisions of official scorers, or FRAA, UZR, or some other new-fangled defensive metric which has its own issues regarding subjectivity. No, the first place we'd expect a change in defensive performance to show up is in the frequency with which they convert batted balls into outs, either via batting average on balls in play (BABIP) or Defensive Efficiency (DE). While the five aforementioned teams rank among the top six in the current AL Defensive Efficiency rankings, the evidence that such a shift in philosophy is driving scoring rates down this year is faint at best:
Though it takes a hike out to the fourth decimal to find it, this year's BABIP and DE were actually both down from 2009 levels through Sunday, which is happening because the rates of hitters reaching on error are actually higher than last year, rising from 0.93 percent of all plate appearances to 1.04 percent. Not exactly what we'd expect if defense were actually being, y'know, emphasized. Shifting our gaze to recent seasons, the year-to-year changes in BABIP and DE don't line up tremendously well with the recent changes in scoring rates. The DE for all 30 teams was four points higher in 2003, the year Moneyball was published, yet scoring was about three-tenths of a run higher. The same can more or less be said for the entire 2001-05 span.
To put it another way, in the grand scheme, DEs and BABIPs correlate very well with scoring levels over the full range of years shown above, at -.87 and .82, respectively, while for strikeouts it's just .38. But if we narrow our focus to ditch the pre-expansion, pre-strike, pre-juiced ball years and use only complete seasons, the relationship becomes less clear. For the range 1996-2009, the correlations fall to -.71 and .54; for the range 2004 (the year after Moneyball was published) to 2009, it's -.62 and .52.
But that's not the end of the story. As Rob astutely pointed out, "If you're selecting for defense, you're taking runs away from your opponents and you're taking runs away from yourself. It's a two-fer." Which again makes sense, as there are plenty of slick-fielding glove men out there who can't hit their hat size. Think of the lineup hit the Mariners took for employing Casey Kotchman (.208 TAv), or the Astros for Pedro Feliz (.209), or the Orioles for Cesar Izturis (.197), not to mention plenty of other players who are above replacement level but carry bats a bit light for their respective positions because they're perceived as top-notch defenders.
If teams are actually choosing glove men more often, we might expect to see the relative level of offense supplied by the more bat-friendly positions (first base, third base, left field and right field) dropping relative to the level supplied by the more glove-friendly positions (catcher, second base, shortstop and center field) over time. Checking in on things like BABIP, home run and strikeout rates, we find that the two classes of players have paralleled each other in rather striking fashion over the past two decades:
Putting all three measures on the same scale, you can see how relatively little such measures have varied. The offense-first positions have shown higher strikeout and homer rates than the defense-first positions in every single year, and only three times did they fall behind in BABIP, twice by less than one point, and not at all since 2003.
Looking at even a crude aggregate measure of offense such as OPS is more helpful, as is another stat Baseball-Reference.com publishes called tOPS+, which is defined as "the OPS+ of this split relative to the player or team's overall OPS," except that in this case, we're looking at MLB as a whole instead of an individual player or team. The formula for this is 100 * ((split OBP / total OBP) + (split SLG / total SLG) - 1). At first glance, the data is hardly a slam dunk:
Either in terms of tOPS+ or simply the raw OPS gap between the two classes, the gap is narrower than it was five or 10 years ago but still not as close as it was prior to the sweeping changes which began taking hold in 1993. Over the 21-season range, there's a definite correlation (.69 for tOPS+, .76 for raw OPS), though if we narrow the focus to the post-strike years, the correlations drop to .36 and .45, respectively. Limiting the focus to the post-Moneyball era full seasons (2003-09)—in case any other front offices were taking note, backlash be damned—they climb back to .62 and .72. So perhaps there is something to Rob's hypothesis.
For the purposes of comparison, I've built a table summarizing the correlations between these various elements and run scoring over the course of the 1990-2010 period and the full-season continuum of 2009:
TB/H is the rate of total bases per hit, the Power Factor which paleo-sabermetrician Eric Walker has invoked as evidence that the ball itself has changed. Note that in just about every case, the correlations are the largest for the samples that include the pre-strike era, the divide across which we've seen most of these rates change substantially.
As it turns out, we can use linear regression on the 21 seasons of data discussed above and in my pieces of recent weeks to construct a fairly accurate model to predict scoring levels. Using just home run rate and Defensive Efficiency, we can build a model which produces a correlation of .93 (an r-squared of .87) and a standard error of 0.10 runs per game. Adding Power Factor to the mix we can get to a correlation of .978 (r-squared of .957) and a standard error of 0.059 runs per game. We can push even it further by adding strikeout rate, to a correlation of .984 (r-squared of .969) and a standard error of 0.052 runs per game. Adding the batman/gloveman differential in either form doesn't advance the cause, however, and actually increases the standard error a hair, so we'll leave it aside.
The nasty-looking but nonetheless robust formula we're left with is:
That formula will get you inside of one-tenth of a run per game for any year from 1990 onward, and it's a whole lot closer than that over the last few years:
Not too shabby, eh? What's interesting is that the model suggests that 2010 scoring levels could actually drop based upon the more elemental measures we've observed thus far.
Which doesn't entirely answer the question once and for all about whether an increased emphasis on defense has any role in the change in scoring levels. But over the long-term period covered here, such an emphasis doesn't appear to be a driving factor the way that rising home run and strikeout rates—themselves reflective of physical, philosophical, and technological changes in the game—have been. Over the shorter term, the effect would appear to be small at best, the stuff more of a few good stories told about enlightened front offices zigging while the rest of the field zagged than of the type of trend readily apparent from at least one person's archaeological dig through the stat sheets.