I suppose one way to generate column ideas is to screw up.

Yesterday, ran a piece I wrote making the case that the National League has become the superior circuit. To bolster my argument, I included a chart that showed that at most positions, the NL had more of the top hitters.

Many people wrote in to point that because the NL had more teams than the AL, they would be expected to have more of the players in any one grouping.

E: Sheehan (21)

The chart stands out as a classic example of a problem you often see in sports coverage, and one that we've spent a lot of time railing against: choosing data to support a preconceived conclusion, rather than examining data to see what's there. I went into the piece convinced that the NL is currently a better league than the AL, and when the method I chose to illustrate the point confirmed that notion, I didn't spend a lot of time questioning it.

Now, I stand by my conclusion, and at that, by the rest of the article. I believe the National League has more talent right now, top to bottom, and far fewer hopeless cases than the American League. But the data presented was egregiously flawed and poorly thought-out, and it blew a hole right through the middle of the piece.

Thanks to the many readers–and BP staffers–who took the time to write in and point out the problems. I can do better, and the readership deserves that.

On to other old business…Monday's column on defensive performance generated a lot of e-mail, some of which I'll address here:

  • What makes this calculation viable is the idea that pitchers do not have much control over what happens on balls in play. For more on this concept, read Voros McCracken's piece, published here last January.

    To paraphrase, and hopefully not too blithely, the batting average on balls in play doesn't vary much from pitcher to pitcher, so what happens once a ball is in play is a function of luck and defense. This tool–balls in play converted to outs–attempts to measure the defense.

  • Yes, you can (and should) park-adjust the Defensive Efficiency numbers. I didn't do so because park-adjusting a month's worth of data is problematic, and could be more misleading than anything else.

    It's fair to say that the parks with a lot of foul territory (are there any left other than Network Associates State Park?) help a defense's numbers, while it would take a historic performance for the Rockies to ever do well by this metric.

  • Keep in mind that this is one month's worth of information, and as such, is subject to the same caveats we'd place on one month of any performance.
  • A number of people suggested that I include errors in the calculation. I didn't because I have no good way of separating fielding errors (out/safe ones) from throwing errors or other plays that don't cost an out. It's a fair point, though, especially in extreme cases; almost all the people who mentioned this referenced the Mets, with their high-profile errors and high unearned-runs total.

    It probably wouldn't hurt to run DE in a chart with a list of errors as well. The two numbers together probably provide a better picture than DE alone; they're certainly better than something like range factor or fielding percentage.

I should reiterate that DE is a quick-and-dirty method for measuring a team's range. I happen to like it because it's fairly easily calculated and it's clear what the number is measuring. I think it's important as additional information when we're trying to see why certain teams might be over- or under-performing their expectations, or preventing or allowing more runs than their pitchers' performances would indicate. It's not a magic bullet, just an interesting tool for measuring something–team defense–for which there is no magic bullet. 

You need to be logged in to comment. Login or Subscribe