September 17, 2003
Can Of Corn
Game Scores, v2.0
In the 1988 Baseball Abstract, Bill James introduces what he admits to being a "garbage stat"--the game score. For the uninitiated, the game score is a number, generally ranging from zero to 100, that's used to evaluate a starting pitcher's performance in any given outing. It's scaled so that 50 is roughly an average start, 90 or higher is gem status and anything below, say, 15 is in Jaime Navarro territory. Like the man said, it's a junk metric, but it's an entertaining one.
Here's how game scores are calculated:
And there you have it. It's quick and dirty, but it's also a sound thumbnail evaluation of how a starter has performed on any given day. That doesn't mean, however, that it can't be improved upon.
Game scores, as mentioned, were conceived 15 years ago, and in the intervening period we've learned a great deal about how much control pitchers exert over certain events. In light of this, perhaps it's time to roll out Game Scores 2.0, with an eye toward what we now know about the art of pitching.
As many of you know, semi-recent research has found that pitchers don't have as much control over what becomes of balls in play as previously believed. Voros McCracken's initial findings suggested that pitchers had almost no control over the fate of a ball once it left the bat (provided it stayed in the park). Subsequent research by Keith Woolner and Tom Tippett found that while pitchers didn't have a great deal of influence over whether balls in play were converted into outs or fell as base hits, they did have a modicum of control over these events. Whoever's right (I side with the latter position), pitchers appear to have much less influence with regard to hit prevention than we once thought. This principle--separating what pitchers control absolutely from what they control only partially--is where the rubber hits the road for GS 2.0.
Here's how I've decided, by ruthless fiat, Game Scores 2.0 will be tallied:
Allow me to highlight the differences for a moment. James' method rewards two points for each inning completed after the fourth. I find this to be a little too accommodating, and I've instead opted to go with the traditional quality-start definition. As such, no points are given out until the pitcher completes the sixth inning.
In keeping with the idea that the pitcher should be rewarded more for the things over which he has the most control, I've doubled the strikeout points from one to two. Accordingly, I've increased the walk penalty from one to four but confined it to unintentional walks. While James makes no provision for home runs--the capital offense of pitching--I've opted to dock a hefty seven points for such a transgression.
James provides different penalties for earned and unearned runs. In all honesty, I think errors of any sort should be expunged from the statistical record (this is a column for another day), so I'm not going to make any distinctions between earned and unearned runs. The two terms are subjectively interchangeable irrespective of what kind of mood the official scorer is in. To avoid double deductions for the same run, only non-homer runs are counted.
It might seem counterintuitive that hits allowed cost the pitcher more in version 2.0 than in the original, since my whole premise is applying more weight to those events the pitcher himself holds sway over. Well, think of it in terms of ratios. The original game score apportions point values (ignoring the + or - signs for a moment) to strikeouts, walks and hits in the ratio 1:1:2. GS 2.0, meanwhile, goes with a 2:4:3 ratio. Although hits have a greater stake in 2.0 in terms of raw points, their level of impact in comparison to strikeouts and walks is reduced significantly. Since the pitcher has absolute control over bases on balls, the impact of those on the game score has been drastically increased. The pitcher does determine whether a ball is put into play and has at least some control over what becomes of it once it is in play, so hits allowed warrants a meaningful point value.
So how does 2.0 stack up in comparison? Glad I asked. The simplest way to do this is to look at a few sample games and see how the two methods evaluate the performances.
For starters, let's take Kerry Wood's now famous 20-strikeout hate crime against the Astros in 1998. James' version (OK, to be fair they're both, of course, James' versions; I'm just adding some tweaks) gives Wood a game score of 105--the highest total ever for a nine-inning game. GS 2.0, meanwhile, gives Wood a lofty 122, mostly by dint of his ridiculously high K total in tandem with zero walks.
One of the games mentioned in James' original article on game scores is Tom Seaver's 12-inning opus on May 1, 1974. James' GS gives Seaver a 106 (a no-decision for Tom Terrific, by the way), while 2.0 slaps him with a fairly outrageous 112. Striking out 16 and unintentionally walking only one in 12 innings of work will do that for you. Try it some time.
And what kind of performances does 2.0 think less of than its predecessor? Looks like a job for Kirk Rueter, the bane of sabermetric systems near and far. On April 23 of last season, Rueter bested the Cubs in Wrigley, surrendering two runs and striking out one over seven innings. Since he also allowed only two hits, the old system festooned his day with a commendable game score of 63. GS 2.0, however, was roundly unimpressed and gave Rueter a paltry 47 for his troubles--a below-average effort according to 2.0.
It's clear that GS 2.0 favors high-strikeout, low-walk dominators over finesse, let-the-defense-do-the-work types. Fair or not, that's more in keeping with what we know about pitching skills in the here and now.