June 19, 2009
Checking the Numbers
In the first inning of an August 13, 2006 game between the Dodgers and the Giants, fans bore witness to an historic matchup, as Greg Maddux squared off against Barry Bonds. The two had faced each other many times throughout their illustrious careers prior to this specific game, with Bonds accruing 34 hits in 120 at-bats against the four-time Cy Young Award winner, but this particular matchup marked the first time in the history of the sport that a 300-game winner would do battle with a member of the 700 home-run club. Even though both players were on the wrong side of 40 years old, this heavyweight matchup still garnered main-event status. Neither disappointed in that first plate appearance, with both players displaying everything we had come to know about their respective games in a short, two-pitch sequence. After Bonds exhibited his tremendous batting eye, taking a two-seamer with incredible movement that sailed just outside for a ball, he smashed the second pitch up the middle. Maddux, not so arguably the greatest-fielding pitcher of all time, snared the scorching liner in what seemed like a millisecond and doubled Ray Durham off of first base. This confluence of events exemplified the intrigue of the batter vs. pitcher matchup.
What should the hurler throw in a specific situation? How should the hitter alter his approach from the previous trip to the plate? Can he decipher any tells or tipping of pitches? Does the pitcher know which offering the batter expecting? Many more such questions could flood the firstname.lastname@example.org inbox, furthering the concept that much of the variance in the sport can be accredited to different showdowns; some good, some great, and some awful. It stands to reason then that when the best of the best square off, the advantages become relatively unclear. This weekend, the Royals will play the Cardinals, and though Zack Greinke and Albert Pujols will not get a chance to face each other, many are left wondering if a showdown of that magnitude would result in the standard Pujols mastery, or if Greinke would make Albert look like Luis Pujols at the plate.
Entering this season, Greinke and Pujols have faced each other in eight different plate appearances, and Pujols has gone 3-for-6 with a double, two walks, and one strikeout. His .500/.625/.667 line may lead some to believe that he owns Greinke, or simply knows how to handle the Royals ace, but that's not necessarily true. In actuality, the Pujols-Greinke matchup perfectly illustrates the main reason why batter/pitcher statistics need to be taken with handfuls of salt. For starters, the first three plate appearances between these two took place on June 25, 2004, during Greinke's rookie campaign. The next three were amassed eleven months later, in a May 20, 2005 contest. The two would not meet again until June of 2007, nearly three years after their initial encounter, when Greinke induced a Pujols groundout on June 14, and issued him a walk on June 20. Suffice to say, the Greinke from 2004-07 was an entirely different pitcher than the one who figured things out down the stretch last year, and has held hitters to .223/.259/.319 this year.
In addition to dealing with different circumstances inherent in a data set, the problem of small sample sizes persists. Knowing what Pujols has done in eight plate appearances against Greinke over a four-year span that ended two seasons ago does not help us understand what will happen in their next meeting. Just as 10 PAs for a hitter would be assigned heavy doses of skepticism when trying to figure out his talent level, definitive claims based on such small samples of data between a batter and pitcher must be avoided. When trying to deduce the odds of success in a given matchup, his projection and his prior data are going to be much more accurate predictors than eight or so plate appearances spread out over four seasons. In the case of Bonds vs. Maddux, things are a little different, since they had faced each other several times each season while simultaneously evolving as players. In addition, by the time their data became statistically significant, both were nearing the end of their careers.
This does not preclude us from investigating or having some fun, however, as something may exist in the data with regards to how the collective group of best hitters fares against top-of-the-line pitchers. To answer, I queried my database for the aggregate batter-pitcher matchup data, from 1999-2008, when pitchers with at least 100 innings and an ERA no higher than 3.85 in a season faced hitters with 400+ PA and an OPS of 900 or better. Here are the slash stats and OPS for the elite matchups during each season:
Year AVG/ OBP/ SLG OPS 1999 .279/.348/.512 860 2000 .281/.368/.488 856 2001 .284/.365/.523 888 2002 .285/.375/.512 887 2003 .271/.354/.488 842 2004 .288/.385/.528 913 2005 .278/.360/.504 864 2006 .273/.361/.503 864 2007 .284/.372/.527 899 2008 .286/.367/.525 892
The data here is not at all cut and dried in terms of evaluating which side got the best of the other. Context is key, as we need to know how these slash stats compare to what the hitters did against everyone else. Looking at 1999, the hitters in the sample produced an 860 OPS against the elites. Compare that to the 995 OPS the hitters boasted at the expense of all other pitchers, and it becomes easier to see that the pitchers really did do a good job in making the hitters appear mortal at the dish. The table below quantifies the deltas, or differences, between the OPS of these hitters against the elite pitchers and the OPS allowed by everyone else:
Year Elite Other Delta 1999 860 995 -.135 2000 856 1009 -.153 2001 888 1036 -.148 2002 887 1029 -.142 2003 842 999 -.157 2004 913 1006 -.093 2005 864 996 -.132 2006 864 1001 -.137 2007 899 1003 -.104 2008 892 997 -.105
The pitchers won out in each of these seasons, and by fairly large margins at that. Keep in mind, though, that this data holds little predictive value outside of the fact that elite pitchers are bound to fare much better than others when facing the top-tier hitters. The elite pitchers didn't exactly turn these hitters into Bloomquist clones, but routinely reducing an opposing player's OPS by over 100 points is not necessarily an easy feat. There will certainly be hitters that perform at a high level regardless of whether Johan Santana or Radhames Liz is toeing the rubber, but the changes in talent and circumstances, as well as the small samples of data accrued over longer time spans, make specific batter-pitcher data difficult to gauge and therefore less meaningful. Is it interesting to know that Hitter A is 7-for-29 against Pitcher B in his career? While nobody would refute its titillating nature, or deny the presence of a hitter's ability to adjust their approach based on past experiences with specific hurlers, the knowledge of Hitter A's .325 batting average over the past three seasons is more likely to indicate his chances of success in this specific plate appearance than his .241 average against our hypothetical pitcher. Matchup data between batters and pitchers can be interesting and noteworthy, but we have to be careful not to treat it as gospel, while also understanding what the information does and does not explain.
A version of this story originally appeared on ESPN Insider .