July 31, 2009
Checking the Numbers
A few weeks back, while sitting in an advanced financial markets course, I took a break from designing the target capital structure of a hypothetical corporation and called up MLB Gameday on the laptop. The Cubs were in town to play the Phillies, and the game simply grabbed a good chunk of my attention-before critiquing my ethics as a student, know that I get high marks, so taking a break here and there to zone out and watch baseball is something I earn and reward myself with. [Ed. Note: What right-thinking individual wouldn't?] What grabbed an even stronger hold of my attention was the pitching matchup that night: Rodrigo Lopez vs. Ted Lilly. Granted, both pitchers had journeyed to this particular point in drastically different fashions-Lilly via a lucrative multi-year deal signed two years prior, and Lopez via above-average performance in Triple-A while working his way back from elbow surgery-but each sported identical 3.18 ERAs as the game began. Though Lilly had produced his ERA in 119 innings, way more than the 11
While an otherwise normal person might spot the similar numbers and be capable of moving onto something, anything, more meaningful, questions blossomed in my head. Does this happen frequently? What is the latest date in a season for such an occurrence? If more stats other than ERA are factored in, could we find the most identical matchup ever? Has anyone ever squared off against an earned run prevention doppelganger on more than one occasion in a season? Same question, but how about over a span of years? An in-class quiz revolving around proxy firms and betas caused the questions to subside for a bit, but after some tedious database work, I finally quenched my useless information thirst. Keep in mind, too, that I am not discussing matchups of identical true talent levels, but rather actual numbers amassed at different points throughout a season.
The first step involved the manipulation of a table featuring running totals and rates through each game in a season. Each row in the table informs on the data for a pitcher through a certain date, or in other words where he stood at the end of a particular game. In order to figure out other instances of identical production entering a game, the task involved linking up the accrued statistics through appearance x with all pertinent game information, like date, year, and teams involved on appearance x+1, an example of which can be seen below:
Pitcher G IP SO ERA Pitcher G IP SO ERA John Danks 32 187.0 155 3.47 Nick Blackburn 32 187.0 93 4.14
This row basically explains that on September 30, 2008, John Danks squared off against Nick Blackburn in what would be Danks's 33rd appearance of the year. Entering that one-game playoff, Danks had logged 187 frames in 32 appearances, generated a 3.47 ERA, and had struck out a raw total of 155 batters. Interestingly, Blackburn was also making his 33rd appearance and had also logged exactly 187 innings in 32 games up to that point, an interesting coincidence given that the row had been chosen completely at random. This process was repeated for each game in every season from 1989-2008, incorporating statistics like IP, ERA, SO, HR, LOB% and K/BB ratio. The first two starts for each pitcher in each season were left out, however, to avoid skewed results consisting of 0.00 vs. 0.00 matchups to kick-start a season.
So, with the data available, let's answer some of the aforementioned questions. Over the last 20 seasons, only 99 of the 34,823 games in the sample involved two pitchers with identical ERAs squaring off against one another. Of the returned rows, 46 games were held before July 1 in a given year, so it isn't as if matchups in the early part of a season have any true advantage in this area. The most recent such matchup took place on August 23, 2008, when Chris Volstad and his 3.02 ERA in 44
The matchup featuring the lowest ERA came on May 3, 2002, when Kirk Rueter and Jose Rijo each sported gaudy 1.89 marks. However, that game took place early in the season, so the "award" for lowest identical ERA matchup for pitchers with higher innings totals belongs to Ed Whitson and Norm Charlton, who had put together 2.77 ERAs through 70+ innings in their game on August 4, 1990. The matchup antonym came on May 22, 2006, when Oliver Perez and Orlando Hernandez sported identically ugly 6.98 ERAs. If more innings are desired in order to deem a matchup as that with the highest ERA, our focus would shift to a July 15, 2001 matchup between Livan Hernandez and Darren Oliver, when each entered the game with a 6.07 ERA.
The matchup featuring most innings pitched in the entire span was held on September 15, 1997, when Tom Glavine squared off against Shawn Estes; both had 3.11 ERAs, with Glavine having accrued 217 innings entering the game and Estes holding strong at 185 frames. Here are the top five matchups with the highest combination of innings pitched:
Date Pitcher G IP ERA Pitcher G IP ERA 9/15/1997 Tom Glavine 30 217.0 3.11 Shawn Estes 29 185.0 3.11 9/30/2000 Javier Vazquez 32 210.1 4.02 Glendon Rusch 30 185.2 4.02 9/23/2001 Jeff Weaver 30 207.2 4.16 Hideo Nomo 30 181.2 4.16 9/28/1996 Hideo Nomo 32 221.2 3.21 Andy Ashby 23 145.2 3.21 8/19/1989 John Smoltz 25 180.2 2.84 Doug Drabek 26 180.1 2.84
At least one matchup of this ilk took place in each of the 20 years in the sample, with certain years boasting more occurrences than others. The year sporting the highest total of identical ERA matchups is 2001, with a whopping nine different games. Livan Hernandez participated in two separate identical matchups throughout that season, helping to segue into the question of whether or not a single pitcher appeared on such a list multiple times in a given season. On May 9, 2001, both he and Javier Vazquez opposed one another with 6.75 ERAs, and a little over two months later, on July 15, the aforementioned Livan versus Darren Oliver matchup took place.
While Livan and a handful of others-Tom Glavine, Willie Blair, and Shawn Estes in 1997, Brett Tomko in 2002 and Brad Penny on 2005-squared off against ERA doppelgangers twice in a given season, nobody was able to accomplish this "feat" in consecutive starts. In fact, I had to go to back to 1985 before finding such a situation as Mark Langston faced earned run prevention twins in his 14th and 15th appearances of the season; on August 2, 1985, Langston and Tommy John dueled with 4.06 ERAs; on August 8, 1985, Langston and Chris Codiroli of the Athletics squared off with 4.10 marks.
The highest overall tally in this 20-year span belongs to Tom Glavine, at four separate matchups. Glendon Rusch, Jose Rijo, Javier Vazquez, Brad Penny, Livan Hernandez, Chuck Finley, and Ted Lilly each had three different doppelganging matchups, with a grand total of 158 different pitchers partaking at least one time over the course of these two decades. Interestingly enough, Lilly's name surfaced towards the top of the leaderboards, and with full knowledge that he and Lopez opposed one another while producing 3.18 ERAs up to that point the Cubs left-hander has now tied Glavine for the most overall identical ERA matchups since 1989. That might not be the equivalent to the Cy Young Award but, well, it's something right?
What happens when more data gets incorporated? Has there ever been a matchup of two pitchers with identical ERAs as well as K/BB ratios? Unfortunately, the answer is no when the ratio gets computed out to four decimal places. What about if we truncate that down to the more standard and commonly used two decimals format? Still no… never mind, then. How about if the LOB percent is substituted for K/BB ratio? Once again, no luck when the strand rate is computed to three or four decimal places, but nine different matchups do surface once the rates are rounded off, one of which we have already discussed:
Date Pitcher Pitcher ERA LOB 7/13/1989 Greg Maddux Bruce Hurst 2.93 0.77 7/24/2003 Brad Penny Kris Benson 3.40 0.74 6/16/1991 Pat Combs Norm Charlton 3.92 0.72 8/25/1992 Ramon Martinez Danny Jackson 3.93 0.69 8/23/2005 Nate Robertson Dan Haren 4.00 0.70 8/13/2008 John Maine Jay Bergmann 4.13 0.72 8/18/2001 CC Sabathia Pat Rapp 4.42 0.70 5/30/2004 Mike Maroth Erik Bedard 4.91 0.70 5/12/1996 Orel Hershiser Jim Abbott 5.67 0.67
Moving further, have any of these rows involved two pitchers squaring off with identical innings pitched totals in addition to the matching ERAs and LOB percentages? With my fingers crossed, I ran the query, and one row returned: on June 16, 1991, then-Phillies pitcher Pat Combs faced future Phillies hurler Norm Charlton in a battle for the ages, as both pitchers entered with 59
Given that this matchup featured three identical data components with no competition from any of the other 99 featured games, it feels pretty safe to say that the epic Combs/Charlton duel on June 16, 1991 is the most identical pitching matchup of the last 20 years. Of course, neither of these pitchers was a "true" 3.92 ERA talent or anything along those lines, but no other game since 1989 has featured two pitchers with more similar data entering the game. The aforementioned Lopez-Lilly dance did not turn out to be a game for the ages, but it did comprise a facet of pitching data only witnessed in just 0.28 percent of qualified games over the past two decades. If you ever happen to see a pitching matchup in which both sides boast identical data in one form or another, know that it does not happen often, and you could be witnessing history… albeit in a very unimportant form.