A few weeks back, while sitting in an advanced financial markets course, I took a break from designing the target capital structure of a hypothetical corporation and called up MLB Gameday on the laptop. The Cubs were in town to play the Phillies, and the game simply grabbed a good chunk of my attention-before critiquing my ethics as a student, know that I get high marks, so taking a break here and there to zone out and watch baseball is something I earn and reward myself with. [Ed. Note: What right-thinking individual wouldn’t?] What grabbed an even stronger hold of my attention was the pitching matchup that night: Rodrigo Lopez vs. Ted Lilly. Granted, both pitchers had journeyed to this particular point in drastically different fashions-Lilly via a lucrative multi-year deal signed two years prior, and Lopez via above-average performance in Triple-A while working his way back from elbow surgery-but each sported identical 3.18 ERAs as the game began. Though Lilly had produced his ERA in 119 innings, way more than the 11 1/3 for Lopez, and therefore much more indicative of actual performance, their identical performance metrics on this score piqued my interest.

While an otherwise normal person might spot the similar numbers and be capable of moving onto something, anything, more meaningful, questions blossomed in my head. Does this happen frequently? What is the latest date in a season for such an occurrence? If more stats other than ERA are factored in, could we find the most identical matchup ever? Has anyone ever squared off against an earned run prevention doppelganger on more than one occasion in a season? Same question, but how about over a span of years? An in-class quiz revolving around proxy firms and betas caused the questions to subside for a bit, but after some tedious database work, I finally quenched my useless information thirst. Keep in mind, too, that I am not discussing matchups of identical true talent levels, but rather actual numbers amassed at different points throughout a season.

The first step involved the manipulation of a table featuring running totals and rates through each game in a season. Each row in the table informs on the data for a pitcher through a certain date, or in other words where he stood at the end of a particular game. In order to figure out other instances of identical production entering a game, the task involved linking up the accrued statistics through appearance x with all pertinent game information, like date, year, and teams involved on appearance x+1, an example of which can be seen below:

Pitcher       G   IP     SO   ERA    Pitcher           G   IP     SO  ERA
John Danks   32  187.0  155  3.47    Nick Blackburn   32  187.0  93  4.14

This row basically explains that on September 30, 2008, John Danks squared off against Nick Blackburn in what would be Danks’s 33rd appearance of the year. Entering that one-game playoff, Danks had logged 187 frames in 32 appearances, generated a 3.47 ERA, and had struck out a raw total of 155 batters. Interestingly, Blackburn was also making his 33rd appearance and had also logged exactly 187 innings in 32 games up to that point, an interesting coincidence given that the row had been chosen completely at random. This process was repeated for each game in every season from 1989-2008, incorporating statistics like IP, ERA, SO, HR, LOB% and K/BB ratio. The first two starts for each pitcher in each season were left out, however, to avoid skewed results consisting of 0.00 vs. 0.00 matchups to kick-start a season.

So, with the data available, let’s answer some of the aforementioned questions. Over the last 20 seasons, only 99 of the 34,823 games in the sample involved two pitchers with identical ERAs squaring off against one another. Of the returned rows, 46 games were held before July 1 in a given year, so it isn’t as if matchups in the early part of a season have any true advantage in this area. The most recent such matchup took place on August 23, 2008, when Chris Volstad and his 3.02 ERA in 44 2/3 IP faced Yusmeiro Petit and his 3.02 ERA in 41 2/3 innings. The identical ERA matchup involving the most innings pitched involved took place ten days earlier, when John Maine of the Mets-4.13 ERA in 124 1/3 IP-took on Jason Bergmann of the Nats-4.13 ERA in 106 2/3 IP.

The matchup featuring the lowest ERA came on May 3, 2002, when Kirk Rueter and Jose Rijo each sported gaudy 1.89 marks. However, that game took place early in the season, so the “award” for lowest identical ERA matchup for pitchers with higher innings totals belongs to Ed Whitson and Norm Charlton, who had put together 2.77 ERAs through 70+ innings in their game on August 4, 1990. The matchup antonym came on May 22, 2006, when Oliver Perez and Orlando Hernandez sported identically ugly 6.98 ERAs. If more innings are desired in order to deem a matchup as that with the highest ERA, our focus would shift to a July 15, 2001 matchup between Livan Hernandez and Darren Oliver, when each entered the game with a 6.07 ERA.

The matchup featuring most innings pitched in the entire span was held on September 15, 1997, when Tom Glavine squared off against Shawn Estes; both had 3.11 ERAs, with Glavine having accrued 217 innings entering the game and Estes holding strong at 185 frames. Here are the top five matchups with the highest combination of innings pitched:

Date        Pitcher           G   IP    ERA    Pitcher          G   IP     ERA
9/15/1997   Tom Glavine      30  217.0  3.11   Shawn Estes     29  185.0  3.11
9/30/2000   Javier Vazquez   32  210.1  4.02   Glendon Rusch   30  185.2  4.02
9/23/2001   Jeff Weaver      30  207.2  4.16   Hideo Nomo      30  181.2  4.16
9/28/1996   Hideo Nomo       32  221.2  3.21   Andy Ashby      23  145.2  3.21
8/19/1989   John Smoltz      25  180.2  2.84   Doug Drabek     26  180.1  2.84

At least one matchup of this ilk took place in each of the 20 years in the sample, with certain years boasting more occurrences than others. The year sporting the highest total of identical ERA matchups is 2001, with a whopping nine different games. Livan Hernandez participated in two separate identical matchups throughout that season, helping to segue into the question of whether or not a single pitcher appeared on such a list multiple times in a given season. On May 9, 2001, both he and Javier Vazquez opposed one another with 6.75 ERAs, and a little over two months later, on July 15, the aforementioned Livan versus Darren Oliver matchup took place.

While Livan and a handful of others-Tom Glavine, Willie Blair, and Shawn Estes in 1997, Brett Tomko in 2002 and Brad Penny on 2005-squared off against ERA doppelgangers twice in a given season, nobody was able to accomplish this “feat” in consecutive starts. In fact, I had to go to back to 1985 before finding such a situation as Mark Langston faced earned run prevention twins in his 14th and 15th appearances of the season; on August 2, 1985, Langston and Tommy John dueled with 4.06 ERAs; on August 8, 1985, Langston and Chris Codiroli of the Athletics squared off with 4.10 marks.

The highest overall tally in this 20-year span belongs to Tom Glavine, at four separate matchups. Glendon Rusch, Jose Rijo, Javier Vazquez, Brad Penny, Livan Hernandez, Chuck Finley, and Ted Lilly each had three different doppelganging matchups, with a grand total of 158 different pitchers partaking at least one time over the course of these two decades. Interestingly enough, Lilly’s name surfaced towards the top of the leaderboards, and with full knowledge that he and Lopez opposed one another while producing 3.18 ERAs up to that point the Cubs left-hander has now tied Glavine for the most overall identical ERA matchups since 1989. That might not be the equivalent to the Cy Young Award but, well, it’s something right?

What happens when more data gets incorporated? Has there ever been a matchup of two pitchers with identical ERAs as well as K/BB ratios? Unfortunately, the answer is no when the ratio gets computed out to four decimal places. What about if we truncate that down to the more standard and commonly used two decimals format? Still no… never mind, then. How about if the LOB percent is substituted for K/BB ratio? Once again, no luck when the strand rate is computed to three or four decimal places, but nine different matchups do surface once the rates are rounded off, one of which we have already discussed:

Date        Pitcher         Pitcher         ERA     LOB
7/13/1989   Greg Maddux     Bruce Hurst    2.93    0.77
7/24/2003   Brad Penny      Kris Benson    3.40    0.74
6/16/1991   Pat Combs       Norm Charlton  3.92    0.72
8/25/1992   Ramon Martinez  Danny Jackson  3.93    0.69
8/23/2005   Nate Robertson  Dan Haren      4.00    0.70
8/13/2008   John Maine      Jay Bergmann   4.13    0.72
8/18/2001   CC Sabathia     Pat Rapp       4.42    0.70
5/30/2004   Mike Maroth     Erik Bedard    4.91    0.70
5/12/1996   Orel Hershiser  Jim Abbott     5.67    0.67

Moving further, have any of these rows involved two pitchers squaring off with identical innings pitched totals in addition to the matching ERAs and LOB percentages? With my fingers crossed, I ran the query, and one row returned: on June 16, 1991, then-Phillies pitcher Pat Combs faced future Phillies hurler Norm Charlton in a battle for the ages, as both pitchers entered with 59 2/3 innings pitched, 3.92 ERAs, and 72 percent strand rates. Combs lasted just 4 1/3 frames, allowing four hits and four earned runs, issuing seven free passes while fanning an equal number of hitters. By the end of the game, he had tallied 64 total innings with a 4.22 ERA. Charlton surrendered five runs in six innings, leaving him at 65 2/3 frames and a 4.25 ERA entering his next start.

Given that this matchup featured three identical data components with no competition from any of the other 99 featured games, it feels pretty safe to say that the epic Combs/Charlton duel on June 16, 1991 is the most identical pitching matchup of the last 20 years. Of course, neither of these pitchers was a “true” 3.92 ERA talent or anything along those lines, but no other game since 1989 has featured two pitchers with more similar data entering the game. The aforementioned Lopez-Lilly dance did not turn out to be a game for the ages, but it did comprise a facet of pitching data only witnessed in just 0.28 percent of qualified games over the past two decades. If you ever happen to see a pitching matchup in which both sides boast identical data in one form or another, know that it does not happen often, and you could be witnessing history… albeit in a very unimportant form.

You need to be logged in to comment. Login or Subscribe
Please don't go into finance. There are enough of us already.
But maybe Eric has all the answers to make the financial markets work.
Well, I'm an Accountant. I'm in grad school to take some courses to help my eligibility for the CPA exam down the road and I'm also concentrating in Finance to hopefully sit for the CFA at some point. I don't necessarily want to do anything, just have the titles for my e-mail signature.
I'm a CPA - and I'd much rather write about baseball.
It's too awesome that you get paid to do this. I too get paid for compiling all sorts of mounds of worthless information, but you actually get paid for it. Sweet.
I get paid WHILE compiling mounds of information of interest to myself. Not FOR.
The funniest part is that to do a fun but not groundbreaking article like this takes about 4-5 days of database work given how long the scripts run--and I have a pretty fast computer. Yet my previous article, showing that ERA is equally predictive as FIP in the second half of a season for guys with big ERA-FIP discrepancies took about 35 minutes to code.
Pretty much always the case. Things that come out looking simple always have 1,000,000 things going on behind the scenes that make it hard to do, while groundbreaking stuff is usually from an interesting insight, but fairly basic number crunching.
Here's another fun question. How often did pitching doppelgangers end the night with identical ERAs?