I feel guilty writing anything these days that doesn’t involve the postseason, but my reactions this time of year tend to stray far from the realm of objectivity and well into the territory of shock (Phil Garner did what?) and awe (Johnny Damon did what?). Let’s take a quick break from all the catastrophe and success and answer a PECOTA question that I’ve been long overdue in addressing.
I look and wonder why Curt Schilling is nowhere to be found on your Ben Sheets Similarity Index. And then look at Brad Lidge‘s Index and see a guy at the top who had a 5.87 ERA, a 2:1 K/BB ratio and a 6.74 K/9 and I have to wonder about what’s going on?
- PECOTA comparables do not take into account performance in the current season, but rather the three previous seasons.
Sheets had a promising campaign a year ago, and PECOTA assigned him a 19% chance of having a breakout, finding favorable comparables in folks like Robin Roberts, Dave Stieb and Mike Mussina. Still, the system expected baby steps forward, rather than the quantum leap that Sheets took this year, one that is succinctly expressed in two numbers: 264 strikeouts, 32 walks.
The problem is that PECOTA is a predictive tool and not a retrospective one. It selected Sheets’ comparables based on the pitchers whose performance who most similar to his track record heading into the season. That suggested a pitcher, like Mussina, who was pretty durable and had some pretty good strikeout and walk numbers, but not a pitcher who had reached the elite circle occupied by folks like Schilling, Bret Saberhagen and so forth.
A similar argument holds for Lidge. Yes, the 10.27 K/9 that Lidge managed last year was impressive. It is nowhere near the 14.93 K/9–Fourteen Point Nine Three!–that Lidge posted this year. Players whose performance departed radically from PECOTA’s expectations, especially in a key category like pitcher strikeouts, are likely to draw a very different group of comparables when the 2005 PECOTAs are posted this winter.
Similarly, the years listed next to a player’s comparables reflect their performance heading into the year in question. So, when PECOTA lists Dan Naulty, 1997, as Lidge’s best comparable, it means that it thinks that Naulty’s performance in 1997 would tell us something about how Lidge would perform in 2004. In 1996, Naulty posted a 3.79 ERA against a league average of 5.15–similar to the 3.60 ERA that Lidge recorded last season against a league average of 4.61. This particular comparison turned out not to be prescient–Naulty regressed in 1997, and never had much of a career, while Lidge took a big step forward–but the pitchers looked pretty similar heading into their age 27 seasons.
- All statistics are adjusted for league context. The league context has changed more than you’d think.
As with most all of our statistical tools, PECOTA adjusts all its statistics for park and league averages. This is crucially important for a prediction engine. It wouldn’t be very fair to compare a hitter who hit 35 homers playing his home games in Coors Field in 2003, to one who hit the same number as an Astro in 1972.
Let’s take a look at Lidge and Naulty once again. Here are their performances in their previous seasons in three critical categories–strikeout rate, walk rate and home run rate–compared against league averages:
K/G BB/G HR/G 2003 NL 6.59 3.40 1.10 1996 AL 6.20 3.80 1.21 K/G BB/G HR/G Lidge 2003 10.27 4.48 0.76 Naulty 1996 8.84 5.52 0.80
Indexed to League Average (100 = average)
K/G BB/G HR/G Lidge 2003 156 132 69 Naulty 1996 143 145 66
Naulty struck out fewer batters than Lidge did, and walked more. He also pitched in a league environment in which walks were more common and strikeouts less so. When adjusted for league averages, their rates look much closer.
This is particularly important to consider in the case of categories such as home runs and pitcher strikeouts, which have grown significantly in frequency over time. In the 1951 American League, for example, pitchers averaged 3.72 strikeouts per game and 4.00 walks. A Red Sox pitcher named Mickey McDermott struck out 6.64 batters per nine innings, a figure less than the National League average this year, but it was good enough to lead the league. McDermott’s 127 strikeouts and 92 walks in the 1951 AL would translate to about 225 strikeouts and 78 walks in the 2003 NL.
- The small stuff adds up.
While PECOTA places the most emphasis on factors like strikeouts, walks and isolated power, it considers 13 categories total for pitchers and 12 total for hitters. Dan Naulty was 6’6″ and weighed 210 pounds. Brad Lidge is 6’5″ and weighs 200 pounds. Both Naulty and Lidge had about one full major-league season under their belts heading into their age-27 years. They also had similar groundball/flyball tendencies. These aren’t the most important factors that PECOTA considers, but they do have an influence on a player’s comparable list, in an amount proportional to the predictive value of the statistic in question. Because we don’t have a firm memory on some of these secondary characteristics for older players–Naulty, for all I remember, could have been built like Cliff Politte–they can sometimes have an unexpected and even counterintuitive impact on a player’s comparables list.
- Comparability is relative.
It was relatively easy to identify comparables for Lidge and Sheets heading into the season. Lidge had five cohorts who registered a score of 50 or higher, which translates as “extremely comparable” in the PECOTA system, while Sheets had seven.
Both pitchers, by virtue of their outstanding performances, are going to find things a little bit lonelier next time around. Lidge’s 2005 forecast is likely to look quite a bit like Eric Gagne‘s forecast this year. Gagne had no comparables with a similarity score of 50 or higher, and just three with a similarity score of 30 or higher. There aren’t many pitchers who put up numbers like Gagne’s.
What does PECOTA do in situations like these? Well, it does the best it can. Gagne’s third best comparable is the 2000 version of Pedro Martinez, who was then long removed from his tenure in the Dodgers’ bullpen. Would you rather compare Gagne to another relief pitcher? All else being equal, sure. But Pedro’s performance certainly tells us more about Gagne than, say, Bobby Thigpen‘s. When a player is truly unique, PECOTA sacrifices the small stuff in an effort to get the big stuff right.
Of course there are a few players–like that guy in San Francisco–for whom finding appropriate comparables is impossible. Not only is the small stuff wrong, but the big stuff is wrong too. In those cases, PECOTA makes like a drunken frat boy and lowers its standards. In the case of Bonds, for example, it’s willing to bed pretty much any 40-year-old with some power and some plate discipline. That isn’t an ideal solution–Bonds is quite a bit different from someone like Edgar Martinez, his fourth-best comparable–but it’s still better than comparing a player to everyone in his age cohort, which is what other projection systems do. Bonds isn’t Edgar Martinez, but he sure as hell isn’t Brett Butler.
- The small stuff adds up.
Thanks for the question, S.B.