August 2, 2008
Minor League Statistics and EqA
The bookmark on my browser that brings me to Baseball Prospectus every day is set to the Statistics page. Even if I'm not looking something up—say if I want to go to one of the chats or see what's new on Unfiltered—that's where I start off. Often, I don't really look at the Stats page—either because it's a way station to wherever else I'm headed, or because I'm so familiar with it that I take it for granted. A few days ago, I was en route to the Glossary when something stopped me cold.
The Stats page had changed. At first, it was difficult to discern how, but ultimately my eyes settled on a single report, labeled "Minor League Statistics and Translations." I immediately logged out of my account and logged back in as a Premium subscriber. The report was still there, perfectly usable, and available to the public. It was all I could do to keep myself from having one of those Tom-Cruise-on-Oprah's-couch moments.
You could say I was a little happy. You see, we're constantly checking and using minor league numbers here at Baseball Prospectus, but generally we had to go to other sources to get updated stats (like the official minor league page at MiLB.com) and if we wanted any of the advanced metrics we have for major leaguers applied to the minors, we usually either had to make special requests from our stats crew, or do without. With this new report, a cornucopia of data is laid out at your fingertips.
The report is actually three different resources in one place. First, you have the "real" statistics for each level from A-ball to the Majors, divided by league, and defined by actual, unadjusted game events. So when you see that the Reds got Danny Richar as part of the Ken Griffey Jr. trade, you can go to the International League section of the report, click on "real stats" and you'll find Richar's line at Charlotte: .262/.321/.427 (that's batting average/on base percentage/slugging percentage, for those of you who are new around here), 9 homers, 39 RBI, 11 steals. However, there's a lot of context that goes into reading these stats—each minor league is its own environment, where due to climate, altitude, stadium construction, or a host of other factors, pitchers or hitters may be intrinsically favored, even before you consider the differences in ballparks within the league.
Helping to adjust for all of that context is the second report—Translated Statistics (called "Regular Translations" in the report). Created by Clay Davenport, our translated statistics apply league difficulty and ballpark factors to those real statistics, enabling us to make apples-to-apples comparisons between players of various levels, all adjusted to what their estimated production would be at the major league level. It's a concept similar to Bill James's Major League Equivalencies (MLEs). Our version (Davenport Translations, or DTs) pegs Richar's 2008 season as translating to a .229/.285/.386 line in the majors, with the same number of homers and virtually the same number of RBI, but with fewer stolen bases (7), probably accounted for by the fact that the translated Richar is on base much less often and has fewer chances to steal than in real life.
But there are other considerations when looking at minor leaguers that aren't covered by these translations. The minor leagues are built like a pyramid, with players advancing from amateur status up the ladder to the majors, older prospects who can't thrive at the next level getting pushed up or displaced by the younger crop of prospects rising behind them. If a player manages to get stuck at a low level, the result can be something like the scene in Billy Madison where a twenty-something Adam Sandler is sent back to grade school—he'll find that this time around, he can really rock fourth-grade dodgeball. You can get very excited about a minor leaguer's prospects—even looking at their translated numbers—only to realize that they belong to a 26-year-old kicking butt and taking names in the Sally League.
One tool that helps take this factor (and several others) into account is the Peak Translation. Peak translation projects a player's translated performance to an expected peak level, using age adjustments compared to each level at which they've played, and (for batters) adjustments based on the power component in their hitting performance and their strikeout rate. The result isn't quite as sophisticated as Nate Silver's PECOTA projections, but it does help a bit in separating the wheat from the chaff.
On the batters' side of this report, the big advanced statistic in use is Equivalent Average (EqA). EqA uses all the components of a batter's offensive performance, including how often they get on base, how many extra bases they get, and even how many steals, and digests them into a single rate statistic. If you've followed the last several installments of Toolbox, you know that I've been working on describing the scales of various advanced metrics. EqA makes this simple—it's designed to function on a scale similar to batting average. The untranslated league average for any league is always .260. Applying the method I've used in the last few installments, I took a sampling of the 270 players with the most plate appearances, both this year and last year. Based on that sample, the standard deviation was .030, giving us a chart that looks like this:
EqA Pct 2007 Poster Child 2.0 SD .320 4.3 Miguel Cabrera (.322) 1.5 SD .305 9.4 Adam Dunn (.304) 1.0 SD .290 20.7 Jimmy Rollins (.290) 0.5 SD .275 40.2 Brian Giles (.275) Mean .260 36.9 Orlando Cabrera (.260) 0.5 SD .245 22.8 Alex Gordon (.245) 1.0 SD .230 10.7 Marcus Giles (.229) 1.5 SD .215 3.5 Gerald Laird (.216) 2.0 SD .200 0.9 Mark Kotsay (.200)
The distribution is clearly lopsided toward the high side of the spectrum. Most of that is selection bias—guys who post EqAs under .230 don't tend to get very many plate appearances.
Equivalent Runs (EqR) are the counting-stat version of EqA. Using EqR and the Peak Translations, we can compile a list of guys who are, statistically speaking, some of the most promising hitters in the high minors:
Name Age Team League AB HR BA OBP SLG EqA EqR Dexter Fowler 22 TUL Tex 407 19 .307 .404 .526 .314 82 Mat Gamel 22 HUN Sou 441 17 .329 .388 .537 .307 80 Kyle Blanks 21 SAN Tex 385 21 .312 .398 .551 .317 77 Austin Jackson 21 TRN Eas 435 14 .294 .384 .483 .298 77 Nelson Cruz 27 OKL PCL 368 30 .277 .362 .571 .306 72 Brent Clevlen 24 TOL Int 360 22 .294 .380 .550 .310 69 Cameron Maybin 21 CAR Sou 321 22 .283 .385 .561 .312 65 Neil Walker 22 IND Int 391 25 .281 .319 .563 .287 65 Andrew McCutchen 21 IND Int 401 12 .287 .385 .434 .281 64 Allen Craig 23 SFD Tex 438 20 .283 .341 .473 .277 64 Travis Snider 20 NHP Eas 351 20 .274 .374 .501 .297 63
This list is derived from the best peak EqRs at the Double- and Triple-A levels, ignoring for the moment the Mexican League, which is not by and large a developmental league like its Triple-A peers. The only guy on this list who's disqualified from prospect status on the basis of his age is sabermetric darling Nelson Cruz—he's in his age 27 season, so that peak should be... now. Of the others, only Allen Craig and Brent Clevlen didn't make their franchises' Top 11 Lists, and Andrew McCutchen, Travis Snider, and Cameron Maybin were their respective teams' top prospects.
Next time, we'll talk about some of the special pitching metrics used in the Minor League Statistics and Translations page.