Arizona Diamondbacks Atlanta Braves Baltimore Orioles Boston Red Sox Chicago Cubs Chicago White Sox Cincinnati Reds Cleveland Indians Colorado Rockies Detroit Tigers Houston Astros Kansas City Royals Los Angeles Angels Los Angeles Dodgers Miami Marlins Milwaukee Brewers Minnesota Twins New York Mets New York Yankees Oakland Athletics Philadelphia Phillies Pittsburgh Pirates San Diego Padres San Francisco Giants Seattle Mariners St. Louis Cardinals Tampa Bay Rays Texas Rangers Toronto Blue Jays Washington Nationals
 << Previous Article Prospectus Hit List: M... (05/15) << Previous Column Prospectus Idol Entry:... (05/17) Next Column >> Prospectus Idol Entry:... (05/17) Next Article >> Prospectus Q&A: Jim Pa... (05/17)

May 17, 2009

# Prospectus Idol Entry

## Brian Cartwright's Initial Entry

Bio: I've been a stats-geek since before Bill James started self-publishing Abstracts, and have been a Prospectus subscriber for about ten years now. Recently, I've written for Seamheads, StatSpeak and FanGraphs, and am waiting for the call up to the big leagues. My personal interests and writing have focused on statistical analysis, and I must admit I've been somewhat disappointed in the number of research articles published at BP since Dan Fox left last year. Eric has been a welcome addition, and I am hoping that I can also contribute to BP's publishing of analysis.

Entry: Major League Equivalencies

Major League Equivalencies (MLEs) are a set of formulas that will translate a player's minor league statistics into those that he would be expected to produce if he was in the major leagues. They form a part of most projections systems, including BP's PECOTA and Davenport Translations. Modeling the level of competition at each stop in the minors can prove to be much more daunting than dealing strictly with major league data, as there are several potential selection biases which can markedly affect the accuracy of the projections.

1. How much elapsed time should be allowed between samples?

2. Does including bench players, who may suffer a pinch hit penalty, bias the factors?

3. Should all players be sampled, or only those who advance all the way to MLB?

4. Should the lower minors be compared directly to MLB, or to the next highest level?

The results show a wide variance in HR and SO rates, and increasingly large overall discrepancies in projections from Double-A and High-A. The best approach might not be the one you would expect.

To test these scenarios, I created matched pairs of batting data with different selection criteria in order to calculate the ratios between major and minor league performance using each method. In the following tables, factors listed are the expected ratio between the minor and major league percent. For example, if a player has a BB% of .100 in High-A, and a factor of 0.71, he would be expected to have a BB% of (.100 x .71) = .071 in the majors.

By recording statistics for each player in each season, we are taking a sample, over a given period of time, which estimates the player's "true talent" in each of the various categories. As a player ages through his twenties, he will on average lose speed, but gain power, strike zone judgement and contact skills. If it takes two years of statistics to get a good measure of a player, at the end of the two years he is likely not exactly the same player he was before. It would then make sense not to let too much time elapse between the two sets of stats being compared.

With such a time constraint, there is not a sufficient number of players at Double-A or lower who can be compared to their Major League stats. In my first test, I set a time restriction of one year, and collected all batting stats for Class A Advanced (A+), comparing them to what the same players did in Double-A (AA) in the year before, the same year, or a year later. Double-A was compared to Triple-A, and Triple-A to MLB. In order to calculate the factors for the lower minors, the results must be "chained" - that is, to know the factor from High-A to MLB, take (A+ to AA) times (AA to AAA) times (AAA to MLB). This first test, with all players, and using chaining, is labeled "All Chained".

SDT = (H-HR)/(AB-SO-HR) {Singles, doubles, triples}
DO = DO/(AB-SO-HR) {Doubles}
TR = TR/(AB-SO-HR) {Triples}
HR = HR/(AB-SO) {Homeruns}
HP = HP/(AB+HP+BB) {Hit by Pitch}
BB = BB/(AB+HP+BB) {Walks}
SO = SO/(AB+HP+BB) {Strikeouts}

```
All Chained
Level  SDT    DO      TR     HR     HP    BB    SO
AAA   0.90   0.94    1.02   0.78   0.85  0.82  1.20
AA    0.88   0.91    1.03   0.75   0.74  0.73  1.26
A+    0.84   0.92    1.03   0.73   0.66  0.71  1.33
```

A bias which exists in using all players is the "pinch hit penalty". It has been shown that most players do not hit as well coming off the bench as they do starting and playing regularly. The factors will be depressed by a disproportionate number of players at the higher level (particularly in the majors) playing sparingly. In order to account for this, I decided only to use players who had an average of more than 2.5 plate appearances per game in each level. This test is labeled "Min PA Chained," which makes the HR and SO factors and to a lesser extent SDT more beneficial to the batter.

```
Min PA Chained
Level  SDT    DO    TR    HR    HP    BB    SO
AAA   0.91   0.94  1.02  0.80  0.84  0.82  1.18
AA    0.89   0.91  1.03  0.76  0.73  0.73  1.23
A+    0.85   0.92  1.02  0.74  0.66  0.71  1.30
```

The first two tests included many players who never advanced through every level, failing to get to the majors. If the MLEs are being used to judge how well a player will perform if and when he makes the majors, is it correct to base the factors partly on the records of players who failed to advance? In the third test, I produced a list of 368 MLB "rookies" from 2003 to 2008. My definition of a rookie season is a player who had 150 or fewer career major league plate appearances entering the season, and more than 150 during that season. Factors were calculated using only the records of these 368 players, in seasons where they had 2.5 or more PA per game at each level. These results are labeled "MLB Chained," and show virtually the same factors at Triple-A (MLB Chained is limited to players who achieved at least 150 PA in their rookie season, while Min PA only requires 2.5 PA per game). It's in the lower minors where larger factors favoring the batter are seen across the board.

```
MLB Chained
Level  SDT    DO     TR     HR     HP     BB     SO
AAA   0.91   0.97   1.01   0.82   0.82   0.82   1.18
AA    0.90   0.94   1.02   0.81   0.71   0.76   1.22
A+    0.87   0.99   1.00   0.88   0.66   0.76   1.24
```

In the event that there was still any bias or distortion that existed in the method of chaining the factors through multiple levels, the fourth and final test compared the minor league records at each level directly to the MLB records compiled no later than one year after the player's rookie season, otherwise not setting any maximum elapsed time. This is labeled "MLB Direct," which uses the same list of players and same playing time criteria as "MLB Chained." The differences being a direct comparison vs chaining, and for the lower minors, a longer elapsed time between the records being compared, which will introduce more aging factors being built into the level factors. The Triple-A factors vary from "MLB Chained" in that the samples do not need to be within a year of each other. Again, HR and SO factors at Triple-A improve slightly for the batter, with larger gains in all categories in the lower minors.

```
MLB Direct
Level  SDT    DO     TR    HR    HP     BB     SO
AAA   0.92   0.98   1.01  0.85  0.85   0.82   1.15
AA    0.94   0.98   1.01  0.94  0.75   0.83   1.16
A+    0.93   1.03   0.99  1.08  0.72   0.83   1.14
```

At all levels, "All Chained" has the least favorable factors for batters, while "MLB Direct" is the most favorable. Using only players who reached MLB is always more favorable than using all players. There is little difference between methods for Triple-A, except in HR and SO. Going into the lower minors, the differences between chaining and direct comparison become more pronounced, as each level requires another multiplication to generate the final factors, which also then multiplies any biases that exist between each level.

```
Level   SDT    DO     TR     HR     HP     BB     SO
All Chained     AAA   0.90   0.94   1.02   0.78   0.85   0.82   1.20
Min PA Chained  AAA   0.91   0.94   1.02   0.80   0.84   0.82   1.18
MLB Chained     AAA   0.91   0.97   1.01   0.82   0.82   0.82   1.18
MLB Direct      AAA   0.92   0.98   1.01   0.85   0.85   0.82   1.15

Level    SDT    DO     TR     HR 	   HP     BB     SO
All Chained     AA    0.88   0.91   1.03   0.75   0.74   0.73   1.26
Min PA Chained  AA    0.89   0.91   1.03   0.76   0.73   0.73   1.23
MLB Chained     AA    0.90   0.94   1.02   0.81   0.71   0.76   1.22
MLB Direct      AA    0.94   0.98   1.01   0.94   0.75   0.83   1.16

Level   SDT    DO      TR      HR     HP     BB     SO
All Chained     A+   0.84   0.92    1.03    0.73   0.66   0.71   1.33
Min PA Chained  A+   0.85   0.92    1.02    0.74   0.66   0.71   1.30
MLB Chained     A+   0.87   0.99    1.00    0.88   0.66   0.76   1.24
MLB Direct      A+   0.93   1.03    0.99    1.08   0.72   0.83   1.14
```

Now that we see how the factors compare to one another, we can judge their relative accuracies? The purpose of the MLEs is to show how well a player in the minors will perform, if and when he reaches the majors. I took the list of 368 rookies from 2003-2008 to see how well each of the methods translated their statistics at each level, compared to each player's MLB records.

Tom Tango's Marcel system was used to generate the baseline MLB records. Marcel uses three years of data, weighted 5/4/3. I generated the Marcels one year after each player's rookie season, giving more time for the player to collect a sufficient sample size, while not going too far into the future, when the player's skills might be somewhat different from when he entered the majors.

I used three methods to test the accuracy:

1. Comparing the weighted means of each player's projections and Marcel

2. Calculating the root mean square error between each player's projections and Marcel

3. Calculating a similarity score, where the difference between each player's projections and Marcel is expressed as a percentage of the standard deviation (t-score) of all players stats in each of the categories, and then using the Pythagorean Theorem to determine the "distance" in t-scores, in all categories together, from the projection to the observed.

```
Level  pSDT  pXBH  pHR   pBB   pSO   eSDT  eXBH  eHR   eBB   eSO  vSDT  vXBH   vHR   vBB   vSO   Sim
All Chained   AAA 0.301 0.247 0.030 0.073 0.201 0.307 0.255 0.037 0.078 0.175 0.029 0.052 0.016 0.025 0.068 1.212
MinPA Chained AAA 0.305 0.248 0.031 0.073 0.197 0.307 0.255 0.037 0.078 0.175 0.029 0.052 0.016 0.025 0.065 1.154
MLB Chained   AAA 0.304 0.255 0.032 0.073 0.197 0.307 0.255 0.037 0.078 0.175 0.029 0.053 0.017 0.025 0.065 1.158
MLB Direct    AAA 0.309 0.258 0.033 0.073 0.192 0.307 0.255 0.037 0.078 0.175 0.029 0.054 0.017 0.026 0.061 1.096

Level  pSDT  pXBH  pHR   pBB   pSO   eSDT  eXBH  eHR   eBB   eSO  vSDT  vXBH   vHR   vBB   vSO   Sim
All Chained   AA  0.295 0.242 0.028 0.067 0.214 0.308 0.255 0.037 0.078 0.177 0.034 0.053 0.017 0.024 0.072 1.287
MinPA Chained AA  0.298 0.243 0.028 0.067 0.209 0.308 0.255 0.037 0.078 0.177 0.033 0.053 0.017 0.024 0.068 1.217
MLB Chained   AA  0.299 0.250 0.030 0.069 0.207 0.308 0.255 0.037 0.078 0.177 0.033 0.053 0.018 0.024 0.066 1.184
MLB Direct    AA  0.313 0.260 0.035 0.075 0.197 0.308 0.255 0.037 0.078 0.177 0.033 0.056 0.020 0.026 0.059 1.047

Level  pSDT  pXBH  pHR   pBB   pSO   eSDT  eXBH  eHR   eBB   eSO  vSDT  vXBH   vHR   vBB   vSO   Sim
All Chained   A+  0.286 0.235 0.024 0.065 0.228 0.309 0.255 0.037 0.078 0.177 0.038 0.061 0.019 0.025 0.089 1.580
MinPA Chained A+  0.290 0.236 0.024 0.065 0.222 0.309 0.255 0.037 0.078 0.177 0.036 0.061 0.019 0.025 0.084 1.495
MLB Chained   A+  0.294 0.252 0.028 0.070 0.212 0.309 0.255 0.037 0.078 0.177 0.034 0.063 0.019 0.026 0.075 1.340
MLB Direct    A+  0.315 0.262 0.035 0.076 0.195 0.309 0.255 0.037 0.078 0.177 0.034 0.066 0.021 0.028 0.061 1.102
0.302 0.254 0.040 0.082 0.164
```

All of the methods under projected HR, BB and SO for Triple-A, with "MLB Direct" slightly high on base hits and extra base hits, while the others were a little low. At Double-A and High-A, "MLB Direct" gives much the same projections for the test group as it did at Triple-A, while the others, which employed chaining, give progressively worse projections, for the same players, the more steps removed they are from MLB. I believe this is because each multiplication of one level to another required in the chaining process also multiplies any biases found between each level.

Despite the increased passage of time inherent in direct comparison of minor to major league statistics, as compared to chaining comparisons of data in consecutive seasons, the direct comparison method consistently gives the closest estimate of future MLB performance. In addition, direct comparison produces virtually the same MLB projection despite which level of minors was used in the calculation, where as chaining produces projections which are increasingly in error the further down into the minors.