Bio: I’ve been a statsgeek since before Bill James started selfpublishing Abstracts, and have been a Prospectus subscriber for about ten years now. Recently, I’ve written for Seamheads, StatSpeak and FanGraphs, and am waiting for the call up to the big leagues. My personal interests and writing have focused on statistical analysis, and I must admit I’ve been somewhat disappointed in the number of research articles published at BP since Dan Fox left last year. Eric has been a welcome addition, and I am hoping that I can also contribute to BP’s publishing of analysis.
Entry: Major League Equivalencies
Major League Equivalencies (MLEs) are a set of formulas that will translate a player’s minor league statistics into those that he would be expected to produce if he was in the major leagues. They form a part of most projections systems, including BP’s PECOTA and Davenport Translations. Modeling the level of competition at each stop in the minors can prove to be much more daunting than dealing strictly with major league data, as there are several potential selection biases which can markedly affect the accuracy of the projections.

How much elapsed time should be allowed between samples?

Does including bench players, who may suffer a pinch hit penalty, bias the factors?

Should all players be sampled, or only those who advance all the way to MLB?

Should the lower minors be compared directly to MLB, or to the next highest level?
The results show a wide variance in HR and SO rates, and increasingly large overall discrepancies in projections from DoubleA and HighA. The best approach might not be the one you would expect.
To test these scenarios, I created matched pairs of batting data with different selection criteria in order to calculate the ratios between major and minor league performance using each method. In the following tables, factors listed are the expected ratio between the minor and major league percent. For example, if a player has a BB% of .100 in HighA, and a factor of 0.71, he would be expected to have a BB% of (.100 x .71) = .071 in the majors.
By recording statistics for each player in each season, we are taking a sample, over a given period of time, which estimates the player’s “true talent” in each of the various categories. As a player ages through his twenties, he will on average lose speed, but gain power, strike zone judgement and contact skills. If it takes two years of statistics to get a good measure of a player, at the end of the two years he is likely not exactly the same player he was before. It would then make sense not to let too much time elapse between the two sets of stats being compared.
With such a time constraint, there is not a sufficient number of players at DoubleA or lower who can be compared to their Major League stats. In my first test, I set a time restriction of one year, and collected all batting stats for Class A Advanced (A+), comparing them to what the same players did in DoubleA (AA) in the year before, the same year, or a year later. DoubleA was compared to TripleA, and TripleA to MLB. In order to calculate the factors for the lower minors, the results must be “chained” – that is, to know the factor from HighA to MLB, take (A+ to AA) times (AA to AAA) times (AAA to MLB). This first test, with all players, and using chaining, is labeled “All Chained”.
SDT = (H–HR)/(ABSOHR) {Singles, doubles, triples}
DO = DO/(ABSOHR) {Doubles}
TR = TR/(ABSOHR) {Triples}
HR = HR/(ABSO) {Homeruns}
HP = HP/(AB+HP+BB) {Hit by Pitch}
BB = BB/(AB+HP+BB) {Walks}
SO = SO/(AB+HP+BB) {Strikeouts}
All Chained Level SDT DO TR HR HP BB SO AAA 0.90 0.94 1.02 0.78 0.85 0.82 1.20 AA 0.88 0.91 1.03 0.75 0.74 0.73 1.26 A+ 0.84 0.92 1.03 0.73 0.66 0.71 1.33
A bias which exists in using all players is the “pinch hit penalty”. It has been shown that most players do not hit as well coming off the bench as they do starting and playing regularly. The factors will be depressed by a disproportionate number of players at the higher level (particularly in the majors) playing sparingly. In order to account for this, I decided only to use players who had an average of more than 2.5 plate appearances per game in each level. This test is labeled “Min PA Chained,” which makes the HR and SO factors and to a lesser extent SDT more beneficial to the batter.
Min PA Chained Level SDT DO TR HR HP BB SO AAA 0.91 0.94 1.02 0.80 0.84 0.82 1.18 AA 0.89 0.91 1.03 0.76 0.73 0.73 1.23 A+ 0.85 0.92 1.02 0.74 0.66 0.71 1.30
The first two tests included many players who never advanced through every level, failing to get to the majors. If the MLEs are being used to judge how well a player will perform if and when he makes the majors, is it correct to base the factors partly on the records of players who failed to advance? In the third test, I produced a list of 368 MLB “rookies” from 2003 to 2008. My definition of a rookie season is a player who had 150 or fewer career major league plate appearances entering the season, and more than 150 during that season. Factors were calculated using only the records of these 368 players, in seasons where they had 2.5 or more PA per game at each level. These results are labeled “MLB Chained,” and show virtually the same factors at TripleA (MLB Chained is limited to players who achieved at least 150 PA in their rookie season, while Min PA only requires 2.5 PA per game). It’s in the lower minors where larger factors favoring the batter are seen across the board.
MLB Chained Level SDT DO TR HR HP BB SO AAA 0.91 0.97 1.01 0.82 0.82 0.82 1.18 AA 0.90 0.94 1.02 0.81 0.71 0.76 1.22 A+ 0.87 0.99 1.00 0.88 0.66 0.76 1.24
In the event that there was still any bias or distortion that existed in the method of chaining the factors through multiple levels, the fourth and final test compared the minor league records at each level directly to the MLB records compiled no later than one year after the player’s rookie season, otherwise not setting any maximum elapsed time. This is labeled “MLB Direct,” which uses the same list of players and same playing time criteria as “MLB Chained.” The differences being a direct comparison vs chaining, and for the lower minors, a longer elapsed time between the records being compared, which will introduce more aging factors being built into the level factors. The TripleA factors vary from “MLB Chained” in that the samples do not need to be within a year of each other. Again, HR and SO factors at TripleA improve slightly for the batter, with larger gains in all categories in the lower minors.
MLB Direct Level SDT DO TR HR HP BB SO AAA 0.92 0.98 1.01 0.85 0.85 0.82 1.15 AA 0.94 0.98 1.01 0.94 0.75 0.83 1.16 A+ 0.93 1.03 0.99 1.08 0.72 0.83 1.14
At all levels, “All Chained” has the least favorable factors for batters, while “MLB Direct” is the most favorable. Using only players who reached MLB is always more favorable than using all players. There is little difference between methods for TripleA, except in HR and SO. Going into the lower minors, the differences between chaining and direct comparison become more pronounced, as each level requires another multiplication to generate the final factors, which also then multiplies any biases that exist between each level.
Level SDT DO TR HR HP BB SO All Chained AAA 0.90 0.94 1.02 0.78 0.85 0.82 1.20 Min PA Chained AAA 0.91 0.94 1.02 0.80 0.84 0.82 1.18 MLB Chained AAA 0.91 0.97 1.01 0.82 0.82 0.82 1.18 MLB Direct AAA 0.92 0.98 1.01 0.85 0.85 0.82 1.15 Level SDT DO TR HR HP BB SO All Chained AA 0.88 0.91 1.03 0.75 0.74 0.73 1.26 Min PA Chained AA 0.89 0.91 1.03 0.76 0.73 0.73 1.23 MLB Chained AA 0.90 0.94 1.02 0.81 0.71 0.76 1.22 MLB Direct AA 0.94 0.98 1.01 0.94 0.75 0.83 1.16 Level SDT DO TR HR HP BB SO All Chained A+ 0.84 0.92 1.03 0.73 0.66 0.71 1.33 Min PA Chained A+ 0.85 0.92 1.02 0.74 0.66 0.71 1.30 MLB Chained A+ 0.87 0.99 1.00 0.88 0.66 0.76 1.24 MLB Direct A+ 0.93 1.03 0.99 1.08 0.72 0.83 1.14
Now that we see how the factors compare to one another, we can judge their relative accuracies? The purpose of the MLEs is to show how well a player in the minors will perform, if and when he reaches the majors. I took the list of 368 rookies from 20032008 to see how well each of the methods translated their statistics at each level, compared to each player’s MLB records.
Tom Tango’s Marcel system was used to generate the baseline MLB records. Marcel uses three years of data, weighted 5/4/3. I generated the Marcels one year after each player’s rookie season, giving more time for the player to collect a sufficient sample size, while not going too far into the future, when the player’s skills might be somewhat different from when he entered the majors.
I used three methods to test the accuracy:

Comparing the weighted means of each player’s projections and Marcel

Calculating the root mean square error between each player’s projections and Marcel

Calculating a similarity score, where the difference between each player’s projections and Marcel is expressed as a percentage of the standard deviation (tscore) of all players stats in each of the categories, and then using the Pythagorean Theorem to determine the “distance” in tscores, in all categories together, from the projection to the observed.
Level pSDT pXBH pHR pBB pSO eSDT eXBH eHR eBB eSO vSDT vXBH vHR vBB vSO Sim All Chained AAA 0.301 0.247 0.030 0.073 0.201 0.307 0.255 0.037 0.078 0.175 0.029 0.052 0.016 0.025 0.068 1.212 MinPA Chained AAA 0.305 0.248 0.031 0.073 0.197 0.307 0.255 0.037 0.078 0.175 0.029 0.052 0.016 0.025 0.065 1.154 MLB Chained AAA 0.304 0.255 0.032 0.073 0.197 0.307 0.255 0.037 0.078 0.175 0.029 0.053 0.017 0.025 0.065 1.158 MLB Direct AAA 0.309 0.258 0.033 0.073 0.192 0.307 0.255 0.037 0.078 0.175 0.029 0.054 0.017 0.026 0.061 1.096 Level pSDT pXBH pHR pBB pSO eSDT eXBH eHR eBB eSO vSDT vXBH vHR vBB vSO Sim All Chained AA 0.295 0.242 0.028 0.067 0.214 0.308 0.255 0.037 0.078 0.177 0.034 0.053 0.017 0.024 0.072 1.287 MinPA Chained AA 0.298 0.243 0.028 0.067 0.209 0.308 0.255 0.037 0.078 0.177 0.033 0.053 0.017 0.024 0.068 1.217 MLB Chained AA 0.299 0.250 0.030 0.069 0.207 0.308 0.255 0.037 0.078 0.177 0.033 0.053 0.018 0.024 0.066 1.184 MLB Direct AA 0.313 0.260 0.035 0.075 0.197 0.308 0.255 0.037 0.078 0.177 0.033 0.056 0.020 0.026 0.059 1.047 Level pSDT pXBH pHR pBB pSO eSDT eXBH eHR eBB eSO vSDT vXBH vHR vBB vSO Sim All Chained A+ 0.286 0.235 0.024 0.065 0.228 0.309 0.255 0.037 0.078 0.177 0.038 0.061 0.019 0.025 0.089 1.580 MinPA Chained A+ 0.290 0.236 0.024 0.065 0.222 0.309 0.255 0.037 0.078 0.177 0.036 0.061 0.019 0.025 0.084 1.495 MLB Chained A+ 0.294 0.252 0.028 0.070 0.212 0.309 0.255 0.037 0.078 0.177 0.034 0.063 0.019 0.026 0.075 1.340 MLB Direct A+ 0.315 0.262 0.035 0.076 0.195 0.309 0.255 0.037 0.078 0.177 0.034 0.066 0.021 0.028 0.061 1.102 0.302 0.254 0.040 0.082 0.164
All of the methods under projected HR, BB and SO for TripleA, with “MLB Direct” slightly high on base hits and extra base hits, while the others were a little low. At DoubleA and HighA, “MLB Direct” gives much the same projections for the test group as it did at TripleA, while the others, which employed chaining, give progressively worse projections, for the same players, the more steps removed they are from MLB. I believe this is because each multiplication of one level to another required in the chaining process also multiplies any biases found between each level.
Despite the increased passage of time inherent in direct comparison of minor to major league statistics, as compared to chaining comparisons of data in consecutive seasons, the direct comparison method consistently gives the closest estimate of future MLB performance. In addition, direct comparison produces virtually the same MLB projection despite which level of minors was used in the calculation, where as chaining produces projections which are increasingly in error the further down into the minors.