BP Comment Quick Links


July 17, 2009 Checking the NumbersMauerQuest!
Prior to the 2006 season during which Joe Mauer produced a gaudy .347/.429/.507 tripleslash line, no junior circuit catcher had ever won a batting title. Last season, Mauer once again found his name atop the battingaverage charts thanks to a .328/.413/.451 showing, and this season he is currently in solid position for a third batting title by hitting at a .373/.447/.622 clip. The 26yearold backstop has always been lauded for his plate discipline, gap power, and his defense behind the plate, but as a hitter he has kicked the gears into overdrive this season, reaching base and smashing the ball around at Pujolsian levels. After going hitless in four atbats against the Astros on June 21, Mauer's line rested at .407/.475/.727, naturally inviting speculationsimilarly to that surrounding Chipper Jones at this time last seasonas to whether or not the .400 batting average could be sustained over the season's remaining balance. Mauer's playing time has thrown a wrench into the mix, however, mainly in that catchers naturally accrue lower tallies of plate appearances due to the rest that's prescribed against the normal positional tear of catching, but also because he spent the first month of the season on the disabled list to boot. These playingtime issues are generally considered to be of benefit to the Twins catcher, as an extremely high batting average would be much tougher to sustain over larger amounts of atbats. Probabilities have been discussed in this space before, per our earlier look into the likelihood that someone with such a historically poor record of reaching base like Pedro Feliz would shoot up to a .360 OBP through a third of the season. The methodology behind determining how likely it is for Mauer to finish the season at, or above, .400 is a tad different. Mauer entered this season with a specific projection, and in order to calculate the probability of an event occurring, we need to reconcile what has occurred so far with preseason expectations. Mauer is not a true .373 hitter, but with the season past the halfway point, it can be safely assumed that his estimate moving forward has increased. In order to find the probability that he will hit .400 we need to first wed the current stretch and prior knowledge of his performance, with regression to the mean serving as best man, in order to determine his likely success rate in any given atbat. As with any form of regression to the mean, the more we know about a player, the less weight the regression toward the league average carries. Even with the knowledge of Mauer's past performance, until he has an infinite supply of plate appearances we will never know if his exact talent level is an eHarmony match for his actual numbers. As more appearances are amassed, Mauer's own statistics become more prominent, similarly to us being more comfortable in stating that a 1,000for3,000 player is more likely a .333level talent than someone who went 1for3 in a single game, but the regression will always carry some weight. Players drastically above or below the mean in a particular area are likely better or worse than the average, but luck cannot be ignored as a factor. Regressing to the mean affords analysts the opportunity to bypass the luck factor to a certain extent, and gauge a player's true level of effectiveness. Research from Tom Tango showed that the recipe to find the true talent estimate for a player of whom no prior knowledge exists calls for the addition of approximately 400 leagueaverage atbats to the pot. With a league average of .269right around the AL's batting average since 2007Mauer would add 107 hits out of 400 atbats to his current totals of 90 and 241, respectively. Therefore, if we knew nothing about Mauer other than that he had gone 90for241 to date in a league that hits .269, his true talent level would be 197for641, or .307. For what it's worth, PECOTA pegged him as a .307 hitter entering the season. We do have prior knowledge about Mauer, though, given his marks of .347, .293, and .328 over the last three years. Progressively weighting the last three years introduces an additional 380 hits in 1173 atbats. Therefore, to gauge Mauer's talent level at this point in time, we add together 90for241, 380for1173, and 107for400 to ultimately arrive at .318. Given the league, his track record, and his current performance, Mauer is a true .318 hitter, but one with an actual mark of .373. If his current 90for241 stretch was eliminated and we merely utilized the other components, the resulting talent level hovers right around .309; Mauer entered the season as a true .309 hitter and has since jumped up to .318 with his fantastic data to date. Of course, with information such as his balls in play rates, age, position, and physical type we can get much more granular, but for the purposes of our look today, the above methodology works fine. With the true talent in tow, the next step involves the determination of how many hits he would need in the remaining seasonal atbats in order to hit .400. Due to previously mentioned playingtime constraints on his year, I am comfortable in estimating that Mauer will finish this specific season with right around 480 atbats, meaning he would then need a grand total of at least 192 hits in order to boast a .400+ batting average; since Mauer has already gone 90for241, he will need to record 102 hits in his final 239 atbats in order to hit .400. Incorporating the current .373 batting average proves confusing relative to appropriate standards because it is much more likely for a success rate of .373 to yield 102 successes in 239 chances. The issue is that the .373 merely acts as one of several ingredients in the 'true talent' stew; the .373 itself is not the sole determining factor in his likelihood of hitting .400, serving rather as a reviser of expectations. The goal then becomes determining the probability that someone with a .318 success rate of getting a hit in a given atbat would experience at least 102 successes out of 239 chances. Since we are interested in at least 102 successes in 239 chances as opposed to at most, it actually becomes easier to input the reversethe probability of at most 137 failures in 239 chances. The probability of success would then be 10.318, or 0.682, making the formula equal to the cumulative distribution function: BINOMDIST(137,239,0.682,TRUE). The result will inform us of the likelihood that a player who makes outs in 68.2 percent of his atbats would make no more than 57.3 percent outs in the total number of chances relative to this experiment. Once entered in, the formula outputs 0.000267, which rounds to a 0.0267 percent chance this occurs. Relative to odds, that equates to 3744 to 1, meaning that Mauer would be expected to hit .426the percentage representation of 102239over the theoretically remaining 239 atbats this year just once every 3745 such stretches. So, you're saying there's a chance!? Now, if Mauer had been in action since the start of the season, holding all else constant, he would be more likely to finish the year with 600 atbats. Running through the same formulas, Mauer, who starts the year at 90for241 in this hypothetical, would need to go 150for359 from that point on to end the season at .400 on the dot. The probability of at least 150 successes in 359 chances given a .318 success rate amounts to 4.222 x 10^5, or 22,612:1. So, yeah, the lost atbats certainly play into his favor, but the difference is essentially akin to comparing the likelihood Adam Eaton throws two straight nohitters to his chances of throwing one period. Nate Silver penned a post similar in content last season when Chipper Jones got off to his magical 92for219 start, running through a complex simulation in order to better replicate reallife working conditions, incorporating aspects such as the types of pitchers Jones would face. While such a simulation would be fantastic to utilize, we need not go to such great lengths in this instance, as the probability is going to remain incredibly low regardless, and it isn't as if Mauer will face extreme opposites of pitcher quality in every trip to the dish. He is in the midst of a remarkable season, one that might finally convince those with voting privileges to give him more credit than Justin Morneau, but he is not going to hit .400, even with the decreased playing time that comes with a missed month and catcher rest patterns. This should not detract from his seasonal merits in any way, shape or form, but let's not get too carried away with the .400 talk, especially given that he has already begun his slow climb down from that mark. A version of this story originally appeared on ESPN Insider .
Eric Seidman is an author of Baseball Prospectus. 19 comments have been left for this article. (Click to hide comments) BP Comment Quick Links newsense (5112) I think you underestimate Mauer's "true" level. If you take a Bayesian approach using his PECOTA as a prior, you get a "true" expectation based on his performance this year of a .332 batting average, which makes the probability of hitting .400 to be 0.135% or 1 in 740. Jul 17, 2009 10:57 AM There was a ton of debate about this on Tango's blog when Nate's article came out last year, but ultimately I feel that the regression approach utilized here is the most accurate method. But that is one of the great things about the numbers  you can adjust them in this case and recalculate. I am much more comfortable in saying he is now a true talent .318 hitter with a .373 clip than using the PECOTA as a prior with a Bayesian approach. But it sort of boils down to sample size taste  you might be convinced that with his 200608 seasons and current numbers that Mauer really is a .330 hitter, which he very well may be. I might be more inclined to regress him to the mean more than simply accept the .330 as gospel. Jul 17, 2009 11:02 AM newsense (5112) Not to reopen the debate, but the chance of a "true" .318 hitter hitting .373 over half a season is pretty low (less than 4%)zs opposed to about 10% for a .332 hitter. Jul 17, 2009 11:26 AM Right, I'm not debating that at all. He very well COULD be a true talent .325.330 hitter, but in going through the appropriate regression to the mean methodology we arrive at .318 right now. I don't have as big an issue with Nate's Bayesian approach last year as some others did, but this is a different approach. No matter what approach you use, however, you are going to get something well below 1 percent. Jul 17, 2009 11:34 AM Sam F (11452) Great article. Two questions (i'll post separately): Jul 17, 2009 11:38 AM Sam F (11452) Second, rather than regressing with leagueaverage ABs, why not regress using ABs at the BA his LD% predicts (or BA predicted by regressing his Line, Fly, Ground rates to the league averages)? Jul 17, 2009 11:48 AM With the second question first, are you referring to weighted regression towards some form of an expected BA based on the number of different balls in play hit? As in, if Mauer hit .347, .293, .328 from 200608, instead of adding in the 3801173 (his weighted prior three seasons), find his EXPECTED HITS given the number of each batted ball multiplied by the 0.73, 0.24, 0.15 expected values, and divide that by the 1173? Jul 17, 2009 12:13 PM Sam F (11452) That's actually not what I was thinking, although I wish I had.. good idea to remove most of the luck from Mauer's historical ABs. Jul 17, 2009 17:04 PM Rowen Bell (5629) Very good, Eric. Jul 17, 2009 12:14 PM Rowen, I agree... now trade me Chase Utley. Is there such a thing as Strattampering? Jul 17, 2009 12:21 PM sbnirish77 (17711) Hell ... Mauer is even putting up numbers comparable to Matt Wieters PECOTA projection ... so you know he must be good Jul 17, 2009 13:16 PM Brian Cartwright (4519) But then you would be regressing Mauer's historical stats to a different set of Mauer's historical stats...and how do you know that the estimate of BA (or BABIP) from batting ball components are any less lucky than the weighted historical record? Jul 17, 2009 17:15 PM Brian Cartwright (4519) Sometimes when I click 'reply' it still puts the post at the very bottom...the above was in reply to Jul 17, 2009 17:20 PM awayish (20768) Was that bit about calculating the failure rate really necessary, since this is not a intro stats lecture. Jul 20, 2009 09:55 AM Not a subscriber? Sign up today!

Lets not get carried away. Joe Mauer would need approx 441 AB's to qualify for the batting title (assumes a 14% walk rate, his career avg.). Joe currently sits at 90 hits in 241 Ab's, to hit .3995 at 441 Ab's he needs to get 176 hits. Simply that means he would have to go 86 for 200 the rest of the season, a .430 avg. Ain't gonna happen. I dont care how big of a Joe Mauer fan you are.
Please read the article, not the little one sentence synopsis, before commenting.