BP Comment Quick Links


August 20, 2009 Checking the NumbersTwo Out of Three Ain’t Bad
Forgive me a lapse of obviousness, but Albert Pujols is one of the greatest players of all time, the type of allaround talent I will take pride in declaring incomparable when describing his career to my future children. He makes an ample amount of contact, knocks the ball out of the yard at least 30 times a year, drives plenty of his teammates in, plays Gold Glovecaliber defense, and makes up for a lack of raw baserunning speed with smarts on the basepaths. This confluence of characteristics makes Pujols the perfect specimen, sort of like the baseball equivalent of the comic book character Deadpool. It also makes Pujols a virtually unanimous choice to be a plausible Triple Crown heir apparent to Carl Yastrzemski, the man who last accomplished the nearimpossible feat back in 1967. Sticking solely to the senior circuit, nobody has topped the leaderboards in batting average, home runs, and RBI in the same season since Joe Medwick did it for the Cardinals in the 1937 campaign. Pujols is no stranger to the idea of a Triple Crown either, routinely finishing in the top five or ten in each category; however, as odd as it may sound, he has only actually led the league in one of the three categories once, when he led the National League with a .359 average back in 2003. This mere fact should serve as a testament to the difficulty in attaining a Triple Crown, as the premiere talent of the league, one of the more balanced and tremendous players in the history of the sport, has only led in a category on that one solitary occasion. Tim Kurkjian attributed some of this to talent specialization, wherein players sacrifice some of their power output for an increase in batting average, or vice versa. Still, if anyone in the game were to realize a Triple Crown in the next few seasons, you would think that it would be Pujols. What are the odds that he actually wins it this year, given the fervent speculation pointed in that direction? About a month ago he led in both dingers and ribbies, and sat just a few points behind Hanley Ramirez on the battingaverage front. In the relatively brief span from then up until Monday, however, Mark Reynolds has jimmyjacked his way into contention for a Dave Kingmanlike homerun title, Prince Fielder has knocked in teammates with reckless abandon, and Ramirez has increased the batting average gap between himself and the reigning most valuable player. While Pujols might be the oddson favorite to win a Triple Crown out of all active players, it certainly stands to reason that his projected end of season linea .326 average, 53 home runs, and 148 RBI, might not win in any of the categories, let alone all three. Fat Albert has a solid shot at taking home the top prize in both the homer and RBI contests, but will likely have to settle for silver in the batting average department. Assuming Ramirez lives up to his adjusted projection and finishes the year at .342, Pujols would need to hit roughly .384 to jump from .326 to .343. Suffice to say, this isn't terribly likely, and translating the above text into a definitive odds ratio proves much tougher than our earlier look at Joe Mauer's chances of hitting .400, primarily because we have to evaluate not just Pujols' performance, but the relative performance of other players as well in this case. We can calculate the probability that Pujols hits .384 over his final 221 PA and ends up at .343, but that ignores how Hanley or others perform. For all we know, Hanley could continue his torrid pace and finish the year at .365, with Pablo Sandoval catching equal fire and ending his campaign at .345. In short, the methodology proposed here is going to produce ballpark results at best, as the underlying assumption will be that all players within reach of a categorical lead will play to their projection from here on out. Essentially, to determine the overall probability of a Triple Crown using the aforementioned methodology, we need to calculate either the likelihood that Pujols surpasses or achieves a PECOTAprojected threshold in an area in which he trails, or, in categories he leads, the likelihood that his closest competitors reach the titleworthy mark. The probabilities for these competitors are then added up, subtracted from one, and Pujols's probability of winning that category results. For instance, PECOTA projects Pujols to end the season at 53 home runs, which would lead Reynolds by four, and Adrian Gonzalez and Adam Dunn by eight. Find the HR/PA rate for each of the players, the number of probable PAs remaining to them, and the number of home runs needed to reach at most 52. Mark Reynolds projects to step to the dish 200 more times this year, and would have a HR/PA of right around 0.070. With 38 long balls already in the books, he needs 14 to reach 52, so the Excel formula would be 1(BINOMDIST(14,200,0.070,TRUE). The TRUE stipulates that the player will experience at most 14 successes in 200 chances given a 0.070 rate; subtracting that result from one provides the probability that the player exceeds that threshold, in this case the likelihood that Reynolds hits more than 52 home runs. For Reynolds, the resulting percentage stays strong at 42.9 percent, with Gonzalez at 3.9 percent, Dunn at 2.1 percent, and Ryan Howard at a mere 0.8 percent. Added together and subtracted from one, Pujols has a 50.3 percent shot at winning the homerun title if 53 home runs will, in fact, win him the title, and everyone within striking distance plays to their projection. This process is then rinsed and repeated for the RBI title, but the only players with a realistic shot of winning the RBI title are Prince Fielder and Ryan Howardnote that when these probabilities were calculated, Pujols and Fielder were tied at 104 RBI. The projections peg both Pujols and Fielder as 148 RBI men, with Howard a fair distance behind at 134. Utilizing the binomial distribution once more, Fielder emerges with a 56.3 percent shot at exceeding 148 RBI, with Howard again at a mere 0.2 percent chance. This leaves Pujols with a 43.5 percent probability of winning the RBI title. Multiplying both of the probabilities together0.503*0.435 = 0.2188gives Pujols a 21.9 percent chance of winning both the homerun and RBI titles. The third leg, batting average, is bound to produce a much lower probability, given the vast gap between himself and Hanley, and the relatively limited time in which ground can be made up. Given his rate of walking and the batting average he would need to realize in the playing time remaining, Pujols basically would have to go 70for183 (with a .326 established talent level) in order to finish the season at .343, a mark that would best Ramirez's projection by a lone point of batting average. The resulting binomial, 3.4 percent, agrees that such an occurrence is very unlikely. Multiply the 0.034 to the other two legs of the Triple Crown, 0.503 and 0.435, and Pujols ends up with a miniscule 0.74 percent chance of winning the Triple Crown this seasonor odds of 134 to 1an astounding number and odds ratio given that he may very well finish the year with a final line of .32653148. If the rest of the current season were replayed over and over again, on average Pujols would win the Triple Crown once every 134 replays. Again, this is not an exact probability given the number of assumptions made based on the inseason projection and how the variables, the other players, throw wrenches into the probabilistic machine. Regardless, it is hard to fathom that the Triple Crown probability would increase past, say, two percent with a more accurate and timeconsuming methodology. I feel like a broken record in stating that the very low probability of achieving an historical feat should not take away from the season for the player in question, but that would be the understatement of the century for Pujols. Albert may win his third MVP award this season and, Triple Crown or not, he will remain one of the best players in history, one with another tremendous season to add to the back of his baseball card. A version of this story originally appeared on ESPN Insider .
Eric Seidman is an author of Baseball Prospectus. 13 comments have been left for this article. (Click to hide comments) BP Comment Quick Links misterjohnny (925) Last night the "slow" Pujols stole 2nd in the ninth inning and scored the go ahead run. I love this guy. Whatever needs to be done to win, he does it. Aug 20, 2009 11:14 AM Dave Pomerantz (47849) The stolen base wouldn't have been enough to win the game. Freakin Russell Martin threw the ball into center field, allowing Pujols to take 3rd and score on a fly ball. Aug 20, 2009 14:14 PM dtung (13775) Instead of subtracting from 1 the probability of each of Pujols' competitors passing him, shouldn't we multiply the probability of each competitor NOT passing him. In the case of HRs, this would be: .571*.961*.979*.992 = .533 or 53.3%. Aug 20, 2009 11:17 AM jdseal (46813) Yes, that's exactly right. You can't just add the probabilities of the different competitors catching him. Because there's a chance that 2 of them would catch him, which also means that there is a greater chance than (1sum) that no one does. To illustrate, what if 4 different guys each had a 30% chance of catching him (adds to 120%)? Would that means Pujols has no chance of winning the title? Aug 24, 2009 07:05 AM RedsManRick (23592) Regarding the interdependency of HR and AVG, I think we should consider them likely to be positively correlated in this case. Assuming a fixed contact rate, which we can do because we are looking at 1 player in isolation, HRs have the highest likelihood of being base hits (100%). Aug 20, 2009 17:09 PM ZacharyRD (36601) Multiple people already commented on this, but treating these three crowns as independent events is clearly incorrect  not only are HRs and RBIs positively (and obviously) correlated, but if he goes on a hitting tear enough to win the AVG title, he's probably got the other two by default. Aug 21, 2009 10:31 AM JPM16 (40201) Using an Excel function that I created, that measures the odds of Hanley getting exactly 0 hits * the odds that Pujols will have a higher BA then Hanley if Hanley gets 0 hits ... for all the possibilities, it gave Pujols just a 0.13% chance at the BA crown, without even figuring any other players. With this, I doubt he has better than a 1 in 2000 chance. Aug 28, 2009 13:44 PM Mike M (24398) Several readers have commented on the interdependence of the three events. There's also an implicit error in the way Eric calculates the probabilities for the individual outcomes. As it stands now, the analysis takes Pujols' PECOTAimplied results as given, then computes the probability of Reynolds matching this number given his HR/PA rate. However, what one really wants to know is the how likely Reynolds is to catch Pujols however many home runs Albert hits. This is the JOINT distribution of outcomes of two binomial distributions. Sep 01, 2009 13:50 PM Not a subscriber? Sign up today!

Eric  You've implicitly assumed (by multiplying the probabilities together) that winning the HR title and winning the RBI title are independent events. Since Pujols' HR total & RBI total are clearly positively correlated, that assumption isn't correct. Similarly, there would be some positive correlation (albeit very mild) between Pujols' HR/RBI totals and his final BA.
I think your ultimate conclusion remains valid, but it would be interesting to see (e.g., via a Monte Carlo simulation) a better estimate of the probabilities.
I think the effect could be much bigger than Rowan suggests. Sure, the chance that Pujols goes on a tear and wins the batting title is remote, but if he does he's almost sure to win the RBI crown as well. Similarly, in the unlikely event that he hit 20 more homers, the extra RBIs will almost certainly give him the RBI title.
The interdependence between Pujols' average and homer run rate is less clear.