BP Comment Quick Links
| Home | Unfiltered | Articles | Newsletter | Statistics | Fantasy | Events | Radio | Glossary | Search |
![]() |
|
|
|
May 18, 2005 Lies, Damned LiesCan A-Rod and Pujols Beat Aaron's Record?Joe Sheehan asked me, prior to his appearance to discuss Barry Bonds' future on ESPN's Outside the Lines last night, if I had any way to estimate the chance that Albert Pujols and Alex Rodriguez will break Hank Aaron's home run record. In other words, a version of Bill James' Favorite Toy, no doubt to be inspired in some large part by PECOTA. It's worth mentioning something before we proceed further. Though the Favorite Toy is one of James' more popular and accessible inventions, it has not to my knowledge been validated empirically. That is, while it produces some answers that look about right and can spark some lively barroom discussions, we have no way of knowing whether it is accurate. My guess, actually, is that the Favorite Toy tends to overestimate the chance that a certain record will be surpassed, mostly because it doesn't account for the way in which problematic events in a player's career path tend to snowball. In other words, the Favorite Toy might estimate that say Ivan Rodriguez has a break-even chance of reaching 3,000 hits, based on an assumption that he will play about seven more seasons and average 140 hits per year (which awould give him 3,031). The problem is that, if Rodriguez only gets say 90 hits in 2007, that likely indicates that something has gone seriously wrong with him (probably an injury), and would radically reduce his projection for future seasons. But if Rodriguez had a good year in 2007 and had say 170 hits, it would probably not substantially increase our estimate of his productivity in the years beyond that, as he'd still be on the wrong side of the aging curve. A potentially more accurate way to go about estimating a player's chances of breaking a certain record is to examine comparable players, which is exactly what PECOTA does. Of course it will require some care to do this properly, but intuitively it seems reasonable that, if we can identify a certain number of similar players, and a certain number of those players ended their careers favorably, then the player in question has about that likelihood of ending his career favorably. PECOTA uses as many as 100 comparable players in order to form its estimates. For purposes of this exercise, we will restrict things to the top 20 comparables, as listed on Rodriguez's and Pujols' PECOTA cards. Here, for example, are A-Rod's best 20 comparables: 1. Dale Murphy 2. Mike Schmidt 3. Tony Perez 4. Sal Bando 5. Johnny Bench 6. Frank Robinson 7. Dave Winfield 8. Ken Boyer 9. Bobby Bonds 10. Gil Hodges 11. Pedro Guerrero 12. Eddie Murray 13. Doug DeCinces 14. Chet Lemon 15. Vern Stephens 16. Eddie Mathews 17. Dick Allen 18. Reggie Smith 19. Richie Sexson 20. Rocky ColavitoIntuitively you will recognize right away that comparables like Mike Schmidt and Dave Winfield--players who were productive well into their 30s--are favorable, while others like Pedro Guerrero and Dale Murphy are unfavorable. You may also have some questions about some of the comparables further down the list, and it's worth noting that the PECOTA comparables are motivated mostly by performance during a three-season period, and not over a player's entire career. If the goal of PECOTA were to produce career forecasts, rather than single-season forecasts, I might have done things a bit differently, and that is a limiting factor here. Let's use Winfield as our example and think about how we might use his career to make some inferences about how Rodriguez is going to perform in the remainder of his. Winfield hit a raw total of 311 homers from age 29 onward. If we combine that number with the 381 that Rodriguez hit through age 28, we come up with 692, a fair bit short of Aaron. But this math underestimates the production implied for Rodriguez for a couple of reasons:
|