BP Comment Quick Links


February 8, 2012 Reintroducing PECOTAThe Weighting is the Hardest PartTo access the 2012 PECOTA spreadsheet, click here.
Madame Sosostris, famous clairvoyante, —T.S. Eliot, The Waste Land
BP’s projection system, at its core, follows the same basic principles as it has before. We begin with our baseline projections, which start with a weighted average of past performance, with decreasing emphasis placed on seasons further removed from the season being projected. Then that performance is regressed to the mean. After that, we use the baseline forecast to find comparable players (while also taking into account things like position and body type) and use those to account for the effects of aging on performance. Every season we put PECOTA under the knife, looking for things we can improve to make sure we’re coming up with the best forecasts possible. Sometimes what we come up with is a minor tweak. At other times, though, what we unearth is not only more significant, but an interesting baseball insight in its own right, even aside from its inclusion in PECOTA. This season, we’ve made some rather radical changes to how we handle the weighted averages for the PECOTA baselines—we still deemphasize past seasons, but nowhere near as much as we used to. With such a dramatic and counterintuitive change, we thought it best to give our users an explanation of what was changed and why so that they could correctly use and interpret the PECOTA forecasts. Last year, I was asked to appear on a Chicago sports talk station to discuss the town’s two teams, in particular how PECOTA saw them faring. I said many things, most of which don’t bear repeating (or for that matter remembering) this far past, but there was one thing I remember saying, and it probably does bear repeating—I expected Adam Dunn to be the best hitter on the White Sox in 2011. Suffice it to say, this statement does not represent my finest hour as a baseball analyst. Consequently, I’ve spent a bit of time thinking about Adam Dunn and whether there was anything in 2010 or earlier that hinted he might be capable of a season like 2011. In other words, is there anything that I know now about forecasting in general that would allow me to predict what happened using only what I could reasonably have known about Adam Dunn before the start of the season? The conclusion I’ve come to is that no, there really wasn’t. What happened to Dunn was, in essence, unforeseeable given what we knew heading into last season. That’s the bane of forecasting—no matter what you do, reality in all its many variations is always going to be able to surprise you. Now it’s time to predict 2012’s stats, and PECOTA has learned from its mistake. No longer does it declare Dunn the best hitter on the White Sox. It has been humbled, dropping Dunn… all the way to second place, behind Paul Konerko. This is partly due to the fact that the White Sox are not a very good hitting team as currently constituted, having traded away Carlos Quentin during the offseason, but part of it is because PECOTA sees a far greater chance of the Adam Dunn that mashed baseballs for the better part of a decade showing up next year than the putrid Adam Dunn the White Sox saw in his first season on the South Side. Naturally, some of you are going to look at PECOTA’s forecast for Dunn, think back to his abysmal season, and say, “I’ll take the under, thanks.” But PECOTA knows about his terrible performance just as we do; at its core, PECOTA takes past baseball statistics and applies a set of rules to them to come up with an estimate of what a player’s future statistics will be. If PECOTA is too optimistic about Adam Dunn, the culprit can be found in the rules governing the amount of emphasis to be placed on recent performance. Of course, in tying myself so explicitly to Dunn, I run the risk that—to be blunt about it—he sucks again. I’m reminded of an article Ron Shandler wrote prior to the 2005 season, where he said:
Shandler probably should have left well enough alone; Pujols hit 41 home runs in 2005, and he’s never hit 50 or more home runs in a season. But it all comes down to the same set of questions: How much emphasis should we put on Dunn’s utter collapse, or on a young Pujols’ secondhalf power index? We don’t just have our eyeballs to rely on—we have decades of past baseball stats we can use to come up with an idea of how to weight baseball stats in relation to one another. So, let’s build ourselves a forecasting model and see how various changes to the backweighting affect the forecasts, as well as try to determine the correct way to derive the backweights. For the sake of illustration, we’re going to use a much, much simpler model than PECOTA (it will remind many of you of the Marcels done by Tom Tango). To predict future TAv (from here on out, TAv_OBS), we will use three years of past TAv, where TAv_1 means one season prior to TAv_OBS, TAv_2 is two seasons prior, and TAv_3 is three seasons prior. The simplest model we can come up with is:
What we have here is a weighted average of a player’s TAv for the past three seasons. But let’s suppose that we want to downweight less recent seasons based on our intuition that more recent seasons are more reflective of a player’s current ability level. We would modify the formula as such:
So how do we come up with our yearly weights? What we can do (and what many other forecasters have done) is use an ordinary least squares regression to come up with weights for each prior season. The simplest way to do this is to use TAv_1 through TAv_3 to predict TAv_OBS in our regression. If we do so, we get:
According to this model, the most recent season is nearly 1.5 times as predictive as the secondmost recent season and over 2.5 times as predictive as the thirdmost recent season. Recasting the coefficients so that the first season is equal to one, I get 1/.6/.4. (This is similar but not an exact match to the weights used in the Marcels, which work out to 1/.8/.6.) [I’ve set the intercept to zero, because our weighted average formula lacks an intercept and this makes it a slightly more representative model, although the effect on the relative (rather than absolute) value of the weights is rather modest. If you include an intercept, it will essentially behave as the regression to the mean component of the forecast, which we’ll address separately in a moment.] The trouble is that this kind of regression doesn’t truly model how the weights will be used in practice. From now on, we’ll call it our unweighted model. With a little bit of algebra, we can redistribute the formula like so:
If there were no need for downweighting of past data, this would provide the proper weighted average we need for our forecasting model. For the sake of brevity, we will refer to
as TAv_1_W (for weighted), and so on. If we plug those into our regression model, we get some radically different weights:
These values are on a very different scale, since due to the lack of an intercept the values have to sum to one for the first regression and to three for the second regression, but they’re also very different in a more meaningful sense; recasting the first year to 1 (which is practically already done for us), we get weights of 1/.92/.90. In this second method, we get a result that seems contrary to our intuition—the most recent season is only slightly more predictive than older seasons. How can we assure ourselves that the less intuitive model is still more correct? We can look to the regressions themselves for one piece of evidence. The rsquared of the first regression is .27, compared to .38 for the second regression. It’s also more consistent with the way the weights will actually be used in practice. What’s interesting is that by themselves, the PA weights have no meaningful predictive value—by definition, they have to sum to one for every player, and including them in the regression as separate variables doesn’t do anything to increase the predictive power of the regression. It’s not the distribution of past playing time that’s affecting the model, but rather what that distribution tells us about the TAv values themselves. Ideally, we’d compare both methods with known good values for what the seasonal weights ought to be and determine the correct method by whichever provides the more accurate results. But we don’t have known good values—if we had, we could’ve used those instead without messing around with any of this in the first place. While we can’t get known good values for real data, though, we can get known good values for fake data—in other words, a simulation. In this case, a simulation is startlingly simple to do; we assume that a player’s TAv_OBS is his true talent level and that all past seasons are equally predictive if PA are held constant. Then we simply take a player’s PA in each of the three preceding seasons and use a random number to come up with TAv values for each preceding season that reflect a combination of a player’s true talent and random variance. (For those who care about the technical details: we generate a random number between 0 and 1, convert that from a percentage to a zscore, multiply by the expected random variance, assuming TAv is a binomial, and add that to TAv_OBS.) Running regressions on our simulated data, we get weights of 1/.8/.3 for our unweighted model compared to 1/1/1 for the weighted model. We constructed our simulation to behave as though player talent was absolutely stable from season to season, so we can confirm that the second set of weightings is correct here, which we couldn’t do with the first set of regressions that featured realworld data. The unweighted method, in this case, still downweights past seasons, which shouldn’t be the case There are three important practical takeaways from this finding. The first and most obvious one is that projection systems that dramatically emphasize a player’s most recent performance will be biased against players with poor recent results and toward players with good recent results. Players are more likely to bounce back from poor seasons or revert back to type after exceptional seasons than those sorts of models would predict. It also suggests that three years is not enough data for a forecasting model to use. If you assume the Marcel weights are accurate, then it makes sense that older seasons wouldn’t add much value to your forecasting models. However, if the decline in value of older seasons is much more subtle than that, you can make good use of five or even seven years of data, if not more. The third, and perhaps most important, takeaway has to do with regression to the mean. We can add a simplistic version of regression to the mean to our forecasting model by adding a TAv_REG of .260 (the league average) with a PA_REG of 1200. (The PA_REG comes from the Marcels; it’s included here mostly for the purposes of illustration. The regression component in PECOTA is a more rigorous model based on random binomial variance—again, the purpose here is only to illustrate the concepts. Consider a player with 650 PAs in three straight seasons, or 1950 total PA. Using the Marcel weighting of 1/.8/.6, that comes out to 1560 effective PA— in other words, throwing out 20 percent of a player’s PAs during that time period. That means 56 percent of a player’s forecast comes from his own performance, and 44 percent comes from the regression to the mean component. Using weights of 1/.92/.90 yields 1833 effective PA, throwing out only about six percent. Using the same regression component, that’s 60 percent of a player’s forecast coming from his own production and only 40 percent coming from regression to the mean. (And if you follow from the conclusions above and start using more years to forecast a player as well, even less regression to the mean is necessary.) Regression to the mean is a valuable concept to keep in mind when forecasting, but increasing statistical power (in other words, the amount of data used to make a forecast) is a far better solution whenever possible. Discarding data (or in this case, downweighting it) in favor of regression to the mean is only advisable when there is conclusive evidence that the data being discarded or downweighted is less predictive.
As a result of its revamped weighting, PECOTA is going to be more bullish on players coming off a bad year and more bearish on players coming off a great year than many other forecasting systems. We’re okay with that. We believe that a full accounting of the historical data supports what we’re doing with PECOTA, and we think a forecasting system with a uniquely accurate outlook is more valuable than one that conforms. UPDATED: Coming soon, we'll have a more indepth look at how the new PECOTA stacks up, including RMSEs against Marcels. Some quick examples beforehand: the recent poster boy for “New PECOTA” would probably be Francisco Liriano, whose 3.60 PECOTA forecast for 2010 was almost identical to his reallife 3.62 ERA, while Marcels weighted his recent past (2009 was horrific, and many observers wondered if he'd ever return from his injury woes) and forecast a 4.88 ERA. The ERAs cited here were derived using a 3rdparty version of Marcels (don't want anyone thinking we cooked the books), against the “New PECOTA” system applied retroactively. Some hits are obviously due to differences between the systems, such as Aaron Harang moving to PETCO (4.01 PECOTA, 3.64 real, 4.74 for Marcel, which doesn’t account for park effects). With other pitchers, it's just a matter of missing the least, such as when Mike Scott unveiled his nuclear splitter for 1986 (4.55 PECOTA, 2.22 actual, 3.79 Marcels). Usually, pitchers don't leap like Mike; their dramatic improvements are quirky statistical samplings which need to be included, but should be weighted little more than earlier seasons. A more recent example is Tim Redding, who posted ERAs of 5.72, 10.57 (in just 30 innings), DNP, 3.64, and 4.95, the last in 2008. PECOTA wasn't impressed with his recent exploits, and projected a 5.30 ERA, compared to 4.51 for Marcels. His actual 2009 ERA was 5.10 (2009), and his latest pitching exploits involved a combined 6.24 ERA for two TripleA teams.—Rob McQuown
Colin Wyers is an author of Baseball Prospectus. Follow @cwyers
150 comments have been left for this article. (Click to hide comments) BP Comment Quick Links I'm fixing that link now... Premium subscribers can access it through their "manage profile" link on the login bar. Feb 08, 2012 04:11 AM PhillyPhreak (64216) I can't see it on the manage profile link. Oh well, I'll just sit tight. Although this means I won't get any work done today. Feb 08, 2012 04:22 AM Wade (30207) what's up with the "directory access is forbidden" when trying to download the spreadsheet? Feb 08, 2012 04:29 AM mjglenn (34786) If you go to "Depth Charts" and click "Raw CSV Data" you can get the 2012 PECOTA Spreadsheet. Feb 08, 2012 04:33 AM All access problems *should* be cleared up now, though I'll verify 100% that the people who posted here are not blocked. Feel free to use the Contact form to tell me directly if you're still having trouble. Feb 08, 2012 05:06 AM "manage your profile" is a link on top of the home page inside the blue login bar, just below the BP logo. It takes you to a page where you can view and edit information about your account, including your past few comments, password changes, and any downloads you're entitled to receive. Feb 08, 2012 05:08 AM Egg on my face, I was wrong  one more setting which was blocking a few people. *Now* everyone should be fine. Feb 08, 2012 05:11 AM apbadogs (9256) Jackpot!! So much for any productivity today at work!! WOOHOOO!!!! :) Feb 08, 2012 05:23 AM kcshankd (1089) ...and thank you very much. Small complaint: please never list by first name. Pujols, Albert, or separate cells. My first action is always to sort alphabetically so I can search easily. Now, not so much. Feb 08, 2012 05:39 AM Randy Brown (189) Agreed. Excel does have a workaround though. Insert a blank column to the right of the names, go to Data>Text to Columns, and check the 'space' button on the second menu screen. This will separate the first and last names into two separate columns. Feb 08, 2012 07:14 AM kcshankd (1089) Did that after posting this morning and had the same result. Feb 08, 2012 20:40 PM frampton (870) Copy the name field to the first empty column on the right side of the spreadsheet, and THEN run text to columns. You can then filter for nonblanks on the column to the right of the last name field, and tidy up 10 or so multipleword last names. Feb 09, 2012 07:13 AM nschneider (20893) The comparables always make me laugh. According to the spreadsheet, the Blue Jays big choice at left field this season is between the next Boog Powell, or the next Barry Bonds. Feb 08, 2012 05:50 AM Update: Just added a few notes on new PECOTA vs. the Marcels from Mr. Rob McQwown. Feb 08, 2012 05:50 AM thatfnmb (40377) I appreciate the effort, but have to say, this is not the PECOTA of old. The projections are a bit disappointing, especially pitcher projections. One clear bug is RP's moving to SP. Top 3 SP's in SO9 are Chapman, Sale & Moore. #7 is Bard, #12 Feliz. Those are reasonable if they're pitching 60 IP not 110+ Feb 08, 2012 05:58 AM Actually, I was skeptical of some of the new PECOTA wrinkles myself, but it's actually quite similar to the "PECOTA of old"  in fact, Nate's early publications on PECOTA still apply almost in their entirety to the current PECOTA process. Feb 08, 2012 06:29 AM tommybones (1168) Plugged in my 5x5 parameters into PFM and Mark Reynolds popped up us the most valuable 3B... don't know what to make of that. Feb 08, 2012 06:50 AM backbrush (60277) What's the timing for the updated player cards? Feb 08, 2012 06:24 AM phuturephillies (26368) Ryan Drese with the highest BREAKOUT score! Feb 08, 2012 06:28 AM jtanker33 (47067) Not really a problem, but, in the interest of maybe spurring conversation or improvement, I see Corey Dickerson of the Rockies is projected for 14 HRs and a .487 slg (which ranks 21st above Youkilis and Pablo Sandoval) in 250 PA. I believe his big power numbers this season were inflated by his playing in Ashville in the minors. Does Pecota look at minor league park factors? I found this at BA: http://www.baseballamerica.com/blog/prospects/2011/08/dailydishthecuriouscaseofashevillescoreydickerson/ Feb 08, 2012 06:42 AM McCormick Field is a dreamland for any lefty hitter with loft in his swing. 373 to dead center, 320 to right center, and 297 down the RF line, albeit with a 36 ft fence from the gap to the line. Feb 08, 2012 06:53 AM wjmyers (4565) Kershaw's comparables include Matusz, Clay Buchholz, and Rich Harden. Wow. PECOTA likes him far less than I do. I think comparables are the biggest issue with PECOTA since Nate left. And that was the key driver behind PECOTA, as it was originally conceived, was it not? Feb 08, 2012 06:56 AM Richard Bergstrom (36532) I've noticed the change too though I'll admit that I miss seeing Andre Dawson as a comparable to half the Dominican players. Feb 08, 2012 08:31 AM rmelby (32378) Shankweather (1804) At first glance, it looks like CF is the strongest position, beating 1B easily. Feb 08, 2012 07:23 AM johnorpheus (60047) Freddy Freeman at 1 warp for the season sticks out most through a cursory glance through. I guess 17 defense will do that you. I find it hard to believe he will be that bad in the field. Feb 08, 2012 07:30 AM jrmayne (1468) Quick hits, because I gotta go to work: Feb 08, 2012 07:41 AM evo34 (33584) Regarding number 6, wouldn't an ideal projection system try to determine customized past season weights based on age/experience to handle the outliers (very old and young)? It appears PECOTA does not. That is, one would want to look at what the optimal weights are for a 22 year old vs. a 40 year old. There is almost no chance they will be the same as a averageaged player (recent perf. will in fact have more value, as will the rate of change), so why force them to be static in the projection system? Yes, I know that an aging curve is applied after raw rate stats are projected. But this assumes that all players age identically, and that there is zero information contained in the rate of change of past stats as to how this particular player is developing/aging. Fine for a midcareer guy, but not so good for players in the middle of a steep incline/decline. Basically, you want to take more seriously very recent changes in performance when players are of an age when very large, real changes are likely to occur. Feb 08, 2012 08:30 AM I did some specific research into incorporating trends into a player's forecast. The results were not statistically significant, and somewhat counterintuitively they actually ran in an opposite direction from what you're suggesting  a player who was on an upward trend over multiple years of performance was actually more likely to *underperform* than overperform, relative to other players of the same age. Feb 08, 2012 09:34 AM evo34 (33584) Did you take a look at very old and young players specifically? I would think that if a 39 year old player's stats are falling off a cliff over last 3 years, it would make sense to use a more aggressive aging curve than the curve used on an average 39 year old. Feb 08, 2012 10:16 AM I did look at very old players specifically  I'd have to dig up my notes to see if I did the same for very young players, but I believe I did. Feb 08, 2012 11:19 AM jberkon (28225) When will depth charts be viewable? I get this message ("Depth Charts will return in 2012") when I go to http://www.baseballprospectus.com/fantasy/dc/ Feb 08, 2012 07:50 AM Hoff (37596) you can look at the individual team DC's for now, but they look like they're a bit of a mess. Or is the plan going forward to list by batting order spot? That would not be useful. Feb 08, 2012 07:55 AM jberkon (28225) How do you look at individual team DCs? Feb 08, 2012 08:31 AM We're still putting some finishing touches on the depth charts so for now they've been disabled. Feb 08, 2012 08:35 AM naehring (786) Where is VORP for hitters? Many are using this for fantasy purposes and while including WARP is good, it includes BP's estimation of the players defense. In all fantasy formats, defense is either not included (standard roto, point formats), or included in a different way (scoresheet, strat, etc). It is difficult to use this to rank hitters if you have to try to back out fielding from WARP. I posted a similar comment last year and was quickly rewarded with the VORP column in an updated spreadsheet. Here's hoping for similar excellent service this year. Feb 08, 2012 07:59 AM jj0501 (60272) How do I get to the new Depth Charts ? Is there a link ? Feb 08, 2012 08:02 AM hessshaun (41493) Chris Sale is coming up for me as pitching 168 innings and striking out 198. Strasburg is projected to throw the same amount of innings with 184. Feb 08, 2012 08:25 AM PhillyPhreak (64216) I can only imagine that when you were putting these projections together that everyday you see one more card. Feb 08, 2012 08:27 AM evo34 (33584) The tenyear forecasts look poorly smoothed at best, poorly conceived at worst. E.g., Kershaw's 10year eqERA goes: Feb 08, 2012 08:39 AM derekdeg (65228) Is the PFM up? I thought I saw it active about an hour ago and now it's not. Feb 08, 2012 08:40 AM No, you're not dumb. It was active but wasn't using the current year data. We've disabled it so there's no confusion while we put some finishing touches on the program. Feb 08, 2012 09:03 AM Randy Brown (189) Not so fast there Joe. If there is one thing I've learned around here over the years, it is that Jaffe has a fantastic mustache. But if there are two things, it is that correlation does not equal causation. The current functionality of PFM does not provide sufficient evidence to determine whether or not derekdeg is in fact dumb. Feb 08, 2012 10:23 AM tbwhite (361) Before I raise some questions, I'd like to say that I enjoyed the article and the extra transparency into the process. I think that there is room for improvement still, but I appreciate and enjoyed the article. I'm also happy to hear about the downplaying of regressing to a ML mean, that always bothered me, essentially by weighting previous seasons more it feels like you replacing regression towards the ML mean with regression towards the specific player's mean. Feb 08, 2012 08:41 AM evo34 (33584) I think Boras should use PECOTA aging curves in his next negotiations. Old guys get older, but their performance apparently stays about the same. Torii Hunter will apparently be a leagueaverage hitter well into his mid40s. But it won't even be a market inefficiency for teams to exploit, as the entire league will be filled with 45yearold .260 TAv hitters by 2018. Feb 08, 2012 08:55 AM pakdawgie (27451) Looking forward to seeing the PECOTA player cards  for me by far the most interesting thing about PECOTA is to see the whole range of outcomes to guage things like upside potential. I have trouble knowing exactly what to do with single deterministic projections. Feb 08, 2012 09:09 AM Quick note...I'm putting together a list of all your questions, concerns, and inquiries. We will attempt to address all issues either here in the comments or in a separate post (or possibly a FAQ page). Feb 08, 2012 09:11 AM doog7642 (3522) I'm glad to see the Breakout scores for hitters are far more substantial than last year's iteration, but I'm still surprised that breakout scores for the top pitchers are more than double those for the top hitters. Feb 08, 2012 09:11 AM doog7642 (3522) Kind of a bummer that neither singles nor total hits are included for hitters. I can discern the numbers from avg. and PAs, but it's kind of a pain. Feb 08, 2012 09:14 AM Robert Bishop (63541) This is my first year messing with PECOTA, so maybe I'm missing something obvious...is there no way to sort by position? Feb 08, 2012 09:30 AM Gordon (1198) Was there any attempt to incorporate injury data into the Pecotas? I think that it would be a difficult thing to do, but in the case of Tommy John procedures it might work: there are good data out there as to who got the operation, when they got it, and there seems to be some evidence of a seasonlong increase in walk rates as they pitch and recover more. Feb 08, 2012 09:52 AM Gordon (1198) Breakout rates for many batters seem really low. 2% for Dominic Brown? Feb 08, 2012 09:54 AM The primary input into the PAs for players not in the depth charts was past MLB playing time. For players with little or no MLB experience whatsoever, we put in a "floor" of 250 PAs for the sake of readability. Again, those players are not in the depth charts, so in reality we're not expecting them to play in MLB much if at all. There's a column in there called DC_FL which will tell you if a player's PA forecast comes from the depth charts or from historic playing time only. Feb 08, 2012 10:00 AM Richard Bergstrom (36532) Ya know spring training is about to start when the questions about PECOTA come out... Feb 08, 2012 10:22 AM Hi all, we've got a post up on current Fantasy status... we'll keep it updated as events warrant. Please drop by http://www.baseballprospectus.com/article.php?articleid=15999 if you've got a fantasy question or comment. Feb 08, 2012 10:31 AM Richard Bergstrom (36532) "Some quick examples beforehand: the recent poster boy for “New PECOTA” would probably be Francisco Liriano, whose 3.60 PECOTA forecast for 2010 was almost identical to his reallife 3.62 ERA, while Marcels weighted his recent past (2009 was horrific, and many observers wondered if he'd ever return from his injury woes) and forecast a 4.88 ERA. The ERAs cited here were derived using a 3rdparty version of Marcels (don't want anyone thinking we cooked the books), against the “New PECOTA” system applied retroactively." Feb 08, 2012 11:00 AM RonEckstein (64808) I'm sure it's in front of my nose and I'm missing it, but where can I find a glossary and how the numbers are reached? Feb 08, 2012 11:20 AM hi Ron, good question Feb 08, 2012 11:59 AM KJOK (31016) If I understand correctly, for hitters a 1/.92/.90 weighting out to about 7 years in the past is used. Feb 08, 2012 12:22 PM David Greene (25846) Someone ought to say it: The title of this article, "The Weighting is the Hardest Part," cracked me up. Verrry clever! Feb 08, 2012 13:19 PM mdthomp (65017) New to Pecota as well, is it me are do they seem a little on the conservative side? Feb 08, 2012 14:07 PM rreading (63656) Thanks for this! Feb 08, 2012 14:17 PM amazin_mess (9525) Check out www.tpfs.com. Our email tripplay16@aol.com Feb 08, 2012 14:42 PM mariotti (47800) I really like BP, but I have been disappointed with the PECOTA ratings for the past few years. Some of the projections have been real headscratchers, and this year's ratings are no exception. Does anyone really think Erik Bedard will have the same WHIP as Clayton Kershaw, or that David Wright will have a higher TAv than Jose Bautista? Results like that are so off that it makes me question PECOTA as a whole. Feb 08, 2012 14:26 PM For everyone with PECOTA questions, Colin will be chatting tomorrow at 1:00 PM ET http://www.baseballprospectus.com/chat/chat.php?chatId=897 Feb 08, 2012 15:21 PM Nate Meyvis (61752) I am amused that PECOTA is trusting enough to believe that Miguel Cabrera is a 3B! Feb 08, 2012 15:39 PM MGL (2121) Downloaded the spreadsheets. For batters, there is no hits or singles column that I can find (I assume B2 and B3 are doubles and triples  is it too much trouble to use the letters D and T or 2B and 3B?). Am I missing it? I can almost infer them from the BA and PA but there is also no AB column and it is not clear if AB is PABB (there are no SH and SF and ROE, etc.). Feb 09, 2012 00:51 AM Projections are for the listed team's expected park factor  so a minor league player that's listed with the Rockies has a forecast for half his games in Coors Field, for instance. BB includes IBB but not HBP. Feb 09, 2012 07:46 AM mlive78 (34591) FYI...until they release a new version with hits and/or singles included, I used SLG and AVG to calculate it on my own. Use the following formulas to calculate singles (B1) and atbats (AB). (Just insert a new column, use the formula, and paste it all the way down the column.): Feb 09, 2012 23:12 PM PhillyPhreak (64216) Maybe someone has said this above but I'd really like SP and RP designations too. Feb 09, 2012 04:11 AM chrisgoddu (3352) Can you include SS/SIM column in the spreadsheet. Many of us are starting scoresheet drafts and that would be helpful. Feb 09, 2012 05:25 AM tomterp (32514) I'm thinking that giving Matt Stairs 250 PA's for the Nats this year was just a way of finding out if any Nats followers are paying attention. I would be stunned if an Indy team gave him that kind of playing time, and I am 100% sure he gets zero PA's with the Nats this year. Feb 09, 2012 05:50 AM joepeta (35285) Ignore any player with an "F" (like Stairs) DC_FL. It's a true/false field for Major League roster presence. Anyone with an "F" is defaulted to 250 ABs (with minor exceptions I can't explain like Posada and Ibanez.) If you do a complete projection on Washington, ignore the guys with "F"s and you will see a reasonable team projection. Feb 09, 2012 08:43 AM TangoTiger (57181) For the record: Feb 09, 2012 06:31 AM fairacres (1980) It strikes me that PECOTA has a very conservative bias. Every year, as I peruse the projections for players, particularly established players, I mentally think "would I take the over or under on that stat line" and while I have not recorded those thoughts systematically, my sense is that each season, I feel like I would take a lot more "overs" than "unders." Feb 09, 2012 06:41 AM joepeta (35285) Fairacres is onto a topic here that I hope Colin addresses either here or on the chat. If you run a simulation for each team based on the component players, PECOTA is calling for total scoring in 2012 to be 21,353 runs. That's a not unreasonable increase from the 2011 total of 20,808. Feb 09, 2012 08:52 AM molokai (2744) Would just like to say any spreadsheet would be more useful if you had Berkman, Lance or a column for last name then first name. Sure it only takes a few more minutes for us to text to column but then we have to weed out the Wily Mo Pena's. Feb 09, 2012 08:53 AM MGL (2121) "...BB includes IBB but not HBP." Feb 09, 2012 13:18 PM joepeta (35285) Colin, Feb 09, 2012 15:16 PM saint09 (47194) Not sure if this is the forum for a PFM issue, but I continue to see Kevin Youkilis listed as a 1B and not a 3B... either Gonzo was traded, my system is acting up... or PFM is off? Feb 09, 2012 17:58 PM Tommy Fastball (19193) All I see in the hitter spreadsheet is 2011 stats. What am I doing wrong??? Feb 09, 2012 22:35 PM Hi, Tommy. Please download the PECOTA Weighted Means spreadsheet again from the Fantasy page. We have corrected the year on the hitters sheet. Sorry for the confusion. Thanks for your support. Feb 10, 2012 08:44 AM fieldofdreams (9235) So Jonny Venters has pitched 171 innings in MLB with 189 Ks and a 1.89 ERA, yet PECOTA thinks he'll strikeout only 62 in 74 IP this year with a 4.07 ERA. What am I missing here? Feb 10, 2012 08:11 AM mlive78 (34591) Any chance we can get SH, SF, and GIDP included for hitters in the Pecota Spreadsheet? Those categories are already incorporated into your PFM tool, so I would think they should be relatively easy things to add. I would also like to see IBB, if possible. On the pitching side, GIDP projections would be helpful there, also, as well as breaking out the Hits against category into 1B, 2B, 3B, and HR (obviously, HR is already done). Feb 12, 2012 12:07 PM jbergey11 (64163) In regards to Dunn. He was awful the entire year and showed no improvement from month to month at all. I wonder if he didnt forget how to hit. Apparently he has slimmed down so we shall see. Feb 13, 2012 04:07 AM jbergey11 (64163) It would be nice if this was sortable. And came with the positions separated. Feb 13, 2012 07:38 AM WilliamWright (4460) I noticed you haven't corrected Fausto Carmona's numbers yet. First off, he's now Roberto Hernandez Heredia, Age 31, and based on his legal issues probably won't approach 169.33333 IP Feb 13, 2012 12:08 PM mlive78 (34591) Should I be concerned that PECOTA is projecting the Houston Astros to lead the NL Central in scoring by a LARGE margin? Feb 13, 2012 13:04 PM Not a subscriber? Sign up today!

Access is forbidden...and when you go to the fantasy page, it gives you the 2011 pecota. I almost had a heart attack when I saw K Escobar expected to win 10 games for the Mets again!