Let’s jump right into it: the PECOTA Weighted Means Spreadsheet, Depth Charts, and PFM were updated Sunday, February 20, with the following:
* hitter projections are virtually unchanged other than the addition of Upside.
* pitcher improvements–Clay has found and accounted for the condition causing the 75th percentile weirdness in pitcher PECOTAs the last time around. The pitcher PECOTAs available in the latest run are now stable at their 50th percentile. This probably concludes our major modifications for pitcher projections for 2010, though of course the playing time projections will continue to develop as roles are firmed up during spring training.
* pitcher Upside is available for some pitchers now; it’s being run more or less in descending 2010 projected value order, so the Upside for the pitchers you will be most interested in is there. Everyone else is coming on a Wednesday, February 24 update.
* SSSim fans, we feel your pain. We do not have an ETA on the availability of this stat. It is high on our list of things to do.
* higher still are the PECOTA cards. We have the data generated and are in the process of building graphs and cards for hitters. They will be ready for beta release on Thursday, February 25. Pitchers will follow. As previously threatened, the cards have been improved more this year than any release since their initial design, and we’re excited to see what you think.
On a related topic, we’ve seen statements on the internets that the team triple-slash batting stats don’t mesh up with expected runs scored in the depth chart projected standings. I want to take this opportunity to categorically confirm these claims. The issue is not with the PECOTA projections themselves, but how they’re playing with the depth chart process. Our newest full-timer here at Baseball Prospectus will be addressing this issue later in the week… more on him in a day or two.
Thanks for your patience, and if you have a question or issue that hasn’t been addressed yet, please let us know and we’ll do our best.
We're trying to wrap up all of the things Dave mentioned above, but I promise that we've seen your request in the past and it's on our short list of updates to add.
We're going to be trying to update things like PFM to better fit your needs very soon. I'll post something to Unfiltered once I know we have the time to accommodate requests, but you know where to find me in the meantime.
With PFM, my big need is faster updates when using inflation. It could be my computer, of course, but when I've got Yahoo's draft running and your thing running, it can take half a minute to refresh.
Yeah, during draft season (pretty much all of March), there's considerable lag in the PFM. Pie in the sky request: executable PFM that checks for updates rather than web-based.
This would be very, very, very useful. Footballguys.com does a similar thing with their draft software, converting spreadsheets to an executable with many features -- you could take a page out of their book. Or hire their guy, Bruce Henderson, to do it.
One thing I think would be great for the PFM would be if we could hide or remove players without having it impact inflation. E.g., there are a lot middle reliever projected to have value in my league, but there's no way they'll get drafted. As such, it would be easier for me to sort through all the players if they weren't visible.
Mike, then you're really ignoring what the PFM is telling you. You're better off taking one of those middle relievers than a high ERA/WHIP starting pitcher. I would suggest you show dollar values down to -$5 to see which players would take their place if you don't intend to draft them. This wouldn't mess with the rest of the dollar values in your league. Another possibility: assign them to the "other team" with a salary of what their PFM projection was. A final possibility: change your 9 pitcher slots to 7 starting pitchers and 2 relief pitchers (or 6/3 for a mixed league where all closers will get drafted).
Thanks for the suggestions. None are perfect, but your second suggestion is the best solution at this time. However, I'm pretty sure it will result in unwanted inflation.
My main concern is being able to ignore or hide a player in order to sort through the players and data easier. My suspicion is other users might find this useful too. Most users likely have a master list of players they are likely to draft, or whom are likely to be drafted in their league. The PFM is helpful in pointing out players you or your league-mates might overlook, but only to certain extent (nobody in my league is drafting Jeff Keppinger no matter what his projected value). As such, I would like to sort through the PFM only using the players on my master list. I can do this easily on excel, but not on the web application.
Anyway, I realize that in some cases I may not be forward thinking enough to fully accept what the PFM is telling me (as might be the case with middle relievers). But I think there are other cases where I might want to ignore a player. Most users would agree that they don't take the PFM as gospel. And with good reason, as the Annual is filled with warnings that PECOTA might not provide an ideal projection for a specific player. As such, users might find it helpful to ignore players who projections are just clutter before a Rick Porcello, or whose projections they don't completely buy (Colby Lewis, anyone?).
Yeah, I look at a lot of projection sets so I'm pretty good at knowing what to keep and what to ignore about different sets. I also tinker around with some stuff. Since my league is a Deep NL-only with high inflation, I usually toss an extra team into the mix to make sure there are still valuable players at the end of the auction for each position.
Another way I've downplayed middle relievers in the past is to add a category like IP in. That way your starters increase in value just by throwing more innings. And it also devalues closers, which commonly happens in 5x5 auctions and snake drafts anyways.
Definitely this is good news. I think there has been some confusion the last few weeks about how much the difficulties come from the underlying PECOTA estimation process (e.g., using a broader set of comparables), how much from possible issues in getting the new code to run PECOTA's translated from Nate Silver's Excel files, and how much from the PFM.
But when you guys get to a point where you're willing to say "this is it," we're going to be a lot happier. Besides, we have something very new to look forward to in the reformatted PECOTA cards.
I just received The Book, and am confused regarding VORP & PECOTA data on a number of levels.
Why are "in the Book" 2009 VORP scores different from those at the BP stats page? E.g. Miguel Cabrara 2009 VORP in the Book is 40.3, but at the BP stats page Batter Season Standard report is 55.9.
Also, projected VORP scores differ from download spreadsheet and book. E.g. Miguel Cabrara 2010 VORP projection in the Book is 45.8, but 44.1 in the download spreadsheet
Granted, this is only one player (the first I looked up) but this seems to be a wide variance, and I need some guidance in how to interpret all this. Is the book data correct? Is the Stats Page data correct? Is the PECOTA weighted means data correct? They can't all be correct, can they?
As far as the data goes, you have to remember that the book version of PECOTA is the earliest run. This isn't the first year that they are different, I just think people are scrutinizing it more due to the other things that have popped up. We continue to optimize PECOTA as the season nears in order to get you the most relevant information--the book still has loads of value, as you get an early look in addition to the hundreds upon hundreds of player comments, that, in many cases, are more meaningful than the data from any projection.
The spreadsheets don't necessarily take playing time into account to the degree of accuracy that the depth charts do, so you're bound to see differences on a counting level. If you're looking for the most accurate representation of what will happen, playing time wise, then utilize the depth chart projections in conjunction with the commentary found for each player within the annual. That's how I use it, and I find it's the best use of both resources.
Regarding the PECOTA projections, I can certainly understand the rationale for the variance in the VORP projections. What I don't understand is the almost completely different list of comparable players, which shouldn't be a function of depth charts I wouldn't think (especially for an established full-time guy like Miguel Cabrera).
As far as the 2009 actual VORP scores, that I don't understand at all. The past shouldn't change, so seeing a 2009 final VORP difference of over 15.0 for Miguel Cabrera (it's a nudge over 40.0 in The Book, and 55.7 on the Batter Season report) is very unsettling. (http://www.baseballprospectus.com/statistics/sortable/index.php?cid=78210)
Upside doesn't compare a player with how much he can improve. This is from the glossary:
"UPSIDE is determined by evaluating the performance of a player's PECOTA comparables. If a comparable player turned in a performance better than league average, including both his batting and fielding performance, then twice the number of runs he contributed above average is counted toward his UPSIDE. If the player was worse than league average, or he dropped out of the database, the performance is counted as zero."
Upside is based on the players comparibles. Wieters' comparibles are the not so stellar group of Kurt Suzuki, Donnie Scott, Eddie Taubensee, and Gerald Laird, so it looks to be an issue with the comparibles.
Upside is based on the players comparibles. Wieters' comparibles are the not so stellar group of Kurt Suzuki, Donnie Scott, Eddie Taubensee, and Gerald Laird, so it looks to be an issue with the comparibles.
I LOVED Ed Taubensee. I know this has no bearing on the Pecota conversation but in case Ed is a stathead I just wanted to let him know plucky LH backup catchers are getting a little love from this corner :-)
That would require an extreme and extremely unlikely fluke. It's not as if there are just a few players in the comparable list. BABIP influence would likely balance out in the set of comparables taken together.
ARod's quite different in the book and on the spreadsheet.
Some of the hitter projections are difficult to understand. PECOTA uses last three years, right? Luis Castillo's last three:
301/362/359
245/355/305
302/387/346
Castillo's 34, and we get a projection of:
288/373/373
I guess the OBP is justifiable, but I have a tough time with that slugging percentage. There are a number of other projections that look very odd; Kelly Johnson's expected career year sort of jumps out. Brian Roberts is expected to to do better at the age of 32 than he did at 29-31.... this seems like a pretty serious red flag. I don't remember these sorts of outlier projections for older players before.
Maybe I'm wrong. Maybe we should project a year for Brian Roberts that's better than all of the past three years. But I'm inclined to believe a more conservative projection (CHONE has Roberts with 50 points of OPS fewer than PECOTA) that doesn't project Roberts to hit better than he has for the past three years. It seems to me that something's wrong. (Reversing the effect of the very good pitching in the AL East?)
I'm more concerned that Roberts has gone 50-40-30 in the SB department the last 3 years, and now is supposed to bump back up to 45, based on...? I think the SB issue is greatly overinflating his value in PFM.
Agree that something looks goofy on the Upside projections. I can't put my finger on last year's projections, but in 2008 there were 17 players at 200 or higher. This year there are 342. Also, wondering on the position algorithm. I completely get the positional difference adjustments, but Iannetta, C. and Lucroy, J. are higher than Reyes, J., Heyward, J., and Braun, R.? I love catchers and all, but that just doesn't seem right.
For Castillo, it could be that players add power as they age, gaining old hitter stats. Alternatively, it could be something about the new park. Kelly Johnson's probably all park-related, plus the fact that he lost his job at the very nadir of his season, despite doing okay in the second half and having a good prior year.
thanks for the update and for addressing many of our concerns.
Can we expect a correction to hitters' upside within the next couple of days as well? As a couple people have pointed out already, these numbers (and the rankings) seem way off. Unlike all the qualitative evaluations of prospects, UPSIDE allows us to quantitatively compare prospects not just within the same year but between years as well. This is lost with these new inflated numbers.
Dave, thank you for acknowledging the importance of getting SSSIM on board as soon as possible. Scoresheet drafts are happening right now, and unless you're able to get it on soon, this entire year will have been lost, with regard to using BP's projections to aid in Scoresheet. Would be a difficult thing for paying customers to accept, when it's an advertised component of PFM, and PFM is part of the incentive to buy a premium subscription. Thanks again for your work pushing that forward.
This. And it happens every year. Maybe someday I'll learn... in the meantime, one draft starts Friday for me, and the other Tuesday. If the SSSIM numbers get in by the end of next week, at least they can help me with the tail end of my drafts.
It's really disappointing. I'm not trying to be hard on you guys. Just trying to state very clearly the reality that we paid for this because of an important feature, and when that feature is not available, it's very difficult to accept.
I could be wrong, but I imagine that a very significant portion of Baseball Prospectus' customers subscribe in part because they want to use BP for their fantasy baseball team.
I'm not sure that the people at BP fully realize this.
I say this not just as a criticism, but also as a suggestion: BP offers statistical and analytical tools (the Player Forecast Manger, most prominently, PECOTA in general, and upside in particular for prospects) that are, in my opinion, significantly better than anything else out there.
The problem, however, is that these tools are useless for many fantasy drafts if they are not ready to go by mid February. And given the number of errors and problems to all of these tools this year, I'm honestly not sure if I can rely on them for fantasy purposes. I hate that, because I love BP when it's working. As someone who drafts almost exclusively prospects in my 100% keeper league, I feel that I have an advantage over the other guys in my league who have yet to discover BP or at least its hidden gem called Upside. Nate Silver used to put out an annual series on prospects, using Upside (and at one point WARP as well) to statistically analyze prospects. He openly compared the use of Upside to the more qualitative evaluations, which was very helpful to see Upside's weakenesses as well as strengths. But now we not only have to use an autopilot (i.e., upside in the PFM) to do these analyses, but we can't even rely on the validity of that autopilot.
All I'm saying to BP is, with all due respect: understand your market, take advantage of the tools you have at your disposal, and deliver a less clunky product.
"tools are useless for many fantasy drafts if they are not ready to go by mid February." . . . or earlier - deadlines such as Scoresheet's we have no control over.
Yeah, I brought that issue up in the prior thread too. I simply stopped playing SSSIM because of BP's issue with that and the fact that I can't trust that they'll get it up in usable form in time.
Maybe its just me, but I can't have a lot of confidence in the upside numbers when I see Colby Lewis comparable to Beckett, Halladay, Kershaw and Nolasco. Maybe it's the upside projections, maybe it's Lewis comped at age 30 to Schilling, Schmidt and Longborg -- but something still looks broken. Are Japan stats now used to create a comp to Schilling et al?
The PECOTA Weighted Means Spreadsheet looks like you just threw it together in a couple of minutes. I work with spreadsheets about 10 hours a day. If I turned in something looking like that into my boss, I wouldn't have a job for much longer.
It would really help a lot if each of the hitters tabs and each of the pitchers tabs were formatted in the same way, with the same column headers, the same font size. Then for categories/columns that are on both the pitchers tabs and hitters tabs (like VORP) they should be in the same column for all tabs.
For $20 I would just expect it to look a little more professional, and not like you just ran a query out of the database and plunked it into excel. Especially because it's probably the spreadsheet your customers are going to use the most.
Spend the extra 15 minutes to format it correctly and put out a professional looking product.
Thanks for continuing to address these issues. I wish the numbers had been finalized better before printing the annual, but I'll nonetheless love reading it like I always do.
I'm still not sure why some pitchers' performances went down and why relative pitcher performance changed so significantly from the early PECOTA versions to the more recent ones, but I guess ultimately I don't really need to understand. I just want to hear that the latest version is the one you can take to the bank, and it sounds like you are close to saying that.
I, like most BP subscribers, am eager for the PECOTA (Please Endure Constant Online Tinkering and Adjustments) projections, but I am willing to be patient and appreciate the updates from BP such as this. I am not a statistician and I am fairly new to SABRE-metrics, so much of the calculations (or "sausage grinding" to quote Jay Jaffe) still involve a lot of hocus pocus to me. Personally, the PECOTA numbers still have the wonder of a rabbit pulled out of a magician's hat. If said Magician where to announce, "I'm going to pull out a rabbit from this hat . . . No, no, no, that's an elephant. Let me put that back. . . . Ta Da! A rabbit!" I would still applaud the magician and his, or her, magical rabbit. I know there is a different set of standards & expectations for paid content, but in my view BP has done a fine job of recognizing the elephant and informing it's customers of their efforts to correct the PECOTA projections. Thanks, from an anxious, but still very satisfied, customer.
What I'd like is for there to be bench slots in the PFM. Particularly in auction leagues, knowing that you have to spend for your bench is pretty crucial.
I'm not entirely clear on how to factor in bench slots either. At the moment I've been dividing the number in two and assigning half to utility slots and the other half to pitcher slots. Is this a mistake?
One of the most common strategies for bench players is to only pay the minimum for them. So if you have a $260 league with 23 active roster and 10 bench, assign $10 to the bench players, leaving you $250 for the active roster. Use $250 to calculate what you should pay for the players that are going to be starters. Paying a guy $8 to sit on the bench (unless it's a keeper league and you're already ditching the season) is normally a mistake. Plus, you're more willing to pick up hot players off the waiver wire if you only spend a dollar on your bench guys to begin with. And are you really going to to take Jeremy Affeldt with your last dollar and bench slot if you never intend to play him? Use that slot on a future closer (like Drew Storen) or a high-upside minor leaguer that needs a place to play. Paying $1 for a guy because the PFM says he's going to be worth $1.33 is not a good strategy.
Plus, with the way people value different players, you might have people available for your bench slots that are expected to earn $4 by PFM (hello Luis Castillo!).
Hello--the 2/24 update has been posted. Clay will be dropping by as soon as he can to touch on specific questions; running the update kept him occupied throughout the day. I can tell you that Upside has been adjusted to more closely resemble the scale of previous years.
To get this information into the cards, we're going to have to re-run some of our backend stuff. They won't be ready until about noon Pac tomorrow. I'm really sorry for the delay.
Is this going to be the latest (time of year) release of PECOTA cards ever? I honestly don't know of a company that over-promises deadlines more consistently than BP.
Going to leave this in the other thread too, but I think you and the BP team deserve a public thank you. Your customers pointed out a problem, you recognized the time crunch and made it a priority, and you got it done in short order. And everyone was able to behave like adults throughout the discussion -- even on the Internet. If every business could run like this...
What is WMS? What is SSSim? Where are they? I clicked on the link to PFM and found nothting to do with Scoresheet Baseball. Clicking under Fantasy, I found none of the above, either.
There is a quirk when you try to download PFM data into a spreadsheet. Some of the names that are in the PFM are cut off / not included in the spreadsheet output.
Also the auction dollar amounts are different in the web PFM versus the spreadsheet that is then downloaded.
It looks like all players whose value is less than $1.00 in the web PFM do not get included in the spreadsheet. So right now, for a league that drafts 175 players, the spreadsheet is giving me only 168 names because 21 names are not getting pulled into the Excel / CSV file.
Any update on providing uniform standardized IDs for each player across all fantasy info sources? (be it HOWEID, BPID or other similar form)?
Let's revisit this after we get the above stuff out the door, but we'll definitely do it soon.
We're trying to wrap up all of the things Dave mentioned above, but I promise that we've seen your request in the past and it's on our short list of updates to add.
We're going to be trying to update things like PFM to better fit your needs very soon. I'll post something to Unfiltered once I know we have the time to accommodate requests, but you know where to find me in the meantime.
thanks guys .... you rock ... (and I won't ask about it again) :-)
With PFM, my big need is faster updates when using inflation. It could be my computer, of course, but when I've got Yahoo's draft running and your thing running, it can take half a minute to refresh.
Yeah, during draft season (pretty much all of March), there's considerable lag in the PFM. Pie in the sky request: executable PFM that checks for updates rather than web-based.
This would be very, very, very useful. Footballguys.com does a similar thing with their draft software, converting spreadsheets to an executable with many features -- you could take a page out of their book. Or hire their guy, Bruce Henderson, to do it.
absolutely agree on this. my big problem with the PFM was the time it took to get a response from the webserver after updating.
One thing I think would be great for the PFM would be if we could hide or remove players without having it impact inflation. E.g., there are a lot middle reliever projected to have value in my league, but there's no way they'll get drafted. As such, it would be easier for me to sort through all the players if they weren't visible.
Mike, then you're really ignoring what the PFM is telling you. You're better off taking one of those middle relievers than a high ERA/WHIP starting pitcher. I would suggest you show dollar values down to -$5 to see which players would take their place if you don't intend to draft them. This wouldn't mess with the rest of the dollar values in your league. Another possibility: assign them to the "other team" with a salary of what their PFM projection was. A final possibility: change your 9 pitcher slots to 7 starting pitchers and 2 relief pitchers (or 6/3 for a mixed league where all closers will get drafted).
Thanks for the suggestions. None are perfect, but your second suggestion is the best solution at this time. However, I'm pretty sure it will result in unwanted inflation.
My main concern is being able to ignore or hide a player in order to sort through the players and data easier. My suspicion is other users might find this useful too. Most users likely have a master list of players they are likely to draft, or whom are likely to be drafted in their league. The PFM is helpful in pointing out players you or your league-mates might overlook, but only to certain extent (nobody in my league is drafting Jeff Keppinger no matter what his projected value). As such, I would like to sort through the PFM only using the players on my master list. I can do this easily on excel, but not on the web application.
Anyway, I realize that in some cases I may not be forward thinking enough to fully accept what the PFM is telling me (as might be the case with middle relievers). But I think there are other cases where I might want to ignore a player. Most users would agree that they don't take the PFM as gospel. And with good reason, as the Annual is filled with warnings that PECOTA might not provide an ideal projection for a specific player. As such, users might find it helpful to ignore players who projections are just clutter before a Rick Porcello, or whose projections they don't completely buy (Colby Lewis, anyone?).
Yeah, I look at a lot of projection sets so I'm pretty good at knowing what to keep and what to ignore about different sets. I also tinker around with some stuff. Since my league is a Deep NL-only with high inflation, I usually toss an extra team into the mix to make sure there are still valuable players at the end of the auction for each position.
Another way I've downplayed middle relievers in the past is to add a category like IP in. That way your starters increase in value just by throwing more innings. And it also devalues closers, which commonly happens in 5x5 auctions and snake drafts anyways.