Welcome to PECOTA week here at Baseball Prospectus. All week, we'll be running content on the state of our projection system, covering where we're at and where we're going. To kick things off, let's pull back the curtain and have a look at the history of PECOTA production, which should answer a lot of questions readers have asked.
The original PECOTA process, as long-time readers know, was designed by Nate Silver and first offered at Baseball Prospectus when we went to a subscription model in 2003. The basis of the system was groundbreaking for the time: similarity modeling of all possible comparables in baseball history to determine likely future performance, separated into percentile bands. We heard from people who just loved that they could take guys who "they had a feeling about" and kick them up or down the percentile ranges a little, while taking the weighted mean for players they didn't have a feel for. We had some neat graphs on the PECOTA cards which helped break out of the sea-of-tables presentation that was common for player cards of the day. BP staffers and subscribers alike had fun using PECOTA projections to do mean things to their fantasy leagues and predicting, quite seriously, that Nate had a future in PR or politics.
Behind the scenes, the PECOTA process has always been like Von Hayes: large, complex, and full of creaky interactions and pinch points. It started with Clay Davenport and Keith Woolner delivering Nate the data he needed, which took some time to compile after every season. Once Nate got the data, he picked up a case of Red Bull. Then he took that data and built constants and preprocessed the dataset with STATA, which took days for each iteration. He took that output and loaded the results into the heart of PECOTA—a Rube Goldberg contraption of an Excel spreadsheet. He'd finish making his changes to the PECOTA methodology for the year to the spreadsheet, kick off the macros, and generate the output on a player-by-player basis on his laptop. Just the Excel portion of the processing took days, barring a memory leak or computer crash that might necessitate starting the entire Excel process over. The output would be a monster CSV and a bunch of image files that I'd run a Perl script on to build the PECOTA cards. The output seemed to change in small ways every year, so the script had to change to accommodate those. For my part, I wrote the card-building script in late 2002, and just thinking of looking at the code today makes me cringe.
Every year, there would be errors and omissions, which really isn't surprising for a system of this level of complexity. With the system constructed as it was, though, we were especially ill-prepared to fix them because of the turnaround time, and because Nate couldn't use his computer while it was steaming through its Excel gyrations. Nate didn't own another computer, and he was writing for and managing the operations of Baseball Prospectus during most of the PECOTA generation time, so even if he didn't have any other interests or hobbies online, this was a problem. One obvious avenue of relief would have been to put the process on dedicated, non-laptop-form-factor hardware, but there are people in this world who think nothing of configuring, maintaining, and using multiple computers in their homes, and Nate Silver, who has often led off discussions about the PECOTA process with "now, I'm not a programmer," is not one of these people.
The numbers crunching for PECOTA ended up taking weeks upon weeks every year, making for a frustrating delay for both authors of the Baseball Prospectus annual and fantasy baseball players nationwide. Bottlenecks where an individual was working furiously on one part of the process while everyone else was stuck waiting for them were not uncommon. To make matters worse, we were dealing with multiple sets of numbers. The 'official' Baseball Prospectus statistics lived on our database server by the middle of the decade, in permutations and schema originally designed by Keith Woolner. The Davenport Translations and many of the eventual inputs to PECOTA came from Clay Davenport, who has his own statistics, processes, and player identification scheme. Like a Bizarro world subway system where texting while drunk is mandatory for on-duty drivers, there were many possible points of derailment, and diagnosing problems across a set of busy people in different time zones often took longer than it should have. But we plowed along with the system with few changes despite its obvious drawbacks; Nate knew the ins and outs of it, in the end it produced results, and rebuilding the thing sensibly would be a huge undertaking. We knew that we weren't adequately prepared in the event that Nate got hit by a bus, but such is the plight of the small partnership.
Nate didn't get hit by a bus, but he did get crazy famous—you might have heard—and that was close to the same thing as far as a predictable and orderly PECOTA generation process went.
The 2009 season was a tough year for us PECOTA-wise. There was the infamous Matt Wieters projection, but we’ll have more on that later in the week. From a process standpoint, we continued to use the original spreadsheet, but it took even longer than usual to get the projections run considering Nate's other obligations. We got the code running on a dedicated machine, but the lack of organizational expertise in the PECOTA generation process gave away the processing time advantage and then some. We ended up giving Fantasy subscribers a free upgrade to Premium for the delay.
As the season progressed, we had some of our top men—not in the Raiders of the Lost Ark meaning of the term—look at the spreadsheet to see how we could wring the intellectual property out of it and chuck what was left. But in addition to the copious lack of documentation, the measurables from the latest version of the spreadsheet I've got include nice round numbers like 26 worksheets, 532 variables, and a 103MB file size. The file takes two and a half minutes to open on this computer, a fairly modern laptop. The file takes 30 seconds to close on this computer. There's some color coding, and a few notes, but you're not going to sit down with a nice cup of tea and pick this thing up in an afternoon. More than one of the big brains on the team threw up their hands while saying uncle. Finally, Clay Davenport stepped up and, essentially by himself, produced PECOTAs based on the logic from the original spreadsheet, and Baseball Prospectus 2010 was saved. We thought we’d reached the promised land.
Then January and February rolled around, and we still didn't have PECOTA cards. The complexity of generating the multi-year projections and producing the expanded output of the player cards, versus just the book projections, was proving to be a much more difficult problem to solve, and the well-documented issues we were having re-rolling the depth charts processes from scratch were just screwing things up further. Clay works in Fortran, and those on staff with Fortran experience didn't want to admit how long ago we last used it because we've gotten self-conscious about sounding old, so collaborative problem-solving wasn’t going to happen. Clay was still working with his own data on his own systems, and the linkages between our database server and the PECOTA data were as shifty and error-prone as ever. Worst of all, even if everything was working tip-top, all we'd done is switch victims in the Murphy's Law BP-staffer-getting-hit-by-a-bus scenario.
We eventually produced a release of our standard Fantasy package, and there were some tantalizing big-picture advantages to the new PECOTAs versus the old—the more automated process and better integration with Clay's raw stats meant we could run PECOTA projections and cards for over twice as many players as we did with the Excel process, for example. Still, it was late, there was understandable uncertainty about the product, and we ended up giving Fantasy subscribers the free Premium upgrade and extended Premium subscribers by a month for the trouble… and it was such a hectic time I don't think we actually announced this. Enjoy the free baseball coverage, folks; when we screw up, we try to make things right, and the suits can't stop us from doing it because we are the suits.
We’ve continued to push out PECOTA updates throughout the 2010 season, but we haven’t been happy with their presentation or documentation, and its become clear to everyone that its time to fix the problem once and for all. The year 2003 seems like an eternity ago; we’ve undergone a huge amount of change since then, and so has the competitive marketplace for baseball analysis. We want PECOTA to be hands-down the best baseball performance projection system in the world, and over the next few days we’re going to break down what we’re going to do—and what we’ve already done—to get there. Stay tuned.
Thank you for reading
This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.
Subscribe now
Based on Dave's recounting above, I am really, really glad Nate never got back in touch :)
Progress is good, thanks for keeping us in the loop.
Wow.
My respect and admiration for Clay Davenport just doubled.
And I say that with all the respect of someone who came thisclose to taking a class on Fortran, before at the last minute, my college decided that it really was better to teach future engineers and computer scientists C/C++.
2007: 0.292
2008: 0.298
2009: 0.285
PECOTA Weighted Mean: .295
PECOTA was basically saying that a guy at age 28 (where the age curve is typically pretty flat) is going to do about what he's done the past three seasons. I mean, that ain't what I'd call sticking your neck out. And any other projection system I've seen said roughly the same thing.
The only trouble is Nate McLouth didn't cooperate. And it may be a while before we know if it was just him having one bad season or if he's the newest Marcus Giles.
I look forward to the remaining articles. As ever, the proof will be in the results - I think PECOTA, once in front, now has a little catching up to do, and I think CHONE is going to try to stay ahead - but this should be fun.
--JRM
I mean, I'm glad you guys are taking the time to really fix this thing, but trust is a hard thing to earn back, and perhaps even more importantly the CHONE's and ZIP's of the world have become solid and free contenders
I do have a quibble though. Late PECOTA projections last spring was a major problem. However, an even bigger problem was poor communications with subscribers about (i) when various PECOTA related projects were going to be released or fixed and (ii) what the quality problems were. Just admitting that you should have put the word "BETA" on the projections for a couple months doesn't begin to scratch the surface of the problems. You may consider this ancient history, but I read today's article and I question whether the miscommunication and noncommunication lessons were learned.
That said, I look forward to future installments of this series to see if sincere and thorough efforts to improve the system are in process.
What I don't believe in is the Ron Shandler attitude that projection accuracy doesn't matter. As long as I'm the guy running the PECOTAs (and right now, I am that guy) we're going to be forthright about what our performance has been and diligent in making sure that we're doing the best we can going forward.
As I've shown, there's a half-dozen legitimate ways to test for accuracy, all depending on exactly what it is that you want. In one test, I had Marcel as #1 in a group of 22 forecasting systems. In another, it was middle of the pack.
My preferred method is to run all half-dozen ways, report the results, and let the reader choose which way most closely aligns with his needs.
First off, anyone who has abandoned PECOTA, I think you have put too much faith in the system in the first place. If this was an exact science, every member of the staff would have retired by now.
Finally, from all of my preparation, I don't think any one person or site has nailed fantasy coverage or projections. Ultimately, I just don't see how people can put blind faith into anything, be it PECOTA, CHONE, ZIP, or any other flavor of the month.
however, PECOTA had enough errors that I ignored it completely
probably even more important than PECOTA though is the $ valuation engine on this site. I have begun to greatly favor an alternative source which makes a lot more intuitive sense, and gives 1 truer number. the $ engine here has all kinds of inputs that don't make any sense - at a very base level I think they are going about fantasy valuation incorrectly. you shouldn't have to pick "moderate" or "aggresive" or decide what kind of positional adjustment you want. these aren't questions and end user should have to answer, that's approaching the problem the wrong way. standard scores based off projections and projected playing time are all an end user should want or need for a basic $ valuation
Another item, and I know this will fall on deaf ears, because no one ever really responds. A forum where people who use the software could talk would do wonders for both the system, and the customers. It might put PECOTA into the proactive category as opposed to the reactive category. I understand it's complex, I read the article.
Couple of PECOTA experiences.....
While I know it's not perfect, I nailed it this year. First off, pitchers have been stellar for my entire duration of using the software. I never once had an issue and we run 12 teams, 10 starters per. Some teams slightly more, some less.
Bats, I have had issues with this before. I flat out ignore any PECOTA SB projection. Way over valued in my opinion. I just think there is way too much weight put on SB, and especially as categories increase. The more you have, the more they should dilute, and they do not. But again, I worked around that and just ignore them. I learned my lesson last year on that one.
Other than that, you are going to have injuries and bad luck in baseball. For the most part I think the projections were decent. The system allows you to identify value through the later rounds, which is generally what you need to win leagues.
This goes double for Jay's playoff odds.
Data collection.
DT's of current player stats; historical player stats.
Generation of baseline PECOTA numbers for the book. Does this still require STATA estimation, or is this integrated into a single large program, e.g., using R?
Generation of depth-chart adjusted PECOTA numbers for spreadsheet. Requires depth charts, of course, which will be done when?
Generation of charts, etc., for PECOTA cards.
Question 1: In one communication last year, Dave referred to it taking days for Clay to process a full run -- but was that due to slow processor time or low capacity? Or was he having to take out his wrenches and fix the data during that stage?
Question 2: What's your projected timetable for 2011 season data releases for internal use (BP201 authors) or for us consumers of the 2011 PECOTA's?
I also appreciate the effort from this season to identify the issues that PECOTA was experiencing, and even for putting together your beta-test team from subscriber-volunteers. Seems so long ago already, hard to believe it was just a few months ago.
FWIW, I was suckered into several of the notorious fantasy busts this year. Fielder in the first round, and other later round picks like Beckett, Nolasco, McLouth, Figgins, Lopez, and Iannetta. And yet, I'm still going to finish in the money in my league.
I'm very much looking forward to reading what the BP team has planned for BP!