CSS Button No Image Css3Menu.com

Baseball Prospectus home
  
  
Click here to log in Click here for forgotten password Click here to subscribe

<< Previous Article
Premium Article Baseball Therapy: You ... (03/18)
<< Previous Column
Baseball Prospectus Ne... (03/01)
Next Column >>
Baseball Prospectus Ne... (03/21)
Next Article >>
Premium Article Pebble Hunting: Eight ... (03/18)

March 18, 2013

Baseball Prospectus News

Replacement Level and 10-Year Projections

by Joe Hamrahi

From time to time—if not at all times—organizations must examine their own operations and ask some difficult questions.

The answers often reveal a range of things done right and things done wrong. Healthy organizations can handle those answers in more than one way—there are many routes to success, but even more to failure—but one hallmark of organizational integrity, to borrow from James Collins, is looking in the mirror when assigning blame and out the window when giving praise.

Here at BP we’ve been faced with an opportunity to ask ourselves some questions, and we’ve decided to grapple with the answers, even though in some cases we don't like them. In short, we have work to do in order to live up to our own high expectations. Despite our pride in much of the progress Baseball Prospectus has made, now is not the time to rest on our laurels. And some recent events make that abundantly clear.

After the 2010 season, Colin Wyers wrote about replacement level and how he was improving its integration with the rest of the component stats at Baseball Prospectus.

This is something of a culmination of work I’ve been doing over the past few months—taking a menagerie of stats available here at Baseball Prospectus and merging them together under the heading of “Wins Above Replacement Level.” We’ve had WARP for quite a while—and its close sibling, VORP, as well—but it has been rather distinct from the rest of our offerings. That’s coming to an end.

The goal of making WARP play well with the component statistics left behind at BP by previous staffers was worthwhile, but the implementation caused problems: We inadvertently raised replacement level for 2011 and 2012. Taking a summation of the WARP or VORP values for those two seasons resulted in league totals which weren't in line with pre-2011 data. They were much lower. By implication, this meant that replacement level was much higher, or that a “replacement level team” would win more games than the data had indicated for previous seasons.

At any point starting in about May of 2011, it should have been clear to anyone looking closely at the stats that something was different, and not just because Colin had re-engineered (read: greatly improved) some of the WARP formulae or because offense was down in 2011.

For the record, we know that these re-engineered formulae work. The chart below shows league-wide WARP totals by year since 2000, along with the winning percentage of a notional “replacement level team” (really, it's just a subtraction of WARP from wins, so there's some noise there for a variety of good reasons, but it's close enough to give a good idea).

Year WARP BWARP PWARP WARP Per Tm Rep. Wins Per Tm Rep. Win%
2000 884 598 286 29.5 51.5 0.318
2001 913 588 326 30.4 50.5 0.312
2002 921 576 345 30.7 50.1 0.31
2003 939 585 354 31.3 49.7 0.307
2004 927 578 349 30.9 50 0.309
2005 941 577 364 31.4 49.6 0.306
2006 940 592 348 31.3 49.6 0.306
2007 907 588 319 30.2 50.8 0.314
2008 906 596 310 30.2 50.7 0.314
2009 887 596 291 29.6 51.4 0.317
2010 912 602 309 30.4 50.6 0.312
2011 838 563 275 27.9 53 0.328
2012 891 573 318 29.7 51.3 0.317

Voila! Exactly the results we'd hoped to get.

Except…

One of the steps we take to improve the speed of queries—and thus to expand the scope of subjects we are able to research—is to put the seasonal replacement level for each event into our events database. In that process, we allowed some bad data to be introduced in 2011. We didn't catch it. It really was that simple, the data equivalent of a typo. We’ve corrected the data, and Baseball Prospectus WARP values for 2011 and 2012 are now representative of the theory we meant for them to represent.

Two additional things need to be pointed out about the scope of this problem: first, VORP was also affected, though FRAA and BRR were not—this was entirely an “at the plate” and “on the mound” problem. Also, slight adjustments to some previous-season WARP values were made, as some of our calculations rely on a multi-year smoothing of baseline data, even including forward-looking data when available.

While we're on the subject of evaluating our data, we've decided, after extensive testing, that the 10-year projections just weren't producing the results we desired. It’s difficult to evaluate long-term projections, and we intend to make that a more standardized, easily repeatable process in the future, but we hold our work up to a certain standard, and in this case, we didn't feel that that standard was being met. Instead of putting out an inferior product, we’ve essentially ordered the design team back to the drawing board to get 10-year projections and UPSIDE correctly formulated and out to the public in a timely manner going forward. We will be releasing the PECOTA percentiles soon, and that will conclude our pre-season projections releases.

It's not enough to fix these problems. We will be addressing these issues at their root—with a hard look at and overhaul of our internal processes and quality control.

But we also want to regain your trust. So we’re going to open the kimono and make our work transparent. Not only will this create a wealth of knowledge for everyone involved—readers and writers alike—but it will give BP the opportunity to leverage the wisdom of crowds.

We've recently named Harry Pavlidis our Director of Data Analysis. His first responsibility is to lead this effort. It will be a team undertaking, with all hands on deck. We will be sharing our progress and plans as they develop. But right now we're looking in the mirror. Looking hard.

Harry's first task is to conduct a full audit of our systems and stats. In essence, we're making him do his "day job"—assessing our systems and developing a plan to move forward. Harry will be bringing a process-driven approach to the effort, with the ultimate goal of improving our stat offerings. The experience he has in this area ranges from tiny start-ups to large, publicly traded companies. We'll all be working together to find the best-fitting tools and processes to bring BP up to the level of operational excellence we all expect.

Finally, I want to personally apologize for any inconvenience we may have caused our readers. The people we employ at BP are perfectionists. They spend more hours than anyone knows to get things done right and in a timely manner. They love this game and this company with a passion and will gladly fall on their sword if it means building a bigger and better Baseball Prospectus in the future. But if something goes awry at BP, it’s my fault and mine alone. I’m ultimately in charge, and I take full responsibility for any and all of our shortcomings. I’ve made mistakes and deserve any criticisms I receive. I may hold Baseball Prospectus to a high standard, but I hold myself to an even higher one. I’m sincerely sorry, and I promise you that I will continue to devote my blood, sweat, and tears to make BP the best it can possibly be.

Joe Hamrahi is an author of Baseball Prospectus. 
Click here to see Joe's other articles. You can contact Joe by clicking here

Related Content:  WARP

36 comments have been left for this article. (Click to hide comments)

BP Comment Quick Links

evo34

Any way to dig up the code from Nate Silver's last year running PECOTA, and produce some provisional projections for this year? Even if they were flawed, they seemed to be fundamentally sound and offered an alternative view of the career paths of important players.

Mar 18, 2013 03:36 AM
rating: 2
 
BP staff member Joe Hamrahi
BP staff

I assume you mean the 10 year projections...I really wish it were that simple. And believe me, something that may have been sound 5 years ago, may not be sound today. I know it's hard to take my word for that, but it's true. And honestly, we want to get things right for the long haul. Even if we could reconstruct Nate's code from 2008, and it was relevant, we don't just want to patch in a one time fix. Otherwise we'll be going through this situation every year. Thank you though. I understand your desire for the numbers, and that's why we're taking this VERY seriously.

Mar 18, 2013 10:41 AM
 
jrmayne

Joe, thanks for this.

As a pitchfork-and-torch salesman to the PECOTA mob the past few years, I've been very disappointed in the contrast between confidence level expressed and actual performance. (When Wayne Causey is your best comp for Bryce Harper, you're doing it wrong; predicted results for teams should average fewer than 86 wins.)

I hope that things go well on this front, and I wish y'all the best. I agree that there's a significant process problem. There are some good things BP is doing (Scoresheet Draft Aid!), and there are good articles. I'm glad to see that efforts are being made to fix some product rather than just have BP go to pure cash-cow mode.

Shorter ramble: Humility - while often a vice - is good if it leads to fixing.

--JRM

Mar 18, 2013 07:10 AM
rating: 4
 
BP staff member Joe Hamrahi
BP staff

The plan is to look at all aspects of our projections...from single year projections to multiple year projections and all the inputs that go into them. There's a lot more data available now than when Nate started the PECOTA process, and I, for one, would like to start incorporating more of it into our forecasts.

Mar 18, 2013 10:46 AM
 
doog7642

I'm thankful that UPSIDE is being unretired and fixed. Thanks, BP.

Mar 18, 2013 07:18 AM
rating: 5
 
Kevin Ebert

We all appreciate the amount of work and effort that you guys put into the site. Believe me, it doesn't go unnoticed. I'm sure you guys will get it right.

One question - Dave Cameron and Sean Forman are rumored to be trying to come together on an agreed upon replacement level. Even though their WAR stats are computed differently, the idea is that people will have more faith in them if they start from the same place. Has BP given any thought to joining the discussion? The idea of an industry consensus on replacement level is highly intriguing.

Mar 18, 2013 07:23 AM
rating: 4
 
BP staff member Joe Hamrahi
BP staff

Yes, in fact, Colin is in discussion with Sean and the guys at Fangraphs.

Mar 18, 2013 10:34 AM
 
Kevin Ebert

That's great. Looking forward to hearing the resolution of these discussions.

Mar 18, 2013 11:18 AM
rating: 1
 
BP staff member Joe Hamrahi
BP staff

Oh, and thank you for the kind words. I didn't want to miss that :)

Mar 18, 2013 10:43 AM
 
kmbart

I, for one, would be perfectly content with a seven-year projection. Ten years seems to me to be the equivalent of searching for signs of life on an extra-terrestrial planet with an optical telescope. What's the length of the average major-league career? Something like four years? I understand about the urge to get it right for a longer timeframe, but more correct over a shorter range is better, I feel.

Mar 18, 2013 07:51 AM
rating: 11
 
BP staff member Joe Hamrahi
BP staff

If we can do 7, we can do 10. It's not so much the amount of years as it is the formula, the aging curves, the inputs, etc. But I appreciate the comment, and we will look at all the options.

Mar 18, 2013 10:36 AM
 
RedsManRick

Joe, I'm sure you can do it, but at what point does the projection not only lose its ability to convey meaningful information but actually have its necessary uncertainty undermine the perceived value of the rest of the system?. That is to say, at a certain point of uncertainty, the mere presence of having data may suggest a level confidence that simply cannot be "undone". Or put differently still, simply having a 10-year projection may project a hubris that turns off a less sophisticated consumer and which no amount of "but look at the confidence intervals" can offset.

Mar 19, 2013 10:09 AM
rating: 4
 
BP staff member Joe Hamrahi
BP staff

Those are excellent points...even better because I had the same thoughts the past few years! It seems as though the consumers, though, want the 10 year projections so that's what we're going to try first. It doesn't mean we can't alter the plan in the future. This is an ongoing process, and I don't think we'll be disappearing anytime soon...at least I hope we won't be There are lots of interesting things we want to uncover and experiment with, and forecasts and projections are some of them. Thanks

Mar 19, 2013 10:52 AM
 
gpurcell

Does this have any effect on the data from the Player Forecast Manager?

Mar 18, 2013 09:05 AM
rating: 0
 
BP staff member Rob McQuown
BP staff

2013 projections should not be impacted in any way, so you won't see any PFM changes due to this.

Mar 18, 2013 09:29 AM
 
Lloyd Cole

Good for you. This is a great way to run a business, and to communicate with your clients.

ps I do play in fantasy keeper leagues, and a useful UPSIDE number seems like a great way to evaluate players for the (very) long term. Thanks for taking a new look at that and trying to come up with a more meaningful number. I hope that forward-looking GMs may look at it too (are you listening, Ruben Amsaro Jr.?)

Lloyd Cole
*PHILADELPHIA* (land of short-term baseball thinking)

Mar 18, 2013 09:48 AM
rating: 3
 
Grasul

I think just some basic stat availability enhancements would be very useful; like what team a player is currently on, whether he hits Left/Right/Switch, etc, downloadable into a CSV. The ability to download CSVs is useful and appreciated, but another consideration might be the development of a baseball statistic API for subscribers. To my knowledge, there isn't a good one out there today aimed at individuals and having an API could be a differentiator for BP.

Mar 18, 2013 10:09 AM
rating: 3
 
Matt

I work in software development, so I'm familiar with mistakes being made, processes being scrutinized and improved, apologies made with difficult explanations, etc. Personally, I wince at these kinds of apologies. It is tempting for the business side to see a mistake in development and feel compelled to solve every problem with broad strokes, like improving the process.

I'm sure Harry will do a great job with that task. But I have seen efforts to improve development process lead to one or more of the following outcomes:
1. Being overly ambitious and not actually implemented
2. Having too great an impact on the delivery of the product
3. Too general a solution and not solving the original problem

I am less concerned that a mistake was made and more concerned that lessons are learned. Whatever you can share would be appreciated, especially if you have some confidence you can actually improve the process.

You said something like this is ultimately your fault Joe... Did Colin and team tell you the product was susceptible to errors if they didn't improve the development process? And you ignored said advice? What mistakes did you make? What would you have done differently? What are you going to do differently? Are you proposing changes that will affect the product delivery timeline? Or affecting subscriber fees?

As was mentioned, we all appreciate your hard work and effort. Most of us understand mistakes can be made. The apology in the last paragraph makes me uncomfortable though. I'm not sure the right way to put this, other than that if you are apologizing emphatically for a problem which may be a natural consequence of developing a complex system, why would I think even more of your blood, sweat and tears is going to help?

Thanks.

Mar 18, 2013 11:56 AM
rating: -1
 
BP staff member Joe Hamrahi
BP staff

It's up to me to prove to you what I can do Matt. I totally get that. How we roll out the solutions to the issues will become apparent in time. That's why Harry is here, and we will be working together closely.

Mar 18, 2013 12:16 PM
 
BarryR

What percentage of BWARP is hitting and what percentage is fielding? Is this a fixed percentage or does it change from season to season?
Logic tells me that since offense and defense play an equal part in scoring, the fielding component should be half the difference between BWARP and PWARP, but there may be reasons why that isn't true, so I am curious as to the numbers here.

Mar 18, 2013 11:58 AM
rating: 0
 
BP staff member Rob McQuown
BP staff

Hi Barry,

Thanks for writing. It's not a constant. It changes over time as a function of BIP rates, mostly (the more TTO, the fewer BIP and so the less fielding matters and so the pitching WARP as a percentage of total WARP grows).

Mar 18, 2013 12:32 PM
 
BarryR

Rob

Okay, so fielding WARP is relative to pitching WARP. (As an aside, note that if you mouse over PWARP you get "Wins above replacement level as a BATTER" - really)

The question then is,if we break BWARP into its components, does batting WARP = pitching WARP + fielding WARP?
If it doesn't, what causes the variance between offense and defense? Is there more or less BWARP based on increased (or decreased) offense, or vice versa? Just trying to pin things down here.

Mar 18, 2013 14:37 PM
rating: 1
 
BP staff member Dave Pease
BP staff
(2)

Fixed PWARP, thank you.

Mar 18, 2013 14:41 PM
 
BarryR

Joe

I'm just trying to figure out what the chart is. I see a drop in WARP between 2010 and 2011 of about 8%, with a little over 6% being recovered in 2012. Is this significant drop due to the "typo", as you put it? Was the 2012 "correction" due to your fixing the "mistake"? Or are these the numbers that are the end result of all the efforts you've made to get things where they should be? If these are the "correct" numbers, then what causes that kind of severe drop in a relative statistic, followed by a significant return in the other direction? Even if there is a drop in offense, the relative WARP should be fairly stable, unless the numbers inherently differ between higher and lower scoring eras. A year-to-year variance like that in the quality of replacement player seems quite odd, if that was the case.

Mar 18, 2013 14:45 PM
rating: -1
 
BP staff member Rob McQuown
BP staff

The erroneous numbers represented a much lower WARP total, and are not shown here.

You are correct that the numbers are inherently different based on the league offensive rates (among other things). While this chart seems to indicate that 2011 had a high level of wins for a 'replacement team' (a nebulous concept at best), that's a quirk of starting it at the year 2000 - going back further, 2011 is well within the range that was evidenced, not an outlier as it seems to be by choosing 2000 as the starting year.

Mar 18, 2013 19:11 PM
 
BarryR

Okay. Before I can use WARP and refer to it like it's a meaningful metric, I need to have some confidence in it. I need to be able to understand how numbers are arrived at which are counter-intuitive.
So let's take a look at 2002 and 2011. Between 2002 and 2011, run scoring declined approximately 7%. At the same time, strikeouts increased 10% and HR declined 10% - both increasing the TTO positives for the pitchers, as did the decline in walks, also 7%. You said in your previous post that the more TTO, the more PWARP increases as a percentage of total WARP. Yet despite these changes in TTO, all in the pitchers favor, the percentage of PWARP dropped from 37.4% of total WARP to 32.8%. How did this happen?
Also what caused the 8% drop in total WARP in 2011 and the subsequent 6% increase in 2012? Was it a great year in the Pacific Coast League?

Mar 18, 2013 19:52 PM
rating: 1
 
BP staff member Joe Hamrahi
BP staff

There are two different topics really being addressed here Barry. The focus of today's announcement was purely on replacement level. Your question goes more to the core of WARP. We didn't discuss it here, but we intend on peeling back WARP one layer at a time over the next several months to provide insight into understanding the metric. So if you can hold your thoughts until we start that discussion, I think you'll get more of the answers you're looking for.

Mar 18, 2013 22:11 PM
 
BarryR

Sorry, Joe, I didn't mean to hijack your topic. It's just that you raised the subject and presented me with numbers in a format which immediately led me to questions which couldn't really wait, as the numbers are sitting there and will probably not be presented in the future.
It's been over 40 years since I took a college math course and I freely admit that when confronted with a series of formulas and equations, my eyes glaze over. But before I dive in to any series laying bare the entrails of WARP, I still want answers to my questions.
You see, Rob answered my first question with a very logical construct, TTO goes up, pitcher impact, as stated on PWARP as a percentage of total WARP, increases. Makes absolute sense. Unfortunately, the numbers don't support it, they are the equivalent of dropping a rock and having it float skyward. Now I would wonder why something like that happened, especially if I was in the rock-dropping business. I expect your numbers people asked and answered that question and I would like to hear it. If they don't have an explanation, then I have little interest in seeing the layers peeled away, I don't want to cook with that onion.
Similarly, the second question seems like an obvious one. An 8% drop in WARP one year followed by a 6% rise the next - if I was involved in the analysis of that metric, my first question would be why did this happen. There may be a simple answer, one for each year. I just want to hear it, because there is another question which should follow that one - what is the effect of this drop/rebound? Was there an 8% drop in WARP across the board? Or was some class or group of WARP scores changed more than others? This goes to the heart of the reliability of the metric itself.
I need answers.

Mar 18, 2013 23:26 PM
rating: 1
 
BP staff member Joe Hamrahi
BP staff

You're not hijacking the topic, and I appreciate the fact you want answers, but we're not going to go into all the details, calculations, assumptions, etc. here. After we roll out the information on WARP, and you have many more details, if you still have questions, we'll answer them. That discussion will go to the heart of the reliability of the metric itself...which is what you're looking for.

Mar 19, 2013 09:38 AM
 
drmorris

In recent years, BP's articles -- which is to say the insights, prose, and wit of its editorial staff -- have become a bigger draw for me than PECOTA. I'm obviously all for the process review and (presumed) improvements in your core analytical offerings, but let me take this moment to congratulate everyone at BP on a site that is delightful and thought-provoking far beyond the projections arms race.

Mar 18, 2013 19:03 PM
rating: 12
 
BP staff member Joe Hamrahi
BP staff

Thank you. I appreciate the kind words. I really do. We do some great things, but I'm not satisfied with that. I want to do even better things, and products like the 10-year projections and UPSIDE are the next to get a bulk of our attention.

Mar 18, 2013 22:03 PM
 
Richard Bergstrom

I'll echo that. I originally came to BP for PECOTA and comparables and projections and VORP/WARP. I stay for the articles and insight but, sadly, haven't cared for WARP in awhile (though the comparables in this year's Annual were better than previous years). FRAA just seems to affect WARP so much that it's hard for me to really consider it anymore.

Out of curiosity, full disclosure-wise, but when did BP become aware that bad data had entered the system?

Mar 18, 2013 23:39 PM
rating: 0
 
BP staff member Joe Hamrahi
BP staff

It was probably about 4-6 weeks ago when we were looking at the framework for the long term projections. From there we had to perform several tests to make sure what we thought was wrong, was truly wrong. Then we had to clean the data, rerun the new calculations, test it again, etc. and that leaves us where we are today.

Mar 19, 2013 05:43 AM
 
BirdlandPGH

I really appreciate this message. UPSIDE was the reason I continued subscribing to BP for my first few years here and its demise was the reason I stopped (a friend was kind enough to gift me a subscription this year). Do you expect UPSIDE/10-year projections to be ready for next year?

Mar 18, 2013 20:52 PM
rating: -1
 
BP staff member Joe Hamrahi
BP staff

Yes, I fully expect both UPSIDE and 10-year projections to be ready for next year. I don't want to put a specific time frame on it this far in advance, but the main purpose behind adding resources is to devote more attention to both those products.

Mar 18, 2013 22:00 PM
 
BirdlandPGH

Awesome. Thank you!

Mar 19, 2013 07:56 AM
rating: -1
 
You must be a Premium subscriber to post a comment.
Not a subscriber? Sign up today!
<< Previous Article
Premium Article Baseball Therapy: You ... (03/18)
<< Previous Column
Baseball Prospectus Ne... (03/01)
Next Column >>
Baseball Prospectus Ne... (03/21)
Next Article >>
Premium Article Pebble Hunting: Eight ... (03/18)

RECENTLY AT BASEBALL PROSPECTUS
Fantasy Article Fantasy Players to Avoid: Starting Pitchers
Fantasy Infographic: Starting Pitchers
Fantasy Article Dynasty League Positional Rankings: Top 175 ...
Premium Article Rumor Roundup: Diamondbacks Third Baseman is...
Premium Article Transaction Analysis: The Bad Bullpen Teams ...
Prospectus Feature: A.J. Preller's Offseason...
Premium Article Raising Aces: The Eyes of March

MORE FROM MARCH 18, 2013
Premium Article Baseball Therapy: You Gotta Keep 'Em Separat...
Premium Article Skewed Left: Spring Stats You Can (or Can't)...
Premium Article Painting the Black: Occam's Rubber, Part II
Premium Article Rumor Roundup: The Hole in the Plan
The Week in Quotes: March 11-17
Fantasy Article Five to Watch: The Top Prospect Edition
Fantasy Article Fantasy Auction Values: Fifth Edition, March...

MORE BY JOE HAMRAHI
2013-04-02 - Daily Roundup: Around the League: April 2, 2...
2013-04-01 - Daily Roundup: Around the League: April 1, 2...
2013-03-31 - Daily Roundup: Around the League: March 31, ...
2013-03-18 - Baseball Prospectus News: Replacement Level ...
2013-03-18 - BP Announcements: Name the Sequel to the Up ...
2013-03-17 - BP Announcements: David DeJesus Family Found...
2013-02-27 - BP Announcements: Cleveland Indians Seek Exe...
More...

MORE BASEBALL PROSPECTUS NEWS
2013-05-11 - Baseball Prospectus News: UPDATED: The Baseb...
2013-03-27 - Baseball Prospectus News: MLB Depth Charts T...
2013-03-21 - Baseball Prospectus News: Dollar Sign On The...
2013-03-18 - Baseball Prospectus News: Replacement Level ...
2013-03-01 - Baseball Prospectus News: Introducing the BP...
2013-02-18 - Baseball Prospectus News: The New Reality fo...
2013-02-15 - Baseball Prospectus News: Introducing the 20...
More...