Notice: Trying to get property 'display_name' of non-object in /var/www/html/wp-content/plugins/wordpress-seo/src/generators/schema/article.php on line 52
keyboard_arrow_uptop

Thanks to the hard work of our statistical and technical teams, our signature BP stats have gotten an overhaul this season, both on the surface and under the hood. As often tends to be the case with ambitious projects, we've had a few hiccups along the way, some of them more noticeable than others, and none of them the kind you can cure by holding your breath and chugging a glass of water (not that we didn't try that, just in case). We've been reexamining old ideas and assumptions, and that's why you've seen some values change or fluctuate. In the process, we've also made a few regrettable missteps. Fortunately, we've managed to resolve the most serious issues, so it's time for an update on where some of our statistical offerings stand:

  • Pitcher WARP: As a number of you noticed,  Derek Lowe's WARP looked oddly high for a pitcher with his unspectacular peripherals. We've examined our code and uncovered a bug for which Lowe was the poster boy, if not the lone pitcher affected. The bug resulted in an incorrect boost in FAIR_IP for extreme groundball pitchers like Lowe, which in turn produced bumps in VORP and WARP, counting stats that are based on FAIR_IP. As soon as this fix was implemented, Lowe's PWARP dropped from 4.0 to 2.4, which is likely to be regarded as an uncontroversial change by everyone other than Lowe.
     
  • Rest-of-season PECOTA: We've made changes to the weighting of recent results that reduced the impact of 2009-2010 performance on players' rest-of-season projections, making them less susceptible to being swayed by small-sample stats. Those changes are now reflected in the "2011 Projections" table at the top of each player card, as well as in our Playoff Odds.
     
  • PADE: Park-Adjusted Defensive Efficiency is now fully operational. The Rays are doing so well in the PADE department, you'd think a member of their front office might've invented it.
In our 15 years of existence, BP has created a number of new ways of looking at the game from a statistical point of view. Some have stood the test of time, while others have been superseded by new research. In our ongoing effort to hack through the statistical thicket and boil our offerings down to the essentials, our goal is to create clarity where there is now a tangle of overlapping and sometimes contradictory statistics offered not only by us, but by our respected colleagues and competitors.
 
In every case, it is our goal to present you with all the information necessary to construct the most accurate picture of baseball as it is played today. In the coming weeks, you will see us continue to add statistics and sharpen others. We regret any errors, an inevitable consequence of experimentation, and we welcome further feedback.

Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now
You need to be logged in to comment. Login or Subscribe
makewayhomer
7/19
Re: ros pecota...

If you reduce 2009/2010 weighting doesn't that increase, not decrease, the smaller samples from this season ?
brownsugar
7/19
The article (at least what I'm reading, maybe it's been edited) stated that the weighting has been changed to avoid reducing the impact of 2009-2010, not achieve a reduction of impact. Would be clearer if it stated "that previously reduced" instead of "that reduced", but I think you and Ben are making the same point.
jrmayne
7/19
That seems an interesting (and counter-intuitive) reading, but is at least a potential explanation for the baffling language. Further reducing current performance seems misguided to me given Tango's writeups on the subject, but it's a possibility.
markpadden
7/19
Either way, we need to know what exactly the new weighting scheme is and why it was chosen.
mtr464
7/19
Just out of curiosity, with Matt Swartz bringing Siera over to Fangraphs and presenting new updates to the stat over there, will BP be incorporating these updates? Or, has other staff taken up providing tweaks to Siera? Is the Siera presented here going to diverge from the Fangraphs Siera?
cwyers
7/19
I have just now seen Matt's new formula for SIERA. He says he's improved the predictive power of SIERA, but has not yet published his testing that shows that SIERA is more accurate now. Before Matt's departure from BP, he and I had an extensive conversation about testing of ERA estimators, and I do not yet know if he's taken any of my points to heart.

That said, a cursory examination reveals that new SIERA resembles old SIERA much more than new Coke resembled old Coke. Looking at 2010, the root mean square error between the SIERA values published on Fangraphs and the SIERA values at BP (weighted by innings pitched) was only .19. Mean absolute error was only .12. These are intensely minor differences, all told. Nor has the standard deviation changed appreciably - I think there might be a .01 difference.

In order to get these modest changes out of SIERA, Matt seems to have solved a problem that nobody had - that SIERA just wasn't complicated enough. He's added four additional coefficients, which is quite a lot considering that formulas like FIP only have four coefficients total. The sign on one of his interactive effect has changed from negative to positive - I don't think anyone thinks the relationship between walks and ground balls has fundamentally altered in baseball since SIERA was initially published. To the extent that SIERA has improved in predictive power (which, as I said, he has not yet presented his evidence for) it seems to have diminished in explanatory power. It is now harder to reason out *why* SIERA says what it says.

Also, shortly I hope to finally publish some of my own research findings on the matter that I've been working on for a few months now. I think that will answer a lot of people's questions about this issue.
markpadden
7/20
More info.:

http://www.fangraphs.com/blogs/index.php/new-siera-part-two-of-five-unlocking-underrated-pitching-skills
jrmayne
7/19
I am a huge longtime fan of BP. I'm very frustrated. Y'all are better than this. Or maybe not.

I appreciate the general admission of errors, but there are some specific unacknowledged errors that ought to be addressed.

1. What makewayhomer said.

1a. You started the RoS PECOTA by announcing that "Fangraphs is wrong." Now you've apparently gone toward Fangraphs. My quick calculations indicated Fangraphs' weighting was pretty good; rather more tellingly, Tom Tango's did too. I'm pretty sure an apology is in order; if you're going to wrongly disrespect competitors, you owe them an apology when you're wrong.

This is really bad form not to do that.

2. If you're going to talk about PECOTA problems and not address the Kila/Bowker problem - which was obvious and mentioned at the time by more than one person - that seems unfortunate. Some indication that you're trying to repair the error would be nice.

3. If you're going to talk about PECOTA problems and not talk about the Trout problem, that seems unfortunate. Some indication that you're trying to repair the error would be nice. The continued insistence at the time that the comp list problems for those with recent minor league histories weren't problems was very annoying, and remains so. One of PECOTA's great assets has left the building and there's no indication that it's coming back.

4. There's an ongoing bug in the A's in the Playoff Odds for at least a week. The Rockies are similarly problematic due to a Mark Ellis bug. Someone should look at these intermittently.

5. The way Nate did the Playoff Odds, if I understood it correctly, was to assume team quality around a baseline; the Monte Carlo sim used a distribution of quality assumptions vs. the remaining schedule. (That is, if we assume the A's are a .642 team the rest of the way, the sim might play them out as .655 or .617, rather than just as .642.) The playoff odds report appears to neither assume a distribution of potential goodness nor does it appear to take schedule into account. I could be wrong about this.

Short version: Grownups acknowledge error. I get that you want to sell the product, but there are some of us in the unwashed masses who aren't buying diet pills from Steve Garvey, penis extensions from Jimmy Johnson, and Fangraphs is Wrong from BP. Admit specific screwups.

I want to love you guys. And there are lots of good articles. And it makes me happy that Team Tracker's improved in some details (if not in daily reliability.) But you're making it hard.

--JRM





cwyers
7/19
I'm typing this on a phone, which makes it hard to respond to all your points in as much detail as they deserve. I just wanted you to know I saw this comment and will have a response for you later today.
jrmayne
7/19
Charming! Thank you.
cwyers
7/19
Going out of order:

* Rob has located the Ellis bug and is working on a fix - the problem was a mistaken entry in the depth charts that was causing Ellis to appear on multiple teams at once. Longer term, there was a set of safeguards in the depth charts code to catch some of those issues, and once the issue is fixed I need to go back into the code and figure out why the safeguards aren't working any more.

* I don't think RoS PECOTA has moved towards RoS ZiPS in the way you describe - I look at the forecast for someone like Bautista and RoS PECOTA is still more conservative.

I don't take making errors lightly. For all of the errors listed above, I am wholly sorry. End of statement - no qualifiers or excuses.

But I stand by my methodological critiques of rest of season ZiPS, particularly in terms of how prior season playing time is not incorporated in the weighting of current season stats. Tango, who you reference, agreed with that critique. After the season, I will take a look at how each measure performed, and that may go further toward resolving the disagreement. Also, above people have requested a more detailed explanation of the methodology behind the updated forecasts, and while doing so will take more than a comment, I will work on a blog post addressing those concerns as well.

And as a philosophical point - I think the field of sabermetrics is better served if people have their disagreements out in the open. I'm not trying to slander Fangraphs or anyone else. When I offer a critique of anyone else's work, it's because it's something I truly believe in; sometimes I'm wrong, but it's because I'm a human and humans err.

* The playoff odds is a Monte Carlo sim and it does take strength of schedule into account.
jrmayne
7/19
Thanks, Colin.

I agree that open disputes are often healthy (like this one!). And I concur with the methodological point you make; however, I expect that the more 2011-centric ZiPS projections will be more often closer from the original date/RoS PECOTA projections. Maybe I'll turn out to be wrong.

I am certainly pleased that you'll revisit this - I'd love to see an article at the end of the year to see how RoS PECOTA fared against RoS ZiPS from the inception of RoS PECOTA.

I'd very much like to see an article early in the offseason (or earlier) about the other flagged PECOTA issues. I view them as serious.

Thanks again for your response.

--JRM

sahmed
7/19
I also value open disagreements, and many people will find that they are valuable learning experiences. Where I struggle is with the strength of your criticisms -- saying something is flat wrong -- weighed against the struggles to come up with sound, constructive alternatives. The former is much easier than the latter, and more balance from you would gain you, from my perspective, more patience in rolling out such ambitious projects.
markpadden
7/19
If we are bringing up unresolved PECOTA/projections issues, here a few [all of these issues were raised in the comments, but not answered]:

1) What mix of projection vs. current season does the in-season Postseason Odds report use, and how was this weighting algorithm tested? Why is this system not transparent, and why not show both the projection team strength and current-season-only team strength, so people can see the actual team vs. expectations.

2) This page, http://www.baseballprospectus.com/odds/ is, to use one of your words, wrong. Look at the expected win pcts. Oakland = .641; Colorado = .627.

3) Why were pitcher win projections massively inflated in pre-season PECOTAs? E.g., all of the Mariners pitchers were projected with a W-L record >= .500, yet the team was projected with 71 wins. Many other teams looked as bad -- e.g., Twins and Mets. Overall, there was a huge problem with PECOTA pitcher W-L projections, an issue which was first raised in early March. To my knowledge, it has never been addressed.

markpadden
7/20
"After the season, I will take a look at how each measure performed, and that may go further toward resolving the disagreement."

Wouldn't you want to backtest this over, say, 10 years of data instead of just 2011?
Michael
7/20
Hard to balance (1) using a large data set to test the differences against (2) not using the data set used as input to develop formulas to test whether the formulas works.
markpadden
7/21
I would agree. But a single season clearly is not going to prove a whole lot.
markpadden
7/19
I appreciate the update. But I am still left to wonder: what is the current status (and future) of the missing 10-year player projections?
philly604
7/19
Hi Colin

If this is going to be a general gripe with the stats, I'd like to add something. I've been disappointed by the disappearance of Clay's minor league translations.

I was shocked to recently learn that he's posting them at his own website and is apparently no longer involved at BP? Who would know as we've never been told.

To the extent that you want to get into that issue is up to you, but I must say I very much miss having something like those reports available. Are there any plans to provide something similar in the future? If not, why not?

Thanks.
benrosenberg02
7/19
question - only tangential related to pitching stats - is the FLAKE stat still posted?
TangoTiger1
7/19
Colin,

I think you have a bug with the "rest of season" (RoS) forecasts. I remember a while ago, Felix's RoS was 2.30 ERA, which was quite bold, considering that:
a) his mean forecast entering 2011 was around 2.60
b) his season performance to then was worse than 2.60

So, given more information, his RoS should have been somewhat worse than 2.60.

Now, his RoS is even lower at 2.19:
http://www.baseballprospectus.com/card/card.php?id=HERNANDEZ19860408A
That is even bolder since his current performance is basically a match to his career totals, and so, you'd need to have his mean forecast be higher than 2.60.

His RoS is a 3.2 WARP, on RoS 98IP. His WARP in his Cy season was 4.4 (250 IP) and 5.6 in 2009 (239 IP).

***

Also, the percentile forecasts are showing this timestamp:
Last Update: 3/26/2010 14:48 ET

Note the year (2010, not 2011).
cwyers
7/19
Tango, I will go ahead and review Felix's forecast to make sure there's nothing going on there. But comparing his RoS forecast for 2011 to his career stats is misleading, given the significant drop in league ERA. For Felix's career prior to 2011, weighted by his IP, the league average was 4.39. The league average for 2011 is 3.87.

In Felix's case, there's also the change in park factor at his home park. Seattle has gone from being a modest pitcher's park from 2006 through 2009 to being essentially the AL's new Petco in 2010. I find myself curious as to what's driving the change in Seattle - whether some of it is changes to the park itself, versus changes to other parks around the league - but I'm fairly convinced that this is a substantial change, not just a data artifact.

As for the timestamps, that's just a typo. We'll get that fixed.
TangoTiger1
7/19
I had considered that you have a different park factor, but WARP takes care of that, right? And so, having a forecast of 3.2 WARP on 98 IP is a rate that is far higher than his best season. That was the point of my showing it in tandem with his WARP in 2009 and 2010.

Can you show the top 10 in RoS WARP (and their IP), both entering 2011, and right now?

lopkhan00
7/19
Maybe the cooler temperatures in Seattle could have brought about the cooler bats. April was 4.7 degrees below normal, May 3.5, June 1.3 and July has been 1 degree below normal so far.
markpadden
7/19
Is it safe to say RoS PECOTA is using park factors only from 2010-2011? If so, that does not seem wise.
cwyers
7/19
Park factors are based upon up to five years of data - it's a rolling average centered on the candidate year, so for 2009 and on it's using less than the full five years. (This means that historic park factors for very recent years may change slightly as 2011 progresses.)

Currently, everything on the site is using 2010 park factors, but the 2010 park factors incorporate 2011 data so the current season results are having a (small) effect on those park factors.

If you'll indulge me in a little speculation - it's widely surmised in the press that the Mariners are building a team around run prevention, which is to say fielding and pitching. I'm at least slightly curious if they've made any sort of changes to the park that would help such a team more than the average team, and thus intentionally brought about what we're seeing in their park factors.
TangoTiger1
7/19
The runs scored at home relative to away in 2011 is 93%, just like in 2009. In 2010, it was 81%.
TangoTiger1
7/19
And in any case, it doesn't matter to WARP. Felix's RoS WARP is wrong. There's a bug in there.
BarryR
7/20
They've also built their offense around run prevention.
markpadden
7/20
Not sure what you mean. What park environments are rest-of-season forecasts assuming?
TangoTiger1
7/19
Since I was mentioned twice (for each side!), here is the thread I have on my blog.

My explanation as to how to do RoS is in post 9. And, I agree in post 12 that you can't keep the weights constant.

Dan chimes in in post 14, and if you look at that post and the link in post 10, it sounds like Dan implemented his weighting scheme (irrespective of actual PA) as a quick effort to get something rolling for this year. I'd expect Dan to improve upon it for next season.

***

In any case, of all the things where we have disagreement in the saber community, weighting of performance by timeline is not one of them.

The weighting of daily performance ALL will follow something along the lines of:
weight=.9994^daysAgo for hitters
weight=.9990^daysAgo for pitchers

That is, the further back in time, the less weight. You can quibble about whether to use .9992 or something for hitters, etc, or, that you want it to accelerate faster, like:
weight = .9998^(daysAgo^1.2)
but, basically, we're all dancing around that scheme.

***

I agree with Colin that all discussions should take place out in the open. It makes life easier, and 2000 heads are better than 2.
pobothecat
7/19
my head hurts
markpadden
7/20
One more unsolved pre-season mystery: why did the projected team win-loss records differ between the Postseason Odds and Depth Charts reports?

Both should have been using the same data (projections and schedule), yet there were differences in the numbers.