CSS Button No Image Css3Menu.com
New! Search comments:
(NOTE: Relevance, Author, and Article are not applicable for comment searches)
Perhaps you didn't read your own article?
To remind you of your claim: "The precipitous <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=BABIP" onmouseover="doTooltip(event, jpfl_getStat('BABIP'))" onmouseout="hideTip()">BABIP</span></a> decline indicates that unless he gets back to the line-drive and ground-ball approach that defined his first seasons, he’s extremely unlikely to challenge for a .300 average again."
This is false. A guy with his exit velocity and angle profile is much more likely to hit .300 than .250 going forward.
No, I read it. Your panic is completely unwarranted. As I said, Lindor's batted profile this year (incl. all his flyballs and popups) would result in close to a .300 BA, if he had had normal luck.
By by all means, continue to publish "the sky is falling" articles. It will just lower his price for the rest of us.
You're reading way too much into 300 <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=PA" onmouseover="doTooltip(event, jpfl_getStat('PA'))" onmouseout="hideTip()">PA</span></a> of crazy-low <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=BABIP" onmouseover="doTooltip(event, jpfl_getStat('BABIP'))" onmouseout="hideTip()">BABIP</span></a>. According to xStats.org, his xAVG is actually .296. I.e., his batted ball profile is still that of an elite batting average hitter.
So, basically, he's getting better, not worse.
I think the author is assuming that most people thought that:
a) umpires change the zone to benefit whoever is behind in the count.
b) umpires are motivated to do this out of emotional considerations.
I understand (as of five minutes ago) that one author proposed this theory in a 2010 article, but I have never heard anyone else make these rather specific claims.
So while the data are certainly interesting and valuable, the framing of the article ("Hey, that commonly accepted theory that umpires are 'compassionate?' It's dead wrong") seems both out of place [straw man] and unsubstantiated [we still do not know specific motivations].
As per usual, the aging curves are comical:
Goldschmidt (age 30 season): .313 <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=TAv" onmouseover="doTooltip(event, jpfl_getStat('TAv'))" onmouseout="hideTip()">TAv</span></a>
Goldschmidt (age 37): .300
<span class="playerdef"><a href="http://www.baseballprospectus.com/player_search.php?search_name=Miguel+Cabrera">Miguel Cabrera</a></span> (age 34): .305
Miguel Cabrera (age 39): .301
<span class="playerdef"><a href="http://www.baseballprospectus.com/card/card.php?id=70262">Kolten Wong</a></span> (age 26): .250
Kolten Wong (age 34): .250
<span class="playerdef"><a href="http://www.baseballprospectus.com/player_search.php?search_name=Matt+Carpenter">Matt Carpenter</a></span> (age 32): .274
Matt Carpenter (age 36): .275
No one has presented an update as to how <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=PECOTA" onmouseover="doTooltip(event, jpfl_getStat('PECOTA'))" onmouseout="hideTip()">PECOTA</span></a> works for the last several years. Has it been updated and/or tested at all during this time? Or is it still blindly cranking out whatever <a href="http://www.baseballprospectus.com/author/colin_wyers">Colin Wyers</a>' black box produces?
Did you just refer to yourself in the third person and then make an Internet bet-threat?
"I’m hoping that after guys in the big leagues show Appel that he’s not just going to be able to ‘out-stuff’ the baseball world forever..."
Are we talking about the same guy? In his last three stops he has posted an <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=ERA" onmouseover="doTooltip(event, jpfl_getStat('ERA'))" onmouseout="hideTip()">ERA</span></a> of 9.74, 4.48 and 4.26. So he's not exactly out-stuffing anyone as it is.
My point is that there is not nearly enough data to test any of the hypotheses -- the original paper should never have been written. The behavior of handful of individuals -- no matter what that behavior is -- simply cannot be used to infer the behavior and intent of a large group of people.
Guess how many Latin umpires were used in these studies?
Exactly. There is an outer limit to how aggressively a catcher can frame a pitch (even if he has the skill to do it). Once he goes too far, most umps will know he is a framer and adjust. It's only starting to happen recently (perhaps in part due to the ~10 framing articles a season BP et al. put out). The sweet spot of being a good framer but not too good is shrinking every season. It will all be moot within 10 years, once ball/strike calls are automated.
I'm a buyer of <span class="playerdef"><a href="http://www.baseballprospectus.com/card/card.php?id=60834">Yan Gomes</a></span> at #13. Numbers depressed last year due to injury (<span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=OPS" onmouseover="doTooltip(event, jpfl_getStat('OPS'))" onmouseout="hideTip()">OPS</span></a> 165 pts. higher after AS break). Legit 20 <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=HR" onmouseover="doTooltip(event, jpfl_getStat('HR'))" onmouseout="hideTip()">HR</span></a> potential for next five years.
Does you commitment to transparency extend to <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=PECOTA" onmouseover="doTooltip(event, jpfl_getStat('PECOTA'))" onmouseout="hideTip()">PECOTA</span></a>, or will it be a black box mystery forever?
Do you troll readers?
Well said -- though I doubt a single BP staffer took it to heart.
Terrible analogy. The band is gone. This is a high-school clique trying to do a couple cover songs.
Also, what was the significant new metric? They've had one each of the last 4-5 years, and none of them have stuck.
When you have a bunch of non-scouts (who won't even post their qualifications) speculating randomly on young players, you're usually looking a baseball blog from 2002. Because one person decided to abandon analytics doesn't mean the site is forever doomed to follow his edict.
So you're "predicting" something that has already happened? Not really how it works.
"While <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=PECOTA" onmouseover="doTooltip(event, jpfl_getStat('PECOTA'))" onmouseout="hideTip()">PECOTA</span></a> aspires to be perfect..."
What actual changes have you guys made to the algorithm in this aspiration toward perfection? As far as I know, while the rest of the baseball world has made incredible advances over the past five years, PECOTA has stood still. So being wrong on a team is not as surprising as you appear to think it is.
This is a great point. Unless the division title is already locked up, resting players has a very real cost in MLB.
In baseball, a lot of what might be driving the sitting of "regulars" for a particular team is how aggressive the team is in platooning players -- which may have little or nothing to do with attempted prevention of injuries.
The real history being made here is the <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=BB" onmouseover="doTooltip(event, jpfl_getStat('BB'))" onmouseout="hideTip()">BB</span></a>% vs. <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=BABIP" onmouseover="doTooltip(event, jpfl_getStat('BABIP'))" onmouseout="hideTip()">BABIP</span></a>. Guess how many qualified players have had a BB% < 5% and a BABIP > .370 in a single season since 2005?
Digging deeper (to 1990), I found one player, the immortal <span class="playerdef"><a href="http://www.baseballprospectus.com/card/card.php?id=494">Homer Bush</a></span> (TOR), who posted a 4.0% BB and .374 BABIP -- to go along with 32 <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=SB" onmouseover="doTooltip(event, jpfl_getStat('SB'))" onmouseout="hideTip()">SB</span></a> -- in 1999. He would never play a full season again.
The fantasy community is (should be) concerned with the future, not the past. If you think past R, <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=RBI" onmouseover="doTooltip(event, jpfl_getStat('RBI'))" onmouseout="hideTip()">RBI</span></a> and BA are even decent predictors of future R, RBI and BA, I might just have an opening for you in my league!
1) Why use overall value metrics (which have a ton of very volatile defense baked in) when evaluating fantasy potential? Defense only matters to the extent it allows a player to stay at his position. "Excess" defensive skill -- even if we were to assume that defensive metrics were any good -- largely goes to waste from a fantasy perspective.
2) None of your metrics make any attempt to normalize Bogaerts' ridiculously high .369 <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=BABIP" onmouseover="doTooltip(event, jpfl_getStat('BABIP'))" onmouseout="hideTip()">BABIP</span></a>. A 4% <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=BB" onmouseover="doTooltip(event, jpfl_getStat('BB'))" onmouseout="hideTip()">BB</span></a> rate (bottom 15 in all of baseball), minimal speed, and a .369 BABIP do not tend to survive together.
Plenty to like about the player long-term, but I think the fantasy community's short-term skepticism has been warranted.
Actually, a lot of the best online poker players would play a ton of small to mid-range tables simultaneously -- both to ensure weaker competition and to reduce variance...at the expense of a lot of manual labor and boredom, of course.
Out of curiosity, how much of a role (if you happen to know) does weather analysis play in tools like the BAT? I'm sure it's factored in to some extent. Wondering how much room for further edge there is on that front. IMO, btw, weather is still not being valued properly for baseball totals betting -- particularly dissecting what conditions produce what results at the most susceptible stadiums, such as Coors and Wrigley.
Not sure I follow. What data are you using as your source? Do you classify pitches with your own algorithm? Finally, is your data available on this site?
What source are you using for your data, raw Pitch F/x? In any case, there's clearly a pitch classification issue going on with Harvey, as evidenced by all of his sub-92 mph fastballs suddenly being called non-FBs 10 starts ago:
With such inconsistent data, I would hesitate to draw even mild conclusions about his in-season velocity change. The good thing is that his max velo has been relatively stable all season (97-98). Plus, his post-AS Break xFIP has been 3.23, vs. 3.44 in the first half.
I'm a long-time baseball handicapper and a reasonably avid full-season fantasy player. For whatever reason, DFS has never quite appealed to me (which is a bit odd, considering that 20+ years ago, I tried [and failed miserably!] to run a one-month salary-cap fantasy league via a 4" x 6" print ad in Baseball Weekly).
Anyway, while reading this article, I was struck by how similar the landscape sounds to that of the online poker world in the early 2000s: a bunch of sharks gradually weeding out the easy prey by player profiling, etc. until the game became too difficult to warrant the hassle/time/risk involved (for most players).
I realize that's a cynical view and that people play the game for more than money -- and also that the US govt. played a large role in the decline of online poker. However...I can't help thinking that we are within a couple years of "peak DFS." Who knows what's next, but I'm just not sure the current ecosystem is self-sustaining long term, esp. considering the amount the sites are raking off the top. Would you agree, or do you think this style of competition will last for decades?
Hits and misses should be relative -- that is, compared to systems like ZIPs and Steamer.
"Projection systems aren't designed to account for dramatic changes in skill level during the pre-season"
There is nothing preventing a projection system from doing this properly; it's just that's yours doesn't.
Instead, how about sending me the ~50 hard-hitting analytical articles that would have been written by this time of year in past seasons...
The problem is that every employee at BP *is* essentially in intern. Hours of manpower wasted on "clever" jokes is what results.
It's just a bunch of bored 24 year olds who are resting on the laurels of the hard work of the site's founders and its hard-working crew over the past 15 years. Happens to every business over time, but usually not so suddenly and awkwardly.
The middle ground = desperation.
When someone says "We don't give a damn about minor league numbers, to be quite blunt" -- I'm not sure what there is to interpret or nitpick.
Are you a scout? I'm genuinely curious what your background is.
When did BP decide that prospect rankings would be driven solely by scouting evaluations? I think I missed the memo.
"Remember, we take a tools/scouting based approach on these"
Since when? Did someone mandate that you ignore statistics in your analysis? Do you at least appreciate the irony that someone has to ask this question on this site? The whole point of BP's prospect rankings 10-15 years ago was to apply a performance-oriented lens to the prospect evaluation process. I.e., a nice complement to BA's scout-driven lists.
Now, apparently, it's just a group of 25-year-old bloggers who like to hear themselves talk in scout-speak. If that's all you are -- and I've seen no evidence to the contrary -- you're not adding value. You've just become "BA Lite."
"We don't give a damn about minor league numbers, to be quite blunt."
That's sad, if true. And it might very well be the shark-jumping moment for BP. It's amazing how far this site has fallen. From a cutting-edge analysis site to a bunch of wannabe scouts drooling over tools.
What you say doesn't make much sense to me. You're saying that a raw talent for raw talent deal cannot be evaluated in real time, but a raw talent for cash deal can? In both cases, we have no way of knowing how the talent will translate to future performance. Having one side of the deal be a fixed asset doesn't make valuation of the other side any more predictable.
Thanks. Frankly, I'm not impressed with <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=cFIP" onmouseover="doTooltip(event, jpfl_getStat('cFIP'))" onmouseout="hideTip()">cFIP</span></a> for a whole host of reasons (http://www.hardballtimes.com/fip-in-context/#comment-61647). The author simply showed that it correlates y2y with *itself* much better than other metrics. He never performed a test to measure avg. error vs. rest-of-season <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=ERA" onmouseover="doTooltip(event, jpfl_getStat('ERA'))" onmouseout="hideTip()">ERA</span></a>, which is what fantasy owners care the most about. The one in-season test he did do (correl. of each metric to RoS RE24/PA) showed cFIP with essentially no improvement over something like <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=SIERA" onmouseover="doTooltip(event, jpfl_getStat('SIERA'))" onmouseout="hideTip()">SIERA</span></a> -- which is troubling since his model was created (overfit, IMO) using the same data he used for testing, while SIERA et al. did not have that same advantage (for years 2011-2014).
Most importantly, he never tested cFIP vs. <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=DRA" onmouseover="doTooltip(event, jpfl_getStat('DRA'))" onmouseout="hideTip()">DRA</span></a>. So it just seems like a curious choice to assume that cFIP-DRA will reveal the best buy-low or sell-high fantasy pitchers, until/unless more work has been done to show that cFIP actually predicts ERA accurately out of sample.
Is there any evidence that <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=cFIP" onmouseover="doTooltip(event, jpfl_getStat('cFIP'))" onmouseout="hideTip()">cFIP</span></a> is a better rest-of-season <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=ERA" onmouseover="doTooltip(event, jpfl_getStat('ERA'))" onmouseout="hideTip()">ERA</span></a> predictor than <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=DRA" onmouseover="doTooltip(event, jpfl_getStat('DRA'))" onmouseout="hideTip()">DRA</span></a>? The original article shows that DRA is actually more accurate (in predicting future runs) than <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=FIP" onmouseover="doTooltip(event, jpfl_getStat('FIP'))" onmouseout="hideTip()">FIP</span></a> for all sample sizes tested. I don't see where anyone has tested DRA vs. cFIP, however.
Nice article. That's a disturbing trend to see time between pitches rise by 7-8% in just five years. Shortening time between innings is just a short-term band-aid for a problem that looks like it will just continue to get worse.
Is the local site format any better? It just looks like font changes to me. I'm with you, though, that BP was the worst site design of any sport site. What on earth is with the tons of wasted white space on each side of every page?
So Volquez and Young are going to continue to post sub-3.00 ERAs?
It dilutes the information. If you try to make a linear coefficient for temp. across all stadiums (in addition to a generic park adjustment), it will be much less informative (contain more error) that if you did it by stadium. Same for humidity and wind vectors.
It's about shit's relative adhesion to a wall...
Not sure where Goldstein gets the idea that McGuire's "realistic floor" is an everyday player. The guy is getting destroyed by the FSL (0.588 <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=OPS" onmouseover="doTooltip(event, jpfl_getStat('OPS'))" onmouseout="hideTip()">OPS</span></a>). His realistic floor is never becoming a major leaguer.
http://mlbfarm.com/data/weather.csv (no affiliation with site)
And it's definitely not a trivial variable when predicting run scoring.
Not sure what your point is, but just FYI -- historical wind/temp data is definitely available for every game. And, yes, the wind direction-to-temp correlation is huge for Wrigley. Weather is a complex issue to dissect, but it absolutely needs to be accounted for in any serious attempt to adjust pitching performance by degree of difficulty.
Temperature is huge. Ask anyone who bets baseball totals for a living.
I think Verducci needs a lesson in causality.
Nope. The guys who deter runners are going to have to throw out far better base stealers, on average, than those catchers against whom everyone runs. So if you fail to adjust <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=CS" onmouseover="doTooltip(event, jpfl_getStat('CS'))" onmouseout="hideTip()">CS</span></a> rates for quality of runners against, you'll massively underrate the skill of the deterring catchers.
Time flies. Some of those shiny new parks, like Camden Yards, are 20+ years old now. We're prob. within 10 years of another refresh cycle.
How are you able to isolate the effect of an aggressively shrinking strike zone from what you assert is an organic and permanent downward slope of offense over time?
The fact that you genuinely think games will average 2-3 total runs in ~5 years shows you really don't have a handle on how the game is (and isn't) changing.
Yeah, I don't follow the conclusion of this article at all. Is he seriously suggesting teams "play with the batting order" based on tiny-sample #-of-outs splits? That's lunacy.
Other research has suggested that day/night and dome/outdoor conditions may have a significant effect on umpires' pitch calling at the borders. Can you verify and, if necessary, control for this?
Wait, what? First of all, that's incredibly noisy data. Starting the season 7-0 might have a tiny, tiny bit more predictive value than going from 2-2 to 9-2, but that's not even worth noting.
The real problem with this article:
"But if you really want to know a team’s true colors, look at what it does when the weather turns a bit hotter."
*Of course* a 7-0 streak is going to correlate higher with season win pct. the later in the season it occurs. Is this not obvious? [And the dropoff in Sept. is due to roster expansion]. Teams that are 80-50 are a hell of a lot more likely to run off seven wins in a row than teams that are 50-80. The 7-0 run is not the cause or predictor of the great season record in this case; it's simply a symptom of being a great team over a large sample of games.
How was this article even published?
I haven't seen anything in three years that indicates you've taken any steps (even wild attempts) to improve your algorithm. Meanwhile, the rest of the world advances at a dizzying pace. You're quickly becoming the AOL of player forecasting.
Correction: should have read "modified xFIP," not "modified xERA."
Absolutely, on the double-counting of temp. Once you park-adjust, all weather variables should be vs. park average, not raw.
Humidity is an under-rated variable. Hint: the sogginess of the ball far outweighs the lighter air in humid conditions.
After reading this article I have no idea what method you used to test it? Are you saying you calculated <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=DRA" onmouseover="doTooltip(event, jpfl_getStat('DRA'))" onmouseout="hideTip()">DRA</span></a> in Year-1 and compared to Year? Or something else?
Let's assume that's the case, and the DRA beats <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=FIP" onmouseover="doTooltip(event, jpfl_getStat('FIP'))" onmouseout="hideTip()">FIP</span></a> when using only last year to predict this year. I would claim that one could quite easily use a modified xERA (adjusted for park and some pitcher qualities) that crushes FIP in accuracy for 1-year predictions. [Source: 10 years as a pro gambler]. I.e., you chose an incredibly easy target in FIP, since it relies on the absurdly noisy <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=HR" onmouseover="doTooltip(event, jpfl_getStat('HR'))" onmouseout="hideTip()">HR</span></a> allowed stat. Don't get me wrong: I'm all in favor of this research. But your validity test should never be something as trivial as beating raw single-season FIP. Beat <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=PECOTA" onmouseover="doTooltip(event, jpfl_getStat('PECOTA'))" onmouseout="hideTip()">PECOTA</span></a>. Beat Steamer. Beat something that a few people actually use for prediction.
No, that makes no sense. The only way you should be using temperature as an input is a by how much it differs from the stadium norm. Why would you look at raw temps. after adjusting for park?
Good stuff. Keep it coming.
Wood was playing in a much higher run scoring era (early 2000s), yet had a much lower <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=HR" onmouseover="doTooltip(event, jpfl_getStat('HR'))" onmouseout="hideTip()">HR</span></a> rate. Gallo had 104 HR (1264 <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=PA" onmouseover="doTooltip(event, jpfl_getStat('PA'))" onmouseout="hideTip()">PA</span></a>) in his first three seasons. Wood had 73 HR (1403 PA) in his first three seasons. Gallo's 8.2% HR rate today is quite a but different than Wood's 5.3% HR rate in 2005. I have no idea how Gallo will turn out long-term, but he has a lot more power than Wood ever did.
Would like to see Thorburn's take on Tate's delivery. Has to be the quickest leg kick ever, which I assume is not a good omen.
Anyone standing out yet as a lock to be a top 5 pick?
Ans: Walk rate doesn't matter much for the low minors.
"And the average final rankings are 20 for trajectory up, 26 for trajectory down, and 22 for trajectory staying the same, which is not enough of a difference to explain the lifetime <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=WARP" onmouseover="doTooltip(event, jpfl_getStat('WARP'))" onmouseout="hideTip()">WARP</span></a> variation that you see."
Well, that depends what the actual distribution looks like. If the "risers" group had an abnormal number of top 10 guys (and 31-40 guys), the WARP would have been higher, even though the avg. and median rank wouldn't have shown it. That's because WARP does not decline linearly when you go from #1 to #40.
The proper test here is to compare expected WARP (for given final rank, age and position) to actual WARP for ~five bins of ranking trajectories [lots of ways to define this]. If the effect is real, it will be obvious.
Weather, park, travel schedule, training staff, player age, unique player skill (in improving/declining during the season)...as far as I can tell, NONE of these were controlled for. This isn't a nitpick. Failing to control for any of the likely causes completely invalidates the analysis.
Exactly. The largest impact on mid-season vs. opening day batter performance is going to be change in weather -- very large for some cities, while not for others. Another potentially huge factor is going to be travel schedule, which can vary significantly from team to team.
The only way this analysis would be credible is if it looked only at managers who managed multiple teams.
That flyball hitter vs. groundball pitcher "inefficiency" theory never really made any sense to me. The only way it would offer the team an advantage is if Oakland's schedule involved an unusually high percentage of <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=GB" onmouseover="doTooltip(event, jpfl_getStat('GB'))" onmouseout="hideTip()">GB</span></a> pitchers (otherwise, it just adds game-to-game volatility on offense, which is only really useful if you are a bad team).
In 2013 and 2014, their division opponents consisted of two groundball pitching teams (Sea and Hou), and two flyball pitching teams (LAA and Tex). Overall, it was an average FB/GB opponent pitching schedule.
It's true. No one likes to talk about it, but there are a handful of 30+ year old vets who have consistently outperformed peripheral (and this projections). See: Young, Chris. It's time to quantify it and factor it into intelligent projection algorithms.
Have you guys thought about opening the PECOTA algorithm for public commentary? By BP's own admission, PECOTA has no real advantage over competing projection systems. So why not open up the black box and accept constructive criticism that will enable you to regain the lost advantage?
Love that you simply ignored my comments. Am I too assume you agree with them but are too lazy to address them via actual changes to your algorithm?
Am I the only one here who has no idea who Jim Walsh is?
"Cingrani has never been good against righties."
Never? In 223 IP in the minors, his FIP was 1.72 vs. righties and 2.94 vs. lefties. Yes, it's the minors, but that's a massive *reverse* platoon split -- pretty rare for any LHP over 200+ IP. Considering he hasn't even logged 200 IP in the majors, it may be a bit premature to label him a LOOGY in the making.
@therealn0d I'm not aware of any international 16-year-olds on top 100 lists in the past 10 years.
My theory (that perhaps the author could test), is that there is far too much inertia when changing prospect rankings. a.k.a. the VanBenSchoten effect. To test that, you might look at players moved either up or down 25+ spots in a given year, and see how much the out/under-performed their position (in the second year).
The other really interesting thing would be to look at which ranking sources have been leaders vs. followers. That is, if source X ranks a player a certain distance from the average rank, do the rest of the sources tend to follow the next year or not. It's a simpler/quicker test that career WAR, and perhaps more valuable as it would be good to know who is best at predicting future predictions.
Is there any evidence that plate discipline, when used out of sample, is predictive of MLB career success (in excess of that predicted by either his prospect pedigree or his statistics excluding plate discipline)?
What looks random to you isn't random to more sophisticated owners.
You must not be too familiar with the backgrounds of career RPs/current closers like Holland, Melancon, Street, Robertson, Cishek, Jansen, Allen, etc., etc. Basically all the projected save leaders for this season were strictly relievers in the minors.
The old "many stud closers used to be starters" adage is sorely outdated and unfortunately being propagated by your cherry-picking comment.
All of the "floors" in this list are incredibly ambitious. Probably only about half of the list will play more than 2-3 years in the majors. There's nothing wrong with saying a guy's floor is "part-time player for a couple years." Instead, all 50 players apparently have the floor of an average MLB player. Or at least ~30 that weren't wasted with weak attempts at humor.
For this list, I would rename floor as "80th percentile outcome," and ceiling as "99th percentile outcome." Then again, if you have to do that, it becomes a little harder to trust the author's ability to rank the players in the first place.
Wondering why you think D.J. Peterson will have a "very helpful" batting average? That seems extremely unlikely.
Don't know where else to post this, so I'll ask it here: is there any sign of PECOTA 10-year projections?
Interesting take on Moran, widely considered a bust -- even by your own prospect staff. I think the message is quite the opposite of what you present. Taking a corner infielder with no power at the top of the first round is usually not a worthwhile investment.
Ah, yes. I remember when forecasting algorithms were innovative and data-driven. As opposed to today's BP motto of: "well, we've done the same thing for the last three years; might as well not change it now."
Not sure I follow. According to Sam Miller, you guys have not changed a thing in your algorithm from last year to this year, all data has been available since last November, and yet no firm date has been set for this season? What's taking all the time?
Is there an actual schedule?
Read the comments on all the PECOTA articles over the past two seasons if you are looking for suggestions. The 10-year forecasts often make no sense as they are over-fit to a small set of similar players rather than smoothed to reflect more general (and more likely) aging curves.
Really, no changes were deemed necessary since last March? Are you guys content with the overfit, rollercoaster-riding 10-year projections, where players go from TAv of .263 to .251, and back up .265 over 3 season, all predicted 5+ years in advance? I know plenty of your readers are not content with it.
I won't rehash all the examples people noted last year, but they are not hard to find in last year's comments (or the previous year's...or the ones from the year before that).
Have there been any substantial changes made to the PECOTA algorithm this year? If so, will they be documented?
I wonder how many guys who were listed at 6-2, 195 out of HS went on to become elite defensive CFs. Not to mention the decaying value of elite CFs as contact rates continue to fall.
Of your five grade categories, which (if any) would you say are correlated with injury risk?
Also, I assume the grades are absolute, right? So a single-A pitcher with an overall C grade is considered to have the same current quality of mechanics as a MLB C-grade pitcher?
"The point of this article, and the "little better than random" factoid at the end, was to point out that there *is* substantial room for improvement, and the most fruitful place I see to pursue that improvement is in terms of identifying some of the major breakouts that happen every year, as they contribute a disproportionate amount of forecasting error to the total."
My point is that you should *not* be concerned about missing a handful of "breakout" players, nor with minimizing absolute error. You should be focused on trying to improve clear shortcomings in your algorithm. A partial list:
Pitchers who are unique enough for long enough that it's clear they are defying BABIP norms and will continue to do so. See: Young, Chris.
Pitchers who perform abnormally well/poorly with men on base -- thus affecting projected ERA.
Hitter platoon effects. How much of a hitter's past performance was driven by favorable L/R match-ups, vs. the projected mix he will face this season?
Long-term forecasts. This topic deserves a separate post. But suffice it to say when you find your algo spits out a 6-year TAv projection of: .280,.281,.270,.266,.285,.240, you have a serious over-fitting issue...
" It sounds like you are looking for a solid comparison between projection algorithms, which I think is a very worthwhile endeavor, but as I noted, shouldn't be done by me. I'm more focused on whether and how to improve the performance of PECOTA. "
Improving the performance of PECOTA *requires* "a solid comparison between projection algorithms." How else are you going to know how good you are?
Exactly. Thank you for writing this. Anyone who thinks projections are "not much" better than random, has never tried to win a fantasy league using league-average rate projections for every player. You wouldn't just finish below the average team; you'd finish in dead last by a massive, massive margin. And good luck trying to beat the oddsmakers in gambling with league-average ERA projections for every pitcher.
The "Look -- your fancy-pants projection algos 'barely' outperformed a monkey/dart/etc. in absolute accuracy" articles really need to stop. What we need is practical evaluation of *relative* accuracy over several seasons, such as what one site has done with weekly football player rankings already: http://www.fantasypros.com/about/faq/accuracy/#pay
I have no affiliation with that site, nor do I think their methodology is perfect. I do think they have the right *idea*, however, and it should be the general framework used to evaluate any ranking or projection system. That is, in some simulated competition, how would each projection system have fared? And by all means, include the all-players-are-the-same projection set as a control...so we can see just how damaging it would be to use anything like them in real life.
That would be strange, since an article on BP this past winter showed that groundball pitchers fare worse (than expected) against flyball hitters, of which the A's team is almost entirely comprised. Based on that research, I would not think throwing an unusually high pct. of sinkers would work well against the A's.
Brilliant use of division. Wonder what the control split would be: a stud catcher from age 22-25 vs. his career from age 26-36? You think there would be a wee bit a drop off in the WAR/season rates? It's called an aging curve.
Not to mention longevity. Do most catchers with career-alerting leg injuries go on to play 12 more seasons? Do they steal 22 bags and leg out 6 triples the very next season after the injury -- the same season in which they rank in the top 5 (wOBA) of all MLB catchers?
It's a nice narrative, but it just doesn't hold much water.
Good point. One possible causal chain: A's hitters perform worse --> Underperforming hitters get into worse batting counts --> Worse batting counts lead to a lower pct. of fastballs. That seems more reasonable than a league-wide effort to throw more breaking pitches in the second half.
Not sure raw FIP is that useful (vs. ERA) over such small a small sample:
FIP still uses HR rates, which are highly erratic over a half-season. Better would be to use xFIP after adjusting for park HR rates.
I'm sure JK would love a lesson in semantics.
...except that 60% of his career WARP was achieved after that injury.
You can't protect the wrist. I'd like to see what pct. of wrist/hand injuries to batters were caused by a HBP.
Personally, I have no desire to own a player who is HBP magnet. Too many wrist/hand injuries.
Publicly, of course not. Privately and functionally, it is. It's not my "personal belief"; it's how the majority of present-day unions operate in practice.
Didn't forget about any of them. Your comment was about a crowded infield, not the outfield, which is rather uncrowded. Lake isn't going to block anyone, btw. They have essentially all three OF spots to fill long-term. If by some chance all three of Bryant, Almora and Soler pan out as outfielders, that's great for the Cubs. McKinney is not considered a top prospect at this point.
The "up arrow", btw, is for fantasy purposes. Russell will now play for a team with deeper pockets, in a potentially great lineup, in a much friendlier park. And his odds of playing SS have gone down a bit. Net, probably a wash or a slight uptick in long-term fantasy value.
Like any other union, the MLBPA is designed solely to line the pockets of its most senior members, at the expense of junior member, non-members and the union's employers. Who do you think is paying for guaranteed contracts of aging veteran players? It's the guys in the minors, and those MLBers who get washed out or injured before hitting free agency. It's essentially financial hazing in a fraternity where < 5% of the rushees get in. If it sounds like an unfair deal, it is. But I highly doubt it will ever change.
If all five guys (Castro, Bryant, Russell, Baez, Alcantara) end up hitting their potentials, they move Bryant to OF, and have four players for three positions. Odds are at least one will get injured and/or be a bust. So chances are slim that a really good player will be "blocked" for any length of time. P.S. Bogaerts is only 21. A little too soon to pass judgment.
Straily has more than homer issues to deal with. He's throwing 87-88 now: http://www.fangraphs.com/pitchfxo.aspx?playerid=9460&position=P&pitch=FA
That's not the velocity trend of a healthy pitcher.
It should be also noted that Norris (Dunedin) played in a significantly more hitter-friendly park than Berrios (Ft. Myers) does in the FSL.
Why not just save the drama, and have a top 50 list which is updated weekly?
Is there an ETA for mid-season top prospect list?
Allow me to spell out the main "risk": getting worse instead of better as your career progresses suggests that your future fantasy earnings are riskier than those of a player on a more traditional, upward trajectory.
There is no greater red flag than that, yet by ranking him #22, you are essentially ignoring it. And then in your comment, you confirm your dismissal of his struggles by trying to attribute them to a "small sample size," and assuring us that his BABIP is "about the same" this season. I guess we should just ignore the fact that K/BB stabilizes long before BABIP?
The truth is his peripherals have not been good since last July, or 140+ IP ago -- i.e., for more than half of his career. Let's stick to quickly stabilizing stats. The average K%-BB% for an NL starter is ~12%. Here are Miller's last five full months:
July 13: 13.7%
Aug 13: 14.2%
Sep 13: 1.6%
Apr 14: 3.4%
May 14: 6.7% [estimated after tonight's game]
Total K%-BB% since last July (141 IP): 7.5%
How bad is that? Of the 65 NL SPs with 140+ IP across 2013-2014, a grand total of five of them had a K%-BB% worse than 7.5%.
So, yeah, that's implies just a touch more "risk" that I would tolerate in a supposed top 25 U25 fantasy property. The reason I didn't waste time/space with these stats in my original comment is because I think everyone here is already at least reasonably aware of the trend and how long it has been occurring. Maybe you are not. Or maybe you consider 140 IP to be a tiny sample for K and BB rates. I have no idea. Nor am I interested in engaging in any further argument over it. (FWIW, I'm actually rooting for Miller to turn it around).
As you may recall, I simply stated an opinion on two players you ranked, which for some reason triggered a snarky (non-)reply, followed by a depressingly shallow "defense" of your ranking: Small sample (incorrect) + injury (unsubstantiated, at best) + work ethic (no specifics), when mixed with a recently bottom-10-percent K%-BB% pitcher, sprouts the #22 most valuable U25 asset in fantasy baseball.
Yours? One of the least helpful I've seen on this site. Might be more "helpful" to your readership if you defend ranking a guy 22nd overall who has regressed significantly over his first 250 IP in MLB -- so much that his own team didn't want him anywhere near the mound last post-season.
I think both players I mentioned have a shot to be solid; I just think that each is significantly overvalued in this list because there is not enough upside to overcome the risk of being a bust.
There are some serious risk factors around Shelby Miller and Mark Appel, that I don't think are being factored into their rankings fully.
Any update on Addison Russell's health (out for last 7 weeks with hamstring injury)?
These PECOTA projections are from the pre-season, right? So many of these players would have significantly different forecasts if they incorporated stats from this season.
What exactly is the model you came up with?
Yeah, +15 career WAR by age 27 is quite the all-time bust... PECOTA was the bust, not Wieters. As for Goldman, let's just say it was addition by subtraction to get him out of here.
If you enjoy intentional over-reaction, there is always WWE.. The only way for replay to work in baseball is to ban managers from leaving the dugout. Allowing both challenging and arguing in an incredible waste of time from the fan's perspective.
Exactly. A batter should never be able to delay the game, nor should a pitcher be able to step off. Just as unlimited timeouts would ruin basketball and football, baseball has been ruined to a large number of would-be viewers. I follow baseball because I find aspects of the sport intriguing; I rarely watch games because I find them so painful to sit through.
Would be nice to see umpire strike zone tendency stats somewhere on BP. Brooksbaseball used to have very revealing tables, but they have since disappeared.
Any chance you can add umpire performance reports (pct. of called strikes/balls which were correct/incorrect) to the Pitch f/x section?
Gallo has 14, not 3 Ks on the season (42 AB through Apr 16). Improved contact rate, but still very poor in that dept.
Some feedback: can you reduce the number of decimal places in your tables from eight to, say, two? Hard to read as is.
Also, what was the actual formula you came up with to convert last year velo. and current year velo. to predicted rest-of-season velo.? Thanks.
Weak sauce. Don't tell me, "if he does this, he'll be that." Tell me if you think the guy is overrated or underrated, and why I should believe you.
These are projections, not raw data dumps. The idea of an algorithmic projection is to use past data to predict the future as accurately as possible by whatever means available. No one is being constrained at BP to publishing raw data derived solely from similarity scores; nothing is preventing them from using comparable player data as one input to a more sophisticated predictive model. For all we know, they already are doing so. My argument is simply that the current black box model's output is not looking terribly logical, and doesn't appear to me to represent the best estimate of what a given player's TAv will be in a given year in the future.
As for the curves looking reasonable to you, only a high prevalence of *non*-career ending injuries would cause a lot of ramps in skill followed by immediate declines in a players' early 20s. Not sure there are enough damaging-but-not-catastrophic injuries occurring to young position players (plus legitimate age ~21-onset skill declines) to make that the average case. Note that in many cases, both the ramps up and the declines are projected; it's not like the model is simply predicting regression for outperforming minor leaguers.
I agree with you on the utility of upside. Would like to see 25th and 75th percentile projections for each year, so we get an idea of the volatility/uncertainty levels.
More generally, I don't think you want to be using a 1-year projection as fact when creating the projection for years 2+. For the same reason that you wouldn't try, before the the NCAA tourney started, to project second round winners after assuming specific outcomes of first round games. Unless you were averaging the results of many simulations, but it doesn't sound like you are doing that.
I think the issue is that by using a projection as the input for another projection, the expected high-performers for 2014 are going to be treated as outlierish by your long-term algorithm, which will look for regression. This is my best guess as to what is creating the odd "pop-and-drop" aging curves for prospects (see my comments below).
No, I wasn't confident at all in the previous effort from Wyers et al. 2-3 years ago -- which is why I am disappointed the glaring issues present in those forecasts appear to remain unaddressed. The long-term forecasts by Silver (pre-2011) at least made sense. They had rational aging curves and were smoothed appropriately to reflect uncertainty. The two forecast algorithms -- the one used prior to 2011 and the one(s) used in 2011, 2012 and 2014 -- are quite different.
The aging curves for prospects really don't pass the eye test. For example:
age 20 (2014): .249 TAv
PECOTA is saying all of these guys are essentially major league ready for 2014 (very bullish), but will experience declines in performance from 2015 to 2017 (extremely bearish/bizarre). In addition to the general curve shapes not looking right, there is the issue of lack of sufficient smoothing of the data (see Pederson's predicted roller coaster from age 23 to 26). I had high hopes for the revamped long-term projections, but honestly these numbers do not instill confidence.
It seems unusual that you are projecting a TAv of .277 in 2014 and then .273/.278 in 2015/16 for Bogaerts. Is there any way to create internal consistency between the 1-year forecasts and the long-term forecasts? Not critically important, as I realize these are rough estimates. But just curious if there is an obvious answer as to why 1-year and long-term PECOTAs might not agree.
Please add the 2014 projection to the Long Term Forecast section. I would also be nice to see last year's (2013) MLE stats in this section, to get an idea of what kind of change you are projecting vs. current performance levels. Thanks.
I wouldn't be surprised if a certain portion of pitch framing is actually just adaptive pitch calling. That is, knowing what location and pitch type a specific umpire is most likely to call a strike erroneously and trying to get your pitching staff to locate there more often. That would be a skill, obviously, but a very different kind of skill.
When are long-term projections due out? Opening Day is creeping up.
BP, What numbers exactly are being regressed and what numbers are not (for data listed in PECOTA cards)? And what, right now, is considered your best estimate of each catcher's true talent?
I could be missing something here, but I don't really follow how stocking up on flyball hitters qualifies as exploiting an inefficiency. I understand that flyball hitters are far better than groundball hitters, on average, but unless this not properly reflected in industry-standard projection systems, that fact by itself is not going be an exploitable inefficiency. Unless you are saying teams are overly reluctant to assemble an extreme flyball team, leading to individual flyballers being slightly undervalued -- but that does not appear to be the point of this article.
The thing that struck me about the MLB-wide data presented is that, after adjusting it (for the fact that high FB hitters and low FB pitchers are better than avg. overall), the so-called platoon edge for flyball hitters appears symmetrical. That is, flyball hitters perform approx. 9 points of TAv better than "expected" (based on the intersections of overall skill levels of each bucket) when facing GB pitchers and 9 points worse than expected when facing FB pitchers. The only way to exploit this is make sure your FB batters have an excessive number of PA (than expected) vs. GB pitchers. But last year, Oakland's FB hitters faced FB pitchers (bad) 13.4% of the time, while they faced GB pitchers (good) only 8.7% of the time -- for a ratio of 1.54. The average MLB team's FB hitters faced a FB/GB pitcher type ratio of 1.30 -- significantly better than the A's. Part of this is that they play in a division with fewer GB pitchers than average, but that should be part of Beane's calculus, no? If he plays a schedule low in GB pitchers, why assemble an extreme team designed to exploit them?
To summarize, I'm not seeing any conclusive evidence the A's are creating more net beneficial matchups than the average team is. A more volatile team? Definitely. I'm wondering if game theory (if you have mediocre talent, you want to maximize volatility) isn't as big or bigger a factor here.
@Alex Kantekcki -- I'm not particularly bullish on Granderson myself this season, but I'm curious what data you used to conclude that Citi Field inhibits LHB power? I can't find any handed park factors that show Citi with a LHB HR PF of less than 100.
Steven Moya is top-5 raw power talent? He was already 22 in single-A and has logged over 1,100 PA in the minors with just 36 HR and a 0.433 SLG to show for it. I.e., he's had time to convert at least some of his raw strength into standout baseball driving ability, but it really hasn't happened yet. I understand this article is about scouting and projection, but top 5 seems like reach to me.
Just curious: who were nos. 6 and 7 in the list?
Bogaerts is a prime example. People were very slow to upgrade, despite the fact that he was putting up a historic age-adjusted year in single-A.
I suppose a better to way to state it is, "what is your opinion of this pitcher's long-term potential?" vs. "what is your opinion of this pitcher right now?" I.e., a single-A player is going to have much worse mechanics, on average, than a major league pitcher; but you may view the minor leaguer as having far greater potential than than the major leaguer 3-4 years from now.
Is there an ETA for the PECOTA long-term projections?
Always look fwd. to these articles. One request: is there any way you could show the mechanics report card with both current grades and your projected future grades?
Great article. I've always felt that people are too conservative in upgrading and downgrading prospects. Nice to see some numbers behind it. Hope you can do something similar for "rising" prospects.
The Pinto playing time projection seems awfully low..
Is there any chance you will be reinstating (either at BrooksBaseball or here) pitch f/x ball/strike data organized by umpire? Umpire queries were a great, unique feature at BB.
Something like this might be an interesting way to score each stat category performance:
I don't think the current PECOTA factors in sequencing of seasons (that may suggest real improvement or decline). So you get stuff like Mike Olt projected to have a OPS of .738 in the majors this year...
Any chance you can make the Team Tracker stat tables sortable?
How about adding 2014 projected stats to each block of stats on the page (e.g., Standard Stats, Advanced Stats)? I am not talking about percentiles or 10 years, but rather presenting the 2014 stats in line with the players career stats for easy visual comparison.
Here are some other analyses on the topic:
Please analyze rate projections (e.g., HR/PA) as well as playing time projections for each source. I would would consider both types of projections to require skill, but unless they are separated, it is very difficult to tell if the source was good at just PT or just stat rates, or perhaps both.
Agreed. Fangraphs really raised the bar last year in terms of near-daily updates to playing time projections. Hopefully BP can follow suit and provide another source of updated PT projections.
It looks like Remington's article was taken off the ballot. Can't say I disagree with the decision. Among many issues with that article, Fangraphs deleted all the critical comments about it.
I'm with you on this. A guy who throws 91ish, and gets less than 8 K/9 in Japan is very unlikely to be a long-term ace, IMO.
That said, apparently Japanese radar readings "run" about 1.5 mph colder than MLB's:
Also, FWIW, Davenport's projection is rather bullish:
"At 28, the Matt Wieters Facts aren’t coming true."
If you want to talk facts, at least mention a few of the relevant ones. Namely, that Wieters' BABIP was 36 points below his career average last season. I.e., if anything, he is likely to be a relative value this year.
When you say the player "everyone" expected when he was in the minors, I assume you mean "everyone" who bought into PECOTA's HOF-by-age-26 projection back then? Meanwhile, he has ranked #2 in RBI, #3 in HR and #4 in runs among all catchers over the past four years.
If you really don't believe 13% inefficiencies exist in MLB, you're not trying hard/intelligently enough. Source: 15 years of betting baseball.
Cool. Thanks for the update.
One of my main hopes is that you can restore focus on projections (PECOTA), open up the process of developing and testing the algorithm for readers to see, and expand the offering to include long-term projections for prospects.
The main problem I have with this article is that you assume that past TTOP is the only information one can use to predict future TTOP. At least that is what is sounds like when you proclaim, "We can assume that all starting pitchers have roughly the same “true talent” TTOP". No, we can't make this assumption. All we can state is that in this specific case, if one is limited to a certain data set to project a certain skill, that data set can be largely ignored. It does not say anything about whether there can be individual pitchers with truly abnormal, sustainable TTOP, or anything about how one might identify them. There are only 150 starting pitchers at any given time. There is no need to save data processing time by adopting generalizations, when plenty of intellectual throughput exists to evaluate each player individually. The same goes for DIPS, platoon splits, home splits, etc.
Wish I could +10 this comment.
Imagine the conceit required to write a farewell email focusing on the "brain drain" in the organization you were leaving. And then thank your parents and colleagues for allowing you to reach the pinnacle. This isn't the Oscars. You got a relatively low-paying job that countless others have turned down due to location, lack of pay and/or poor work/life balance.
Ah, now I know why you don't ring a bell.. You've made a grand total of three comments here in the past year. Yet somehow you've been stealthily tracking my personal happiness via my BP comments. Awesome.
If you'd like to contradict any of my claims about Wyers' performance, I'm all ears. Something tells me you're not going to have much to say. Meanwhile, here's a list of all my comments: http://www.baseballprospectus.com/i/33584. 100% complaining, eh?
Can I get an "amen"?
Not sure I know who you are. Do you know me?
Somehow I think we will survive without:
- The chronically missed deadlines for PECOTA,
- The last-second retirement of key features (long-term projections anyone?) from the same product,
- The inexplicably hostility toward others in the field. (Remember the amount of energy spent on trying to "prove" why ZIPS in-season projections are garbage, instead of engaging in an open process to produce the best ones of your own?)
- The inexcusable errors with every set of pre-season projections and refusal to answer questions/criticisms of them. No, we don't forget the year you projected, among tons of other absurdities, that all Sea pitchers would have 20 more aggregate wins than the team itself. You were alerted to this multiple times in Feb/Mar, and never once acknowledged it, much less fixed it.
- The declining amount of original research at BP.
- The impossible-to-use, late-90's interface of the stats pages.
Why am I writing this? It's because I think BP is a unique resource; yet I have seen it head in the wrong direction over the past few years. Certainly not all of this is Wyers' fault; but frankly, a large portion of it is. I hope that BP will look for a new statistical director who will be: more transparent in his/her methodologies, more deadline-oriented, and more committed to producing original research.
This article would be a lot stronger if it showed exactly how sub-optimal ("stupid") Baker's decisions have been. The article basically assumes this as fact. It's certainly the popular opinion of him, but I would expect more evidence to be presented.
Apologies if I am missing this somewhere on the site, but can you post the full updated catcher framing list? Ideally, both current season and career lists. Thanks.
The question is whether 50 innings in a certain park indicates a sustainable trend or random fluctuation? In the case of Zito and Petco, it's almost certainly random.
I second that. Insider quotes are always interesting.
Why has this not been updated in over a month for the hitters? All sorts of changes, namely that Rendon is now the full-time 2B. You still have him listed with just 10% playing time.
Are there any top 100 prospects who are known to have possible age fraud issues? This kind of thing is never talked about publicly until a guy is caught. Wondering how widespread the issue is today and if any teams have inside info. (vs. the entire league sharing the suspicions about the same players).
Any view on Tony Sanchez's season and odds of becoming a starting MLB catcher?
There's nothing to "get to the bottom of" from an analytical point of view. It doesn't matter how much you care or don't care about the issue; there is no way to intelligently adjust past performance for drug use because we don't know who used what when. Throwing your hands in the air and saying that all past stats are useless (as you imply in your PECOTA house of cards comment) is your option. The rest of us will continue to treat drug use as just another one of hundreds of variables that we can't measure accurately enough to include in a predictive model.
Is there any concern whatsoever about the legitimacy of Sano's age? Not that I have heard anything, but just asking if any teams feel that way. More generally, do some teams factor in an age "adjustment" if they suspect a player has a decent chance of using a fake age, or has this practice become so rare that it would be foolish to do so?
Absolutely. I might seem ridiculous, but I think there should be a monthly overall top 101 released during the season. It's easy to say player x has off to a good/bad start, but harder to estimate just how much his prospect stock has changed as a result.
Except once you have part of a crew doing the replay, his objectivity is in question. Anything remotely close, he will probably side with the umps' original call...and expect the same treatment when he on the field in 4 out of 5 games. Best solution is to have a non-umpire in the booth.
I choked on my coffee when I read this: "With his purse strings wound tight around the Steinbrenners' fingers in hopes of staying under a $189 million budget in 2014..."
And which is why if you're serious about pitching projections, you'll use Steamer instead of PECOTA. It at least attempts to factor in major velocity loss into its projections.
I hope you are joking. A player projection system does not try to project front office moves. If you are that certain you know what a front office will do, apply that knowledge to the raw projections yourself.
Disagree. I like to hear about the players whose performances have been the most unusual -- that is, those most likely to represent a change in prospect value -- whether it's an 18 year old or a 24 year old.
I called it a "horrific start." How is that putting "weight" on anything? If you have no information to add (and clearly you don't), that's totally fine....and why I posed the question to Zach, not you.
A 21/17 K/BB ratio for a top pitching prospect is horrific in my book.
If you ran this same analysis on fake hitting coaches and seasons (randomly assigning batters to groups of the same sizes), would the above tables all be expected show zeroes? I.e., what are the odds this is all noise?
Anything on Gerrit Cole or Mike Olt? Truly horrific starts of the season for each. Olt appears to still be out (but not on the DL). Kind of odd.
How does a subpar catcher throw out 26% of runners for his career? That's right around MLB average. Only 8 guys have attempted to run on him this season, so it seems like teams are reading a different scouting report than you are.
P.S. I would agree with you that pitch-framing analysis is becoming a bit overexposed. Not sure we need weekly columns on it.
A big part of framing is making some kind of effort to hold the ball after the catch and allow the ump time to make a (presumably difficult) decision. Norris' "effort" is terrible (above); he's already started to throw the ball back to the pitcher before the ump even has a chance to make a strike call.
Any idea what the Nats' plan is for Rendon? It seems like if they want him to be a 2B long-term, they should be having him play 2B in the minors now...but he's not.
Couldn't disagree more. Hate the expanded rosters in September. It's a completely different and less entertaining game.
The smaller the number of RPs, the fewer advantageous LHP vs. LHB matchups you can engineer, and thus the higher your overall bullpen ERA.
For pitching, you'd be nuts to use PECOTA over Steamer. Major issues. For a random 2013 example, see: Marmol, Carlos.
Thanks, but honestly too late to help for fantasy purposes.
I'll certainly sell you that $3 Rivera-Chapman spread. Marmol actual: 6.5 BB/9 last 2 years. Marmol projected: 1.20 WHIP. Can't wait to hear what the auditors have to say about the PECOTA process next year..
Ben -- any chance you can post an article examining pitching velocity gainers/losers this spring?
Any way to dig up the code from Nate Silver's last year running PECOTA, and produce some provisional projections for this year? Even if they were flawed, they seemed to be fundamentally sound and offered an alternative view of the career paths of important players.
Yeah, that would be great.
Can someone at BP run a similar article for this pre-season?
I second that. Would be great to have a running list (updated monthly year-round) showing the top ~500 dynasty values (incl. college/HS/foreign prospects, minor leaguers, and MLBers).
aaaand, still waiting.
When are long-term forecasts coming out?
Just curious: did you have pitches per inning as available factor?
Couldn't you look at groundball rates instead?
What is the ETA for PECOTA long-term projections?
I agree. Are we to assume that a performance evaluation was done behind the scenes? Wyers said, "A full list of changes would be rather long and frankly a little tedious". I find that to be hard to believe (a few hours of labor would be well worth it). More generally, I cannot understand why PECOTA has become a blacker and blacker box since Silver left. Throw us a bone here. What improvements were made and why?
General prospect question: do teams quietly apply an age inflation factor to the listed ages of Latin American players? It seems like whenever an age has proven to be false in the past, articles were written suggesting that teams widely "knew" the player was a couple years older than listed, and that the info. had already been factored into the player's contract/trade value.
The question is how widespread is lying about age believed to be today? If it is still significant, do teams have a decent feel for which players are higher risk or rather are they forced to hedge and apply a slight inflation factor across the board based on nationality?
I totally agree with your last paragraph. People are so eager to claim that outliers will not repeat that they miss the very real possibility that a certain human being might behave abnormally in certain situations due to a very real and repeatable cause. Josh Hamilton cannot possibly have persistent day/night splits bc most players do not. Jered Weaver is apparently no more likely than the average pitcher to outperform at home next year. Justin Upton's ridiculous career home/road splits: sheer noise. Luke Hochevar: chronically unlucky, mostly because Hellickson stole all his luck. The list goes on and on.
One reason I am confident that people have gone off the deep end in applying generalizations to tail cases is that I have made a living betting on baseball for the past 10 years, and it's been in no small part due to taking the other side of these inaccurate predictions of mean reversion (provided the sample is large enough to refute it).
I hope someone here or elsewhere can devote an entire series to examining serious career outliers (in FIP vs. ERA, handedness splits, etc.) apart from the rest of the population and determine if such players should be lumped in with the rest of the player universe, or rather if we should be applying different predictions of regression.
Never cared about the hall of fame; never will.
I know you are not necessarily suggesting something this extreme, but I just wanted to mention that the Orioles attempted to profile prospects based on an objective measure of makeup/drive about 10 years ago. Based on their propsects' (lack of) success from that era, it didn't work so well:
"Since Ritterpusch returned to the fold, the Orioles have loaded up on prospects who rate five or five-plus on their scale. Most are pitchers: Adam Loewen, Kurt Ainsworth, Ryan Hannaman, Don Levinski, and Chris Ray are among them. Daniel Cabrera graded well on a Spanish version of the test that the Orioles recently developed, and his profile convinced the Orioles that he could mentally handle the callup to the majors from Double-A, according to Ritterpusch."
The odds that Tabata is really under 25 y.o. are very small...
Abnormal, 1-year home/road splits are not all predictive.
Turns out I was pretty dead on. How's Bleacher Report treating you?
Nice. Any chance of adding TAv to the table, and of allowing queries for multiple seasons (e.g., 2009-2011)? Thanks.
In order for this to be true, these aces must be underperforming, on average, in the playoffs. The first step would to show the effect explicitly and then if verified, speculate as to why the anomaly is occurring.
A couple issues:
1) There is no healthy "ace" on the Orioles. The starters are all pretty similarly bad.
2) Even assuming there was an ace a full 1.0 RA/9 better than than the next best pitcher, the math behind your theory doesn't work. Assume the following:
By fully punting the game (starting AAA scrubs), you have a 10% chance of winning game 1 (of these 2 theoretical games).
With your ace on the mound and normal batters vs. Sabathia, you have a 40% chance of winning game 1.
If you play Oakland with your ace and an extra-rested team, you have 50% chance of winning game 2.
With a tired team and your #2 pitcher vs. Oakland, you have a 40% chance of winning (8% delta due to SP downgrade of 1.00 RA9, and 2% due to less rest for team).
So, punting game 1 means odds of advancing to ALDS = 0.1 + 0.9*0.5 = 55%.
Starting ace in game 1 means odds of advancing = 0.4 + 0.6*0.4 = 64%.
Not even close, and that is assuming you have a true ace. You can even make odds of beating Sabathia 30% (with your tired ace and team), and it still is not advantageous to punt a game.
Game theory certainly applies when deciding whom to start in a particular game of a sequence, as Marchi has already pointed out (http://www.baseballprospectus.com/article.php?articleid=18263), but frankly that has always been true for postseason situations. The concept of not sending your best hitters in a given game would not make sense in any realistic situation in my opinion.
I think you are measuring the wrong thing here. People don't necessarily think you need an elite pitcher to outperform in the post-season; they think you need a group of top 3 starting pitchers that is significantly better than your overall starting pitching for the season.
No answers from BP? Do you guys not even read the comments?
I am no fan of the current iteration of PECOTA myself, but sbnirish77 is making no sense. Let's assume BP decides that is it going to assume that 30% of all players were using PEDs over the past 20 years. How would this change PECOTA? Should they shut down any effort to project players simply because not all variables are known?
I don't agree about it being reasonable at all. You're an elite athlete (in theory). You're getting paid tens of millions of dollars to be in peak physical and mental form. Would it kill you to lay off beer for six months a year? You have from age 40 till death to drink as much beer as you please. I happen to love beer, and drink it regularly. I would, however, like to think I have the shred of self-control needed to not drink it less than 24 hours before I was about to be paid ~$50,000 to perform athletically for three hours as well as I possibly could.
Name another 19-year-old with a .840 OPS at any position in the Carolina League... It's only happened once in the last 10 years (Wil Myers 2010). Either Bogaerts is much older than his listed age or he is an underrated prospect.
How exactly would a person not in your league use this info.? Presumably he already knows what the hometown bias is is in his own draft, so glancing at what that bias looked like in some random Phillies draft is not really going help.
Shhhh. Be quiet. You're not allowed to mention that the site is falling apart when BP's leaders are too busy picking internet fights.
You should know better than to comment on things of which you have no knowledge. I read the entire article, incl. the 75% of it devoted to strained historical references. I elucidated the point the article made quite clearly and refuted it. If you have something to add, go fot it. If not, you're better served lurking.
For people who find it painful to watch BP aimlessly try to debug a black box system for 16 months, I humbly suggest you google "clay davenport Postseason Odds". His methodology is not perfect IMO, but at least it is stable and logical.
What specific shortcomings in Colin's work which I pointed out are "ignoring facts and context"? I ask very specific important questions, and he ignored all them. Meanwhile, he took to Twitter to mock the entire exercise.
"If age is not a factor, then should we just tell 14-year olds to toss 110 pitches per game, with the hope that they have resilient enough ligaments/muscles to withstand such abuse? "
This is completely misreading my comment. My comment says nothing about changing practices. It merely makes the observation that you have two possible explanatory variables, and your article is assuming that just one (age) is sufficient. Pitchers may be getting hurt in clusters around a certain age because:
1) There is something unique to that age that drives injury.
2) The majority of pitchers of that age have just recently received sufficient lifetime workload to cause body parts to break down for the first time.
My point is that there is nothing terribly special about the age; it's the cumulative arm use that matters.
Once a player has shown that he can survive long-term use without injury (by whatever exact measure used and by whatever age this may occur), it's reasonable to guess that he has technique or physiology that is better than average.
The only way to show that age alone is a significant factor would be to measure injury rates by age after controlling for lifetime workloads. I actually think that would be an interesting study that could yield useful fundings.
Is there anywhere on the site to find sortable rankings of catchers for these three metrics?
One likely explanation for this effect is individual game environmental effects. That is, higher than normal temp. and/or wind blowing out (relative to park avgs.) will lead to longer innings for both teams and poorer pitcher performance.
If only someone would try to explain PECOTA as lucidly as James did Win Shares, there might a chance people could understand, critique and try to improve it.
Any idea what the park factor was like for Koufax?
How much has Shelby Miller's velocity dropped this season?
Totally agree with your complaint about the need to catch strike three. Similarly (in my mind), why are runners apparently allowed to skip touching a base as long as the opponent doesn't "appeal" by throwing to the skipped base after the fact? I think a pretty simple rule change would be if a runner touches a base out of order (third after missing second, e.g.), he would be called out immediately.
I think it's also worth noting that all four of Scherzer's opponents in May (ChW twice, Pit and Oak) have ranked in the top 10 in team K%, whereas only one April opponent did.
It should probably be renamed to reflect it is a representation of the team's true strength, and is not the win pct. 'expected' to be seen going fwd. -- which is certainly what the current name implies.
"Under such circumstances, it is a miracle that Prior pitched at all in 2004."
Um what? If PAP was so perfectly predictive of next season injury, don't you think we would have seen a massive drop in pitcher injuries over the past 10 years?
Also, the "injury nexus" is, in my opinion, very likely to be a selection effect -- not an age-related effect. That is, certain physiologies can withstand chronic abuse better than others; naturally the differentiation won't show itself until sufficient abuse has been accumulated -- typically the pitcher's early 20s. The only ones who make in past that age intact are those with unusually resilient ligaments/muscles, etc. I think this drives most, if not all of the observed effect.
So, in conclusion, extreme outliers in mid-May may (or may not) revert to something closer to mediocre the rest of the season?
Retail through insurance may be billed at $2k, but certainly high-volume with no insurance involved is a lot closer to $500. Either way, it's a trivial cost compared to top pitcher salaries.
Really? So if the MRI shows a torn UCL and there is no pain, you just assume the pitcher doesn't need a UCL?
Regarding #11, it's pretty clearly driven by the increase in avg. pitch speed (coupled with the lack of increase in hand-eye coordination of the batters).
If I am reading the article correctly, your "neutral park" in the last two tables is implicitly assumed to have MLB-average temperature (for a given time/date), right?
If so, it's pretty surprising to the Rangers with the coldest "gun". Their avg. game temp. is 10-12 deg. higher than rest of MLB, which looks like it should be worth an additional ~0.25 mph.
By the way, what would in theory cause a systemic miscalibration of the system at a given park? I don't know the exact technology, but I would think measuring velocity would be far less prone to calibration issues than measuring location. But maybe not.
Really impressive stuff. Can you post the park factors you have calculated?
One comment: time of day and temp. should have a pretty sizeable correlation. Is this accounted for? Also, in some parks (e.g., Fenway), temp. is correlated with wind direction. To assess the true impact of temp., I would think you would have to evaluate that.
And among relievers with 10+ saves since 2010, a 3.15 xFIP ranks only 17th best. I guess I am not clear on the conclusion of the article. Is Walden a quality closer and improperly used, or is he just a decent reliever in your view?
It's not the blown saves. It's not the small sample. It's the 24 BB in 50 IP in the minors in 2010, the 26 BB in 60 IP last season, and the 4 BB in 4.1 IP this season. Unless you are sporting a 15 K/9 like Kimbrel, that's probably not going to cut it as a closer on a contending team.
Why don't teams perform routine monthly MRIs on all quality pitchers in the org.? It takes 30 min., is non-invasive and costs all of $500. Why wait until the pitcher complains of pain or shows decreased performance?
And what exactly makes you think that a whopping 62 career IP vs. left-handed batters is predictive of future split performance? Even if you ignore the research and assume his past splits are predictive, his xFIP (generally considered a better predictor than FIP for small samples) vs. LHB is 3.93, and it's 2.46 vs. RHB. Seems like the Giants might actually know a bit more about platoon splits than you do..
Because pitchers are throwing a ton harder than they used to. None of these workhorses from the 60s would even make it out of triple-A today. When you max out the human anatomy, parts will break. Technique, increased muscle size and increased frame size have all resulted on greater forces on body parts that one cannot always strengthen and develop equally to handle the added load.
TAv-against is not adjusted for league (presence of DH). So you cannot fairly compare pitchers from different leagues using this metric. E.g., pitching in Florida (no DH) will make Johnson's numbers appear better than his true skill.
Couldn't disagree more. As pointed out by numerous Amazon reviews, editorial quality went severely downhill under Goldman. Hopefully this will be addition by subtraction.
Good riddance, Goldman. The editorial quality of the BP annual went severely downhill under your leadership. The number of articles needlessly mentioning Yankees and/or political views skyrocketed. Subscribership plummeted. Addition by subtraction I suppose.
One goal of sabermetics is to understand basic statistical concepts, like selection bias. I'm stunned by this article.
http://adblockplus.org/en/ can work wonders in taming aggressive ads.
Where is Upside on the player cards? It's the day before Opening Day, and no sign of it thus far.
Is the projected record based on actual schedule, or a generic AL schedule?
Does this use actual team schedules or generic schedules?
This an incredibly long-winded article to say absolutely nothing. For anyone skipping to the comments, I can summarize the whole article as follows:
Matusz had a bad year last year, partly due to injury. He has really good peripheral stats this spring training, which apparently have no predictive value. Also, he got a couple "lucky" double plays and hit 2 batters. Also, apparently BABIP and HBP rates are predictive from one outing to the next.
So the author criticizes Matusz for BABIP, extra double plays and hitting 2 batters, while dismissing the exceptional K9 and BB9 rates he has had this spring? First of all, Mike Podhorzer recently published research showing that spring training K and BB rates are suprisingly correlated to regular season rate stats. So odds are Matusz spring contains at least some information.
More generally, the author has zero handle on what is and is not predictive with regard to pitching. You don't need to manually compensate for "lucky" double plays if you are looking at the correct peripheral stats to begin with. I could go on, but not worth anyone's time. Truly shocked this aritcle made it on to BP.
ETA for 10-year projections?
I don't understand how you consider the fact that some teams have a bad player at a given position a "major inefficiency at work." Talent is never evenly distributed among all teams, nor would one expect it to be given the long-term nature of most contracts. The only way to show it is a true inefficiency is to prove that there are free agents or bench players teams are willing to give away who can hit much better than the bad LFers you profile, while not giving up the gains through worse defense -- unclear given that you only measured the defense of players who actually played a full season in LF. Without naming these replacement LFers, projecting their actual defense, and showing that they would cost nothing to acquire/promote, I'm not seeing any obvious action teams should take to close the supposed inefficiency.
Also, I think any comparison of LF vs. RF defense needs to consider the number of fielding chances each position sees.
Apparently not available yet. No one at BP seems willing to commit to an ETA. (Last year, they were delayed until after the season had started).
Thanks. Would also be good if BP could add the option to input keeper (or otherwise unavailable) players in the main interface, independent of other settings. E.g., provide a link to "edit available player pool."
How do you mark players as taken? I do see any option to do so (I have read the Detailed Help section, and there is no mention of it).
roughly, are we talking days or weeks away?
ETA on the cards?
A cap on sentence length -- consider it... I would suggest some number under 98 (words):
"I leave it to those who have studied the schedule more closely than I to argue that the lack of off-days in the modern calendar requires that teams carry more than ten pitchers lest they fatigue the lot, but unless confronted with incontrovertible evidence that the last man cannot be stashed at Triple-A until the appropriate moment and then used in a taxiing rotation with a few other similarly semi-talented unworthies, I refuse to believe that the baker's dozen pitching staff is anything more than a security blanket for wet managers, or a wet blanket for insecure managers. "
This article is so irrelevant to baseball. It just strikes me as comment fly paper. Honestly, Goldman, there have to be better ways to spend your baseball-directed time with opening day less than two months away.
Correction: I was looking at an old card for Kershaw. (The cards are apparently not updated yet).
Can you write up what methodology is used to generate peripheral stat projections? E.g., R, RBI, SB, W, L.
Did you take a look at very old and young players specifically? I would think that if a 39 year old player's stats are falling off a cliff over last 3 years, it would make sense to use a more aggressive aging curve than the curve used on an average 39 year old.
Also, why not let the season sequence info. contribute to the comparable player-finding? Right now, you generate a single number (baseline forecast) using certain weights, and then try to find historical players with similar stats and other attributes. [Correct me if this is inaccurate]. I would suggest looking for historical players who have experienced season sequences similar to the player in question (say, .250,.290,.230) over the last x years, and see how they performed. x would certainly need to be greater than 3. It may not even be the sequencing that provides extra info.; it may the y2y volatility or lack thereof. And clearly the criteria for "similar" season sequences (a.k.a., "career paths") would need to be pretty loose to create a sufficient sample and not overfit.
Is this type of approach (using more than 3 past seasons and using loosely-bucketed career paths to find comps) under consideration for future versions of PECOTA?
I think Boras should use PECOTA aging curves in his next negotiations. Old guys get older, but their performance apparently stays about the same. Torii Hunter will apparently be a league-average hitter well into his mid-40s. But it won't even be a market inefficiency for teams to exploit, as the entire league will be filled with 45-year-old .260 TAv hitters by 2018.
The ten-year forecasts look poorly smoothed at best, poorly conceived at worst. E.g., Kershaw's 10-year eqERA goes:
So he is still getting worse right now (age 23), but when he turns 30, he'll finally turn the corner and start to improve... I think the erratic K rate projections are mostly to blame here.
Regarding number 6, wouldn't an ideal projection system try to determine customized past season weights based on age/experience to handle the outliers (very old and young)? It appears PECOTA does not. That is, one would want to look at what the optimal weights are for a 22 year old vs. a 40 year old. There is almost no chance they will be the same as a average-aged player (recent perf. will in fact have more value, as will the rate of change), so why force them to be static in the projection system? Yes, I know that an aging curve is applied after raw rate stats are projected. But this assumes that all players age identically, and that there is zero information contained in the rate of change of past stats as to how this particular player is developing/aging. Fine for a mid-career guy, but not so good for players in the middle of a steep incline/decline. Basically, you want to take more seriously very recent changes in performance when players are of an age when very large, real changes are likely to occur.
Speaking of which, was the order of past seasons considered? That is, if a player had TAvs of .250, .280, .310 in his last three seasons, his next season projection would look roughly the same as someone who had gone .320, .280, .250 -- assuming I am reading the explanation correctly. I would that for any aged player, a clear trend (and you probably need more than 3 years to establish a trend) would merit some level of inclusion in the predictive model.
I'm pretty sure most people know AFL stats are going to be a tiny number of PA. The only reason to mention PAs is if the number is less than expected for the relevant league. And if so, it's just as easy to write, "in 130 PA" as it is "". I.e., not sure what this article is trying to establish that couldn't have been stated in a single sentence.
Here's a look at one season:
Given they amount of money they have spent in the last two drafts, I'd say they shouldn't be all that happy..
Will your formula be used for save projections in next year's PECOTA? Do you have a similar study on predicting pitcher wins and losses?
Is there any reason why BP does not provide park factors? Always been a mystery to me..
Would enjoy seeing a study of whether hitters tend to improve vs. a particular pitcher the more they have seen him. This effect does exist on a team level within season, but I have not seen a player-level study. Would be difficult to do a full-career experiment due to survivorship bias, but a 1-2 year study of # of PA vs. TAv would be interesting.
61 comments later...is there anyone here who would put Arthur Rhodes into a close World Series game -- for any reason? If so, would love to hear why. About half the people here (incl. Goldman) defend LaRussa, but no one has made even a shred of case that he is a major league caliber pitcher, much less a guy you would look to in a close post-season game.
This is not Monday morning QBing. In the game in question, Rhodes' performance was actually pretty neutral -- i.e., the odds of winning were roughly the same when entered as when he left. It's more a question of expected results going forward.
I think it's even more remarkable that:
1) Wyers admits he didn't even watch the game he wrote about so critically. (Would anyone read a scathing movie review from a critic who merely scanned a plot summary in the middle of the night?)
2) In a subsequent attempt to cover for his error (i.e., to claim his point would still have validity in the game sequence that actually occurred), he proceeded to make a horrendously misleading case about whether the next inning would lead off with a pitcher or a leadoff hitter. Why so horrendous? His rationale is that, "The average team scores .60 runs in an inning where their leadoff hitter leads off the inning; when the pitcher leads off the inning that drops to .49 runs." This apparently offsets the benefits gained in the current inning. The problem is that the next inning is not the last inning. It seems to me that the goal in the 4th inning is minimize runs scored for the rest of the game, not for the immediate next two innings. I.e., if you are going to claim that run-prevention in inning N+1 is negatively affected by a certain strategy in inning N, at least take to time to measure its cumulative impact on innings N+2, N+3...9.
Where did you show that past high-leverage performance is predictive of future high-leverage performance? Especially specific outcomes of high-leverage PAs. A certain pitcher induces more flyballs in high-leverage situations over a single season (tiny, tiny sample), and he should be expected to continue to do so? I don't think so.
I'm just stunned that Rhodes is on a World Series roster right now -- regardless of the actual outcome of this game. Tony: step away from the 42-year-old, 89-mph-fastball, 5.90 FIP reliever. However bad LaRussa projected Motte to be vs. Hamilton, I fail to see how that projection could be worse than that of Rhodes pitching. Oh, and calling his heater 89 mph is being very charitable when you look at http://www.fangraphs.com/pitchfxo.aspx?playerid=1097&position=P&pitch=FA. He was in massive decline this year and was throwing about 86 mph by the end of the season -- unheard of for even the softest of LOOGY relief pitchers.
I think you'll find the answer if you look at the average horizontal angle a pitch comes in to a LHB and to a RHB. LHBs see more pitches that are moving from inside to outside than RHBs do. The umpire's strike zone is going to naturally shift to where pitchers are trying to pitch, even if that results in a shifted strike zone. By no means is this acceptable, but it's what happens.
Would be interesting to see if this age effect exists at all among college draftees.
Your assumption is that whatever beneficial effect of growing up as older than most teammates on little league teams has totally evaporated by the time the player is 17. In fact, the month-of-year-born effect (surmised to result in superior coaching during formative years, higher self-confidence, etc.) could very well have a permanent imapct and still be at work after a player is drafted.
That would be great. Ideally, for each given catcher/ump pair, you'd like to be able to estimate the net expected impact on called strikes -- looking at independent catcher impact, independent umpire impact, and also any possible catcher/umpire interaction effect.
Never mind. It's discussed below..
Curious what the source is that you use for "runs per called ball-->strike." I've seen a few numbers thrown out there in studies, from ~0.08 to 0.25 runs/changed call. Do you have a view on what is accurate?
Well said. This article is pure hindsight. The advantage play was to do exactly as the Rays did -- until MLB changes the rules.
It comes down to whether you think awards should be given based on what "should have" happened or what did actually happen. I "happen" to think actual results should determine awards, regardless of the role luck is assumed to have played in said results.
Number of Goldman articles since July 1:
Number of Goldman articles mentioning the Yankees since July 1:
Both fair and balanced.
Melky Cabrera is "a switch-hitter who is a far weaker hitter from the right side of the plate"? His career OPS from the right side is .700, vs. .740 from the left side. A spread of 40 OPS pts. for a switch hitter is not large at all.
A system is indicted because it produced two pieces of independent data rather than zero?
Maybe Maddon read this: http://www.fanduel.com/insider/2011/05/27/how-important-are-hot-streaks-for-hitters/
"Matt Wieters displayed exceptional defense while becoming more of an offensive presence than he had been in his first two seasons. "
I.e., "he shed his inexplicable 'Top 50 Bust of All-Time' label by...well, continuing not to resemble a bust in any way, shape or form." Although, to be fair, most of his recent HRs were appeared to be aided by a mysteriously gusty tailwind that would seemingly crop up mid-swing.
Agree. No idea where the preference for Mickey Mouse font sizes started.
You understand that FIP is not park-neutral, right? Parks affect HR rates, K rates and BB rates; therefore, FIP should not be compared across teams. Specifically, Arizona gets unfairly penalized in such a raw comparison.
Looks like a nearly-25-year-old scraping by in single-A.
Since this is as good a place as any to ask: why not publish a top 100 prospects list every month, 12 months a year?
Yeah, it's just sheer chance that teams with a tiny pool of potential customers tend to be the cheap-asses and the organizations in cities that could easily support five teams are the big spenders. Clearly they are all getting the same amount money thanks to revenue sharing. So it's just a question of whether you want to win or not, right?
Gimme a break. Note to baseball: institute a salary cap, or reduce the number of teams (by a lot). Not that hard. Every other sport seems to get it.
Target Field has played about as neutral as a park can be, so that's not a valid factor. And his defense at catcher is average.
Another thought is to check if there is a correlation between prospect ranking, and performance rel. to PECOTA projection. PECOTA is probably the best thing we have for performance-based projections of very inexperienced players, since it tries to use minor league performance. I.e., you'd be trying to isolate (for a certain set of players with a decent-sized minor league career) whether subjective rankings provide any predictive information not already contained within the player's stats and/or the fact that he made it to the majors.
I like the concept here, but the problem is that Marcel uses an extremely generic (regressed) projection for young players -- much more so that with 3+ year veterans. So the "projection" for most 23-year-olds, for example, is going to be nearly constant. Thus, the above data basically shows that young prospects are better than young non-prospects, not that prospects are beating any actual performance expectations. The only time Marcel truly represents an estimate of ability is after the player has played a few seasons in the majors, which is not coincidentally when the effects described above start to evaporate. This is probably shown most clearly in the K/9 data, where the prospects and non-prospects are going to receive roughly the same Marcel projection for their first couple years (regardless of minor league performance), and yet by the process of self-selection, prospects as a group will always have a much better K rate than non-prospects.
I.e., not sure that this study is actually measuring breakout likelihood as much as it is measuring raw performance. Perhaps taking a look only at player seasons that came after at least ~1,000 PA/300IP over the previous three seasons would make it more of a breakout vs. expectations measurement, though obviously you wouldn't have a lot of data from young players.
Bowden's and Hendry's situations were not at all similar. Bowden basically performed inline with his decaying payroll.
It's certainly implied in this article, but I think it needs to be stated explicitly: Hendry has had a top 10 payroll to work with for each of the past seven years, has enjoyed a well-below-average strength of schedule, and yet has a .500 record over this time span. That's fairly tough to do, even after factoring in some of the erratic behavior from ownership described above. Unrelatedly, I wonder if Dave Littlefield has kept his job.
If the A's ever did find undervalued short pitchers, they haven't lately. They have had just one SP (Gaudin) shorter than six feet since 2003.
Do you have a link to this study?
"the degree of race correlation in player comparisons was almost perfect (except that black and Latin players were, very rarely, compared to each other)"
Can you clarify what you mean?
"The question isn't whether teams will give short players a chance -- the question is how much of a chance."
Which is exactly what the author originally said. He never said, "short players never get drafted or signed." How does your point differ at all from what was presented?
"going over the cap" What the hell are you talking about? This isn't the NFL.
I read the St. Louis capsule three times. It still makes no sense. He's "disgruntled" and "streaky," so his ship has sailed? Same guy with a .300 TAv in 2010 at age 23... Who would want that?
Well, he was awesome in 2010, so not exactly a huge stretch.
The article itself doesn't mention anything about power or home runs (just hitting slumps), but then the player capsules talk mostly about HR and the author comments also suggest this is a power outage article. So I think there is a bit of disconnect between what article says it's about and then what is actually is about. After reading everything, I assume it was intended to be "the four biggest power disappointments among lower-tier players."
If you care to read his post, he is talking about adjusting flyball and groundball rates by park, which makes no sense and is not at all what I am talking about.
So, just to clarify, if you were to make a fair over/under line on Padres pitchers for next season HRs allowed, you would take the current staff's flyball rate, and apply the NL average HR/FB rate to come up with the number? Cool. As a guy who has made a living betting on baseball for the past 10 years, I would be more than happy to take your action.
Questions that remain curiously unanswered:
1) Why were PECOTA pitcher win-loss record projections so screwed up in the pre-season?
2) What was changed in the PECOTA rest-of-season weighting scheme a few days ago, and why was it done?
3) Why is it assumed that a good test of ERA estimators is using the past 1000 IP to predict the next 200 IP?
4) Why was the park environment ignored in assessing HR rates vs. flyball rates as predictive indicators?
5) Why was schedule ignored in assessing HR rates vs. flyball rates as predictive indicators?
6) "Aren't you graphing actual HR rates vs. predicted HR rates based on flyball rates? Of course the actual HR rate is going to be wider--that's the nature of real life vs. projection, right?
Are you implying that the distribution of expected home runs based on contact rate is wider than that for fly ball rate?"
7) "A good test might be to ask whether HR/FB is more predictive of second half home runs than HR/CON?"
8) "...there is a reason that this sort of analysis can be unsatisfactory: it assumes that a fly ball that wasn’t a home run is equally predictive of future home runs as a fly ball that is a home run."
But isn't this just as true as HR/CON? In fact, couldn't you turn this argument around and say that this is why HR/FB is preferable to HR/CON? Because fly balls are more predictive of future home runs than, say, ground balls?"
Yep. What we need is consensus. Not innovation.
"To the first, all I can say is this - up until the point that this article was published, SIERA was our own metric."
This is, of course, untrue. The article was published on July 25, a full week after SIERA was introduced at fangraphs.
It sounds to me like you have thrown your hands in the air and said, "FIP is the best that anyone could possibly do." How does that benefit subscribers? Unrelatedly, what metrics have you developed since your start date at BP?
I would challenge anyone to find a worse-formatted, more-difficult-to-use page than: http://www.baseballprospectus.com/sortable/index.php?cid=975549. Assume creation date = > 2002.
The only person misusing SIERA as a long-term projection system is you. How about looking at your own article, and its odd, self-congratulatory conclusions about flyball rates? No one seems to be able to replicate your results on other data. Nor would any rational baseball person ignore park effects on home runs in such an analysis.
I sincerely hope this translates into more tangible gains for the subscriber in years to come. Thus far, to be frank, you have had a few very interesting articles, but the net data analysis gain since Wyers has joined has been pretty well hidden, if it exists at all.
Will there be a full article describing the development of the algorithm used to create the 10-year projections?
I agree with the aging curve being pretty far off -- both for plate appearances and for TAv. It's both way too optimistic and also too erratic (some players will jump from .230 to .260 and back to .230 in the long term projections).
Nice to see Colin run away from responding to comments on his own article while talking the time to spew snide tweets all weekend. E.g.,
"cwyers Colin Wyers
I developed a new ERA estimator: 1*ERA+0. The r-squared is f@cking fantastic."
Oh, it will drop the home page and will be forgotten. I agree. But will Wyers bother to answer the many valid questions raised in the comments about his approach, or will he ignore them and wait till the article disappears? Based on past history, it will be the latter.
Yet another problem is the fact that pitcher schedule strengths are correlated within season and year-to-year. So if a pitcher faces an abnormally HR-heavy set of batters in a given year, chances are the following season will also have a higher than avg. percent of power hitters. xFIP will suffer because it explicitly assumes the pitcher will be facing a league-average schedule.
The goal of estimators like xFIP try to show what HR rate a pitcher would likely yield IF he was magically placed in a random/neutral park against a random/neutral schedule. Wyers tests what happens when the player is NOT in this random/neutral environment, but rather in one that is highly correlated to his previous (known) environment. There is no reason one should try to use raw xFIP or SIERA to try to predict a season in a partially-known environment. If the goal of these metrics is to to predict next season accurately (which Wyers seems to be assuming), the correct method is to estimate the biases of each pitcher's next-season environment and make the necessary adjustments to xFIP/SIERA formulas or to the stats being measured.
I'll ask Lindbergh and Wyers again: what was changed in the PECOTA rest-of-season weighting scheme a few days ago, and why was it done?
As some feedback for next season, I don't think the current presentation of UPSIDE is as useful as it has been in years past. I don't think most people care what the exact upside prediction for 2017 is (down to a decimal point); they would like an approximation of the player's total upside over the next 5 years, or 10 years. Upside used to be listed this way (as a total). Now it's either listed for 2011 only (in PFM and the PECOTA spreadsheet), or as a row of individual year numbers on the PECOTA cards.
Also, it was mentioned several times in the pre-season, but the order of tables in the PECOTA cards seems odd. The injury and contract info. tables, which are not of primary interest to most users and take up a ton of room, are listed above projected playing time, forecast percentiles, and now the 10-year forecast.
Something tells me that if xFIP and SIERA were adjusted properly to reflect the pitcher's expected park environment (which is already built-in to ERA and FIP to varying degrees), the avantage would shift a little more to xFIP and SIERA. xFIP and SIERA were not designed as projection systems; if they were, they would certainly contain a simple park adjustment to reflect the actual environment (avg. HR/FB) of the player in the year being projected.
So if someone decides on his own to use them as projection systems (as Colin does), he needs to make the adjustments to make it a fair test. Obviously, not every pitcher remains on the same team year-to-year; but most of them do. Same thing with fielding: there is going to be a reasonably high correlation of team defense year-to-year, which will help ERA as a predictor and hurt the fielding-independent estimators.
The short version is that if something is designed to neutralize park and defense (as FIP, xFIP and SIERA are to various degrees), it's not right to use it as a raw predictor of something (actual ERA) that is affected heavily by park and defense.. Either make a new metric that is designed to be a predictor of raw stats, or create a new test.
More critically, it appears you fail to control for park in your comparison of flyball rate vs. HR/contact rate in predicting future home run rate. Clearly, you would not use a league-average HR/FB rate to predict the second half of a pitcher in an extreme park. No one thinks that HR/FB rates regress to MLB-average rates. They regress to park-average rates (the net HR/FB park factor of a pitcher's schedule).
So no one would use a generic HR/FB rate or raw xFIP to predict second-half ERA. You'd obviously under-project HR rate and ERA of every Rockie pitcher and over-project every Padre by doing so. A stat like xFIP is only useful when compared to itself (across pitchers on same team, or across years of same pitcher); it's not well-designed to directly predict actual ERA and I don't think this is a particularly new revelation.
Forgive me if I glossed over the three-quarters of the article devoted to dated politics, but it seems your sole point is that managers should pull closers as soon as they blow a few saves.
It's hard to think of a policy that would fly more in the face of modern statistical analysis. What you are suggesting is that managers make personnel decisions based largely upon whether four or five save opportunities were converted or not. That's not just recency bias; it's also using arguably the game's most flawed statistic (save pct.) to gauge that recent performance. It's a binary stat that has incredible variance independent of player skill. If a manager were to follow your advice, he'd be endlessly chasing the "hot hand" rather than playing the players who gave him the best chance of winning.
If a guy's peripheral stats are awful for a couple months, obviously he should lose playing time. But you are not saying that. You're saying managers should kick guys out of high-leverage roles as soon as they blow a few saves: "most of them should be about as secure as the average NFL placekicker—you shank a couple, you’re out of a job." Bizarre to see that on this site.
Can you summarize what changes have been made? Not trying to be snide; I just don't have a handle on what has been added/improved from the data reporting/analysis side.
Do any of the year-to-year tests here adjust for park? If not, some of the error observed is not actual error, but a park bias. I.e., for a given park (in cases where pitcher stays with same team for both years), a pitcher should be expected to have an actual ERA that is higher or lower than his estimator ERA. (E.g., a Padres pitcher will always be expected to have an ERA lower than his xFIP [Petco depresses HR/FB], and xFIP should not be penalized for this since it was not designed to be park-dependent). This will vary by estimator, as some assume league-avg. HR/FB rate, etc.
Who is using ERA estimators on a full season to predict the next season? That is what projections algorithms (which use more than one year) are for. ERA estimators are useful for very small samples (50-100 IP), and should only be judged on that basis.
Why would simplicity matter very much for an estimator? Are people jotting down FIP calcs. on a napkin, or looking up the final numbers on a web site?
At least Swartz is 100% transparent in what he is doing, and is very responsive to reader input. Contrast that with BP, which has apparently changed (improved?) PECOTA, but has published nothing to indicate what has changed or why. We were all expecting a massive research piece on what steps were taken to improve the algorithm this past off-season. Instead we got crickets and a late product.
Now we get an incredibly time-consuming rebuttal to someone's else's work on a different site. This off the heels of an even more off-target critique of Fangraphs' rest of season weighting algorithm. Exactly what weighting are you using again for PECOTA RoS, Colin? Oh wait: you never answered that repeated question, instead spending a whole article doing forensic analysis to uncover and slam someone else's algorithm.
I have no allegiance to Fangraphs or BP or any other site for that matter. But the trend here at BP is undeniably disturbing. More effort is being put toward saying someone else is wrong than it is toward creating and optimizing your own models and tools -- at least based on what is being published, which is all that really matters. The PECOTA pitcher win-loss projections were unbelievably screwed up, beyond any shred of credibility; the 10-year projections never came. But instead of reading articles about what happened and how it will improved for next year, we're reading about how some ex-employee should stop trying to improve a once-prized algorithm for estimating true pitcher skill over small samples? Color me saddened as well.
Settle down, Beavis. You made an absurd comment/accusation, and got called out for it.
I would agree. But a single season clearly is not going to prove a whole lot.
One more unsolved pre-season mystery: why did the projected team win-loss records differ between the Postseason Odds and Depth Charts reports?
Both should have been using the same data (projections and schedule), yet there were differences in the numbers.
Not sure what you mean. What park environments are rest-of-season forecasts assuming?
"After the season, I will take a look at how each measure performed, and that may go further toward resolving the disagreement."
Wouldn't you want to backtest this over, say, 10 years of data instead of just 2011?
If we are bringing up unresolved PECOTA/projections issues, here a few [all of these issues were raised in the comments, but not answered]:
1) What mix of projection vs. current season does the in-season Postseason Odds report use, and how was this weighting algorithm tested? Why is this system not transparent, and why not show both the projection team strength and current-season-only team strength, so people can see the actual team vs. expectations.
2) This page, http://www.baseballprospectus.com/odds/ is, to use one of your words, wrong. Look at the expected win pcts. Oakland = .641; Colorado = .627.
3) Why were pitcher win projections massively inflated in pre-season PECOTAs? E.g., all of the Mariners pitchers were projected with a W-L record >= .500, yet the team was projected with 71 wins. Many other teams looked as bad -- e.g., Twins and Mets. Overall, there was a huge problem with PECOTA pitcher W-L projections, an issue which was first raised in early March. To my knowledge, it has never been addressed.
Please change "Expected Win Pct" to a name that is more accurate. It is not the expected winning percentage of the team; it's the true, schedule-neutral strength of the team.
Is it safe to say RoS PECOTA is using park factors only from 2010-2011? If so, that does not seem wise.
Either way, we need to know what exactly the new weighting scheme is and why it was chosen.
I appreciate the update. But I am still left to wonder: what is the current status (and future) of the missing 10-year player projections?
Well, the question is how much cash the Jays paid the Dodgers to take Rivera.
Well, if a team knows that going with a KRod/Axford sequence in the 8th/9th is going to save $14 million over using an Axford/KRod sequence, I think it's pretty obvious what that team would choose to do.
Here's an idea: if you're a player, don't agree to huge incentives based on counting stats.
I'm sure if the Cards want to pay $10 million for half a season of a good reliever, they can find other options...
Look up the actual research on the topic. It's been done. http://www.hardballtimes.com/main/fantasy/article/do-hitters-decline-after-the-home-run-derby/. There is no need to use anecdotes to speculate.
No one harps on an obvious point more than Sutcliffe. I'm including the ghost of Joe Morgan. I have yet to finish watching one of his games.
"With the power play, the home team outscored the visitors 30-25, or by 20 percent"
Some something happened during one season 30 times instead of 27.5 times, and that's conclusive?
If you genuinely don't understand why this is a negligible sample, I can't really help you.
The goal of a salary cap is not to force parity; it's to make skill the primary factor for success (vs. squatting on a premiere market and/or entering the business with cash to burn).
So Chamberlain would be a much better pitcher now and/or would never have become injured had he not been "mishandled?" Please explain.
Sweet nit to pick. So the scorers are actually the ones involved in the conspiracy... Even more likely.
Where do get the idea that Wieters has already played a 10-year career of mediocrity?
You don't seem to comprehend position scarcity, among other things.
It's not Wieters fanboys who were offended by your absurd and continuing claim that he is a bust. Rather, it's every rational baseball fan who happens to understand that:
1) In order to be a "bust," by definition, most players drafted after you need to have outperformed you. To date, Wieters has a career WARP of 3.9. Only three players drafted in the first 20 picks of 2007 have achieved a higher WARP than Wieters since the draft. If you want to look at just the college guys in the top 20, Wieters ranks 2nd out of 8, behind only David Price (#1 overall pick). I.e., exactly where you would expect him to be, if not a little higher.
2) Calling anyone a "bust" at age 24 is risky, unless he has already experienced massive failure. But calling someone an all-time bust, when that someone is still in very beginning of what looks like will be a long career...well, that's just desperate page view grubbing.
You're basically projecting your own misguided expectations (that apparently a top 10 draft pick should be an All-Star immediately) onto the rest of world. The original article you wrote was an embarrassment to the site, but not nearly as much as this follow up. I mean, if umpires are conspiring to keep Wieters' passed ball numbers down, who knows what else they are doing to help him along and try to make you look bad? I'm sure you'll think of something by the time you do the next follow up.
Wrong answer. It's not the pitcher's special abiility to control HR allowed that causes this descrepancy; it's the park. This is why there is a large diff. between xFIPs of Giants pitchers and ERAs of Giants pitchers historically. xFIP assumes a normal HR/FB rate but the SF park has a very low HR/FB rate. So much of Cain's "beating" his xFIP is very much expected and contained within the limitiations of the xFIP formula.
I'd rather see the hot vs. the hot. Just like I prefer the World Series champion to be whichever team was hottest during a given season and a given playoffs -- not whichever team has the highest lifetime achievement.
You understand that most Giants pitchers have a better xFIP than ERA historically, right? Look up the formula for xFIP and tell me why this occurs.
I don't think it's an accident that the year Lee became great, his velocity jumped up by 1.5 mph. Similarly, Santana was one of the harder thowers (93 mph) in his great years. I.e., studs rarely emerge without better than average velocity -- regardless of what their main out pitch is.
I like the premise of this article, and hope to see more lists like it (mid-season views on who has the highest upside in the minors).
I agree. This an angry, angry young man -- I mean, article. For very little impact. I personally would prefer to see the 10-year PECOTA projections arrive sooner than a random rant against a fairly innocuous opponent. Four months and counting.
So, you are performing the following calc., right: (ExpYear2-ExpYear1)-(ControlYear2-ControlYear1)? And the result is construed as the change in skill for year 2?
I think I get it now...
My only remaining concern would be if the avg. age of the exp. group was for some reason trending one way over a decade or so, it could lead to erroneous conclusions about true player ability changes (which would be better explained by changes in how aggressively/cautiously players are brought up from the minors). No idea if this is the case, but just mentioning as a potential factor.
And any other team would have taken one look at him and said, "Yep, this looks like a textbook 9-month concussion. Don't even let this guy think about coming back until next year"?
Don't think so. Unfortunately, that's not the way concussions work.
The problem I have is that you are measuring the diff. between how given players perform vs. past opponents and comparing it to how they perform vs. brand new opponents, during a given season. From that comparison, you are concluding that strikeout skill is going up for pitchers each and every year.
For example, looking at the year 2 comparison of control vs. experimental for pitchers... I would argue that the experimental bin for these pitchers will contain an abnormal pct. of rookie batters or other young batters whose playing time has increased from year 1 to year 2. A batter with increasing playing time from year 1 to year 2 will be more likely to have his plate appearances qualify for any given pitcher's experimental bin. Younger batters tend to strike out more. The net result is that the experimental bin for a pitcher in year 2 will contain more plate appearances vs. high-K batters than the control bin will for that same pitcher. So the effect of pitchers' improving K skill every year may just be a measurement of the bias between the control and exp. groups.
A similar and additive bias should also be present in the year 1 control vs. exp. comparision. The exp. group will have an abnormal number of players with declining playing time (older players), which will depress observed K skills for pitchers (older batters are harder to strike out). Since you are comparing year 2 to year 1 to assess pitcher skill change, this will further inflate the increase in K skill.
Overall, comparing the experimental year 1 results with the experimental year 2 results requires the assumption that the two experimental groups of opponents (batters) are roughly equal in skills. I'm not sure they are.
Does this make sense?
"Dickered endlessly?" To my knowledge, Morneau has had more time off due to a concussion than any other baseball player. Long-term brain damage due to a concussion is nearly impossible to assess. You cannot see the damage done and therefore cannot see if/when it has healed. So the Twins' approach, in my opinion, is what any other team would have done. But you are saying that most teams would have helped Morneau's brain heal better?
Thanks. But wouldn't the experimental group (matchups between guys who did not see each other last year) be over-represented by young players? It seems like young and/or injured-part-of-last-season players would be selected for abnormally in the exp. bin, as they would often have far fewer matchups last season vs. this season. I.e., they would generate a disproportionate pct. of the "new" matchups this season by virtue of their limited playing time last season.
I'm not sure I am interpreting your methodology correctly, so the above comments could be off. But maybe you could write a follow-on article to break down your exact process and also what each graph represents in more detail.
This article is much more pro-DePodesta than it is anti-Dodgers. Remind me again what DePodesta has accomplished after being liberated from the Dodgers?
It would be nice to know how exactly PECOTA projects wins, as it had every single Seattle pitcher with a >= .500 record this pre-season, and yet the team was projected to win 71 games. I still have not heard an explanation for this.
"Or I’m sure PECOTA does something similar to arrive at a pitcher’s projected win total, so when looking at a pitcher’s projection, don’t just look at the ERA and call it a day."
To follow up further: in order for SIERA to be systematically inflating numbers, something fundamental would have had to have changed in how runs are scored this season. That is, even with a lower-offense environment, the peripheral stats SIERA uses should be reduced proportionately, thus leaving the algorithm to work as is, without adjustment. Unless I am totally misunderstanding how SIERA is calculated.
Not sure I follow the ~7% discount applied to Stauffer's ERA. Is it correct to assume that SIERA is overestimating by 7% this season?
So, essentially, your "experimental" group is comprised of baseball's youngest players, correct? And these youngest pitchers have been improving their strikeout rates at an average of ~1.5% per year -- so ~15% aggregate over the past 10 years? Just making sure I am interpreting correctly.
Meanwhile, the veterans have been striking out fewer and walking more batters. So the conclusion is that rookie pitchers over the past couple of years have driven the overall decline in run scoring?
Beg to differ regarding Gorkys Hernandez. The guy is almost 24, and hitting a DT-translated .226 EQA this season. That doesn't bode well for his future, regardless of his CF defense.
Davenport projects him with a peak OPS of .630.; Oliver projects a peak OPS of .597. Both of these factor in his current-season performance.
Is there evidence for this? That is, do infielders have strong year-to-year correlations for pct. of BIP converted to flyouts credited to the infielder?
How does a 3.50 SIERA for Stauffer equate to a projected 55% team win pct. in games he starts? SD starting pitchers have had a 3.68 ERA YTD (only slightly worse than Stauffer), and the Padres' 3rd-order win pct. stands at .406. A 0.18 improvement in SP ERA is worth roughly 3% in expected win pct., so the end result should be in the 44%-45% range -- nowhere near 55%.
That's if you assume that whatever phenomenon causing past data to lose its predictive value over time is generated solely by a ticking clock. The alternative hypothesis is that a player is more likely to implement/undergo fundamental changes in the off-season (new delivery/stance, fitness change, nagging injury healing, etc.), and so the existence of the off-season would cause more information value to evaporate from the past performance than would the passage of the same amount of time within a continuous season.
If the theory is true, April's performance would be more predicitve of same-season October's performance than October's performance would be predictive of the following April's.
Hence, the question.
Also, can you post a link to your study?
If the optimal weighting scheme is that well known and accepted, I'm not totally sure why none of the three in-season projection systems (ZIPs, Oliver and PECOTA) are using it. Or is PECOTA now using that weighting scheme?
Does "daysAgo" refer to calendar days, or days of a season?
I find it hard to take a prospect seriously when he (Iglesias) has yet to hit a single HR in 450 career pro plate appearances. The guy has just *one* extra base hit in 150+ ABs this season. Just 3 out of 5 in SB att., so it's not like he's a legendary speed burner, either.
I don't think anyone is questioning how the theory of weighting by PA/IP works, or how one would implement it. The question is what are the optimal weights to use to provide the most accurate in-season forecasts? It's an interesting problem and one I would like to hear from Colin on.
I also am more interested in what PECOTA is doing *right* than what Fangraphs is doing wrong. I.e., what weighting per stat is in-season PECOTA using and more importantly, what was the method used to arrive at these weightings?
I hear you. Obviously, a high-quality rest-of-season projection is what we are all looking for. But until we are there, people continue to generate lists if xFIP to ERA, or SIERA to ERA differences mid-season for fantasy purposes. My argument is that these are not good tools to use to evaluate what a pitcher "should" have done thus far this season. We agree that xFIP and SIERA are not great raw predictors of ROS stats, but this presently is not stopping people from using them exactly as such over and over.
I'm all in favor of quality ROS projections, but I believe there is still a need for a single-single season ERA predictor metric -- a SIERA-type stat that aussumes the player's environment is not changing. THis would satisfy those of us who want to get a simple look at how a pitcher is truly performing in the context of his current team/stadium, without the baggage of an opaquely-weighted multi-season projection.
You should take a snapshot of ROS projections from Fangraphs and from PECOTA right now, and then evaluate at the end of the season. A single year won't prove anything conclusively, but if the difference in accuracy is huge, it will certainly suggest which method is superior.
I don't follow why MRIs are not done immediately for any pitcher who is showing any kind of arm issue, and then even routinely (every couple months) for pitchers without known issues. It's not like the $500 per MRI is going to put a huge dent in a team's finances.
It wasn't just Baltimore who roundly mocked your selection of Wieters; it was a wide range of rational baseball analysts who were stunned someone would mistake a solid career start with a historic bust. In reality, his 2011 is on pace to be worth 4.5 wins in his age 25 season. Not exactly busty...
I would agree. An "About Us" page with all current writers, their titles, bios and dates of hire would seem quite useful.
My impression is that BP is currently operating a merry-go-round. Every week, there is a brand new name or three on the front page. I'm all for discovering new talent, but at this point the site feels much more like an anthology of independent blog posts than a coherent brand.
It's interesting that page views are a criteria for rentention. I am all for this, but would hope that the concept would be taken a step further and a reader rating systems would be implemented. Something where readers could rate each article 1-5 stars. I feel like BP is never shy to judge others, and so should be very open to accepting transparent feedback from its customers.
The main problem with using SIERA (or xFIP) for fantasy purposes is that ballparks have a large influence on HR/FB, and these ERA estimators totally ignore it. E.g., as far as I know, SIERA/xFIP use the same expected FB/HR for pitchers for Padres and Rockies pitchers. Ditto, for defense. Assuming a MLB-average HR/FB rate and MLB-average defense is appropriate for evaluating true skill of pitchers in between seasons. It's not, however, a good practice for predicting second-half fantasy stats, which I believe more people care about.
In other words, please fill this void and create a metric that assumes the pitcher will remain in his current park/defense environment for the rest of the season.
Was research done as to the optimal weight to give to in-season stats vs. prior-season stats [that is, what weighting system has best predicted ROS stats historically]? Or was it assumed that current year stats are exactly as informative those from three years ago (for veterans)? I.e., is there evidence that Fangraphs' weighting system is not optimal, and yours is?
Also, I think there is a typo in Headley's numbers above. He has outperformed projections thus far, but his ROS is lower than his pre-season projection.
I think you're right that this is a bad example to use (the catcher in this case was not interfering with the runner), but that doesn't take away from the general idea Davenport is expressing, which is that catchers should not be able to plant themselves in front of the runner.
Well said. The current (lack of) rule is absurd, even without considering the injury increase factor. I've always wondered why putting your leg down right in front of the plate is any different than just sitting indian-style on top of the plate? To take it to a more ridiculous level, why not call in the firstbaseman and have both he and the catcher sit down on top of home plate, each vaguely trying to field the ball thrown from the outfield?
I'm glad someone finally called out the shortcomings of the current rule, and hope MLB is reading this.
What happened to the 10-year PECOTA projections this year? Happening or not?
We're quite a ways off from the trade deadline, and odds are the Pirates will not be close enough to .500 at that time to make it a difficult decision. The decision will most likely be whether to try to get ~75 wins or 70 wins.
It's pretty unusual to hear that a guy (Cole) has increasing control problems when he cuts his walk rate in half (from 3.8 BB/9 in 2010 to 1.8 BB/9 in 2011). I understand that you are talking about recent trend, but I also understand that stats from a few starts tend to not to be very predictive... Ditto for his BAA. A bump from .205 to .231 is not exactly telling for a 92 IP sample.
What does PECOTA say about his 10-year forecast? That would seem to be the most direct way to assess his future.
Would love to see the old True Average report available again. It was the best way to see at a glance how each team's hitters were performing. The new format has the same information but much more difficult to browse at a glance.
Again, we need to know how actual stats vs. projected stats are being used to generate the expected win pct. As of now (May 4), none of us has any idea how much projections are being used.
I agree that a Transaction Analysis column, if it exists, should be broad and comprehensive. I disagree that the column was ever, or will ever be, great. It's pretty bland stuff for the most part, and when you de-flower Kahrl's prose, you're left with formulaic complaining about mostly insignificant events.
I consider Lindbergh to be a significant upgrade over Kahrl in terms of overall insightfulness and ability, but...I'm not sure this column is the best use of his talents. Would love to see more in-depth work from Lindbergh on a wide variety of topics, and TA left to someone else, or even abandoned if necessary.
Yeah, this is getting (has gotten) bizarre. Make a big splash with a new stat that is supposed to be the best pitcher evaluation metric, and then not have it working for the current season (already in May)?
Would be nice for BP to take a step back from the editorial fire hydrant and address the ongoing data issues -- from still-missing 10-year projections to missing current-season stats. Something like, "This is a black swan event. A, B and C happened. X, Y and Z are what we have done to prevent similar issues going forward." Alternatively..."This is just the way it's going to be. We'll update stats/projections whenever we can, but don't count on any predictable schedule -- now or in the future."
I think people need to have an idea which one of these two is the current policy at BP. If this was a free blog, I'd stay quiet and enjoy my erratic updates. But a pay service? At least explain what customers should be expecting. That's all we ask for.
You might want to look more closely...
In 2007, (the "89" mph season), he pitched 21 innings, so the sample is too small. Also, not surprisingly, his ERA that season was terrible at 5.48.
So his avg. FB velocity in full seasons was:
Hardly a lot of variance. Yes, within that context, 1 mph is "significant," as I stated. Never said the difference was "massive."
Again, read up on the subject at http://www.hardballtimes.com/main/article/lose-a-tick-gain-a-tick/
By all means, do a study with a sample size of two....or look at the work of people who have analyzed the entire league.
Shockingly, Wright hit 12 of his 29 HRs at home in 2010. Just slightly less than half his HRs erased by Citi Field..
This is pretty obvious... If it is agreed a guy would hit 30 HR in a neutral environment, and someone claims due to his actual environment he will hit 15 HR, it means his home park (where half the games are played) is being assumed to allowed zero HR. Not that tough to understand.
Will the 10-year PECOTA card projections ever be released this season?
Glad you noted this. I feel like it's one of the better kept secrets that most pitchers lose velocity starting pretty early in their careers.
Sort of off-topic, but do you know of a study on the aging curve of GB/FB rates for both hitters and pitchers?
Can we at least have an indication of what the current PECOTA weighting is?
As a control, you would need to look at how hitters perform after missing ~8 games due to something other than a concussion. I have a feeling that taking time off is a generally detrimental to hitting, and might very well explain the 6 points of TAv difference.
I'd prefer to see more analysis in this type of column. I think if one is going to offer a column with betting picks at BP, it should try to find a quantitative edge instead of saying a certain pitcher is "due" for a win, or that a certain team is abnormally poor on the road, based on last 50 games (negligible sample). Personally, I have profited from betting against this kind of mentality. Most trend-type observations are pure noise in baseball -- whether it's a pitcher doing really well for 2 years in day games, or a team going winless in its last 10 games vs. a certain opponent.
Not wasted, but certainly a poor investment, even for a team that is trying to overpay for talent.
Any clarification of this?
This report needs the team's actual W-L pct., as well as the current weighting of PECOTA vs. actual that being used for the expected win pct. Overall, a smaller, more professional/readable font, and more data (like last year's report) would be helpful. Thanks.
Sorry to rain on the lovefest, but I have always found Kahrl's writing to be labored, convoluted and oddly condescending. Her columns read as if they had been written by Warren Buffett chastising beginning traders for the smallest of supposed trading errors. The most obvious problem is that Kahrl has never had the credibility to cop such an attitude. What teams has she managed? What groundbreaking research has she contributed? It's fine to answer "none" to both questions, as long as one keeps the snideness in check. Instead, it is always an over-the-top, "what have these idiots kids [GMs] done now?" mentality that is tough to tolerate, much less enjoy.
The more fundamental issue is that the subject matter (largely insignificant transactions) does not warrant such a high level of attention/resources, whether in the form of run-on sentences or not. I happen to like Lindbergh's work thus far, but feel that his talents would be better used in generating new analysis. Let's allow TA to die a peaceful death, and focus resources on innovative work.
For the same reason that Nate Silver decided to put his work on BP 10 years ago, rather than simply joining every hi-stakes fantasy league he could find.
A) If you know what you are doing, you know that Accuscore is not very good.
B) The point is to go through the exercise of what goes into predicting the outcome of a game. That is what people would find interesting.
Interesting column. Would love to see some of the BP staff try to develop a simple model to predict the odds of team winning a given game. I think it would be an entertaining project to document as enhancements and improvements were made to the model.
Would love to have the option to turn on/off PECOTA adjustment. It would give the user more flexibility to decide if he or she wants to use only current-season stats or not. Thanks.
Please consider adding back to the Adjusted Standings report each team's AEQR scored and AEQR allowed (3rd order). It's the only way to see the schedule-and-luck-adjusted offsensive and defensive strength of teams. This info. has always existed in this report. Please put it back in, or make a different, more detailed report avail. with it.
Please put back in this report 3rd order runs scored and runs allowed. It is very useful for measuring a team's adjusted run creating vs. run preventing ability at a glance. I would recommend including this data in the current report. Or, you could make a separate, more detailed text report for subscribers seeking this data that is no longer shown.
Are the PECOTA 10-year forecasts coming out this year, or have they been cancelled?
Where are the 10-year projections?
When are 10 year player cards due?
Add a col. for the 10-year upside, and put it on the card as well.
Why on earth would you restrict a minor league player's comps. (and thus his Upside calc.) to major league players? That defeats the point of using comps. for prospects.
I fear the 10-year projections for prospects (probably the single most valuable output of this whole process) will be ruined by this policy.
That's just a poor calculation of variance in a projection. No player should have an 80% chance to finish within 3 HR of his projection...
Pretty interesting. Looking forward to more of this during the season. One observation is that run scoring is actually positively correlated with temp., so the reduced air resistance appears to benefit hitters even more than the added velocity benefits pitchers.
How was the population chosen for the temp. analysis? You only used guys who played the whole season I assume?
Projected Playing Time (with projected stats) should be moved to the very top of the cards. Right now, if you want to know how many ABs BP is projecting for a given player, you have to scroll down 2 pages.
As stated by another user, Upside should be listed as a 5-year or 10-year total value -- both in the Pecota cards in in the PFM spreadsheet.
Did I miss an update on the 10-year projections? Maybe for the All-Star break? Seriously, I don't understand how BP management (assuming there is such a thing) can remain silent for a month and a half while a core product is MIA. I've never quite seen a company (fail to) work like this..
Speaking of Lindor, what would you say the odds are he stays at SS long-term?
Can you also address the inflation of wins projections for pitchers? E.g., why every single Mariners pitcher is projected to have a .500 record or better when the team is projected to be 70-92. Obviously, at this point, there is no way one can trust your pitcher win-loss projections. But I would like to hear why this issue has not been fixed or even so much as acknowledged by BP in the five weeks since it has been pointed out.
Better late than never. But I think we could all use a coherent explanation of what happened over the last four weeks. I am pretty shocked, given the magnitude of the issues last season, that the apparent bugs preventing a timely production of long-term projections were not addressed before spring training.
The spreadsheet should show upside ratings...
Yes, injury, compensation and fantasy sections should be at the bottom. 2011 forecasts should be up much higher.
It would make sense to test any defensive metric to see how predictive it has been historically. I suspect the predictive value of FRAA would be quite low, but I would certainly be open to reading an analysis of it.
Does anyone at BP care to address the pitcher win issue? E.g., take a look at the collective W-L record of the Mariners' pitchers, and compare it to the team's projected W-L record. Not even remotely close (all the pitchers are projected to be above .500 whereas the team is projected to be 70-92). Same with many other teams. I.e., there is a major problem with the pitcher W-L projection algorithm. This problem was pointed out more than three weeks ago. I have yet to see a response from BP.
My top concern is his velocity drop last season: 91.1 mph avg. fastball, which is a career low.
Your scoring system should penalize Phils' starting pitchers, if anything (no wins). But I have to say this seems like a pretty random, borderline useless BP article (trying to show that one draft has a home team bias)...
I don't really see why anyone would have suspected that quality start pct. did not correlate well with avg. runs scored, or with ERA, or with team win pct. I.e., I don't see why this analysis was performed or what it really proves.
The question is, what additional value (beyond RA or ERA) does the quality start metric provide? I would say not much. It shows how well-spaced runs allowed have been for the pitcher. The problem is that this is not likely to be a skill -- i.e., past run spacing is not predictive of future run spacing. And if you're in the market for a stat that is not predictive of future skill, you might as well use W-L record.
Did the Marlins steal tens of millions of dollars from Willis when they paid him only $200k-$400k in his first three seasons?
Draft picks are gambles; contracts are gambles; life is a gamble. The Tigers gambled; they lost. Get over it.
I would pose this on a Wyers article, but since he has not published anything since Feb 24, I'll ask it here:
Where are the player card updates?
Just curious: where do you get this data for spring training games? Is it publicly available?
Unfortunately, there is no "90% PECOTA" this year. BP (apparently) has stopped updating player cards for 2011. So all the percentiles and upside numbers are still from pre-season 2010. Even the bio info. is old. Nice work...
I certainly wouldn't trust a 1-year trend to be predictive. If you look at his numbers from 2008-2010 (http://sports.espn.go.com/mlb/players/splits?playerId=5641&type=pitching3&three=1), he was actually slightly better on no rest (lower OPS allowed) than on one or two days rest.
These are valid points. I just get the feeling that the fantasy stats forecast algorithms are half-baked at best. No one seems to be in charge of the full process. And certainly the process of projecting stats, esp. counting stats, is a complex one. The main issue I have is the dead silence from BP whenever a discrepancy/issue is pointed out. E.g., the bizarre win-loss totals for pitchers. The tiniest bit of responsiveness and/or transparency would go a long way to improve BP's eroding credibility as a top source for fantasy projections.
It's hard to be bullish on a guy whose avg. fastball is 88 mph and at best, and *might* have regained some semblance of control. Would be interested to hear how hard he is throwing with his new delivery.
Also, FWIW, his downward spiral did not begin after he was traded. His xFIP in his last two seasons with the Marlins was 4.48 and 4.74.
I find it odd that every pitcher is green for odds of missing 30+ days, whereas less than half of the hitters are. This doesn't jibe well with subjective experience, which is that pitchers have more major injuries than hitters. Is there something unusual about this team, or are pitcher injuries less common that people typically think?
Random question for Corey: Do you think that teams should schedule routine (every 6 mos. or so) MRIs for pitchers? It seems like MRIs can often show partial damage to a shoulder or elbow. They're non-invasive and pretty inexpensive relative to payroll. I've always wondered why MRIs aren't used more in baseball for proactive screening, instead of just to investigate symptoms.
A salary cap is not even remotely similar to tenure. Tenure is an agreement educators make with their employers to protect their jobs/benefits indefinitely without regard to job skill/performance.
A salary cap is an agreement between organizations to limit spending in order to make a sport measure skill instead of wealth.
Think of it in terms of a fantasy league. Tenure would be guaranteeing each owner the right to continue to participate in the league no matter how much his/her effort or skill level dropped. A salary cap would be limiting how much each fantasy owner spent in the draft each season. They are totally different concepts.
Plenty of franchises in salary-capped leagues have gone under, as they should. Putting a cap in place does not mean granting each city with a current team the right to have a team indefinitely.
Well, it's pretty obvious that the players would be against a cap on their total income pool. But that's not really the point (employees don't usually get everything they want). The question is what is in the best interests of the fans and the best interests of the long-term health of the sport. Right now, the players are clearly winning. The majority of fans and the long-term interest level in the sport are losing. This can, should and will change eventually.
As for "parity," the goal is not to create parity. *If* it's an equal playing field, and a single team dominates for a long period of time, that is an amazing and highly respected accomplishment. If, on the other hand, a baseball team dips into its owner's pockets as much as it wants and uses its geographical fourtune to its advantage year after year...well, then you have no idea how much skill vs. situational advantage was involved, and the accomplishment is diminished.
Baseball is the only sport in America where a GM who has not even sniffed a championship series in his 13-year career is considered a genius. I'm not even saying Beane is or is not a great a GM. I'm saying that the current structure of baseball makes it difficult to to know how good he or any other GM is; and any time a sport is failing to measure/reward skill accurately, it's a bad thing...
No one is arguing that a salary cap would make any specific team profitable. People in favor of a cap generally simply that the sport become much more interesting once you start separating baseball skill from the geographical monopoly/duopoly a team happens to have been given.
Either reduce baseball to ~8 remaining teams, or put in salary cap. No other solution creates a reasonable sport.
Way to find the outliers... The relatively high variance in baseball does not make it a fair sport. If I wanted to root for giant corporations to put up better numbers year after year over smaller firms, I would join a stock market fantasy league and wear an IBM jersey.
Sports is about creating an even playing field and measuring athletic and strategic skill. The only way to do this -- and most other sports understand this fact -- is to strip away money from the equation as much as possible. I can't imagine anything less satisfying than being a "fervent" Red Sox or Yankees fan. If you lose, you did so while having 4x the salary of many other teams; if you win, you did what a team with a massively unfair advantage should have done. To me, that's not what sports is about. It's about creating a structure that measures true skill.
Regarding Francisco: Last I checked (http://www.baseballprospectus.com/unfiltered/?p=669, among other studies), looking at 180 AB in a single season to assess apparent L/R split trends is not useful. He has a career .806/.762 L/R OPS split (i.e., normal), and has seen his fair share of RHPs over his career (782 AB vs. RHP, 311 AB vs. LHP).
Ideally, the projections would have a column for "expected PA/IP missed due to injury." That way, people could figure out how much of the playing time forecast is based on injury expectations, vs. other causes of less than full playing time (platooning, position competition and other benching).
The projected TAv's in the PFM do not match the player card projections for many (most?) players. No idea what is going on.
There has been a significant and inexplicable lack of response to major issues raised by BP users over the past three weeks. E.g.,
Where are the player card updates (incl. 10-year forecasts, percentile forecasts and upside projections)?
Why are the projected records of teams different between Depth Charts and Post-season Odds pages? (Are they using different schedule assumptions; if so, how/why?). No, Colin's article (http://www.baseballprospectus.com/article.php?articleid=13040) does not answer these questions.
Why are some of the projected team records from PS Odds Report different from the "Exptected Win Pct" on the same report? E.g., Cards expected win pct. is .522 and expected wins is 85.8; Brewers expected win pct. is higher (.524) but expected wins is lower (85.2).
Why are the projected win-loss records of individual pitchers so far off from what they should be? Wins are being massively inflated, as has been pointed out by several users.
How often and how aggressively are the Depth Charts being updated? (Jason Castro, e.g., is still being projected with 400 PA, ~48 hours after a season-ending injury).
All of these questions are not new; rather, they have been raised repeatedly over the past several weeks by a wide range of users. I have yet to hear a definitive answer from anyone at BP -- either via customer service email, or via comments.
I would be interested to know what exactly the W-L projection algorithm is. Clearly, it's not even close to working properly (e.g., LA Angels). Commenters have pointed this out on individual team pages since as far back as Feb 21, but no acknowledgement or fix from BP as of yet.
Colin only says, "Due to unbalanced schedules, we come up with win-loss records that diverge a bit from the numbers on the depth charts." But this does not make much sense, since both systems should be using actual schedules as far as I know. If not, what exactly do they each use for schedules?
There is going to be some rounding error, esp. with the middle relievers. But yes, there is obviously a problem beyond rounding going on. Not just here, but with most teams. No acknowledgement from BP yet.
This says it was updated at 2:24pm on Mar 4. Castro was diagnosed with a torn ACL at 8:00am on Mar 4, yet is still showing with 400 projected PA. Certainly I am not saying that playing time adjustments need to be made intraday, but I am saying that when you do make an update, every effort should be made to include the latest injury info. as of that time...
"Jason Kubel needs to be something more than the guy he's looked like outside of the abandoned Metrodome, more like the hitter he was in 2009; a career .257/.330/.449 performance on the road suggests that, highlights aside, there's only adequacy on tap."
What does this mean? His career home/road splits are well within the range of normal: .819 OPS at home, and .779 on the road.
Would be nice to have a single page that lists which teams' depth charts were adjusted on a given day. Right now, the only way to see which teams have been updated recently is to go to every team's page and look at the time stamp.
Do you really think Wieters is a top 50 failure, or are you just reaching for attention? (I'm not sure which would be more damaging to your credibility).
The player cards with 10-year projections as well... No ETA, no mention from anyone at BP in over two weeks. I think a status update is in order.
I don't really get the point of this article. Bullpens are important. Given that bullpens pitch almost 1/3 of all innings, that's pretty intuitive. I'm just not sure what else one was supposed to glean from this article.
Do you have data showing that a ~5-10 degree drop in temp. causes a significant change in pitch velocity?
Love the article. Looking forward to more. tbwhite raises some really good points that make the causality difficult to establish, but I would imagine that extreme cases (over several seasons) are of value in determining which pitchers likely have unusual fatigue patterns.
Can you post a note on http://pfm.baseballprospectus.com/updates.php each time you update playing time projections? Right now, it's unclear, e.g., if any updates have been made since the initial release on Feb 16.
This is incorrect. All the traditional stats projected by PECOTA are done so factoring in the player's park.
Exactly. Catcher development is highly unpredictable. It's embarrassing (for the entire BP staff) to read Wieters' name in this list.
In additon to the aforementioned Varitek, Victor Martinez and Jorge Posada were similar "busts" at age 25. Based on the "irredeemably perfect" editing of the 2011 BP annual...I find it spectacularly unlikely that Goldman will be running the show at BP three years from now.
Ah, I see. Valid point. I hope Colin can provide more detail on what is going on.
Can Colin (or anyone at BP) please reply with an explanation of how/why the depth charts use a different schedule projection than the PS Odds? Specifically, what is "unbalanced schedules" referring to?
I have asked this question here because I have received no answer from customer service (or anywhere else on the site). And yes, long-term projections are very relevant to the topic of evaluating subjective top prospect lists.
Player cards release date?
Well, you need to assume something about the schedule in order to make a runs scored/allowed projection -- if it's actually a "projection" and not a simple power rating. Given that their header (on http://www.baseballprospectus.com/fantasy/dc/) is "American League, ranked by projected 2011 record," I would have assumed it uses actual schedules as a factor. But now I have no idea.
Relatedly, are player projections using actual schedules, or are they assuming a league-average schedule? I sure hope it's the former -- or at very least a division-average schedule. Would not be much point in creating advanced fantasy projections if you didn't take this step.
I think you are misreading the table. It says that the Brewers division odds have increased from 27.0% to 32.6%.
"Due to unbalanced schedules, we come up with win-loss records that diverge a bit from the numbers on the depth charts."
I don't follow this. What schedule assumptions are you making for the Depth Charts win-loss projections?
Please keep the "last update" date accurate.
...and the Jays have outperformed in all but one of the past two seasons (BP projected 81.2 wins in 2009 when they won 75). I hardly think the system is magically biased against the Jays.
Last Update still says "02-16," but clearly Wainwright was removed well after that date. Please keep the last update date accurate, so we know what news (esp. when it may be more subtle) has already been incorporated.
"You don't put up those kind of numbers with severe splits"
Right. The majors are littered with LHPs with very good overall numbers and huge splits. Do some research; check it out. I.e., there is a reason quality LH relievers only rarely become closers.
Thanks. What source do you use? Minorleaguesplits.com has shut down, and milb.com has only the player's most recent team, so I am in search of an alternative.
Wow. Pineda just ranked as #16 in all of MLB at BA, yet not in your top 15 for the Mariners alone..
No idea who is right, but this might be the biggest prospect opinion disparity between BP and BA that I have ever seen.
Any idea what McGee's L/R splits are like from the minors? Unless they are unusually balanced, he would seem to be a longshot to become a closer.
I think TAv would be a nice addition to the depth charts. I realize you cannot add every stat (in the interest of clarity), but since TAv is a cornerstone of BP projections, it should be listed...
Agreed. There needs to be a dated "Change List" on every team page listing what was adjusted when.
Your single-season sample of NHL goal scoring doesn't "empirically contradict" anything... I can't believe I just read your conclusion that "In terms of raw goals, 75 percent of HFA (15 goals out of 20) comes in even-strength situations. That empirically contradicts the book’s assertion that it’s all penalty calls."
So a sample of 20 goals in a single season is conclusive? It shows nothing.
ETA for player cards?
I agree. To me, the article is basically saying, "here is what his career would look like IF he stays really healthy and really good... Guess what? Under those assumptions, his value would be really high." I don't see the value of this.
Nor was Alfonso Soriano...
Please post any important updates to your products in an appropriate, consistent, easy-to-find location.
Buried within the comments within a blog post is not the proper location for such information.
How about putting important product release information where it belongs -- on the home page and on the Fantasy page?
Why is every announcement relegated to BP Blogs (beta)? Announcements are not blog content, and for that matter, how is a blog in "beta?"
What "cliff" are they falling off of?
Stanton's OPS is projected to decline by 13 pts; Heyward by 49.
Heyward had an extremely lucky year by any accepted metric of luck. .335 BABIP (top 20 in MLB) and 16.8% HR/FB (top 20 in MLB). Both are likely to regress.
Wholeheartedly agree. There needs to be an easy way to look at each player's offensive value.
"Discussions" implies that there was some back-and-forth. If it was truly unanimous, what exactly were the discussions about?
Perhaps the pros and cons of publishing non-baseball commentary?
Pros: Mildly amusing half of your subscribers in a forgettable way.
Cons: Irritating/pissing off the other half in a memorable way.
Well said. As far as I can tell (and I am about as liberal as one can get), this is BP's cry for help.
Or...just keep non-baseball articles off the site -- and yes, politics aside, this is a non-baseball article. Saves countless subscribers time, and presumable saves some resources for BP.
Oh, my fault. Sorry for not loving articles *about* gay porn. Or not thinking it's an awesome use of limited resources by BP...
I don't know you and certainly haven't called you anything. I am pointing out the long-standing pattern of missed deadlines/promises/indications/randomly-generated-dates that has plagued BP for years now. I would love to think they have acknowledged and fixed the core problem. I have yet to see an acknowledgement, and if this particular instance is any indication, the issues remain to be fixed.
"Winning" would include: acknowledging the causes of issues in the past, detailing what has been done to fix them, and then delivering products on time.
So should we grant RBI to players who hit a HR just after a teammate gets picked off? After all, the hitter didn't deserve to get deprived of the RBI. All baseball stats are flawed. Not sure I follow the level of agitation with ERA. If "stat reform" is the platform, I think wins, losses and saves deserve attention before ERA.
Are the full 2011 projected standings posted anywhere yet?
I'd agree that it seems like the catcher position has an abnormally high flameout rate, but according to:
it looks fairly normal.
It's not the two days that even matters. It's the blatant spinning of this as some sort of evidence of BP's commitment to excellence that leaves a bad taste. I don't think even Microsoft would dare use "Irredeemable Perfectionists" as the title of what amounts to a bug report and release delay.
Well said. The next time I postpone something on the very day I had promised it, I am going to avoid using the words "apologize" and "sorry" -- opting instead for "irredeemable perfectionist." I might throw in a few really obscure, off-topic references just to add to the smokescreen effect.
Presumably, you've been working on this for 10 months now. To let it get down to the last day like this is just poor planning. There have been major missed deadlines for each of the past five years. To continue the [bizarre] alcoholism parallel that Goldman leads with...the first step to fixing a problem is admitting you have it. The "problem" in this case is not an individual error or technical issue; it is a poor process for planning, managing, and controlling quality.
Not sure I follow: are you saying that a one-year change in flyball rate is less predictive of a persisting change than a one-year change in HR/FB rate? I am concluding this from "his HR/FB rate was a not-terrible 10.2 percent last season" and "Prior to 2010, he had a pretty consistent fly ball heavy batted ball profile with the Marlins, and I suspect we'll see that again in 2011."
I would think a change in flyball rate is more alarming (for one season) than a change in HR/FB rate, since the latter tends to fluctuate a lot y2y regardless of the player.
"Gassed," or did he just draw the Red Sox/Yanks for three of his six Sept. starts? Looking at his games log, it would appear to be the latter. His avg. FB velocity also supports the notion he was not "gassed" in Sep., as his highest avg. velocity was actually recorded in his final outing of the year.
Single season home/road splits are not even close to being predictive. Thinking you are smarter than you are is how you lose in fantasy baseball...
Do you have aggregate data on this? Care to publish a study on it? "Crushed" means statistically significantly different than. Chone, for example, has shown itself to be superior to Marcel.
But according to you, they (the forecasts) are all the same in avg. quality, and there is no hope of outperforming consistently. Contradicting facts aside, even if that was the case, why would you (BP) build/re-vamp a very complex projection system, when you know the work will not result in any consistent outperformance over a simple past-years' stat average system? Just for kicks?
BP really can't have it both ways. The stance needs to be one of the following:
(1) We are committed to producing the most accurate baseball projection system possible. We have worked tirelessly on possible methods of attacking this problem and believe ours is the best. We plan to charge money for these projections (alone or as part of an editorial package), and will in return provide objective evidence of our progress -- both in backtested results and in real-time (end of season) performance measurement, vs. various other methods/systems.
(2) We have looked at the issue of how best to project player stats and have concluded that there is *no* benefit to doing anything more than averaging the players' past stats (Marcel). We believe that Marcel is and will always be about as good as the very best projection system, so we have formally given up. We will no longer be charging for fantasy projections and will refer all of past customers to a calculator.
Obviously, stance #1 is preferable, but #2 would be fine as well. But trying to have both at once is....a problem.
Marcel most certainly is not "among the best." This has been proven over and over. CHONE has crushed Marcel over the years, and PECOTA used to until 2009.
My question remains: is this (the spreadsheet in this post) some kind of stripped-down pre-PECOTA algorithm, or is it the final algorithm you plan to use for the pre-season 2011 projections?
I, too, do not understand how/why Marcel is anywhere in this article. The goal is be accurate, not just "no worse than the worst projection system out there." If your model is not ready for release yet, then don't release it. People can get/generate Marcel projections easily enough on their own, if they need something for today.
The goal should be to improve accuracy. Period. Any smart user is well aware of variance. So no, the goal of PECOTA should not be to explain what variance is; it should be to improve forecast quality (average error).
I don't get this (below). Are you saying you are currently using some algorithm that is different from the one you will use to make final forecasts? That would make no sense.
"This is the first release of PECOTA, and as such will continue to undergo revisions through the remainder of the offseason. The program we use to generate the PECOTAs is continually evolving, and when we discover new ways to improve the forecasts, we'll make those changes and pass the updated forecasts on to you. We’ll also be updating periodically to keep up with players who switch teams."
Can you post, or point me to, a summary of what has been changed in the PECOTA algorithm for this season?
How about the top int'l prospects who have not yet been signed by a MLB team?
Why don't you create and test a "predictive FIP" then?
"2. FIP is not meant to be predictive! FIP merely represents current performance. In no way should one even think that you would regress K rates the same as HR rates. If I wanted a "predictive FIP", I would probably do something like (5*HR + 2*BB - 2*SO)/PA + constant or something."
I think the next step is to look at which estimator performs best for in-season projections. That is, compare the first-half ERA estimator with the second-half actual ERA (filtering out team changers and guys with not enough IP in both halves), and assess the average errors.
I hope you can provide detail at some point about the process you plan to use to generate in-season projections. That is, how much to weight current season stats vs. pre-season projection in coming up with a rest-of-season projection.
Also, would love to see any work you have done to test how well the current PECOTA algorithm has performed in backtesting -- similar to what Swartz did in his article about SIERA.
I do not think proper resources are being directed to developing a valuation engine (it is not a trivial process), nor to updating playing time projections as accurately as possible. There is no quantitative way to project playing time; it requires significant expertise and time. I would think this expertise exists somewhere on the staff, but not convinced it is being applied fully at this time.
...and there is no point is using a projection system as an input unless it is expected to be one of the very best available. A state-of-the-art algorithm from five years ago, awkwardly dragged through several obsolete/inappropriate technologies while putting little effort in actually improving the algorithm, is not necessarily one of the best available inputs in 2011.
That said, I look forward to future installments of this series to see if sincere and thorough efforts to improve the system are in process.
More detail on pitchers' average fastball velocities. Less fluff (replace "Ephemera" section with useful content).
Any serious rating system already takes into account quality of opposition. Various studies in other sports have shown that trying to underweight performances against poor teams (thus emphasizing performances against good teams) reduces the accuracy of forecasted team strength. I.e., teams, on average, don't have a special skill to perform well/poorly against good/bad teams.
Exactly. Abnormal single-year home/road splits are not predictive....and thus not all that interesting.
Agreed. 41 Ks in 90 AB in the AZL is pretty stunning.
Just curious -- what the source of your data?
"I do think it would be ideal to come up with a way to translate SIERAs into "expected ERAs" "
I agree and actually think this development is critical for fantasy purposes. Ideally, I think BP should be shooting for a constantly-updated (at once a week) rest-of-season ERA projection for every pitcher -- based on current season metrics as well as (appropriately weighted) past season data. Something akin to what baseballprojections.com is doing this season (updated monthly), but hopefully with more frequent updates and research-driven formulas to weight past seasons vs. current season.
Why waste time/space asking BP authors if a particular game went extra innings (vs. the oh-so-common post-season doubleheader), when the info. is two mouse clicks away?
Sheer laziness. And the Ackley comment was a weak attempt to snidely imply that Ackley's line was somehow overvalued by Goldstein bc it contained runs and RBI. Poor.
1) Go to milb.com. This isn't a box score site.
2) Do you not "value" home runs?
Get a clue.
What player was being pigeonholed? It was a general comment.
Woke up on the wrong side of a secret mid-season firing?
Right...that's why run scoring is way down.
How do you know if a given series is being measured too high or too low? I can understand how you could estimate park effects from season data, but unless you are manually gunning each game, I don't see what your method would be for measuring a specific game.
First off, I don't think MLB policies should be dictated by what may or may not help a ridiculously tiny number of poor teenagers. Secondly, even if one was in favor of MLB trying to save the world, are there any numbers to indicate exactly how many low-income kids actually benefit from the current system?
I think #5 is the best point. More manual work needs to be done to adjust for this effect, as it is not small. Matt mentions two players in his rebuttal, but that's not enough to assess the impact of the annual mid-season transfer of talent from low payroll teams to high payroll teams.
This is somewhat tangential, but I would love to see a thorough, objective analysis of Billy Beane's performance over the past five years. Obviously, the end results have not been that good; but would like to see evidence for and against the thesis that he is no longer a top GM.
Perhaps because he had an incredibly short season due to a random injury? His peripherals were not far from his career levels. Odds are that had he stayed healthy, he would have started to perform more in line with expectations.
Do teams give precuationary/baseline MRIs to pitcher prospects? I.e., what are the odds Strasburg had some damage to his UCL prior to the start of the season?
I enjoyed this topic. Would be cool to create a list of veteran pitchers who have shown the most extreme home/road career SIERA splits.
Same with left/right splits.
The main assumption (used in your conclusions after the second graph) is that there is no correlation between quality of starting pitching and quality of relief pitching from team to team.
The problem is that there is a pretty strong correlation. This year, for example, it's around 0.44. Some of this is the fact that the teams that spend more and/or scout better for SPs tend to do so for RPs as well. Also, park effects play a major role. Padre starting pitchers will have low RAs on average, and so will their relief pitchers, e.g., which makes it look like there is some causality between starterRA and relieverRA that may not exist.
I loved his answers, but the obvious follow-up questions were not asked:
1) What stats do you care about?
2) What specifically are your stat-based "objectives for the long term"?
Might be nice to show some actual numbers rather than just state that they almost never work out and expect that the reader will take your work for it. I had assumed from the article's teaser line on the main page that there would be even some light analysis of deadline deal results, rather than simple assertions and cherry-picked examples.
"It would be easy— or lazy—to cite his BABIP, and note that his walk rate has remained the same. However, considering that his ISO had dropped by half this season, there was more going on here than just some unlucky guy hitting 'em where they wuz. He's been hitting more fly balls, and a few more popups, but neither problem is epic. As much (or, more appropriately, as little) faith as you can invest in his "documented" line-drive rate also suggests he's hitting the ball with less authority—not that his pancaked power didn't really sort of already tell you that. "
No, actually it wouldn't be lazy to cite his BABIP; it would just be accurate and concise. Most of his power drop can be attributed to his abnormally low HR/FB rate. And "hitting more fly balls" is generally not considered a "problem," contrary to your assertion. It in fact suggests even more strongly that his lack of power is more random chance than a sudden change in skill this season.
There is no evidence that unusual home/away or day/night splits in the past -- esp. mid-season splits -- are predictive of future splits. I.e., it's almost definitely random. Of more relevance is that his velocity has increased over the course of the season from 92-93 mph avg. fastball in April to 95-96 over his last several appearances.
Worth noting that the overall MLB K/BB ratio is on pace to the highest in at least 10 years. Ratios for 2010 through 2000:
2.09 (partial season)
Kotchman has batted cleanup exactly zero times this season...
There is not much similarity at all between Kotchman and Smoak. Kotchman was a HS draft pick who always hit for high average in the minors, but with very limited power. Smoak showed great power at South Carolina (.757 SLG with .374 ISO in a pitcher's park his junior year). In the minors, Kotchman was much more of a groundballer than Smoak, with a 59/13/28 G/L/F rate, vs. Smoak's 45/23/32. In the majors, this discrepancy has increased, with Smoak hitting 39% groundballs, and Kotchman 57%.
I have no idea how well Smoak will turn out to be, but using Casey Kotchman as any kind of predictor is not useful.
Except that they are essentially the same age...not much point of a "comp" unless you know how one of the players developed already.
I think this idea (daily links to stories of interest) has some potential, but only if the stories are actually of interest. E.g., novel research pieces or unusual news stories with inside info. But "Roger Mooney of the Tampa Tribune writes that the Rays swept the Red Sox and won their fifth straight game." isn't worth noting.
Any idea of Crow's velocity? His K rate has been abysmal.
fangraphs charges a fee for some services, but not for seeing pitch data. This appears to satisfy MLB. Could BP offer the pitch data only its free section as well?
Is there any reason BP continues to ignore pitcher velocity data? The avg. velocity tables and charts are prob. the single most useful feature at fangraphs. I have no problem continuing to use them for this information, but it seems like it would be a good addition to player pages here. Ultimately, using pitch types and avg. speeds should allow for creating much better comparable players lists (even if the historical data is not yet deep enough to allow for full use in PECOTA).
Wow. That's quite a diatribe from someone who was consistently the dullest knife in BP's drawer of analytical thought. Your pre-season projections and "my guys" posts became increasingly sad/comical over the years. Simply put, whenever you went out on a limb, you were dead wrong an astounding percentage of the time. I don't know you or another other persona in this thread and thus have no axe to grind; I just know substandard baseball analysis when I see it. Good riddance.
I cannot speak for the original poster, but to me, Kahrl's pieces are ridiculously condescending in tone. Unsubstantiated opinions are stated as obvious fact -- as if somehow she is the teacher and baseball GMs are the fifth-graders acting up. The style is long-winded and excessively showy for subject matter than would be better served by brevity. To me, the articles read more like thesaurus-fueled rants than objective commentary.
Could not agree more with your first point.
To give some context, what Swartz wrote was:
"Also, "you can win with the lowest payroll" is laughable. The correlation is high between winning percentage and payroll, and getting higher. Simply noting some counterexamples of smart teams does not change the fact that winning with a low payroll is hard. Having good talent evaluators helps too of course. That does not change the fact that having a higher draft pick helps in addition to having money and good talent evaluators. It's like you have decided that a few examples of other ways that teams have been successful changes anything about a backwards incentive structure."
I, for one, could not agree more with what he wrote. In an unrelated note, adding an "Ignore" feature for subscribers to hide comments from particular users would be a nice addition to the site.
The best short-term solution is to do a Google search on your search terms and site:baseballprospectus.com. E.g., "Ryan Howard strikeouts 2009 site:baseballprospectus.com" Not perfect (it logs you out of BP when you follow a google search result), but infinitely better than BP search. This problem has been acknowledged by BP for the last ten years, but has never really been improved.
"Last week, ESPN's Rob Neyer picked up on my series regarding this year's dip in scoring and increase in strikeout rates."
This is misleading, as your own article shows that K rates this season were not significantly higher than last season (18.1% vs. 18.0%).
More importantly, I do not get the point of running a regression to create "a fairly accurate model to predict scoring levels." Unless I am completely misreading the data presented (possible), you are using current-season data as input to estimate current (the same) season run scoring. Where is the evidence that this model predicts future scoring better than simply using past run scoring? It seems that these underlying stats are likely to be just as noisy as the stat (runs) you are trying to predict. Optimizing four such stats over a very limited set of training data has almost no chance of working well as a predictive model going forward.
1) Scrap the layout of the entire site and start from scratch. The statistics pages in particular are not even up to early 2000s standards in terms of readability/usability. It's not a few simple tweaks away from being good; it's going to require a true overhaul from a professional designer.
2) Introduce ratings for articles. I.e., readers could rate them from 1 to 5 stars. This would help readers find the most interesting articles quickly, and more importantly, provide a way for BP to figure out what subscribers like and don't like.
3) Get rid of the blogs beta section. If ~70% of site content is contained in this section, it shouldn't be obscured by this designation.
4) Produce more original, cutting-edge research on how to predict player performace -- both for future season and for rest-of-season stats. E.g., more Swartz and Lindbergh; fewer five-page rants about minor transactions.
Nice article. The fact that most batters are facing the same team (Red Sox), is probably the largest potential bias of the study -- as Nate Sheetz points out.
Look forward to a follow-up.
Would you honestly write this post if you were not an employee of BP? I should hope not. (Some other Colin Wyers apparently wrote a more objective critique here: http://www.hardballtimes.com/main/article/the-death-of-superman/).
There was a time not too terribly long ago when the motto of PECOTA was more ambitious than what sounds a whole lot like: "not that much worse than the average projection system." Any impartial observer reviewing the data would quickly realize that PECOTA was a relative disaster in 2009; and yes, I am talking about rate stats, not counting stats. I am also talking about standings predictions: http://vegaswatch.net/2009/11/evaluating-april-mlb-predictions-2005.html.
The additional problem with the "not much worse than the others" argument is that the others are far simpler systems and are freely available. I.e., the systems' authors would readily admit there is not a ton of scientific rigor behind the algorithms used. These systems should be used by PECOTA as the benchmarks to beat -- rather than shooting only to beat using last year's stats.
Nate Silver used to pride himself on beating other projections. I sense none of this drive to outperform in this post or any other by a BP author with regard to PECOTA over the past year. After the pre-season debacle this season, one would think there would be an in-depth article (or ten) on the steps BP was taking to improve PECOTA. You know, cool research articles on possible ways to improve forecast accuracy -- the former bread 'n' butter of BP. One would think BP would be putting its top minds on the project *now* -- not next Jan or Feb.
As a final note, how can you praise the PECOTA depth chart playing time forecasts, when among the four systems designed to project playing time, it finished third?
Again, would be much easier and more useful if you created a version of SIERA that already reflects defense -- rather than having to manually factor it in.
If SIERA is going to be used as a fantasy tool, it should really be redesigned as defense-dependent, since the defense a pitcher has had YTD will not suddenly become league-average for the rest of the season. As it is now (according to Swartz), SIERA is designed to reflect how a pitcher would perform with MLB-average defense.
My guess is that organizations are realizing that it is more advantageous to call up a young, hard-throwing pitcher early rather than let him burn through innings in the minors. Much like NFL running backs and career rushing attempts, many (though not all) pitchers seem only to have so many pitches in them before slowing down gradually or incurring a catastrophic injury. Studies show that the avg. pitcher loses velocity the longer he pitches in the majors, regardless of age. I.e., the act of pitching is damaging to the arm in most cases. I believe teams are calling up young pitchers to capture velocity-driven performance before it starts to decline and/or major injury strikes.
Is this a recent report?
It's not like Alvarez's performance was exactly screaming Major League stud over the first two months of the season. He posted an .883 OPS through the end of May in Indianapolis, which translates to a .737 Major League equivalent OPS. Also, he had 45 K in 174 AB through May.
Through May, LaRoche had a .673 OPS. He is rated as an average defender this season, and was above average last season. Alvarez's defense (as of April) is described by Goldstein as follows: "at 6-3, 225 (and probably more than his listed weight) his range at the hot corner is a bit limited." PECOTA projected him to be -10 in defense this year, which is about as negative as PECOTA gets on a prospect's defensive ability.
So while it *might* have earned the Pirates an extra fraction of a win or so to have promoted Alvarez earlier, this is far from the most egregious example of holding down a player whose absence was clearly costing the parent team wins.
Why don't you look up the value of 0.28 runs per game for a starting pitcher in terms of MORP and then decide if it is "material" or not?
Or does actual research not have a place in this columm?
You went out of your way to state that Nolasco is *not* throwing any slower this season, even using italics to emphasize the point. This is dead wrong. Now you are defensively swinging from the corner you backed yourself into. I feel like this column is a true outlier almost every week in that it fails to measure up to the analytic rigor of most of the other columns at BP.
I do not know you or care to know you. My only interest is in accuracy and coherent analysis. If you continue to be this lazy in your research, I will continue to alert others to the massive flaws in your arguments.
All of this is sheer speculation. I read BP for in-depth statistical analysis of why a certain player/team has performed a certain way, with an eye to creating a framework for evaluating future instances of similar performance. I don't see any of that in this article.
Surely you are not this illiterate with regard to the current research on velocity changes and pitching:
A 1.0 mph change between seasons is certainly significant. It's estimated to be worth 0.28 additional runs allowed per nine innings.
I have no idea how the author can write, "Again, he is not throwing the ball slower nor allocating his repertoire much differently"
10 seconds of research shows a significant decline in FB velocity this season.
That's quite a rant against the Pirates. The secret extensions are unorthodox, but nothing else contained in the rant is remotely new. Is the sample size of Huntington's work large enough to judge yet? Is there any evidence that changing a manager on a young, bad team has any long-term beneficial effect on the franchise? These are the questions to be asking.
salamander: Try a grammar lesson on for size next time if you are going to be a pedantic DB. It's "I read your column closely enough," not "I read your column close enough."
Huh? His BA is as low as ever this season (.227 as of Jun 17, vs. .237 career). So it's his power that has spiked this season, not batting average. Certainly, his HR/FB rate is elevated, but more striking is his increase in FB rate to 55% this season from 44% over his career. I.e., this power spike (current ISO of .310 vs. .180 career) probably has some staying power to it.
"Will it stay that way? I doubt anyone reading this really thinks that it will."
Clearly, you did. One doesn't generally write a whole article about a statistical fluke without at least mentioning that it is in fact a fluke. Instead, the reader is asked, "But what can you do about Fielder?" -- implying there is something that could/should be done to address the (non-)issue.
That's tiny variance for such a ridiculously small sample. We're talking about a total of 8 IFFBs here. So his supposed increase in IFFB% amounts to an additional 1.7 IFFBs this season. That's totally negligible.
His line drive rate, while down slightly from his career rate, is exactly what it was last year, when he had a .788 OPS.
So, to repeat: mostly bad luck here.
The question of the alleged "weakness" of Jimenez's opposition has already been answered in the comments above. He has faced average quality hitters for an NL starting pitcher.
On the other issue: generally speaking, hitters benefit more than pitchers from familiarity.
I think Hamels' ERA improving from 4.32 last year to 3.74 this year qualifies as a rebound.
Good point. In other words, he has not faced an easier than usual set of batters.
Most of it is BABIP. Not sure how you could write an article such as this one and not mention it. It's .221, vs. .279 career. His GB/FB ratio is the same as his norm. His HR/FB rate is half his norm (mostly luck). Both his K rate and BB rate are up significantly, but his K/BB ratio is normal. His swing rate and contact rate are both close to normal as well.
I.e., mostly bad luck here. I would expect his rest-of-season OPS to be close to his career number of .777.
"But what else you gonna do?" Hmm. Maybe be smart enough to realize that abnormal performance with RISP is not a skill, and thus not predictive of future abnormal performance with RISP?
Thanks for the responsiveness. Any advice for a simple way (for the user) to adjust SIERA to make it a pure predictor of actual future ERA? Obviously, I am coming at this from a fantasy perspective, where you seek the best possible intra-season indicator of actual future ERA on the same team -- not what the ERA would look like if all the pitcher's teammates were replaced by average defenders. Fielding-independent metrics are certainly useful for trying to isolate pitcher skill between seasons, but for pure numerical projections of same-season ERA, a fielding-dependent (but otherwise luck-adjusted) metric is needed. Is there any simple way for someone to tweak SIERA to reflect the pitcher's team defense impact? Obviously, I would prefer to see this offered on the web site, but if not, would be open to suggestions as to how I could do the calculation myself.
Bottom line: does SIERA project what a pitcher's ERA is likely to be with his *current* team, or what his ERA would be if he played on a team with league-average defense?
A few requests: can you make the BP stat pages (e.g., http://www.baseballprospectus.com/statistics/sortable/index.php?cid=244079) sortable by player *last* name, and able to show all players on a single HTML page, if desired?
More importantly, can you make available team SIERAs (preferably with the option to see team starter SIERA and team reliever SIERA)?
Thanks. Just to refresh, SIERA does not try to neutralize team defense, correct? So SIERA attempts to predict *actual* ERA going fwd. (vs. trying to assess team-neutral pitcher quality like FIP does)?
Are full SIERA numbers available on the web site somewhere?
Any reason why almost every player is projected to be above average defensively -- both for 2010 for for 10-year projection? It never looked like this in past years.
Can someone from BP address the seeming defensive rating inflation in this year's PECOTA cards? It's not a subtle change.
Any reason why almost every player is projected to be above average defensively -- both for 2010 for for 10-year projection? It never looked like this is past years.
Has anyone addressed the 10-year projections for defense? TO me, it was always one of the more intersting part of PECOTA cards. This year, however, just about every player is projected to be above average defensively for the next 10 years. Clearly something is wrong (as of March 31).
Dee Gordon PECOTA card does not exist. (He is the Dodgers' #1 prospect).
Ian Desmond's 2010 PECOTA card is missing.
Aardsma's PECOTA card is missing.
What is the point of pulling three player projections and saying all is well because they generally resemble another firm's projections? That's the most absurd notion I have heard in this entire thread.
It's not a question of "resorting to something like that." It's a much less sinister question of how best to test a new prediction algorithm. I (and others) have suggested that optimizing parameters on a data set (all past seasons), and then testing the algorithm's accuracy on a subset of that data, is not a valid approach.
It's the same as developing a stock trading system using all data through 2009, and then testing it on 2009 data. And then comparing to a system that was develped using only data through 2008. Ideally, you want to test the system on data that has never been used in the development process.
*No one* is suggesting that the new PECOTA somehow has access to future results as it makes it projections. Rather, there is simply some interest in using more rigorous methods when trying to assess accuracy. In this case, to compare the old vs. new PECOTA, it would only make sense to look at past years that neither system had access to during *devlopment*.
What years of data did the "old" PECOTA system use as its input for optimization of the algorithm? I.e., when was the last time PECOTA was changed prior to this season?
Thanks for the info. I understand that it was not run with any explicit knowledge of the future, but was the new PECOTA system originally developed (optimized) using any data from the 2009 season? That has not been made clear yet.
Would like to see RMSEs for SLG and OBP (not counting stats), new vs. old, for at least the last five seasons.
Valid point. I would still maintain that the age 30-to-35 curves look suspiciously flat this year compared to previous PECOTA iterations. The fact that there are 10-year projections instead of 7-year projections is probably also a factor. I have not done any systematic checks, just looked at a 15-20 players.
If they were simply interested in replicating the old PECOTA formula [which is what they *should* have done for this season], they would have simply compared the generated 2009 projections for each system on a player by player basis and noted the discrepancies. They clearly changed the way the algorithm works, and their post was an attempt to show that the new system is slightly better than the old one -- i.e., that the changes they implemented had some merit. It's a valid concept certainly. But the problem is, as a previous poster noted, that the sample size is tiny (one year) and they are using in-sample data to test the new algo. So the RMS error table for 2009 does not really establish anything meaningful.
All these points are valid, especially #1 and #2. It appears they used 2009 results both to create the new formula and to "test" its accuracy. That just doesn't work in any kind of forecasting business.
It's unclear from this post whether 10-year projections for hitters have been fully "fixed" yet, but it doesn't look like it. Basically, most players are projected to remain stable or improve from mid/late-20s through age 35 or so. This is not a normal aging curve, and differs significantly from past long-term PECOTA forecasts. E.g., Dustin Pedroia's TAv taken from his latest 10-year forecast:
Age 26: .305
Other examples (pulled quickly by spot checking random players) include Sizemore, Miguel Cabrera, Adam Lind. Basically almost every good player is projected to experience no decline between age 30 and age 35. This does not correspond to observed reality (http://www.baseballprospectus.com/article.php?articleid=4464 and http://www.baseballprospectus.com/article.php?articleid=9933), or to the way past PECOTA projections looked.
Exactly. It is one thing to screw up massively. Quite another to flatly ignore your customers once it occurs [yes, remaining silent for 12 days after the problem was acknowledged counts as ignoring customers]. This has been a case study in how NOT to handle a significant problem encountered by a consumer-facing business. The increasingly apparent arrogance of BP is bordering on astounding at this point.
How about an update (from *anyone* at BP) on the PECOTA errors and missing pitcher cards? It's been two weeks of silence, while one or your core products (long-term projections) remains broken.
Again, what is the point of a green light with a disclaimer like that? Equiv. to: "He's a very low health risk, unless of course the obvious sign that there may be something seriously wrong with him turns out to show there is something seriously wrong..." And it doesn't even matter that in this particular case the pitcher did end up having a torn ligament. No pitcher who has recently left a game due to arm soreness and has an MRI pending should get a green light for injury risk. It's that simple.
Yeah, I think it makes sense to give a green light to a guy heading for an MRI on his pitching elbow. What is the point of these ratings if they fail to factor in obvious risk? Do you have stats on avg. number of days missed to DL by pitchers who have an elbow MRI? Something tells me that number is just a smidge higher than normal.
Why not use SLG or TAv instead of OBP? OBP for small samples is going to be driven largely by BABIP, which is already known to be highly volatile.
I think a daily update is warranted this late into the preseason.
In order to lose half your home runs "to the ballpark," the ball park would have to allow no home runs to anyone. Not only is this obviously untrue, but Citi field had a HR/FB park factor of 98 last year -- essentially neutral, and an overall HR park factor of 106.
Unless you assume that a player's precise flyball distribution (locations where his FBs land) repeats year-to-year, there is no reasonable way to assume that Citi will continue to have a dramatic negative effect on Wright.
Would be interesting to see what normally happens the season after a player has an extreme HR/FB rate (vs. his norm). I'm sure it mostly reverts, but there could be some predictive value in it.
Prob. the most relevant stat (not mentioned here) is his HR/FlyBall rate. It plunged to 6.9% last year (vs. 13.9% for his career). His FB rate (best predictor of future HR rate) was 35.9% last year vs. 38.9% overall -- certainly a drop, but not massive. Citi Field has a pretty normal HR/FB rate overall, so this looks primarily to be a fluke.
When will pitching metrics advance to the point where quality of opposition faced is used to calculate an opponent-neutral ERA? A key component of the opp. "quality" calculation would be handedness of the batter rel. to the pitcher. Continuing to evaluate pitchers without doing this is like using a college football power rating that doesn't incorporate strength of schedule.
Probably not a great idea for BP to bring up PECOTA's performance in predicting team wins. As shown here,
PECOTA was the best by a tiny margin in 2008, but was the worst projection system by a huge margin in 2009. Overall, from 2006-2009, PECOTA ranks third out of four systems tracked in RMSE (CHONE, MGL, PECOTA, ZIPs).
Sadly, not subscribing -- if enough people do it -- is the only way to force BP to treat this problem with sufficient resources and a true sense of urgency. I have been a subscriber for eight years, and have seen plenty of incrementally positive and negative changes over the years at BP. But what has happened with PECOTA since Silver left is nothing short of negligent.
So let's take a look at how PECOTA projects the top five hitting prospects in baseball to "grow" over the next five years [TAv taken from 10-year forecast]:
Jason Heyward (age 20):
Mike Stanton (age 20):
Desmond Jennings (23 years old):
Buster Posey (22 years old):
Pedro Alvarez (23 years old):
So, basically none of the top five prospects in baseball are projected to improve over the next five years. Apparently, each has already peaked as a mediocre MLB regular. Anyone who has used PECOTA projections over the years will understand how massively different these projections look than those of years past. They (Pease et al.) have essentially diluted the informational content out of prospect projecions to the point where all major prospects are projected to follow an eerily similar career path.
In short, this is worse than New Coke. Someone has significantly changed the algorithm (intentional or not), and there is no documentation of what has changed or why. There is simply no way to trust any of the PECOTA projections for this season -- esp. those of prospects. This is extremely unfortunate as long-term projections were the last remaining competitive advantage BP had over competing forecast services (for data forecasts, not editorial content). A full article on this debacle (not another "Unfiltered" side-note) is warranted.
I am not trying to bash BP as much as I am expressing my personal disappointment at not having source for accurate long-term prospect projections for the first season in a very long time. I honestly don't know of anyone else who takes a numerical approach to evaluating minor leaguers. If anyone does, please post.
Yes. This looks like a total disaster that will not be fixed until next season. When the core of your offering is accuracy of projection, and you are experimenting aimlessly in March for the current season, it's a very, very bad sign. Glad I have not renewed my sub. yet.
To clarify further, player age is missing both in the bio info. at the top of the page, and in the 10-year performance forecast.
Overall, they look good. As a practical matter, I am not sure that making the "cards" approx. 20 browser pages long is necessary. The related articles, roundtables and chats should be separate links, rather than part of the player's main page.
Also, the player's age is missing.
When are the PECOTA cards really coming out?
Is this going to be the latest (time of year) release of PECOTA cards ever? I honestly don't know of a company that over-promises deadlines more consistently than BP.
Shouldn't this be tested, rather than asserted [that using both xFIP and SIERA produces a more accurate prediction than xFIP alone]?
Absolutely. This is embarrassing at best.
Give readers the ability to rate articles on a scale of 1 to 5, and let us search for the highest rated articles.
Where is the evidence that your ratings are predictive (more so than using a player's historical injury rate combined with his position's avg. injury rate)?
Totally agree. Splitting the entire population into two groups is not going to prove anything.
This is great research. I actually expected stronger trends to emerge, but am glad finally to have data on this topic. Thank you.
Allow me to say that the "Hit List" is easily the most useless recurring feature at BP.
I agree with many other subscribers. While I like the idea here (of making in-season projections), your (Eric's) methodology is arbitrary at best and seriously flawed at worst.
First, recalculating everything based on 2-month park factors is chasing noise, and surely adding nothing but increased average error to your projections. There is no evidence whatsoever that if a first-half park factor differs from what was expected in the pre-season, the second-half park factor will be any closer to the first-half one. I.e., the sample is way too small for partial season park factors to be predictive. Frankly, you should already know this.
So the main concept behind in-season projections is to figure out how to best use any *new* information the current season has provided. Instead you arbitrarily add adjusted rate stats from past seasons to the equation. PECOTA is specifically designed to replace the method of using past stats directly to predict the future. You are basically throwing that out the window and saying that past rate stats -- as long as they are adjusted -- are more useful than PECOTA alone for predicting the current season. Where is the evidence for this? And if there is any, then why don't your pre-season projections use adjusted past rate stats to create the projections, rather than trusting PECOTA to handle all interpretation of the past on its own?
In short, you have a created an arbitrary hodge-podge of a formula to project rest-of-season stats. It's not even clear what stats you are using. Are you using this season's raw ERA as an input? This season's raw BA as an input? I should hope not, as these stats are not predictive.
My hope is that you or someone at BP will do the research to figure out the optimal choice and weighting of in-season stats to use to come up with the most accurate rest of season projections. Whatever it is [and I have done some research on this topic], I am pretty confident it will differ significantly from what you threw together this season.
Repeating post in proper place (at end of comment thread):
Can someone explain which numbers are rest of season projections, which are YTD, and which are neither? For the rate stats listed for each player above, how are they calculated? They match neither the PECOTA projected rate stats nor the actual YTD rate stats.
This page needs a full explanation/glossary ASAP.
Can someone explain which numbers are rest of season projections, which are YTD, and which are neither? For the rate stats listed for each player above, how are they calculated? They match neither the PECOTA projected rate stats nor the actual YTD rate stats.
This page needs a full explanation/glossary ASAP.
This is interesting, but I hope you can break down the avg. HFA by game number in a series and/or by day of the week. Travel is presumed to represent a significant portion of HFA in every sport, but no one has tested it. Baseball teams have many road games where they have already at least spent one day in that city. So it is ripe for an analysis of how much travel actually impacts next-day performance in the sport.
I would agree. This was an interesting topic, but not a very readable presentation. A true Q&A interview would have been best. If not possible, then a more integrated use of quotes would have been appropriate. As it is, the author inserts the implied questions before every implied answer the interviewee gives. So you get a manufactured interview that the reader is not sure really occurred in the way is was written.
Stadium-specific effects are minimal. Travel and psychology are the primary drivers of HFA in any sport.
Yeah, the raw HFA numbers are misleading if you try to compare across sports. In the NBA, e.g., the better team wins much more frequently than it does in MLB. This is simply due to the way points/runs are scored and their frequency.
The interesting thing will be to isolate travel effects in MLB, since it is the only sport offering a decent sample of road games played in same city as previous game. To my knowledge, no one has looked at this analytically yet, so I am looking fwd. to the future installments.
I hope you will compare HFA not only vs. game number of the series, but also wrt number of days rest. Will be interesting to see if unrested road teams in game 1 fare more poorly that rested road teams in game 1, e.g.
I like the direction this is headed...
One thing to look at is whether team-specific HFA is correlated y2y (assuming no stadium change). I.e., is it more accurate to use a generic HFA for all teams, or to tailor it based on a team's historical HFA?
On a similar note, I hope you can extend the analysis at some point to look at individual player home/road splits and whether they have any predictive value (vs. using generic team home/road adjustments). I suspect past individual splits are mostly worthless for predicting future splits.
I don't follow your explanation. Additional volatility (variance) in per-game scoring will *always* help the inferior team's chances of beating an opponent. Run scoring does not have to be normally distributed for this to be true.
The only problem with calling Bell a "righty-killer" is that he usually isn't particularly tough on righties (relative to his overall effectiveness). In fact, his PECOTA card projects an RHB OBS split of just -.046 (27 points less than the league RHP avg. of -.073 vs. RHBs). His historical 3-year OPS-against vs. righties from 2006-2008 was actually worse (.649) than vs. lefties (.598) by 51 points, according to espn.com. This season, obviously, he has performed better vs. righties. But the sample is tiny and it's highly unlikely he changed into a righty-killer during the off-season.
To my knowledge, the following is not accurate:
"In that report [regular post-season odds report], each team's current record and third-order Pythagorean record—their record after adjusting for scoring environment, run elements, and quality of opposition—are factored into a Monte Carlo simulation of the rest of the season, with their records regressing not to .500 but to their third-order winning percentages. Run differentials play a big part here; a team that's above .500 but being outscored won't see favorable odds."
In the regular post-season odds report (http://www.baseballprospectus.com/statistics/ps_odds.php), actual win pct. is not used at all, and it is in fact regressed to .500. The projected team strength calc. starts with 3rd-order win pct., and regresses it toward .500. Your statement says that it uses actual W-L records and 3rd-order records as inputs, and somehow regresses the records to their third-order winning percentages.
From the page itself:
"Expected winning percentages (EWP) for each team starts with their W3 and L3
from the Adjusted Standings. A regression is applied to derive the EWP for the
rest of the season, which is going to be between the current winning
percentage and .500."
I couldn't agree more...well said. There is more to baseball than transactions/injuries/prospects. All three are important, but remain peripheral topics. Somehow -- I think mostly unintentionally -- these topics have moved from the periphery to become the core of BP. As a result, original analysis has become the exception, rather than the rule on this site.
I found the face dimension discussion useless and somewhat awkward (as was the original article on the subject), but to be fair, the host really forced the conversation in that direction. The first half was solid, except for: "Rick Porcello has been just a revelation in his rookie season." The guy has 47 K, 31 BB, 96 H and 13 HR in 87 IP (4.14 ERA, 1.46 WHIP). By any advanced metric, that's bad. With neutral luck, he'd probably be in the bullpen or Toledo by now. Certainly a case could be made that he will improve on those numbers, but when a host specifically asks if a starting rotation can continue its outstanding performance, and rather than pointing out Porcello's bad peripherals, you simply call him a "revelation," it instantly erodes credibility.
Count me among those who would rather hear directly from someone (Swartz) who has done original research on the topic, vs. someone (Kahrl) who is nitpicking from afar.
This is really good. I think Cecil is definitely a player PECOTA users were wondering about heading into the season. I liked the analysis employed. The only thing missing was more context for the velocity numbers. How hard did scouts say he was throwing in the minors? What exactly are his offerings, the pct. he throws them, and the avg. velo of each? I think that this kind of info. is needed in the introductory paragraphs -- well before any statistical analysis begins. I am guessing his abnormally high velo in the first few innings of his first MLB start was more due to adrenaline than due to injury or lack of stamina.
An NFL running back has never entirely turned a franchise around -- Tomlinson included. The only position with enough leverage even to make this possible is quarterback.
I don't know about aloof, but certainly inappropriate and misleading.
If a woman had the biceps of Piazza, she might very well get a token a late-round draft spot... The idea that women might be undervalued because they have fewer developed bad habits and/or have fresher arms is truly absurd...probably the most inane comment made yet in the entire competition. There is no "inefficiency" resulting from the failure to draft and develop women; by contrast, it would be incredibly inefficient -- and thus beneficial to the competition -- to do so.
It was spelled correctly in the opening sentence, but then misspelled in the next two mentions.
Just to be clear: a simple broken wrist with full recovery after 5 weeks last season, combined with a minor hamstring problem that has sidelined him for only a handful of games this season, makes you question Longoria's ability to stay healthy? I think missing only ~5 games to non-fluke injuries over a player's first season's worth of games is a better than average outcome, and certainly not an indicator of future fragility. A series of related muscle pulls or tendon/ligament issues would be another story. An isolated bone break from a HBP and a non-DL hammy don't really warrant the same level of concern.
Comments like this will do nothing but inhibit future innovation. Just because a topic has been addressed before, does not mean it is should be ignored going forward.
Overrated? I don't think anyone -- incl. the Yankees -- thought that Nady's first couple months in 2008 was his expected future level of production. And how exactly does the Braves trade show that McLouth is overrated?
I didn't know McLouth was a boxer.
Intersting. One issue over which you have no control is the accuracy of reported attendance numbers. Certain minor league teams are known to call in well after game day with suspicious "corrections" to past att. numbers -- in what appears to be an attempt to rise in the league's attendance standings. I don't think this will impact your general analysis much, if at all, but for many specific teams, there are non-zero "park effects" for attendance reporting.
One more factor to add to your list for future study is the local experience level of the opening day roster. That is, how many players are familiar to the home team fans from last season or earlier, vs. how many are brand new. Maybe the metric is total number of games played on current team -- added up for all members of opening roster.
Nice article. I particularly enjoyed the pure talent eval. without the team/bonus/agent aspects -- as those factors are of far less interest to me. I just wish you could post an updated version every week throughout the year. I recognize that it would be even more speculative the further away from the next draft it is and that many weeks would see no major changes, but I think a constantly updated evaluation of amateur talent is missing on the internet. E.g., if someone wants an expert's view of the best available HS/college talent as of, say, January in a given year, it's difficult to find anything even remotely comprehensive.
Any reason? Not sure I follow the point of hiding actual results from voters.
I'd say neither one is likely.
Value = expected relative player performance divided by expected player cost over the life of a contract. Not too hard to understand.
"Who knows what 4.5 million could buy in 2010 and 6 million could buy in 2011?"
Let's see...how about less than half of an Aaron Rowand contract ($12M/yr)? Read up on MORP if you need more examples.
An "equal" or "stable" home/road HR rate split for a Pirates hitter does not mean he is showing no HR park effect, since an average Pirates player would be expected to hit more HR on the road than at home. A player who has equal raw home/road HR rates in such a park is showing some degree of HR outperformance at home.
One aspect of the trade that surprised me a little bit is that the Pirates are losing a lefthanded hitter for a righthanded hitter. According to BBHQ, PNC Park has a -28% LHB HR 3-yr. park factor, vs. a nearly-flat RHB HR factor. McLouth shows a +40-point career home/road SLG pct. differential, which is pretty decent-sized, given the overall SLG suppression of PNC. Turner Field has actually reduced LHB HR by 15% and has boosted RHB HR by 7%. Obviously, you don't build your team solely on handedness park factors, but it does seem like McLouth would have a bit more raw value in Pittsburgh.
"Sure, he's better than replacement value, but does that mean you want to throw 4.5 million next year and 6 million in 2011 to that kind of player?"
Yes, it does. Apparently, you haven't been following player salaries very closely. There is no one on this comment board besides you who would argue that McLouth's current contract has negative value.
I believe he is basically saying that defensive stats are still rather imperfect, so a subjective evaluation of defense should actually carry at least a little weight -- just like a scouting report has value in supplementing a statistical profile.
All I would say is that if I was forced to bet on which player had the better "true" defense [assume that could magically be determined after the fact]: A) a CF with a 94 Rate and a recent Gold Glove, or B) a CF with a 94 Rate and no Gold Glove votes, I would bet on A.
I agree. This trade does not show anything about how the industry values players. It merely suggests that a single team -- the Pirates -- is placing much more value on defensive metrics (for better or for worse) than the average team is. The "industry" has not expressed its opinion in any way on the trade.
As a side note, based on a big ol' sample size of two, it appears as if the Pirates may be valuing groundball pitchers more than average.
Finally, Sheehan's thought that, "For the Pirates, though, [Hernandez] becomes a rated prospect who could be a fair regular in a corner, at low cost, for a few seasons," strikes me as pretty absurd. Why would they pay up for premium CF def., and then stick the guy on a corner? Furthermore, Hernandez could not look like more of a slap hitter than he has thus far in his career. If he does start for the Pirates in the next couple of years, it will almost certainly be at CF.
Obviously, I mean a tally of how many positive votes vs. total views each author got -- generally, the whole point of any contest or election. Not a listing of who each particulcar reader voted for...
No, actually, I prefer to read intelligent compilations of relevant salary info. Hence, value. And if you are seriously using USA Today for your baseball contract info., then you clearly *don't* value your time...
Good stuff. I like the quick summary of existing work on the topic, and then your addition to it. A couple questions: did all three groups have roughly the sample number of IP/season per pitcher? That is, did the "Good" pitchers have more innings/season in their sample than the "Bad" pitchers? I ask bc, if so, the larger sample could account for some of the decreased volatility (vs. other pitchers) seen in their stand rates. I.e., you would be using more information for Good pitchers than you would for Bad pitchers.
On a similar note, is strand rate consistency more influenced by pitcher quality (FIP) or by pitcher consistency? That is, I would be interested in creating three buckets of pitchers: inconsistent, average, and consistent, and running the same kind of test as was run on Good, Medium and Bad. The issue is how to define "consistent." I might try using stdev of season FIP over the sample, so those Ps with the highest FIP stdev over the past several years would be labeled "Inconsistent," etc. Just trying to assess if a consistently mediocre pitcher is more likely to maintain strand rate than a mediocre pitcher who swings wildly from good to bad each season.
Definitely there have been improvements every year, but they tend to be rather incremental compared to the ambitiousness of the original concept of PECOTA, and, frankly compared to the increasing ambitiousness of other baseball analysts in the past few years. It's disheartening to see people like Jacques act so defensive and then shoot down new ideas without even doing enough homework to determine if they have in fact already been implemented as described. [Note: based on the links he sent, none of what I suggest is already incorporated in player cards]. This attitude of "why are criticizing anything?; we already addressed those issues long ago...at least I think we did" simply cannot help PECOTA grow and improve.
In the interest of transparency, how about posting the actual voting numbers each week?
I don't follow how OBI attempts to answer how "reliable" the external component of the RBI stat is. There is no test of the reliability or predictive quality of this measure. Is it almost all noise, or is it somewhat predictive? I would lean to the former, but the author makes no attempt to answer that question. Instead, the second half of the article consists of anecdotal predictions for individual players.
"Once Hawpe's line begins to dip back toward normalcy, we should see his R2BI rate drop, which would also mean a drop in his overall OBI and RBI totals"
Wait, so Hawpe's individual performance is due to revert, and once it does, it is likely that he will drive in fewer runs from 2nd base, which means he will also be driving in fewer non-self runs overall, which means he will be getting fewer total RBI. Therefore, sell high.
But is there anyone who would have expected his RBI production to stay constant as his performance leveled off?
Sorry for all the negativity -- as I think in-season secondary stat projection is an area ripe for development -- but this article fails to keep its analysis clean and useful. If OBI is the topic, then try to identify which players have abnormally low or high OBI%s and make a case that these rates will persist (due to great/awful surrounding offense) or revert to lg. avg....rather than obscure the discusssion with general talk of which players are off to hot/cold starts.
This (getting at the core of RBI creation) is an interesting and neglected topic in fantasy. I hope the author or someone else can do some more in-depth research on it.
I am not interested in helping your specific fantasy team.
I am interested in learning about player contract statuses with as little effort as possible on my part.
Tell me something: who is currently managing PECOTA development at BP? I did not read the book this year [one less tree family shattered], but I am assuming from Nate Silver's absence on the site that it is no longer him, and from your cursory knowledge of the inner workings of the system that it is not you.
I do not have the 2006 book in storage, so I cannot check the specific adjustments to PECOTA you are referring to -- again, would be nice to have a detailed explanation avail. to subscribers in which the improvements/revisions made over the years were documented.
When I say "strength of schedule," I mean the quality of opponent and environment (park) each player actually experienced on a per-AB basis -- not a generic "the player played in this division, so we adjust him by this amount." E.g., if a Padre miraculously managed to play all 30 of his season PAs at Coors Field, your system would not account for this (to my knowledge). Furthermore, the link you quote refers to incorporating SoS in team win projections, which is not the topic of this discussion. [BTW, Silver's article *manually* applied SoS factors to the team projections in this article].
I.e., you have the data to know exactly whom each player was facing and where it was for every plate appearance. Why not use this to normalize the difficulty (for both pitcher and batter) of each PA? I recognize it is not a trivial procedure; but certainly something you should at the very least be in the process of developing presently.
As for handedness adjustments, the "Platoon Splits" section you refer to from BP 2008 is quite vague. It says that PECOTA now tries to estimate the handedness mix of the opponents a player is "likely" to face for the upcoming season, and adjusts the raw projections accordingly. It does not say that it evaluates on an AB-by-AB basis what handedness mix the player has faced for the past three years and how that is used to adjust his historical eq-Stats, if at all. Again, it's a matter of looking at what the specific difficulty of each PA (including handedness) and averaging all PA to come up with an adjustment, rather than making division-wide or league-wide assumptions as to the difficulty a player has faced.
I am certain you will (try to) correct me if I wrong. I am not a PECOTA historian, but rather someone who wants to see the most accurate baseball projection system possible. That is my sole motive. I am not sure what your motive is, and you appear more interested in playing defense than in looking for new ideas. And BP Idol clearly is *not* a solicitation of new ideas to improve PECOTA. It is a solicitation for articles and future employees. There is a big difference.
Value = how much time and effort is saved.
This research is descriptive, not predictive. There is no evidence presented that YTD OBI rates are correlated to rest-of-season RBI rates (or even OBI rates), without which I can cannot see any fantasy usefulness for this article.
FWIW, when I did some research several years ago on in-season RBI predictors, team OBP had only a near-zero correlation to RBI -- which was surprising and may have been an anomaly due to limited sample. It needs to be evaluated more thoroughly. Also slighty interesting: 1st-half slugging pct. was not the best single-stat predictor of 2nd-half RBI. A modified slugging pct. (altering the weights of 2B, 3B and HR) was more accurate by a non-trivial amount.
I don't know the proper way to predict future RBI, but I was disappointed this article didn't tackle the issue.
To follow up further: the analysis used in this article is, in fact, pretty basic for anyone with experience with Excel. I have no idea why the author even mentions, "pivot tables," as they are in no way needed to do the analysis presented. It is very basic stuff he discusses, and honestly, I get the feeling VLOOKUP and Pivot Tables were mentioned chiefly to dress up the simplicity of the analysis.
This article is adequate for Fantasy 101, but a bit too introductory for the average BP reader. I feel like if you didn't already understand everything he mentioned, you really wouldn't get much out of a BP subscription.
As people have noted, fairly boiler-plate analysis. But well-executed, and one of the few articles this week to offer direct usefulness to the fantasy player.
Absolutely (from another long-time BP sub.). This will surely land me on the permanent BP hate list, but why are three (apparently) non-quantitative people acting as judges for a contest to gain employment writing for a site that was founded on innovative quantitative analyisis?
I'll be blunt: the comments from the judges have been almost uniformly disappointing -- not even as they pertain to this contest, but as they reflect the future direction of the company as a whole.
Not very exciting, but contains critical strategy elements that many above-average fantasy players still do not fully grasp. I personally did not get a whole lot out of it, but I applaud the author for taking the time to address this fantasy technique in an organized way. Overall, the article is worthwhile because it offers a specific strategy to improve your fantasy performance. How many of the other articles this week can make that claim?
I liked this article. It was an easy-to-read list of players filtered for likelihood of being traded. That is not something I can find elsewhere. I found the author's opinions to be more convincing than those of most "deadline deal prognosticators." It may have been a bit off-topic, but the article was quick, painless and informative. I don't need to wade through tortured prose to get at information; I appreciate the author's direct style.
There is something very odd about this article. It feels like it was written by three different people -- two of whom may know something about fantasy baseball.
The fact that you are "pretty sure" something has been addressed in the past more or less proves my point. No one appears to be taking the lead with driving PECOTA to become better as aggressively/swiftly as possible. We get daily updates on injuries and transactions [sorry, not everyone is absorbed with such topics], but an article on PECOTA research (or related topics) appears maybe once or twice a year.
If you find them, please point me to where ideas #2 and #3 were addressed -- as well as #4. And more generally, please be more transparent with what exactly PECOTA uses as factors and how it uses them. The secret's been out of the bag for a while now. It's time to actively solicit ideas and tweaks from your readership (after first explaining to them what you are currently doing).
I agree. It kind of defeats the purpose of the contest structure to hide the weekly results in some sidebar article...
Easily the most worthless of this week's entries. Yes, I have played some Strat. No, the topic is not worthy of an introductory article at a site like BP. Solid writer, but topic selection skills are lacking -- and finding the right topic is more than half the battle...
Liked the topic. Did not like the readability/presentation of the data. Did not like the shift from general analysis to single-player anecdotes in the second half.
This article is great. The author needs to be hired immediately. PECOTA devleopment is being neglected to the point where it is nearing on a slippery slope to obsolesence -- regardless of what the BP brass may believe. It desperately needs someone to look at ways to improve accuracy, preferably from an outsider.
Other factors I feel PECOTA is ignoring/undervaluing at its own peril:
1) Using pitch type/velocity/movement data to find player comps. more accurately.
2) Grading past performance as reliever vs. as starter on different scales (and doing the work to find out exactly what that adjustment should be).
3) Using strength of schedule. No credible team performance prediction system would ever consider making the assumption that all opponents are league-average. Why would PECOTA? Sure, it's more difficult; but it cannot really be that hard to look at the quality of opponents face for each season.
4) Dealing with platoon advantage more fairly. This ties in with #3, as players who are used abnormally more against benefically-sided opponents (LHPs facing mostly LHBs, or LHBs facing only RHPs) will get inflated rate stats. This needs to be accounted for in the schedule difficulty calc., so that a situational lefty who posts a 3.00 eqERA will be penalized to reflect what he would have posted vs. a normal mix of batters.
Apologies if any of these topics have been fully addressed already in the latest PECOTA. But to my knowledge, they have not; and I am not getting the feeling that anyone at BP is committed to improving the algorithm as aggressively as it needs to be done.
Strand rates show basically no correl. y2y:
"The fastball velocity issue has raised eyebrows as well, since it is fairly rare to see a 24-year-old with fluid mechanics and no real prior injury history suddenly drop from 93 mph to 91.7 mph in under three seasons."
I totally disagree with this. I have not done the research fully, but based on scanning the data for every current MLB pitcher, I believe that it is not at all unusual for a young, "healthy" pitcher to lose that much FB velocity (1-2 mph) 600+ IP into his MLB career. The truth is for most pitchers, the act of pitching itself is a gradually damaging process, even if not sudden injury occurs. This is sometimes offset by or even exceeded by the gains he makes in control/knowledge/strategy. But raw velocity rarely remains steady several years into a career. To cherry pick one example, Cole Hamels was once a "mid-90s" fastball prospect. There are a surprising number of picthers whose velocity peaked in their 1st or 2nd MLB seasons.
Would love to see someone at BP test this hypothesis fully. Obviously, would have to adjust for usage (starter vs. reliever), and possibly examine peak FB as well as avg. FB, to account for possibility picther is simply choosing not to throw as hard as often.
I agree. This a very good article. And it merely touches the surface of the ineptitude of Littlefield. The man should never enter a stadium without a ticket again.
So why exactly are people thinking Dontrelle has changed? His sparkling 5/4 K/BB ratio?
Am I the only one who thinks this is terrible? Yeah, I get the joke. The problem is that it's not very funny. It comes off as a tortured attempt by the author to advertise his own cleverness.
I am sure you are right, as the *relative* slg. pct. of leadoff hitters is actually down from its peak in the 80s. Leadoff slg. pct. was only about 10 points lower than MLB avg. slg. pct. in the 80s, vs. ~18 pts. lower than avg. in both the 90s and 00s.
The relative OBP of leadoff hitters was stable at about +14 pts. in the 80s and 90s; but it has dropped to +5 pts. in the 2000s.
BA was +12, +10, and +9 pts. better than MLB avg. for 80s, 90, and 00s, respectively.
Isolated power (SLG-BA) went from about -22 pts. vs. avg. in the 80s, to -27 in the 90s and 00s. So there has been no long-term increase in the power of leadoff hitters relative to the rest of the lineup.
I agree with fredlummis. This is a great format to relay info. we would never otherwise see. The more you can supply direct scout opinions, the better.
"I’m not going to be first on some stories and I’m going to stay away from the rumor mills. I’m just going to try to be the best at what I do … on my pace."
Was this your same policy when you irresponsibly "broke" the Pete Rose news several years ago? [Note that your report was not even a blog entry; it was a formal, look-at-me news story]. I totally agree with what you say in this post, but I do not think you have followed your own philosophy in the past.
I agree...for all the words published on BP every season, there are very, very few devoted to discussing the specific mechanics of how players and teams are projected. Some of us have done independent research on the topic, and would like to at least have a slightly more detailed look at the methodology used.
No mention of Feliz's control problems? If he's a top 5 overall prospect with 23 BB in 45 IP in AA, I would expect the issue at least to be addressed in the writeup.
BP needs to start re-calculating ERA/WHIP based on projected usage. I.e., it needs to project both a starter ERA and a reliever ERA for Ps that are used as both. There is a huge difference in expected rate stats for a given pitcher when used as a starter vs. a reliever. To lump them into one number (thus forcing users to manually look up what PECOTA arbitrarily sees as the usage mix and adjust accordingly) is not very helpful for fantasy purposes.
"It's because a high pitch count total, as a singular event on a particular day, is where the damage to the arm comes from. If you throw 200 pitches on the 15th of every month, and I throw eight pitches 300 times per year, I bet your arm is in more trouble than mine."
If this was true, closers would not get injured nearly as often as starters. Where is the evidence for this?