CSS Button No Image Css3Menu.com
New! Search comments:
(NOTE: Relevance, Author, and Article are not applicable for comment searches)
The thing about the splitter and the changeup is that ordinarily the two pitches behave the same, so that pitchers throw one or the other. There were only three MLB starters last year who threw both: Freddy Garcia and Brad Penny had splitters that broke more vertically than most and hence resembled their four-seamers, and used them in tandem with conventional changes that had the usual run in on RHB; while Tim Hudson's change was a bit like Buchholz's, with less armside run than usual (intermediate between his two- and four-seamer, though), so he was able to pair it with a splitter that had a lot of run.
But none of these guys had a velocity separation between the two pitches of more than about 1 MPH, which is to say, negligible.
Buchholz is throwing an unusually slow change with (even more unusually) even less armside run than his 4-seamer, and a somewhat hard splitter with good, typical run in. There's no velocity overlap at all between the pitches. I haven't gone back to see if anyone else has done this in the pitch/fx era, but I think it's fair to say that, for hitters, it's something many or most have never had to deal with.
The pitches ought to be easy to confuse, so the key question may be whether the splitter is making the change even more effective.
Here's one you missed: 3B Michael Almanzar, he of the $1.5M Red Sox bonus four years ago, went 3-3, 2B, HR, HBP for Salem, and has hit .467 / .568 / .967 over his last 8 games, with 24% of his season's walks, 5% of his strikeouts, and 4 of his 9 homers. He's six days older than Brandon Jacobs, a better defender, and now has better offensive numbers on the season. Worth watching.
Shaq broke his second and longest strikeout streak at 21 today, with a sharp line drive to RF.
Update, 7/17: After going 0-5, 5K, to give him 15 consecutive AB with a K since breaking the 16-for-16 streak, he now owns the 2 longest streaks since 2005. He goes for his own record tomorrow.
If the Red Sox were to trade both Daisuke Matsuzaka and Aaron Cook and move Felix Doubront to the pen in order to trade for Dempster, Garza, or any of the other "usual names" who might provide a "stabilizing force" (which I imagine is to be contrasted to any kind of actual likely upgrade) for the rotation, they would then be rightly mocked and excoriated right here and everywhere else online. (You'd probably do that to make room for Hamels, Greinke, or King Felix if he were available, but that doesn't seem likely.) And the notion of a team with 17 MLB pitchers on their roster (not counting Daniel Bard) opting to pick up a "depth piece" suggests an author who ... well, failed to look at that roster.
KG, any thoughts on Kyle Stroup? I spent too much of the winter reading all his professional game logs and crunching his numbers in insane detail and came to the conclusion that he's a borderline three star prospect who is somewhere around 16 in the system -- without, of course, ever having seen him pitch. Then BA put him 20, which made me happy.
Actually, it was July 29. And he had already homered once already. He and Nomar (5-10-99) are the only players in the PBP era to have 3 HR including 2 GS in a game.
July 24, 2004 was his 2-run walk-off HR against Rivera in the Varitek-ARod fight game.
That's actually a pretty good job, though. It's not uncommon to see articles like this where the odds add up to 50% or (less often) 150%.
You can't focus on this one play. The better question is, why did McNamara let Buckner (.218 / .257 / .322 vs. LHP) hit against Jesse Orosco (.187 / .235 / .253 vs. LHB, second best marks in the league) with the bases loaded and 2 out in the 8th, when he had Don Baylor to pinch-hit? (The answer appears to be because Buckner told Mac "I can hit this guy" as Baylor was getting a bat out of the rack.) The even better question is why was Buckner not only starting the game but hitting 3rd against Bobby Ojeda (.150 / .207 / .206 vs. LHB, best marks in the league) when Baylor could have started, since there was no DH?
Buckner had a -.468 WPA in the series, so objectively he was indeed the goat. That the ball went through his legs when it did was essentially Divine Shorthand. But it's his job to say "Put me in, coach, I can play." It was McNamara's job to say "No, you can't, you're terrible."
Bogaerts has an .085 Home Runs / Contact; in the decade 1995-2004 the only 18 year-olds in low-A to top .050 were Delmon Young (.063) and Andy Marte (.055). The only 19 year-old to top .085 was Alex Escobar (.094). Harper this year was .071. Mike Stanton was .123.
Blue text on a light blue background? I can't be the only one suffering eyestrain from that. Is black text unhip or something?
I've been writing about this for a while. Weaver is viewed and regarded as a three-quarter delivery guy, and apparently hitters read the expected movement on his fastball from that upper arm angle. So they're expecting much more armside run than rise. But pitch/fx data shows that he gets very little run and lots of rise, as if he were coming straight over the top, apparently because of his forearm angle and wrist position. Voila, hitters are under the ball.
If this explanation is correct, you'd expect Weaver to have bigger times-around-batting-order splits than usual, as hitters adjust to the discrepancy, and sure enough, he does.
I'll second the nomination for (2B + 3B) / (1B + 2B + 3B), and add 1B / (1B + AB - H - SO + SF), which is to say 1B / (1B + Outs in Play).
It's very clear that the first three things to look at are K%, UBB%, and HR/Contact (and interesting to see that the latter generally stabilizes before HR/OFFB, which means that variations in OFFB rate do not generally being with them the baseline HR/OFFB rate).
It's quite unclear how to best break down balls in play, however. You could do 1B/BIP and XBH/BIP independently, and they both stabilize before BABIP. If you did BABIP first, you would want to look next at XBH / Hits in Play. Alternatively, you could first do XBH / BIP and then do 1B / (1B + Outs in Play), the question being whether that stabilizes before 1B / BIP.
And then there's the very interesting question of whether finding a stat that stabilizes very *slowly* is actually desirable, because that way you can isolate the luck. It's at least informative. If in fact 1B / (1B + Outs in Play) stabilizes even slower than BABIP, then we've demonstrated that most of the luck on BIP resides in the singles, since the former removes XBH entirely.
Kevin, Cabral's first three weeks / six outings in high-A were every bit as spectacular as his low-A performance: 6 H, 1 BB, 14 K in 12.1 IP. Then he fell apart. So the level change seems to be irrelevant; he just needs to recover what he had during the first half to be able to make the Rays.
Well, you'd better send a memo to Clay, because the last time I crunched the numbers on his Davenport Peak Translations*, he had Anthony Rizzo and Lars Anderson as the 6th and 7th best hitting prospects in baseball (after Montero, Freeman, Hosmer, Moustakas, and D. Brown. Not adjusted for position).
*Combining across levels and adjusting for the fact that promoted guys tended to gain a little value, indicating that his level-adjustments weren't quite neutral.
What we need is the sortable report covering everybody, back to the beginning of time.
And the report giving team totals, too. The argument that UZR and Plus/Minus underrate the spread of performance due to range bias is really interesting. Having the data for everyone and for teams would give us terrific tools for testing that. The obvious question: if you back team nFRAA and team UZR out of pitching totals, which makes more sense of the pitching numbers that are left behind (especially BABIP)?
Excellent point -- except that self-confidence isn't an emotion, it's an attitude or belief that is compatible with any emotion on the anxious to calm spectrum. I have no idea whether Papi was fueled by extra adrenaline or extra calm but I don't think any of these athletes can succeed disproportionately under pressure without extraordinary self-confidence. Nor do I think that a period of lack of success or a physical decline is going to change the *emotional* response to high-pressure situations -- just the belief in the inevitably of success.
And I think that great self-confidence increases success under pressure by essentially eliminating distracting thought processes. All the cognitive horsepower is dedicated to figuring out the strategy and tactics of the situation with complete clarity, and then the thinking brain can get completely out of the way so that "muscle memory" (procedural memory, technically speaking) can take over. When any doubt enters the equation, that distracts from the clarity of the conscious thought and impedes the purity of the unconscious skill execution.
(Yes, this is all absolute speculation, but it's informed by a ton of psych classes and the normal amount of personal experience, although I have to admit that the "sport" in which I once in a while achieved unusual success via supreme self-confidence was pinball.)
Finally, in Papi's case, the opposing pitcher's lack of confidence and corresponding diminished quality of execution was almost certainly a factor as well.
Great article, but David Ortiz did not earn the "clutch" appellation because of some perceptual psychological quirk. He won it because from the 2004 post-season to mid-August of 2006, the Red Sox won 14 consecutive games in which he had a chance for a walk-off hit, with Ortiz getting 11 walk-offs (7 of them HR) and being on base after a BB when they won the other three times. In these 20 potential walk-off PAs he hit .786 / .850 / 2.286 (11-14, 7 HR, 6 BB). All three times he was retired (twice in the ninth and once in the 10th) he won the game in a subsequent PA.
He came up eight times needing just a 1B to win the game. Rivera got him to pop up in the 9th inning of 2004 ALCS Game 4, but subsequently he went 1B, BB, 1B, BB, 1B, IBB, 1B.
He came up five times needing a 2B for a victory and went HR, HR, HR, HR, BB.
He came up seven times needing a HR to win, and went K, BB, BB, 1 out solo HR off of Scott Shields (9/6/05), 2-out 3-run HR off of Otsuka (6/11/06), GO, and 1-out 3-run HR off of Carmona (7/31/06).
I messed around with some numbers and estimated that the odds against having a hitter of Ortiz's caliber doing all this in a random simulation like Diamond Mind were a billion or a trillion to one. Hence my assertion on ESPN the next winter that if clutch hitting were a drug, Ortiz would have been the first person ever to be certified as effective by the FDA.
People make the mistake of thinking that "being clutch" (or its opposite) is a trait variable when it is actually a state variable. The 2004-2006 results are so extreme that they can only be explained by an overwhelming but ultimately fragile self-confidence. Once Papi failed a few times he stopped believing in the inevitably of his success ... but for a few years, he had that belief and it became as good as prophecy.
The only problem with the "Kalish recall lights a fire under Reddick" theory is that Reddick hit .387 / .424 / .677 from July 15th through Kalish's last Pawtucket game and "only" .363 / .387 / .637 since.
The turnaround actually happened exactly at the All-Star Break and was even more dramatic than you report: .207 / .255 / .383 in his first 71 games and .368 / .396 / .647 in the subsequent 31.
Two corrections -- first, Kalish has been playing CF vs RHP with Nava, Hall, and McDonald sharing LF.
The other one, and it's a biggy, is that they've absolutely settled on Felix Doubront as the 7th inning guy they've lacked all year. He's only faced 28 batters but he's fanned 9 and walked 1, rates that are hugely better (and already statistically significant) than his rates as a starter (72 BFP, 10 K, 8 BB). He hasn't shied away from leverage, either, as he has a 3.03 Component ERA (not BABIP-driven, either, it's .313) and a WPA which translates to 1.13. He's throwing five pitches (4-seamer, very distinct 2-seamer, cutter with slider spin and break, change, and big curve) and commanding them all. He could implode tomorrow but so far he looks like the real deal.
As I just explored over at SoSH, almost all of Beltre's improvement is in his third time around the batting order (where he's gone from bad to terrific), and his behind-in-count vs. ahead splits have widened a lot, too. That means he's been a *smarter* hitter as best as we can measure that objectively, and suggests a possible explanation for the breakout: significantly better pre-game prep.
BTW, there's a pretty good chance that Hermida was actually placed on optional assignment waivers and will still be on the 40-man roster at Pawtucket. Where he'll have a month to prove that he's worth tendering a contract to in December (and maybe even beating out Kalish for the last post-season roster spot, if there is one).
I love that the Padres signed him and I'll take a little bit of credit for that, as I did a huge and largely very glowing analysis when Jed Hoyer and I were both with the Sox (although the conclusion was that he had huge physical talent but serious weaknesses on the mental side which needed to be dealt with). Still rooting for the guy.
It's worth noting that WARP3 is not a measure of actual value, but a measure of value assuming a neutral distribution of events relative to game leverage*. After prorating this year, Lackey over the last three seasons is averaging 1.0 win of actual value (based on WPA) per year more than his WARP3. It may be luck, but he may also be saving his bullets for high-lev situations.
*And it's based on rate components for hitters when it could just as easily be based on change in run expectancy, while the opposite is true for pitchers.
It's seldom mentioned that the crappy way earned runs are calculated often contributes a bit to these differences. Besides the obvious issue of inherited runner support, there's the bogus way that pitchers are absolved from all responsibility after they should have gotten three outs in an inning, no matter how hard they're hit (any batter who reaches bases after the deserved third out and then scores without benefit of a subsequent error should be an earned, not an unearned run, unless the third out would have ended the game).
In Buchholz case, he coughed up 4 "unearned runs" on 4/17 after an error with a delta-RE of 1.5. Oddly enough, he has subsequently pitched out of error-instigated jams at a better than average rate, negating that difference, but he has also gotten 0.9 extra runs of inherited runner support from the pen. His "True ERA" (adjusting for inherited runners plus a similar adjustment for errors, crediting the delta-RE instead of the number of actual unearned runs) is 2.72.
More widespread use of such a True ERA would nicely take out a little bit of noise.
Re Kalish: I don't think any club values "makeup" more than the Red Sox, and from what I've read Kalish's is terrific. The last guy who profiled the way Kalish does -- solid regular tools but maybe not a star's, terrific makeup -- was Dustin Pedroia. Now, I'm not saying Kalish will be the 2013 AL MVP, and neither are the Sox. But I think it's very likely that they give him J.D. Drew's job in 2012 because they've had so much success with guys like him and are convinced that he will be better than expected by scouts who are looking only at tools. Far from a cloudy future, he may have the clearest of all the Sox position player prospects.
Kevin, the Oscar Tejeda line looks like the opposition is attacking him like a guy who just had a 643 OPS in low-A (which he is); if they throw everything over the middle of the plate and you're good, you will have a .295 Iso and .003 IsoD. My question: how much scouting of the opposition and going over opposing hitters is there at this level? He's been hitting 6th for Salem; at what point do they start pitching him like they would a #3 hitter based just on his numbers so far?
Just to rub it in for Mets fans, Jonathan Papelbon has made three appearances at home and two of them were with the score tied in the ninth.
Just a note: the historical correlation of manager to team BABIP is very high, seemingly much higher than you could explain by changes in defensive personnel. So what you're identifying here as "team" includes not only the defensive skill of the fielders, but the quality of their coaching.
At about the same time as I did that study I did a correlation study of BABIP for pitchers who changed teams -- which in fact was the first post-Voros study of any kind to demonstrate that BABIP was a pitching skill (all of this work is buried in the bowels of rec.sport.baseball). I seem to recall that the luck % that I came up with (based on the r^2 of the regression) was less than 75%. I'll have to dig up that study and think about the results.
All this is true, but I'm specifically taking about the difference between teams that have every draft prospect sit down and take at least one psych test of some sort and those that don't. According to KG, the latter easily outnumber the former. If the former are in fact kicking the butts of the latter in drafting and developing players, eventually the whole industry will come around and the edge will go the organizations who do the best testing. But this may take ten or twenty years.
You might try thinking of a team that is generally regarded as being on the cutting edge of everything and has had extraordinary recent draft success, especially with overperforming players whose success has been credited to their outstanding makeup. You would think that the clubs who aren't trying to measure makeup would wonder whether said club were doing so, and in fact with some success.
There's definitely a fascinating psychological effect at work here. Ortiz hit .286 / .364 / .616 in 173 PA from June 6th to the day the PED story broke, and (after going nearly 0 for his next two weeks, when by his own testimony he wasn't sleeping at night) .290 / .397 / .619 in 184 PA from August 14 to the end of the season. I watched all but a handful of games and scored every pitch and he certainly didn't look awful to me; he looked like David Ortiz only a couple of years older.
So the psychological factor at work here appears to be observer bias (and I'm not asserting that my view was necessarily neutral, either, although I suspect pitch/fx data could help back it up). There isn't even a consensus "what our eyes could see."
Re the batting order: Ellsbury is definitely 1 and Scutaro is probably 9.
3-4-5-6 is toughest but the key is that Ortiz and Drew will be separated and I just don't see Drew hitting 7 behind Beltre or Cameron. That's a big reason not to hit VMart 3, the other being that when he's not in the lineup, you have to do much rearranging.
I'm pretty sure Tito would like to do (revitalized) Ortiz-Youkilis-VMart-Drew but he may start the season off Drew-Youk-Ortiz-VMart.
Cameron may hit 7 and Beltre 8, hard to say.
According to mlb.com*, Bedard won't be ready until June at the earliest, so I think it is a camp battle, at least at the beginning.
Isn't there a massive free-for-all for Erik Bedard's eventual spot, between Doug Fister, Lucas French, Jason Vargas, Yusmeiro Petit and Garret Olson?
I found the SIERRA articles struck a nice balance. In fact, and perhaps quite ironically, if they hadn't stopped to explain Applying Park Factors 101 at the start of the first article, I wouldn't have caught that they did them wrong (how did everyone at BP miss that?). This is another reason why clarity is good.
All the glossary definitions should have both rewritten plain-English explanations and complete technical descriptions. There may well be other methodological mistakes that have gone undetected -- in fact, I've found that PADE isn't remotely correct at all. I haven't been able to figure out why from the available description of the method.
The field does need more writers and speakers who are capable of explaining the complexities to a wider audience. I've had some success doing that with physics and I hope to get a chance to do that with baseball at some point. But it is so much easier to write for an assumed informed audience that I think many of us are just sucked into writing for the cognoscenti by sheer inertia.
That OPS is initially difficult to grasp is another good reason to bury it. Some version of RC/27 would not only be more accurate but much more transparent to the average fan. Call it HRA for Hitter's Run Average and every fan would instantly understand what it meant.
The big problem with that, of course, is that different people would be getting different numbers depending on which metric they used. I've been working for a while now on improving BaseRuns, which I (and many others) think is the best put-together metric conceptually; if we could come up with a version with more terms that was clearly more accurate than RC/27, EqA, Linear Weights, etc., we might shoot for universal acceptance.
I don't follow this logic at all. As far as I can tell, the minimum effect of GB happens when GB = FB + PU in either form of the metric. When that happens, the linear term, the squared term, and the interaction terms all become zero. When GB > FB + PU or GB < FB + PU the term becomes non-zero and you start to see GB loading on the metric. Your final equation does reflect the reality of the situation (GB rate is minimized at at unknown value) but your constant e is just an unknown portion of the overall constant a + b + e.
In general, I don't think there's any rationale for keeping a term as both linear and squared if the squared term is significant and leaving out the linear term improves the overall regression. In this case, the interactions of GB with itself and with K and BB rates appear to be so important that if you include them you don't need to include the term directly. That may make the seeming illogic of not having the term directly more palatable to consider.
The one problem I can see in general is that a pitcher with GB = FB + PU is not at average pitcher and yet he's the baseline that's determining the constant (i.e., he's contributing your unknown variable "e" to it). I would have begun by normalizing all the data, so that GB_FB = 0 meant a pitcher with an average rate. This would be the best solution to the problem you're worrying about, since by definition the effect of GB can be regarded as minimized for an average pitcher. Then you run the regressions and you convert the coefficients to useful ones by reversing the normalization.
40 IP is actually lower than I think might be safe; that's about 170 BFP and the Y2Y correlations seem to start falling off more steeply below 200. But probably no big deal*.
You really should try removing the straight GB term (the rationale being that you've already got it squared and there's no logic that says it needs to be fully quadratic) and see what happens with the GB*BB one. I'm just personally curious because I've done so many of these multiple regressions and I've seen a lot of funky things happen when you take out one term.
*It's worth noting, though, that increasing the sample size with noisy data can give you worse (less significant) regression terms.
First of all, and this is a big we-need-to-start-over-and-run-the-numbers-again mistake (although I think ultimately we're only talking about tweaking the coefficients): I am fairly certain that the (Pete-Palmer, Total Baseball designed) Park Factors in the Lahman database do not have to be cut in half, because they are already designed to be straight multipliers. Compare them to the straight Run Indexes published each year in the Bill James Annual, or plow your way through the technical explanation at
Biggest unanswered question: what's the minimum BFP for inclusion? I've found no loss in year-to-year K/PA correlation down to 260. If you didn't go that low, you can increase your sample size.
My biggest disappointment is that you started with ERA. Granted, that's the stat we look at. But there's no good reason to ignore the (very accurately) quantifiable errors in RA caused by good or bad inherited runner support. You have that data here ("Fair RA," IIRC).
And you probably should have wrestled with R vs ER. Personally, I believe in keeping track of UER but doing it exactly the way you adjust for inherited runners -- the pitcher is credited with the average change in Run Expectancy caused by the error rather than the number of UER that actually end up scoring. (This actually only works for ROE; for errors leading only to base advances you need a "subsequently rendered moot" adjustment, ao it does get tricky.) I bet there's a correlation of GB% to errors and hence UER ... you may have been better off regressing to Fair RA (adjusted only for inherited runners) with a separate term estimating ER/R. Or regressed to Fair RA and used a fixed ER/R, which is just using RA but scaling it to look like ERA.
Finally, I've never kept a term with p = .56 no matter how strongly I felt it deserved to stay. That is not trending towards significance and I think it's wishful thinking to expect it to get there with a bigger sample. Although I am at a loss to explain why it's not showing up. I would experiment with taking out the straight, non-squared GB term and see if that helps this one.
Not an artifact. Same pitchers, look at year-to-year changes in BB rate, and HR / Contact follows very mildly but with immense statistical significance. See below for the details.
When not using HR / FB (i.e., in the many cases where FB data is unavailable), you should always use HR / Contact, which is not only the most logical but also has a stronger year-to-year correlation that HR with any other denominator I've looked at.
Change in HR/Contact, adjusted for age and for any change in role, correlates to change in (BB-IBB)/(PA-IBB), r = .130, p = .000015 (n = the 1107 pitchers who faced 200+ BFP in consecutive seasons for the same team playing in the same park, 2002-2009).
(Without the adjustments, r = .126, p = .000026.)
Here's food for thought for version 2.0.
I'm sure you're aware of the positive correlation between BB% and HR rates and your methodology will capture that. However, the baseball c.w. includes a historical class of pitcher (Jenkins, Hunter, et al) who featured elevated HR rates in conjunction with low BB rates. They may be worth looking at.
Essentially, good control improves K and BB rates and depresses BABIP and perhaps HR/FB. But pounding the strike zone rather than nibbling (a difference of approach, not skill, perhaps) also improves K and BB rates but may well increase BABIP and HR/FB. Daisuke Matsuzaka looks like he is actually achieving lower BABIP by nibbling (whether this is sustainable is another question entirely) -- again, a correlation of walk rate to hardness of contact that's opposite the expected.
Another guy I can think of who appears to have a true BABIP skill is Jared Weaver, whose gets a ridiculous BABIP on his FB given his swing-and-miss rate. I think that's a function of the deception in his delivery, where the movement on the pitch doesn't match the upper arm angle.
One thing a metric like SIERA will allow us to do is identify the consistent under- and over-performers better than past metrics, and then we can examine them with pitch/fx and the like. Those findings may never be included in the metric but would allow us to determine after a single year's over- or under-performance whether a pitcher might be one of the rare guys who has a true, non-SIERA measurable BABIP or HR/FB skill (or lack of same).
Good! Grabbing a chunk of the true BABIP skill via the K and BB factors in the regression makes the metric even better than I realized.
I remain somewhat skeptical about whether HR/FB really should be regressed essentially to the mean, but admittedly I've barely studied it, let alone studied it as much as I've studied BABIP.
This looks like a very nice advance in metrics that assume that pitchers do not have significant variations in hardness of contact allowed. That is an incorrect assumption*, but the variations are small enough at the MLB level to make such a metric very useful, and may help get a handle on those slight but real differences.
*I've lost count of the number of ways this can be demonstrated. Most obviously, there is a significant correlation between team BABIP allowed and the three true outcomes; staffs that are better according to the latter also allow a lower BABIP just as you'd expect. But here's perhaps the best bit of evidence yet:
Take all the pitchers with 200+ BFP in consecutive seasons for the same club in the same park since we've had UZR data (2002-2009). The change in BABIP is of course correlated to the change in team UZR Range + Error, with the correlation surprisingly weak (r = .19) because BABIP is just so damn noisy. Now, adjust the yearly change in K/BFP for age and change in role (starter vs. relief) and toss that into the regression. Surprise! The change in BABIP is also significantly (p = .02) correlated to the change in K rate. That is very unlikely to be caused by changes in FB/GB ratio that correlate to K rate changes and very likely to be what it looks like: you strike out more batters, they also hit the ball less hard (and the opposite).
It's also interesting that the change in BABIP with age (again, with change in UZR included in the regression) precisely mirrors the change in HR / Contact with age. In both cases, there is no change until age 28 or 29, then a worsening. (Contrast to K rate, which significantly improves to age 27, then declines at the same rate). The worsening of BABIP after age 29 is not significant (p = .25), but in this data set neither is the improvement in BB rate by young pitchers (p = .26), and we're pretty sure that's real.
(Some of this is in a thread at SoSH looking at the impact of team defense on pitching.)
Re Lester (and Bard), it comes down to this: common sense tells us that 2007 is much less relevant than usual. What you want is an objective, pure quant system that is just as smart as common sense.
I always assumed that PECOTA worked by finding the best match for both the size and *shape* of the previous three seasons or even the whole career. The Bard and Lester predictions make me 90% certain that the system starts by taking a 3-year weighted performance average and then looking for comps.
OK, this is demonstrably wrong.
They have Adrian Beltre projected as a +1 defender.
And David Ortiz as a +12 (in 118 games!).
And of course these wacky defensive projections are presumably factored into the pitching projections.
It appears as if they are projecting the starters to be about +15 defensively ... which is a run less than Beltre alone has averaged in his career according to UZR and FB.
Folks might try taking 0.35 off all the ERAs and adding 5 wins to the team total if they want to know what this would look like if BP had a PBP defensive metric in place already.
Those Breakout and Improve numbers are indeed a year old. Get the weighted mean spreadsheet for this year's numbers.
Lester has a 39% improve, so they expect him to regress even compared to his 3-year baseline, which includes a 2007 which he broke out from. When you factor in the improved defense, they are projecting a big decline, on the order of 1.00 of ERA. Who knows why?
OK, I've figured out the answer to that one: they think that the 7.7 K and 5.7 BB are Bard's baseline rates based on his last three seasons. Which they get by including his 7.08 ERA ( 5.6 K, 9.4 BB) season in A ball (mostly low-A) three years ago, which common sense would tell you couldn't possibly be relevant.
Are we really supposed to believe that PECOTA has a sizeable body of comps who were historically, unimaginably awful their first year in the pros and two years later went an entire month in MLB fanning 50%+ of all BFP? And therefore 2007 is actually relevant?
Sure they do.
Something tells me that PECOTA *starts with* the weighted 3-year average and does its comps based on that, when I always assumed it did the comps by matching each year separately. A smarter system would know that 2007 meant nothing, because after matching 2009 and 2008 it couldn't find anything remotely resembling it.
The Beckett VORP isn't the only thing that makes less than zero sense. How about this one:
Daniel Bard is projected to go from 11.5 K/9 to 7.7, and from 4.0 BB/9 to 5.7, but he has the highest "Improve" percentage on the team, at 51%, and the fourth lowest collapse. A 50% improve is supposed to mean the same performance, so why does his weighted mean projection show a collapse? And that's ignoring the fact that the defense behind him will be 0.50 of ERA better.
The biggest relevance I can think of for baseball is on subjective evaluation of players, especially fielding. It's not quite inattentive *blindness*, but it's the same phenomenon writ smaller -- the brain saving CPU cycles by not bothering to actually see, and instead substituting what it expects to see. The jumps we "see" an outfielder get on a ball are likely to be filled in by the brain rather than perceived accurately.
A related phenomenon is when the brain tries but is unable to perceive accurately and fills in with its best guess. We've always known that "late movement" on a pitch is a physical impossibility and hence optical illusion. What pitch/fx has shown us is that fastball movement in general is an optical illusion -- any swing and miss looks to use like more movement than any contact. Hideki Okajima gets as much FB movement -- and more rise -- than Jonathan Papelbon, for instance. When Dan Bard had trouble locating his FB in his first few outings and got it pounded, more than one observer decreed that his FB was straight and couldn't get MLB hitters out. He then proceeded to fan half the hitters he faced in July, showing awesome "late movement" -- which was, according to pitch/fx, identical to the "straight."
Pitchers who succeed by deception are exploiting the difference between the movement the hitter expects to see based on the general style of delivery and what the actual movement is.
What I heard from my classmates subsequently (or quite possibly Dan Simon himself; I can remember it either way :)) was that they tried the experiment with varying degrees of difference between the before and after subjects and got a nice, roughly linear relationship to the percentage of people who noticed. I do know that they tried it with a gender change and that some people were oblivious to it, but I can't recall whether the percentage was 10% or 25% or some other low number. I also believe that men were better at spotting swapped women and vice versa.
Earlier experiments by Simon show that who the other person really matters. If the social situation is such that the other is encoded generically (a construction worker on campus versus a student), change blindness rates are very high.
The most famous change blindness experiment was done by Simon and Chabris just after Simon came to Harvard (a year or two before I took his introductory cog psych course). Watch the video at
and count the number of passes made by the team *dressed in white.* Did you notice anything unusual? If not (and I didn't, nor did 80% of my classmates), watch the video a second time.
Simon was amazingly cool -- he also told us all about Google when it was early in its beta test!
Oh. My. God!
I helped design this experiment as a member of Dan Simon's Cognitive Psych Lab class at Harvard in (IIRC) the spring of 1999. I had to drop the course (it was my 5th) after we had finished the design but before we got to the point of running it. It's great to actually see the results -- all I knew about them was from running into classmates subsequently.
(And when I say I helped design it, I actually made no specific contribution. Change blindness was Simon's cognitive psych specialty and the lab class involved each of us designing our own little experiment to test it. But half our time was spent on brainstorming a group experiment, and continually refining our ideas for the experimental design. It's quite possible that the resulting design was what Simon had in mind all along, but it felt like we were making it up, not him, because he's a brilliant teacher.)
It's a small world when this shows up here.
(And yes, the bunch of psych classes I took from 1998-2002 back at my alma mater have profoundly influenced the way I think about baseball. Someone who wants to pursue sabermetrics as a career or serious hobby could do worse than major in it.)
Truly interesting. I think their plan A is to sign Adrian Beltre and just have the most ridiculous run-prevention team in anyone's memory, simply because I think they're appropriately skeptical about Youkilis at 3B for a long stretch of years (this is exactly the point in his career where they would be thinking of moving him the other way had they never traded for Mike Lowell).
But trading Beckett to the Rangers for a prospect haul that they pass on to the Padres -- that's a great plan B.
I'd take Clay Buchholz over Porcello and probably Hanson and Anderson, too. And just to prove I'm not a fanboy, maybe over Lester. It's kind of strange that a guy can be the #1 pitching prospect in MLB, add a very good two-seamer and boost his G/F, develop his slider into a legitimate plus pitch, have a very solid and sometimes spectacular half-season in MLB (with a FB that averaged 93-4 and touched 97), and not even be a "just missed."
There is a further complication or two that makes it even more challenging for Anthopolous and an even closer parallel to the Ultimatum Game.
There is a sound argument that he should do whatever he can to guarantee that Halladay does not end up with either the Red Sox or Yankees. His goal, after all, is to finish ahead of one of them sometime soon. He's only going to trade Halladay within the division if he gets overwhelmed relative to the value of the two picks, and it's unlikely that either the Sox or Yankees will make that kind of offer; they will both be content to let him go elsewhere, as they were with Santana.
That leaves the Phillies and Angels, and Halladay is very unlikely to sign an extension with the latter because he wants to play on the East Coast. In fact, he's stated that the Phillies are his ideal destination.
So it's basically the Phillies (who are reportedly trying to deal Blanton's last arb year to create salary space) or bust. Trading Halladay for Michael Taylor and Travis D'Arnaud would be a huge win -- you'd save Halladay's 2010 salary, save the signing bonuses on the two picks, improve your 2011 draft position, end up with two better players more quickly, and guarantee that Halladay does not pitch the next five years for a team you're trying to catch. (The Phillies would then let Cliff Lee walk a year from now and recover some of the value they gave up with those two draft picks.)
Getting two picks for Doc and having him sign with the Phillies would be clearly less desirable; getting just the two picks and having him sign with the Yankees or Sox would be a disaster.
Of course, as good as it is when examined rationally, this trade would give Ed from Scarborough an aneurysm.
Hell of a dilemma. But if I can explain the logic so succinctly here, why can't Anthopolpus explain it to Ed? (You'd leave out the "we're trying to get worse" part, of course.)
Mark Grace had a career .383 OBP and was a legitimate GG defender. Kotchman's also an excellent defender, but he's in danger of losing his job not because of a lack of power but because he has a career *.337 OBP*. That's an enormous gap in offensive value. James Loney is an average defender with a .354 career OBP. Again, not anywhere near the player.
I think you're actually proving my point here: folks will immediately lump all power-challenged 1B together without discriminating between bad, OK, and excellent OBP, and between OK and terrific defense. (I mean, really, James Loney is similar to Mark Grace the way Bill Gates is just like Oprah.) If Anthony Rizzo can hit 15 HR a year but put up a .380 OBP and play GG defense (and I think a lot of scouts feel that is a very realistic upside for him) he will be an All-Star caliber player exactly like Grace was.
The only thing that bugs me about an otherwise extraordinarily informative writeup is the assertion that Rizzo's HR power "needs" to show up soon simply because he's a 1B prospect. While HR power is the single most important attribute for a 1B, it's not *necessary* the way K rate is for a pitcher. If Rizzo's HR power never shows up, you're still essentially describing Mark Grace there. A 1B prospect who projected to be an average defender instead of Rizzo's +10 or so and had shown an extra +10 runs of projected power would, I think, be a four star prospect, simply because of that ultimately irrational bias that a 1B "needs" to have plus HR totals. But it's the whole package that matters, not the way the value is distributed.
In 2004 David McCarty converted himself in ST into exactly the sort of player you're talking about -- the last guy on the bench who could double as an extra LOOGY. Francona brought him into a game on April 9 with one out in the top of 9th with the Sox down 8-5 and he gave up an inherited runner and one of his own. That more or less ended the experiment. He pitched the 9th inning of a blowout loss on June 12th and got the side in order, including a K of Jayson Werth. On the last day of the season, he pitched two 1-hit innings against the O's, fanning Palmeiro, Bigbie, and Newhan (all looking). If he had started with an outing like that, the story might have been completely different.
Joe, this is by and large an exemplary summary, with one glaring exception and one minor quibble.
And I have to give you credit for sticking with your guns re the supposed demise of David Ortiz. But the fact remains that after you called for his benching here and in SI, after you were so certain of his washed-up uselessness, he hit .281 / .383 / .579 in 201 PA, a line entirely predictable from his season to that point (I can say that because I did just that at the time, based on his hitting an almost identically valuable .285 / .364 / .616 in 173 PA from June 6 to the day the PED scandal broke).
Which means that his failure to hit in this series, like virtually all his teammates, is as meaningless as all the other small samples you correctly dismiss. They probably should start platooning him rather than Drew against tough LHP, but the notion that they desperately need to dump him (as opposed to looking to upgrade him, say, with a major trade for Price Fielder), or that they should have PH Brian Anderson for him on Friday night, is just wack.
The minor quibble involves dismissing A-Rod's post-season performance -- indeed, his whole career in the clutch -- as random variation. A-Rod was a terrific post-season player through 2004 ALCS game 3, then was awful until this year, and now is terrific again. Meanwhile, his regular season "Clutch" (as measured by FanGraphs) was an average +.25 wins per year before and after (in his playoff years) and -1.06 in between. That doesn't strike me as random; that's bipolar, and I think you'd grow to be very old indeed before you replicated that pattern replaying his career in Diamond Mind. I think the mistake everyone makes re "clutchness" is thinking it's a trait variable rather than a state variable. A-Rod's neither a clutch player or a choker in the long run (like nearly everyone else), but I honestly think he's a guy whose confidence swings to such extremes that his performance, one way or the other, is unlikely to be attributable to chance at any point in time. As long as he's feeling good about himself the Yankees are going to be a little tiny bit scarier.
You've got the six best NL teams ranked 2, 5, 6, 7, 8, and 9.
Last year, according to a properly adjusted EqA RS / RA, the six best NL teams finished 5, 10, 11, 12, 14, and 16.
To say that this initial Hit List catastrophically fails to adjust for league imbalance would be an understatement. I'm not even sure you're including the c. 2.2 win league difference you guys think exists (it was actually 9.3 last year).
I know this sounds like nitpicking, but it's a pet peeve of mine: none of these odds make a whit of sense mathematically. For instance, Viciedo has a 10% chance of being #1 (that's what 9-1 means), Poreda has a 7.7% chance (12-1 is 1 in 13), and Beckham has a 42.9% chance (4-3 is 3/7) ... so who has the remaining 39.4% chance of being the White Sox #1 prospect? Blagojevich?
(It's probably Beckham, I'd guess, which means the odds on him should have been 1-4, not 4-3).
I'm not surprised that KG made this mistake (it's pretty common in such lists, although the more usual error is to assign more than 100% of the probability), but frankly I'm a little disturbed that none of the more numerical types at the corporation caught it. Does anyone edit this stuff?
A rotation with one of Beckett, Matsuzaka, and Lester, and the best four from Smoltz, Masterson, Buchholz, Penny, Wakefield, and Bowden might still be the best rotation in MLB after the Yankees and Rays. The Sox might indeed be "done" in the two-injury scenario only because the division is insanely strong, but they could still be the third best team in MLB. A two-injury scenario for the Yankees or Rays does not leave them among the three best teams in the game, IMHO.
So there is a significant difference in pitching depth.
The full, complex methodology would be smart enough to handle this and more.
For instance, you project the MLB 1B to hit .274 / .395 / .582 and the blocked AAA 1B to hit .259 / .341 / .508. In certain runs of the simulator, the MLB guy gets hurt and puts up something like a .207 / .360 / .352 through the end of June before being shut down. In those runs, the simulator calls up the kid from AAA and maybe in some of them he hits .288 / .356 / .567. Voila, extra accuracy! But without such a methodology, you are not factoring in the way having a Ryan Howard in AAA protects you from the unexpected collapse of a Jim Thome.
Even the simpler methodology I suggested would do a better job than simply averaging PT. If the Yankees suffer an injury to one of their top 5, the replacement will be whoever is pitching best among Hughes, Aceves, and Kennedy. So whoever gets more starts than projected from that group (at the expense of the others) will also be likely outperforming his PECOTA mean projection. This puts a team like the Yankees in a better position than one with just one or two extra viable rotation candidates.
I\'m curious as to whether the model handles positional battles / depth correctly, or whether it could be improved, perhaps dramatically (project for next year).
Team A has one SS with a PECOTA projection of a 750 OPS. Team B has an identical player, but also a second guy with a 730 projection. You reasonably assign a 50 / 50 PT weight to the latter pair, even though (as you explain) in some universes the actual split will be 90 / 10 or 10 / 90 depending on who wins the job in ST.
I\'m guessing that, in your model, team A ends up getting better projection from SS (750) than team B (740). In the real world, of course, the opposite is true, because often the team will pick the player having the better season (or likely destined to, based on physical condition). Significantly often, the guy who wins the job will outperform his PECOTA mean projection, and (perhaps more importantly) the winner is much less likely to grossly underperform. No matter who wins the job in ST, if he\'s at his 10% projection in mid-May, the other guy is likely to get a shot.
The proper way to do this would be with a season simulator that has both a real and random component for variation about a player\'s mean projection and which dynamically adjusts PT. The simpler way would be to just use the 60% (or 75% or whatever) PECOTA projection for players at deep positions.
1) You might look up the word \"stereotypical\" in the dictionary. It doesn\'t mean \"widely held.\"
2) And why should should the politics of the person writing the introduction to a baseball book matter in the least? That\'s what none of the conservatives have been able to explain.
3) You really expect me to believe you would have been just as upset if Rush Limbaugh or Ann Coulter had backgrounds in sports commentary and had written the foreword?
4) The funniest thing about all of this is the folks excoriating Olbermann for being biased and hence unscientific and hence inappropriate to write for BP. This is a complaint, mind you, from supporters of an administration who not only opposed the \"reality-based community\" (their term), but mocked it.
Wow. Wow. And wow.
Echoing CubbyFan23: there is no possible way that any of the anti-Olbermann crowd\'s dislike of him can begin to compare with my revulsion and hatred for the policies of George W. Bush and for the job he did as President. (Olbermann\'s just talking, Bush did actual severe damage to the country and probably the planet). And yet if W. had penned this year\'s introduction, I would be curious as to what he had to say, as someone who loves the game. I\'m actually kind of fond of Bush the baseball fan; it\'s the only aspect of him I find remotely palatable.
Of course, it\'s a characteristic of the conservative mind that it sees only forests, not trees, and is generally incapable of having a mixed emotional response to different aspects of the same thing (which is why conservatives misread liberal criticism of America as hatred). So it\'s actually not surprising in the least that the conservatives are incapable of feeling differently about Olbermann the baseball analyst and Olbermann the political analyst.
The Phillies had more hits, more doubles, more homers, more walks, more stolen bases, and fewer GDP\'s -- and somehow lost.
And when was the last time you saw a team fan three straight innings with a runner on 3B and 1 out -- and they were the only 3 K\'s the opposing pitcher had recorded to that point?
And, BTW, the Phillies were just 1-15 with RISP.
The reason why Lester\'s G/F jumped was because he added a new pitch, a sinking \"1-seam fastball\" (2-seamer variant) that is nearly as distinct from his 4-seamer as his cutter is. Compared to the 4-seamer, it breaks 4\" in and 5\" down (the cutter breaks 5-6\" away and 3-4\" down; the 4-seamer has been averaging 94-95, the cutter 90-91, the sinker 93).
His transformation into an ace is pretty much the result of that, increased strength and hence velocity, and the improved command that is very common for young pitchers.
Umm, he\'s better than Paul Byrd? He\'s the fourth starter on the team and he\'s pitching game 4. It\'s not rocket surgery.
I\'ve been advocating for automated balls and strikes all my life -- there\'s a thread discussing it over at Sons of Sam Horn:
I\'ve also been posting analyses of umpire performance in the ALCS thread. According to pitch/fx, Sam Holbrook missed as many as 35 calls Saturday night, 27 of them obviously, with the calls favoring the Rays 2 to 1. Considering how consistently Beckett was squeezed in comparison to Kazmir, it\'s hard to escape the conclusion that the Sox would have won the game with neutral umpiring.
There\'s an assumption that Youkilis is a defensive downgrade at 3B, but in 452 innings the last 3 seasons (admittedly a small sample) he\'s +16 runs per 150 games according to Fielding Bible Plus / Minus (this year he\'s even better); Lowell this year was +9. Nor does he look like a downgrade; he has much better range to his left and the rest is hard to say.
Wow, you really don\'t want to lose all your credibility within the first two paragraphs. The Sox victories were by 4, 8, 4, 3, 4, 6, 3, and 8 runs; the Rays were by 1 (in 11), 1, 3, 1, 2, 1, 1, 2 (in 14), 1, and 7. Yup, the Sox and Rays played seven games that were either decided by 1 run or went into extra innings, and the Rays won them all. The Rays bullpen was better in these games, but no bullpen deficit gives you a 0-7 record without you being more than a \"wee bit unlucky.\"
The slight advantage of D3 over D2 is almost certainly meaningless, especially considering that the W3 / D3 numbers are completely inaccurate, and that the adjustment for strength of schedule really should be made separately for actual, Pyth, and EqRS/RA Pyth win percentages, the raw versions of which can then be safely ignored.
(See http://sonsofsamhorn.net/index.php?showtopic=36962 for the problem with W3.)
Facts, a/k/a Fielding Bible Plus / Minus converted to Runs / 150 G.
Youkilis +7, Pedroia + 12, Lowrie +24, Lowell +9, Bay -10, Ellsbury +10, Drew - 5 = +47.
Teixeira +19, Kendrick -4, Aybar +10, Figgins +11, Anderson 0, Hunter -4 (he hasn\'t been good for years), Matthews -13 = +19.
Lowrie is almost certainly exaggerated by a SSS but, OTOH, the Sox OF numbers are massively hurt by Fenway (Bay and Manny both finished the season at -10; at the time of the trade, IIRC, it was something like -5 Bay, -20 Manny).
30 runs in a season is a pretty large edge.
And re the Secret Sauce, a huge part of the Sox underachievement comes from a change in Papelbon\'s luck, which is reflected in the huge drop in WXRL and big drop in FRA, while his component RA has only gone from 2.20 to 2.46. It would be interesting to rejigger the Secret Sauce with a component RA substituted for WXRL, which can contain a significant component of \"clutchiness\" which is probably not predictive.
The covariance shows that good teams are likelier to outperform their Pyth than bad ones. Which would mean that part of Pyth differential is real and not random.
In fact, the correlation of Pyth differential to the previous year\'s differential has been statistically significant since 1959 (r = .06, p = .04). The historical timing of this effect is consistent with the thought that closer or bullpen quality affects Pyth differential in a real way, and this is confirmed by the data: if you exclude the worst underperformers (who could be expected to change their closer), the correlation strengthens, and the teams in the excluded group that did in fact change their closer did significantly better the next year than those who didn\'t. (Sorry I don\'t have the specific numbers on the last two assertions; they\'re buried in an Excel file from last December with 52 worksheets.)
The fascinating thing is that there was a marked *inverse* correlation from 1997-2002 that seems very unlikely to be random. (Without those six years, the correlation is much stronger, and in the last five years it\'s been r = .14, p = .08.) The reason I haven\'t published this stuff is that I\'ve been wanting to figure that anomaly out. But the notion that all Pyth differential is \"luck\" is clearly wrong.