CSS Button No Image Css3Menu.com
New! Search comments:
(NOTE: Relevance, Author, and Article are not applicable for comment searches)
And only one OF FB, versus 3 <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=GB" onmouseover="doTooltip(event, jpfl_getStat('GB'))" onmouseout="hideTip()">GB</span></a> and 3 PU. It's actually a stronger pick for PotD, and that it didn't earn a mention has to be some kind of embarrassing oversight.
There's actually no declining trend for 1B <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=TAv" onmouseover="doTooltip(event, jpfl_getStat('TAv'))" onmouseout="hideTip()">TAv</span></a> beginning from 2012: .277, .280, .279, .284, .275*. The previous 15 years' average was .284, so there's been a drop of about .005. RF is down .003 in the same time frame.
The biggest mess in the good end of the defensive spectrum right now, though, is in LF, where offense fell below CF starting in 2011, after having averaged .013 better the previous decade (CF is up .005 and LF down .008). Last time I checked it had fewer innings by identifiable regulars than any position -- including catcher.
Meanwhile, 2B and 3B offense this year are at crazy 25-year highs, beating the second-best marks by .005 and .004. There's just a shortage of guys who hit much better than they field.
*You're reporting .278, but all the figures are high, so I'm renormalizing.
Wow, your linking code is messed up here, too.
I thought it would be amusing to use your May 3 projection for <span class="playerdef"><a href="http://www.baseballprospectus.com/player_search.php?search_name=Steven+Wright">Steven Wright</a></span>, and revise it after five more days of the season ... and then compare it to his season so far:
Pro 34 <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=IP" onmouseover="doTooltip(event, jpfl_getStat('IP'))" onmouseout="hideTip()">IP</span></a>, 5 <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=GS" onmouseover="doTooltip(event, jpfl_getStat('GS'))" onmouseout="hideTip()">GS</span></a>, 1 <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=QS" onmouseover="doTooltip(event, jpfl_getStat('QS'))" onmouseout="hideTip()">QS</span></a>, 43 <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=H" onmouseover="doTooltip(event, jpfl_getStat('H'))" onmouseout="hideTip()">H</span></a>, 14 <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=BB" onmouseover="doTooltip(event, jpfl_getStat('BB'))" onmouseout="hideTip()">BB</span></a>, 24 SO, 4 <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=HR" onmouseover="doTooltip(event, jpfl_getStat('HR'))" onmouseout="hideTip()">HR</span></a>, 5.29 <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=ERA" onmouseover="doTooltip(event, jpfl_getStat('ERA'))" onmouseout="hideTip()">ERA</span></a>
YTD 41 IP, 6 GS, 6 QS, 25 H, 16 BB, 38 SO, 2 HR, 1.52 ERA
He had a 3.23 ERA in 120 IP, 2014-16, when you made that 4.25 ERA projection.
Re <span class="playerdef"><a href="http://www.baseballprospectus.com/player_search.php?search_name=Henry+Owens">Henry Owens</a></span>' upside, granted this is a small sample, but here's the leaderboard for Z-Contact%, minimum 60 <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=IP" onmouseover="doTooltip(event, jpfl_getStat('IP'))" onmouseout="hideTip()">IP</span></a> as a starter:
That's a good #2 starter's swing-and-miss %, based on the correlation with <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=ERA" onmouseover="doTooltip(event, jpfl_getStat('ERA'))" onmouseout="hideTip()">ERA</span></a>. He's nowhere near that now, of course, but I think this is a very significant upside marker.
Oh, given his defense, what <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=TAv" onmouseover="doTooltip(event, jpfl_getStat('TAv'))" onmouseout="hideTip()">TAv</span></a> does Vazquez need to top, to be a first-division starter (i.e., better than league-average)?
How good to be a top 10 catcher?
I've regressed the everloving beejeesus out of <span class="playerdef"><a href="http://www.baseballprospectus.com/player_search.php?search_name=Christian+Vazquez">Christian Vazquez</a></span>'s framing (using a bunch of regressions on your own data) and, when I add it to the rest of his equally elite defensive package, I still have him as easily the best defensive player in MLB, with his defense ranking just a bit behind <span class="playerdef"><a href="http://www.baseballprospectus.com/card/card.php?id=59432">Mike Trout</a></span>'s CF offense as the single most dominant skill set in baseball (based on gap between #1 and #2). If his elbow comes back 100%, I have him as the second best catcher in MLB even with the modest hitting projections he's getting. I wouldn't hesitate to put him 3rd on the 25-and-under talent list.
People who think this is wack didn't see him catch and/or haven't parsed how important catcher defense is.
Two comments. First, positional adjustments are yet one more example of why we ought to sepaate WAR into a retrospective version, that attempts to measure the actual value of a season just completed, and a prospective version, that measures the expected value were the season to be repeated, and hence involves isolating the talent underlying the performance.
For retrospective WAR, all you really need is an adjustment that is largely based on the average offense at the position in question. How long a period of years you use for smoothing depends on whether you are comparing players within or across seasons. I would argue that in a period where, for instance, SS offense is unusually high or low, a WAR designed to identify an MVP should reflect that. However, if you're trying to identify HOF worthiness, you'd want to use the long-time historical trend. (OK, so that's three different WARs. So far.)
There will be some complications -- right now, for instance, LF offense is less than RF offense because there aren't enough good OFers to go around. But the proper adjustment for that can be approached empirically in several ways -- for instance, it's also true (last time I looked) that fewer LF innings are played by identifiable regulars than any other position.
Prospective WAR is actually good for little more than pre-season player rankings, but of course it's based on work crucial for accurate projections (and hence comes in two versions as well, park-neutral and park-specific). Positional adjustments here are much trickier. Now you do want to identify how well players actually play other positions, the value of being a utility guy, and so forth.
The second comment concerns the Red Sox Hanley disaster. The year before, the Sox had thrown <span class="playerdef"><a href="http://www.baseballprospectus.com/card/card.php?id=59664">Brock Holt</a></span> into the OF despite zero innings of experience there at any level of baseball, and in 382 innings (mostly in RF) he put up a +28 R/150 DRS and +17 UZR. That may have made them overly optimistic about the Ramirez conversion.
I argued all winter that Miley was the Sox's 7th best SP, after <span class="playerdef"><a href="http://www.baseballprospectus.com/player_search.php?search_name=Steven+Wright">Steven Wright</a></span> and <span class="playerdef"><a href="http://www.baseballprospectus.com/player_search.php?search_name=Joe+Kelly">Joe Kelly</a></span>. Not by a big margin in either case, but clear enough.* I suggested they trade Miley for precisely what Dombrowski says he was telling other teams. And in fact it's hard to explain the trade unless they think as I did. If they've cleared a path for guys who are actually a bit better, it's a clear win.
*Kelly has tinkered with his pitch mix his entire career, and he had an absolutely novel one, throwing a career-low FB%, in putting up a 3.87 <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=FIP" onmouseover="doTooltip(event, jpfl_getStat('FIP'))" onmouseout="hideTip()">FIP</span></a> over his last 8 starts. Wright's simply been good, especially so when starting on regular rest. (If you want more analytic rationales, search for posts by me at SoxProspects.com).
Re Montas: when pitchers miss their spots with fastballs, the fastballs become very hittable because, of all wacky things, they have missed their spots, not because they "straighten out." That sort of significant within-game or between-game variation in fastball movement just doesn't happen.
Pitch/fx data has revealed that human beings are completely incapable of judging vertical fastball movement by eye. What is perceived is entirely an illusion, essentially propagated backwards by the result, i.e., position of the bat relative to ball.
Significant reductions in FB movement over the course of multiple seasons do happen, and they generally go entirely unnoticed until they become so extreme as to start changing the results. At which point they are then spotted largely by people looking at the pitch/fx data.
I'm guessing they'll play Bradley in CF and Castillo in RF most of the time, and the other way around occasionally. They may give Castillo some LF starts, too, since a Hanley or Ortiz injury would open up that spot and Castillo may be better than a Nava / Craig platoon.
Two of your chief findings contradict each other. If the final ranking of the fallers should have been lower, and that of the risers should have been higher, then their average rankings should be less predictive of the outcome, not more.
My guess is that the superiority of average rankings is being driven by players with U- and inverted-U-shaped trajectories, which you're ignoring. I'm especially thinking of guys like <span class="playerdef"><a href="http://www.baseballprospectus.com/card/card.php?id=1339">Trot Nixon</a></span>, whose BA rankings went 13-46-39-x-x-99. In retrospect, a former elite prospect who has just hit .310 / .400 / .513 in AAA while playing 65 or 70 defense in RF should have been ranked much higher than 99. Nixon was actually a big riser *at the end*, but you have him as a faller.
You can solve this riddle by using more sophisticated trajectory measures. You can divide the straight risers and fallers from the more complex trajectories. And you can look at slope, correlation, and the endpoint of the trendline through all the rankings.
There's no way Christian Vazquez, quite probably the best defensive player in MLB, ranks lower than 4th in the 25 and under talent list. Steamer projects him to hit .257 / .321 / .360, which is perfectly attainable, and then they heavily regress his D to arrive at a 1.9 WAR projection. But every scout says his D is Yadier-level and for real, which pulls the projection up to 2.4. Now assume that all pitch framing estimates are wildly exaggerated, and cut them all in half. OK, now Vazquez is a 4.1 WAR player, and is projected to rank 6th in MLB. In his first full season. The pitch framing estimates are probably not that exaggerated, though (StatCorner's are of the same magnitude as the the ones here, with which they correlate at 0.93), and he's probably already a 6 when you include them. The only reason you'd put him after Swihart is that we don't know how good *his* pitch framing will be.
What's awesome (I keep reminding whiny Sox fans) is that the Sox had average luck overall, 2013-2014. But they had an incredibly lucky distribution of good and bad luck (I suppose that's meta-luck). As a result, instead of just missing the playoffs two years in a row, they won a WS, and then were able to both sell off impending free agents, and nab an early first-round draft pick. If you knew the overall luck was going to be average, you couldn't script it better.
Two things, Chris. First, I've seen enough of Cecchini to know that he has power potential; you may remember him hitting a huge HR in the AFL All-Star game last fall. So an adjustment to swing path could cause a dramatic improvement in HR rate. That's not true of too many guys.
Second is that you do see guys having hot and cold stretches, but you really don't see a 378% increase in HR rate too often, for an obvious reason: you need a very low baseline to get that. You can hit better because you were "seeing the ball well," or because you got lucky, or for the reasons you suggest, but I think it's really hard for a guy with a long-established .010 HR/Contact rate to hit the ball out of the park almost five times as often (.049) because of those reasons. BABIP can get boosted big-time, sure, but my sense is that you need at least a swing path tweak to get those HR results.
To put it another way (and maybe this more detailed narrative is a third reason), you've got a guy who has never hit hit for the power he seems he ought to possess, and is struggling mightily for the first time in his pro career, and who as a result has homered twice in his last 540 PA ... and he homers twice in the span of *five* (which in fact triggered the start of the strong finish). Your first thought is:
A) Must have got lucky.
B) I wonder if he's made an adjustment?
Oh, BTW, tonight he hit a ball off of Cobb that would have been a HR in a bunch of MLB parks, lined out, and had a LD 1B. He looks very good.
In the past Mercedes has sat at 91-96 rather than 88-92, and reportedly touched 100. He also lost velocity late last year, presumably due to stamina problems. Clearly he needs to work on both that and his mechanics, but confirming Blair's observation that he ought to be throwing harder than he does probably boosts his projection somewhat.
While Cecchini certainly belongs on this list based on his overall numbers, it's worth noting that he hit .318 / .397 / .523 over his last 122 PA (29 games). He had 4 HR in his last 81 times making contact; versus 4 HR in his previous 387 times going back to the previous year. That strikes me as much likelier to be the result of an adjustment that luck.
Xander Bogaerts hit .305 / .394 / .464 in his first 238 PA this year and has hit .375 / .400 / .667 in his last 51. That's .318 / .396 / .501 over 289 PA, sandwiched around .147 / .193 / .212 over 250 PA. That's more than "glimpses" of the top prospect he was; that's sustained demonstration. The terrible prolonged slump, of course, started two games after they signed Stephen Drew and moved him to 3B, and largely coincided with him playing -29 (UZR) to -32 (DRS) defense there. The move to 3B didn't start the slump, but it's pretty clear that simultaneously butchering an unfamiliar position he didn't want or expect to play was not exactly conducive to his working his way out of it. He'll be a monster next year.
Ah, I should have looked that up (or some sportswriter, somewhere, should have mentioned it!). Still, moving down a round is a relatively small price.
It's important to note that the A's also gave the Sox their 2015 competitive balance pick. So, compared to keeping Lester and losing him in free agency, the cost for one year of Cespedes was moving down a handful of picks in the 2015 supplemental first round, plus the C prospect Gomes could have gotten them. Which is close to nothing. If they re-sign Lester, then they've bought the year of Cespedes and the draft pick for the extra money they're likely to have to pay Lester because he no longer will cost teams a draft pick.
There's evidence that Kelly may have a BABIP skill and hence be better than people think, i.e. his ERA-FIP may not all be air. In 268 PA, he's held 3 and 4 hitters to about .025 points of TAv below league average (via a .345 SA allowed), while fanning 13.8% and walking 10.8%. Which frankly seems impossible (the whole set of splits by batting order position, in fact, is inverted.) And that in turn suggests his low BABIP with RISP may have a real component. (Whether such shenanigans can be sustained is another question, of course.)
The thing is, the trade of a good closer is relatively rare, and the trade of a great one even more so.
Based on (B-ref) (WPA/aLI)/G at the time of the trade, none of these guys were having a season anywhere as good as Street. Two -- Sherrill and K-Rod in 2013 (with a big asterisk for limited PT) were having seasons comparable to Koji Uehara this year.
Rodriguez in '11, Rauch, and Veras were having solid seasons, but not good enough for the receiving team to think they'd be getting any kind of Mo-like post-season advantage. Dotel was league-average and Myers, Capps, League, Woods, and Qualls had all been all below average.
Selling the closer is certainly a myth, but there's still reason to believe that teams will [seem to] overpay for a guy perceived as a 9th-inning force.
Career simply increase the sample size and also allows you to assess any trend (note that all are improving).
Of course there are confounding variables in any defensive metric. I'm not sure how many of them Clay tries to eliminate (as many as he can, I'm sure). But if one guy has allowed 0 PB and another, say, 15, that obviously represents something real that can't be entirely explained because of a difference in the staffs being handled (not sure if Clay excluded knuckleballs).
Again, if someone's throwing out 55% of runners and someone else is throwing out 10%, some of that is clearly the catcher.
A proper metric regresses to the mean to account for the confounding variables.
BP's catcher FRAA's all seem to be within the +2 to -2 runs per 150 games range, which can't be right.
Is it my imagination, or did CJ Wittmann answer the "who's your runner-up" question by extolling Alfaro for two additional paragraphs while essentially saying nothing about Swihart or Hedges? Did he slip Nick a $20 or something?
Clay Davenport calculates a defensive run number for catchers, based, I believe, on SB, CS, PB, and WP (and maybe E). Here's R/150, first career, then 2014:
Alfaro, -2, +12
Hedges, +8, +15
Swihart, +27, +36
Swihart this year, 0 PB, > 50% CS rate.
Your FRAA for minor league catchers seems broken. Swihart is +2/150 career, Alfaro +1, Hedges 0.
Excellent stuff. Zone distance and swing distance should be made available in the stats section.
I think the explanation for the fly ball data point is twofold: fly balls are the most common result of the average swing path, and the bat is tapered. I bet it's being driven by the most inside pitches, where the narrowness of the bat turns flyballs into the other three outcomes (that it appears to do so equally is the one thing
that seems less than likely.)
A final note. Of course, it follows from this work that pitchers who can, over a sustained period, have a larger overall zone distance (which is to say, work the corners and avoid the heart of the plate) will allow a lower BABIP than pitchers with a low zone distance. Of course, we really have known that all along. But it's one more proof that the DIPS revolution has been largely misunderstood: the revelation was about signal versus noise, not skill variation versus uniformity (an argument I made to Voros, on Usenet, from the outset).
I just remembered this: John Farrell had a great comment at last week's Boston SABR chapter meeting that underscores how wrong it is to believe that pitching the 9th is exactly like pitching the 7th or 8th.
If you're down by 3 runs, a 2-run rally in the bottom of the 8th is worth +.078 WPA. In the 9th inning, it's worth -.047.* That suggests that there ought to be a significant difference in hitters' approaches. And some pitchers will be better suited to facing hitters as their approach changes.
Dan Brooks had earlier showed a chart that showed a huge 2013 increase in Koji Uehara's use of the splitter, which has always easily been his best pitch. Farrell was asked about that, and said that Uehara has a great sense of hitters' approaches, and that hitters are more aggressive in the 9th, which allowed him to throw the splitter more: a hitter who might lay off it in the 8th is more likely to chase it in the 9th.
There's always more to this game then we think there is. (Jim Rice's career OPS by inning, starting with the 5th: 935, 871, 854, 798, 733.)
*There's arguably a flaw in the logic of WPA for successful 9th-inning rallies. If you come back from 4 or 5 run downs, the hits and walks early in the rally get almost no WPA compared to the tying and winning hits, yet their success is ultimately just as necessary. The solution would be to take some portion (maybe all) of the total WPA of the rally and redistribute it according to changes in RE. That reflects the reality of the rally being an all-or-nothing affair, something which every participant understands and which greatly affects strategy on both sides of the ball.
WPA without adjusting for defense (for hitters as well as pitchers) is too noisy to be very useful, but an adjusted WPA would be a great tool for prospective estimates of situational performance differences, which we know are real.
Another caveat with using WPA is that it can be extraordinarily sensitive to defensive support. One might, for instance, wonder how Koji Uehara failed to lead MLB in reliever WPA. The answer is that on July 6, Brandon Snyder, playing 3B, had an absolutely trivial game-ending force-out at second and threw the ball into CF, allowing the tying run to score and costing Uehara 0.5 WPA. So there went 11% of his season WPA, lost in 0.4% of his batters faced. Who knows who else suffered a similar fate -- or benefited from a play that turned a meltdown into a shutdown?
Ideally we'd have a WPA that divided credit and blame between the pitcher and defenders on every play, using UZR's methodology. And yes, failing that, there is and has always been an argument for using errors in assigning WPA. A crude binary and unidirectional implementation of our preferred methodology is still better than no adjustment at all.
I'm not as confident as MGL that the study revealed little or noting, but I know I would have looked at a lot more data, in search of more information on the evolution of the game and of managerial thinking.
I would have looked at all starting pitching, inning by inning starting with the sixth, in reasonably close games. Break it down four ways: not the pitcher's final inning, lifted after the inning without a pinch-hitter (you'd want to isolate the leagues in the DH era), lifted afterward for a pinch-hitter, lifted mid-inning. I would classify the last bucket by five further variables: pitcher retired last man faced or not, pitcher left inherited runners or not, pitcher already charged with ER when lifted or not, reliever had a platoon advantage (or would have, had the other team not pinch-hit) that the starter lacked, or not; and whether the team was ahead, tied, or behind. (Tied probably acts like behind, but I wouldn't make that assumption to begin with.)
Now, I've just divided the last group of pitchers into 32 or 48 buckets, but we don't worry about that yet. First, let's look at the historical trends for all this data, without worrying about the results. Get a sense of how managerial strategies have shifted.
Then compile the data for relief pitching, in each inning. I think I'd charge the SP with the RE of his inherited runners and credit or debit the relievers with the difference between that and actual. But maybe not.
I'd use some sort of running average or multi-year buckets to eliminate some of the year-to-year noise.
I'd spend far too much time looking at the interactions among the five mid-inning variables. But one goal is to identify any buckets that can be usefully defined as combinations of more than one variable (e.g., pitcher retired last batter + reliever had platoon advantage).
I'd have no idea what I'd discover, but there's nothing more fun than going where the data leads.
I just wrote a pretty extensive critique, with a suggested revised methodology, here:
I think the general idea here is fabulous. Unfortunately, WARP is currently a very bad metric (especially for pitchers), so that adds a lot of noise. And when you have an idea like this, you always want to try to create a model that *empirically predicts the thing it claims to predict*, rather than guessing at weighting factors.
Re not assuming the intercept is zero or the MLB minimum ... I just found the regression analysis I did for the 2006-07 free agent starting pitchers. The R^2 is .96. The equation for projected AAV is 5.5 + 2.6 * VORP wins (last 2 years weighted 70/30) +/- 0.4 years of age below / above 32 - 0.7 if re-signing. That's right, the intercept was $5.5M (95% confidence interval $3.5M - $7.5M). Talk about paying for "proven talent."
I do think I saw this come down in a subsequent season, but I've never forgotten that.
have been skill, not luck. Why are these the only comments on the planet that can't be edited?
Most of that seems to have not been skill, not luck. Compared to the rest of the league, they had 137 more hits than expected based on their LD / FB / GB breakdown. But that subdivides as follows:
0.5 extra LD singles
4.5 extra LD doubles / triples
There's not a lot of variation in hardness of line drives, by definition. Not much luck here; 4 or 5 more balls found gaps than expected.
7 extra FB singles
78.5 extra FB doubles and triples
FB singles are almost all luck, but FB XBH are mostly the product of hitting the ball really hard (and playing half your games in Fenway Park). The Sox had a .112 XBH-BIP, the rest of the league .054.
39 extra ground ball singles
7.5 extra ground ball doubles and triples
We know that GB BABIP is a function of hardness of contact, hence much or all of this would be predicted by the previous set of numbers. If you can have twice as many fly balls go for XBH, I think that 10% more ground balls going through holes seems reasonable.
But let's say that the 4 line drive XBHs and 7 fly ball singles were luck, as well as 10% each of the fly ball XBH, and the ground balls. That knocks the .329 BABIP all the way down to .325.
The 2012 team, at the time of the Dodgers trade, had a second order Pyth deficit approaching a projected 10 runs (IIRC), and that was largely due to astonishingly bad hitting in two well-defined high-leverage situations. Trailing by 1-3 runs in the 9th or later, the team went 4/7, 2 2B, 3B, 2 SF in their first two games in April, then hit .139 / .191 / .235 in 199 PA thereafter. And they were shut out in 21 extra innings at home, hitting .164 / .228 / .178. Their approach was palpably terrible; after June 15 they had 34 SO and 2 BB in those opponent save situations, including a 23 K, 0 BB stretch in 87 PA from June 16 to September 4. You have to blame all that on the manager, whose job it is to establish the team's psychological tone, and identify and fix counter-productive mind-sets and approaches.
The other thing is that almost half of the PECOTA rosy outcomes were really not that surprising, and reflect the limitations of PECOTA more than the Sox' good fortune. Victorino, Salatalamacchia, and Nava were genuine surprises. While no one expected Buchholz to be this good, some of the apparent over-performance is PECOTA's inability to recognize that he has a real and large BABIP skill, and some is its inability to factor in the impact of his 2011 back injury on 2012, where a slow start added 1.25 to his ERA.
PECOTA has been predicting Ortiz to get old now for a bunch of years, and it's yet to happen. And Lackey was actually a good bet to return to his '08-09 form after TJ surgery; I bet PECOTA doesn't accurately model a guy whose elbow shredded gradually over two full seasons. As for Doubront, I have nothing objective to support his matching my personal projection, and I'll tell you right now that he'll kill his PECOTA projection in 2014. Sometimes you just have to watch a guy play.
Why does PECOTA like the Cardinals? Its simply appears to be awfully stubborn about accepting the reality of the season just played. David Ortiz had a mean projected TAv of .293 and put up a .324, his 90th percentile on the nose; PECOTA'S current / final projection is .297. That's something like 83% original guess, 17% reality.
The situation is even worse with pitching, where the projections seem to rely on FRA, which pretty clearly doesn't work. After two seasons with BABIPs of .197 and .203, and hence WHIPs of 0.78 and 0.64, PECOTA projected a .289 BABIP and 1.11 WHIP. After Uehara put up a .189 BABIP and 0.56 WHIP, PECOTA projects his WHIP at 1.01 (and his ERA at 2.51). It's hard to overstate how colossally wrong these rest-of-season projections were, all along.
Essentially, PECOTA is acting like the Republicans who still, on the eve of the election, thought Mitt Romney was going to win. Irony much?
Yes, that explains why we do it the way we do. It doesn't explain the weird lack of interest that the sabermetric community seems to have for a full context-neutral pitching metric. Yes, we know that some of BABIP is luck, but we know that much of it isn't (for instance, on a start-by-start basis, most pitchers have a significant correlation between FIP and BABIP). What I want is that full context-neutral metric (essentially a pitcher's TAv allowed) and the same thing with league-average BABIP substituted, and better yet, with the smartest possible estimate of the pitcher's true BABIP skill.
(I might mention that every metric I've suggested I used to do while with the Red Sox, using my simplified (conceptually) / expanded (number of terms) version of Base Runs, so they are very doable! You do things like substitute league-average rates of runners out on base and passed balls. All very straightforward.)
What is most frustrating about the neglect of RE24 for hitters in favor of a context-neutral approach is that *we do the precise opposite for pitchers.* RA, once adjusted for inherited runners, is essentially a measure of pitchers' net delta Run Expectancy, with context. We should be using linear weights or Base Runs to calculate every pitcher's context-neutral RA, but nobody does that. Bill James' ERC is an attempt at an estimate, but we can do better.
In a perfect world, we have a context neutral run value, one that includes the base-out situation, and one that includes the inning / score situation by converting WPA back to equivalent changes in RE. For both hitters and pitchers. It's the middle figure that's "real" and not any kind of estimate; the other two numbers attempt to subtract context that we think may lack predictive value, and add further context that we also believe is non-predictive. But having all three measures (including some addenda that measure the contribution of leverage and opportunity alone) handy for everyone will allow us to address many interesting questions.
It's interesting that you point out that Betts needs to work on his plate discipline, since ... [cough] ... he had the highest BB/K ratio of all 211 players in low-A (minimum 300 PA) before his promotion to high-A, was voted as having the best plate discipline by league managers in BA's annual survey, and has more BB than K at every level he's played at, including high-A after his promotion.
Clay Davenport's Peak (age-adjusted) Translations have him ranked as the 3rd best hitter in the SAL relative to his position, after Bird and Rosell Herrera (neither of whom have been challenged with a promotion to high-A), and, after his promotion, the 3rd best hitter in the Carolina League, after Garin Cecchini and Michael Ohlman.
The Papelbon story is missing an important episode. After his first year as closer, the Sox did indeed decide he would have significantly more value as a mid-rotation starter, and planned to make that conversion. They spent all winter trying to figure out who their closer might be, signing Joel Pineiro as a potential starter-to-closer conversion, signing J.C. Romero, and trading for Brendan Donnelly. (I spent the winter trying to convince them to give Tim Wakefield a shot.) During ST, Papelbon came to the team and solved the problem by saying he wanted to close after all.
Next thing to look at: the impact of the identity of the hitter, and of the pitcher.
For each, I'd look at three factors: experience, overall quality, and strike zone command. Do veteran hitters have a smaller zone than rookies? Among hitters with three previous years of experience, do good hitters (based on the last three years) have a smaller zone than bad ones? And do hitters who draw a lot of walks have a smaller zone than those who don't? (You can also look at this by including strikeout totals, and I'd use K% - 2 * BB% as a measure of overall strike zone command.) The last two questions are obviously related, so you might want to also create four groups, including hitters who aren't very good overall but draw a lot of walks, and hitters who are good but don't walk much.
And of course you'd look at the pitchers the same way.
Oh, and though Cano has the edge in raw hitting stats, he also has a big leverage split that Pedroia lacks. Pedroia's WPA per 150 games is 1.29, while Cano's is 0.99. So Pedroia has actually been the more valuable player (real, measurable, value, not necessarily predictive) at the plate.
Oh, and he has a 7.5 run career edge in baserunning.
While the "resident defensive metric" gives Cano the edge over Pedroia, +5.5 to -0.7 career R/150, UZR has Pedroia at +9.5 and Cano at -3.2. DRS has Pedroia with a +12 to +2 edge, Total Zone has him with a +11 to +5 edge, and Fans Scouting Report (2009-12) has him with a +13 to +7 edge.
In general, FRAA correlates poorly with the other three metrics, while the others correlate much better with each other (especially for infielders). I know that FRAA tries to eliminate scoring location bias, but with so many results like this one, one has to wonder whether the methodology is sound.
I question the entire notion that you can see a pitcher once and draw any kind of firm conclusion about him. Owens' line since this report was filed: 11 0 0 0 7 19. Personally, I can't imagine the pitcher described above doing that.
Oh, and by the way, pitch/fx data makes it pretty clear that human beings are incapable of judging vertical movement on fastballs. Anything in a scouting report about fastball movement other than an assessment of armside run is a description of an illusion, reverse-engineered from effectiveness, and actually derived from command and deception.
The thing about the splitter and the changeup is that ordinarily the two pitches behave the same, so that pitchers throw one or the other. There were only three MLB starters last year who threw both: Freddy Garcia and Brad Penny had splitters that broke more vertically than most and hence resembled their four-seamers, and used them in tandem with conventional changes that had the usual run in on RHB; while Tim Hudson's change was a bit like Buchholz's, with less armside run than usual (intermediate between his two- and four-seamer, though), so he was able to pair it with a splitter that had a lot of run.
But none of these guys had a velocity separation between the two pitches of more than about 1 MPH, which is to say, negligible.
Buchholz is throwing an unusually slow change with (even more unusually) even less armside run than his 4-seamer, and a somewhat hard splitter with good, typical run in. There's no velocity overlap at all between the pitches. I haven't gone back to see if anyone else has done this in the pitch/fx era, but I think it's fair to say that, for hitters, it's something many or most have never had to deal with.
The pitches ought to be easy to confuse, so the key question may be whether the splitter is making the change even more effective.
Here's one you missed: 3B Michael Almanzar, he of the $1.5M Red Sox bonus four years ago, went 3-3, 2B, HR, HBP for Salem, and has hit .467 / .568 / .967 over his last 8 games, with 24% of his season's walks, 5% of his strikeouts, and 4 of his 9 homers. He's six days older than Brandon Jacobs, a better defender, and now has better offensive numbers on the season. Worth watching.
Shaq broke his second and longest strikeout streak at 21 today, with a sharp line drive to RF.
Update, 7/17: After going 0-5, 5K, to give him 15 consecutive AB with a K since breaking the 16-for-16 streak, he now owns the 2 longest streaks since 2005. He goes for his own record tomorrow.
If the Red Sox were to trade both Daisuke Matsuzaka and Aaron Cook and move Felix Doubront to the pen in order to trade for Dempster, Garza, or any of the other "usual names" who might provide a "stabilizing force" (which I imagine is to be contrasted to any kind of actual likely upgrade) for the rotation, they would then be rightly mocked and excoriated right here and everywhere else online. (You'd probably do that to make room for Hamels, Greinke, or King Felix if he were available, but that doesn't seem likely.) And the notion of a team with 17 MLB pitchers on their roster (not counting Daniel Bard) opting to pick up a "depth piece" suggests an author who ... well, failed to look at that roster.
KG, any thoughts on Kyle Stroup? I spent too much of the winter reading all his professional game logs and crunching his numbers in insane detail and came to the conclusion that he's a borderline three star prospect who is somewhere around 16 in the system -- without, of course, ever having seen him pitch. Then BA put him 20, which made me happy.
Actually, it was July 29. And he had already homered once already. He and Nomar (5-10-99) are the only players in the PBP era to have 3 HR including 2 GS in a game.
July 24, 2004 was his 2-run walk-off HR against Rivera in the Varitek-ARod fight game.
That's actually a pretty good job, though. It's not uncommon to see articles like this where the odds add up to 50% or (less often) 150%.
You can't focus on this one play. The better question is, why did McNamara let Buckner (.218 / .257 / .322 vs. LHP) hit against Jesse Orosco (.187 / .235 / .253 vs. LHB, second best marks in the league) with the bases loaded and 2 out in the 8th, when he had Don Baylor to pinch-hit? (The answer appears to be because Buckner told Mac "I can hit this guy" as Baylor was getting a bat out of the rack.) The even better question is why was Buckner not only starting the game but hitting 3rd against Bobby Ojeda (.150 / .207 / .206 vs. LHB, best marks in the league) when Baylor could have started, since there was no DH?
Buckner had a -.468 WPA in the series, so objectively he was indeed the goat. That the ball went through his legs when it did was essentially Divine Shorthand. But it's his job to say "Put me in, coach, I can play." It was McNamara's job to say "No, you can't, you're terrible."
Bogaerts has an .085 Home Runs / Contact; in the decade 1995-2004 the only 18 year-olds in low-A to top .050 were Delmon Young (.063) and Andy Marte (.055). The only 19 year-old to top .085 was Alex Escobar (.094). Harper this year was .071. Mike Stanton was .123.
Blue text on a light blue background? I can't be the only one suffering eyestrain from that. Is black text unhip or something?
I've been writing about this for a while. Weaver is viewed and regarded as a three-quarter delivery guy, and apparently hitters read the expected movement on his fastball from that upper arm angle. So they're expecting much more armside run than rise. But pitch/fx data shows that he gets very little run and lots of rise, as if he were coming straight over the top, apparently because of his forearm angle and wrist position. Voila, hitters are under the ball.
If this explanation is correct, you'd expect Weaver to have bigger times-around-batting-order splits than usual, as hitters adjust to the discrepancy, and sure enough, he does.
I'll second the nomination for (2B + 3B) / (1B + 2B + 3B), and add 1B / (1B + AB - H - SO + SF), which is to say 1B / (1B + Outs in Play).
It's very clear that the first three things to look at are K%, UBB%, and HR/Contact (and interesting to see that the latter generally stabilizes before HR/OFFB, which means that variations in OFFB rate do not generally being with them the baseline HR/OFFB rate).
It's quite unclear how to best break down balls in play, however. You could do 1B/BIP and XBH/BIP independently, and they both stabilize before BABIP. If you did BABIP first, you would want to look next at XBH / Hits in Play. Alternatively, you could first do XBH / BIP and then do 1B / (1B + Outs in Play), the question being whether that stabilizes before 1B / BIP.
And then there's the very interesting question of whether finding a stat that stabilizes very *slowly* is actually desirable, because that way you can isolate the luck. It's at least informative. If in fact 1B / (1B + Outs in Play) stabilizes even slower than BABIP, then we've demonstrated that most of the luck on BIP resides in the singles, since the former removes XBH entirely.
Kevin, Cabral's first three weeks / six outings in high-A were every bit as spectacular as his low-A performance: 6 H, 1 BB, 14 K in 12.1 IP. Then he fell apart. So the level change seems to be irrelevant; he just needs to recover what he had during the first half to be able to make the Rays.
Well, you'd better send a memo to Clay, because the last time I crunched the numbers on his Davenport Peak Translations*, he had Anthony Rizzo and Lars Anderson as the 6th and 7th best hitting prospects in baseball (after Montero, Freeman, Hosmer, Moustakas, and D. Brown. Not adjusted for position).
*Combining across levels and adjusting for the fact that promoted guys tended to gain a little value, indicating that his level-adjustments weren't quite neutral.
What we need is the sortable report covering everybody, back to the beginning of time.
And the report giving team totals, too. The argument that UZR and Plus/Minus underrate the spread of performance due to range bias is really interesting. Having the data for everyone and for teams would give us terrific tools for testing that. The obvious question: if you back team nFRAA and team UZR out of pitching totals, which makes more sense of the pitching numbers that are left behind (especially BABIP)?
Excellent point -- except that self-confidence isn't an emotion, it's an attitude or belief that is compatible with any emotion on the anxious to calm spectrum. I have no idea whether Papi was fueled by extra adrenaline or extra calm but I don't think any of these athletes can succeed disproportionately under pressure without extraordinary self-confidence. Nor do I think that a period of lack of success or a physical decline is going to change the *emotional* response to high-pressure situations -- just the belief in the inevitably of success.
And I think that great self-confidence increases success under pressure by essentially eliminating distracting thought processes. All the cognitive horsepower is dedicated to figuring out the strategy and tactics of the situation with complete clarity, and then the thinking brain can get completely out of the way so that "muscle memory" (procedural memory, technically speaking) can take over. When any doubt enters the equation, that distracts from the clarity of the conscious thought and impedes the purity of the unconscious skill execution.
(Yes, this is all absolute speculation, but it's informed by a ton of psych classes and the normal amount of personal experience, although I have to admit that the "sport" in which I once in a while achieved unusual success via supreme self-confidence was pinball.)
Finally, in Papi's case, the opposing pitcher's lack of confidence and corresponding diminished quality of execution was almost certainly a factor as well.
Great article, but David Ortiz did not earn the "clutch" appellation because of some perceptual psychological quirk. He won it because from the 2004 post-season to mid-August of 2006, the Red Sox won 14 consecutive games in which he had a chance for a walk-off hit, with Ortiz getting 11 walk-offs (7 of them HR) and being on base after a BB when they won the other three times. In these 20 potential walk-off PAs he hit .786 / .850 / 2.286 (11-14, 7 HR, 6 BB). All three times he was retired (twice in the ninth and once in the 10th) he won the game in a subsequent PA.
He came up eight times needing just a 1B to win the game. Rivera got him to pop up in the 9th inning of 2004 ALCS Game 4, but subsequently he went 1B, BB, 1B, BB, 1B, IBB, 1B.
He came up five times needing a 2B for a victory and went HR, HR, HR, HR, BB.
He came up seven times needing a HR to win, and went K, BB, BB, 1 out solo HR off of Scott Shields (9/6/05), 2-out 3-run HR off of Otsuka (6/11/06), GO, and 1-out 3-run HR off of Carmona (7/31/06).
I messed around with some numbers and estimated that the odds against having a hitter of Ortiz's caliber doing all this in a random simulation like Diamond Mind were a billion or a trillion to one. Hence my assertion on ESPN the next winter that if clutch hitting were a drug, Ortiz would have been the first person ever to be certified as effective by the FDA.
People make the mistake of thinking that "being clutch" (or its opposite) is a trait variable when it is actually a state variable. The 2004-2006 results are so extreme that they can only be explained by an overwhelming but ultimately fragile self-confidence. Once Papi failed a few times he stopped believing in the inevitably of his success ... but for a few years, he had that belief and it became as good as prophecy.
The only problem with the "Kalish recall lights a fire under Reddick" theory is that Reddick hit .387 / .424 / .677 from July 15th through Kalish's last Pawtucket game and "only" .363 / .387 / .637 since.
The turnaround actually happened exactly at the All-Star Break and was even more dramatic than you report: .207 / .255 / .383 in his first 71 games and .368 / .396 / .647 in the subsequent 31.
Two corrections -- first, Kalish has been playing CF vs RHP with Nava, Hall, and McDonald sharing LF.
The other one, and it's a biggy, is that they've absolutely settled on Felix Doubront as the 7th inning guy they've lacked all year. He's only faced 28 batters but he's fanned 9 and walked 1, rates that are hugely better (and already statistically significant) than his rates as a starter (72 BFP, 10 K, 8 BB). He hasn't shied away from leverage, either, as he has a 3.03 Component ERA (not BABIP-driven, either, it's .313) and a WPA which translates to 1.13. He's throwing five pitches (4-seamer, very distinct 2-seamer, cutter with slider spin and break, change, and big curve) and commanding them all. He could implode tomorrow but so far he looks like the real deal.
As I just explored over at SoSH, almost all of Beltre's improvement is in his third time around the batting order (where he's gone from bad to terrific), and his behind-in-count vs. ahead splits have widened a lot, too. That means he's been a *smarter* hitter as best as we can measure that objectively, and suggests a possible explanation for the breakout: significantly better pre-game prep.
BTW, there's a pretty good chance that Hermida was actually placed on optional assignment waivers and will still be on the 40-man roster at Pawtucket. Where he'll have a month to prove that he's worth tendering a contract to in December (and maybe even beating out Kalish for the last post-season roster spot, if there is one).
I love that the Padres signed him and I'll take a little bit of credit for that, as I did a huge and largely very glowing analysis when Jed Hoyer and I were both with the Sox (although the conclusion was that he had huge physical talent but serious weaknesses on the mental side which needed to be dealt with). Still rooting for the guy.
It's worth noting that WARP3 is not a measure of actual value, but a measure of value assuming a neutral distribution of events relative to game leverage*. After prorating this year, Lackey over the last three seasons is averaging 1.0 win of actual value (based on WPA) per year more than his WARP3. It may be luck, but he may also be saving his bullets for high-lev situations.
*And it's based on rate components for hitters when it could just as easily be based on change in run expectancy, while the opposite is true for pitchers.
It's seldom mentioned that the crappy way earned runs are calculated often contributes a bit to these differences. Besides the obvious issue of inherited runner support, there's the bogus way that pitchers are absolved from all responsibility after they should have gotten three outs in an inning, no matter how hard they're hit (any batter who reaches bases after the deserved third out and then scores without benefit of a subsequent error should be an earned, not an unearned run, unless the third out would have ended the game).
In Buchholz case, he coughed up 4 "unearned runs" on 4/17 after an error with a delta-RE of 1.5. Oddly enough, he has subsequently pitched out of error-instigated jams at a better than average rate, negating that difference, but he has also gotten 0.9 extra runs of inherited runner support from the pen. His "True ERA" (adjusting for inherited runners plus a similar adjustment for errors, crediting the delta-RE instead of the number of actual unearned runs) is 2.72.
More widespread use of such a True ERA would nicely take out a little bit of noise.
Re Kalish: I don't think any club values "makeup" more than the Red Sox, and from what I've read Kalish's is terrific. The last guy who profiled the way Kalish does -- solid regular tools but maybe not a star's, terrific makeup -- was Dustin Pedroia. Now, I'm not saying Kalish will be the 2013 AL MVP, and neither are the Sox. But I think it's very likely that they give him J.D. Drew's job in 2012 because they've had so much success with guys like him and are convinced that he will be better than expected by scouts who are looking only at tools. Far from a cloudy future, he may have the clearest of all the Sox position player prospects.
Kevin, the Oscar Tejeda line looks like the opposition is attacking him like a guy who just had a 643 OPS in low-A (which he is); if they throw everything over the middle of the plate and you're good, you will have a .295 Iso and .003 IsoD. My question: how much scouting of the opposition and going over opposing hitters is there at this level? He's been hitting 6th for Salem; at what point do they start pitching him like they would a #3 hitter based just on his numbers so far?
Just to rub it in for Mets fans, Jonathan Papelbon has made three appearances at home and two of them were with the score tied in the ninth.
Just a note: the historical correlation of manager to team BABIP is very high, seemingly much higher than you could explain by changes in defensive personnel. So what you're identifying here as "team" includes not only the defensive skill of the fielders, but the quality of their coaching.
At about the same time as I did that study I did a correlation study of BABIP for pitchers who changed teams -- which in fact was the first post-Voros study of any kind to demonstrate that BABIP was a pitching skill (all of this work is buried in the bowels of rec.sport.baseball). I seem to recall that the luck % that I came up with (based on the r^2 of the regression) was less than 75%. I'll have to dig up that study and think about the results.
All this is true, but I'm specifically taking about the difference between teams that have every draft prospect sit down and take at least one psych test of some sort and those that don't. According to KG, the latter easily outnumber the former. If the former are in fact kicking the butts of the latter in drafting and developing players, eventually the whole industry will come around and the edge will go the organizations who do the best testing. But this may take ten or twenty years.
You might try thinking of a team that is generally regarded as being on the cutting edge of everything and has had extraordinary recent draft success, especially with overperforming players whose success has been credited to their outstanding makeup. You would think that the clubs who aren't trying to measure makeup would wonder whether said club were doing so, and in fact with some success.
There's definitely a fascinating psychological effect at work here. Ortiz hit .286 / .364 / .616 in 173 PA from June 6th to the day the PED story broke, and (after going nearly 0 for his next two weeks, when by his own testimony he wasn't sleeping at night) .290 / .397 / .619 in 184 PA from August 14 to the end of the season. I watched all but a handful of games and scored every pitch and he certainly didn't look awful to me; he looked like David Ortiz only a couple of years older.
So the psychological factor at work here appears to be observer bias (and I'm not asserting that my view was necessarily neutral, either, although I suspect pitch/fx data could help back it up). There isn't even a consensus "what our eyes could see."
Re the batting order: Ellsbury is definitely 1 and Scutaro is probably 9.
3-4-5-6 is toughest but the key is that Ortiz and Drew will be separated and I just don't see Drew hitting 7 behind Beltre or Cameron. That's a big reason not to hit VMart 3, the other being that when he's not in the lineup, you have to do much rearranging.
I'm pretty sure Tito would like to do (revitalized) Ortiz-Youkilis-VMart-Drew but he may start the season off Drew-Youk-Ortiz-VMart.
Cameron may hit 7 and Beltre 8, hard to say.
According to mlb.com*, Bedard won't be ready until June at the earliest, so I think it is a camp battle, at least at the beginning.
Isn't there a massive free-for-all for Erik Bedard's eventual spot, between Doug Fister, Lucas French, Jason Vargas, Yusmeiro Petit and Garret Olson?
I found the SIERRA articles struck a nice balance. In fact, and perhaps quite ironically, if they hadn't stopped to explain Applying Park Factors 101 at the start of the first article, I wouldn't have caught that they did them wrong (how did everyone at BP miss that?). This is another reason why clarity is good.
All the glossary definitions should have both rewritten plain-English explanations and complete technical descriptions. There may well be other methodological mistakes that have gone undetected -- in fact, I've found that PADE isn't remotely correct at all. I haven't been able to figure out why from the available description of the method.
The field does need more writers and speakers who are capable of explaining the complexities to a wider audience. I've had some success doing that with physics and I hope to get a chance to do that with baseball at some point. But it is so much easier to write for an assumed informed audience that I think many of us are just sucked into writing for the cognoscenti by sheer inertia.
That OPS is initially difficult to grasp is another good reason to bury it. Some version of RC/27 would not only be more accurate but much more transparent to the average fan. Call it HRA for Hitter's Run Average and every fan would instantly understand what it meant.
The big problem with that, of course, is that different people would be getting different numbers depending on which metric they used. I've been working for a while now on improving BaseRuns, which I (and many others) think is the best put-together metric conceptually; if we could come up with a version with more terms that was clearly more accurate than RC/27, EqA, Linear Weights, etc., we might shoot for universal acceptance.
I don't follow this logic at all. As far as I can tell, the minimum effect of GB happens when GB = FB + PU in either form of the metric. When that happens, the linear term, the squared term, and the interaction terms all become zero. When GB > FB + PU or GB < FB + PU the term becomes non-zero and you start to see GB loading on the metric. Your final equation does reflect the reality of the situation (GB rate is minimized at at unknown value) but your constant e is just an unknown portion of the overall constant a + b + e.
In general, I don't think there's any rationale for keeping a term as both linear and squared if the squared term is significant and leaving out the linear term improves the overall regression. In this case, the interactions of GB with itself and with K and BB rates appear to be so important that if you include them you don't need to include the term directly. That may make the seeming illogic of not having the term directly more palatable to consider.
The one problem I can see in general is that a pitcher with GB = FB + PU is not at average pitcher and yet he's the baseline that's determining the constant (i.e., he's contributing your unknown variable "e" to it). I would have begun by normalizing all the data, so that GB_FB = 0 meant a pitcher with an average rate. This would be the best solution to the problem you're worrying about, since by definition the effect of GB can be regarded as minimized for an average pitcher. Then you run the regressions and you convert the coefficients to useful ones by reversing the normalization.
40 IP is actually lower than I think might be safe; that's about 170 BFP and the Y2Y correlations seem to start falling off more steeply below 200. But probably no big deal*.
You really should try removing the straight GB term (the rationale being that you've already got it squared and there's no logic that says it needs to be fully quadratic) and see what happens with the GB*BB one. I'm just personally curious because I've done so many of these multiple regressions and I've seen a lot of funky things happen when you take out one term.
*It's worth noting, though, that increasing the sample size with noisy data can give you worse (less significant) regression terms.
First of all, and this is a big we-need-to-start-over-and-run-the-numbers-again mistake (although I think ultimately we're only talking about tweaking the coefficients): I am fairly certain that the (Pete-Palmer, Total Baseball designed) Park Factors in the Lahman database do not have to be cut in half, because they are already designed to be straight multipliers. Compare them to the straight Run Indexes published each year in the Bill James Annual, or plow your way through the technical explanation at
Biggest unanswered question: what's the minimum BFP for inclusion? I've found no loss in year-to-year K/PA correlation down to 260. If you didn't go that low, you can increase your sample size.
My biggest disappointment is that you started with ERA. Granted, that's the stat we look at. But there's no good reason to ignore the (very accurately) quantifiable errors in RA caused by good or bad inherited runner support. You have that data here ("Fair RA," IIRC).
And you probably should have wrestled with R vs ER. Personally, I believe in keeping track of UER but doing it exactly the way you adjust for inherited runners -- the pitcher is credited with the average change in Run Expectancy caused by the error rather than the number of UER that actually end up scoring. (This actually only works for ROE; for errors leading only to base advances you need a "subsequently rendered moot" adjustment, ao it does get tricky.) I bet there's a correlation of GB% to errors and hence UER ... you may have been better off regressing to Fair RA (adjusted only for inherited runners) with a separate term estimating ER/R. Or regressed to Fair RA and used a fixed ER/R, which is just using RA but scaling it to look like ERA.
Finally, I've never kept a term with p = .56 no matter how strongly I felt it deserved to stay. That is not trending towards significance and I think it's wishful thinking to expect it to get there with a bigger sample. Although I am at a loss to explain why it's not showing up. I would experiment with taking out the straight, non-squared GB term and see if that helps this one.
Not an artifact. Same pitchers, look at year-to-year changes in BB rate, and HR / Contact follows very mildly but with immense statistical significance. See below for the details.
When not using HR / FB (i.e., in the many cases where FB data is unavailable), you should always use HR / Contact, which is not only the most logical but also has a stronger year-to-year correlation that HR with any other denominator I've looked at.
Change in HR/Contact, adjusted for age and for any change in role, correlates to change in (BB-IBB)/(PA-IBB), r = .130, p = .000015 (n = the 1107 pitchers who faced 200+ BFP in consecutive seasons for the same team playing in the same park, 2002-2009).
(Without the adjustments, r = .126, p = .000026.)
Here's food for thought for version 2.0.
I'm sure you're aware of the positive correlation between BB% and HR rates and your methodology will capture that. However, the baseball c.w. includes a historical class of pitcher (Jenkins, Hunter, et al) who featured elevated HR rates in conjunction with low BB rates. They may be worth looking at.
Essentially, good control improves K and BB rates and depresses BABIP and perhaps HR/FB. But pounding the strike zone rather than nibbling (a difference of approach, not skill, perhaps) also improves K and BB rates but may well increase BABIP and HR/FB. Daisuke Matsuzaka looks like he is actually achieving lower BABIP by nibbling (whether this is sustainable is another question entirely) -- again, a correlation of walk rate to hardness of contact that's opposite the expected.
Another guy I can think of who appears to have a true BABIP skill is Jared Weaver, whose gets a ridiculous BABIP on his FB given his swing-and-miss rate. I think that's a function of the deception in his delivery, where the movement on the pitch doesn't match the upper arm angle.
One thing a metric like SIERA will allow us to do is identify the consistent under- and over-performers better than past metrics, and then we can examine them with pitch/fx and the like. Those findings may never be included in the metric but would allow us to determine after a single year's over- or under-performance whether a pitcher might be one of the rare guys who has a true, non-SIERA measurable BABIP or HR/FB skill (or lack of same).
Good! Grabbing a chunk of the true BABIP skill via the K and BB factors in the regression makes the metric even better than I realized.
I remain somewhat skeptical about whether HR/FB really should be regressed essentially to the mean, but admittedly I've barely studied it, let alone studied it as much as I've studied BABIP.
This looks like a very nice advance in metrics that assume that pitchers do not have significant variations in hardness of contact allowed. That is an incorrect assumption*, but the variations are small enough at the MLB level to make such a metric very useful, and may help get a handle on those slight but real differences.
*I've lost count of the number of ways this can be demonstrated. Most obviously, there is a significant correlation between team BABIP allowed and the three true outcomes; staffs that are better according to the latter also allow a lower BABIP just as you'd expect. But here's perhaps the best bit of evidence yet:
Take all the pitchers with 200+ BFP in consecutive seasons for the same club in the same park since we've had UZR data (2002-2009). The change in BABIP is of course correlated to the change in team UZR Range + Error, with the correlation surprisingly weak (r = .19) because BABIP is just so damn noisy. Now, adjust the yearly change in K/BFP for age and change in role (starter vs. relief) and toss that into the regression. Surprise! The change in BABIP is also significantly (p = .02) correlated to the change in K rate. That is very unlikely to be caused by changes in FB/GB ratio that correlate to K rate changes and very likely to be what it looks like: you strike out more batters, they also hit the ball less hard (and the opposite).
It's also interesting that the change in BABIP with age (again, with change in UZR included in the regression) precisely mirrors the change in HR / Contact with age. In both cases, there is no change until age 28 or 29, then a worsening. (Contrast to K rate, which significantly improves to age 27, then declines at the same rate). The worsening of BABIP after age 29 is not significant (p = .25), but in this data set neither is the improvement in BB rate by young pitchers (p = .26), and we're pretty sure that's real.
(Some of this is in a thread at SoSH looking at the impact of team defense on pitching.)
Re Lester (and Bard), it comes down to this: common sense tells us that 2007 is much less relevant than usual. What you want is an objective, pure quant system that is just as smart as common sense.
I always assumed that PECOTA worked by finding the best match for both the size and *shape* of the previous three seasons or even the whole career. The Bard and Lester predictions make me 90% certain that the system starts by taking a 3-year weighted performance average and then looking for comps.
OK, this is demonstrably wrong.
They have Adrian Beltre projected as a +1 defender.
And David Ortiz as a +12 (in 118 games!).
And of course these wacky defensive projections are presumably factored into the pitching projections.
It appears as if they are projecting the starters to be about +15 defensively ... which is a run less than Beltre alone has averaged in his career according to UZR and FB.
Folks might try taking 0.35 off all the ERAs and adding 5 wins to the team total if they want to know what this would look like if BP had a PBP defensive metric in place already.
Those Breakout and Improve numbers are indeed a year old. Get the weighted mean spreadsheet for this year's numbers.
Lester has a 39% improve, so they expect him to regress even compared to his 3-year baseline, which includes a 2007 which he broke out from. When you factor in the improved defense, they are projecting a big decline, on the order of 1.00 of ERA. Who knows why?
OK, I've figured out the answer to that one: they think that the 7.7 K and 5.7 BB are Bard's baseline rates based on his last three seasons. Which they get by including his 7.08 ERA ( 5.6 K, 9.4 BB) season in A ball (mostly low-A) three years ago, which common sense would tell you couldn't possibly be relevant.
Are we really supposed to believe that PECOTA has a sizeable body of comps who were historically, unimaginably awful their first year in the pros and two years later went an entire month in MLB fanning 50%+ of all BFP? And therefore 2007 is actually relevant?
Sure they do.
Something tells me that PECOTA *starts with* the weighted 3-year average and does its comps based on that, when I always assumed it did the comps by matching each year separately. A smarter system would know that 2007 meant nothing, because after matching 2009 and 2008 it couldn't find anything remotely resembling it.
The Beckett VORP isn't the only thing that makes less than zero sense. How about this one:
Daniel Bard is projected to go from 11.5 K/9 to 7.7, and from 4.0 BB/9 to 5.7, but he has the highest "Improve" percentage on the team, at 51%, and the fourth lowest collapse. A 50% improve is supposed to mean the same performance, so why does his weighted mean projection show a collapse? And that's ignoring the fact that the defense behind him will be 0.50 of ERA better.
The biggest relevance I can think of for baseball is on subjective evaluation of players, especially fielding. It's not quite inattentive *blindness*, but it's the same phenomenon writ smaller -- the brain saving CPU cycles by not bothering to actually see, and instead substituting what it expects to see. The jumps we "see" an outfielder get on a ball are likely to be filled in by the brain rather than perceived accurately.
A related phenomenon is when the brain tries but is unable to perceive accurately and fills in with its best guess. We've always known that "late movement" on a pitch is a physical impossibility and hence optical illusion. What pitch/fx has shown us is that fastball movement in general is an optical illusion -- any swing and miss looks to use like more movement than any contact. Hideki Okajima gets as much FB movement -- and more rise -- than Jonathan Papelbon, for instance. When Dan Bard had trouble locating his FB in his first few outings and got it pounded, more than one observer decreed that his FB was straight and couldn't get MLB hitters out. He then proceeded to fan half the hitters he faced in July, showing awesome "late movement" -- which was, according to pitch/fx, identical to the "straight."
Pitchers who succeed by deception are exploiting the difference between the movement the hitter expects to see based on the general style of delivery and what the actual movement is.
What I heard from my classmates subsequently (or quite possibly Dan Simon himself; I can remember it either way :)) was that they tried the experiment with varying degrees of difference between the before and after subjects and got a nice, roughly linear relationship to the percentage of people who noticed. I do know that they tried it with a gender change and that some people were oblivious to it, but I can't recall whether the percentage was 10% or 25% or some other low number. I also believe that men were better at spotting swapped women and vice versa.
Earlier experiments by Simon show that who the other person really matters. If the social situation is such that the other is encoded generically (a construction worker on campus versus a student), change blindness rates are very high.
The most famous change blindness experiment was done by Simon and Chabris just after Simon came to Harvard (a year or two before I took his introductory cog psych course). Watch the video at
and count the number of passes made by the team *dressed in white.* Did you notice anything unusual? If not (and I didn't, nor did 80% of my classmates), watch the video a second time.
Simon was amazingly cool -- he also told us all about Google when it was early in its beta test!
Oh. My. God!
I helped design this experiment as a member of Dan Simon's Cognitive Psych Lab class at Harvard in (IIRC) the spring of 1999. I had to drop the course (it was my 5th) after we had finished the design but before we got to the point of running it. It's great to actually see the results -- all I knew about them was from running into classmates subsequently.
(And when I say I helped design it, I actually made no specific contribution. Change blindness was Simon's cognitive psych specialty and the lab class involved each of us designing our own little experiment to test it. But half our time was spent on brainstorming a group experiment, and continually refining our ideas for the experimental design. It's quite possible that the resulting design was what Simon had in mind all along, but it felt like we were making it up, not him, because he's a brilliant teacher.)
It's a small world when this shows up here.
(And yes, the bunch of psych classes I took from 1998-2002 back at my alma mater have profoundly influenced the way I think about baseball. Someone who wants to pursue sabermetrics as a career or serious hobby could do worse than major in it.)
Truly interesting. I think their plan A is to sign Adrian Beltre and just have the most ridiculous run-prevention team in anyone's memory, simply because I think they're appropriately skeptical about Youkilis at 3B for a long stretch of years (this is exactly the point in his career where they would be thinking of moving him the other way had they never traded for Mike Lowell).
But trading Beckett to the Rangers for a prospect haul that they pass on to the Padres -- that's a great plan B.
I'd take Clay Buchholz over Porcello and probably Hanson and Anderson, too. And just to prove I'm not a fanboy, maybe over Lester. It's kind of strange that a guy can be the #1 pitching prospect in MLB, add a very good two-seamer and boost his G/F, develop his slider into a legitimate plus pitch, have a very solid and sometimes spectacular half-season in MLB (with a FB that averaged 93-4 and touched 97), and not even be a "just missed."
There is a further complication or two that makes it even more challenging for Anthopolous and an even closer parallel to the Ultimatum Game.
There is a sound argument that he should do whatever he can to guarantee that Halladay does not end up with either the Red Sox or Yankees. His goal, after all, is to finish ahead of one of them sometime soon. He's only going to trade Halladay within the division if he gets overwhelmed relative to the value of the two picks, and it's unlikely that either the Sox or Yankees will make that kind of offer; they will both be content to let him go elsewhere, as they were with Santana.
That leaves the Phillies and Angels, and Halladay is very unlikely to sign an extension with the latter because he wants to play on the East Coast. In fact, he's stated that the Phillies are his ideal destination.
So it's basically the Phillies (who are reportedly trying to deal Blanton's last arb year to create salary space) or bust. Trading Halladay for Michael Taylor and Travis D'Arnaud would be a huge win -- you'd save Halladay's 2010 salary, save the signing bonuses on the two picks, improve your 2011 draft position, end up with two better players more quickly, and guarantee that Halladay does not pitch the next five years for a team you're trying to catch. (The Phillies would then let Cliff Lee walk a year from now and recover some of the value they gave up with those two draft picks.)
Getting two picks for Doc and having him sign with the Phillies would be clearly less desirable; getting just the two picks and having him sign with the Yankees or Sox would be a disaster.
Of course, as good as it is when examined rationally, this trade would give Ed from Scarborough an aneurysm.
Hell of a dilemma. But if I can explain the logic so succinctly here, why can't Anthopolpus explain it to Ed? (You'd leave out the "we're trying to get worse" part, of course.)
Mark Grace had a career .383 OBP and was a legitimate GG defender. Kotchman's also an excellent defender, but he's in danger of losing his job not because of a lack of power but because he has a career *.337 OBP*. That's an enormous gap in offensive value. James Loney is an average defender with a .354 career OBP. Again, not anywhere near the player.
I think you're actually proving my point here: folks will immediately lump all power-challenged 1B together without discriminating between bad, OK, and excellent OBP, and between OK and terrific defense. (I mean, really, James Loney is similar to Mark Grace the way Bill Gates is just like Oprah.) If Anthony Rizzo can hit 15 HR a year but put up a .380 OBP and play GG defense (and I think a lot of scouts feel that is a very realistic upside for him) he will be an All-Star caliber player exactly like Grace was.
The only thing that bugs me about an otherwise extraordinarily informative writeup is the assertion that Rizzo's HR power "needs" to show up soon simply because he's a 1B prospect. While HR power is the single most important attribute for a 1B, it's not *necessary* the way K rate is for a pitcher. If Rizzo's HR power never shows up, you're still essentially describing Mark Grace there. A 1B prospect who projected to be an average defender instead of Rizzo's +10 or so and had shown an extra +10 runs of projected power would, I think, be a four star prospect, simply because of that ultimately irrational bias that a 1B "needs" to have plus HR totals. But it's the whole package that matters, not the way the value is distributed.
In 2004 David McCarty converted himself in ST into exactly the sort of player you're talking about -- the last guy on the bench who could double as an extra LOOGY. Francona brought him into a game on April 9 with one out in the top of 9th with the Sox down 8-5 and he gave up an inherited runner and one of his own. That more or less ended the experiment. He pitched the 9th inning of a blowout loss on June 12th and got the side in order, including a K of Jayson Werth. On the last day of the season, he pitched two 1-hit innings against the O's, fanning Palmeiro, Bigbie, and Newhan (all looking). If he had started with an outing like that, the story might have been completely different.
Joe, this is by and large an exemplary summary, with one glaring exception and one minor quibble.
And I have to give you credit for sticking with your guns re the supposed demise of David Ortiz. But the fact remains that after you called for his benching here and in SI, after you were so certain of his washed-up uselessness, he hit .281 / .383 / .579 in 201 PA, a line entirely predictable from his season to that point (I can say that because I did just that at the time, based on his hitting an almost identically valuable .285 / .364 / .616 in 173 PA from June 6 to the day the PED scandal broke).
Which means that his failure to hit in this series, like virtually all his teammates, is as meaningless as all the other small samples you correctly dismiss. They probably should start platooning him rather than Drew against tough LHP, but the notion that they desperately need to dump him (as opposed to looking to upgrade him, say, with a major trade for Price Fielder), or that they should have PH Brian Anderson for him on Friday night, is just wack.
The minor quibble involves dismissing A-Rod's post-season performance -- indeed, his whole career in the clutch -- as random variation. A-Rod was a terrific post-season player through 2004 ALCS game 3, then was awful until this year, and now is terrific again. Meanwhile, his regular season "Clutch" (as measured by FanGraphs) was an average +.25 wins per year before and after (in his playoff years) and -1.06 in between. That doesn't strike me as random; that's bipolar, and I think you'd grow to be very old indeed before you replicated that pattern replaying his career in Diamond Mind. I think the mistake everyone makes re "clutchness" is thinking it's a trait variable rather than a state variable. A-Rod's neither a clutch player or a choker in the long run (like nearly everyone else), but I honestly think he's a guy whose confidence swings to such extremes that his performance, one way or the other, is unlikely to be attributable to chance at any point in time. As long as he's feeling good about himself the Yankees are going to be a little tiny bit scarier.
You've got the six best NL teams ranked 2, 5, 6, 7, 8, and 9.
Last year, according to a properly adjusted EqA RS / RA, the six best NL teams finished 5, 10, 11, 12, 14, and 16.
To say that this initial Hit List catastrophically fails to adjust for league imbalance would be an understatement. I'm not even sure you're including the c. 2.2 win league difference you guys think exists (it was actually 9.3 last year).
I know this sounds like nitpicking, but it's a pet peeve of mine: none of these odds make a whit of sense mathematically. For instance, Viciedo has a 10% chance of being #1 (that's what 9-1 means), Poreda has a 7.7% chance (12-1 is 1 in 13), and Beckham has a 42.9% chance (4-3 is 3/7) ... so who has the remaining 39.4% chance of being the White Sox #1 prospect? Blagojevich?
(It's probably Beckham, I'd guess, which means the odds on him should have been 1-4, not 4-3).
I'm not surprised that KG made this mistake (it's pretty common in such lists, although the more usual error is to assign more than 100% of the probability), but frankly I'm a little disturbed that none of the more numerical types at the corporation caught it. Does anyone edit this stuff?
A rotation with one of Beckett, Matsuzaka, and Lester, and the best four from Smoltz, Masterson, Buchholz, Penny, Wakefield, and Bowden might still be the best rotation in MLB after the Yankees and Rays. The Sox might indeed be "done" in the two-injury scenario only because the division is insanely strong, but they could still be the third best team in MLB. A two-injury scenario for the Yankees or Rays does not leave them among the three best teams in the game, IMHO.
So there is a significant difference in pitching depth.
The full, complex methodology would be smart enough to handle this and more.
For instance, you project the MLB 1B to hit .274 / .395 / .582 and the blocked AAA 1B to hit .259 / .341 / .508. In certain runs of the simulator, the MLB guy gets hurt and puts up something like a .207 / .360 / .352 through the end of June before being shut down. In those runs, the simulator calls up the kid from AAA and maybe in some of them he hits .288 / .356 / .567. Voila, extra accuracy! But without such a methodology, you are not factoring in the way having a Ryan Howard in AAA protects you from the unexpected collapse of a Jim Thome.
Even the simpler methodology I suggested would do a better job than simply averaging PT. If the Yankees suffer an injury to one of their top 5, the replacement will be whoever is pitching best among Hughes, Aceves, and Kennedy. So whoever gets more starts than projected from that group (at the expense of the others) will also be likely outperforming his PECOTA mean projection. This puts a team like the Yankees in a better position than one with just one or two extra viable rotation candidates.
I\'m curious as to whether the model handles positional battles / depth correctly, or whether it could be improved, perhaps dramatically (project for next year).
Team A has one SS with a PECOTA projection of a 750 OPS. Team B has an identical player, but also a second guy with a 730 projection. You reasonably assign a 50 / 50 PT weight to the latter pair, even though (as you explain) in some universes the actual split will be 90 / 10 or 10 / 90 depending on who wins the job in ST.
I\'m guessing that, in your model, team A ends up getting better projection from SS (750) than team B (740). In the real world, of course, the opposite is true, because often the team will pick the player having the better season (or likely destined to, based on physical condition). Significantly often, the guy who wins the job will outperform his PECOTA mean projection, and (perhaps more importantly) the winner is much less likely to grossly underperform. No matter who wins the job in ST, if he\'s at his 10% projection in mid-May, the other guy is likely to get a shot.
The proper way to do this would be with a season simulator that has both a real and random component for variation about a player\'s mean projection and which dynamically adjusts PT. The simpler way would be to just use the 60% (or 75% or whatever) PECOTA projection for players at deep positions.
1) You might look up the word \"stereotypical\" in the dictionary. It doesn\'t mean \"widely held.\"
2) And why should should the politics of the person writing the introduction to a baseball book matter in the least? That\'s what none of the conservatives have been able to explain.
3) You really expect me to believe you would have been just as upset if Rush Limbaugh or Ann Coulter had backgrounds in sports commentary and had written the foreword?
4) The funniest thing about all of this is the folks excoriating Olbermann for being biased and hence unscientific and hence inappropriate to write for BP. This is a complaint, mind you, from supporters of an administration who not only opposed the \"reality-based community\" (their term), but mocked it.
Wow. Wow. And wow.
Echoing CubbyFan23: there is no possible way that any of the anti-Olbermann crowd\'s dislike of him can begin to compare with my revulsion and hatred for the policies of George W. Bush and for the job he did as President. (Olbermann\'s just talking, Bush did actual severe damage to the country and probably the planet). And yet if W. had penned this year\'s introduction, I would be curious as to what he had to say, as someone who loves the game. I\'m actually kind of fond of Bush the baseball fan; it\'s the only aspect of him I find remotely palatable.
Of course, it\'s a characteristic of the conservative mind that it sees only forests, not trees, and is generally incapable of having a mixed emotional response to different aspects of the same thing (which is why conservatives misread liberal criticism of America as hatred). So it\'s actually not surprising in the least that the conservatives are incapable of feeling differently about Olbermann the baseball analyst and Olbermann the political analyst.
The Phillies had more hits, more doubles, more homers, more walks, more stolen bases, and fewer GDP\'s -- and somehow lost.
And when was the last time you saw a team fan three straight innings with a runner on 3B and 1 out -- and they were the only 3 K\'s the opposing pitcher had recorded to that point?
And, BTW, the Phillies were just 1-15 with RISP.
The reason why Lester\'s G/F jumped was because he added a new pitch, a sinking \"1-seam fastball\" (2-seamer variant) that is nearly as distinct from his 4-seamer as his cutter is. Compared to the 4-seamer, it breaks 4\" in and 5\" down (the cutter breaks 5-6\" away and 3-4\" down; the 4-seamer has been averaging 94-95, the cutter 90-91, the sinker 93).
His transformation into an ace is pretty much the result of that, increased strength and hence velocity, and the improved command that is very common for young pitchers.
Umm, he\'s better than Paul Byrd? He\'s the fourth starter on the team and he\'s pitching game 4. It\'s not rocket surgery.
I\'ve been advocating for automated balls and strikes all my life -- there\'s a thread discussing it over at Sons of Sam Horn:
I\'ve also been posting analyses of umpire performance in the ALCS thread. According to pitch/fx, Sam Holbrook missed as many as 35 calls Saturday night, 27 of them obviously, with the calls favoring the Rays 2 to 1. Considering how consistently Beckett was squeezed in comparison to Kazmir, it\'s hard to escape the conclusion that the Sox would have won the game with neutral umpiring.
There\'s an assumption that Youkilis is a defensive downgrade at 3B, but in 452 innings the last 3 seasons (admittedly a small sample) he\'s +16 runs per 150 games according to Fielding Bible Plus / Minus (this year he\'s even better); Lowell this year was +9. Nor does he look like a downgrade; he has much better range to his left and the rest is hard to say.
Wow, you really don\'t want to lose all your credibility within the first two paragraphs. The Sox victories were by 4, 8, 4, 3, 4, 6, 3, and 8 runs; the Rays were by 1 (in 11), 1, 3, 1, 2, 1, 1, 2 (in 14), 1, and 7. Yup, the Sox and Rays played seven games that were either decided by 1 run or went into extra innings, and the Rays won them all. The Rays bullpen was better in these games, but no bullpen deficit gives you a 0-7 record without you being more than a \"wee bit unlucky.\"
The slight advantage of D3 over D2 is almost certainly meaningless, especially considering that the W3 / D3 numbers are completely inaccurate, and that the adjustment for strength of schedule really should be made separately for actual, Pyth, and EqRS/RA Pyth win percentages, the raw versions of which can then be safely ignored.
(See http://sonsofsamhorn.net/index.php?showtopic=36962 for the problem with W3.)
Facts, a/k/a Fielding Bible Plus / Minus converted to Runs / 150 G.
Youkilis +7, Pedroia + 12, Lowrie +24, Lowell +9, Bay -10, Ellsbury +10, Drew - 5 = +47.
Teixeira +19, Kendrick -4, Aybar +10, Figgins +11, Anderson 0, Hunter -4 (he hasn\'t been good for years), Matthews -13 = +19.
Lowrie is almost certainly exaggerated by a SSS but, OTOH, the Sox OF numbers are massively hurt by Fenway (Bay and Manny both finished the season at -10; at the time of the trade, IIRC, it was something like -5 Bay, -20 Manny).
30 runs in a season is a pretty large edge.
And re the Secret Sauce, a huge part of the Sox underachievement comes from a change in Papelbon\'s luck, which is reflected in the huge drop in WXRL and big drop in FRA, while his component RA has only gone from 2.20 to 2.46. It would be interesting to rejigger the Secret Sauce with a component RA substituted for WXRL, which can contain a significant component of \"clutchiness\" which is probably not predictive.
The covariance shows that good teams are likelier to outperform their Pyth than bad ones. Which would mean that part of Pyth differential is real and not random.
In fact, the correlation of Pyth differential to the previous year\'s differential has been statistically significant since 1959 (r = .06, p = .04). The historical timing of this effect is consistent with the thought that closer or bullpen quality affects Pyth differential in a real way, and this is confirmed by the data: if you exclude the worst underperformers (who could be expected to change their closer), the correlation strengthens, and the teams in the excluded group that did in fact change their closer did significantly better the next year than those who didn\'t. (Sorry I don\'t have the specific numbers on the last two assertions; they\'re buried in an Excel file from last December with 52 worksheets.)
The fascinating thing is that there was a marked *inverse* correlation from 1997-2002 that seems very unlikely to be random. (Without those six years, the correlation is much stronger, and in the last five years it\'s been r = .14, p = .08.) The reason I haven\'t published this stuff is that I\'ve been wanting to figure that anomaly out. But the notion that all Pyth differential is \"luck\" is clearly wrong.