CSS Button No Image Css3Menu.com
New! Search comments:
(NOTE: Relevance, Author, and Article are not applicable for comment searches)
you may be interested in this:
This is a good point, and maybe finding out earlier (as a baseball player) that you aren't going to make any money is better than being misled for longer, while also suffering brain damage, in college football.
Really neat, Jeff.
I think another cool angle to take would be to compute this over different time periods, and see whether some clusters have gained or lost players, or whether there were other clusters in the past that have since gone "extinct".
In addition to what Russell said, two things: 1) there's a bunch of good people who are just starting, so check them out, and 2) I trust that Sam will find an excellent replacement for me.
Well, that's because I haven't started yet--they didn't have regular baseball coverage before. Check back tomorrow/Monday and you should find something.
Thanks, Jon. I'm a great admirer of your work, so this means a lot.
An excellent point, and I should have cited this, don't know how it slipped my mind. So there's precedent, even within baseball, for this kind of undertaking.
But such an undertaking would be very difficult and laborious, and there would be no need for it if only MLB would provide us the data to begin with...
I'm not disappearing altogether; hopefully I can induce you to come read my new weekly column over at FiveThirtyEight. And I will be back here once in a while.
"If the final ranking of the fallers should have been lower, and that of the risers should have been higher, then their average rankings should be less predictive of the outcome, not more."
This doesn't make any sense to me; could you expand? I'm happy to take your criticism seriously, but I can't understand what this sentence means. I would note that I took into account both final and average ranking, and in both cases, the trajectory of the ranking was significant. So, pretty much no matter how you slice it, trajectory impacts the lifetime <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=WARP" onmouseover="doTooltip(event, jpfl_getStat('WARP'))" onmouseout="hideTip()">WARP</span></a> of a prospect.
I would also note that I did look at the endpoint and slope. Both proved significant for improving predictions of lifetime WARP. It's hard for me to understand why the correlation would be meaningful, but I will try it.
OK, here it is with Final Ranking, instead of average ranking (by the way, the reason I used average over years is because it is more predictive of future <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=WARP" onmouseover="doTooltip(event, jpfl_getStat('WARP'))" onmouseout="hideTip()">WARP</span></a> than final ranking):
Final rankings within top 40, whose trajectory was up (improving their rankings the more time they spent on lists): median WARP 17.2, mean 20.96
Within top 40, trajectory down (going down the list the more time they spent): median 5.9, mean 15.09
Stayed the same: median 8.52, mean 20.09.
And the average final rankings are 20 for trajectory up, 26 for trajectory down, and 22 for trajectory staying the same, which is not enough of a difference to explain the lifetime WARP variation that you see.
That's what I was going for in the last table. All those groups have similar average rankings, but the guys that rose up to their rankings did significantly better than the guys that stayed (you can think of them as the "control").
Also, if trajectory wasn't important then it shouldn't improve the accuracy of the cross-validated model. But it did (marginally), suggesting that how a prospect traverses the lists is meaningful above and beyond their final position.
This question about inertia is the subject of the next article.
Could also look into the leaders vs. followers phenomenon, but you might run into problems because not all lists are released in the same order every year. Within a year, I expect there is some cross-influence between lists, too.
1. Tough to tell exactly who they are because so many players are clustered around 20 lifetime WARP, but I think they are Richard Hidalgo, Aubrey Huff, and maybe Bret Boone.
Good idea. I went back and looked at the data, and 2 years on the list is the optimal number, pretty much no matter what the age a prospect started getting ranked is.
There might be something to the foreign player vs. domestic player distinction, though. Maybe Sano was less prepared/needed more development than an equivalent, US-born player. Perhaps the prospect people can weigh in on that, I'm not enough of an expert to know.
I don't have that data on hand, but I would expect there to be a strong concordance, high school draftees spending more time on the lists. In the future, I will layer draft position, high school vs. college, and some other stuff on here and see how much it improves our predictions.
Individually? No, certainly not. In aggregate? Yes, I think it's possible. Bear in mind that the fans have access to PECOTA *and* all sorts of additional information, for example about the coaching, the ownership, the training staff, and so on. The wisdom of crowds is a real effect and has been demonstrated in many contexts.
I don't think it's probable, certainly. But I think it's possible, and worth checking, because why not?
That's exactly what I'm looking for. Maybe the White Sox fans know something PECOTA doesn't, and they will do better than we thought. Based on these results, we'd expect the Mariners, Braves, White Sox, and Mets to do better than PECOTA says, and the Rays, Yanks, Padres, Jays, and Angels to do worse (exempting the Giants for obvious reasons).
Fair enough. The selection bias is, in Rumsfeldian parlance, a known unknown, and you may be believe that it is larger than I do, which is completely plausible. I certainly didn't mean to undersell it; perhaps I should have been more careful to clarify that this method measures a section of the fanbase, not the whole thing.
I also think that the stereotypical aspects of fanbases you mentioned might begin to emerge over a longer term study like this. In the near term, I think it's plausible that each fanbase is mostly concerned with how their team did recently and how it will do going forward. But in the long term, we might start to see consistent patterns, like Mets fans always being a little less happy than their W-L would suggest, and so on.
Which zone? The one at 2-0 or at 0-2? The one with Lucroy behind the plate, or the one with Pinto? The one that's for night games with close scores on the road with a low-strike happy umpire, or the one for day games in Milwaukee against a veteran pitcher?
What I'm getting at here is something I've argued before: the idea of there being any single thing that is "the strike zone" is incorrect. Better, at least in my humble opinion, to average over all of the zoneS that exist, in showing the effect of someone like Lucroy. In this way, we avoid the problem of picking some arbitrary threshold which may or may not reflect the actual zone, which as we know moves, grows, shrinks, and changes shape depending on many factors.
The other thing is that this ought to affect more than just that fraction of pitches which paint the black. In fact, pitches well inside are sometimes called strikes, and I would bet that they are more often called so when a good framer is receiving than a poor one. So the batter can't rely on his framing knowledge only when the pitch is doubtful, he has to be thinking about it always.
I've never seen such a study. I can definitely buy into the idea that the ump would compensate for one good framer by being lax on the other framer. I think that would be a really cool line of inquiry, actually.
In my models, catcher framing (in aggregate) exercises a significantly larger effect on called strike probability than the umpire. Which is not to say that the effect of the umpire is negligible, but don't underestimate the impact of a good or bad receiver.
filing this under 'future article ideas'.
On these results, I suspect the effect would be small. The Brewers might be targeting that area more, but they can't force batters to swing. Lucroy can force them, by making the probability of a strike there much higher.
With that said, I wonder if perhaps the Brewers calibrated their down-and-away philosophy to take advantage of Lucroy's particularly excellent skill. Generally, I think looking at how location/pitch type choices are affected by the framer seems like a good area for more research, one that we have been talking about.
As to whether the strategy might influence Lucroy's numbers, I will leave that to the framing experts. My guess is that it might affect his cumulative numbers, by giving him more chances to show off his framing skill, but I don't think it would affect his rate numbers, like the one I cited above.
That's an excellent point, i.e. that some (but I suspect not all) of this effect is already incorporated into the existing model. An important caveat.
I like the LWTS idea, that seems like a good way to go.
The PITCHf/x data is helpful because it can diagnose recoveries (and, to a lesser extent, injuries). So, it can tell when a player who has suffered from a lingering injury is healed, or, sometimes when a player is suffering from a lingering injury that hasn't been announced yet (like with Jedd Gyorko).
The difference between models is statistically significant. The model which incorporates PITCHf/x drops the prediction error pretty substantially.
If you want to get into the nitty-gritty for why it's statistically significant: I did permutations where I randomly resampled the PITCHf/x predictor variables and reran the model many times with these random numbers. This gave me a distribution of prediction improvements which I would expect to see if the PITCHf/x numbers were pure noise. When I compared that distribution to the actual improvement of the model with the real (not permuted) Pitchf/x numbers, the real improvement exceeded the distribution of permutation improvements to a large & significant degree (p<.01). For a more academic background on what I did (which is called a permutation test), you can check out a paper like this:
My concern is that the data is still relatively sparse, since there's only ~5 years of PitchF/X, and several parameters (age, strike probability, injury days in two seasons). But I will give it a try. It ought to become more feasible as we gather more and more years of data.
it's true that players tend to see increasing strike probability as they age (working on the aging curve now), so that's a valid concern.
The way that it's dealt with is basically by letting the model sort it out, and what it ends up doing is taking an age-related decline as the baseline, and looking for changes more dramatic than the average year-to-year decline as symptoms of injury.
This is happening, for sure, possibly next week. I'm going to integrate the fastball frequency changes with the zone distance changes and come up with a list of possible over/underachievers for next year.
I will get in touch. Split-finger shouldn't matter much either way cause it's not a very large portion of all of the pitches thrown.
Yes, definitely fantasy implications too, although I am no fantasy expert. As I mentioned in the article though, I think the best fantasy analysts (like our own crew here at BP) already know about how projection uncertainty can vary, and they already talk in those terms (for example, mentioning upside and downside risks, etc.). But certainly some analysts, fantasy-focused and otherwise, do not carefully take into account the spread around a projection, concentrating only on the mean or median outcome.
Thanks for reading.
Yeah, these are both very good points. It's especially interesting in that now, as you mentioned, some of the very rich teams seem to be consciously trying the "buy variance" strategy (in particular, the Yankees, Red Sox, and Dodgers). I suspect that this has to do partly with the saberification of baseball, and it will be interesting to see what tactics the small market teams, like the post-Friedman Rays, come up with to adjust to the situation.
There's so much blame to go around for the steroid era. The players are the most responsible, but let's not forget that nearly the entire apparatus of baseball ignored the problem, going all the way from the front offices to the commissioner himself. No one wanted to confront the issue until it was long past too late.
And even the writers *who are now voting for or against the HoF candidates* are partially responsible, for failing to report on what was happening. They, too, are complicit (some of them, anyway).
Not that I am condoning the steroid users; only saying that the era was the product of much more than one group's errors.
Well, it did happen in the last few years, but it didn't happen this year.
To your point, I do think that the BBWAA is modifying their voting preferences yet again. For example, this year there were 8.4 votes per ballot, comparable to last year and about two votes higher than the average during the 2000s. If that continues, the Hall will likely return to a more historically normal level of inductions per year.
To clarify, I meant in that sentence "obvious" as in "easily perceived", not that Smoltz's case was _obviously_ worthy of induction. The reasons behind John Smoltz's HoF case may be stupid--I am willing to buy arguments against him, like Ben's--but he WAS an obvious candidate for the 1st or 2nd ballot, according to the established patterns of the HoF voting (and many people predicted him as such, even before the ballots started getting tallied).
With respect, your statement is not true. We have an absolute standard to compare against--the players' true performances--and our desire is to come as close as possible to minimizing the distance between our projections and the actual performances, as they occurred. Distance here could be formalized by RMSE or some other measure, but regardless, it is not necessary for us to know other algorithms' projection accuracies to improve our own.
Thank you for your recommendations regarding other ways PECOTA could be improved. Those are certainly significant and complex issues, each of which would require substantial reworking (and in some cases, for example identifying pitchers who serially under/overperform expected BABIPs, a better understanding of the underlying sabermetrics). I will note that working to improve one aspect of PECOTA does not preclude progress on these other, obviously worthwhile projects.
MGL, in lieu of a PA threshold, what would you suggest doing to handle the random variability problem?
Ah, I understand. I wasn't aware of MGL's results. I'll check about the magnitude of the effect (i.e. how the RMSEs differ between the flat and variable players)
Good points. I actually did do rank-based (i.e. percentile) comparisons after I discovered that all of the systems were overrating the league-average offense to a significant degree, but I didn't report them (will next time). I think your idea of weighting the tails to a greater degree is an important one, and sort of what I was trying to get at with the tail of the prediction error phenomenon. Guys who over- or underperform their listed OPS by huge amounts will usually go from somewhere in the middle to one extreme tail or the other, which could have a huge impact on, for example, your fantasy team.
Re: evo34, I certainly wouldn't recommend trying to win your fantasy league without looking up some projections first. That would be foolhardy. It sounds like you are looking for a solid comparison between projection algorithms, which I think is a very worthwhile endeavor, but as I noted, shouldn't be done by me. I'm more focused on whether and how to improve the performance of PECOTA. The point of this article, and the "little better than random" factoid at the end, was to point out that there *is* substantial room for improvement, and the most fruitful place I see to pursue that improvement is in terms of identifying some of the major breakouts that happen every year, as they contribute a disproportionate amount of forecasting error to the total.
I could if I knew who the flat players were going to be in advance. But I don't, not yet at least. Flat players are easy to identify *in retrospect*, but that's not very useful or valuable.
I have some ideas about how to identify the flat vs. variable players based on PECOTA's percentile projections, and I'll write up the results of that soon.
That's up next, as I gather more years of data. There's no obvious age pattern so far. I didn't check batted balls, but I will (and also other PitchF/X information).
Ah, cool. Yes, everything stays about the same, but it re-orders the top 10 a bit so that it makes a little more sense (Giancarlo, McCutchen are higher). More significantly, it raises the correlation between Marginal Runs and TAv by .15, so that it's r=.87. (Not that TAv should necessarily be the gold standard, but I would expect any good offensive metric to agree pretty well with it.)
yep, I think you are totally correct (mad I didn't think of it myself). Should be easy to do via the attached spreadsheet, but I'll try to update it at some point as well.
Thanks. Very true about the different values of some strikes. The metric takes into account the coordinates of each pitch in attempting to determine its value (so a strike further away from the center is rated as worse to swing at, and a ball close to the edge of the zone is rated as better to swing at). With that said, this measure does NOT take into account individualized hitter "hot zones". That's very difficult to do, and probably best accomplished with multi-year data, as Jared Cross pointed out in a recent piece (http://www.hardballtimes.com/bringing-the-heat/).
Thank you. Yeah, in theory it would be possible to compute for anyone who saw a pitch, but I think it would get inaccurate for players with few pitches, because we don't know how well they'd do if they swung at the ball (which we have to estimate). Could always substitute league average.
"The "value" of a take is certainly not the run average value of the result. It is the difference between that and the run value of a swing (at that pitch on that count)."
That (the marginal value of a take, relative to a swing) is the goal; this article is a step towards that goal.
Thank you, Jon, and that sounds great, I am looking forward to reading it.
Jones seems to be another one of those guys (like Sandoval, who I discussed last time) who swings too much, but also swings much more at stuff in the zone than outside it. Also like Sandoval, he sees many fewer strikes than average, so the high baseline rate of swinging is to his detriment.
Re: the numbers being low, I should have noted in the article that I graphed the model's output for each hitter as though it was a 0-0 count, i.e. the first pitch of each at-bat. Hitters are much more conservative on this first pitch than average, hence the swing rates look low.
In the model itself, I included count as a factor, so we should be seeing for each hitter their response to strike probability, as opposed to their unique tendencies with regards to the count.
Hmm, I'll check on those numbers.
Yep, I linked to that piece in the body of the text. It is similar in spirit to this research, but different in terms of the methodological details (e.g. it has a uniform, but probabilistic, zone).
It is perhaps debatable, but the most fair comparison I could think of was to compare the Giants performance against high velocity fastballs versus their performance against all fastballs, rather than against the league average rates.
That is to say: everyone does worse against 95+mph heaters, but how much worse do they do relative to their normal performance? My conclusion being that while the Giants whiff more, the magnitude of their performance deficit against 95+mph stuff is less than what it is for most players. Getting in the middle of that narrative is the fact that the Giants tend to swing and miss more on average, for whatever reason (even though they are an above-average hitting team).
March through October, yes. I included spring training data in there, although it doesn't affect the trend if you leave it out.
Could be, but when I bring umpire id into the model, the distribution is still over-dispersed, suggesting that it's not just bad umpires screwing things up.
"If you define "accurate" as the de fact zone, such that the average umpire is accurate by definition, then you should find that accuracy or at least a deviation in accuracy from the de facto zone, has no effect on K rate or any other offensive component, on the average."
Yes, to be clear, I am using the de facto zone, or at least trying to do so, accounting for things like the expansion in certain ball-strike counts and pitch framing. So it is the latter case, where I think we should not expect to see a difference in K rate, but we actually do.
In the case of my model, I use the hitter's height as one of the inputs. This improves accuracy to a small degree, suggesting that the umpires are accounting for the hitter's unique strike zone.
You could build something similar in to a machine-called-zone by inferring the position of the hitter's shoulders and knees from the video. The cameras are at 60hz, so they could adjust for the hitter's zone at the precise moment in which the pitch was released.
However, to be clear, I am not advocating for a machine-called zone, bhacking is. It's certainly interesting to ponder though, and I think, were pitch-calling-by-computer to be implemented, it might look something like the machine learning algorithm I used for this piece.
That's a great point. One of the things I'm planning to do to follow up is to use a continuous measure of accuracy which penalizes umpires less for the edge cases than for the glaringly obvious missed calls. Hopefully that should correct for this potential problem. In the mean time I'll look and see whether that would explain what's happening in the "least-accurate" games.
Agreed, although I would note that in the model as I built it, if a particular ump had a different version of the zone, that would go in the category of "bad ump'ing" since he's calling it in a way which is inconsistent with other umpires.
I debated whether to include the umpire ID in the model, since players might be aware of the umpire's particular proclivities, but decided that it was too minor of an effect for them to adjust their strategy. That's something I should look into in more detail, though.
yep, that's what I think too (and what I will look into next). Maybe the plate discipline guys are generally less consistent as a result, since they have to adjust more to account for game-to-game variability in the called strike zone.
"Good article, I'd be interested to hear more about this: 'PITCHf/x is not perfect either—the system has a margin of error as well.'"
Thanks, and sure. I was referring to the random measurement error inherent in the system, i.e. the difference you would see in reported location if you somehow threw the exact same pitch twice. Sportvision (makers of Pitchf/x) claim that this is about half of an inch to an inch, and that claim has been verified independently by Alan Nathan (http://www.baseball-fever.com/showthread.php?84245-Pitch-f-x-accuracy). So that's pretty good, especially considering how technically challenging the problem is.
On top of this, there's some systematic error game-to-game, which is caused by e.g. miscalibration of the cameras, on the order of ~1 inch. That kind of calibration error should be mostly removed (or at least dramatically diminished) in this analysis, thanks to correction values given to me by resident experts (in all things) Dan Brooks and Harry Pavlidis (thanks guys!).
To your question, my bet is that Pitchf/x is at least as good, if not substantially better, at recognizing pitch locations as a well-trained human. And, hypothetically, if you were to use it for actual pitch calling, you could improve the accuracy (either by additional cameras or orthogonal data-gathering) to a very low level. Here's a nice piece from Ben Lindbergh on that very subject: http://grantland.com/features/ben-lindbergh-possibility-machines-replacing-umpires/
That was not a knock on Billy Butler, who should be lauded for the steal, as well as his baseball smarts and awareness (I literally did applaud him, sitting in my chair, as it happened). That was a knock on the Los Angeles Angels of Anaheim, who gave Billy Butler a chance to steal.
I guess you could say I'm bullying the Angels, but I strongly suspect that they don't care, and/or would admit that allowing Butler to steal was pretty boneheaded.
Finally, it was relevant to the story as an unusual happening which exemplifies the randomness of the postseason. Perhaps that connection was not clear. I certainly hope I did not offend either Billy Butler or any of the Angels.
Yeah, it's a neural net. But I prefer Harry Pavlidis' brain to the MLBAM algorithm.
Caveat: I am using the MLBAM classifications for this, and to really be accurate, I'd need to get Harry's classifications. It's probably not an issue for assigning fastball vs. non-fastball, but getting into specific types can be troublesome without better data.
But, with that said: changeups up (a lot), curveballs a bit, and a marked increase in sinkers, maybe supporting your point.
I will say that sinker frequency increasing is a league-wide trend I've noticed, so I'm not sure if it's specific to Oakland, but intriguing nonetheless.
Weirdly, eephuses increased three-fold to ~.1% in the second half. Probably a classification error, but it would be cool if it was real.
There's not a big correlation between overall offensive performance by TAv/OPS and fastball percentage. But it's a reasonable suggestion to look into in more detail. Parsing fastball % by count before and after seems like a good next step (as well as looking at swing rate).
Interesting idea, I wonder if looking at swing rates might be helpful.
Possible, but not supported by the data:
first half fastball percentage, leaguewide: 55.2%
second half fastball percentage, leaguewide: 55.5%
I looked at a few other teams and no, none of them saw drops nearly as dramatic as Oakland.
I don't have much experience with that, but I'll give it a try.
The trade is around pitch 250, so about 1/6th of the way across the graph.
It's possible that the Pirates are systematically getting pitched further away, but I don't see a lot of support for it in the data. It might just be that you're noticing higher distances for guys like McCutchen, Alvarez, Harrison, Marte, and Walker, but then again, those guys are all pretty good hitters, so I think higher zone distance is to be expected.
Hey Shaun, thank you! Glad you are enjoying.
I had the same idea recently about hot streaks. The barrier, as you say, is the stabilization rate of distance metrics. But, it seems like it might be possible somehow.
Excellent question. Between this year and last, V-Mart's zone distance against righties went up only marginally, from 1.172 to 1.177. Against LHP, on the other hand, it shot up from 1.1 to 1.17 feet, which is the better part of an inch. So much of the difference in zone distance comes from lefties.
I think you are onto something here. Initially when I looked at zone distance, I found that LHP and RHP for a given batter had fairly correlated zone distances, but probably there's more to it than that for individual hitters, especially when they have large pre-existing platoon splits.
Yes, the pitching changes are another significant issue with pace, one that I would also like to see fixed. Agreed that they are often obnoxious in how they interrupt the drama of the game. However, at-bat time has been rising with the time between pitches as well, even for SP, as Russell Carleton has observed here:
Yes, what Geoff said. I don't watch football, albeit not for that specific reason.
I would also note that there is some substantial activity between plays in football; substitutions are made, formations are being set, linebackers are running hither and thither, and Peyton Manning is barking "OMAHA" repeatedly for no apparent reason.
Of course, we disagree, but that's perfectly fine. I don't get drowsy at 10:15, so I don't mind the late end times much at all. I also couldn't hang out at a bar every time I want to watch a game, but that's just my life--yours may be different. I think lots of people watch baseball from home, though.
Thank you. You are quite right. Like you said, though, the impact should be minimal.
The only reason was a practical one: I couldn't figure out an easy way to calculate dWAR(P) without taking errors into account, so I just left them in. If anything, it would push the model to be less accurate (greater RMSE).
Yes, you are right that it is regressed (see here: http://www.baseballprospectus.com/article.php?articleid=11589). That's a good idea.
Since all teams play the same number of games (and roughly the same number of innings) over a full season, the difference between a rate stat (ERA) and a counting stat (earned runs allowed over the year) at the team level will be a simple multiplier which should be similar for all teams. In other words, it shouldn't make a difference.
I figured that errors are the easy to understand and calculate part of defense, and I wanted to show that even after we subtract them from the problem, defensive metrics still have value.
Dan Brooks mentioned that he wasn't clear how the model was built and tested, so to remedy that, the model was:
(team ERA in year i) ~ ((pitcher WAR(P) in year i) + (defensive WAR(P) in year i))
To be clear, I'm only using regression to account for the greater uncertainty of defensive metrics. The use case you are talking about is perhaps to estimate a player's true talent level--for instance for making a projection. But, as you note, for the purposes of MVP discussions we do not care so much about a player's true talent level as what they actually did on the field. The problem is that "what they actually did on the field" is possibly more uncertain for a player's defensive contributions than for their offensive contributions; hence the attempted use of a regression factor on their defensive contributions.
Unless you think that measurement uncertainty varies by player (which is entirely possible, but beyond the scope of this article), it would not be appropriate to apply a different regression factor to each player. You would apply a different factor when trying to estimate true talent levels, since they clearly do differ by player, but as I mentioned, that is a slightly different problem.
"Are you using Audacity on the video files directly from MLB, or do you use some other method to record audio for analysis? If the former, what is the sampling rate and bitrate of the files you use?"
The former. I think that the sampling rate has been consistently 44100 Hz; this is the standard for most broadcasts, right? Bitrate is 1411kbps.
"What was the highest peak frequency you saw? I tried to replicate this with Javier Baez's last home run (August 18, 9th inning, 8:36.67 into the condensed game) and got 2.8 kHz. I suspect some operator error, though."
Yep, that's the same as what I got. Wow, was that loud. That's the highest peak frequency I've seen, but that's also the hardest hit ball I've seen in some time, so maybe that's alright.
"How did you decide how large to make the window of analysis around each bat crack?"
This was arbitrary; I basically recorded about two seconds up and downstream of each hit, and then manually selected from that a little bit immediately around the spike of the hit (until the vibrations visibly died down). Eventually, I settled into a pretty consistent rhythm of getting about a tenth of a second on either side.
Thanks, Rocco--the feeling is mutual.
"I'm going to guess that borderline pitches with more audible pop are called strikes more often than similar pitches without it. I'd attribute this to the catcher's framing skill, but maybe umps are using the sound to help make their calls."
Great idea! I will for sure give that a try. And who knows, maybe part of catcher framing skill is creating the audible pop by moving the glove in certain ways (maybe snapping it shut augments the pop?).
"To Robert: Did you try to filter out the crowd noise in the background at all?"
Not really--if you examine the second picture in the piece, you can see that immediately prior to the bat crack, crowd noise is close to negligible, both to my ear and relative to the great volume of the crack. To clarify, I selected the tenth of a second or so immediately surrounding that peak, so whatever crowd noise there was would have to be in that region.
Ah, I misunderstood.
512 = 2^9
In seewave I think you'll need to specify the frequency of sampling (44100 Hz) and also the size of the Fourier Transform (512 is what I used). That should get you the same results. I will post my R code and the individual samples at some point, perhaps on my blog.
"Next topic - does the MLB.tv compressed game audio ever pick up the sound of pitches hitting the catchers mitt?"
Yep! I am going to analyze that too. In the audio feed, it looks like a mini-bat crack (a bit longer and less loud), but the frequencies are very different.
Under "Analyze", I used "Plot Spectrum". I also used an R package, seewave, to redo the analysis and make sure I got the same result. It looks like the audio files are in the right spots, and when I just did the analysis from sound recorded from the website, I got the same frequencies I did before. Maybe you just recorded the wrong audio file? Groundouts do show a peak at ~500 Hz.
The speed of the pitch will act to increase or decrease the relative velocity of the collision. So assuming the hitter swings at some constant speed, a fastball will produce a more violent collision (with higher peak frequency) than a curve. I'm not sure about the magnitude of that effect, though.
With regards to the the size of the bat, I do not know. Maybe Alan can weigh in on this.
Thanks, glad to hear that.
Yeah, it's definitely possible Sveum only meant that to apply to a few guys, or that he could only get through to a few guys. A couple things though: although Hosmer and Moustakas increased their swing heights, it wasn't by very much, especially compared to the change between years (2013 -> 2014). Granted, I don't know how much higher Sveum wanted them to swing, but since both are doing poorly relative to last year's performances, I would think Sveum would want them to swing a lot higher, rather than only a little bit higher.
Second, I think the failure to get Billy Butler, in particular, to buy in is a pretty substantial one. Butler has had a serious loss of swing height over the years that has manifested in tons of grounders. Moustakas and Hosmer and others are doing poorly, but Butler is undershooting his projected TAv by an immense 40 points, and, because he is Billy Butler, he can't contribute value in any other way. So I think that, whatever Sveum's list of people with swing height problems was, Butler had to be at the top of that list, and he hasn't changed his behavior at all.
Oh, that's a good one.
totally plausible, I think. I'll take a look.
thanks man. I'm not planning on going anywhere any time soon.
Springer's pretty good, and obviously a power hitter, but he's also incredibly patient, with a zone distance of 1.02 and a swing distance of .745 (in the bottom 15 in MLB). I wrote about his hyper-patient approach a while back:
Singleton is also solid: zone distance 1.08, swing distance .82. I would expect both to be OK, although they also have top 10 swinging strike rates, which scares me a little.
Position is probably mostly irrelevant. Good hitters scare pitchers regardless of what position they play, and keep good swing discipline regardless of their position. However, it's plausible that catchers might develop more slowly or something like that--something to look into.
Good point: d'Arnaud's demotion and subsequent return happens around pitch ~325 in the above chart (roughly the middle of it). If anything, his zone distance has decreased since then--.99 before the demotion, .94 after. He is seeing much better results, though, which will maybe persuade pitchers to stop attacking the zone so vigorously.
Thanks (for both your compliments and feedback), you are dead-on about the color progressions.
You are absolutely right, Truthteller. There's a lot I'm not accounting for, like the pitcher, the location in the zone, pitch type, the ump, and so on. Really, for this first (overall) analysis, I just wanted to get a bird's eye view of swing rates by count and how they vary. In the future, I will be thinking about how to put all that stuff into a model to pull out how aggressive/patient individual hitters are.
indeed, that was the idea, sorry if it ended up being unclear.
You are right, which is (part of) why I wasn't surprised that they did vary more (of course they would!). There's no way to get rid of this effect entirely, but I did weight the differences in swing rate by the frequency of each count, so that a 50% change on 3-0 is not as significant because 3-0 counts are rare overall.
However, another reason they vary more is because of the offsetting changes that I talked about--hitter increases his rate on 0-0 counts, but decreases it almost as much on 0-1+1-0 counts, so the total change gets diminished.
Actually, come to think of it, I suppose I could have gotten the 'expected' amount that count-adjusted swing rates would vary by random sampling, and then compared that to the actual differences. For next time.
I cringed--not because of anything related to the play itself, or whether it would result in the winning run crossing the plate--but because it looked like Rizzo could have suffered a terrible injury doing that.
(Performance in Year Y) = (PECOTA Projected Performance) + (Zone Distance Trend in year Y-1)
By the way, minor note: I am using the actual distances from the center of the zone (based on the Pitchf/x px and pz coordinates) rather than the more commonly used Zone% statistic that shows up in, e.g., BP's Plate Discipline section.
BP keeps a database of Pitchf/x data, which I started using recently. Before that, I used an R package called PitchRx, which works well. And still another source of data would be Darrell Zimmerman's database at: http://www.baseballheatmaps.com/pitch-fx-download/
But also, I'm happy to take suggestions if people can find interesting changes in approach after hitting coach hirings/firings. Magadan seems like a good candidate.
I looked into that in a limited, simple way, and although KC's pitchf/x readings are a little bit more downward biased this year according to my model (indicating a slight recalibration effect), it isn't nearly enough to explain the year-to-year difference in swing height for most of the Royals hitters.
Good point--I'll update the tables to be in inches.
Between years, e.g. 2012-2013, both distances have R2's of around .72. But actually there's a lot of variation even during a year, with pitchers trying to outsmart hitters and the hitters adjusting appropriately (wrote about that here with reference to Yasiel Puig, Jose Abreu, and George Springer: http://www.baseballprospectus.com/article.php?articleid=23846). And sometimes, trends during a year predict improvements the next year (wrote about that here: http://www.baseballprospectus.com/article.php?articleid=23550). So even though they are mostly stable, changes in the stats during a year can be useful for forecasting.
I have played around somewhat with the ratio, but it isn't as good of a predictor of hitting ability (via TAv) or future changes in hitting skill than the two separately. I think that's partially because different combinations of swing/zone distance can mean different things.
For example, while early Springer would have had an amazing swing/zone distance ratio (on par with some of the best hitters), he was clearly being too selective in his approach. That's an extreme example, but I do think that the relationship between zone and swing distance is a little more complex than I initially supposed.
Thank you. Re: auto-generating distance maps, we are currently working on it.
I can name names, just didn't want the piece to get too long. Some notables: Carl Crawford (he just went on the DL, actually), Ryan Howard, Matt Kemp, Peter Bourjos, Brett Lawrie. It would have called Franklin Gutierrez's absence (he was #1). Troy Tulowitzki is 20th, with ~20 predicted DL days. Basically, the people who've had the most injury problems in the last few years.
All R, all the time.
Not all Rockies hitters are being given the same treatment this year. Some aren't being avoided as much, despite the outstanding team results. I'm planning to write more about this soon.
yeah, for sure. I did a piece on Albert Pujols a couple of weeks ago, and Jeter would make another interesting study.
The correlation in 2013 between TAv and zone distance change is only ~.1 (and non-significant). So, yeah, it's not necessarily true that a good hitter is going to be more and more avoided as he's having a good year.
"Any response to TroJim's objection that the prior year increases in zone trend are simply due to pitchers adjusting to hitters having good years?"
That's certainly not necessarily the case. Check out Albert Pujols graph above; in his best year (and one of the best years anyone has ever had!), his zone distance was trending downward. You see that with plenty of hitters having good years, especially when BABIP-aided. With that said, Pujols is only anecdotal evidence, so I'll look at the league-wide trend and get back to you later with some gory math (either in the comments or in an article).
It sounds like maybe you are objecting to my usage of the word 'breakout', which I am defining as: "A significant upward deviation from PECOTA's forecast". That's fair, you may consider breakout to mean something else (like a young player suddenly becoming great), and as I noticed, plenty of the 'breakouts' are older players failing to get worse as we would expect. Still, I would say that there is a lot of utility in figuring out which players (e.g. David Ortiz) will out-perform PECOTA. PECOTA is pretty accurate overall, and when it suggests an older player is going to take a big drop in performance, more often than not it is correct. Also, there's some examples of young players breaking out, like Chris Davis and Anthony Rendon.
thanks, now fixed.
That's a cool idea, I hadn't thought of it. There are some technical issues that might need to be addressed, like how the PitchF/X technology has changed, as well as how the strike zone has varied, but I'll give it a shot.
I will note that I looked at a few other hitters, both good and bad, over the same time span as a sort of ad hoc baseline. The trends were as you might expect--Miguel Cabrera started seeing pitches further away from the zone after his wonderful couple of years, Matt Holliday was pretty steady, and so on.
There's 162 slightly different models, one per game number. As you might imagine, the weight on PECOTA decreases as the season goes on, and the weight on RS/G increases. As I said below, I'm going to look into this again and try to get a simple formulation for how much to weight PECOTA per game number. The intention here wasn't to maximize accuracy so much as to illustrate the overall trends.
I use a different set of weights for each game number.
A lot of good questions and suggestions here, and in the comments above (as usual). I will look into some of these for the next article.
Yes, I ran it with the updated predictions.
See above. But I agree; the tendency to underpredict to RS/G is a little weird. I'm fairly certain that the numbers are correct and consistent with my method, but perhaps I need to re-examine the method with some external checks as you suggest.
I think that this is a feature, not a bug. Why? Because teams generally scored fewer runs per game later in the season than earlier in the season, reducing the final RS/G number (or at least they did in the years I looked at [2012/2013]). So the model is systematically underpredicting the runs/game to account for that.
Yes. All of this is based on PECOTA's depth chart projections, which assume that player X will get N plate appearances with his current team. There's no accounting for what would happen if that player gets traded. That's probably a source of some inaccuracy, given trading deadline dynamics (good teams tend to buy, bad teams tend to sell).
Mathematically, a linear model has the variables in it (in this case, RS so far and PECOTA projected RS), and also an intercept. So if the intercept for a particular model is say -.1, and the RS and PECOTA numbers are each pointing towards 4.2 RS, the model might spit out 4.1, because of that intercept.
Generally, I would take the above numbers with a grain of salt. They are provided for illustrative purposes, rather than as definitive predictions. The point of this article was to show how quickly RS/RA stabilized, and to demonstrate that preseason predictions still carry some weight (and will until ~ game 100). If people are interested in maximally accurate predictions, maybe I can do a follow-up with some more sophisticated models.
(With that said, it's also possible I just made a mistake in entering the numbers in the table. I'll go back and check to make sure.)
I will definitely look at how high-ISO hitters do against different pitch types. League-wide, there doesn't seem to be any major difference in ISO between fastballs and non-fastballs, which is curious.
Thanks for the correction.
Well, I think Scutaro has always had a high contact rate, pretty much throughout his career, so I'd feel comfortable attributing at least some of his contact rate to his intrinsic ability.
With that said, I completely agree with the broader point you are making. Disentangling cause and effect is difficult with these kinds of correlations.
One way to do that, I hope, will be to look at the history of a batter's fastball rate as it evolves over time. For instance, say a certain hitter loses power dramatically because of a hidden injury. This hitter was traditionally thrown very few fastballs. Does his fastball rate spike as the league catches on to his now-reduced power? And does his contact rate increase correspondingly, and if so, by how much? I'm working on this kind of analysis now.
Thanks for the feedback.
in terms of the model, the equation is...
Velocity in full year = .97 * velocity in previous year + .56 * (April [this year] velocity - April [of previous year] velocity) + intercept (2.8, in this case)
So, the difference in April velocities is worth roughly 1/2 as much as the previous year's velocity.
McCarthy juuuust missed the list: he's right after Dillon Gee, with about a .5 mph predicted increase. He's also shifted his pitch usage a lot, which is intriguing.
To clarify, though, I only included the 2013 velocity in the model. I could also bring in data from earlier pitcher seasons, but didn't for this particular analysis. I suspect it would prove largely redundant with the 2013 data (given how consistent velocity usually is year-to-year), but it might contribute a small increase in predictive power.
Thanks! I should have linked to your articles, they are nicely complementary.
That sounds like a job for an intern if ever there was one.
Yes, this is true, hence a large portion of the MLB population would technically be considered obese by some BMI charts. However, all things considered, I would expect that on average players with higher BMI are more likely to have higher body fat % than players with lower BMI.
Still, I agree it would be much better to have that data (% body fat) directly, rather than a noisy, biased correlate of it.
Yeah, nothing significant or clear showed up.
"Sorry if this is off-point."
Not at all, I'm glad it provokes thought. Indeed, that was sort of the point.
Interesting idea, I'll give it a try.
Good suggestions. I like the classifications you suggested with regards to entropy and success.
Thanks for the kind words. You are right about the change in player valuation post-free agency--I think that's worth looking into. I definitely will be integrating player quality & playing time into future analysis.
you are absolutely right, on the pages of this very site... http://www.baseballprospectus.com/article.php?articleid=20046. And I was wrong to cite it. But I still like Anthony Rizzo.
Good point. Hornsby fooled me with all the post-30 WAR, and I failed to notice that it was all concentrated in a few years. Eddie Collins is a much better example, since he played productively until age 40.
There is indeed a significant trend over time, that'll be in the next piece on this subject (hopefully).
That piece is one of my all-time favorite articles on the subject.
SaberTJ, I think that's a great point. Are you thinking of a pitcher pumping a fastball past a batter, noticing that the batter wasn't able to catch up with it, and then just continuing with that fastball? Seems plausible. Tough to look at without SwingF/X (if that is a thing that will ever exist), but maybe I can look at whether swinging strikes affect future pitches differently relative to called strikes.
Hey Scott, somehow Cingrani missed inclusion in the sample, even though he meets the 100 IP cutoff I used. I'll get back to you when I've got his data processed.
I understand Russell's desire to keep it simple, but I too wonder whether and how context might be an important factor.
Maybe taking into account context would help to separate the adaptive hitters, who are presumably better because they adjust to the pitcher and/or the game state (say, Mike Trout, Evan Longoria), from the no-discipline hitters, who are just wildly varying their tendencies in an effort to fix whatever ails them (Ryan Howard, Dan Uggla).
"You should either have a model with entropy and number of pitches and their interaction."
I'm still a little lost, cause you said "either" but then no "or". So, to recap, I fit the following models:
-pitch types + evenness + interaction
-logarithm of pitch types + logarithm of evenness
All had positive effects of entropy or its components on K/9.
It doesn't make a lot of sense to me to fit entropy and number of pitches in the same model (as I think you suggested) because number of pitches is in the calculation for entropy, and so they are VERY highly correlated. This causes multicollinearity.
While evenness and number of pitch types are also (negatively) correlated, it is to a much smaller degree.
For sure, that's a great idea.
I'm not sure I see why evenness and pitch types are necessarily related (besides the obvious), or why exactly that would result in an artifactual interaction term. But taking your word for it, I recalculated the model with just entropy, and the logarithm of evenness + the logarithm of pitch types. In each case, the effect of entropy and its components is to increase K rate. Hopefully that ameliorates your concerns.
My mistake! I'll get that fixed ASAP. It was an excellent article, by the way.
Re: spin--similar to fastball velocity, I used the average of the 100 fastest-spinning pitches' spin rate (from Pitch F/X data).
Hey Bill, author of the article here.
Yes, there are some interesting differences in the data between lefties and righties.
The prevailing patterns related to strikeout rate, velocity, and entropy are there in both subpopulations, though, which is why I chose to pool them. In other words, although LH and RH are apples and oranges with regards to some characteristics, they seem to be all apples with regards to how entropy increases strikeout rate.
In the future, I'm definitely going to split them up and examine each separately, also looking at batter handedness as a factor.