July 9, 2012
The Blind BABIP Test: Results and Revelations
On Friday, many of you took the blind BABIP test. I gave you 18 GIFs, in nine sets of two, each set comprising two batted balls. One was a hit. The other was an out. You guessed which was which, but you couldn’t see the outcome; the GIFs cut off at the frame just as contact was made, or just before contact was made. This was supposed to tell us something. I’ll get to the big result first: We’re the worst at this!
I tallied 82 full sets of answers, which is 738 individual guesses, of which 387 were correct. That is 52 percent correct. Closing our eyes and pointing would theoretically have earned us 369 correct answers. All the wisdom of the 82 of you was worth 18 extra correct answers. So that's the big thing first.
Here are the correct answers:
1. The hit is on the right (73 percent of you were correct)
It is helpful to see the results broken down like this. If I just tell you that you got 52 percent correct, you might conclude that your guesses were basically random. You might think that there was nothing significant that distinguished one pitch from another, to your educated-but-amateur eyes. But in fact, as a group, you were not random, and you did not guess with 52 percent accuracy on each set. We actually spotted certain features that we thought were significant, and in some sets those features perhaps were significant. In others, we were wildly misled.
Part of the challenge of this exercise was that a lot of hits in baseball aren’t hit as well as a lot of outs in baseball. But for the most part, in these sets, the hits corresponded to the best-hit balls. In six of the nine sets, the hit was also clearly the best-hit ball of the two. In two of the sets, there’s not a clearly superior hit. (In the third set, both balls were medium grounders, but Mauer got the hit when his bounced off Jake Peavy’s glove. Mauer went up the middle with his, so I’d say that is the better hit, but it’s not clearly the better hit. And in the eighth, Jamey Carroll’s out was a decently hit grounder up the middle that was fielded, while Plouffe’s single was a sort of soft liner to first base with some weird spin.) The only set with a result that didn’t reflect the quality of hit at all was the fourth set. Drew Butera doubled on a soft line drive to center field; here is a relevant screen image:
That close. Mauer, meanwhile, drove a line drive/fly ball to the warning track, where it was caught. It was one of the two or four hardest-hit balls in the game. So you have a pretty good excuse for missing that one. Of the seven sets where the better-struck ball is unambiguous, you all identified the better-struck ball 56 percent of the time. Still low. Better, but still low. The best-hit ball of the game was Denard Span’s double, in the second set. The guesses for that were way off.
I’m not going to display all the GIFs again, out of respect for your browsers. But I want to look at each set briefly, and a couple in a bit more depth.
1. Left was an infield pop-up. Right was a line-drive single up the middle.
2. Left was a pop-up to medium right field. Right was a fly ball off the wall in deep right-center.
The point of this exercise, of course, is to see whether we are able to evaluate things before we know the results. The same question could apply to this very piece that I am writing and you are reading. I know the results of your guesses. And I’m analyzing them. But is my analysis swayed by the success rate that I already know, and am I crafting narratives based on that number? If I say “clearly this shows that we overestimate the effect of visible movement,” or “we mistakenly see fastballs as more hittable than sliders even when the fastballs are outside the strike zone,” am I trying to fit an explanation into a result? Probably. Probably I’m doing that.
3. Left was a broken-bat 4-6-3 double play. Right was a groundball single off the pitcher’s glove.
4. Left was a line drive to deep right field. Right was a soft liner to center field.
5. Left was a hard line drive up the middle. Right was a groundout on a diving stop by the second baseman.
6. Left was a soft line drive past the second baseman. Right was a fly out to shallow center.
A few of your responses on these two:
SeeinRed: "Both of those looked like balls, but I have an easier time seeing the golf swing (on the left) as a hit than the jammed swing."
MikeyRuler: "real good slider low and away on the left so probably not much contact, pitch on (the right) seems to run back over the plate"
JM: "Both look to be good location although he missed the target on right. Better movement on left pitch too"
GH: "batter is on (the pitch on the left), expected it"
BT: "Right: badly missed his spot. Pitch on the left is down and away... a batter will get a hit on that only if the BABIP gods are angry"
This one was difficult because neither pitch looked particularly hittable. On the left, Brian Dozier swings at a slider that is either off the plate or at the very corner of the zone. On the right, Joe Mauer swings at a fastball that is inside. I suspect more people chose Mauer for the hit because Peavy missed his target by so much. Against Dozier, he executed his pitch perfectly. Against Mauer, he didn’t. But, crucially, Mauer doesn’t know what Peavy was trying to do. He doesn’t know it’s a “bad” pitch, and he doesn’t know Peavy missed his target. All he knows is that there is a fastball inside that he has to protect against. He might have even been looking for a fastball away, because he can go through the same decision-making process that Peavy and his catcher can. It might have been the perfect pitch. Mauer popped it up to shallow center field.
The other thing is that Dozier didn’t hit the ball particularly well. He hit a little inside-out liner that landed on the infield dirt. It was well-placed. You might say it was a good piece of hitting. You might say it was a good piece of pitching, spoiled by some fortunate location.
LouisPupu nailed this one:
7. Left was a hard line drive up the middle. Right was a routine groundout to shortstop.
8. Left was a grounder fielded by the pitcher. Right was line drive past the first baseman.
9. Left was a popup to shallow center. Right was a line drive down the right-field line.
BT: "Doesn't have nearly as much drop as the pitch on the left, plus he missed his spot badly, leaving it up and out over the plate. The pitch on the left looks pretty nasty, actually."
Jeff: "This is probably the hardest. The pitch on the right is by far the better pitch IMO. I'm surprised the batter even swung at it. Just tough to get the barrel on that pitch, the pitch on the left is a prime candidate for a bloop single or grounder up the middle."
MikeyRuler: "Span looks on that pitch on the right especially since peavy just threw him the same pitch right before, good breaking ball on the left to morneau"
What’s interesting to me about this is that the pitches are very similar to those in the sixth set, the one almost everybody missed. On the left is a pretty good slider thrown low in the zone, or just out of if. On the right is a fastball that is supposed to be away but misses badly. In both cases, most of you picked the missed-location fastball as the hit. In this case, that was correct. The big difference, perhaps, is that in this case the fastball catches some of the plate. Span lined it down the line for a double. Morneau hit a weak pop-up to center field on his pitch.
I’m not prepared to draw any big conclusions from this. The simplest thing we can say is that this is hard. It's not all that hard to differentiate pitches, but it's hard to decide which variables are important, especially because the variables that are important will vary from pitch to pitch and batter to batter. But the original idea behind this was to do a simple test to see whether we are capable of evaluating the quality of a pitch without seeing the result. And, despite the 52 percent, I actually think the answer is "yes." For the most part, something like a consensus was reached. And for the most part, I believe what the consensus consented on. There was disagreement, but mostly I think there were good pitches and bad pitches and the group identified them.
As far as the test of whether we are capable of predicting the result of a pitch, the answer is a strong "no." When I imagine a BABIP-induced stretch of luck, I see lots of line drives landing in gloves, or dribblers trickling through. We see that type of bad luck clearly in the fourth set of GIFs. But that's the secondary bit of luck. The first bit of luck, from the pitcher's perspective, is whether the batter is going to do his job well. When the batter does his job well, the good pitch can still be hit. What we see in this sample is that sometimes the better pitch gets hit harder. Also, sometimes the ball that's hit harder gets caught. It's the duality of this noise that makes it hard to look simply at a pitcher's pitches, or simply at the rate of line drives he has allowed, and draw conclusions about the nature of his BABIP spike. We really have to look at both, while also acknowledging that good pitches and bad pitches can look fairly similar to our fallible eyes.