Pebble Hunting: The Blind BABIP Test: Results and Revelations

July 9, 2012

On Friday, many of you took the blind BABIP test. I gave you 18 GIFs, in nine sets of two, each set comprising two batted balls. One was a hit. The other was an out. You guessed which was which, but you couldn’t see the outcome; the GIFs cut off at the frame just as contact was made, or just before contact was made. This was supposed to tell us something. I’ll get to the big result first: We’re the worst at this!

I tallied 82 full sets of answers, which is 738 individual guesses, of which 387 were correct. That is 52 percent correct. Closing our eyes and pointing would theoretically have earned us 369 correct answers. All the wisdom of the 82 of you was worth 18 extra correct answers. So that's the big thing first.

Here are the correct answers:

1. The hit is on the right (73 percent of you were correct)
2. Right (30 percent)
3. Right (30 percent)
4. Right (37 percent)
5. Left (61 percent)
6. Left (22 percent)
7. Left (63 percent)
8. Right (76 percent)
9. Right (79 percent)

It is helpful to see the results broken down like this. If I just tell you that you got 52 percent correct, you might conclude that your guesses were basically random. You might think that there was nothing significant that distinguished one pitch from another, to your educated-but-amateur eyes. But in fact, as a group, you were not random, and you did not guess with 52 percent accuracy on each set. We actually spotted certain features that we thought were significant, and in some sets those features perhaps were significant. In others, we were wildly misled.

Part of the challenge of this exercise was that a lot of hits in baseball aren’t hit as well as a lot of outs in baseball. But for the most part, in these sets, the hits corresponded to the best-hit balls. In six of the nine sets, the hit was also clearly the best-hit ball of the two. In two of the sets, there’s not a clearly superior hit. (In the third set, both balls were medium grounders, but Mauer got the hit when his bounced off Jake Peavy’s glove. Mauer went up the middle with his, so I’d say that is the better hit, but it’s not clearly the better hit. And in the eighth, Jamey Carroll’s out was a decently hit grounder up the middle that was fielded, while Plouffe’s single was a sort of soft liner to first base with some weird spin.) The only set with a result that didn’t reflect the quality of hit at all was the fourth set. Drew Butera doubled on a soft line drive to center field; here is a relevant screen image:

That close. Mauer, meanwhile, drove a line drive/fly ball to the warning track, where it was caught. It was one of the two or four hardest-hit balls in the game. So you have a pretty good excuse for missing that one. Of the seven sets where the better-struck ball is unambiguous, you all identified the better-struck ball 56 percent of the time. Still low. Better, but still low. The best-hit ball of the game was Denard Span’s double, in the second set. The guesses for that were way off.

I’m not going to display all the GIFs again, out of respect for your browsers. But I want to look at each set briefly, and a couple in a bit more depth.

1. Left was an infield pop-up. Right was a line-drive single up the middle.
Most of you got this correct, which is good, because this is probably the simplest set in the piece. Both pitches are fastballs to righties. One tails back to the outside corner; one tails right over the middle of the plate and misses the catcher’s target. The one over the plate was the hit. It seems that those of you who got it wrong were watching Jamey Carroll and thought he looked late on it. That’s a good process, but Jamey Carroll wasn’t late on it. He hit it hard.

2. Left was a pop-up to medium right field. Right was a fly ball off the wall in deep right-center.
The pitch on the right was the best-hit ball of the game. It was a double to right-center that missed going out by perhaps a couple feet. Most of the guesses that came with explanations focused on the movement of the pitches. The fastball on the left was “flat,” while the slider had “biting movement.” This set is probably the best indication that what we are doing is very, very hard. One of the worst batted balls of the game, against the best hit of the game, and we whiffed.

The point of this exercise, of course, is to see whether we are able to evaluate things before we know the results. The same question could apply to this very piece that I am writing and you are reading. I know the results of your guesses. And I’m analyzing them. But is my analysis swayed by the success rate that I already know, and am I crafting narratives based on that number? If I say “clearly this shows that we overestimate the effect of visible movement,” or “we mistakenly see fastballs as more hittable than sliders even when the fastballs are outside the strike zone,” am I trying to fit an explanation into a result? Probably. Probably I’m doing that.

3. Left was a broken-bat 4-6-3 double play. Right was a groundball single off the pitcher’s glove.
Here we have two balls that weren’t hit very well, and one happened to be a hit while the other happened not to be. Both are fastballs to left-handed hitters. The pitch to Span (on the left) was almost perfectly in the middle of the strike zone. The pitch to Mauer was near the outside edge. It’s understandable why you all chose the pitch to Span as the one that surrendered a hit. This was very difficult to get correct.

4. Left was a line drive to deep right field. Right was a soft liner to center field.
Most of you correctly identified the pitch on the left as the one that was hit well. It was hit well, into an out. This is a very difficult pitch to get correct.

5. Left was a hard line drive up the middle. Right was a groundout on a diving stop by the second baseman.
This was the most divisive set in our group, which makes some sense, as both batted balls could have very easily been hits. Morneau hit a four-seam fastball that was on the outer half of the plate. Revere got a slider up in the zone. The pitches were actually in almost the same spot, except that one (the fastball) was supposed to be there and the slider was not supposed to be there. I’m surprised everybody didn’t pick Revere. I wonder whether breaking balls just look like tougher pitches than fastballs, even when the breaking balls aren’t very good.

6. Left was a soft line drive past the second baseman. Right was a fly out to shallow center.
This is the set everybody did the worst on.

A few of your responses on these two:

SeeinRed: "Both of those looked like balls, but I have an easier time seeing the golf swing (on the left) as a hit than the jammed swing."

MikeyRuler: "real good slider low and away on the left so probably not much contact, pitch on (the right) seems to run back over the plate"

JM: "Both look to be good location although he missed the target on right. Better movement on left pitch too"

GH: "batter is on (the pitch on the left), expected it"

BT: "Right: badly missed his spot. Pitch on the left is down and away… a batter will get a hit on that only if the BABIP gods are angry"

This one was difficult because neither pitch looked particularly hittable. On the left, Brian Dozier swings at a slider that is either off the plate or at the very corner of the zone. On the right, Joe Mauer swings at a fastball that is inside. I suspect more people chose Mauer for the hit because Peavy missed his target by so much. Against Dozier, he executed his pitch perfectly. Against Mauer, he didn’t. But, crucially, Mauer doesn’t know what Peavy was trying to do. He doesn’t know it’s a “bad” pitch, and he doesn’t know Peavy missed his target. All he knows is that there is a fastball inside that he has to protect against. He might have even been looking for a fastball away, because he can go through the same decision-making process that Peavy and his catcher can. It might have been the perfect pitch. Mauer popped it up to shallow center field.

The other thing is that Dozier didn’t hit the ball particularly well. He hit a little inside-out liner that landed on the infield dirt. It was well-placed. You might say it was a good piece of hitting. You might say it was a good piece of pitching, spoiled by some fortunate location.

LouisPupu nailed this one:

6L:hit, soft liner to right
6R:out, pop up to infield

Perfect.

7. Left was a hard line drive up the middle. Right was a routine groundout to shortstop.
Both pitches were fastballs. The one on the left missed its target. The one of the right did not miss its target. I think we’re all very big on pitches hitting their targets in this exercise.

8. Left was a grounder fielded by the pitcher. Right was line drive past the first baseman.
Almost everybody got this one. The pitch on the right was three inches higher. But the pitch on the left almost perfectly bisected the strike zone, while the pitch on the right was about five inches further away from the hitter. Assuming location was the big determinant for you on this set, we can conclude that three vertical inches are more significant to you than five horizontal inches. Interesting.

9. Left was a popup to shallow center. Right was a line drive down the right-field line.
This was the easiest to pick; four in five of you got it right.

BT: "Doesn't have nearly as much drop as the pitch on the left, plus he missed his spot badly, leaving it up and out over the plate. The pitch on the left looks pretty nasty, actually."

Jeff: "This is probably the hardest. The pitch on the right is by far the better pitch IMO. I'm surprised the batter even swung at it. Just tough to get the barrel on that pitch, the pitch on the left is a prime candidate for a bloop single or grounder up the middle."

MikeyRuler: "Span looks on that pitch on the right especially since peavy just threw him the same pitch right before, good breaking ball on the left to morneau"

What’s interesting to me about this is that the pitches are very similar to those in the sixth set, the one almost everybody missed. On the left is a pretty good slider thrown low in the zone, or just out of if. On the right is a fastball that is supposed to be away but misses badly. In both cases, most of you picked the missed-location fastball as the hit. In this case, that was correct. The big difference, perhaps, is that in this case the fastball catches some of the plate. Span lined it down the line for a double. Morneau hit a weak pop-up to center field on his pitch.

***

I’m not prepared to draw any big conclusions from this. The simplest thing we can say is that this is hard. It's not all that hard to differentiate pitches, but it's hard to decide which variables are important, especially because the variables that are important will vary from pitch to pitch and batter to batter. But the original idea behind this was to do a simple test to see whether we are capable of evaluating the quality of a pitch without seeing the result. And, despite the 52 percent, I actually think the answer is "yes." For the most part, something like a consensus was reached. And for the most part, I believe what the consensus consented on. There was disagreement, but mostly I think there were good pitches and bad pitches and the group identified them.

As far as the test of whether we are capable of predicting the result of a pitch, the answer is a strong "no." When I imagine a BABIP-induced stretch of luck, I see lots of line drives landing in gloves, or dribblers trickling through. We see that type of bad luck clearly in the fourth set of GIFs. But that's the secondary bit of luck. The first bit of luck, from the pitcher's perspective, is whether the batter is going to do his job well. When the batter does his job well, the good pitch can still be hit. What we see in this sample is that sometimes the better pitch gets hit harder. Also, sometimes the ball that's hit harder gets caught. It's the duality of this noise that makes it hard to look simply at a pitcher's pitches, or simply at the rate of line drives he has allowed, and draw conclusions about the nature of his BABIP spike. We really have to look at both, while also acknowledging that good pitches and bad pitches can look fairly similar to our fallible eyes.

Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now

Sam Miller

More about:

Latest Articles

You need to be logged in to comment. Login or Subscribe

saigonsam

7/09

So to sum it up, "We didn't learn anything we didn't know already".

It was still a fun exercise.

Reply to saigonsam

louispupu

7/09

Sam
Do you have the distribution of the score of the all 82 sets?
For example,
5% got the all 9 pairs right, 9% went 8 for 9, 11% went 7 for 9 etc....

If you do, would you mind sharing the results? Thanks.

Reply to louispupu

lyricalkiller

7/09

I don't off the top of my head; if I get a chance I'll add 'em up. But three people got seven and nobody did better. Tons of 4s and 5s. Mostly 4s and 5s I think.

Reply to lyricalkiller

hotstatrat

7/09

I wonder if the respondents who focused on the pitch did better than those who focused on the batter's approach to the pitch - among those who did one or the other.

At 52 % - given the number of respondents, I am not totally convinced our answers weren't essentially random, however - or around 2 % better than random.

Reply to hotstatrat

dethwurm

7/10

Bingo. If answers are random, you still won't see 50% on every individual set because sometimes guesses cluster randomly.

It would be fun to try it with like 800 people instead of 80 and see what the results are. Then compare with 800 scouts on the same set. Maybe KG can call in some favors to make it happen!

Still, it was a fun exercise. Great idea, Mr. Miller!

Reply to dethwurm

Oleoay

7/10

I wish I hadn't been too busy because I would've liked to participate in this. I think Hoot brings up a good point about whether people who focused on the hitter did better than people who focused on the pitch. The line in the article about "But, crucially, Mauer doesnâ€™t know what Peavy was trying to do. He doesnâ€™t know itâ€™s a â€œbadâ€ pitch, and he doesnâ€™t know Peavy missed his target." touches on this idea.

I'm not sure how intensive it would be, but it might be interesting to run this kind of test again, but with the GIF cropped and frames removed to exclude as much as possible of the catcher and maybe even the pitcher.

And, as an aside, maybe we can not predict hits, but I think this kind of study can provide some good insight into the kind of biases we readers have.

Reply to Oleoay

Oleoay

7/10

Another way to slice this is to look at the results given and see how accurate readers were in projecting whether the ball was a popup/grounder/line drive... i.e. given a pitch, can readers predict whether the ball is "hit well" or not.

Reply to Oleoay

louispupu

7/12

An important point in this test is that it's a given that one of them is a hit and the other is not, making it so much easier to judge. I compared my prediction (you can find it in the comments under previous article) and the actual results. Under this condition, among the 18 pitches, 8 of my predictions were exactly what happens or quite close, 4 of them were somewhat correct, 6 of them were just wrong. I did the detailed prediction because I was rather confident on my ability to read the pitch and predict the result. However, if the 1 hit/1 out condition weren't a given, I believe my prediction would be much much worse.

What I'm trying to say is, the 1 hit/1 out hint, IMHO, made a huge difference in the test. To predict the result only from the pitch is extremely difficult, especially those pitches that caught some but not too much plate which could easily results in many different results, without being given some hints or others (i.e. the pitching sequence up to the pitch in question in the at-bat and results). Maybe we can repeat the test and take out the 1 hit/1 out hint and see how everybody does.

Reply to louispupu

louispupu

7/09

One more request if it's possible, because No.4 is indeed the tricky one, could you maybe give the results without counting that one? Thanks.

Reply to louispupu

lyricalkiller

7/09

54.4 percent excluding no. 4

Reply to lyricalkiller

louispupu

7/10

Sorry that I didn't make it clear, I meant without counting No.4, how many got all 8 right, how many were 7 for 8, how many were 6 for 8, etc. Thanks. I am not sure whether this distribution tells us anything but I am curious about it. Thanks.

Reply to louispupu

Pebble Hunting: The Blind BABIP Test: Results and Revelations

Thank you for reading

Latest Articles

TA: The Orioles Stock Up, While Relievers Go Cheap $

Fantasy Starting Pitcher Planner ’24: Week 17 $

Deep League Landscape ’24: Week 17 $

TA: Seattle Mariners Press Randy Arozarena Into Service $

MLU: Brannigan’s Law $

Sam Miller

More about:

Latest Articles

TA: The Orioles Stock Up, While Relievers Go Cheap $

Fantasy Starting Pitcher Planner ’24: Week 17 $

Deep League Landscape ’24: Week 17 $

Thank you for reading

Related Articles

Latest Articles

More about:

Latest Articles

Related Articles