Notice: Trying to get property 'display_name' of non-object in /var/www/html/wp-content/plugins/wordpress-seo/src/generators/schema/article.php on line 52
keyboard_arrow_uptop

Baseball is possessed of a rich and diverse collection of sounds. The shouting of the fans, their intermittent applause and jeers, and the crackling of the PA system all contribute to the cornucopia. Even limiting ourselves to the action on the field, baseball is aurally pleasing: the pulse of the ball pushing the air out of a glove, for instance.

First among all baseball sounds, without question, is the crack of the bat. Something about the whip striking the ball is downright electric. If you are like me, after watching so many thousands of baseball games, that crack still exercises a visceral and jolting effect on my nervous system. It is baseball’s leverage alarm: the contact could result in a routine groundout, or it could be a massive home run, but either way, the stakes just increased and you’d better pay attention to what happens next.

Yet the crack of the bat is itself diverse. Some balls are ripped with great force, and produce a crisp, single note. Others are walloped into the ground, off the bottom of the bat, and generate a dull thud. Some players routinely seem able to contact the ball with the kind of sound that is associated with hits, regardless of whether their screaming line drives find gloves.

I think many a sabermetrically-inclined fan of the game has wondered about measuring those sounds. Not that our ears are a bad guide, but human perception is subjective and can be biased. It would be interesting to know, for example, if your favorite player’s bat really does produce a special sound, or if you can tell the difference between a home run and a groundout based only on the crack.

Collecting sound data itself is not very difficult. To do so, I used my computer to directly record the sound feed from MLB.tv*. When you collect sound data, you can see a direct readout of it that looks like this:

pastedGraphic.png

Time is passing on the x-axis, and the y-axis relates the amplitude of the sound that’s being recorded. Loud sounds produce more significant departures from the line at 0, which represents silence.

The first task was to see whether the crack of the bat could be at all distinguished by the computer from the surrounding sounds. That turns out to be trivial, for two reasons.

pastedGraphic_1.png

The first reason is that the sound of the bat is extremely loud relative to the rest of the television feed. The second reason is that the sound is also very short. The combination of these characteristics gives us that crisp, sharp sensation which is so pleasing to the ear (and attention-grabbing).

Having now convinced myself that I could reliably identify the sound of the bat in the audio feed, I wanted to do a more detailed analysis of that sound. I first needed to gather a bunch of bat cracks. To do so, I used the condensed games feature on MLB.tv, which turned out to be ideal for this purpose. The condensed games strip out the announcers’ feed, leaving you with the sound of the game as it happens on the field. I collected several games worth of audio, saving individual audio files for each contact event, and noting the result of that contact in broad terms (fly out, groundout, home run, etc.).

The result of that work was a small sample (5-10) of each event variety. Before I get to the #GoryMath, let’s listen to that most glorious of sports sounds, the bat crack. For each event, I made a composite sound of that event by stacking all of the bat cracks on top of each other to produce a sort of ‘average’ sound. This, for example, is a composite home run, made from eight separate dingers:



If your browser won’t let you play that, here’s a direct link.

This is a composite groundout (n=9):



Link

Here’s a composite line drive (n=10):



Link

Those are three different kinds of batted balls that all sound relatively distinct to me, but we can make deeper distinctions than that. Consider the sound of a composite groundout (above), relative to the sound of some grounders which went for singles (n=6).



Link

Side by side, it’s easy to hear that the groundball singles were slightly higher-pitched.

On the other hand, contrast a bunch of fly outs (n=7) with the home runs.



Link

I don’t hear much difference there at all, although your ears may vary.

With the exception of home runs and fly balls, most of those sounds seem quite distinct to me, which suggests that there are some real differences in acoustic characteristics between them.

Sound is actually waves of air pressure. To collect data on it, one is really collecting a series of air pressure measurements (usually 44,100 per second). Loud sounds produce greater increases and then decreases in air pressure. The oscillations in air pressure (high then low then high and so on) make the waves which manifest in our perception as pitch: rapidly oscillating waves have higher pitches, while more slowly oscillating waves produce lower pitches.

Through a lot of math**, we can take a sound like a composite home run bat crack and decompose it into a set of frequencies, as well as the volume of each one of those frequencies. Appropriately, this kind of decomposition is called a frequency analysis, and it makes a graph that looks like this:

pastedGraphic_2.png

On this graph, frequency is on the horizontal axis (in hertz), and the amplitude of that frequency is on the vertical axis. I don’t want to walk through all of this, because it’s not all that relevant (yet). I do want to note a few important points, however. The main peak in this analysis is right at 1 khz, exactly where Dr. Alan Nathan, expert on the physics of baseball, said it would be (science!). For this and the following graphs, I put a faint green line at 1khz as a reference. Secondly, you can see that there’s a bunch of other frequencies with significant volume in the data, including some other peaks.

For comparison, here is the same graph but for groundballs (both hits and outs):

pastedGraphic_3.png

You’ll note that the overall shape of the graph is the same, but the dominant peak at 1khz for home runs is much diminished for groundouts. Instead, the loudest sound is something at a lower frequency around 500 hertz. Now let’s look at the frequency spectrum of 10 line drive singles:

pastedGraphic_4.png

Here, the peak at 1khz is the second loudest, beaten by a peak (1.4 khz) present but diminished in the other batted ball types. If we plot all the events together, we get the following.

pastedGraphic_5.png

There’s many other small differences, but I don’t want to get drowned in minutiae; the point I’m trying to make is that different batted ball types produce different frequency spectra. I hope these are visually obvious, despite the complexities of the frequency spectra graphs.

A way of summarizing the differences, as I have already alluded to, is to look at the peak frequency of each batted ball type. Alan suggests that “When the relative ball-bat speed is higher, the collision time is shorter and peak frequency is higher.” So harder hits should produce higher frequencies. In addition, with regards to where on the bat the contact is made, Alan writes: "For impacts (away from) the sweet spot, the bat can more easily bend, resulting in longer collision times and lower frequencies”. In other words, when the contact is close to the sweet spot, frequencies should be higher, and when the contact is further away, the frequencies should be lower.

Alan’s predictions are borne out. Line drives have the highest peak frequency (~1.4khz), and are (by nature) struck hard. Home runs are second, with a peak at 1khz. For groundballs, peak frequency is at a much lower level (~500 hertz), implying a worse contact.

If you calculate a peak frequency for each individual hit (instead of all of the hits of a given type put together), you can see that, while the data is messy and overlapping, the general relationship between peak frequency and hit type is there. Intriguingly, groundball outs show a lower peak frequency than groundball singles (just like we heard above), implying potentially worse contact.

pastedGraphic_6.png

You might expect home runs to have the best contact, but I’m not so sure that should be the case. Home runs might travel the furthest, but they don’t necessarily result from the most perfect bat-to-ball contact. Some are probably hit above the sweet spot on the bat, which gives them a more upward trajectory. Perusing the pages of the HomeRunTracker, one finds plenty of homers with speeds off the bat that are less than a well-hit line drive, but which clear the wall due to lucky gusts or atmospheric effects.

This also brings me to an important caveat in this preliminary study. I’m capturing only a few events per category, and by coincidence, several of the nine home runs I captured were of the barely-clearing-the wall-variety. There were no Giancarlo Stanton or Jose Abreu epic bombs, but a few lucky, perhaps wind-aided dingers. The results might change when I get greater numbers in each event.

Herein also lies a limitation of this form of analysis. Because audio files have to be captured and processed individually, there is a lot of manual work that goes into each event. That prevented me from getting more than five to ten events in each category. In other words, the need for manual annotation of each and every contact event is, for now, a significant barrier to getting large datasets put together.

Still, the prospects for analysis with this kind of data are broad and interesting. Because the sound off the bat is related to the bat/ball collision—specifically, where on the bat the collision occurred and how hard it was—we can begin to investigate questions about quality of contact.

Quality of bat/ball contact meets the rare dual criteria of being both obviously significant and severely understudied. We know it’s important in all facets of the game, but getting any further than that has proven difficult because of a lack of data. It’s easy to say that a certain hitter looks like he’s making solid contact, but much tougher to verify that (as well as determine whether that’s driving, for example, an elevated BABIP). Sound analysis may offer a way to examine questions like this (and many others, too).

In the spirit of the Sabermetrician’s Credo, I ought to note that I am not an expert in acoustics, and this research almost certainly has caveats and problems of which I am not yet aware. One example which I already stumbled upon—but haven’t had time or data to properly address—is the issue of microphone adjustments in different ballparks. Whether because of mic placement or audio feed processing or something else, there is definitely an effect of ballpark on the produced sounds (which led me to capture from five different games in different ballparks). This should be easy to adjust for in the future, but there are undoubtedly myriad additional difficulties in the data which I have not yet found.

Even so, I have been able to show that different hit types have different sonic signatures that correspond to the quality of the contact made. There are some interesting early indications that contacts which result in hits differ in terms of the sound from contacts which result in outs, which may provide a way to tell if a hitter is driving the ball with authority or just getting lucky. Sound analysis might offer a rare view into a moment of the utmost importance in baseball: that joyous fraction of a second in which the ball meets the bat.

Special thanks to Alan Nathan for his help and advice.

*I used the software Audacity, available here.

**The Fourier Transform.

Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now
You need to be logged in to comment. Login or Subscribe
Michael
8/20
Fascinating work, Robert.
therealn0d
8/20
When I saw that first screencap I immediately recognized Audacity and thought instantly, "why didn't I think of this?" Makes me appreciate this awesome article even more. I can't wait to see what's next. Outstanding stuff, Robert.
rawagman
8/20
Would the size of bat used by the hitter or the speed of the pitch by the pitcher not play a big role in producing the sound of the batted ball?
therealn0d
8/20
Yes. among other things. I think that's what we want to find out.
nada012
8/20
The speed of the pitch will act to increase or decrease the relative velocity of the collision. So assuming the hitter swings at some constant speed, a fastball will produce a more violent collision (with higher peak frequency) than a curve. I'm not sure about the magnitude of that effect, though.

With regards to the the size of the bat, I do not know. Maybe Alan can weigh in on this.
Plucky
8/20
I suspect you'd also want to know whether the batter is using ash or maple
a-nathan
8/20
The speed of the pitch plays a very small role in determining the speed of the batted ball (attendees of @saberseminar last weekend might recall my mentioning that in my talk). So I doubt there is much correlation between the frequency spectrum and the pitch speed.

I should point out that there is a bit of controversy regarding the origin of the lower-frequency sound for balls hit off the sweet spot. One theory is the one I told Rob, which has to do with the longer collision time. But another, and the one Adair subscribes to, is due to the excitation of low-frequency bending vibrations in the bat (see also my saberseminar talk, which I will shortly post at my web site). It might well be that it is a combination of both effects. Data such as those taken by Rob might help clarify this issue. Or it might be better to take data under more controlled conditions, as in a laboratory.
therealn0d
8/20
Would it be possible to translate acoustical environments to a "neutral" acoustical environment?
therealn0d
8/20
What function did you use to plot the frequencies? I'm not getting the same results on the same audio (well, not quite the same audio...I had to record it from the browser link). For instance, on the line drive audio I get peak amplitude at ~500 hZ.
nada012
8/20
Under "Analyze", I used "Plot Spectrum". I also used an R package, seewave, to redo the analysis and make sure I got the same result. It looks like the audio files are in the right spots, and when I just did the analysis from sound recorded from the website, I got the same frequencies I did before. Maybe you just recorded the wrong audio file? Groundouts do show a peak at ~500 Hz.
therealn0d
8/20
I double-checked before asking, and I did use the analyze-
therealn0d
8/20
plot spectrum and got what looks like the right results. I am also using the seewave package (the meanspec function). The results in R don't match. I can't even figure out how to get the f and a ranges like you get. I did a couple of Giancarlo Stanton HRs, but I don't trust that I'm doing it correctly. I'm having a lot of fun, but at the same time I'd like to get it right.
nada012
8/20
In seewave I think you'll need to specify the frequency of sampling (44100 Hz) and also the size of the Fourier Transform (512 is what I used). That should get you the same results. I will post my R code and the individual samples at some point, perhaps on my blog.
a-nathan
8/20
The size of the Fourier transform has to be a power of 2 if the usual algorithm is used.
nada012
8/20
512 = 2^9
Right?
a-nathan
8/20
Right! I didn't mean to suggest otherwise.
nada012
8/21
Ah, I misunderstood.
beeker99
8/20
You had me at "Fourier Transform" - an outstanding article, Robert! I eagerly wait to see further samples and what more data has to say. I wonder if the composition of the bat (IIRC, bats can be made from 6 different types of wood) has much impact on the sound?

Next topic - does the MLB.tv compressed game audio ever pick up the sound of pitches hitting the catchers mitt?

And I, too, am kicking myself for not ever thinking of doing this.
nada012
8/20
Thank you!

"Next topic - does the MLB.tv compressed game audio ever pick up the sound of pitches hitting the catchers mitt?"

Yep! I am going to analyze that too. In the audio feed, it looks like a mini-bat crack (a bit longer and less loud), but the frequencies are very different.
walrus0909
8/21
If I remember right, that nice loud pop of ball in mitt is less a function of speed and more of lack of movement: if the catcher has to move his glove more, he's less likely to catch it in the right spot.

I'm going to guess that borderline pitches with more audible pop are called strikes more often than similar pitches without it. I'd attribute this to the catcher's framing skill, but maybe umps are using the sound to help make their calls.

To Robert: Did you try to filter out the crowd noise in the background at all?
therealn0d
8/21
I'm not presuming to speak on Robert's behalf, but I tried removing the crowd noise. The problem I had was that removing that noise also removes some of the signal of the bat crack. But, if you isolate just the signal from the crack, cutting away all the noise surrounding it, you really only hear the crack of the bat. Give it a try.
nada012
8/21
"I'm going to guess that borderline pitches with more audible pop are called strikes more often than similar pitches without it. I'd attribute this to the catcher's framing skill, but maybe umps are using the sound to help make their calls."

Great idea! I will for sure give that a try. And who knows, maybe part of catcher framing skill is creating the audible pop by moving the glove in certain ways (maybe snapping it shut augments the pop?).

"To Robert: Did you try to filter out the crowd noise in the background at all?"

Not really--if you examine the second picture in the piece, you can see that immediately prior to the bat crack, crowd noise is close to negligible, both to my ear and relative to the great volume of the crack. To clarify, I selected the tenth of a second or so immediately surrounding that peak, so whatever crowd noise there was would have to be in that region.
walrus0909
8/21
I just wonder how the recording of a recording compressed for transmission over the Internet affects the frequency content. But it's a fantastic article regardless.
therealn0d
8/21
I think I can answer this, but I'm afraid to try. I would imagine that there is a "signature" that gets preserved in a relative way. Speaking as someone who has been in a recording studio as a trained musician, I would...ask the engineer.
sfrischbp
8/20
There are a couple of acousticians who have looked at this, so you have a little more of a foundation to start from. This abstract reports similar frequency findings:
http://scitation.aip.org/content/asa/journal/jasa/109/5/10.1121/1.4744892

Robert Collier is another name to look into.
therealn0d
8/20
Thanks!
a-nathan
8/20
Other interesting links:

http://baseball.physics.illinois.edu/glanz-nytimes.htm (Jim Glanz of NYT, regarding Adair's article). See also http://baseball.physics.illinois.edu/glanz-crack.jpg

For some reason, the link to the full Adair article seems to be broken (although it wasn't last week when I told Rob about it). Maybe the site is temporarily down. Here it is:
http://acoustics.org/press/141st/adair.html.

Dan Russell is an acoustical physicist who has compared wood with aluminum:
http://www.acs.psu.edu/drussell/bats/crack-ping.html

therealn0d
8/20
Okay, I downloaded the .wav files and get the same results, but I still can't graph it like that.
yadenr
8/20
I can't wait for the day when BP player cards have their aural profiles as well.
bleaklewis
8/20
See that old kook in trouble with the curve could still really scout blind all along!
50cubs
8/21
Gus Lobel was right: It's a pure sound.
ddietz2004
8/21
Jose Abreu hit a home run this past Easter in Arlington. The sound off the bat was explosive. I encourage you to find it, but cover your ears.
therealn0d
8/21
I found it. I got a peak of 1.7 kHz. I think it was a pretty hard hit ball :-)
myshkin
8/21
A handful of questions:

* Are you using Audacity on the video files directly from MLB, or do you use some other method to record audio for analysis? If the former, what is the sampling rate and bitrate of the files you use? MLBAM provides six different condensed games for various platforms. I haven't looked at them all, but I've seen at least 24kHz with 22kb/s and 48kHz with 29kb/s. I don't think the quality differences will affect the analysis too much in the end, but I'm not certain.

* What was the highest peak frequency you saw? I tried to replicate this with Javier Baez's last home run (August 18, 9th inning, 8:36.67 into the condensed game) and got 2.8 kHz. I suspect some operator error, though.

* How did you decide how large to make the window of analysis around each bat crack?
nada012
8/21
"Are you using Audacity on the video files directly from MLB, or do you use some other method to record audio for analysis? If the former, what is the sampling rate and bitrate of the files you use?"

The former. I think that the sampling rate has been consistently 44100 Hz; this is the standard for most broadcasts, right? Bitrate is 1411kbps.

"What was the highest peak frequency you saw? I tried to replicate this with Javier Baez's last home run (August 18, 9th inning, 8:36.67 into the condensed game) and got 2.8 kHz. I suspect some operator error, though."

Yep, that's the same as what I got. Wow, was that loud. That's the highest peak frequency I've seen, but that's also the hardest hit ball I've seen in some time, so maybe that's alright.

"How did you decide how large to make the window of analysis around each bat crack?"

This was arbitrary; I basically recorded about two seconds up and downstream of each hit, and then manually selected from that a little bit immediately around the spike of the hit (until the vibrations visibly died down). Eventually, I settled into a pretty consistent rhythm of getting about a tenth of a second on either side.
Dutchleaguer
8/21
Love the post and the idea, and look forward to seeing the correlations when more data comes in.

I've always been perplexed by that little window of amplified sound in MLB broadcasts when the ball is delivered -- it's like they turn up certain mics for a second. It's cool, but the crowd sound is also amplified at the same time, so for a split second you get all this crowd noise along with the crack (or the sound of the ball hitting the catcher's glove, which is also cool).
rdemaro
8/21
Really interesting work, Robert. I enjoy how your brain works, sir.
nada012
8/21
Thanks, Rocco--the feeling is mutual.