With the Cardinals facing elimination, Game Six will be an all-hands-on-deck endeavor. Both managers are scouring their rosters for any potential advantage, and as part of that effort, they’ll probably be referring to historic batter-pitcher matchups. Should La Russa lean heavily on a player like Octavio Dotel, who has historically done well against Rangers hitters like Adrian Beltre and Michael Young? Or should he opt for the players with the best overall performance, regardless of what the matchups say?

Let’s say we want to predict the outcome of a particular batter-pitcher matchup. I’m going to lean heavily on True Average, which is scaled to look like batting average but captures a player’s total batting value (so a player gets a little credit for a walk and a bit more for a single, all the way up to a home run).

How would we predict the outcome if we knew nothing about how a particular batter-pitcher pair had fared against each other? First we’d want to know about the talent level of the batter and pitcher involved. In this case, we’ll use the previous three seasons’ TAv and TAv against, respectively. To find an expected TAv for the matchup from those values, we can use something called the log5 method to combine the two values into one value.

Once we have that expected value, we can also look at the TAv from that batter-pitcher matchup from all previous seasons. We can run this data from 1951 through 2011, giving us sixty years of data and over 16,000 data points to look at.

Using a technique known as ordinary least squares regression, we can see how well our expected TAv and our prior batter-pitcher matchup TAv predict future batter-pitcher matchup TAv. After controlling for whether the batter has the platoon advantage, what we find is that our log5 estimate of the outcome of a batter-pitcher matchup is 67 times more predictive than the batter’s past performance against that pitcher. Now, that’s slightly better for the batter-pitcher matchup data than we might have expected; there were on average 78 times as many PA for the log5 expectation as there were for the batter-pitcher matchup. (Since there are both batter PA and pitcher PA against used to generate the log5 expectation, I used what’s known as a harmonic mean to come up with the PA totals for the log5 expectation.)

We can conclude that one plate appearance against a specific pitcher is slightly more predictive than a plate appearance against any pitcher at all. But that effect is dwarfed by the number of plate appearances a batter makes against all pitchers, and historic trends are conspiring to increase this effect. In the 1960s, the batters and pitchers with the most history against each other might have given us as many as 240 plate appearances from which to draw conclusions. On average, there would be 14 plate appearances between a batter and pitcher in previous seasons to draw from. Since 2000, however, the most plate appearances between a batter and pitcher in past seasons is only 148, and the average has fallen to just seven. In other words, the frequency of a batter seeing any particular pitcher has dropped by half. Expansion, interleague play, and free agency have conspired to reduce the number of times any particular batter and pitcher have faced off in the past.

Consider one area in which improperly valuing matchup data could potentially still cost the Cardinals in the World Series. If the Cardinals are able to hold on tonight, they would face a Game 7, and one of their starting pitcher options would be bringing Edwin Jackson back on short rest. Jackson is the Cardinals pitcher with the most history against the Rangers (due to his time in the American League). He has been effective against Michael Young, holding him to a .192 TAv in 21 plate appearances, but otherwise the Rangers have largely teed off on him—Ian Kinsler owns a .372 TAv in 16 plate appearances, Endy Chavez has a .359 TAv in all of three plate appearances, and David Murphy has a .488 TAv in 10 plate appearances. None of that is significant enough that La Russa should consider it strongly in deciding which pitcher to use for Game 7. And if La Russa does bring in Jackson, Murphy’s history against Jackson shouldn’t factor into Washington’s decision-making process; his 10 plate appearances against Jackson are nowhere near enough to outweigh Murphy’s underwhelming career numbers.

But what about cases where a batter has really owned a pitcher in the past—just utterly demolished him? Let’s restrict ourselves to cases with a prior TAv of .520 against a pitcher, or twice the average TAv. (By happy coincidence, that’s just about two standard deviations above the average, for those of you who care about such things.)

Historically, these have been more predictive of batter success than ordinary batter-pitcher matchups. But they are still dwarfed by the predictive power of our log5 expectation, by a factor of about 24 times. A manager is likely doing himself a favor if he puts a guy with that kind of extreme success in the lineup in place of a batter who’s otherwise reasonably close in ability. However, such cases are extremely rare, and even in these extreme cases, the whole of a batter’s historic performance (combined with knowledge of the platoon advantage) is still a much better gauge of how a batter will perform against a pitcher going forward.

It’s easy to read words like “harmonic mean” and “ordinary least squares” and dismiss these findings as something disconnected from what takes place on the field. But this study (and any other like it) is as much about those things as archaeology is about pickaxes and hammers. They’re just tools to expose the truth lurking within masses of data generated not inside some sterile sabermetric laboratory, but by the actions of the players themselves—in this case, an exhaustive record of batter-pitcher matchups over sixty years of baseball history. The data isn’t telling us that batters can’t pick up certain cues about a pitcher, or that a pitcher’s repertoire is equally suited to all batters. However, 10, 50, or even 100 plate appearances aren’t enough to tell us whether what we’re seeing is one player with a special edge against another, or simply a small-sample-size fluke, and there’s too much at stake for La Russa and Washington to let themselves be overly swayed by such statistics to the detriment of their teams.






Michael Cuddyer

John Danks



Mark Teahen

Mark Buehrle



Prince Fielder

Carlos Zambrano



Todd Helton

Livan Hernandez



Jim Thome

Justin Verlander



Carlos Lee

Carlos Zambrano



Derrek Lee

Javier Vazquez



Joe Mauer

Justin Verlander



Bobby Abreu

Javier Vazquez



Albert Pujols

Ryan Dempster



A version of this story originally appeared on ESPN Insider Insider.

Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now
You need to be logged in to comment. Login or Subscribe
Thanks, nice points. While recognizing the limits of batter and pitcher specific history, what about season-long split tendencies?

In 2011, lefties hit Jackson hard, though as a Cardinal it was righties who took him long more.

As for the Rangers,

Kinsler, balanced, slightly better power against righties
Andrus, equal
Hamilton, better on base vs. rhp, better power vs. lhp
Young, good against both, though better vs. lhp
Cruz, much better against lhp
Beltre, much better against lhp
Murphy, platoon player against rhp
Napoli, equally solid, more power against rhp
Moreland, passable against rhp

If the Cardinals' option is Jackson, Lohse, Westbrook, or Carpenter (on short rest), who makes the most sense?

Lohse (the listed starter at the moment), being much more balanced and less homer-prone would seem a reasonable choice.

And I guess if it's a game you really have to win you have to think about Carpenter, even with the debacle that was his last short-rest start...
Much of Bill James' success, I always felt, was his approach. He'd briefly state his conclusion upfront, next show an example of how it worked that way or why it mattered on the actual baseball field, THEN show his work. With the Table up there, he'd show us why Mark Teahen is not a better choice to DH against Mark Buehrle than Billy Butler is.
He also understood, somewhat at least, the difference between building knowledge and assisting a decisionmaker. In the latter, no, large sample size is not always better than small sample size. If I'm a manager, I don't much care what Mark Teahen has done in his last 50 ABs against Buehrle. Players change. I much prefer his last 25 ABs. That's too few to get me up to a .90 confidence level? Well, welcome to my world, Sabre Boy. I make some decisions at .51 confidence level before moving on to my next task. You get me up to a .60 confidence level on some things, you've earned your pay for the week.
So Jim Leyland should be skipping Verlander's turn in the rotation when the Tigers play the Twins. Fascinating.
There is only one Twin on that list...
How predictive is extreme failure against a pitcher: 0 for 15? 0 for 25?

It may also be better to define extreme failure or success as historical performance as a proportion of the expected log5 TAv. Or better still, calculate p-values for historical performance given expected performance.
Would enjoy seeing a study of whether hitters tend to improve vs. a particular pitcher the more they have seen him. This effect does exist on a team level within season, but I have not seen a player-level study. Would be difficult to do a full-career experiment due to survivorship bias, but a 1-2 year study of # of PA vs. TAv would be interesting.
The other problem besides a survivor bias is an aging bias - the aging curve for pitchers and hitters look different, and if you don't control for that well you'll end up seeing an effect where there isn't one. I agree it would be interesting to investigate further.
I'm curious about the aging curve of pitchers vs batters. I just generally assume that players performance begin to decline at age 30, but have seen a decent number of pitchers violate my belief recently (they are being good later in their lives). Is there an article somewhere that discusses this, or someone who has ran these numbers? I would like to see it in either case.