Last week, I brought up the topic of catcher defense, and how it had so piqued my interest. One of the main reasons for my fascination is how there are several different aspects of a catcher’s responsibility on the defensive end, some more quantifiable than others, yet the more qualitative components seem to matter a heck of a lot more than what can currently be measured. Another interesting point is how, even with the recent advancements in defensive metrics, determining how a catcher contributes to a team with actual glove work—and not just from being a catcher—has eluded analysts for quite some time. Even more interesting is the thought of catcher defense itself: while shortstops field grounders and right fielders shag fly balls, the foul outs and dinky three-foot balls on the ground that would be the equivalent for catchers of the balls in play for infielders and outfielders aren’t thought of initially with catcher defense.

It was great to hear the takes on defense from A.J. Hinch, Jason Jaramillo, and Gregg Zaun last time out, given their respective roles, but my goal today is to take a look at the other side of that coin, the analysts, and discuss similarities and differences between how certain components of catcher defense are accounted for in different attempts. Before developing any metric, it is important to understand what actually should be measured, as well as what others have done along those lines. While this literature review of sorts will not encompass every article penned on the subject, it should offer a solid foundation of the various components of catcher defense and how they would be converted to a runs-based or wins-based statistic. Perhaps the foundation will spur a discussion on specific components that can help us determine what should go into—and with what weighting—a metric that could instill confidence and become uniform in the field.

Catcher Overview

So what does a catcher do? Based on Keith Woolner’s outline in Baseball Between the Numbers and the interviews with the aforementioned triumvirate of major-league catchers, backstop responsibility essentially breaks down into three main groups: helping the pitcher execute, controlling the running game, and catching the ball. Each of these three involves its own subset of responsibilities. For instance, helping the pitcher execute encompasses game calling, making adjustments on certain hitters, keeping the pitcher’s head in the game, being his psychologist and confidant, and anything else qualitatively that impacts performance. Controlling the running game breaks down into nabbing would-be basestealers, preventing them from attempting based on reputation, or the seldom-discussed throwing out of lead runners on bunt attempts. Catching the ball involves, as Zaun mentioned, being a good receiver, framing pitches to allow the umpire to make an accurate call, and blocking balls in the dirt to prevent wild pitches or passed balls.

With regards to the qualitative facets of catcher defense, before breaking down each of the components above, I want to point out that immeasurability should not be conflated with importance. Just because we don’t know right now how to convert pitches framed or making adjustments behind the plate into linear weight run values does not mean they should be completely ignored when discussing catcher defense. As I mentioned last year, it is still perfectly fine to craft metrics without these qualitative inputs, so long as we realize they might not be complete; in other words, being 65 percent there is great, and better than nothing, but we shouldn’t "dis" that which cannot be computed and assume we are 100 percent accurate.

Stolen Bases

Admit it—steals and caught stealings are the first two things you think about when the topic of catcher defense comes up. Heck, it happens to me all the time. Throwing runners out is sexy—relatively speaking—and it is increasingly common to derive opinions of catcher defense from their stolen-base percentage against. Mike Piazza was considered to be defensively handicapped because everyone, Rich Garces included, could steal a base safely with him behind the plate. Though Piazza may have, in fact, been a poor receiver, his inability to prevent steals alone did not render him ineffective. It was just one of many inputs.

To account for stolen bases, most analysts utilize some variation of the following: take the numbers for the individual, compare it to the league, broken down by pitcher handedness and prorated to the league average. For example, if the league allows .10 steals per inning with a righty on the mound, and a catcher kneels down for 1,000 innings, he would be expected to allow 100 steals. If he only allowed 65, then he allowed 35 fewer than the league in this split. The same process can be run for times caught, and then the run values for steals and caught stealing are applied to the results. In the end, the calculation will offer how many runs above or below the league average a catcher contributed or subtracted from his team in this component.

I have seen a few attempts that do not break the inputs down by pitcher handedness, which I consider to be vital to this component. Lefties have an advantage over righties with runners on first base and thusly should be expected to have a lower rate of steals per inning. If a catcher receives pitches from an inordinate number of lefties, and a blanket percentage across the league is applied, the results are likely to be skewed.

Reputation Runs

Admittedly, I haven’t seen these used much save for Chuck Brownson’s article at Beyond the Box Score, but the idea is that certain catchers might not receive as much benefit from throwing runners out because they are not tested as much based on their reputation. Yadier Molina is a good example of this, as is Ivan Rodriguez from back in the day. Both catchers were held in such high esteem with regards to gunning down runners that fewer baserunners even attempted to steal. From a run-value standpoint, which aggregates in counting numbers fashion, Molina would potentially pale in comparison to a catcher who throws out a good number of runners due to having such a high volume of attempts against him.

To deal with this, Brownson proposed a methodology that would calculate reputation runs based on the value of a stolen base, the likelihood of the runner being caught, and the number of times runners have foregone the opportunity to steal based on said reputation. The calculation works very much like an expected value formula, where a catcher is expected to have a certain number of attempts against him; a number lower than that is attributed to his reputation. As of August 5, 2009, when Brownson described his methodology, Molina had produced 4.4 runs above average based on his reputation, well above anyone else.

Passed Balls/Wild Pitches

Both of these events are lumped together most of the time due to Colin Wyers’ favorite subject other than run estimators—scorer bias. We might know how to explain the difference between a passed ball and a wild pitch, but if I had a nickel for every time I swore it was a wild pitch only to find out the pitch was scored a passed ball, I’d have $7.60. We can sit here and debate the merits of lumping the two events together, but because of the potential for errant classifications and similarities between both events, it makes sense to combine them. Of course, the usual caveats with balls getting by the catcher are at work here, such as a discount for catching knuckleballers. Why should Tim Wakefield’s personal catcher suffer when he has a much tougher job than catching, say, Dan Haren?

Anyway, the sum of these events is converted to runs added or subtracted in very much the same fashion as stolen bases, where the individual is compared to the league and the run values are applied to the net result. One difference, or at least one I would propose, would be to use pitches caught as opposed to innings behind the plate as the denominator, as the numerator consists of pitches as well. Combined with the stolen-base numbers and reputation runs, the passed balls and wild pitches wrap up what is generally thought of right away when conjuring up what catcher defense actually entails.

Actual Defense

Catchers don’t have to make plays as often as infielders, but there are foul balls in the air to catch, dribblers to gobble up, and sacrifice bunts to field. Curiously enough, a cursory scan of catcher defense articles produced not one that included sac bunts and throwing out lead runners. Perhaps the availability bias is at work in my mind after watching Molina put on a veritable clinic against the Phillies last week, throwing out lead runners on bunts seemingly at will, but think of the shifts in run expectancy. If the Twins have a runner on first and second and one out, and a successful sacrifice bunt is executed, they have runners on second and third with two outs. If the opposing catcher nabs Jim Thome at third, the net result is no change in baserunners and another out. Then again, if he throws to third and the runner is safe, the change is even more substantial.

Accounting for lead-runner plays could be calculated using these shifts in run expectancy. In the end, it might even out, but it certainly shouldn’t be ignored, nor should errors on foul balls and dribblers, as infrequently as they might occur. In this article from Justin Inaz, who had the good fortune of sitting next to me and having to listen to me talk for three hours straight at our PNC Park event last year, different run values were proposed for throwing errors and fielding errors. From the author himself:

Here's my reasoning. Fielding errors are usually made on plays that would otherwise be outs. For example, a catcher not being able to handle a good throw to the plate results in an error that would otherwise have resulted in an out. Therefore, the run value of making one error above average is the value of the advancing runners (0.48 runs allowed) plus the value of the out (0.27 runs saved), or 0.75 runs total cost to the team.

On the other hand, many catcher throwing errors are made on stolen base attempts, with the runner usually ending up at third. These plays are typically scored as a stolen base plus an error. Therefore, the difference between making the error and not making the error is just the advancement of the runner(s) (~0.48 runs allowed); we can't assume than an out would have been made. It's true that some throwing errors would have resulted in outs (e.g. throwing away a ball on an easily-fielded bunt), but I tend to err on the side of being conservative with fielding statistics. I am, however, very much open to suggestions on how to better handle this issue.

Framing Pitches

Not much has been done on this subject, though I was able to find this gem of an article from Bill Letson, which describes a very interesting framework for computing the value of framed pitches. Just to be clear, framing pitches refers to the catcher holding his glove after catching the ball, as well as potentially catching the ball in a certain way that increases the likelihood of the pitch being deemed a strike. Sometimes it looks silly, but certain catchers look as though they have the art down pat, receiving the ball and inching it subtly back to a corner, holding the glove for the umpire to get a better sense of where the ball was "actually" thrown.

Summarizing Letson’s methodology, we first normalize strike zones, and divide the zones into different bins based on batter handedness and count. A LOESS regression is then run to estimate the probability of a pitch being called a strike—and it should be noted that "pitch" refers solely to those classified as a ball or a called strike. The actual result of the pitch is compared to the probability of it being called a strike. If the probability was 0.3 and it was actually a strike, then the catcher is credited with 0.70 extra strikes.

Next, Letson astutely discussed umpire normalization. As someone who has worked with umpire data quite a bit, I was very pleased with this control. Essentially, umps have different strike zones. The range between the thinnest and the thickest zones might not be as vast as you may think, but it adds another level of accuracy to either discount or add a premium to the extra strikes based on the zone of the ump behind the plate. If the ump for our hypothetical pitch above calls, on average 0.05 extra strikes per pitch, then the pitch would add 0.65 extra strikes (0.70 from the actual vs. expected comparison minus 0.05, the ump factor).

I love Letson’s idea and think it is a tremendous attempt at quantifying framed pitches, or at least the number of extra strikes that are not attributable to the ump or the pitcher, which is sort of a definition backing into how framed pitches are defined.

What’s Left?

The above components are certainly integral to crafting a metric measuring what catchers add or subtract behind the plate, but the qualitative aspects of calling a game and being a good receiver should not be forgotten. The problem is that I don’t feel incredibly comfortable using a blanket estimate range and attributing some number in that range to these aspects. While Catcher ERA isn’t persistent from year to year, it doesn’t mean what actually occurred was fake—it just means it might not happen next year.

Let’s open a discussion here on this topic. Keep in mind that what was presented here is more of a summary of others’ work to instill a solid foundation of what we’re working with, as opposed to my own attempt at a metric. How do you feel about how we are currently accounting for these components? How should the qualitative components be factored in? How important do you consider catcher defense to be given the massive positional adjustment of simply being a catcher? Hopefully we can get a good discussion going and hammer out some details or even add a few new ones into the fold.

Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now
You need to be logged in to comment. Login or Subscribe
In terms of the passed ball/wild pitch calculation, you mention a discount for knuckleballers, what I was wondering was, could the discount be extended to wild pitches/passed balls as a percentage of overall balls outside the strike zone thrown by a pitcher? So, if a catcher is catching a less accurate staff, one would assume there would be more passed balls/wild pitches. I wonder how well this correlation holds generally, and/or if it would be useful/productive to add into a passed ball/wild pitch metric.
Just a quick suggestion - based on the assumption that pitches with "movement" are theoretically harder for catchers to handle than pitches on a straight line trajectory, maybe a slight adjustment could be incorporated in the metric for average pitch movement? I imagine it would be fairly straightforward to tease this out of pitch/fx data. It would indicate not only the catcher's ability to handle these difficult pitches, but also the confidence in his pitchers to throw their nasty stuff.

In terms of stolen bases/reputation runs, Brownson used average SB opportunities per inning to determine reputation. I bet that there would be some value in using play-by-play data to determine the actual number of SB opportunities as well as the actual number of attempts. It seems to me that catchers on teams with poor pitching staffs would face more opportunities (more runners on base), therefore would face more SB attempts, and look like they are more desirable to run on.

Play-by-play data would also allow greater depth of analysis. Each stolen base opportunity can be categorized by factors such as what base is open, pitcher handedness, outs, inning, and run differential. Opportunities to steal second have different likelihoods and values than opportunities to steal third. Teams losing by four runs are less likely to attempt a steal of 2nd with no outs in the 8th.
I was thinking that you might want to include something that take into account the pitcher for Caught Stealing. That way it might account for guys that take forever getting the ball to the plate or that like to throw a lot of off speed pitches. I don't know if there has been any analysis on this before to find out if certain pitchers actually do let up more stolen bases but it seems worth looking into.
There are actually up to THREE players involved in a SB/CS play, focusing for the time being on a steal of second, the most common steal:

1) The pitcher
2) The catcher
3) The shortstop/second baseman

(Actually, we could break this down to an even more granular level - for pickoff attempts, the first baseman is involved as well.)

A runner's probability of successfully stealing the base depends on a number of factors, primarily (for the defense, at least):

1) How much of a lead he's able to get.
2) How long it takes for the ball to get to the catcher.
3) How quickly the catcher can get the ball to the infielder.
4) Applying the tag properly.

Only ONE of those is within the catcher's control. It's really worth asking - why the presumption that the catcher is the defensive player with the most ability to affect the running game?
I'll recommend these articles.


Plus the followup in THT 2008 Annual on Google Books.

Max Marchi:

Chris Dial:

Davey Lopes does the analysis...


...but it looks like you need to reach first base before he'll share his data. Try not to get tasered.

Davey's stopwatch data is (I'm guessing) pitcher-handedness-independent. I'm assuming he starts the stopwatch when it becomes clear that the pitcher is going to the plate, and stops it when the catcher receives it. If you get data on the Lopes Time for every pitcher and control for it, you could probably do a pretty good job of assessing the catcher in terms of attempts/attempt opportunities and SBs/attempt.

Eric, one element of stolen base analysis that is really overlooked is deterrence (as well as pitcher handedness and quickness to the plate, but you mention that).

As you elude to, the set of base runners attempting a steal against Mike Piazza is far different than than the set who would against Pudge (either one) or Yadier Molina. Obviously, all CS% aren't created equal.

So, I actually prefer to look at steals per 1000 innings. Molina allows 27 steals per 1000(career 46% CS%) Yogi Berra, with a similar percentage, allowed 56 steals per 1000. Still, that doesn't account for steals off the pitcher. But it does help put deterrence in context.

Good stuff here though!

Also, It would be great to see a few things to help our understanding (not that they are all possible).
Caught Stealing% by pitch type (including pitch outs)
CS% by Handedness
CS% by time to the plate
CS% by runner type
For framing pitches, it might be better to define a strike zone for each catcher in the same way you would define zone for an umpire and compare catchers on the basis of the area of their strike zones. Giving credit on the basis of actual pitches that are called strikes (when they might have been called balls) introduces a bias by pitcher if the pitcher throws relatively many or few frameable pitches. The catcher strike zone area could also be corrected for the average strike zone area of the umpires he works with.
Has any analysis been done on play-calling? The catcher is most often responsible for which base to throw to on balls in play with multiple runners on. Its sort of like pitching, where the catcher really only makes the decision where the ball should go, its still up to the fielder to execute. I would imagine most catchers would fall in the middle, but there could also be a big difference between the worst and best catchers at doing this, and that the run values for this ability would be significant, since it usually involves outs or bases advanced.
In addition to what Colin and Tango said, I'll throw in BIS/John Dewan's work on the topic of Catcher Defense in The Fielding Bible -- Volume II.

Dewan, similar to Tango's work, adjusts for the pitchers' career SB rates with other catchers up to that point of their career. You've gotta do that one way or another. In fact, Dewan estimated that pitchers have more "control" over the running game than catchers do (65/35 split). IIRC, there was some debate over the methodology used to arrive at the 65/35 number, but that doesn't mean it's entirely wrong.
I still don't understand why you need a "deterrence" adjustment factor, or whatever you want to call it. If a catcher allows no SB attempts because he can throw the ball 1000 miles an hour, then you simply compare him to the average catcher. If, in fact, runners run too much such that the average catcher saves his team some RE or WE, then the catcher against whom no one runs is a liability, right? Regardless, I don't see how there is any need to make any adjustments for catchers who deter the running game. We know the value of a SB and we know the value of the CS. So you simply take each catcher's total SB and CS, multiply each by their respective values and that is the value of the catcher's arm. What difference does it make how many SB attempts the catcher allows? That will be included in the calculations. If an average catcher costs his team 1 run per 150 games, then the catcher against whom no one runs is worth 1 run per 150 games, relative to the average catcher. Again, no need to do any separate calculations or adjustments based on whether a catcher "deters" runners or not, as long as you normalize every catcher to the average catcher, which everyone is going to do of course.
MGL is right in the grand scheme of things. If you wanted to break down his baserunning game to "profile" him, you can include his deterrance. But, in terms of his overall value or impact, it makes no difference.

This is no different than an outfielder where everyone runs on him, but he manages to throw out 20% of the runners and an outfielder where no one runs on him at all. Overall they are equivalent. The interesting thing is to profile them, but that's an aside.

Sorry, but I don't think that is necessarily the case.

Everybody runs on Johnny Damon whereas far fewer baserunners attempt to stretch against Ichiro Suzuki. Damon's weaker reputation results in more men at 1st-and-3d situations AND man at 1st, out-at-3d situations. Ichiro's stronger reputation is going to result in fewer of both of those situations, but also result in more men at 1st-and-2d situations as the lead runner stops with increased frequency.

The number of opportunities for these situations against the two OFs is going to differ and it seems to me we would need to know the volume because the potential runs gained and lost probably isn't going to match.
"Sorry, but I don't think that is necessarily the case."

It IS the case, and I am glad that Tango chimed in since I thought that I was the only mad one in a sane world.

drawbb, outfield arms get valued by the number of advances and the number of assists (the OF throwing out a runner on the cases), as compared to the average OF at that position. If no one runs on you, you get credit for no advances. If players run on you, you get credit when they don't advance a base, when you throw out a runner (more credit than a non-advance obviously), and demerits when runners advance the extra base.

So everything is accounted for, without any special adjustment or calculation for "deterrence" (runners not running on you).

It is exactly the same with catchers. You get credit for throwing a runner out (and pick-offs), and you get demerits for allowing a stolen base (and throwing errors on a steal). And everything gets compared to the average catcher. There is no need to adjust for "deterrence." If no one runs on a catcher, nothing happens, just as if stolen base attempts were not allowed, like in Little League.

The way the "adjustment" occurs, if you even want to call it that, is by comparing everyone to the average catcher in the final step in the computations. The catcher that no one runs on gets exactly zero net runs, but if the average catcher has -2 net runs per season (IOW, all base runners combined generate net positive runs), then the catcher against whom no one runs, gets +2 runs in credit. Interestingly, if base runners generate net negative runs (ran too much), which they probably did for many years up until the last few years, those catchers with great arms and zero net runs (before the league adjustment) would have to be credited with net NEGATIVE runs, a little bit if a logical anomaly.

The answer to that, by those good catchers that no one runs against, is that if runners want to generate net negative runs by running too often (and/or possibly at the wrong times), then these catchers need to actually "bluff" the runners a little by not showing such a great arm and encouraging them to run a little. If that is the optimal strategy for them, and they do not do that, then they indeed deserve to be charged with net negative runs even if they have great arms and no one runs against them. That is because in baseball, as in most sports, it is not only athletic talent (like a strong arm) which creates value, but good strategy as well.

So, I would like to hear from Eric (or others) and have them explain to Tango and me what this "deterrence adjustment" (in quantifying catcher value) is all about, as it makes no sense to me.

I didn't create them or tout them, or express feelings either way on the matter -- this was simply an article discussing what components have been discussed. That being said, it seems the idea is that someone who had an SB against rate of 15/30 would have a 50% rate, which looks worse than someone who had an SB against of 30/100, a 30% rate. But the former allowed fewer attempts. In essence, yours and Tango's posts are correct that it matters little, but this is one of the first things people discuss when you talk about catcher defense and it was worth noting due to that and that alone.
Thanks for the response, but I just don't see how your conclusion follows from the example I gave. You didn't address my point about the volume of varying opportunities perhaps creating unequal potential run values between Damon and Ichiro.

However, I think the real purpose of a deterrence argument is not between Damon and Ichiro, but between Ichiro and a guy with similar ability but a weaker reputation--again, due both to the disparity in advancement opportunities and unequal run values inherent to the baserunners' decisions and outcomes.

I don't know, maybe this is like the Monty Hall paradox where the answer seems counterintuitive and I'm totally open to that possibility here...but I'd like to see someone provide a simple example of numbers that prove Reputation Guy is not hurt by a comparison with Not-As-Widely-Publicized Guy and that a deterrence adjustment is not needed. Right now I don't see that, but I'd be grateful if someone can show me.

Right you just referenced the article. I was just wondering what the methodology or rational was. You seem to be familiar with it.

You say, "it matters little." It matters none, at least as far as quantifying the catchers' value. In your example, the 30/100 catcher probably allows zero net runs or so (assuming an overall 70% BE rate) and the 15/30 allows maybe +3 runs. The catcher who is 0/0 has zero net runs of course. And, as I said above, those numbers have to be further adjusted by the net runs allowed by the average catcher if we want to compare catchers to the average catcher although there is no great reason why we have to do that (sum the league to zero).
Nope. The guys who deter runners are going to have to throw out far better base stealers, on average, than those catchers against whom everyone runs. So if you fail to adjust CS rates for quality of runners against, you'll massively underrate the skill of the deterring catchers.
Well, I just read Chuck Brownson's article at BTB. Here is what he does:

"I then subtracted his actual stolen base attempts from the expected number and multiplied it times the run value of the SB (0.175) times the likelihood the runner would be caught stealing (CS%)."

That is completely wrong of course. For some strange reason he is assuming that all catchers "should" have the same SB attempts (per inning I guess) with their own CS% and then crediting or debiting them the difference between what they "should" have and what they do have. That is ridiculous of course. For example, if a catcher is 1/2 in 900 innings (around 100 games), he is going to assume that they "should have" been 35/70 (or so) rather than 1/2 and he is going to give them 12.25 "rep runs" or so, which makes no sense. None whatsoever. Where did he save 12 runs? Similarly if a catcher is 3/3 (0 CS%), he is going to assume 70/70 and dock him 10.5 runs or so. Again, makes no sense.

Eric, you should not have even mentioned "rep runs" in your article. You devote an entire paragraph to it, clearly implying that that it is part of a catcher's value. You could have mentioned that some catchers are so good that no one runs against them and therefore, paradoxically, they derive little or no value from their good arms, but you didn't. In fact, you implied that these catchers have more value than is being captured by the traditional SB/CS numbers, which is not true. If you didn't mean to imply that, why would you have even mentioned it, let alone devote an entire paragraph to it, based on an obscure and incorrect methodology by someone whom I have never even heard of?
One half of Brownson's assertion makes all the sense in the world to me: The notion that Reputation Guy saved X number of runs based on the lower volume of attempts against him. If there existed a catcher so feared that he faced only 2 SB attempts in 900 IP compared to 70 SB attempts in the same time against a normal catcher, you'd better believe that is definitely worth something to Supercatcher's team.

Defense is an ongoing series of tradeoffs that the coaching staff must decide to make, i.e. guarding the line. If Supercatcher's reputation is that amazing, then guarding against the SB at the expense of opening more holes in the IF is a tradeoff his team doesn't need to make and they can instead focus their efforts elsewhere--which certainly has SOME corresponding run value. I simply can't imagine someone trying to argue that this doesn't make sense.
You have these two catchers facing 1000 runners on first base for the season:

IRod, 20 SB, 20 CS
Carter, 80 SB, 60 CS
Piazza, 140 SB, 100 CS

Which one is more valuable?

For the moment (just for the moment, for this one moment), presume that DP, Hit&Run rates, opening the hole between 1B and 2B, and taking extra bases on singles and doubles are non-factors. (Just for the moment.)
All right, Tango. I used the 2009 run expectancy chart from this site and made two assumptions: A) That each catcher faced a normal run environment and B) That the SB and CS occurred evenly with 0 outs, 1 out, and 2 out. To make the math easier on the latter point I used numbers divisible by 3 so that Pudge had 21-21, Carter 81-60, and Piazza 141-99. Hope you are OK with that small adjustment.

Pudge is credited with 5.19736 runs saved, Carter with 11.26511 runs saved, and Piazza with 17.33286 runs saved. The catcher who faced the most attempts and had the worst CS percentage nevertheless would appear to have a better arm than the other two catchers combined. That's exactly my concern from above, i.e. the non-linear run disparity of the successful and unsuccessful attempts combined with the higher volume produces a counterintuitive result.

This example seems to support the notion that a deterrence adjustment is necessary. My question for you is how can these numbers be reconciled with the opposite assertion and, if so, on what is that based?
I forgot to mention that for the sake of simplicity right now I treated every SB attempt as an attempt to steal 2d base.
As noted by the readers at The Book Blog, the value of the "reputation" comes not in shutting down the running game (which, as MGL has shown, is already handled by the SB, CS numbers), but on the other parts of the running game, like taking the extra bases on hits.

For example, if you have Superman behind the plate, and no one tries to steal, AND they also take a shorter lead off first base, this might prevent the runners from taking an extra base on singles and doubles, and might get them doubled-up more often.

Setting that particular point aside (perfectly valid, but is not really the particular trait being discussed here), I'd encourage the readers who are skeptical to actually work out the numbers specifically.

By the way, this is exactly the same situation with the baserunning numbers (baserunners). Dan Fox introduced it in the annual a few years ago, and Dan did it exactly like MGL says we should do the catchers and arm numbers. They all work identically.