keyboard_arrow_uptop

You’re at a restaurant with some friends. It’s been a fun meal, rehashing the good old days from Whassamatta U. Somewhere in there, you ordered a couple of pitchers of potent beverage, an order or two of cheese fries, Larry and Tom split an order of whatever that was, and Tom … the other Tom … wasn’t really hungry and didn’t order anything, but he did steal some of your chicken fingers. And now, the bill has come and everyone pulls out their credit cards.

Warning! Gory mathematical details—not to mention indigestion—ahead!

Parsing out the credit in a baseball game can feel similar. The ball dropped between the left fielder and the third baseman. Which one of them was to blame? Should we instead blame the pitcher for giving up a sinking line drive? Should we credit the batter with good woodwork? Is this yet another thing that the catcher is actually responsible for and that Jonathan Lucroy excels at?

Here and there, I’ve been trying to put together a system that will work in a low-information environment (i.e. just using play-by-play data) that would allow us a more fine-grained idea of who really made the events happen. Right now, stats like win probability added give all of the credit on a play to the batter and pitcher, when we know that’s not the case. So far, I’ve handed out credit for events that happen at the plate and examined how to understand different types of batted balls. Today, we’re going to look at things that happen in the field and on the bases to see if we can get the basics of a baseball game down.

Warning! Gory Mathematical Details Ahead!

Fielding

Fielding systems have always been tough to create. The fact that we have ball-location data and, with Statcast, all sorts of velocity and hangtime measurements, means that we can measure fielding with much greater accuracy. It might seem a little backward-looking to create a fielding system using only play-by-play data. Indeed, BP's own FRAA only uses play-by-play data, and Sean Smith's Total Zone remains in use at Baseball Reference, including for pre-DRS WAR calculations … and then on the other end of the popularity scale there’s the one I made years ago called OPA! (out probability added above average), which was based on TotalZone. But we want to create something that will work even when information is hard to come by and I think there’s something to learn from those old measures. If all we know is that there was a groundball to third, how do we parse that out? Seems a shame to give all the credit to the pitcher when the third baseman at least did something. While we’re at it, the first baseman did something too.

Last time, we talked about how much control hitters and pitchers had over where the ball went on the field and came to the conclusion that there’s a good helping of luck, but that batter and pitcher share about equally in directing whether the ball goes to an area of the diamond where fielders generally are or they ain’t. If we know that a plate appearance ended in a 5–3 ground out, we know that it was most likely hit in an area where the third baseman usually goes. We know it wasn’t a single into left, so it’s not as likely that it was hit in the hole between short and third. Maybe it was and the third baseman made a great play. With just play-by-play data, we don’t know. But thanks to Retrosheet data from 1993–99, we can at least get some idea of where the ball was, because we have Project Scoresheet batted ball location data. We know that for groundballs that were fielded by the third baseman during this time, 43.6 percent of them were in the “5” zone, 25.0 percent were in the “56” zone, 9.7 percent were in the “56S” zone, and so on. In later years, we don’t have location data, so we’re going to play Schroedinger’s batted ball and assume that a groundball to the third baseman was 43.6 percent in the “5” zone, etc.

We can run the reverse. A ball hit to the “5” zone became an out 79.1 percent of the time, a single 8.2 percent of the time, an error 5.7 percent of the time, and so on. The batter and pitcher (and dumb luck) share the blame and credit for putting a groundball in that zone (or some probability that it was in that zone) and we can sum across the probabilities that it was in each zone and the probabilities of what usually happened to that sort of ball. The third baseman can only play with what’s hit toward him. But if he makes the play, he takes whatever possibility that a single might have happened away and turned it into an out. He should get credit for that.

Well, most of the credit. He got to the ball, fielded it (thereby at worst making it an infield hit, rather than an outfield hit), and didn’t drop it. He threw to first base. Now, if the first baseman dropped it, should we penalize the third baseman for that? I vote no. The third baseman gets credit for 99.8 percent of an out, and the first baseman gets credit for the final 0.2 percent. Or for completely undoing all the work that the third baseman did by committing an error.

We can do the same for line drives and fly balls and allocate credit that way. In that way, we give proper credit to the fielders and solve the usual “give the pitcher all the credit” problem.

Stolen Bases

This one is a little easier. I used a similar method as before. I looked at steal attempts of second that were not part of a double steal. I found all runners who, in a season, attempted at least 10 steals, and all catchers and pitchers who had at least 10 attempted on them. I found the log odds ratio of their success (or in the case of the pitchers and catchers, their “aw shucks”) rate and placed all three, along with the league rate, into a logistic regression. I looked at the amount of variance (for the initiated, yes, it’s logit; I’m using change in -2LL likelihood if the term is removed) that each actor had.

Actor

Percentage of credit/blame

Runner

45%

Pitcher

49%

Catcher

6%

League

< 1%

This actually squares nicely with what the stolen base components of DRA found, that the pitcher was much more responsible for a stolen base than we ever really give him credit for. In fact, the pitcher seems to have almost eight times more to do with whether a runner is successful on a steal try than the catcher. Yet we always hear about catcher caught-stealing rates and never pitcher ones. In fact, for years, I justified being lukewarm on Mike Piazza’s Hall of Fame chances because of his abysmal CS rate. I guess I was wrong on that one.

Baserunning

Most baserunning metrics, aside from stolen bases, focus on five different situations. Arm metrics usually use these too.

  • Going from first to third on a single to the outfield

  • Going from first to home on a double

  • Going from second to home on a single

  • Tagging up from third on a fly ball to the outfield

  • Tagging up from second on a fly ball to the outfield

Who really controls whether the runner even tries for an extra base and who is to blame if he makes it?

Similar to the above, I looked for any of these five situations and found the runner involved, the outfielder involved, and the pitcher and batter who produced the ball in play. I found the rate at which each tended to produce “go” outcomes and “safe” outcomes. Data are from 2010–14.

I included a dummy variable for whether there were two outs, and for the players, I used their “go” and “success” rates overall for all five situations. It’s not perfect, but it will give us some idea of what’s going on.

Actor

Go” Credit/Blame

Success” Credit/Blame

Runner

34%

29%

Fielder

20%

23%

Pitcher

21%

24%

Batter

14%

20%

League

12%

4%

Going or staying is most closely associated with the runner’s proclivities, followed by the pitcher, rather than the fielder. It’s possible that certain types of pitchers give up deeper fly balls that are easier to run on. But all told, it’s interesting how closely bunched together all these actors are. When we move over to success vs. TOOTBLANing, we see much the same pattern.

Passed Balls/Wild Pitches

In theory, we should know who was to blame for a passed ball or a wild pitch simply by what it was ruled to be. The official scorer gets to decide. If the catcher was at fault, it’s a passed ball. If the pitcher, then a wild pitch.

Instead, I looked at both actors and checked to see which one seemed to be more responsible.

Actor

Percentage of credit/blame

Pitcher

94%

Catcher

6%

It seems that errant pitches tend to be more tied to errant pitchers, rather than catchers who can’t block those pitches. From 2010 to 2014, there was an 83-17 split between wild pitches and passed balls, so official scorers are generally getting it right, although not quite.

Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now
You need to be logged in to comment. Login or Subscribe
cmaczkow
6/03
I'm not knowledgeable enough in the math behind the various approaches to know the answer to this, but I'm curious: for situations like these, where you're trying to assess how much responsibility for the outcome goes to each participant in the play, are any of the concepts or models from advanced stats in football, basketball or hockey applicable? Each of those sports have to deal with this issue even more frequently (and usually with more variables involved) than baseball, and while I'm not sure anyone's found the Ultimate Truths just yet, there has been a lot of work done (and many different approaches used) that try to find the same type of information that you're looking for here.
pizzacutter
6/04
I don't honestly know enough about how handegg/organgeball do it to really borrow from them. With StatCast, we will have much better measures of where the ball landed and where the players were before the ball was hit (and their reaction time and speed, etc.) That's probably where the frontier is. This was specifically designed to work in a much lower-information environment.
Mark68
6/04
I don't see a role for the third base coach mentioned in your calculations. Many of us Angel fans are continually frustrated by the Angels' numerous baserunning outs, and many point the finger at the third base coach (Gary Disarcina) sending runners when the situation probably dictates otherwise. Baserunners tend to follow the dictates of their coaches, so if the coach is sending them home (or putting up the stop sign), for the most part, they will follow the coach's instructions.
therealn0d
6/04
That's awfully fine-grained. It could probably be done, but I'm not sure either that it's worth it or would even be accurate enough to be usable.
pizzacutter
6/04
Third base coaches, as a rule, are far too conservative in sending runners. http://www.baseballprospectus.com/article.php?articleid=10073