Believe it or not, most of our writers didn't enter the world sporting an @baseballprospectus.com address; with a few exceptions, they started out somewhere else. In an effort to up your reading pleasure while tipping our caps to some of the most illuminating work being done elsewhere on the internet, we'll be yielding the stage once a week to the best and brightest baseball writers, researchers and thinkers from outside of the BP umbrella. If you'd like to nominate a guest contributor (including yourself), please drop us a line.
You asked, he answered. Below are the first batch of responses to the questions BP readers submitted for sabermetrician Tom Tango. All questions are presented in their original form.
TOPIC #1: Lineups
Peter Hood asks:
It appears to be one of the axioms of Sabermetrics that "lineup protection" is a myth since studies have consistently been unable to establish its existence. And yet, ancedotal evidence (mainly from players I suspect)seems to suggest that that it does exist. I'm wondering if the sabermetric problem is possibly one of specification? Lineup protection (to me at least) is something that is critical only infrequently, existing primarily in key situations and, as a consequence, may be buried under the noise of routine ABs.
This is a sabermetric myth.
There are two issues to consider:
- Do players (pitchers and/or batters) behave differently based on who is on deck?
- Is the overall impact better or worse?
In The Book, we looked at this topic. Luckily for you, it was excerpted a few years back at The Hardball Times (please read that). The first takeaway is that yes, definitely the players respond differently. And really, when you are talking about human beings in different situations, the expectation is that they should respond differently. After all, they are not automatons, are they? And they respond on the surface as you'd expect: the pitcher is avoiding the unprotected batter, which results in more walks (and more strikeouts).
So, score a big one for conventional wisdom.
But, even though there is a different response pattern by the players involved, that does not by itself mean that it favors one side or the other. Indeed, the result of our study shows that when it comes to putting the ball in play, there was no significant impact.
So, score a wake-up call for conventional wisdom.
Subtopic: Batting order
Questions about your optimal lineup construction from "The Book": why bat your 5th best hitter 3rd instead of your 3rd or 4th best hitter? I am assuming it's to prevent the lineup from being too top heavy? The other is why in the AL you bat your worst hitter 9th and in the NL you bat the pithcer (your worst hitter) 8th? Is this because of just how bad pitchers are at hitting that it becomes optimal to bat them 9th, and at what point would the worst hitter on an AL team have to be for it to be optimal to bat them 8th and not 9th?
The Book proposes that you put your top three batters somewhere in the first, second, and fourth slots, with the low-power guy in the leadoff slot and the high-power guy in the cleanup slot. It also proposes that you put your two next-best batters somewhere in the third and fifth slots, with the high-power guy in the third slot. Those are not hard and fast rules, because you also need to consider the propensity of each batter to ground into double plays, the speed of the runners, the handedness of the batters, and so on.
Now, as to the main reason the third hitter is not so highly thought of: he comes to bat a disproportionate number of times with the bases empty and two outs. When that happens, the best way to score a run is to hit a home run. I've run models where I swap the traditional number-three hitter with the traditional number-two hitter, and you end up scoring more runs by making the swap. But we're only talking about a two-run gain over the course of 162 games.
Even doing something drastically incompetent, like putting the pitcher in the cleanup slot, costs you only 0.1 runs per game.
As for why it’s better to bat pitcher eighth: it's because it's more beneficial to set up the top of the order than to give the pitcher fewer times to bat. But again, we're talking about a two-run gain over the course of 162 games.
Why is there so little gain (or at least, less than one might presume)? Because everyone eventually bats. It's like deferring your taxes: you can save only so much. If you swap your number-two and your number-six hitters, what happens? Well, that's a difference of 72 plate appearances. If your number-six hitter creates 90 runs per 700 PA and your number-two hitter creates 70 per 700 PA, the net effect is that you can gain 20 runs per 700 PA. And 20 divided by 700 times 72 is two runs.
The best way to set up your batting order is to put it in the optimal order (which means you have to have different batting lineups based on pitcher handedness), and then tweak it based on the ego of the players, because human impact is more important than leveraging two runs.
TOPIC #2: Pitchers
Subtopic: Bullpen management
Mr. Tango….what are your thoughts on modern bullpen alignment and usage? Should "closers" be used more liberally in key spots in the middle innings if indeed they're the best guy in the bullpen?
There is one thing that has been constant between bullpen usage in the 1970s and today: the best relievers finish the close games. The question at hand is really when you can bring them in. If you focus on the idea that you can bring in the ace reliever in the seventh or eighth inning and have them not pitch in the 9th inning, you are going to be in unchartered waters.
I made a little chart a few years ago that showed when Sutter and Gossage were brought in (please read that). We see that only half the batters they faced were in the ninth inning. About 40 percent of the batters they faced were in the eighth inning or earlier. And the leverage, or the importance, of those batters was just as strong in the ninth inning as it was in the earlier innings. So, we can definitely find situations, very easily, to bring those relievers in earlier and have them pitch to the end of the game.
Against that, however, we have to weigh two things:
- Pitchers pitch much better as reliever than starters. And so it would seem that the more you can limit the pitches per game, the stronger and more effective the pitcher can be. If the reliever knows he will pitch only one inning, he might be more effective. In The Book, I found extremely strong evidence for the starter vs. reliever theory (a one-run-per-game difference between pitching as a starter compared to pitching as a reliever). However, I could not find any evidence to support the short vs. medium relief outing theory when I focused on pitchers from 1999-2002. If I were to look in other time periods, in which pitchers were conditioned to go for short and long outings, I might find it.
- Relievers love to know which inning they will enter. It's possible that the reason we find such sustained low ERAs among relievers in the current era (Rivera, Wagner, Percival, Hoffman, etc.) compared to those of past eras is that they have an optimal usage pattern.
So, while I will come down somewhere on the side of the new wisdom, I think there's still plenty to be said for conventional wisdom and sabermetric wisdom, as well as future wisdom. There's much more to learn here.
Subtopic: Starter-Bullpen Management
If there's an equivalent chance the starter or middle reliever gets the next out. Who should pitch to the batter? What if the starter or reliever has a slightly better chance? I guess basically, Do the numbers show an effective usage pattern for season long maximization of your pitching staff?
This is a good question. You are basically asking what the value is of making sure your bullpen stays as deep as possible. So, if you have an effective .490 winning percentage starter and your reliever is a .495 winning percentage pitcher, should you necessarily bring in the reliever? If you do, you now deplete your bullpen by one bullet.
I don't know that I have a good answer here. In the current practice, however, the decisions being made are not this close. It's more like a .440 winning percentage starter or a .495 winning percentage reliever. So, we're not close to the point of thinking that perhaps we should be saving our bullpen a little longer. A pitcher pitches worse each time he faces the same batter in the same game, which is a key finding in The Book.
Subtopic: Closer usage
As an add-on to this – what proportion of save situations ought to be pitched by the best reliever? I suppose I was thinking about closers by committee, and that, while there is the odd situation where someone else ought to close (three lefties due up, or three run lead facing the bottom of the order or whatever), most of the time, you would just want the best reliever who hasn't pitched yet out there. Is the real problem then that closers should be used much more frequently in key situations earlier in the game, in the same way that the Red Sox appear to be using Daniel Bard this year?
In The Book, I noted that almost 20 percent of batters faced by the ace relievers were in low-leverage (tune-up) situations. Basically, managers have a hard time knowing when to hold back the ace and when to bring him in when it actually counts. Continually waiting for a more and more important scenario that does not materialize will force a skipper to throw him out there when it doesn't matter.
The main problem is the over-reliance on using the ace in the ninth inning of a three-run save situation, and the under-reliance on using him in the eighth inning with runners on base. This was also excerpted, at Sports Illustrated (please read that).
This is the sixth answer I just wrote, and I apologize for the constant mentions of studies in The Book. I'm not pushing The Book, and I encourage people to read it for free via Amazon's Look Inside feature, where substantial portions are available. You can get to that Amazon page via my blog.
Studies I have seen in the past state that "catcher's ERA" does not show any statistical significance. Nonetheless, we constantly hear about catchers' role in calling a game and helping the pitcher. Have there been any developments on this issue?
Ah, the old correlation problem. There is so much noise in a catcher's ERA that to try to isolate his specific skill in it is extremely problematic. You have to go in with the idea that of course a catcher will impact his pitcher. If you can't prove it, it just means that you haven't been able to find that needle because of the haystack problem. There were a couple of good studies in the recent Hardball Times Annuals on this topic, and I think more can still be uncovered.
TOPIC #3: DIPS
Subtopic: Observed spreads
In DIPs theory, does the variance of the rates at which pitchers give up doubles and triples look more like the variance of the rates of singles allowed or HRs allowed?
I think the best discussion regarding DIPS took place over at my blog several years back, which is captured in this pdf document (please read that). The spread of singles, extra base hits, and home runs are all in the same ballpark.
Subtopic: Pitcher Control
DIPs states that pitchers have no control over what happens after the baseball leaves their hand, yet statistically some pitchers are definitely "ground ball pitchers" and some are definitely "fly ball pitchers". Statistics also prove that ground balls and fly balls turn into outs MUCH more often than line drives (and ground balls more so than fly balls), so how can DIPs be true?
Following on this question, do pitchers really have no control over BABIP? Are there really ground ball, fly ball, or line drive pitchers? Does year-to-year correlations show that pitchers fit into one category (or lean into one category)?
Pitchers absolutely do have control over balls in play. The question on the table is the degree to which we can find this skill, given the amount of noise in the BABIP metric. We can very easily figure out if a pitcher is a groundball pitcher or not, about as easily as figuring out if someone is a strikeout pitcher or not.
If the choice is: should I put 100 percent weight on BABIP or 0 percent weight, then the answer is 0 percent if you have fewer than 150 starts, and 100 percent if you have more than 150 starts (more or less). Fortunately, we don't have to limit ourselves like that. It's like a pitcher's won-loss record: if you look at a single-season won-loss record and you have to decide whether to give it equal weight to all the other stats or no weight at all, choose no weight at all. But at the career level, you'd have to give it significant weight. Therefore, the correct way to use BABIP (or any metric, for that matter) is to give it a sliding scale of a weight, based on how much performance data you happen to have.
Subtopic: Hitter Control
With hitters, the number of balls in play required for the metric to remove a good portion of the noise is much smaller than for pitchers. That's why we're happy with a season's worth of BABIP for hitters, but not for pitchers.
The reason for this is that among pitchers, you need to have a skill not to allow hits on balls in play. So, when looking at MLB pitchers, it's hard to separate their talent in this respect, because they are all good at it. It's like goalies in hockey: their save percentages are so close because they are all good. But for hitters, there are many different ways to contribute: high walks, high HR, great fielding. Consequently, our ability to distinguish between good and bad BABIP hitters is greater. Go look at BABIP among high school pitchers: you will see that there's a far wider range there, which makes it easier to see who the good pitchers actually are (or at least to be able to discern within a team which is the better pitcher).
From a pitchers point of view, it seems that unpredictable randomness or "luck" factors into his results and "numbers" much more than previously thought. Especially since the pitcher is so reliant on things totally out of his control—run support, inherited runners stranded by the reliever, closer effectiveness, "fluke" hits, hard smashes right at fielders vs. a 5 bouncer that finds a hole, etc. Thoughts?
I agree. Our job is to figure out how much random variation there is in a metric, relative to the number of trials (plate appearances, or balls in play, etc). What we are up against is that we link the performance to the pitcher's identity, when really, much of what happens should be linked to other factors. We say things like "the pitcher allowed a hit," when really what happened is "the pitcher was on the mound when his team allowed a hit." In this way, we are suggesting that the pitcher, the fielder, the park, the batter, etc, all played their part in that hit occurring (and luck, too, of course).
Subtopic: Home runs
Ever since Voros McCracken made his stunning breakthrough observation, one bizarre and seemingly irreconcilable implication of DIPS has nagged at me. That is, that pitchers are "responsible" for a home run, but a ball drilled to the warning track and caught is thrown into the same Magic Soup of "ordinary batted balls" as the dribbler to the mound and the infield popup. A red flag of "discontinuity" goes up for me there.
Put more starkly, in one park a deep drive would be the pitcher's "fault" because the fence is a few feet closer – while in another stadium, DIPS says that there but for Lady Luck that same deep drive could as easily have been a harmless swinging bunt, since it stayed in the park. Now I realize that HR/FB normalizations are done in the more sophisticated defense-independent pitching stats (like xFIP). But how can an identical batted ball be purely the pitcher's "doing" in one park, but essentially the batter's "doing" in the other? I cannot get my head around this seemingly artificial construct.
But that's not true. If we were told that a ball was hit to the warning track, or that a ball was a pop-up 200 feet in the air and 150 feet from home plate, we would not treat them the same. But we are not told that in a pitcher's seasonal line. We are presuming that a pitcher doesn't have a disproportionate number of warning track doubles and infield flies when we look at seasonal lines. It’s a workable enough presumption, for the most part.
What you have to understand is that we're trying to categorize a pitcher's 600 contacted balls in a season into three categories: home runs, other hits, and outs. What DIPS is arguing is that you are (almost) better off dividing those contacted balls into just two categories: home runs and all other batted balls. The difference between a warning track hit and a warning track out, as it relates to the talent of the pitcher, is very small. Now, you can argue for other ways to classify batted balls: for instance, balls hit 300 feet and balls hit fewer than 300 feet, or balls that remain in the air for under three seconds and balls that remain in the air for more than three seconds. Each of those buckets will have much different run values and talent levels associated with them. It's a question of making up categories for balls, given whatever factual and (if any) subjective data we have for each batted ball.