Baseball ProGUESTus: Everything You Always Wanted to Know About the Times Through the Order Penalty

November 5, 2013

Most of our writers didn't enter the world sporting an @baseballprospectus.com address; with a few exceptions, they started out somewhere else. In an effort to up your reading pleasure while tipping our caps to some of the most illuminating work being done elsewhere on the internet, we'll be yielding the stage once a week to the best and brightest baseball writers, researchers, and thinkers from outside of the BP umbrella. If you'd like to nominate a guest contributor (including yourself), please drop us a line.

Mitchel Lichtman, or MGL, has been doing sabermetric research and writing for almost 25 years. He is one of the authors of The Book: Playing the Percentages in Baseball. He has consulted for several major-league teams over the years and has occasionally made a fool of himself on radio and TV. He holds a B.A. from Cornell University and a J.D. from the University of Nevada. You can check him out on Twitter at @MitchelLichtman or on his blog at www.mglbaseball.org.

If, like many of us, you’re a prolific baseball blog reader, you’ve probably heard a lot lately about the “times through the order” penalty (TTOP). For those of you who have no idea what that is, here is a quote from page 187 of The Book: Playing the Percentages in Baseball: “As the game goes on, the hitter has a progressively greater advantage over the starting pitcher.” Essentially, the more times a batter faces a pitcher during a game, the better he does at the plate.

The way the TTOP is traditionally measured is by looking at a starting pitcher’s performance using, say, wOBA against, the first time through the batting order, the second time, and so on. (Like TAv, wOBA is an all-in-one offensive rate statistic, but on the OBP scale instead of the BA scale.) Theoretically, a starter’s wOBA should be about the same for batters 1-9, and then 10-18, etc., since the pitcher is obviously the same, and in most cases the batters are more or less the same (I don’t include pitchers batting or pinch hitters). You might even think that a pitcher improves as the game goes on, as he gets thoroughly warmed up—especially on a cold night—and gets a feel for all of his pitches, at least until he perhaps enters a decline phase due to fatigue, assuming he is allowed to stay in the game that long.

But that’s not what we see, as the last letter of the acronym TTOP implies. Here are some actual numbers from The Book (p. 186, Table 81.) based on data from 1999-2002. The total sample is 469,721 PA between starting pitchers and starting lineups, not including IBB and bunts.

Times Through the Order	TBF	wOBA
1	163,900	.345
2	158,872	.354
3	124,603	.362
4	22,221	.354

As you can see, there is a significant and distinctive trend in the last column, at least through the third time through the order. Basically, batters get better and better from the first time facing a pitcher in a game to the second, and then again to the third, and then revert back to “second time” levels by the time they have seen the pitcher for the fourth time. We’ll talk about that “fourth time” anomaly in a little while.

Another thing you can clearly see is that most pitchers make it through the order at least three times, which is actually something of a modern trend. In the past, starting pitchers pitched many more complete games, but they were also taken out earlier when they were getting shelled. It is also relatively rare for a pitcher today to face the order for the fourth time. That should not be surprising, since by the fourth trip through the lineup, pitch counts are usually elevated. On average, it takes almost 100 pitches to get through the order exactly three times (the current average “pitches per PA” (P/PA) is around 3.8).

As you might expect, the pool of pitchers is not exactly the same for each TTO group, at least starting with the third time (and neither is the pool of batters). Pitchers in group three are slightly better than those in groups one and two, and the pitchers in group four are quite a bit better. Balancing this out is the fact that the quality of the batters in each group also rises slightly. Because of the disparity between the pitcher and batter pools in each group, the expected wOBA in each group is actually a little different, as you can see from the table below.

TTO	Pitcher quality	Batter quality	Expected wOBA	Obs. wOBA
1	.349	.347	.353	.345
2	.349	.348	.353	.354
3	.348	.350	.354	.362
4+	.345	.351	.353	.354

The significant rise in observed wOBA from the first through the third times through the order is not a result of any large changes in the pitcher and batter pools in each group. For all intents and purposes, the expected wOBA is the same in all groups. Something else must be going on.

If you are wondering which group represents a pitcher’s norm, conveniently, the second time through the order is almost exactly what we would expect from the pitcher overall. That is illustrated in columns 4 and 5 in row 2 of the table above. In the second time through the order, the expected wOBA, based on the pitchers’ and batters’ overall full-season numbers, is .353, and the observed wOBA is .354, almost exactly the same.

In summary, we can say this: The first time facing the lineup, the starting pitcher has the advantage, as compared to his overall “true talent.” The second time, the battle between the pitcher and batter is roughly neutral. The third time through the order, the batter gains the advantage. The fourth time, the balance appears to be neutral again; however that may not be quite true, as we will see in a while.

Now that we’ve gotten the groundwork out of the way, let’s look at some interesting data and ask and answer some equally interesting questions. All data is now from 2000-2012. Again, pitchers batting and pinch hitters are not included.

First, we’ll look at the same data that we presented in The Book, but for 2000-2012.

TTTO	Pitcher quality	Batter quality	Adj. wOBA Obs.
1	.346	.340	.340
2	.345	.340	.350
3	.343	.343	.359
4+	.339	.346	.359

We basically see the same pattern that we found in The Book—around an 8-10-point increase each time through the order until the fourth (and later), at which point it levels out. The observed wOBA is a little higher than in The Book across all TTO groups because of the way it is calculated (no sacrifice hits—in The Book we removed all bunts). The pitcher and batter quality numbers do not have SH removed—which is why they are lower as well.

Now let’s focus in on the first inning. While the first inning usually contains only batters who are facing the starter for the first time, some crazy stuff is going on that we don’t see in the second or third innings when also facing the order for the first time. It has nothing to do with the quality of the batters faced. All the observed wOBA numbers you will see from now on (as well as in the previous table) are adjusted for the quality of the batters and pitchers faced.

*First Time* Through the Order	TBF	wOBA
Inning one	274,332	.336
All other innings	258,871	.344

There seems to be something about the first inning that gives the pitcher an eight-point wOBA advantage as compared to the first time through the order in the second or third inning. Again, we might have assumed the opposite—that hitters should have the advantage, as pitchers need some more time to acclimate themselves to the mound, find out which pitches are working for them, etc. On the other hand, hitters haven’t seen any real pitching since their last game, they may have been sitting on the bench for some time, and they probably haven’t seen that particular pitcher for a while, if ever.

What happens if we split the above sample into home and away?

First Time Through the Order	Home team batters wOBA	Road team batters wOBA
Inning one	.347	.324
All other innings	.351	.338

The first time through the order, the home team has only a four-point hitting disadvantage in the first inning, as opposed to the second or third inning, but the road team hits a whopping 14 points worse! Your guess as to why there is such a large discrepancy between the home and road team in the first inning is as good as mine. Maybe coming to the plate before playing the field is a disadvantage for the visiting hitters, similar to the DH or PH penalty. Maybe it takes the visiting starter or even the fielders more time to get used to the mound and the playing field (although the data suggests that it is a hitting problem and not a defensive one). What’s clear, however, is that the home field advantage is extremely large in the first inning, larger than in any other inning by a long shot.

What about by the second time through the order? Has this imbalance between the home and road teams disappeared or at least dissipated? Let’s look at all the TTO data split by home and road pitcher.

Times Through the Order	Road Pitcher wOBA against	Home Pitcher wOBA against
1 (inning 1)	.347	.324
1 (innings 2 and 3)	.351	.338
1 (all innings)	.349	.331
2	.355	.346
3	.364	.354
4	.362	.354
All	.356	.343

It does appear that by the time we get to the second time through the order, the imbalance is mostly gone. The difference between the home and road wOBA the first time through the order is 18 points. The second, third, and fourth times through the order, the differences are all around nine points. One of the things to take out of this is that the home team starting pitcher derives a large portion of his home field advantage from pitching in the first inning. Relievers are not so fortunate. If you’re a pitcher and you want to pump up your stats, start all your games at home, and after you’ve faced nine batters, get the heck out of Dodge!

Let’s briefly get back to that funky fourth time through the order, when it seems that the TTOP stops dead in its tracks. Does the batter’s advantage level off by the time he’s seen the pitcher for the fourth time? Actually, not as much as it appears.

A while ago I stumbled on something interesting about what happens when a starter lasts into the ninth inning or later. The starter’s team is probably winning, of course, but the margin of victory also tends to be large. In other words, in the very late innings, if it is a one- or two-run game—or even tied—the closer or other short reliever is likely to be on the mound rather than the starter. And when the game is not close, especially in a blowout, for some reason wOBA does not do well in reflecting the losing team’s approach at the plate. Consequently, wOBA in the ninth inning or later, with a starter in the game, is artificially low. If we remove the ninth inning and later from the “fourth time through the order” data, we see the wOBA rise accordingly.

The other thing that is relevant is the temperature of the game when the lineup bats for the fourth time. In night games it is much colder, and most major league games are played at night. Let’s look at the regular TTO numbers, but this time we’ll do two things: One, we’ll include only up to the eighth inning, and two, we’ll split the data into three groups: outdoor day and night games, and indoor games.

Times Through the Order (through 8 innings only)	wOBA	wOBA Day Games	wOBA Night Games	wOBA indoor games (or roof closed in SEA and MIL)
1	.340	.337	.343	.335
2	.350	.349	.351	.345
3	.359	.361	.359	.358
4+	.361	.364	.359	.366

Eliminating the ninth inning and later raises the wOBA the fourth time through the order by two points in all games combined. And as you can also see, in day games it rises a little more, while it stays flat in night games. In the indoor games, where temperature is not a factor, we actually see a fairly large increase from the third to the fourth times through the order—eight points. In day games, we see only a three-point jump. Maybe in the daytime the temperature decreases a little between the third and fourth times, or maybe the batters and umpires are tired and want to go home. Again, your guess is as good as mine in explaining the above patterns. Suffice it to say that once weather is removed, as well as the ninth inning and later, we do in fact see a steady TTOP all the way through to the fourth or later time through the order.

What about the quality of the pitcher? Does that affect the penalty? Are good pitchers good at least partly because they don’t suffer as extreme a penalty, and vice versa for bad pitchers?

Times Through the Order	Good pitchers (<.320 wOBA against for that season)	Bad pitchers (>.340 wOBA against for that season)
1	.297	.365
2	.305	.376
3	.317	.386
4	.321	.387

Interestingly, the really good pitchers show a fairly modest penalty from the first to the second time through the order—eight points—while the bad pitchers pitch 11 points worse. However, from the second to the third time, the aces get 12 points worse and the poor pitchers, 10. These differences could easily be due to sampling error. In any case, it is clear that great pitchers are by no means immune to the dreaded TTOP. These are starters who are elite pitchers, on the average a run per nine innings better than the typical pitcher, yet by the time they face the lineup for the fourth time, they are barely .3 runs per nine above average. By the third go-around, both groups of starting pitchers, the aces and the duds, both lose about 20 points in wOBA as compared to their first go-around, and around 10-12 points as compared to their overall numbers.

During the fifth game of the World Series, several people wondered whether Jon Lester would not suffer from the typical TTOP. They used that speculation to partially defend John Farrell’s decision to let Lester hit in the top of the seventh inning and continue to pitch in the bottom of the seventh, even though he was facing the Cardinals lineup for the third time. By that time, if the TTOP was in effect, we would have expected Lester to be a slightly above-average starter rather than the roughly no. 2 starter that he normally is (notwithstanding any potential “hot hand” effects resulting from pitching a good game so far). The third time through the order, the typical penalty is around .35 runs per 9 innings compared to a starter’s overall RA9.

The evidence that the Farrell defenders gave for Lester possibly being immune to the penalty was that in his career he has not shown the typical TTOP. I looked at 2009-2012 (I don’t have the 2013 data handy), and here is what I found for Lester.

Times Through the Order	Lester’s wOBA against
1	.320
2	.327
3	.327
4	.356
Overall	.326

We are not dealing with tremendously large sample sizes in each group, of course, so we don’t expect these numbers to be especially reliable, and it is unlikely that they would exactly mimic the pattern of the average starting pitcher. That said, Lester does show a roughly typical penalty from the first to the second time, no penalty from the second to the third, and an exceedingly large jump from the third to the fourth (the number of TBF in the fourth group is only around 165). However, before we can put any stock in the predictive nature of a player’s own patterns or deviations from the league average, we must estimate how much to regress that data toward the league mean—the typical TTO penalties.

That’s the same thing we do for platoon splits, BABIP, or even overall performance itself, like FIP, ERA, or wOBA against, when creating projections or estimating true talent. As it turns out, a pitcher’s past deviations from the league average, in terms of their TTO penalties from the first to the fourth times through the lineup, are not very predictive, much like BABIP. When I computed year-to-year correlations for all pitchers with at least 100 TBF in each “times through the order” group per season (an average of around 220 TBF per group), I got “r” values of around .03 for around 500 data points. That means that it would take around 7,100 TBF or 1,650 innings pitched (roughly eight seasons for a full-time starter) before we would regress a pitcher’s own TTOP pattern 50 percent toward that of the average starter. So unless a pitcher had a long history of a significantly larger or smaller TTOP than the average starting pitcher, we can assume that he will lose around .35 runs per nine innings the third time through the order. Keep in mind that because of the relatively small samples we are dealing with, the 95 percent confidence interval around the .03 correlation is roughly -.06 to .12.

I’m going to look at one more thing, and then I think you can truly say that you know everything about the now-famous (I hope) “times through the order” penalty. In that same World Series game, there was also some talk about the fact that Lester had thrown only 69 pitches after facing the lineup exactly twice, so maybe he wouldn’t suffer any third-time penalty—another attempt to justify Farrell’s decision to leave him in the game. After all, most starting pitchers won’t be fatigued after only 69 pitches. While that is true, the TTOP is not about fatigue. It is about familiarity. The more a batter sees a pitcher’s delivery and repertoire, the more likely he is to be successful against him. In fact, 69 pitches is not even a low number when it comes to facing the leadoff hitter for the third time. It takes an average starter about 68.4 pitches to get through the order two times (18 times 3.8, the average P/PA in MLB).

That said, even though fatigue due to elevated pitch counts is likely not much of a factor in the TTOP, the more pitches a pitcher throws each time through the order, the more the opposing batters are able to acquaint themselves with the pitcher. How much does that affect the penalty?

I looked at that in two ways: First, I looked at the number of pitches thrown going into the second, third, and fourth times through the order. I split that up into two groups—a low pitch count and a high pitch count. Here are those results. The numbers in parentheses are the average number of pitches thrown going into that “time through the order.”

Times Through the Order	Low Pitch Count	High Pitch Count
1	.341	.340
2	.351 (28)	.349 (37)
3	.359 (59)	.359 (72)
4	.361 (78)	.360 (97)

We don’t see much difference there. In general, number of pitches thrown does not seem to be a factor in determining how much of a penalty a starter is going to suffer each time through the order.

The second, and better, way I examined this question was this: I looked only at individual batters in each group who had seen few or many pitches in their prior PA. For example, I looked at batters in their second time through the order who had seen fewer than three pitches in their first PA, and also batters who saw more than four pitches in their first PA. Those were my two groups. I did the same thing for each time through the order. Here are those results. The numbers in parentheses are the average number of pitches seen per PA so far in the game, for every batter in the group.

Times Through the Order	Low Pitch Count each Batter	High Pitch Count each Batter
1	.340	.340
2	.350 (1.9)	.365 (4.3)
3	.359 (2.2)	.361 (4.3)
4	.361 (2.3)	.353 (4.3)

Wow! If a batter has seen more than four pitches in his first PA, he hits 25 points better the second time around. That is a huge revelation, I think.

As with the previous table, batters who’ve seen fewer than two pitches or so during their first PA still benefit by 10 points in their next PA. So the big advantage seems to come from seeing a lot of pitches, especially in the first PA. This advantage seems to disappear by the third time through the order. By this time, the “high pitch” batter has only a two-point advantage over the “low pitch” batter. The second time he has a 15-point advantage. The fourth-time numbers in the “high pitch” group probably suffer from sample size error, as the TBF are only around 3,300. In fact, if we combine the third and fourth times in the “high pitch” group, we still get a wOBA of .360. By the time batters get to the third time through the order, how many pitches they’ve seen is mostly irrelevant. But from the first to the second go-around, it seems to be huge.

Batters who are patient are indeed imparting a benefit to their team. But it is not what most people think. It is not in order to drive the starter out of the game early—against most starters, especially the poorer ones, that would actually be a bad thing for the batting team! The benefit is to the batter himself. The more pitches he sees, the better his next PA, at least from the first to the second time through the order.

Let’s recap what we learned today about the “times through the order” penalty.

The first time through the order, pitchers pitch better than they do overall. This “first time” effect is magnified in the first inning, especially for the home pitcher.
Starters get progressively worse as they face the lineup for the second, third, and fourth times. The fourth-time penalty gets masked in outdoor games, especially at night, and in the ninth and later innings.
A pitcher’s career “times through the order” patterns have almost no predictive value. We can assume that all starting pitchers have roughly the same “true talent” TTOP, regardless of what they have shown in the past.
Good and bad pitchers show around the same magnitude of TTOP. The third time through the order, all starters are expected to pitch around .35 runs per nine innings worse than they do overall.
Pitch count does not seem to have much of an effect on the TTOP. For example, going into the third time through the order, whether a pitcher has thrown 60 or 75 pitches doesn’t seem to matter much.
For an individual batter, the number of pitches seen makes a huge difference. The largest difference is from the first to the second time through the order. If a batter sees fewer than three pitches in his first PA, he hits 10 points better his second time at the plate. If he sees more than four pitches his first time up, he hits 25 points better on his second go-around!

As you can see, the “times through the order” penalty is a significant effect that should be incorporated into a manager’s decision about when to remove a starting pitcher. In fact, it would behoove managers and pitching coaches to be much more mindful of a starter’s “times through the order” than his pitch count. In an article I wrote two years ago about the benefit of “quick hooks,” I showed that a typical NL team could add from a half to a full win per season simply by removing a starting pitcher who is not an ace whenever he comes to bat in a high-leverage situation after pitching at least five innings, even if his replacement is a league-average reliever. Even in AL parks, where pitchers don’t bat, managers should be inclined to replace a pitcher, especially a fourth or fifth starter, as soon as he faces the order for the third time. These mediocre or worse starters are likely at or near replacement level by this time, even if they have been pitching well.

If you are watching a game and feel inclined to criticize or (less likely) praise your favorite manager, make sure that you don’t forget to consider everything you just learned about the “times through the order” penalty.

Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now

Mitchel Lichtman

More about:

Latest Articles

You need to be logged in to comment. Login or Subscribe

bozarowski

11/05

While I don't disagree that in most cases times through the order will lead to diminishing returns, I'm concerned about taking as an absolute principle. This reminds me of early DIPS theory a bit in the sense that we're taking it as a bright line rule about pitcher performance when I suspect the reality on an individual scale is much different than the reality across all pitchers. It's no longer seriously controverted that some pitchers are better at inducing weak or inefficient contact (see Cain, Ford, Glavine, etc) thus 'defying' DIPS and I wouldn't at all be surprised to see that there are times through the order defiers as well. Anecdotal evidence suggests that a guy like Verlander often gets stronger as the game goes on. Further, I suspect that pitchers who have more MLB caliber pitchers in their repertoire, who are smarter or who have a better catcher helping call a game for them would show a less pronounced times through the order effect than a two pitch sort 1.0 WAR sort.

I think, as well, this understates the risks of overtaxing a bullpen and the stacked effect that can build long term. The analysis feels more relevant to the playoff context (when relievers can pitch nearly every game thanks to off days) than it does to the regular season context where managers try to avoid using relievers more than two days in a row.

Reply to bozarowski

Schere

11/05

He addresses this rather specifically in the text.

In any case, Verlander's career splits (all this data are easily available on bbref):

1st time through the order: .629 OPS
2nd time through the order: .638
3rd time through the order: .706
4th time through the order: .666, in 1/5th the sample size

Reply to Schere

bozarowski

11/05

"For all intents and purposes, like BABIP, we can ignore a pitcherâ€™s own historical TTOP when projecting any future penalties. In other words, we do expect Lester, or any other pitcher, to lose around .35 runs per game the third time through the order, regardless of what he has done in the past."

This is exactly the language I object to here. I don't agree that we can ignore all past production and pretend that every single pitcher is applicable to the same curve that overarchingly reflects pitchers. I say this as a sabr-loving, BP-subscribing, baseball nerd, but this to me feels strongly of ignoring the less tangible elements of pitching in favor of a broad, potentially more satisfying, conclusion. This feels just like the early days of DIPS - DIPS certainly has value and is overarchingly accurate, but can struggle on an individual level - it seems too extreme for me to accept as gospel anything that creates a hard and fast bright line rule that ignores any less mathematical analysis of player performance or game theory.

Reply to bozarowski

lichtman

11/05

Oh, and one more thing. I don't think that DIPS is like the TTOP. I don't think the pitcher has much control over the TTOP. I don't think it has much to do with him. I think it is almost entirely about batters simply getting used to the pitcher, and I don't think that a pitcher can do much about it. That makes it very different from DIPS.

In any case, your "opinion" doesn't matter. The math speaks for itself. If all you know is a pitcher's past times through the order numbers, the math tells us that we can't use that to predict the future. If you want to argue with the math, be my guest.

Reply to lichtman

Behemoth

11/05

Given that the data seems to support the idea that this is due to increased familiarity with a pitcher's repertoire, would it be possible to look at the variety of stuff thrown by a pitcher? It would seem that pitchers like Darvish, who throw a lot of different pitches, might have a lower penalty as they go through the order multiple times.

Reply to Behemoth

lichtman

11/05

That is certainly possible. Joe's data below suggests that that might be true. It is on my list of things to do!

Reply to lichtman

bozarowski

11/05

To act as though a manager should make every decision predicated on the basis of a basic guarantee that a pitcher will be 0.35 runs worse the third time through the order and that no other factors (in a single game window) including pitcher quality, catcher quality, lineup quality, temperature, type of weather, home or road game or actual pitcher performance in a given game is simply too far. You're operating as though each pitcher performs at a baseline talent level every single game. The Verlander numbers Schere cites above are a perfect example - in a game when Verlander is 'on' (one where he's more likely to see the lineup a 4th time) he improves late in the game relative to a normal start's 3rd time through the line-up. There are countless examples of little things that would operate to impact pitcher performance on a game-to-game basis - the 'math' doesn't speak for itself. I mean fundamentally this 0.35 runs number takes as an objective truth that every single starting pitcher in MLB tires at exactly the same rate - a less fit pitcher should perform worse the third time through and a more fit one should perform better, it's simple human biology.

Reply to bozarowski

lichtman

11/05

Yes, I did address this with the correlations. Like DIPS, there may be pitchers who have their own unique "times penalties" (or not), but we can't tell from their past results, even for many seasons. That is what a very low correlation (.03) tells us, be definition - that a pitcher's past differences (between times through the order) has almost no predictive value. So you can't contradict that when the math is almost irrefutable.

" think, as well, this understates the risks of overtaxing a bullpen and the stacked effect that can build long term."

Well, I'm not advocating anything. I'm simply giving and explaining the data. What a manager wants to do with that is up to him - not me. But I think that it would behoove managers to understand this phenomena in order to make those decisions, don't you?

Reply to lichtman

markpadden

11/11

The main problem I have with this article is that you assume that past TTOP is the only information one can use to predict future TTOP. At least that is what is sounds like when you proclaim, "We can assume that all starting pitchers have roughly the same â€œtrue talentâ€ TTOP". No, we can't make this assumption. All we can state is that in this specific case, if one is limited to a certain data set to project a certain skill, that data set can be largely ignored. It does not say anything about whether there can be individual pitchers with truly abnormal, sustainable TTOP, or anything about how one might identify them. There are only 150 starting pitchers at any given time. There is no need to save data processing time by adopting generalizations, when plenty of intellectual throughput exists to evaluate each player individually. The same goes for DIPS, platoon splits, home splits, etc.

Reply to markpadden

GoTribe06

11/05

I was thinking the same thing about there surely being outliers, whether it is pitchers with more pitches or pitchers who work with certain catchers that are better at strategically unveiling their talents at ideal moments (could this be another 50 runs Jose Molina was secretly worth?), but my takeaway was that even if there are outliers, you will never be able to accurately identify who they may be. A manager is best to utilize the aggregate data then try to slice the data too finely.

And these principles need to be integrated into the managers decisions (and plans). You obviously can't just remove all of your starters after two times through the order and stack another 350 innings on your bullpen (except in the playoffs), but there are great opportunities to optimize when you use short hooks (day games and in domes) and when you let your starter ride (night games early in the season).

Reply to GoTribe06

Schere

11/05

This is really clearly presented and interesting, thanks. I commented on your wordpress about the Lester decision, and I'm happy to see your work here as well.

Reply to Schere

Kinanik

11/05

I wonder, do different teams systematically improve at different rates? I imagine with a good clubhouse atmosphere, manager, or coaching staff, a team can be better than average at adapting to a pitcher. A bad clubhouse atmosphere leads to less attention and slower adaptation. Could this be a sort of instrument to dig further in to the effects of a manager or 'intangibles'?

Reply to Kinanik

lichtman

11/05

I have no idea. I imagine that this is a universal thing and probably largely subconscious, but then again, I am no cognitive psychologist. It would be interesting, as you say, to look at individual teams. Other than random fluctuations due to sample size issues, I am guessing that all teams have roughly the same patterns, both on offense and from the pitching side.

Reply to lichtman

jroegele

11/05

Very thorough coverage - I love this topic too as you know MGL!

I'd calculated some related numbers for how batters fare in their second plate appearance, both based on total pitches faced in their first PA as well as the number of *unique pitch types* seen in their first PA.

This is what I found for 2011-Aug 2013, second PA performance per first PA unique pitches seen (not adjusted for pitcher/batter quality though):

1st PA Unique UIBB% K% wOBA BACON
1 6.9% 9.5% .312 .284
2 7.1% 9.1% .321 .289
3 7.2% 8.9% .322 .290
4+ 7.3% 8.7% .328 .291

So basically everything gets better in terms of second PA performance the more unique pitch types you see in your first PA. You walk more, strike out less, better wOBA and better Batting Average on CONtact (including HRs).

Really enjoyed the article!

Reply to jroegele

lichtman

11/05

Thanks. Your data is interesting. Let me just get it straight what you mean by number of unique pitches. I think I do. For example, if a batter sees 4 fastballs, that is considered 1 unique pitch? But if he seed one fast and one curve ball, that is 2 unique pitches, right?

I think you want to somehow separate number of unique pitches from number of pitches altogether. For all we know, you may be simply picking up the effect of number of pitchers, period, and not unique pitches.

You want to do something like, one group are batters who saw 4+ pitches but at least 3 unique pitches and the other group are batters who saw 4+ pitches but they were all the same. Or something like that where we can separate the effects.

It is also always nice (somewhat mandatory actually) to at least report (if not control for) the quality of the pitchers and batters in each group. Once you start looking at number of unique pitches thrown or seen, you could easily have substantial differences in the quality of the batters and pitchers in each group. For example, when I was looking at base stealing, I established two groups, one was pitchers who allowed a lot of steal attempts and the other was pitchers who did not. To my surprise the latter group were much better pitchers, based on wOBA against (i.e, not even considering SB/CS against). If I had not controlled for pitcher quality in my research, I would have been in trouble.

Nice work!

Reply to lichtman

newsense

11/05

In comparing the effect of number of pitches in the first at-bat, is it possible that the patient hitters have greater true talent, that they penalize themselves by taking too many pitches in the first at-bat but show their greater talent later on?

Reply to newsense

lichtman

11/05

Yes, that is possible. Interesting theory actually. The way to test that would be to see which batters if any, take more than their share of pitches in their PA in general. I'd have to think about that, but that is an interesting thought! Thanks.

Reply to lichtman

tbunns

11/05

In looking at the benefit of seeing pitches in the first at bat, did you look at the two (sort of) control groups - those taking an intentional walk in the first at bat & those taking a 4 pitch walk in the first at bat?

The IBB control is pretty clear - but there might not be enough data points. There should be enough for 4 pitch walks over a season or two. I'm not sure exactly what the numbers would be telling us for the 4 pitch walk, but it would be interesting to have.

Reply to tbunns

lichtman

11/05

Good idea! I might try that. Sort of a controlled experiment in real life!

Reply to lichtman

TangoTiger1

11/05

Wonderful research, great presentation.

Reply to TangoTiger1

lichtman

11/05

Thanks Tango. And thanks to you for discovering this TTOP. I think you did at least.

Reply to lichtman

caminante1

11/05

One correction: Lester pitched in the 5th game and not the 6th game of the World Series.

Reply to caminante1

bornyank1

11/05

Fixed.

Reply to bornyank1

mcbrown

11/05

MGL: great work. I thought fatigue would prove to be a bigger factor in the TTOP. I was wrong. It really is just about familiarity.

Nicely done.

Reply to mcbrown

lichtman

11/05

Yeah, I never thought it has much to do with fatigue, but I wouldn't rule it out completely. Clearly there is a point at which each pitcher gets fatigued and pitches worse than normal, even given the applicable TTOP. We found in The Book, that there is probably no magic number for that (like 100 pitches). It is hard to research I think, but there probably needs to be more research on fatigue anyway.

And I don't trust managers and pitching coaches to be able to figure that out in the middle of a game. They are too focused on and biased by results. For example, if we could somehow know when a pitcher was indeed tired and we allowed that pitcher to throw one more inning and he struck out the side (even tired or bad pitchers can pitch well, right?), I would be willing to bet my last dollar that a manager or pitching coach would think that he is just fine! And vice versa. If we could know that a pitcher after 103 pitches was NOT tired, and he were to give up 2 walks and a HR (that happens to non-fatigued and good pitchers, believe it or not), I would also bet my last dollar that managers and coaches would take them out and tell us that they were tired.

Reply to lichtman

therealn0d

11/05

Let us not forget hitters that are so inclined, time permitting, to run into the video room and watch their at bats to see how they were pitched and whatever else they have time to observe.

Reply to therealn0d

gjhardy

11/05

Wasn't it LaRussa who started to play around with "splitting games" between two starting pitchers, with one pitching the first four innings and the next pitching the next three or four innings? This research would seem to vindicate the concept.

It would take a really geeked up front office to try to enforce such a pattern, though. And the only teams who might be willing to try to do something like that would be the desperate teams, the ones with less talent, and the results would probably not look good, even if the strategy produced better pitching numbers than expected.

Forty years from now my kids could be saying, "Heck, I remember when pitchers were 'real men' and were going six and even seven innings each start. They weren't being mollycoddled like today's four-inning wimps. Those were the days!"

Reply to gjhardy

whjohnson37

11/06

I believe the Astros minor league system employed â€œsplit level rotationsâ€ where two starters pitched in the same night at several levels (but not for the entire year). This was employed to give multiple guys looks as starters and to keep pitch counts down but it would be interesting to see how they fared. The Astros sent six minor league teams to the playoffs. Something went right!

Reply to whjohnson37

LlarryA

11/05

Nice work, and it does help to explain some things, but we need to be sure that we don't get blinded as to its limitations on an individual game basis. While any manager should be aware of this phenomenon, and take it into account, there are many other indicators (and sure, some of them are entirely subjective), such as velocity and movement as fatigue indicators, that should also be taken into account. This is not a sledgehammer to be wielded indiscriminately.

On a given night, a particular pitcher may be performing so well relative to his 'true talent level', that even with the penalty, he is still better than any available reliever.

I would be interested to see further incorporation of jroegele's work involving pitch 'types' as well. PitchFX has some limitations in how well it distinguishes pitches, and we do start to risk small samples, but that may be the fundamental cause of this penalty. I also wonder how much of an effect there is if the individual batters are more or less familiar with the particular pitcher, though again, small sample sizes may mask any true results.

Reply to LlarryA

lichtman

11/06

"On a given night, a particular pitcher may be performing so well relative to his 'true talent level', that even with the penalty, he is still better than any available reliever."

And how would managers and pitching coaches be able to recognize that? I submit (quite confidently) that they are so results oriented that they can't and don't. They get almost everything else wrong (sorry, but that's true), why would they get this one right? If you don't believe that they "get almost everything else wrong," please listen to the ex-players and managers who are the color commentators on TV broadcasts. They constantly spew nonsense. These are the same guys that manage teams.

Reply to lichtman

LlarryA

11/06

Oh, I don't know, a two-hit shutout, maybe?

See, now this is a beautiful example of what old-schoolers don't like about statheads. You've run some very interesting numbers, found and quantified an effect, and then stated it as an absolute rule and that any manager who doesn't slavish obey it is wrong.

Nevermind that this is an average built up over a larger sample, and therefore half of the performances are better than that. You state that aspects of this have "no predictive value" in specific instances, yet totally dismiss the idea that there may be other information available to the manager to be weighed as well. And then you bring in the TV guys, who while they may know about the other inputs, don't have in the booth what the real manager has in the dugout, so of course they are a good yardstick...

Great math, but pardon me for finding the conclusion a little less absolute on an individual game basis...

Reply to LlarryA

markpadden

11/07

Wish I could +10 this comment.

Reply to markpadden

Behemoth

11/07

But that's the whole point, and it's precisely these sort of cases that matter. No manager with half a brain is going to leave their fifth starter in for the fourth time through if he's got through five innings with a couple of strikeouts and walks and given up four runs.

Just because someone is pitching a two hit shutout, it doesn't necessarily mean that they are a) pitching above their normal talent level or b) a better bet than a quality set up man or closer to pitch the last inning or two. Most good relievers are better than pretty much every starter over one inning, especially if the starter has already gone seven innings and three times through the order.

Reply to Behemoth

LlarryA

11/07

"...it doesn't necessarily mean that they are a) pitching above their normal talent level or b) a better bet than a quality set up man or closer to pitch the last inning or two."

It also doesn't mean that they aren't. You're also making an assumption about the quality of relievers available at the given moment.

I agree that this is really cool work (in the aggregate), and helps explain some of what we see. This should be one club in a manager's bag, but he should not wield it to the exclusion of all others. He has other in-the-moment information that may help tell him if his starter is running on the good side of the mean today, or whether he expects his relievers to be running on the good or bad side of their norms right now.

Reply to LlarryA

Schere

11/08

hey, go do the work. If a guy has a 2-hit shutout through 18 batters, how does the third time through the lineup go?

Reply to Schere

Scott0801

11/06

I just want to say this is a super idea to include a guest article from another source once a week. Great job! Makes BP site even better.

Reply to Scott0801

misterjohnny

11/06

Has anyone looked at this from a batting perspective? Are some teams more adept at taking advantage of the 3rd time through the order? Are some players? Or is the sample size too small?

Reply to misterjohnny

pft1957

11/07

This is pretty great stuff and I will bookmark it.

I would like to see if it holds true for the bottom of the order guys as well as the better hitters, but that's probably a pretty big project

I also wonder if it holds true for RP'ers. A typical RP'er may only face the same hitter 3-4 times in a year, if that, so it would be much harder to test, perhaps impossible due to the smaller samples. I always wondered if Red Sox hitters familiarity with Rivera over the years had anything to due to Riveras relatively poor performance against them (albeit still elite performance).

Reply to pft1957

lichtman

11/07

I have done some unpublished research looking at the effects of seeing pitchers in prior games. There appears to be a small effect such that the more times you have seen a pitcher in the past, the better you do in an upcoming game. As I said, the effect is very small and I din't delve into it very deeply (for example, does it matter if you saw him last week or last year). A lot of good comments and questions above and a few horrible ones.

Reply to lichtman

lichtman

11/08

I apologize for an error in the article. This sentence:

"That means that it would take around 30,000 TBF or 7,000 innings pitched (roughly 35 years) before we would regress a pitcherâ€™s own TTOP pattern 50 percent toward that of the average starter."

Should be, "around 7,000 TBF or around 1600 IP," which is 8 years and not 35 years. So if we have 3 years for a pitcher, we would regress his own penalties around 73% toward the mean. Of course "the mean" could be different for different kinds of pitchers. For example, pitchers with many different pitches like a Felix, may have a lower TTOP than, say, a pitcher who throws mostly fastballs - I don't know. Plus, the 95% or 99% confidence interval around that .03 correlation can be as low as no correlation at all (it is highly unlikely to be a true negative correlation) and as high as .1 or .13 or so.
Thanks to Jared Cross for picking up that error (on The Book blog).

Reply to lichtman

bornyank1

11/09

Updated the article to correct this.

Reply to bornyank1

bachlaw

12/03

MGL,

For your wOBA data going through 2012, did you keep the same linear weights for each event as published in The Book or did you use adjusted ones for each year?

Thanks!

Jonathan

Reply to bachlaw

Baseball ProGUESTus: Everything You Always Wanted to Know About the Times Through the Order Penalty

Thank you for reading

Latest Articles

The Stash List ’24: Week Four $

Box Score Banter: No Exit B

MLU: Triantos Tries on Some Power $

Speed, Spin, and Snap $

Pat Murphy, Wade Miley, and the Ship of Theseus $

Mitchel Lichtman

More about:

Latest Articles

The Stash List ’24: Week Four $

Box Score Banter: No Exit B

MLU: Triantos Tries on Some Power $

Thank you for reading

Related Articles

Latest Articles

More about:

Latest Articles

Related Articles