CSS Button No Image Css3Menu.com

Baseball Prospectus home
Click here to log in Click here for forgotten password Click here to subscribe

<< Previous Article
Premium Article Painting the Black: Bl... (01/29)
Next Article >>
Premium Article Prospects Will Break Y... (01/29)

January 29, 2014

Throw the Flag

Challenges and the Replay Review System

by Dan Brooks and Russell A. Carleton

the archives are now free.

All Baseball Prospectus Premium and Fantasy articles more than a year old are now free as a thank you to the entire Internet for making our work possible.

Not a subscriber? Get exclusive content like this delivered hot to your inbox every weekday. Click here for more information on Baseball Prospectus subscriptions or use the buttons to the right to subscribe and get instant access to the best baseball content on the web.

Subscribe for $4.95 per month
Recurring subscription - cancel anytime.

a 33% savings over the monthly price!

Purchase a $39.95 gift subscription
a 33% savings over the monthly price!

Already a subscriber? Click here and use the blue login bar to log in.

About that instant replay system that MLB put in place—we found a little problem with it. It started with us asking a pretty easy question. What is the best strategy for a manager to use in deciding when to throw “the flag” to challenge a call? We were sitting around talking about it, and the answer that we came up with is actually kinda scary: Managers should just throw that flag for any close play, the first time that they see one. When we say any close play, we mean just about anything that they have a smidgen of belief could be overturned by consulting a replay. And they shouldn’t fear throwing it even in the first inning, or throwing it to contest something that would give them only a trivial advantage.

If managers are truly doing it right (in the mathematical sense of the word), there will be a lot of replay challenges on plays where the audience will say “Yeah, it was close… but c’mon, it wasn’t that close.” Even if they’re not doing it right, there will still be plenty of those. This is entirely different from the putative goal of the system, which was, as Tony La Russa says, to go after “the dramatic miss, not all misses.”

This may seem counterintuitive, but managers should be losing challenges. A lot of them. So many of them, in fact, that the best managers in terms of maximizing “run production” gained from challenges will almost certainly be the worst managers in terms of challenges won percentage. It’s kind of like that old adage, “You miss 100 percent of the shots you never take.” Well, since there’s essentially no cost for missing, any time managers see a challenge opportunity, they should take it.

Warning! Gory Mathematical Details Ahead!
We tried to figure out what the situation in which a manager would have the most to gain by challenging a play would be. We figured that it would be bases loaded, two out, and the batter hitting a ball into the outfield. The right fielder goes racing back and over, dives, and...did he make that catch or trap it? He's holding up the ball like he caught it, but the runners (who were running on contact because there were two outs) alertly keep going. Two have already scored when the second base umpire rules that the fielder did not catch the ball, and the runner from first is heading toward the plate. The right fielder gets up and gets the ball into the cutoff man, who turns to throw home. The cutoff man is so distraught over what he feels is a blown call that he airmails the ball over the catcher and it skips past the pitcher backing up the play. The batter, now at third, decides to break for home and complete a very odd inside-the-park grand slam and clearly makes it as the catcher drops the throw from the pitcher, who finally managed to track down the throw. The fielding team's manager immediately runs out and challenges the call of a trap, pointing out that if the right fielder caught the ball, the inning is over and no one scores.

Then we tried to determine the most trivial application of using the challenge. The best that we came up with was a 2-0 count, when the pitch comes near to the batter and may have hit him. It’s the difference between ball three and essentially skipping right to ball four. The hitting team’s manager would prefer first base to a 3-0 count, but the difference isn’t that big. (We estimated it at some small fraction of a run.) From a strictly mathematical point of view, it’s important that we figure some of these things out. The reason is that while we don’t know exactly what’s going to happen over the first six innings of the game, we can make some reasonable assumptions about what’s going to happen and use the basic laws of probability combined with expected value to figure out the costs and benefits of different strategies.

The problem with the challenge system is that… well, there just aren’t a lot of close plays in MLB. Sure, everyone remembers the ones that are, but mostly in the same way that people pay attention to airplane crashes when the overwhelming majority of flights have no problems at all. The truth is that on most grounders, the batter is out by a couple steps. On most fly balls, there’s no question as to whether the fielder made the catch. Most home runs go several rows back over the fence. You might be familiar with this study from 2010 in which ESPN researchers found that there were roughly 1.35 calls during the average game that were close enough to justify using replay.

Now, according to the new rules, manager-initiated challenges will essentially end after the end of the sixth inning, so we’re probably talking about an average of 0.9 plays that are close enough to need replay within the first six innings (⅔ of 1.35). And if we were managing a team, we’d challenge only the calls that didn’t benefit our team. Assuming that half would go our way and half would not, that’s 0.45 plays that are both replay eligible and that we would be interested in challenging expected over the course of six innings.

The ESPN study suggested that nearly two-thirds of the time, the umpires actually got the call right on the field. In 20 percent of cases, they got it wrong, and in 14 percent, the evidence was too close to call (presumably, the call on the field would stand). Of the 0.45 plays that we might challenge, only 20 percent would be overturned. Suddenly, we’re down to an expected value of 0.09 plays (20 percent of 0.45) that would be close enough to need replay, would benefit our team from being overruled, and that would theoretically be overturned by a challenge. Again, we don’t know exactly what will happen. The umpires might have an awful day at the office, completely messing up five different calls that all have major implications on the game, but you have to set your strategy based on a realistic expectation of what might happen, rather than fear of what could possibly happen.

We don’t know what sort of reviewable play might present itself during the course of a game, but as we established above, the most that a play could be worth would be four runs (give or take). That means that even if I somehow knew that any disputed plays that took place during the game would be one of these four-run humdingers, these types of challengeable plays happen so rarely that the greatest expected value that we can hope for is 0.36 runs (0.09 plays x 4 runs). That’s the expected value in the top of the first inning, right after the PA guy yells the names of the various players so that everyone can fill out their scorecard.

As the game goes on, the chances for a replay-inducing call go down because there are fewer plays left to be made. (We’ll talk about that more in a minute.) The rules do say that if a manager gets a challenge right, he gets another but would be allowed no more than two. But that would require two challengeable plays. Even if we just lazily double 0.36 runs, we still haven’t even reached three quarters of a run, and we’re asking for a confluence of events rarely seen outside of science fiction or an important Cubs game to happen twice within the first six innings of the same game. In expected value terms, the cost of challenging the first play of the game probably isn’t even half a run—at the absolute maximum.

And let’s talk about what plays are actually likely to be reviewed. A much more likely play would actually be a bang-bang play at first, or a trap/no trap call that might be the difference between a single or an out. Changing a single into an out is worth something around .70 runs (give or take). Even a safe/out call at home plate, where the stakes are a scored run and no extra out or an erased baserunner and an extra out, is worth around a run and a half. (Let’s assume a single with a runner on second and no out, and a play at the plate. The run expectancy matrix says that, in 2013, one out, runner on first is worth .49 runs, and a scored run with a runner on first and no outs is worth 1.82 runs, a net value of 1.33 runs.) Let’s set the value of the types of plays most likely to be reviewed at an even one run. That’s probably generous. That brings the expected value of being able to challenge a call later in the game down to a measly .09 runs.

So let’s say a manager is thinking of challenging a call early in the game. He needs to ask himself two questions: How big is the possible reward if the call is overturned, and how likely is it that the call will be overturned? If it’s a bang-bang play at first (again, worth about .70 runs), and he believes that by challenging the call, he’s got a 50/50 shot of having the call overturned, then he inherently believes that making the challenge is worth .35 runs, and he should probably make it.

In the first inning, the expected value of holding on to that challenge is around .09 runs. Because we estimate that changing a ball to a strike is worth 0.17 runs, if a manager believes he has even a 50/50 shot of getting a call as meaningless as that one changed in his favor, it’s worth it at any point in the game. (Note: ball/strike calls are not actually eligible for review, they’re just a convenient baseball play worth very few runs.) So, even for something as small as the 2-0 count, “Did he get hit by the pitch or not?” situation, if the manager is fairly sure that the batter did get hit by the pitch, he should immediately walk out and ask for a replay.

The Bottom Line
All of this shows that it is in a manager’s best interest to contest even small calls that don’t go his way, even where it’s not very likely that he’ll win the challenge. And that’s going to get annoying at some point. Maybe he shouldn’t take the first chance he sees (another, slightly better one might come along), but his threshold should be rather low and get even lower. As the game progresses onward, things tilt even further in favor of making a silly challenge. Consider a manager in the bottom of the sixth inning who still has a challenge remaining. His ability to challenge expires in a few minutes, so if anything even slightly controversial happens, he might as well try to get the umpires to take a second look. This will probably be the year when we are introduced to the term “the sixth inning, ‘why not?’ challenge.”

This mathematical problem is exacerbated by the fact that unlike in the NFL, where making a bad challenge costs the team a very valuable timeout (which can be used to stop the clock, prevent a costly penalty, or better prepare the team for some later situation), a bad challenge in the MLB costs essentially nothing. With no enduring damage to the team’s chances caused by making a bad challenge, the most costly scenario is one in which the challenge is never used.

We expect that at first, most managers will be scared to use their challenges at all. They might see a situation in the first inning that might call for a second look, but demure figuring that another call might come along later in the game. Remember when we said that teams should plan strategies based on reasonable assumptions about the likelihoods of different outcomes rather than fear of what might happen? Most teams actually operate on the latter. After all, what will happen if a manager uses a challenge on a seemingly trivial call in the second inning, and a safe/out call of importance does come up in the fifth? He’ll be roasted for it in the media.

But realize that the new system fundamentally incentivizes challenges that will come off looking silly and petty. Eventually, someone will make one of these challenges and be mathematically justified in doing so. However, he’ll be seen as breaking some unwritten rule of decorum on the field, and his cleanup hitter will get a fastball in the ribs as a result. Plus, the umpires may grow to resent having to be constantly called out for things that 90 percent of the stadium already knows won’t be overturned.

To reiterate, even if the umpires will hate this ridiculously over-challenging manager, and even if the players will be frustrated with this over-challenging manager, and even if the media will roast this over-challenging manager, and even if you’ll detest this over-challenging manager for making a mockery of the challenge system, he’ll be mathematically doing the right thing in terms of helping his team win games. And it’s a bit strange to create a rule in which the optimal strategy will be universally detested. We should create rules that make people respect and admire the incredibly tough job that managers have, not ones that beg them to make a fool of themselves for the sake of winning.

So if we may, here a couple of small changes to the policy that would help MLB sidestep these issues. First, the idea of challenges, whether managers are throwing a little flag or not, is neat and works just fine for the NFL, and it’s a fun little strategic wrinkle to play with in the game of baseball. And yes, we look forward to more calls being made correctly. But the challenge system is incentivizing behavior that will make the game much less aesthetically pleasing. Instead, let the keys to the replay system rest with the umpires for the whole game, rather than just from the seventh inning onward. Umpires may use replay when they feel a call might need to be reviewed and use the same basic infrastructure. If you are worried that it will lead to managers coming out of the dugout to “suggest” a replay, then simply make suggesting a replay an offense similar to arguing balls and strikes, punishable by ejection. That gives you all the benefits of replay with none of the unfortunate inadvertent consequences.

If we’re going to keep the current challenge system, in which the power to challenge rests with the manager, we need some penalty for being wrong to dramatically change the threshold of what merits a challenge. If there were some penalty that potentially cost the team runs, then managers would want to be reasonably sure that they were going to win before they made a challenge. We suggest an out, either added to the current inning or added to the next one. Even this penalty would strongly favor managers challenging plays that they were convinced had been called wrong, correcting obvious mistakes, which was the intended point of the challenge system in the first place.

Special thanks to Gabe Kapler, who had a simple question for Dan in preparation for something he was writing, and to GChat for enabling the conversation that led to this article.

Dan Brooks is an author of Baseball Prospectus. 
Click here to see Dan's other articles. You can contact Dan by clicking here
Russell A. Carleton is an author of Baseball Prospectus. 
Click here to see Russell's other articles. You can contact Russell by clicking here

Related Content:  Managers,  Instant Replay,  Challenges

28 comments have been left for this article. (Click to hide comments)

BP Comment Quick Links


Seems like making too trivial a challenge will risk ticking off the umpires, whom one may need to persuade to ask for replay review late in the game when the win probability sometimes is MUCH higher.

Jan 29, 2014 05:19 AM
rating: 1
BP staff member Russell A. Carleton
BP staff

My guess is that "Don't make stupid challenges" will become an unwritten rule in the game. I just wonder whether baseball is ready for the first time that someone gets beaned in response to a frivolous replay challenge.

Jan 29, 2014 07:04 AM

My guess is that some managers' strategy will be similar to some tennis players. Roger Federer, for example, doesn't (or at least didn't at the beginning) like the idea of challenging the official for a review. It was rare that he would actually use his challenges. He would still use them from time to time when he knew he had to, but wouldn't burn them as much as other players do.

Jan 29, 2014 10:39 AM
rating: 0

I agree there should be a penalty for a lost challenge, but an out opens up too many other questions. Which pitcher gets credit for the out and which batter gets charged with the out? How does the out affect the batting order? Could we have a 26 out perfect game? Just a few, but I am guessing there are a lot more.

Jan 29, 2014 05:38 AM
rating: 0
BP staff member Russell A. Carleton
BP staff

Another idea I had after we had went to press was that perhaps a lost challenge could lead to the opposing manager being able to select a player from the other team's bench or bullpen who would be taken off the potential list of substitutes. Think they got the call wrong? Are you willing to bet being able to use Joe Nathan at the end of the game?

Jan 29, 2014 06:53 AM

How about we make it more simple. The penalty for a lost challenge is the forfeit of the challenge for the next game.

Jan 29, 2014 08:43 AM
rating: 3
BP staff member Dan Brooks
BP staff

Possible, although:
1) Odd in practice, I think, because most things in baseball do not explicitly carry over from game to game.
2) Hard to determine exactly what the break even point would be. It still would probably be worth it to make questionable challenges, just at a slightly reduced rate.
3) Doesn't really get to the heart of the issue, which is that the challenge system should fundamentally be there to correct umpire mistakes, and this suggestion means that some later mistake is allowed to stand simply because of a previous poor decision.

Jan 29, 2014 09:57 AM

1) Not explicitly but personnel moves for example, both pre-game and in-game, carry over to future games. Bullpen use is maybe the most glaring example.

2) I don't think the break-even will matter much in reality. This article makes great points and is based on sound logic, but most managers aren't evaluating in-game decisions in this way. They're going to challenge when it "feels right". It's not like the numbers have stopped Don Mattingly from ordering so many ridiculous sac bunts.

3) I totally agree that's an inherent problem with the challenge system. But working with the challenge system, if we're going to try to introduce a penalty as a deterrent then this seems more simple and to the point. It would directly affect managers who choose to use their challenges in situations where it is not clear that an umpire's call was wrong and may affect their willingness to use the challenge when there's only marginal potential gain. Again, not that they'll consider the numbers involved. But rather only that it will feel less "right" to make the challenge.

Jan 29, 2014 11:09 AM
rating: 0

Wouldn't that just make the challenge worth 0.18 runs? That still seems like a low threshold.

Jan 29, 2014 10:34 AM
rating: 0

Agree, but I don't think it'll matter. See my reply above.

Jan 29, 2014 11:10 AM
rating: 0

Great read. I agree that the lack of a disincentive means fans, players, everyone will be subjected to trivial/petty reviews on occasion, but I think an out is too strong of a penalty.

My suggestion would be; if the team in the field challenges a call and is wrong the batter at the plate has a ball added on to his count and if the team at the plate wrongly challenges a call then a strike is added to the count. Or add a ball and take away a strike or something similar of that nature.

As you've shown, this penalty would be worth around double the expected runs of holding on to the challenge. It's coincidence, but that's a nice round number. I doubt baseball would ever do something like this, because won't you please think of the records, but the risk/reward seems more in line with the game than penalizing a full out.

Jan 29, 2014 05:48 AM
rating: 2
BP staff member Dan Brooks
BP staff

Sure, I think a ball/strike penalty is fine too.

Jan 29, 2014 08:03 AM

One problem is that you're also penalizing/rewarding the pitcher/batter statistically for getting an easier walk/strikeout. It also matters when in the count it occurs. Ball one matters less than ball four.

Jan 29, 2014 08:39 AM
rating: 1

Great piece and interesting conclusion. However, I think risk aversion will solve this problem. No manager is going to want to challenge an insignificant play in an early inning, lest he find himself answering questions after the game about why he blew his challenge in the 1st when the game was decided by an obviously blown call against his team in the 6th.

Jan 29, 2014 07:30 AM
rating: 1

could make for a lot of challenges in the sixth inning

Jan 29, 2014 09:40 AM
rating: 0
BP staff member Ben Lindbergh
BP staff

If anyone is interested in hearing more about this subject, Sam and I spoke to Dan about it on the podcast yesterday.

Jan 29, 2014 07:39 AM

I had thought that the rules allow a team employee to sit just outside the dugout with a TV and let the manager know if he should challenge. This will obviously have an effect on the frequency with which challenges are upheld.

Jan 29, 2014 07:51 AM
rating: 0

I think that's true, and should take care of this problem.

Jan 29, 2014 07:58 AM
rating: 0
BP staff member Dan Brooks
BP staff

I'm not sure this is true. Even in football, where there are many assistants with access to the video feed, there are still plenty of "maybe/maybe not" challenges.

Video access will help, but there will be plenty of challenges where it won't be clear cut.

Jan 29, 2014 08:01 AM

I think most baseball cases are more clear-cut - for example, the vast majority of who got to the bag first cases are resolved easily with slow-motion footage. Much the same goes for an awful lot of disputed catches, fair/foul calls and most home runs. That must make up a pretty large percentage of potentially disputed calls. So, it's less of an issue then if a manager challenges anything that appears to be in doubt.

Jan 30, 2014 02:03 AM
rating: 0

I hope this system has so many flaws it is immediately dumped. Either commit to making the correct call every time (robot umpires) or don't, or perhaps do as today, where boundary line calls are all reviewable. Replacing one human error/judgment (umpires) with another (manager's decision to challenge) isn't an improvement, adds to the technocratic nature of the game and wastes everyone's time.

Jan 29, 2014 08:38 AM
rating: 2

some of the problems could be minimized by giving managers a fixed number of challenges that they can use over the course of a season, say 20 or 30. of course you might get a run of challenges later in the season, but if you want to minimize frivolous challenges this could do it.

Jan 29, 2014 11:17 AM
rating: 0

I have a particular aversion - and I'm not sure why - to suggestions that may or may not be reasonable (or even excellent), but will never be implemented. No penalty that alters the game in any way shape or form will be considered by MLB, whether that is right or wrong.

I think this discussion is much ado about nothing. The math and the strategy is interesting, yet very few managers are going to make marginal challenges, for various reasons. And if some of them do - so what? It will take 30 seconds or a minute to resolve (especially if it is obvious that the manager is wrong). A month or two into the season, no one will be paying much attention to the challenges, especially the gratuitous ones (if there are any). It will give people an excuse to go get a beer, like a mound visit.

And if it becomes a problem, MLB will change the rules. I assume that the present system is somewhat experimental anyway, no?

Jan 29, 2014 14:50 PM
rating: 2
BP staff member Dan Brooks
BP staff

I understand the point that managers won't do this (and so the entire discussion is academic), but about the penalties part of the comment: there are already plenty of penalties that alter the course of the game. You can't bat out of order, for example, without being charged an out. You can't interfere with a fielder without being called out (see the World Series!). I don't see why it would be so absurd to say if a manager lost a challenge he would be given an out.

Jan 30, 2014 06:22 AM

I like the ball-strike penalty more than any of the others I've read so far. But I still think it's an unwanted game-change. Is there a penalty that does not explicitly impact game stats? For instance, a blown challenge eliminates mound visits by the catcher, pitching coach, or manager unless the pitcher is replaced?

Or, rather than penalizing a blown challenge, why not reward a manager who keeps the flag in his pocket? Something to entice a manager from making a frivolous challenge "just because."

Jan 29, 2014 16:25 PM
rating: 0

I read the article and listened to the applicable podcast. One thing that wasn't discussed is I would expect to see increasing use of the challenge as you approach the end of the sixth inning (i.e., a manager is more likely to use the challenge on a frivolous situation as the "time to expiry" approaches). The willingness to use the challenge should increase as the time to expiry approaches. I would think some stock option mathematics would come in handy to calculate when to optimally exercise the challenge.

While challengable plays may be expected occur evenly over the first six innings (and probably higher as the number of runners increase), the "life" of the challenge decreases over those same six innings, i.e., "dead" as the last out is recorded in the sixth. Managers have every incentive to use it or lose it. That incentive increases dramatically as you approach the end of the sixth inning.

Jan 29, 2014 21:51 PM
rating: 0
BP staff member Dan Brooks
BP staff

We mention this in the first paragraph after "The Bottom Line". =)

Jan 30, 2014 06:22 AM
Pat Folz

One non-statistically-altering (and, some might argue, not actually game-altering :p) idea would be to eject the manager for a failed challenge. I really like that, actually -- it would presumably cut waaay down on frivolous challenges, and the ones that are made would be real tests of managerial (and umperial) intestinal fortitude.

Though really, MLB (and the NFL for that matter) should just do like college football and have some umpire or league official review every play and buzz the umps on the field if s/he decides it should be reversed. Challenges are just a dumb minigame that slows down the real game, and really no one should be strategizing about getting the call right.

Jan 30, 2014 11:56 AM
rating: 1
You must be a Premium subscriber to post a comment.
Not a subscriber? Sign up today!
<< Previous Article
Premium Article Painting the Black: Bl... (01/29)
Next Article >>
Premium Article Prospects Will Break Y... (01/29)

Fantasy Article My Model Portfolio: Framing Decisions Around...
Fantasy Article Dynasty Dynamics: TINO Does Arizona, 2015
Pebble Hunting: The Case For Shaming the Cub...
Every Team's Moneyball: Texas Rangers: Short...
Every Team's Moneyball: Atlanta Braves: Shor...
Prospectus Feature: All Spin Is Not Alike
Premium Article Baseball Therapy: The Most Important Player ...

Premium Article Prospects Will Break Your Heart: Milwaukee B...
Premium Article Painting the Black: Blistery Science Theater
Premium Article Scouting the Draft: Positional Preview: High...
Premium Article Minor League Update: International Winter Le...
Fantasy Article Fantasy Team Preview: Milwaukee Brewers
Fantasy Article Fantasy Three-Year Projections: Second Basem...
Fantasy Article Graphical Fantasy Rankings: Second Basemen

2014-02-17 - Premium Article Baseball Therapy: Looking for Meaning Amid t...
2014-02-11 - Premium Article Baseball Therapy: When Sabermetrics Gets Per...
2014-02-03 - Premium Article Baseball Therapy: How Would We Know That a T...
2014-01-29 - Premium Article Throw the Flag
2014-01-16 - BP Unfiltered: RISPy Business
2013-09-23 - Pebble Hunting: Pedro Hernandez and the Rash...
2012-10-05 - Advance Scout

2014-04-01 - Premium Article Overthinking It: Takeaways from Opening Day
2014-03-13 - Premium Article Skewed Left: The Good and the Bad News About...
2014-01-28 - BP Daily Podcast: Effectively Wild Episode 3...