Baseball must be toasting this week’s sports pages over glasses of vodka and schadenfreude. Last Friday, NBA referee Tim Donaghy was implicated in a betting scandal. On Wednesday, Tour de France leader Michael Rasmussen, under heavy suspicion of doping, was kicked out of the race by his own team. And on Thursday, Michael Vick was scrambling away from reporters in a federal courthouse, rather than opposing linebackers on the field.

Of these scandals-and for that matter, Barry Bonds‘ “scandalous” pursuit of the home run record-it’s the NBA’s that will have the farthest-reaching impact. Vick’s problems are strictly off the field, and the average NFL fan is probably more concerned about how he’ll impact their fantasy draft than what he did to those poor animals. Rasmussen (apparently) and Bonds (allegedly) cheated, but at least they did so in an effort to help themselves win. Only in the NBA is the integrity of the games themselves threatened.

Could something similar happen in baseball? It’s no accident that the betting scandal happened in basketball, a sport in which the referees have a relatively large discretionary impact on the outcome of the contest (about one-quarter of all points in the NBA are scored on free throws). A gambling expert interviewed by ESPN estimated that an NBA official could influence the outcome in his favor as often as 75 percent of the time. At that figure, the incentives to cheat become compelling.

The good news is that this number is surely much lower in baseball. If we located baseball on a continuum of sports relative to the impact that judges or officials have on the outcome of the contest, it would place closer to golf than to figure skating on the spectrum. Nevertheless, it is intriguing to figure out exactly what the relevant percentage is. If you bribed the home plate umpire to ensure that the Dodgers beat the Padres, how often could he facilitate that outcome? Would it be often enough to make corruption attractive?

image 1

Before we answer that question, let’s consider exactly what sort of return on investment an unscrupulous bettor would require in order to justify offering a bribe to an official. A Vegas sportsbook will generally lay odds of -110 to a team that it expects to win the game 50 percent of the time. What that means is that you’re betting $110 to win $100, and that if the house has laid its odds right, it can expect to make an average of $5 (4.55 percent) on your wager. (You might be able to get away with less vig if you placed your wager with an illegal bookie, but on the other hand, you are taking some risks–the bookie might not pay you out, or he might not take further action from you, or he might get busted by the feds and rat on you).

Given a vigorish of 4.55%, you need to win your bet about 52.4 percent of the time in order to break even, and a little better than that to justify putting your money at risk. A well-bankrolled handicapper could probably get away with a long-term winning percentage of 53 percent, and anyone who can beat the line 54 or 55 percent of the time will find himself living pretty large.

Of course, this is no ordinary bet. For one thing, you have to bribe the official. Imagine that you have $100,000 to spend, and that the ref will do his best to manipulate the outcome for you if you pay him $20,000, leaving you $80,000 left over to gamble. At that rate, you would have to win your bet 65.5 percent of the time in order to cover both the bribe and the vig. If you can get away with a $10,000 bribe, then your corresponding winning percentage goes down to 58.2 percent. Some might object that you could threaten the official rather than bribe him, but it costs money to hire a hit man, too.

image 1

The other major cost is the risk of detection, and herein lies a real dilemma for our legitimate businessman. The larger he bets, the more heat he’s going to bring upon himself. Very large bets may in fact move the line, which will inherently trigger a higher level of scrutiny.

In addition, the more the referee or umpire tries to manipulate the outcome of the game in his favor, the more likely he gets caught. In any sport, there are a certain number of judgment calls on close plays that will trigger almost no scrutiny, at least over the short run. Beyond that, there are certain marginal plays on which a corrupt official can probably get away with a “bad” call, provided that he doesn’t press his luck too often. Finally, there are a set of miscalls that would be so egregious that detection would be almost instantaneous. If the umpire called every single pitch that Jake Peavy threw a ball, the Dodgers would almost certainly win the game, but just as certainly, he’d be led off in handcuffs, like one of those pedophiles on To Catch a Predator.

Although there is no way to quantify these percentages exactly, I suspect that a quasi-rational goon (think Tony Soprano) would require at least the 58.2 percent winning percentage represented by the 10 percent bribe to consider going forward, and perhaps quite a bit more than that once he evaluates his opportunity costs; the mob has lots of interesting things it can do with its money, after all. Could a corrupt home plate umpire–but one who is reasonably concerned about detection–shift the odds that far in his favor?

I can think of four distinct ways that the home plate umpire can manipulate the outcome of the game. The first is in the way that he calls balls and strikes. The second is in safe/out calls on close plays at home plate. The third would be by ejecting the starting pitcher and other key personnel. And the fourth would be by calling balks. There are some oddball scenarios in addition to these–for example, if the home plate umpire is also the crew chief, he has discretion over calling the game on account of rain–but they are too minor to worry about. We are assuming, by the way, that the home plate umpire is not in cahoots with any other official, player, or manager.

Balls and Strikes

The impact of turning a ball into a strike–or the other way around–depends primarily on the count. As a rule, the more advanced the count, the larger the impact of the umpire’s call; a call on a full count is seven times more important than one on 0-0 in determining the outcome of the game.

We can estimate the impact of a ball-strike call by referencing the expected run scoring value of the plate appearance from any particular count forward. For example, using a linear weights formula and data published by Tom Tippett, a plate appearance that reaches a 2-0 count has an expected value of .200 runs, while a plate appearance on a 1-1 count has an expected value of .115 runs. The difference between these two outcomes is .085 runs. In other words, if you turn a called strike on a 1-0 count into a ball (meaning that the count will be 2-0 rather than 1-1), the offense gains .085 runs in expectation. The complete set of such outcomes is listed in the table below.

image 1

This data is pretty interesting on its own merits. It suggests, for example, that first pitch is comparatively unimportant, meaning that the pitcher can usually get away with ‘establishing’ a pitch for later in the at-bat. Conversely, two-strike counts are especially important, including the 0-2 count. How a pitcher handles two-strike counts is one of those ‘pitchability’ skills that might deserve a longer look.

Getting back to the task at hand, we need two additional pieces of information to estimate the umpire’s impact in calling balls and strikes. The first is the distribution of called pitches between different counts. Conveniently enough, Dan Fox provided that very data in yesterday’s column, which I have reproduced here. The key thing to notice is that the counts on which the umpire has the largest impact come up less frequently. The 0-0 count alone represents more than a third of called pitches (in part because it’s a count on which batters take frequently), but it has the smallest impact on run scoring.

image 1

If we take a weighted average of the impact of a ball-strike call across the distribution of at-bats, we come up with .106 runs. In other words, calling a pitch a ball rather than a strike contributes an average of .106 runs to the offense’s cause, or vice versa.

The other question is how many pitches the umpire might reasonably have discretion to call either way, without triggering undue suspicion. Dan provides an answer to this question, too–he suggests that there are about 15 pitches per game that might be called a strike by the most conservative umpire, but a ball by the most liberal umpire, or 7.5 pitches per side.

We need to be careful, however, because some of the close strikes would have been called strikes anyway by an average umpire, and likewise with the close balls. Essentially, “close pitches” represents a spectrum of pitches ranging from those that are almost always called strikes (let’s say 98 percent of the time), to those that are almost never called strikes (two percent). The average of these values is 50 percent, and so, while the umpire might see 7.5 close pitches on each side, in only half of these cases (3.75 per team) will he actually alter the outcome. Thus, we estimate that the umpire contributes about .397 (3.75 x .106) runs to home team’s run expectation, while deducting the same amount from the visiting team’s.

Assuming that we start with two teams, the Dodgers and the Padres, that each have an expectation of scoring exactly five runs (this means that the game is a pick ’em), what we wind up with is the Dodgers winning the game 57.6 percent of the time. This is determined by running the Dodgers’ and Padres’ new run expectations through the pythagenpat formula.

image 1

It might be argued that even this degree of corruption will put the umpire at significant risk of detection. After all, what we have him doing is calling a very tight strike zone when the Dodgers are at bat, and a very wide strike zone when the Padres are at bat. Since umpires are evaluated more by how consistent their strike zone is than by how large it is in the abstract, it’s doubtful that the umpire could get away with this for very long. Depending on whether the close calls came during important at-bats, the umpiring might be the lead story in the game. If the umpire had two or three such games in a row, he might well face review by the league, if not worse. Thus, this value should be thought of as something of an upper bound, which might be sustainable for one game, but probably not in the long run.

Close Plays at Home Plate

Baserunning plays at home are high-impact moments. Sean Smith estimates that a baserunner kill at home plate is worth about .85 runs. On the other hand, these plays do not come up very often. According to research conducted by Bil Burke, BP’s Director of R&D, there were only 261 outfield assists at home in the entire league in 2006, or slightly fewer than nine per team.

It is reasonable to assume that most of these plays are fairly close. However, there is a difference between “fairly close” and “too close to call.” We will guess that half these plays boil down to judgment calls in which the umpire does not risk detection by altering his call. Thus, when the Dodgers are at bat, there is approximately one baserunner kill per 37 games that our umpire can turn from an out into a run; this increases their run scoring expectation by .023 runs. Conversely, when the Padres are at bat, we will assume that there are an equivalent number of plays in which the baserunner would ordinarily be ruled safe, but instead is called out. Thus, we subtract .023 runs from the Padres’ run scoring expectation. The Dodgers’ chances of victory have increased by another 0.4%, to a total of 58.0 percent.

image 1

Ejection of the Pitcher

We will confine our discussion of ejections to ejection of the pitcher for the visiting team, who will be angered by having his strike zone compacted. Certainly, the visiting manager is also a candidate for ejection, but the impact of this is surely fairly minor (consider that some managers intentionally get themselves thrown out of ballgames to motivate their team). In addition, one or more of the visiting hitters might be thrown out for arguing balls and strikes, but the impact of any one hitter on a baseball game is reasonably small.

Ejections of the pitcher for arguing balls and strikes are really quite rare; there have been only four thus far this season. Now, we have to be talking about optimal conditions for ejection, because we have an umpire that will deliberately be calling an inconsistent strike zone. Nevertheless, I would guess that the umpire would find some reasonable excuse to throw the pitcher out of the ballgame not more often than 1 in 20 times. He would have to be somewhat cautious, since ejecting the pitcher more or less ensures that he will get on SportsCenter, which is exactly what he doesn’t want under these circumstances.

Estimating the impact of an ejection is not obvious. We will assume that the pitcher would have pitched 2.0 more innings if he had not been ejected–remember, it is not always the starting pitcher who gets ejected, and the ejection will not always take place within the first couple of innings (in fact, it is more likely to come later in the game, after the inconsistency of the strike zone has been “established”). In addition, we will assume that the pitcher will be replaced by a reliever who has an ERA a full run worse than him. This too is just a guess–isn’t it a blessing in disguise when Sidney Ponson gets ejected? An ejection, then, is estimated as contributing .222 runs to the home team’s bottom line each time that it occurs, which is once per 20 ballgames. This adds another tenth of a percentage to the home team’s win expectation.

image 1


The run-scoring impact of a balk is about .25 runs-slightly more than a successful stolen base. Balks are extremely rare in today’s game; an entire pitching staff will usually have about three balk calls against it over the course of an entire season. In addition, not all balks are called by the home plate umpire. I am not sure of the exact distribution, but we will assume that two-thirds of balk calls originate at home plate, meaning that the home plate ump calls two balks per 162 team games.

It is hard to say how much wiggle room a corrupt umpire would have in calling balks. The balk rules, while complicated, are not nearly as open to interpretation as the typical bleacher bum might assume, who lustily boos the umpire each time the pitcher steps off the rubber and scratches himself. In addition, precisely because balk calls are so unusual, they might not be worth their mileage in terms of the risk of detection (sort of the equivalent of surrender at the blackjack table). We will assume that the home plate umpire calls half as many balks on the home pitcher as he would ordinarily, and three times as many balks on the visiting pitcher. This might sound like a lot, but considering the low values that we are starting with, the overall impact is fairly trivial, contributing another tenth of a percentage point to the home team’s win expectation.

image 1


Our conclusion is that a corrupt home plate umpire could engineer a victory for the Dodgers an extra 8.2 percent of the time, for an overall winning percentage of 58.2 percent; the vast majority of the difference is in the way that he calls balls and strikes. This assumes that the umpire has some concern about anomalous calls that might trigger attention from league (or law enforcement) officials. Over time, as the level of heat on the umpire increases, he would need to be more and more careful, and his ability to manipulate the outcome of the contest would decrease. If, for example, the umpire decided to call a tight strike zone when the home team was at bat, but a normal strike zone when the visiting team was at bat (as opposed to a wide one), the home team’s winning percentage would be reduced to 54.3 percent.

Coincidentally, our upper bound estimate of the umpire’s ability to rig the ballgame (58.2 percent) is a dead-on match for the lower bound estimate of the savvy mobster’s minimum required winning percentage. If the mobster had to spend 10 percent of his bankroll to bribe the umpire, he would just break even at this winning percentage, before considering his risk of detection, not to mention other contingencies, like the umpire reneging on his deal. Moreover, even this breakeven percentage would not be sustainable for long, because the degree of scrutiny would rise continuously, increasing the mobster’s required rate of return at the same time it decreases the umpire’s ability to rig the game. To put it bluntly, even if the mobster were hellbent on fixing a sporting contest, he would be better off picking another sport.

Postscript: What about the Over/Under?

There is speculation that Donaghy was attempting to manipulate over-under totals, rather than point spreads or money lines. The theory is that Donaghy called a tight game with a lot of fouls on both clubs, leading to more free throws and more possessions, and a concomitant increase in point scoring. This is potentially a superior method to tipping the game to one team or the other, because it enables a referee to still call a “fair” game, thereby reducing the risk of detection.

This strategy has its analog in baseball; the umpire could call a tight strike zone or a liberal one, impacting the run scoring totals while maintaining his consistency from inning to inning and game to game. However, our umpire would still encounter a few problems. If he always called a tight strike zone or a loose one, he could probably go on more or less permanently without detection; word would get around the league that the umpire had changed his strike zone, and the players would adjust accordingly. The problem is that word would get around to Las Vegas too, and the oddsmakers would adjust their lines to reflect the new reality before their punters had the chance to. Unlike the NBA, where referee assignments are kept secret until the last possible moment, it is often possible to figure out who the home plate umpire will be days ahead of time. (Indeed, the NBA’s policy of not disclosing its officiating assignments is probably counterproductive from the standpoint of reducing corruption. All the savvy mobster has to do is learn of the referee assignments ahead of time, and he has a profitable bet against the total, provided that the referee is toward one end or the other of the foul-calling spectrum).

Alternatively, the umpire could shift between calling a tight strike zone in some games, and a loose one in others. Although Vegas could no longer arbitrage the differences out, the umpire would reassume some risk of detection. Baseball players have photographic memories for the strike zone, and word would get around the league that something funny was going on with this umpire. In certain ways, in fact, this pattern might be more detectable than if the umpire’s strike zone were inconsistent within particular ballgames. The latter would simply seem like bad umpiring, while the former would seem almost deliberately inconsistent.

Detection could also come from the bookmaker side. Over-under bets are generally treated with more suspicion than money line bets, as they are less attractive to recreational bettors, particularly in a sport like baseball. To make consistently large bets against the total in a relatively illiquid market, especially in association with one home plate umpire, is something that would quickly be detected.

Moreover, I am not certain that the mathematics work in favor of the corrupt umpire. We estimated that, by making calls that tended to favor the offense when the Dodgers were at bat, the umpire could increase their run scoring from 5.000 runs per game to 5.438. Presumably, he could do the same thing for the Padres. Thus, you’d have two teams who were expected to score a total of 10.876 runs per game against a probable over-under line of 10.

There has been relatively little work done on run scoring distributions (e.g. how often does a team that averages 5.438 runs per game score exactly 4 runs?). Thus, there is no established way of estimating how often this over-under bet should come in. However, I pulled up an old data set, and looked at the run scoring distributions of all teams that scored between 5.338 and 5.538 runs per game. I then combined two such distributions to estimate the total number of runs scored in the game, assuming that they were independent of one another, with the exception that the two teams could not score the same number of runs (since there are no ties in baseball).

This method shows that, excluding pushes, a bet on the over would win 54.8 percent of the time, which represents a materially lower return on investment than the prospective bet on the Dodgers. Alternatively, the umpire could try to lower the scoring of both teams to 4.578 runs per game (our estimate of runs scored for the Padres), and have his syndicator place a bet on the under. I have this bet doing a bit worse still, winning only 53.0 percent of the time.

Although I am not 100 percent confident in the mathematics behind this result, it appears that over/under lines in baseball are a bit less susceptible to corruption than money lines, at least provided two evenly-matched clubs. The intuition is that the won-loss outcome is a zero-sum game, while the over-under is not. Everything that you do to increase the Dodgers’ chances of winning necessarily reduces the Padres’ chances. However, increasing the Dodgers’ run scoring has no direct effect on the Padres’ run scoring; you have to work twice as hard. Interestingly, this property does not carry over to basketball, because having a quick whistle increases the number of possessions for both clubs. This correlation is what may have emboldened the mafia to have Donaghy do their bidding; sabermetrics and organized crime make for dangerous bedfellows.