BP Comment Quick Links
May 16, 2014 Front Office ExpansionThe High Value of the Video Replay Analyst
I had the honor of having lunch with Dan Brooks last week, and as we ate our sandwiches the conversation turned to my thesis. I mentioned that if I were running a team, I would strongly consider hiring dozens of new front office employees, because their salaries are so cheap compared to those of players that even if only a few of them ended up making substantive contributions, they would more than earn their collective keep.
I don’t remember the exact words that came out of Dan’s mouth when I asked him what he thought of the idea of a team’s hypothetically hiring 100 baseball operations employees tomorrow, but it was something along the lines of: “What would you have them all do?” Though I didn’t think it detracted from the general point, it was an important question to which I didn’t have a good answer. And from a team’s standpoint, that uncertainty might be the biggest obstacle to the kind of dramatic front office expansion for which I would advocate.
This got me thinking about how one might conceive of the work that juniorlevel baseball operations employees do in the most concrete way possible in order to make the idea of altering teams’ front offices more accessible. This train of thought fortuitously led me to the newest—and thus least structurally entrenched—category of front office personnel: video replay analysts.
Before we dive in to how the values of replay analysts could be derived, I should say that I cannot offer concrete results and that all the specific numbers I calculated are based on educated guesses about the variables. But in modeling the additive benefits of hiring replay analysts as well as the possible heterogeneity in ability among them, it becomes clear that even the relatively straightforward (though certainly not simple) task of recommending that a play be challenged or not is important enough that we should conceive of replay analysts’ values in a different way.
Defining the Terms
The first component of defining a replay analyst’s value is the probability that he or she will make the right recommendation about a questionable call. This is not just the number of successes a team has relative to the number of challenges. Presumably, there are substantially more calls that analysts seriously consider advising their managers to challenge than they actually recommend, and the decisions not to recommend challenges are not always correct.
As of this writing challenges have a 47 percent success rate so far this year, but some proportion of the 53 percent that stood were probably incorrectly withheld, and that doesn’t include borderline calls that weren’t challenged. I’d probably guess that the leagueaverage accuracy for deciding whether or not to challenge questionable calls is somewhere around 70 percent.
The other factor in estimating the value of replay analysts is the potential impact of challenging calls on the outcome of the game. The great Russell Carleton recently calculated that the theretoforeaverage run value of a call reversal was more than twothird of a run’s worth of run expectancy. Of course, not every call leads to a reversal, and not every manager uses his challenges every game, but that’s a substantial swing for a correctly recommended challenge—and a major opportunity cost for getting it wrong.
Intuitively insane as it sounds, given Russell’s numbers and the degree to which a single close play can affect a team’s chances of winning, I wouldn’t be surprised if the potential aggregate impact of teams’ replay analysts recommendations on questionable calls were somewhere in the neighborhood of five points of win probability (i.e., 0.05 WPA) per team in the average game. I’ll default to 0.03 WPA in my later calculations because it seems more subjectively reasonable, but I consider that to be a conservative estimate.
Modeling Employment Expansion
But let’s make it simpler and assume that a single replay analyst is capable of coming to a fully informed decision by himself or herself. In that case, adding additional replay watchers would mean having the group vote on whether or not to challenge each play rather than letting one person decide on his or her own.
What difference would that actually make? If we make two simplifying and probably somewhat inaccurate assumptions—that all of the serious candidates for an open replay analyst job are generally interchangeable in their abilities to identify challengeworthy plays and that the probability of any individual analyst making the right decision on a given play is unrelated to the probability of any of his peers calling it correctly—the number of wins W a team would gain per year by hiring x replay analysts could be modeled by:
where x is an odd integer, I is the average impact that the decisions about whether or not to challenge plays have on a game, p is the probability that an individual analyst will make the correct recommendation about whether not to challenge a given play, and N is a negative constant to adjust for the basic leaguewide boost teams get from the ability to challenge calls. The sum of the fancylooking series of letters and exclamation points after the sigma represents the odds that a majority of the analysts would come to the correct conclusion about whether or not to review the play. And W(x) represents the number of wins a team would get out of its replay analyst(s) over the course of a season if it hired x of them.
The market value V_{M }of the boost that the xth replay analyst would provide to his or her team could thus be given by:
where C is the market price of a win, which I estimate to have been approximately $7 million in 2013. The inavacuum value V_{T} of the xth replay analyst to his or her team could be given by:
where U(w) is the utility (in terms of revenue or otherwise) that team T receives from winning w games, expressed in monetary terms. A team should thus continue to hire additional analysts at salary S so long as both V_{M} ≥ S and V_{T} ≥ S.
Going back to my previous estimates, say that the expected value of the probability of an additional hire’s making the correct recommendation about whether or not to challenge a call is 70 percent (i.e., p = 0.7) and the impact of call challenging on an average game’s outcome is three percent (i.e., I = 0.03); let’s also say that it would cost $30,000 a year to hire a replay analyst who does not do anything else for the team (i.e., S = $30,000).
Imagine that a team with one replay analyst is considering hiring two more. Adding a second and third replay analyst would lead to an expected boost of 0.4 wins in the standings per year for just $60,000—nearly 50 times what teams paid for that kind of production on the free agent market last year. In other words, putting a dollar toward expanding a team’s replay analysis department seems like a significantly better investment than spending it on a free agent player.
The Value of Replay Analysis Skill
So what would that look like for replay analysts? The variable of interest here is p, the probability that an analyst will make the right call on a given play. If we introduce some variation in p such that p_{q} equals the probability that replay analyst q will be correct in his or her recommendations, we can model how much he or she is worth.
Let’s say a team is considering hiring only one of two applicants, a and b. A will get the questionable calls right 80 percent of the time (i.e., p_{a} = 0.8), while b will get them right 75 percent of the time (i.e., p_{a} = 0.75). The difference in the team’s projected standings with a and with b can be given by:
At I = 0.3, the difference between a and b is worth just shy of a quarter of a win per year, with a 2013 market value of $1.7 million.
In a vacuum, the amount of wins above replacement analyst—let’s call it WARA—that a team could get from hiring replay analyst q if every team hires one and only one can be given by:
where p_{r} represents the probability that the 31stbest candidate—i.e., the best replay analyst who couldn’t find a job—would get each call correct. A onepercent difference between p_{q} and p_{r} works out to about onetwentieth of a win per year, or $341,760 in 2013 market value. So if the best replay analyst in baseball made $100,000 a year to be only three percent more accurate than a replacementlevel replay analyst, he or she would give his or her team around 10 times as much for its money than it would get from a free agent.
I don’t have a good answer to the overarching problem of how to identify the best replay analysts. The best I’ve got is holding hourslong interviews with enough replay tests for statistically significant differences between the candidates’ abilities to call plays correctly to appear. But if by some method a GM (or whoever is in charge of the hiring process) thinks he or she has identified an elite replay analyst, he or she should hang the expense in bringing him or her aboard.
What This Means
But that wasn’t the main purpose of this endeavor (I wasn’t expecting this thought experiment to yield the results that it did). The broader point is that, in an industry in which teams pay millions of dollars to make themselves marginally better, it doesn’t take much to make it worth spending more on employees. And even with a task as potentially solitary and easily understood from the outside as calling for replay challenges, a team that invests more in its front office will likely find itself getting much more bang for its buck than the rest of the league.
Lewie Pollis is an author of Baseball Prospectus. Follow @lewsonfirst
3 comments have been left for this article.

This is extremely interesting, but I see one important and possibly fatal flaw in your methodology: the assumption that I, the average impact of a challenge decision, is a constant. In a strict sense, if you choose to define it as a constant, well, it's a constant, but there are a number of reasons why that's probably not the way to go. The main one is that your "constant" is almost certainly a function of how frequently calls are challenged, with a point of diminishing returns setting in as the number of challenges approaches the number of calls that should be challenged. I should be considered a dependent variable that depends on success of a challenge. That, in turn, depends on the frequency of challenges. Take away the two or three  or maybe it's five or ten, it doesn't change the argument although it moves the point of diminishing returns  times during a game when the umps clearly screwed up, and subsequent challenges become less likely to have any impact on the game, because the challenges are less likely to succeed.
This analysis presumes that there is no penalty to a team in making a challenge that is not upheld on the field. That's basically the way the rules are right now, and it's hard to see how they could be changed without drastically impacting the way baseball is played. I accordingly don't think the case for adding replay analysts is as strong as you make it (acknowledging that you say to take it with a grain of salt anyway). I do think your point about the value to a team of spending more bucks on staff work is good, but I'd spend them elsewhere; my personal favorites would be on reliable assessments of "makeup" and injury risk. But there's room for argument on those as well, of course.
I think you might have misunderstood my point in defining I—I might not have explained it very clearly. I didn't mean that to stand for the impact of a single challenge on the game but the entire collection of plausibly challengeable calls. As I said: (emphasis added)
"Of course, not every call leads to a reversal, and not every manager uses his challenges every game, but that’s a substantial swing for a correctly recommended challenge—and a major opportunity cost for getting it wrong."
I meant that to express that the impact of a replay analyst is not just in the atmost three calls a game he/she does recommend challenging but also in the questionable calls he/she does not think it's worth challenging. That's why I think I is high enough that my estimates for the value additional/superior replay analysts economic significance.
No, I understood that. It's the "questionable calls he/she does not think it's worth challenging" that are the reason why I isn't a constant. As the number of calls available to be challenged goes up (because more review analysts have more time to notice them), the number where the challenge will be successful does not rise with it, because umpires do their job right the large majority of the time. Yet, since there is no penalty for an unsuccessful challenge comparable to the loss of a time out that happens with an unsuccessful challenge in the NFL, there's no disincentive to make challenges that are unlikely to succeed. As a result, it becomes possible, among other things, to make challenges in lowerleverage situations, potentially diluting I. There is room for this to happen since not every manager uses all of his challenges, as you point out.