Last week in Federal court, former St. Louis Cardinals executive Chris Correa was indicted on, and pleaded guilty to, charges that he improperly accessed the Houston Astros’ database, Ground Control, on multiple occasions. Before we go any further in this article, let’s get something out of the way. What Mr. Correa or anyone else involved in the case did or did not do is a matter for the FBI to investigate and the courts to adjudicate and I will leave that in their hands. Correa is quoted in the article as saying that he “trespassed repeatedly” and that he accepts responsibility for the case. Everyone else, not surprisingly, has largely declined to say much else.
(And yes, for those of you who are longtime readers of the site, let’s just get this one out in the open. There is a well-known pipeline of people who have previously written here at Baseball Prospectus and who now work for the Astros. I very specifically did not ask them anything about the case, mostly because I knew that the answer would be “No comment.” Everything in this article is either a matter of public record or my own conjecture.)
According to the Associated Press:
Among information Correa viewed was 118 pages of confidential material, such as notes of trade discussions the Astros were having with other teams, summaries of evaluations written by scouts of amateur players the Astros were considering drafting; the not-yet-completed amateur draft board for 2014 and summaries of college hitters and pitchers that the Astros regarded as top performers.
Leaving aside the whodunit details of how Correa gained access to the Astros’ database or what he did with the information, there were a couple of sentences in the reporting that I found endlessly fascinating.
U.S. Attorney Kenneth Magidson said the hacking cost the Astros about $1.7 million, taking into account how Correa used the Astros’ data to draft players.
“It has to do with the talent that was on the record that they were able to have access to, that they wouldn’t have otherwise had access to,” he told reporters. “They were watching what the Astros were doing.”
Ladies and gentlemen, the Federal government has finally gone too far, reaching even into Sabermetrics! It’s the sort of overreach that makes me so angry that I want to occupy a remote wildlife refuge in Oregon. That’ll show ‘em!
Whether the Federal government wants to admit it or not, they’ve accidentally tried to answer one of the more perplexing questions about baseball. How do we place a value on what a front office does? We have really good models for the value of each of the 25 guys wearing the funny pajamas and hats, but what about the contributions of the scout who found him, the minor league pitching coordinator who taught him the wipeout slider, and the analyst who made the crazy algorithm that identified him as a potential breakout candidate?
Ground Control, in addition to being a tribute to the recently departed David Bowie, has been described by Astros General Manager Jeff Luhnow as “the repository of all our baseball knowledge.” And not just that Jose Aluve’s batting average was .313 last year, but also all the juicy stuff that no one else is supposed to see. In some sense, it’s like the Astros collective private diary. A place where they can spill their innermost secrets and not worry about who’s reading it. You can read it and find out whom they have a crush on. It’s supposed to be secret, but someone found the key and read it.
How can we put a price on unfettered, uncensored access to all of an organization’s most private thoughts?
Warning! Gory Mathematical Details Ahead!
Let’s for a moment take a look at the number provided by the Federal government of $1.7 million. According to quote above, that was the “cost” to the Astros based on Correa’s alleged use of the ill-gotten data just in the draft. He may have accessed other areas, but the Feds didn’t include that in the price of admission. Presumably, it means that the Federal government believes that Correa accessed files related to Houston’s preparation for the draft, including scouting reports and preference lists. In theory, there are two ways that Correa could have used the information. One would be that he used the information from the Astros database to strategize on which players to take. For example, if the Cardinals knew that the Astros were interested in Player X, and that the Cardinals were trying to make a decision between taking him now or waiting another round, the Cardinals could have pulled the trigger then, thus depriving the Astros of the guy they really wanted. That doesn’t seem to be an efficient strategy, because the Cardinals still don’t know whether any of the other 28 teams were interested in Player X. (Surely if the guy is any good, at least one of the other teams would have an interest.)
But the other way is that Correa could have used any information he accessed to simply look at someone else’s homework. The Cardinals might have been interested in Smith and/or Jones, and of course, they sent their own scouts out to look at Smith and Jones. But now, Correa could have looked at what the Astros brain trust thought of them. Maybe they had info that the Cardinals didn’t get. By doing that, they were able to grab a guy whom they had rated too low or perhaps avoid a guy whom they had missed something bad on.
Now, it’s not clear from the media coverage what the alleged use was. But let’s play some of these scenarios out. Correa would have had full access to the analogous Cardinals database of scouting information, and so the information from Ground Control would have been a second opinion. That suggests that there was enough data in the Astros records that the Cardinals didn’t have that it was worth $1.7 million extra. You have to figure that the Cardinals and Astros would have had a good amount of the same information. They both have scouts and cross-checkers and all that, so there wasn’t that much extra to be gleaned. Let’s say that for players scouted by both the Astros and the Cardinals, the teams shared 90 percent in common, and so the Cardinals would have gotten the benefit of the extra 10 percent of information. That would mean that the entire Astros database related to the draft would be worth $17 million.
I’d encourage the reader to pause for a moment. The calculations that I just did involve a lot of estimation. Not unreasonable estimation (says the guy who did it), but still not something where I’ll fight to the death for these exact numbers. I think in this case, the more important piece of information is the order of magnitude that we’re talking about. Given our information, it’s likely that the information on just the Rule 4 Draft portion of the database has an overall value that has eight digits in it. That right there is pretty powerful information. Ten to the seventh power.
Now, consider that there’s probably information in there about international free agents, minor leaguers from all the other teams (aka potential trade targets), evaluations of current major leaguers, advance scouting reports, strategic gambits, and a few recipes for chocolate cake. If we make the assumption that all of those “sections” of the database have similar value, then we have a database that’s worth a lot of money. If we consider that the effective job of a front office is to gather knowledge and act on it, and that knowledge is worth eight or nine figures, then a front office writ large is a very valuable thing.
But let’s calculate that the other way around. The Federal prosecutor may have put the value at $1.7 million, based on "the Astros’ scouting budget and the number of players included in the database," but that doesn’t mean that he’s a good Sabermetrician. If we assume that being able to access the repository of a team’s knowledge has added value, we have to ask “compared to what?” We assume that a great deal of the information that would have been available in Ground Control would have been known to Correa anyway. If we’re asking what a front office or a database is worth, we need to know whether the baseline is the general public, or another front office, or whatever the waiver wire/Triple-A/bench equivalent is in front officers, or perhaps the baseline is zero.
It’s a very different question to ask what would happen if some random fan were given access to a database like Ground Control (even magically granting her/him the knowledge he would need to even use it) and how much it would expand her/his mind than how much it would benefit someone who was already on “the inside.” In this case though, the alleged accessor was already an accomplished front office type.
Still, we know that teams are actually really bad at figuring out which amateur players in the draft are going to become competent major leaguers. Mike Trout was taken with the 25th pick in the draft his year. Clayton Kershaw went seventh overall and was the sixth pitcher after Luke Hochevar, Greg Reynolds, Brad Lincoln, Brandon Morrow, and Andrew Miller. It looks like there’s a lot of room for improvement for everyone. It’s not just the funny anecdotes that tell the story. Systematically (i.e., when you do the proper #GoryMath), teams are really bad at this stuff. Could peeking at someone else’s homework help… perhaps to the tune of $1.7 million?
First of all, it’s worthless to use the “$7 million per win” reflex here. That’s the price of wins on the free agent market, and in the draft, by definition, we’re specifically talking about gaining control over a player’s cost-controlled years, and cost-controlled years—on average—check in at about half the price per win as free agent years. Still, the $1.7 million value that the U.S. Attorney placed in the information that he alleges that Mr. Correa improperly viewed suggests a value of about half a win. If the extra information helps a team to draft one slightly better player than they otherwise would have gotten, and slightly better being defined as a tenth of a win in each of the six seasons that he’s cost-controlled, then that’s slightly more than half a win. Or another way, if it gives a team a 10 percent greater chance of taking the guy who will actually produce five wins over the course of his first six years rather than the guy who’s a dud, then yeah, that’s $1.7 million. It’s impossible to get a specific read on the matter, but it actually wouldn’t take much to hit the number suggested.
Knowledge is Power (and Money)
I’m sure some of you were hoping that I would value a front office at exactly $73,490,872.39. Sorry to disappoint. But this case gives us a rather unique way to think about the value of the sum total of the collected baseball knowledge of an organization (and by extension, the people who gather it). And while we don’t know the exact amount, a little reasoning puts us in the neighborhood of something in the eight to nine digit range, when you define “information” broadly. Consider for a moment that number is around the same order of magnitude as the MLB player payroll.
There are probably a few wrinkles on this question that might make for some interesting theory. Mr. Correa is only alleged to have hacked into the Astros. What if he could have done the same for the 28 other teams. What would that level of information—perhaps we might call this the “God mode” scenario—have been worth? Perhaps there’s a point of diminishing returns on the scouting side. What would you do with 30 sets of scouting reports on the same player? At the same time, a team would always be able to figure out what exactly would be the best deal that they could get in a trade and they’d have plenty of extra information about the prospects in each farm system.
Fun to think about on a cold January day, eh?
We still don’t know how to divide up the credit among the various actors in the front office. The number of people who could theoretically lay claim to some of that value is rather large. There are the decision-makers, the scouts, video guys, the coordinators, the analysts, the programmers, and anyone else who lent a hand. But it’s pretty clear that the folks who wear the polo shirts with the team logo are about as valuable as the guys who wear the hats with the team logo. The problem is that while the players produce their value out in the open (they charge admission!), the front office guys do it in secret, behind firewalls.