June 24, 2014
Is it Really Harder to Scout in New England?
The Cape Cod League is the premier summer baseball league for college players. A good summer on The Cape might just make you a million dollars at draft time. I’m told there’s also a local professional team in the New England area that has had some recent success too, so good for them. And yet, in scouting circles, New England is seen as something of a desert wasteland. The standard explanation is that sure, there are athletes good enough to play professional baseball in New England. The problem is that players in Stars Hollow, Connecticut just don’t get the reps that they do in Georgia, because there’s a lot more baseball weather (read: time that it isn’t snowing) in the South.
The geography of where baseball players come from is a fascinating topic (and makes for a great map!) Matt Swartz recently noted that counties with warmer weather (and bigger incomes) were more likely to produce major leaguers. New England actually turns out rather well on the income distribution, with Connecticut, Massachusetts, and New Hampshire ranking fourth, fifth, and sixth, respectively, among the 50 states in median income, so it must be the cold and snow that’s holding the region back from producing MLB talent. Or is it?
Is it really that hard to scout in New England? A few weeks ago, I studied how well teams were doing when it came to properly evaluating prospects for the MLB draft. The answer was that teams weren’t doing as well as we might think. The links between signing bonuses and draft positions and basic outcomes like whether the draftee made it to the majors or produced five career WAR were actually only moderate. I choose to interpret that as “Prospecting is hard” rather than “Teams are doing a bad job.” But I got to wondering whether New England’s reputation is actually well-earned. Do teams have a harder time scouting cold climes than warmer ones? Is there something else at work here?
Warning! Gory Mathematical Details Ahead!
I ran a couple of different analyses. In one, I ran a correlation between a player’s standardized bonus and his career WAR total (to date) among those who had made it to MLB. I also ran a logistic regression predicting whether he met the two other milestones, with signing bonus as a predictor. In my previous work, I used signing bonus as a proxy for how highly a team thought of a player. I found that the correlation was stronger for some categories (first-round picks, college players) than others (anything after the first round, high school players). In theory, a high correlation shows that teams (in general) are good at assessing players. Low correlations mean that teams are paying money and have no idea what they’re getting for it. That could work out in their favor (getting a really good player for a $10k bonus) or against them (Brien Taylor), but it’s the sign of an inefficient market.
This time, I split things up geographically and focused on where draftees were from. Whether it was from high school or college, Baseball Reference kindly provided the state in which the player’s school was located. This is convenient because teams often assign scouts to specific states or, depending on the size of the state and how baseball-rich the area is, clusters of states. If it’s true that New England is harder to scout because the weather is worse and the competition is more uneven, then we should see teams guessing more on players from New England than from other areas, like the all-baseball, all-the-time state of Florida.
Here, similar to my original article, I present the value of the correlation between signing bonus and career WAR. For the logistic regressions, I took the Nagelkerke’s R-squared for the model and took the square root, to bring it to the same scale as the correlation. (If you aren’t super-initiated, just nod your head and know that “higher is better.”)
I also present the total number of players from each region who signed during those years, and the percentage of them who appeared in an MLB game. Finally, I used a logistic regression to create an expected rate of MLB appearance. We would expect a player who got a $3 million signing bonus to be more likely to get to the bigs than a guy who got 10 grand. I looked to see how many major leaguers each region should have produced (if their signing bonuses are any indication) and what percentage of that number actually showed up.
** - There were no players from New England drafted from 2003-2008 who put up more than five WAR.
The worst results in measures of how efficient the market is (that is, how good teams are at pricing eventual performance) came from New England and from the Mountain West (where MLB teams actually seem to have it backward). Those two regions also produced the fewest draftees and the lowest ratio of major leaguers to draft picks, as well as the lowest yield of major leaguers when adjusting for expectations (read: signing bonuses). Teams didn’t find a lot that was interesting in these areas, and when they did, they had almost no idea how to price it and it usually ended up disappointing them.
Missing Persons Report
We can interpret the lagging New England numbers in a couple of different ways. It may very well be that good players from New England high schools choose to go to SEC and Pac-10 colleges that are perceived as better places to hone their craft, and then they get drafted from there (and so my model lists them as being from North Carolina or California). That can turn into a spiral where those programs really do become better programs because they get all the good talent, and that could depress the number of players drafted. There are plenty of those in the database, by the way, but that doesn’t explain the whole problem.
Why is it that the ones who are left behind—the high school players from Boston or the college kids from Boston College—are so poorly priced? Certainly, if a team is interested in a player from Vermont, they send a scout or two to go see him play in the same way that they would send a scout to Texas. In theory, they would evaluate both on the same criteria, and do the same interviews. Why are teams so much worse at guessing what the Vermont kid will become?
One answer that we can rule out is that because there’s a talent drain from New England high schools into colleges in other areas, the players who do get drafted are more likely to be high-risk high schoolers. In my original article, I found that high school players really are riskier bets, in that the market does a poor job of figuring out what they will be come, worse than college draftees. However, while 31.2 percent of all draftees from 2003-2008 were high school students, only 25.8 percent of New Englanders were drafted out of high school. That doesn’t seem likely. We also saw that players drafted after the first round were, as a group, mis-priced. Maybe teams see New England as a nice place to find a fifth rounder? That wasn’t the case, either as 22.6 percent of the New Englanders chosen were first-round picks (compared to 15.4 percent of all picks—I counted supplemental picks as first-rounders).
Maybe it’s just the fact that because scouts don’t have as many chances to get good looks, they’re going on less information. Less information always means more risk. Maybe it’s because the talent drain means that the opposing hitters/pitchers that the player is going up against aren’t as good, and so the scout doesn’t get a chance to see what he can do against “real” competition as easily. I suppose that’s what the showcase circuit is for, but even that’s an ever-smaller sample.
I spoke to a few of the scouting folks here at Baseball Prospectus (I know, they’re Capulet and I’m Montague…and that’s how it actually works, people. We’re supposed to hate each other, but we actually stand on each other’s balconies and make out all the time) and several of them chimed in with theories. Some mentioned a couple of specific cases of draftees who turned into busts. In a sample size of 31, that can go a long way toward messing up a correlation. Ryan Parker had an interesting theory about how the Cape League might actually be to blame. If a scout has New England as his territory, should he work the high school circuit or just park on the Cape and see all kinds of fun college kids? In that way, teams get fewer looks at the high school talent.
Chris Mellen pointed out that New England is made up of states with relatively low populations, so it might be that there’s not a lot of talent to begin with, simply because of raw numbers. Al Skorupa observed that because New England states tend to have higher incomes and higher concentrations of college graduates, high school students are more likely to become college students—not because it will increase their chances at MLB, but because they are more likely to be the children of college graduates who want them to graduate as well. Of course, they go to schools where the baseball team will get some coverage.
A lot of the potential explanations came down to “bad weather, which means fewer reps, which means talent that’s more raw, which means bigger risks for major league teams.” All of these theories make sense, and maybe one or two of them are actually true. Or maybe this is a fluky thing that just kinda happened; not everything has an explanation. But if this thing does, it might point to some sort of inefficiency (nay, opportunity!) in how scouts approach the northeast corner of the country. Right now, I have to confess that it’s not entirely clear to me what that opportunity is. So I leave you with a mystery. Where are the missing New Englanders?