Manager of the Year is stupid. Manager of the Year voting is stupid. Given the former, it's not clear that the latter matters in the least, but indulge me.
Despite the language in the above paragraph, I'm not a 2002 stathead, though I certainly was once upon a time. I can't pretend today that the semi-tangible, semi-measurable aspects of managing a baseball team that fans love to talk about are the most important aspects, because it is highly likely that they are not. Computers and front-office nerds alike (hold the jokes, HOLD THE JOKES) can do an excellent job deciding when to bunt (never), when to substitute a relief pitcher (as often as possible), and how to construct a batting order (Barry Bonds leading off!), yet we've heard more in recent years about the possibility of a player-manager (Paul Konerko) than nerd-managers. (And no, Joe Maddon does not count—he was briefly paid to play the game of baseball, after all, and I'm talking about putting Paul DePodesta or Ben Lindbergh in the dugout, not an ex-minor-leaguer who happens to wear glasses and listen more carefully to his team's analytics department than most dugout men do.) This absence of nerditry would suggest that baseball teams making seven- and eight-figure bets on their personnel and leadership decisions value significantly the immeasurable side of managing that includes dealing with personalities, keeping an eye on low-level health issues, and even actual coaching. Sure, teams can be subject to biases and path-dependency just as anyone else can, and the size of the gamble doesn't mean the play isn't stupid (heyyyyy Wall Street), but we can't go off half-cocked on these teams and demand firing Dusty Baker every time he bunts, either.
With that 2012 Stathead's Apology out of the way, I would suggest that if we're going to persist in the fool's errand of ranking managers for a yearly award, as was done recently with the election of Bob Melvin to the position of Manager of the Year in the American League, then we ought to examine what are clearly the core criteria on which the voters vote, to see if they might have logical holes. But really, I should use the singlar: that criterion is simply, "Which manager's team was most surprisingly good?" That is, switching to the second person like Hubie Brown, if you have your general manager hand you a $180 million behemoth with six All-Stars and you massage those egos and keep those players healthy all the way to a 96-win season, you'll get a handful of third-place votes. But get lucky with your run differential and win a bunch of games via bleeps and blonkers and take a team that everyone wrote off to 92 wins and the wild card and you're a mortal lock for hardware. (Unless, as was the case this year in the American League, there were two surprise teams. Then they'll split the vote and whoever is nicer wins by a hair. Always be nice to the media.)
Before we jump whole-heartedly into the criticism of this method, though, I want to note: the funny thing about expectations and how we make them and where our biases might be is that I can think of exactly two situations in which preseason expectations matter. First, if you like gambling on season lines at your local legal sports book, you are presented rather directly with the expectations question and you put your money where your brain is.
The only other place where expectations really matter is in the issue at hand: end-of-the-year manager awards. I guess here I should expect some quibbles that the awards don't mean anything. To these quibbles I have two responses:
shut up; or, phrased in legal terms: let's stipulate that the awards really do matter
winners can probably sell the trophy or plaque or whatever for a few bucks and thus the award is not totally valueless or meaningless
What, by the way, were the 2012 expectations for the two managers who wound up being the only serious candidates for AL Manager of the Year? The A's and Orioles both began the year under 10 percent in our Playoff Odds, though we should note where PECOTA's expectations differed from those in the mainstream. The A's, for instance, were not projected by PECOTA to be a seriously awful team, a 100-loss-type team, by our projections just before the year started, but I certainly saw that figure tossed around liberally on A's-Twitter eight months ago. Sure, the optimists saw that Jarrod Parker had a good chance to be just as good as Trevor Cahill and that a starting-quality outfielder (Josh Reddick) is always worth more than a relief pitcher (Andrew Bailey) and that Yoenis Cespedes can jump on a lot of boxes and put sunscreen on sharks or whatever he got up to in that video I never watched, but I suspect that the pessimists better reflected the national mood on the A's after Billy Beane traded away two top(pish) young starters as well as a talented closer. (I realize that I'm assuming the existence of a national mood on the A's. Go with it.)
It is less clear to me what the Orioles were expected to do. Their division was seen as a powerhouse, but Dan Duquette did things like sign Wei-Yin Chen and get his scouts banned from South Korea, both of which are more win-now moves than the A's trading Gio Gonzalez to Washington looked to be. I guess the short answer is: the Orioles were expected to contend for a solid fourth place finish.
To the meat of the thing!
My first, albeit lesser, objection to expectations-based award-voting is that it's weirdly arrogant. The implicit assumption of anyone voting this way is clear: his or her projections are so good at pegging how well players will perform over the course of a year and thus how well the team will perform that upward deviations from that projection must be attributable to the manager waving his wand around. No way could it just be that our beliefs about Jarrod Parker's present skill level were wrong!
I also think it's hard to avoid what I'll call "movement bias" even though I'm sure someone else has already named it something better. What I mean by this is the phenomenon of teams that sign a free agent or make a trade being rated differently for having made that move than if the team simply employed the player in the first place. If Josh Hamilton and Zack Greinke sign with the Astros this offseason, there's no projection system that would posit that that's enough to make them a playoff team. If you add (e.g.) 10 WARP to a team with a 60-win core, you get a 70-win team, but you don't get a playoff team, as I suspect many fans and writers would hope/believe based on the motion on the roster. This is not something I brilliantly made up just now, of course, and it's also not something that I can prove exists, but I have a hunch that it tickles many of you in a particular brain-spot that indicates that you also have a feeling that fans and analysts and writers are fooled by this type of bias.
With those two paragraphs said, I don't even think that the setting of expectations in itself is the biggest problem with this method of manager award voting. After all, we have more or less rigorous methods available to us to create team projections, just as we have more or less rigorous methods available to us to value individual offensive performance. It's not really our fault if award-voters ignore those metrics when they choose the MVP, and it wouldn't be our fault if they did the same for Manager of the Year. That is, if "actual wins less expectations" were a valid method of voting for the award, we could actually calculate a statistic that measures that in a reasonable fashion using actual numbers and math.
Where such the method fails, however, regardless of how much rigor (or at least whatever semblance of rigor we're working with in baseball analysis) we apply, is in the attribution of exceeded expectations to the manager. Back to the A's: the team both exceeded preseason expectations by having players perform better than we thought they would and, by our adjusted standings, the Pale Pachyderms finished with a better record than they "should" have. There are ways in which the manager could theoretically influence each of these types of overperformance:
Straight-up coaching that makes players' skills better.
This could help players actually perform better than projections think they will (e.g. by platooning a player who has not previously platooned).
Relief pitching could be structured in such a way that the team loses more blowouts (giving up runs in games that are already lost, thus hurting the Pythagenpat record without affecting the wins-and-losses bottom line) and wins more close games.
Position-player substitution strategies could be managed in the same way.
Motivation, calming and focusing players, or any number of other generally stated psychological factors about which our on-site staff of brain scientists is far better equipped to speak than I. Note that psychology is often closely scrutinized in the context of clutch performance (and thus potentially has a heavy impact on a team outperforming its basic runs scored and allowed), but it's perfectly applicable to overall performance as well. If happy players hit better, then being happy in the third inning of a five-run game will help just as much as being happy in the ninth inning of a knotted-up contest.
Each of these has significant measurement problems, however. For instance: Haha, measuring coaching. Haha, measuring motivation. Josh Reddick hit above his 70th percentile PECOTA. Coco Crisp nearly hit his 80th. Yoenis Cespedes hit just shy of his 90th. Jonny Gomes and Brandon Moss hit 20 and 40 points of TAv above their 90th. Did Bob Melvin build that?
Looking at pitcher- or position-player-usage in terms of leverage is dubious for reasons pointed out by Colin Wyers, or at least related to such reasons. If a key pinch-hit puts a game out of reach in the sixth or a shutdown reliever extinguishes a run-scoring situation before the inning is late enough for the win-expectancy to swing significantly, then the leverage will never get high and the manager won't get "credit" for using his best players in the tight situations. This relates to Colin's argument about win expectancy failing to untangle the creation (or extinction) of leverage — if a manager can push buttons that ensure that his team wins four-run games instead of two-, he'll push those buttons even though four-run wins don't create saves and they don't create high-leverage situations.
Even looking at personnel deployment in general is hard. Not-insignificant portions of player-usage decisions are determined by factors one would not have to consider while playing RBI Baseball. Relievers can't pitch every day. Most position players can't take every at-bat available to them. Contemporaneously, we often know from managers' pre- and post-game statements which relievers and which bench players are and are not available to perform certain tasks. At the end of the season, however, when we're totaling up the contributions of a manager's decision-making to the team's final winning percentage, this knowledge would require significant amounts of work to recreate and the problem is not susceptible to algorithmic solution.
And as to clutch! Well, we have a hard enough time distinguishing luck from skill in overall performance without chopping our sample into tenths and introducing psychology into the equation — the theory and general knowledge around brain science and questions of anxiety and motivation and adrenaline might be well-developed, but gosh, good luck getting data from major league baseball players that we can fit into those models.
Let me add one more confounding issue about choosing managers based on expectations: some amount of our ideas about how teams will do is based on who their manager is. When Baseball Prospectus puts together the Depth Charts that form a significant part of our process for creating playoff odds, Jason Martinez and everyone else involved know who the teams' managers are and what their tendencies might be. The innings distribution of certain pitchers can be selected based on how a manager uses his pitchers, for instance.
This is perhaps even more true when we don't use math-based projections but instead simply think about teams in general ways. "Dusty Baker always uses his veteran bench players well," we think when we ponder the latest collection of mediocrity he's been handed. When a team that has been managed by Manager X for three years unexpectedly breaks out in Year Four, are we really that happy to attribute the breakout to the manager? Our preseason expectations incorporated three years of manager data on X!
So where are we? The expectations model of Manager of the Year voting is flawed:
(a) from the outset because it's either based on ideas about the precision of projections that simply aren't true or maybe based not so much on projections as on general and vague ideas about how teams will perform;
(b) in implementation because performance above expectations is uncritically attributed to the manager when it could be caused by any of 10 million other things, some of which are in the general vicinity of the manager's control and many of which are not.
Congratulations to Bob Melvin!