October 12, 2005
Lies, Damned Lies
A Mulligan on GuzmanYou folks have long memories. I've already gotten several sarcastic comments about this article, which made the claim that Cristian Guzman was the best free-agent signing of the past winter. Although it will be nice to receive that Christmas card from the Bowdens, this obviously doesn't rank among the highlights of my analytical career. At least the Phillies almost made the playoffs this year.
I do try and be accountable when I get an aberrant result like this one, to see if there is anything wrong with the method that produced it. Guzman, certainly, had a worse season than most anyone might have anticipated. His .219/.260/.314 batting line was a dead match for PECOTA's .226/.260/.315 10th percentile projection. The real problem, however, was in the field, where he lost a Kevin Garnett sized step, going from a +14 FRAA player to a -11. Nobody was claiming that Guzman would have a breakout season--on the contrary, PECOTA gave him a 31% collapse rate. But a premium glove at shortstop is worth something, and his deal looked comparatively better than the likes of
To be frank, however, that result also looked pretty aberrant before the season even began. Coming into the year, Guzman was a flag-bearer for what might be called the "sub-premium" group of players--those players who are notionally better than replacement level, but no better than average in a good season. (If Guzman was a beer, he'd be Miller High Life). A lot of Guzman's fellow sub-premium players fared well in my free agent analysis--Corey Koskie, David Eckstein, Damian Miller, and so forth. It would seem that either the market is undervaluing these players, or my analysis is overvaluing them.
I decided to re-run my analysis with a couple of new wrinkles:
The pattern here is non-linear. Teams are willing to pay a premium for star talent, above and beyond the number of wins that they can be expected to contribute; signing one 6-win player is more costly than signing two 3-win players. The equation here, so that you can play along at home is:
Salary = (WARP^2 * $212,730) + (WARP * $402,530)This new formula results in a substantially more sensible salary projection for Cristian Guzman, who now rates as only a $2.8 million "bargain" for the life of his deal, rather than the $17 million implied by the original analysis. But does this non-linearity reflect rational behavior on the part of baseball clubs? Or are they overpaying for superstar talent?
I suspect that it's probably the former, for a couple of reasons. Firstly, the distribution of talent in baseball is highly uneven: a relatively small group of players are responsible for providing a relatively large portion of value. We can get some sense of this by looking at a Lorenz Curve that compares the percentage of value, as measured by VORP, against the percentage of playing time for all position players in 2005 (players with negative VORPs have their scores treated as zeroes).
The top 10% of players were responsible for about 25% of all VORP in 2005, and the top 25% of players were responsible for about 50% of all VORP. Of course, we wouldn't expect talent to be evenly distributed. But the distribution here is particularly skewed; there are a lot of David Ecksteins, and not very many Albert Pujolses. Lorenz Curves are traditionally used to evaluate income distribution within an economy; if baseball were an economy, it would be the United States, and not Sweden.
The second reason is that, while salaries are non-linear with respect to marginal wins, so too are revenues. There is a very substantial economic benefit that comes from making the playoffs, and most teams enter the baseball season with one of two goals--either be competitive enough to reach the post-season (a 90-win club), or save costs and write off the season as a rebuilding year (a 65-win club).
It's easy enough to see what happens when we put these two things together. Teams have every incentive to try and reach the playoffs. Reaching the playoffs requires the presence of a number of above-average players, and probably a couple of superstars. But there are relatively few above-average players to go around, and even fewer superstars. Ergo, top players demand and receive a premium. (It's also worth referencing Michael Wolverton's work in our 2002 annual. Michael examined the relationship between wins above replacement and pennants added and discovered that it is not linear; a player who contributes one 10-win season will provide more than twice as many pennants than he would with two 5-win seasons.)
Conversely, sub-premium players like Guzman become less valuable. It's very hard to punt on a position and still expect to reach the playoffs, because that's just one more really good player that you'll have to find to make up for his lack of contribution. Even Billy Beane would find it hard to stomach the presence of a truly replacement level player in his lineup. The A's have had a few Scott Hattebergs--players who perform slightly below average and are paid well below average--and a few mistakes like Terrence Long, but they won't call up an underqualified rookie from Sacramento, pay him the league minimum, play him every day, and sit around patiently while he posts a .235 EqA. That would turn Beane's job from being difficult into outright impossible.
Put differently, there is a hidden cost to fielding players who are substantially below championship caliber if you have aspirations of being a championship-caliber ballclub. A championship-caliber team generally needs to get about 35 wins out of its position players, accounting for both their offensive and defensive contributions; the average position player WARP for the 24 playoff clubs since 2003 has been 33.1. We could get there by fielding a uniformly above-average lineup like this one:
Pos WARP Salary C 4.0 $ 5,013,800 1B 4.0 $ 5,013,800 2B 4.0 $ 5,013,800 3B 4.0 $ 5,013,800 SS 4.0 $ 5,013,800 LF 4.0 $ 5,013,800 CF 4.0 $ 5,013,800 RF 4.0 $ 5,013,800 DH 3.0 $ 3,122,160 Total 35.0 $43,232,560That lineup would cost us about $43.2 million put on the field, according to the market data that I presented above.
Now, suppose that our second baseman is a free agent; we decide that he'll be too expensive, and that we're going to play a replacement level player there instead, making up his contribution elsewhere on the field. If we take the four wins we were expecting out of him and distribute them evenly among the other eight positions, we wind up with:
Pos WARP Salary C 4.5 $ 6,119,168 1B 4.5 $ 6,119,168 2B 0.0 $ 0 3B 4.5 $ 6,119,168 SS 4.5 $ 6,119,168 LF 4.5 $ 6,119,168 CF 4.5 $ 6,119,168 RF 4.5 $ 6,119,168 DH 3.5 $ 4,014,798 Total 35.0 $46,848,970That lineup will be no better and no worse than the one that we fielded previously, but it will be about $3.5 million more expensive, since we have to move farther up the salary curve at the other positions to make up for his loss.
Alternatively, and more realistically, we could make up for our replacement-level second baseman by signing a superstar, 8-win third baseman:
Pos WARP Salary C 4.0 $ 5,013,800 1B 4.0 $ 5,013,800 2B 0.0 $ 0 3B 8.0 $16,834,960 SS 4.0 $ 5,013,800 LF 4.0 $ 5,013,800 CF 4.0 $ 5,013,800 RF 4.0 $ 5,013,800 DH 3.0 $ 3,122,160 Total 35.0 $50,039,920But that is even worse; now we're paying nearly $7 million more for the same talent. Call it opportunity cost, scarcity, or the Tony Womack Effect, but that 'free' guy we're playing at second base comes at a fairly substantial price.
Even Guzman, of course, wasn't projected to be a replacement level player, merely a below average one. Still, the effect exists to some extent any time that we lock in a 'sub-premium' player, and the problems are compounded if we sign him to a multi-year deal, as the Nats did with Guzman. Too many baseball clubs are fixated on cost certainty, when the vast majority of them are capital-rich enough that they don't really need to be, and too many statheads go gaga when a team locks up its young players with multi-year contract extensions, when these extensions don't usually represent a material discount below market price.
I prefer the concept of flexibility: avoiding long-term commitments at enough positions that you can properly take advantage of arbitrage opportunities in the market. Let's look for example at this winter's Phillies, who aren't very flexible. They might well have both the cashflow and the desire to upgrade their offense, but they don't really have the position at which to do it:
Ultimately, we may need a more flexible understanding of replacement level. I've felt for some time that we've set the bar a bit too low if the goal is not describing how baseball teams win games but, rather, how they win championships. One option might be to use the salary equation that I described above; instead of talking about a player's WARP or VORP, we'd talk about his MORP--Market Value Over Replacement Player. A player would get some credit for being better than replacement level in the literal sense of the term--as bad as Angel Berroa has been, he's probably better than a guy you could find on the waiver wire. But he'd receive relatively more credit for being above average, and extra credit still for being an elite player that facilitates championships. The 2005 AL shortstops would look like this:
Player WARP MORP Peralta 9.2 $ 21,708,743 Jeter 8.8 $ 20,016,075 Tejada 7.6 $ 15,346,513 Lugo 7.0 $ 13,241,480 Young 6.2 $ 10,673,027 Uribe 4.5 $ 6,119,168 Crosby 4.2 $ 5,443,183 Cabrera 4.1 $ 5,226,364 Guillen 3.3 $ 3,644,979 Bartlett 2.5 $ 2,335,888 Renteria 1.8 $ 1,413,799 Berroa 1.8 $ 1,413,799 Adams 0.9 $ 534,588 Betancourt 0.7 $ 386,009That, to me, is a more accurate rendering of the player's relative values. Salary considerations aside, I don't think you'd trade Derek Jeter for Juan Uribe plus Cabrera, which WARP suggests would be a fair swap. And I don't think you'd trade Jeter for a team full of Cristian Guzmans.