Happy Thanksgiving! Regularly Scheduled Articles Will Resume Monday, December 1
January 20, 2010
Ahead in the Count
Revising Player Contract Valuation, Part 2
The second of a three-part series.
One particular area of concern in my attempt to develop a new, more accurate formula for MORP is how to price wins above replacement level for superstars, players who add several wins above replacement level. The current alternative to MORP, found at FanGraphs.com, lists a player's value using its win statistic, WAR. They do this by approximating a price of how many $/WAR teams pay on average on the free-agent market and multiplying it linearly by WAR to find that value. While I have a number of concerns about that methodology, this particular choice seems worth exploring.
As it currently stands, MORP is not linear with respect to WARP. In other words, if you take a player with 4.0 WARP, like Jason Bay, his MORP is more than twice as high as a player with 2.0 WARP, like Johnny Damon. This made some sense at the time the study was done, since WARP previously used a much lower replacement level. Thus, players who produced a WARP below 3.0 using the old replacement level were all probably paid close to the major-league minimum salary, as they were probably all pretty close to replacement level. For players between 3.0 and 6.0 WARP using the old replacement level, teams would start paying more for wins as WARP increased, since these players were actually above replacement level.
However, with a higher replacement level, this makes less sense as an assumption and is worth testing. Nate Silver's original formula for MORP logically had salary at the league minimum-when a player's WARP was 0.0-as we see in blue below, but it also showed salary accelerating as WARP increases. If we imagine that the red points are sample data points of salary and WARP for individual free agents, it makes sense to graph it, with the curved shape we see below. Trying to draw a straight line through these points would not look like it accurately reflected the relationship. Instead, the blue parabola best fits the data:
However, Clay Davenport recently shifted replacement level upward to better reflect the players available at the league-minimum salary. Now, let's see what happens to the same exact distribution when we shift replacement level upward and lower all those WARP levels by two wins.
If we continue the natural assumption that zero wins above replacement is worth the league-minimum salary, suddenly, the best fit is a straight line. To properly estimate whether the relationship between wins and the dollar value of a player's wins is linear, we need to dig back into some theory and see what kind of assumptions would make that true and whether they hold in this case.
In a brief e-mail exchange with Silver on the topic, Nate said that one reason that it may still be worth keeping MORP as a non-linear estimator is that we are not estimating what teams should pay for wins but what they do pay for wins. We are stuck with the reality that we cannot determine how many extra fans are put in the seats by adding a player with a certain win value, and we are therefore stuck with using the inferred price of free agents that teams have shown us.
However, we are trying to approximate how much teams pay for that production. If we wanted to simply approximate how much a free agent would sign for, even if he was not necessarily worth the money, we would include a term in the equation to give the player extra MORP for high RBI totals. After all, some GMs certainly place a premium on this and the player will earn more as a result. A formula that simply adds extra dollars to the MORP value when a player has more RBI is hardly the production valuation that we are looking for from MORP.
Think of the implications: If we want to use MORP to evaluate signings, we cannot simply say, "Jason Bay got a great deal. Look at all those RBI that brought his MORP up! Clearly Omar Minaya overpays for RBI, just like Baseball Prospectus told him to!" After all, Bay's placement in the potent 2009 Red Sox lineup led to a lot of RBI that we do not want to pretend is part of his value, since he would not have gotten the same total had he remained with the Pirates.
Similarly, we are also not going to include a dummy variable in the equation for "Was this a veteran relief pitcher signed by Ed Wade?" that adds a certain number of dollars to MORP when the answer is "yes." We are figuring out what teams pay for production, so we should be using their WARP total as the measure of output. Similarly, we should not implement a system that adds extra $/WARP for higher WARP unless there is an actual basis for doing so in economic theory.
If baseball free agents were in a typical, perfectly competitive market like those you see in the first chapter of your introductory economics textbooks, the price per win would have to be linear. Basic economic theory of perfectly competitive markets would say that anything other than the same price for all wins would create arbitrage opportunities where teams could perpetually trade their way to the top of the league. This makes sense in something like a stock market. Say the dollar price of two shares of Microsoft was more than twice the dollar price of one share of Microsoft. People would simply buy up all the single shares of Microsoft, and then sell them in packets of two at a profit. The end result is that the demand for single Microsoft shares would go up, driving that price up, and the supply of pairs of Microsoft shares would go up, driving that price down.
Of course, strawman economists, who consider it their job to think really hard about the first chapter of the econ textbook, are always wrong. That first chapter is really supposed to be a baseline from which we can see what would happen as we change each assumption. In the case of baseball free agents, there are two main reasons why the baseline's assumptions don't apply. First, these markets aren't thick enough that teams can sign and trade players so easily and quickly swap out players for others like investors can do with shares of Microsoft. There are only so many teams, and there are limits to making this kind of move in general. Second, you can't employ 60 Garret Andersons on a team and suddenly become the best team in baseball. There are only 25 roster spots, and only so many players can realistically get enough playing time to realize their true value. Somebody needs to be playing above average for the team to make the playoffs.
The first issue of "thick enough" markets does not matter as much here. As the Mets could have simply signed Jon Garland and Johnny Damon, both two-win players, rather than just Jason Bay, one four-win player, it does not make sense to say that the MORP of a four-win player is any more than the combined MORP of a pair of two-win players. Since the money it would take to get four wins of production is better estimated through the price of two two-win players if there are spots for both, the price of four wins should be the combined price of Damon and Garland. Of course, that assumes that the Mets have spots in their rotation and lineup for both.
The issue of there being spots for both is curious. In the initial background study toward the general project of developing MORP, I decided to look at how many roster spots teams could realistically upgrade from a replacement player. I looked through each team's roster at the beginning of the 2009-10 offseason and attempted to determine how many places a team would have been left with expected replacement-level talent without making a trade or free-agent signing.
The concept of what a replacement player means is hazy, but there certainly does seem to be a talent level that is common among teams who replace players. I used something roughly equal to that definition to do this analysis. This definition, for clarity, involves about 20 runs below average over a full season for position players with positional adjustments considered. For starting pitchers, it involves approximately a 5.50 FRA in a neutral league. For relief pitchers, it is approximately a 4.50 FRA in a neutral league. FRA, or Fair Run Average, estimates what a pitcher's Run Average would have been if he had neutral performance on inherited/bequeathed runners. These are only ballpark figures, as I only have a few projection systems to work with, but these levels do make sense as there seems to be about three to six players who would have been slotted as starters at each position with no off-season moves. All of them were about 20 runs below replacement level and more than half of teams had a fifth starter who could be expected to put up a FRA near 5.50. Among relief pitchers, it's tough. Although the average team employs six or seven relievers, not all of them get high-leverage innings. Looking through how teams have historically used relievers, a good estimate is that each team uses about four relievers in high-leverage situations regularly over the course of the season. Therefore, if a team had three relievers projected to have FRAs below 4.50, I considered that one opening in the bullpen where a new reliever could be signed and get high-leverage innings.
The analysis is relatively straightforward after that. In final part of the series on Friday, I'll check how many spots the average team had, with special emphasis on teams who typically spend on free agents and therefore set the market. With these definitions and the understanding developed above, I'll look through each position on the diamond (and DH in the American League) as of the beginning of November, and the top five projected starters and top four relievers at this time. If most teams have a few replacement-level openings at the beginning of the offseason, then we will know it is OK to go forward with a linear relationship between dollar value and WARP.