CSS Button No Image Css3Menu.com

Baseball Prospectus home
  
  

Search Article Archives

Find:   

Author:    Article: 
Search Date (From):    Search Date (To):   
Sort Results by:  Relevance
 Date
Show me  Results

   Show Article Summaries

New! Search comments:      
(NOTE: Relevance, Author, and Article are not applicable for comment searches)

Matt Swartz
759 comments | 83 total rating | 0.11 average rating
Facebook Twitter email a friend
Share comments by 'Matt Swartz' posted at
Baseball Prospectus http://bbp.cx/i/24824
Matt Swartz
(24824)
Comment rating: -

I don't think you correctly interpreted my point, so I'll try to make it clearer. I don't think they will spend past rationality. I think they will be rational-- that is my point. As a result, I think they will spend as much as the expected increase in revenue. The expected increase in revenue, however, is NOT fixed, and the expected cost of talent is not either. The expected increase in revenue is based largely on the probability of spending putting a team in the playoff spot, something that has been shown many teams here and other places. Since teams have been tighter packed, that probability will go up, which will make the expected revenue of spending go up for teams on the margin, causing them to bid more on wins. You say that teams with 86+ wins will now have the option of spending more but you think they can choose to stand still. But that assumes is it equally easy to reach 86 wins. No one can realistically reach 86 wins without free agent talent, something I've shown before (look through archives for: Service Time Contracts and Wins, Part I and II), which means that if there are more bidders for that talent, the cost of that will go up. In other words, THE OPTION TO STAY PUT AT 86 WINS IS NOT AVAILABLE IF PLAYERS COST MORE. Another example, if there was no longer a luxury tax or draft pick compensation, the Yankees would spend more on the players that mid-market teams have been able to acquire in their pursuit of 86 wins-- making this cost more. The assumptions you're making assume constant prices of labor, a common assumption in perfectly competitive markets with many buyers and sellers of labor-- that has nothing to do with baseball. The odds of winning a playoff series is not so different than 50%, so they won't really treat that as different than before.

 
Matt Swartz
(24824)
Comment rating: -

A fair point. Generally teams spend more than the direct playoff revenue. That extra revenue will be part of what the players get too, though, in the sense that they will be paid for the marginal revenue of wins. Whenever you add revenue associated with winning, the players will get that. So the net result here may be closer to neutral for owners, except for the bidding issue I mentioned earlier in the comments. So maybe it's not $50MM vs. $25MM but $50MM vs. $40MM. Something like that. Good call.

 
Matt Swartz
(24824)
Comment rating: -

And regardless, it seems that you are undoubtedly convinced in general that you expect the players to be the beneficiaries. You're just not sold on my belief that the owners may actually keep less money. But you would agree that if these playoffs generated $25 million more in revenue that player salaries will go up at least around $25 million, right?

 
Matt Swartz
(24824)
Comment rating: -

That's not what I'm saying. What I'm saying is that in situations where they are willing to spend $10 million, they used to only have to spend $8 million to win the auction, but now someone will bid them up $9 million. They will still make money but the increased competition makes it harder to get a player for much more than their marginal revenue. Not only that, when other teams spend less in general, there is less need to keep spending to ensure a playoff spot, but this may make it harder to continue to do so.

 
Matt Swartz
(24824)
Comment rating: -

I played around with this one, but I think it's still better to ensure the longer series. The expected number of games in a 3-, 5-, and 7-game series are: 2.5, 4.125, and 5.8125. So you're expected number of playoff games as a wild card is 6.74 if you're in a wild card matchup and 8.48 games if you get a first-round bye directly. Definitely an interesting idea, though.

 
Matt Swartz
(24824)
Comment rating: -

They don't necessarily spend up to their marginal revenue right now. Other teams bid them up as high as they're willing to go. But as more teams bid, they're more likely to spend closer to the marginal revenue and end up with the same chance of making the playoffs. The laws you're talking about implicitly assume that they would make the same amount of money if they sat on their hands. They wouldn't. Teams on the margin will spend more, and compete in bidding more. Also, the wild card/division winner issue in the AL East also comes into play for years with only two good teams in that division. Being the better of those two teams didn't used to matter but when your playoff odds as a wild card are cut in half relative to now, winning that division matters more. I don't know if you can assume owners will make more on regular seasons games proportionally just because they matter more. Maybe attendance goes up a little, but I doubt it would be commensurate with the playoff gains.

 
Matt Swartz
(24824)
Comment rating: -

I'm not saying that going to 100% playoff participation will raise the marginal revenue. I'm saying that going from 27% to 33% will raise the marginal revenue, and specifically given how the league is currently set up. You're right that I'm saying that there will be more revenue in the collective pot, but that owners will end up with less of it. It follows logically from the economic concept that you're talking about. I'll give you a perfect example Suppose that the MLB says that any team that doesn't make the playoffs for five years in a row loses their revenue sharing. Let's say this makes fans excited, increases ticket sales. But as a result of the larger increase in marginal revenue per win to the Pirates, they now spend more. The revenue goes up, but the Pirates spend more money than the expected revenue increase. Contrived, yes, but it makes the same point. In this case, the odds of one game making the difference goes up from about 27% to about 43%. As a result, the marginal revenue of a win is higher to a lot of teams, and they bid up salaries. It follows logically from rationally acting owners who simply have higher estimates of expected marginal revenue from spending across the board. The key here is understanding the auction nature of this market.

 
Matt Swartz
(24824)
Comment rating: -

The point I tried to make in last September's article was that this is about what PECOTA says, not what the authors say. I also really wouldn't talk about the Red Sox overcoming their sabermetric biases-- the other finding of that article (other than the fact that PECOTA overrated sabermetrically inclined teams) was that sabermetrically inclined teams did better per dollar than less inclined teams.

 
Matt Swartz
(24824)
Comment rating: -

Every year this question comes up, and it's not a ridiculous one on first glance. However, you're missing the context of projecting stats of individuals, versus generic stats of leaders. We are NOT forecasting only two people to hit 100 RBI. If you asked for a projection of the number of people who will have 100 RBI, we can give you an estimate of 30 guys or whatever, but that doesn't mean we can give you which thirty guys will do it-- because what we can tell you is about 80 guys who are going to be at 90 +/- 20 with their true talent level. It's a matter of guessing what each player will do on average, rather than who will be the luckiest. If you give a pitcher with a 3.00 ERA a total of 33 starts, and randomize which games he pitches in, he'll win about 16 games on average. But sometimes he'll get 20, other times he'll get 12. We would be foolish to just start guessing which pitcher gets the best run support, just like we'd be foolish to start guessing which hitter has the most broken bat hits with men on third base. But of course somebody will get 20 wins most years. Somebody will get that lucky. Ryan Howard is the most likely to lead the league in RBI if you have to pick somebody, but he's only got maybe a 18% chance of doing so. Pujols has maybe a 16% chance, Braun has maybe a 14% chance, etc. We'd be foolish to sit here and say that since Howard has the highest odds of leading the league in RBI that he'd have the 130 RBI necessary to lead the league. There's a good chance he ends up with 90 RBI and we weight all the outcomes.

Apr 01, 2011 3:42 PM on
 
Matt Swartz
(24824)
Comment rating: -

That's a good idea. I think the two categories you mention are important, but the unquantifiable intangibles include real skills too. I think things like coaching, synergy, etc., are all part of that. Once the new WARP statistic is available for all years, I'm going to be able to run that a little easier (and I plan to), and I can also check out things I do like the No Turnover Standings for last year too in which I follow the same sum-of-all-WARPs methodology you're talking about.

 
Matt Swartz
(24824)
Comment rating: -

I kept switching between putting Francisco Liriano and Jered Weaver actually and ended up picking Weaver.

Mar 31, 2011 2:53 PM on Staff Picks for 2011
 
Matt Swartz
(24824)
Comment rating: -

Yeah, I know, I just liked the alliteration. I guess I sort of meant the Maryland residents in the middle of the two. Bethesda is where I lived and, at least anecdotally, it seemed split down the middle.

Mar 17, 2011 4:12 AM on Battle for the Beltway
 
Matt Swartz
(24824)
Comment rating: -

The missing link here is that I've actually researched free agent performance in multi-year deals, and found that it declines pretty steeply after the first year: http://www.baseballprospectus.com/article.php?articleid=10883 And that's particularly true for older free agents. There are plenty of exceptions of course, but statistically the Nats are basically getting a bargain in the early years and a net loss in the late years. But the Nats would have been able to add someone who is in the "bargain years" of his multiyear deal when they will have the most use for a bargain. There isn't a lack of elite free agents in any offseason. The Nats just as easily could have signed the best OF or 1B in 2013 and not paid the premium of a current superstar that won't always be.

Mar 16, 2011 6:13 PM on Battle for the Beltway
 
Matt Swartz
(24824)
Comment rating: -

I don't underestimate the damage done to Washington baseball fans. I lived there, I saw it. It was nowhere near the psychological damage that Philadelphians saw when they had a decade of being called a small market team. I do not think that the Nationals will need to do anymore than the Phillies needed to do, which is win. And I'm not saying they will survive on Strasburg and Harper alone-- quite the contrary, I explained in the article that teams need free agent talent to supplement the product on the minor league system as it comes to fruition. But Werth is going to be a couple years older then, and it makes more sense to spending on the biggest free agent of 2013 (or whatever year looks like they're ready to compete) than to spend on Werth now and have less money to spend on players in their prime when Strasburg, Harper, and Zimmerman are all contributing.

Mar 15, 2011 4:37 PM on Battle for the Beltway
 
Matt Swartz
(24824)
Comment rating: -

The draft choices provided by Dunn belonged to the Nats regardless of what they did with Werth. I don't think that the Nats are going to draw much better this year with Werth than they did in the non-Strasburg games of 2010, do you? And I also don't think that an actual winner on the field in 2013 would be ignored due to fans being turned off in 2011, particularly if they added a top free agent in the 2012-13 offseason, do you?

Mar 15, 2011 4:07 AM on Battle for the Beltway
 
Matt Swartz
(24824)
Comment rating: -

Obviously just an anecdote, but I saw a lot of Orioles gear in Bethesda. I suspect if they were good, they'd have a much larger following, particularly if they sustained it.

Mar 15, 2011 4:05 AM on Battle for the Beltway
 
Matt Swartz
(24824)
Comment rating: -

Yup, I think that's the key here. It's not quite that simple because we need to factor in ticket prices and their effect, but the downside of fans distrusting your willingness to win just isn't THAT much lower than what the Nats and O's have been getting by spending on pretending like they're trying to win.

Mar 14, 2011 3:45 PM on Battle for the Beltway
 
Matt Swartz
(24824)
Comment rating: -

I'm sympathetic to your point, but I think that "showing the fans some wins" just isn't going well. These teams are selling 22K tickets per game, just 3K more than the Marlins and Pirates. I think that it's just not worth outbidding a team that is getting a combination of "showing the fans some wins" and actual playoff revenue.

Mar 14, 2011 3:41 PM on Battle for the Beltway
 
Matt Swartz
(24824)
Comment rating: -

I agree-- it's a lot of Maryland counties that are on the fence. I lived in Bethesda and it really seemed like an area that could go either way. I felt like I had a seat on the front lines, and neither army would get their act together!

Mar 14, 2011 3:39 PM on Battle for the Beltway
 
Matt Swartz
(24824)
Comment rating: -

Yeah, I wrote an article actually about when one-year deals work here (http://www.baseballprospectus.com/article.php?articleid=9869), and that actually does condone moves like signing and trading Tejada at times. At the same time, I still would like to see the money spent elsewhere in most situations. Especially with teams projected to lose 90 games or more, I think resources should be managed differently. It's a smaller loss, but why not use the $10 million that a Lee-type player costs towards international free agents and towards a free agent a couple years down the line?

Mar 14, 2011 3:36 PM on Battle for the Beltway
 
Matt Swartz
(24824)
Comment rating: -

I agree with the issue of having a city of transplants, as it is somewhat unique to baseball. Since baseball is a game that often is shared between fathers and sons, it is probably more likely to affect a transplant city where the younger residents share team loyalties with transplanted parents. Still, Strasburg-mania dulled those concerns, since I realized how many people wanted to be Nationals fans if they had a reason to be.

Mar 14, 2011 3:34 PM on Battle for the Beltway
 
Matt Swartz
(24824)
Comment rating: -

What the Nats got for Dunn and Willingham as irrelevant to this situation because they would have gotten those players regardless of whether they signed Werth. Similarly, the fact that they have Strasburg already shouldn't affect it either. By signing Werth, they lost the pick and that is a cost associated with it. However, you are correct that the Nationals have not been run as poorly as the Orioles have during 2005-2011, and that a lot of their under-.500 struggles are a result of the mismanagement of the Expos franchise. The only reason I highlighted their collective records was to discuss how unique the market was in lacking an above average team for an extended period of time. I agree that using free agency along with the draft is a smart strategy, but in a way that enhances the Nationals when they are going to be good. Free agents from other teams generally derive most of their value in the first year of their contracts. This is especially important for a team that is so positioned to improve over time as the Nationals are poised to do.

Mar 14, 2011 3:32 PM on Battle for the Beltway
 
Matt Swartz
(24824)
Comment rating: -

Right-- which I think is evidence of how much the area wants a winner! There are Caps jerseys all over the place, despite the fact that hockey on the whole is not that popular in America. People are rabid about the Redskins for the first two weeks of the season too. It's just an untapped market for a winner!

Mar 14, 2011 5:37 AM on Battle for the Beltway
 
Matt Swartz
(24824)
Comment rating: -

Was this what you had in mind? http://www.baseballprospectus.com/article.php?articleid=11890 The systematic bias I found was that teams who use sabermetrics more tended to be overrated.

Feb 24, 2011 6:52 PM on
 
Matt Swartz
(24824)
Comment rating: -

This is a common question. If you asked the question of "how many pitchers will get more than 15 wins this season on average?", the answer is that probably 5-10 will. However, there is no individual pitcher with better than a 50/50 shot of topping 15 wins. The pitchers that win 20 games are basically always among the top pitchers in the games who were ALSO lucky with run support and health in a given season. For example, batting average generally has a standard deviation of about 20 points for a given season, which means that if you get 10 guys who are supposed to be around .310, one of them is probably going to hit .335 or so because of luck alone even without improving their talent level.

 
Matt Swartz
(24824)
Comment rating: -

That could be something we'll work on soon too. I agree that it's important to be forthright about when we do not come out on top. Colin has worked very hard on improving PECOTA in the last several months, and he's definitely playing with tests of various versions of PECOTA at least internally. I do think multi-year looks at projection systems are an important step to take as well. Thanks.

Jan 29, 2011 9:12 AM on Testing SIERA
 
Matt Swartz
(24824)
Comment rating: -

I don't think we'll integrate the two in a formulaic way. Mike Fast and I email and tweet to exchanges ideas sometimes, and it has definitely helped my understanding of how the two can be integrated to learn about pitching. I suspect that the data won't be used in SIERA but both will be used to inform our understanding of pitchers.

Jan 29, 2011 9:10 AM on Testing SIERA
 
Matt Swartz
(24824)
Comment rating: -

Really cool idea. I'm going to have to look into this. Thanks!

Jan 29, 2011 9:09 AM on Testing SIERA
 
Matt Swartz
(24824)
Comment rating: -

My understanding is that is definitely the plan for the player cards this year. Doing it on old cards was apparently very tricky.

Jan 29, 2011 9:09 AM on Testing SIERA
 
Matt Swartz
(24824)
Comment rating: -

Yeah, I agree that FIP isn't park-adjusted and that causes it to be short-changed. In the article I explained that, which is why I tested FIP against unadjusted ERA for pitchers who did not switch teams, and checked SIERA, xFIP, etc. against park-adjusted ERA for the same pitchers who did not switched teams. (That's in table 4). I agree RA would be ideal instead of ERA, for non-fantasy purposes. At some point, we might derive a SIRA regression, or something like that, so as to sort out what that should look like. I guess that the original goal was just to move closer to estimating something familiar and to justify its use for fantasy purposes if people are interested. It's definitely something on my radar. Thanks.

Jan 29, 2011 9:08 AM on Testing SIERA
 
Matt Swartz
(24824)
Comment rating: -

In this article, I didn't actually compute any of the estimators using formulas. In the article, I explained that to avoid last year's mix-up, I just used what was available at BP and FG. I did use old SIERAs from 2003-04 before BP stats have access to the batted ball data. If there was a bbFIP posted somewhere that I could merge with the rest of the data, that would be easier. If I try to compute it from an original set of batted ball data, I can include bbFIP, but it's tricky without since my coding skills are still a work in progress.

Jan 29, 2011 9:04 AM on Testing SIERA
 
Matt Swartz
(24824)
Comment rating: -

Most of this stuff I looked at in my previous articles on this stuff actually. Great minds think alike and all.. Crowd seems to be almost no effect. If so, some teams would have consistent home-field advantages year-to-year more and more. This is not true. Also, teams that went from old stadiums to new stadiums would see jumps in HFA, but instead they see small declines if anything. Travel period seems to have a weird effect that is not very clear. There is no difference in between the first and second series in a row at home or on the road. There is a higher HFA in the 2nd game of a three games series, with lower HFA in the 1st and 3rd games of the series. This seems like it could be related to sleep deficits counteracting the effects of getting used to stadiums, but the jury is still out on it. On the other hand, travel length is very relevant. The shorter that teams travel, the smaller the home-field advantage. Umpiring I'll need to defer to others on. If you want more details, click on the link in the first sentence of this article, which is the recent article with links to all five articles in my 5-part series last year.

 
Matt Swartz
(24824)
Comment rating: -

FYI When recreating MORP, I used the actual pick surrendered. The $8/5/14MM I talk about for Surrender 1st/Surrender 2nd/Resign Type A is just an approximation of the output. I literally went through each and every draft the last few years and wrote down which picks were surrendered. It was boring!

 
Matt Swartz
(24824)
Comment rating: -

MORP is $5MM/WARP. So 2.47 WARP is $12.35MM. I'm not sure where you got the other number. (FYI the glossary needs to be updated from 2007 numbers which were based on 2007 salaries and on old WARP with a lower replacement level.) The 2.47 WARP would be right in line with a pitcher who throws about a 4.25 ERA over 180 innings, so that's where we're in agreement. If you recalculate with MORP done correctly, I think you'll realize where your mistake is. But Carl Pavano, a very comparable pitcher, is going to do a lot better than 2y/$17MM on the FA market right now. If you get in a bidding war with another team for Joe Blanton, you should go as far as you'd go for Carl Pavano. The first thing I said after the table was: "It is true that Blanton is probably the worst pitcher on this list, because it was designed to figure out who had outpitched him over the last two years". So of course he's the worst one in that list. Go through and find a list of pitchers who made $5MM after reaching six years of service time in the last few years. They all have major injuries or ERAs around 5. That's simply not Blanton at all.

Dec 20, 2010 1:57 PM on A Pitch for Joe Blanton
 
Matt Swartz
(24824)
Comment rating: -

Click on the "Statistics" tab on the top, click on "Pitcher Season - Rates" which takes you here: http://www.baseballprospectus.com/statistics/sortable/index.php?cid=112182 You can also click on "Pitcher Team - Rates" if you want pitcher's who switched teams mid-season to come up separately for two different teams (i.e. Roy Oswalt as an Astro and as a Phillie on separate lines)

Dec 20, 2010 11:53 AM on A Pitch for Joe Blanton
 
Matt Swartz
(24824)
Comment rating: -

That's a very specific cohort. It includes Joe Blanton and Johan Santana. Take away the service time minimum and you include Jason Hammel. Replace the IP of 350-400 limit with a 300-450 and you get Chris Carpenter.

Dec 20, 2010 4:43 AM on A Pitch for Joe Blanton
 
Matt Swartz
(24824)
Comment rating: -

I think what he is saying is that any variable that has any slight persistence at all will do that. That's true, I think. But we're talking about saying suppose that all pitchers gives up infinity fly balls in 2010 and infinity fly balls in 2011, all against the same competition, the correlation of his HR/FB in those two years would be 1.0. That's technically true. The sample size issue that Tango is talking about with HR/FB is cancelled out somewhat by the fact that HR/FB is lower than BABIP. HR/ofFB is around .130 on average. BABIP is around .300. About 28% of BIP are oFFB. So if you take a pitcher with N BIPs and .28*N ofFBs, then your standard deviation due to randomness is: For BABIP: sqrt{(.300)*(1-.300)/N} = .46/sqrt(N) For HR/ofFB: sqrt{(.130)*(1-.130)/(.28N)} = .64/sqrt(N) So there is a slightly larger standard deviation due to randomness, about 1.39 times as large for HR/oFFB. But the correlation for BABIP is .122 vs. .075 for HR/ofFB, which is 63% larger. So we're still looking at BABIP having higher variance in skill level (***assuming I did all that correctly***)

 
Matt Swartz
(24824)
Comment rating: -

Well, the proof mostly comes at the beginning. The correlations of BABIP and HR/FB are positive and small. They are also significantly negatively correlated with K%, and BABIP is significantly correlated with GB%. So we do say evidence of that small amount of control in that respect. Pitchers who miss bats also tend to induce weaker contact.

 
Matt Swartz
(24824)
Comment rating: -

Yeah, it's probably very unlikely that many pitchers can really control where hard hit contact is made. If you're going to have that much control, you should probably aim to get the guy to make weak contact or miss. But a few pitchers probably do have the skill and intelligence to get that job done, which is why we do see somewhat positive correlations for a lot of the numbers above, even the ones close to zero.

 
Matt Swartz
(24824)
Comment rating: -

I don't want my discussion of FIP to be mischaracterized. When I say "FIP assumes no BABIP control," I do not mean "Tom Tango assumes no BABIP control." I'm ONLY saying that his model is designed to neutralize BABIP, and one of the advantages SIERA has is that it neutralizes BABIP luck with neutralizing the whole thing. One of the model assumptions of SIERA is that all we know about BABIP can be explained by K, BB, and GB%. That's not a 100% true assumption either. But it's a feature of the model.

 
Matt Swartz
(24824)
Comment rating: -

I think that any detailed statistical analysis of one particular game needs to have a big asterix on it. That said, I think that pitchers (particularly ones like Moyer) are great at having plans for games. They are really good at mixing pitches and finding hitters weaknesses. As a result, they can often have good BABIP in a fleeting moment, but those strategies don't tend to be persistent. It's tough to keep getting the same hitters out if you are not striking them out or getting them to hit ground balls. Moyer will surprise a lot of hitters with a fastball down the middle on an 0-2 count, and then they'll hit a pop-up when they suddenly realize that it is actually a fastball. I don't think that Moyer will get the same guy to do that again, so I wouldn't say his BABIP is repeatable (as evidenced by his ever so slightly good but not amazing career BABIP), but he's definitely able to make a good pitch to get an out in an individual game. One thing I have also noticed with Moyer is that when his location is off, his BABIP is terrible, and no pitcher can always have their location on. It's a fleeting thing-- even Cliff Lee and Greg Maddux have games where the ball isn't going where they want it to be within the strike zone. So I would say it's possible to have a good game with mediocre K/BB, but probably it's rare, and the times it does happen, it's probably when the pitcher gets ahead in the count a lot.

 
Matt Swartz
(24824)
Comment rating: -

Line drive rate for major league pitchers has almost no persistence at all. Something along the lines of <.01 year-to-year correlation. I'm sure that if you included a bunch of A-level pitchers in the major leagues, they'd allow a lot of line drives, but among MLB pitchers who can keep their jobs (i.e. maintain at least a K% of 10% and do at least something else well), they're line drive rate is not persistent at all.

 
Matt Swartz
(24824)
Comment rating: -

I think so. They should be getting overhauled soon, and I've asked to make sure SIERA is on them. In my spreadsheets, I find it helpful to have SIERA year by year for pitchers so I'm sure readers would too!

 
Matt Swartz
(24824)
Comment rating: -

Statistics tab-- Pitcher-Team-Rates or Pitcher-Season-Rates: Here's the link to pitcher-team-rates http://www.baseballprospectus.com/statistics/sortable/index.php?cid=224511 is Pitcher Team Rates

 
Matt Swartz
(24824)
Comment rating: -

I didn't actually run this through peer review. A lot of the time our writers email each other about articles and talk them out, but short of a little discussion with Eric, I felt comfortable presenting these as facts. The conclusions are pretty evident, so I wasn't too worried about interpreting them incorrectly here. There's been some discussion of some of the peripheral issues on Tom Tango's blog but nothing really about the content at this point.

 
Matt Swartz
(24824)
Comment rating: -

Oh, that's specifically because fly balls and pop-ups are outs more often than ground balls. About 23-24% of ground balls go for hits, while about 17-18% of fly balls and about 2% of pop-ups. FB pitchers have fewer hits on balls in play, but those hits do tend to go for extra bases, and they also tend to allow HRs too.

 
Matt Swartz
(24824)
Comment rating: -

They each command the strike zone. Very few walks and plenty of strikeouts. You can be a slow baserunner as a hitter if you have power. You make up for the skill elsewhere. This is a similar thing.

 
Matt Swartz
(24824)
Comment rating: -

Extreme fly balls pitchers are going to struggle to give up a lot of home runs. They may be solo home runs more often due to the infrequent hits on balls in play, but they will still be home runs. Pitchers who give up a lot of pop-ups are bound to do a little better than pitchers who give up only long flies. Regardless, the pitcher would need to have great K/BB numbers to offset the home run issue. There definitely could be a few. I think guys like Ted Lilly, Scott Baker, Jered Weaver all get away with high fly ball rates.

 
Matt Swartz
(24824)
Comment rating: -

The p-statistics being low is good. It means that there is virtually no chance that these variables would be so far from zero if those variables had nothing to do with ERA. Specifically, it's very unlikely that the effect of GB% on BABIP is just linear. The curve that you see in the tables (and hopefully the graphs...they're working on it) is significant enough.

 
Matt Swartz
(24824)
Comment rating: -

I'm actually discussing this in the next article which should be out tomorrow. There is actually somewhat of a skill with inducing pop-ups, probably related to movement.

 
Matt Swartz
(24824)
Comment rating: -

Yes, there is a benefit to being a high GB% pitcher than xFIP doesn't calculate. xFIP assumes BABIP skill is equal for all pitchers so it misses pitcher differences in that skill. xFIP's strength is that it more precisely knows the direct effect on runs of HR, BB, and K than SIERA can tell its own. The tables aren't a takedown of xFIP. They're just a way of highlighting the effect of ground balls that I have found. The standard deviation of pitcher BABIP skill is about .007, meaning that it's probably about 0.15-0.20 runs per nine innings for the average pitcher. SIERA picks up on this skill pretty well-- and specifically does so because it has a GB^2 term in the equation. The point of the article is to explain why that term came up as it did, and how helpful it is to understanding pitching.

 
Matt Swartz
(24824)
Comment rating: -

What I'm basically saying is that a 1% chance of such an extreme three-year period happening by random chance is still a chance, and that I'd bet this is just a fluke. I'm not saying I don't trust the numbers-- I calculated them myself. I'm saying that there's still a chance that it was a random fluctuation, and that I'm guessing that's what it will end up being. I'm sure that there is some hitter who is hitting over .500 in Tuesday day games, despite how improbable that is, and I don't think he's actually an over .500 hitter. But 1% of hitters will be 2.5 standard deviations above their true talent level in Tuesday day games, which would yield false positives. That's my best guess about what's happening here, but I'm very unsure. I don't think HFA and R/G are correlated at all, given that the numbers have barely moved for 60 years. Also, the R/G has fallen since 10 years ago but risen since 20 years ago, so we're not in a unique period in terms of R/G.

 
Matt Swartz
(24824)
Comment rating: -

Z-score analysis on one-year samples requires a 56% Home-Field Advantage (up from 54%) to show up as statistically significant. There simply are not enough games in one season to confirm a switch to 55-55.5% Home-Field Advantage either way, so you will always discover no effect. It seems that you also may have used the standard deviation of the data, which is not the way to analyze this at all. We don't need sample standard deviation-- this is a binary variable. The standard deviation is sqrt(p*(1-p)/n) in any set of n games with a HFA of p. I think that using 3 years is going to be the best way to look at this, since we're asking about the last 3 years, and it's worth looking at 4 or 5 (like I did in the article) just to check if I'm cherry-picking data. You also would have found a significance in the 8-year period of 2003-2010 had you used the correct standard deviation formula rather than the sample standard deviation.

 
Matt Swartz
(24824)
Comment rating: -

Hmm...I know that I tried to look at whether hitter's or pitcher's park would trend towards larger or smaller HFAs, but found virtually no difference though maybe a small tendency towards pitcher's parks. I'm not sure that I even thought about checking extreme vs. neutral parks...worth looking into also, thanks.

 
Matt Swartz
(24824)
Comment rating: -

You know, I was looking for this effect a year or so ago, and it seemed like rosters were just built very poorly for their parks. But maybe that is starting to change more. That's probably just as likely as the quirky stadium effect, come to think of it. Worth looking at-- I just need to think about methodology because that's a tricky question to ask the data.

 
Matt Swartz
(24824)
Comment rating: -

Maybe, but this seemed to be an improvement across the board, so I doubt it. It seems like a lot to be explained by cheating anyway-- even if every team stole signs and swapped out balls every single game they played, would it really change the outcome of 1.5% of games? Now suppose that only half the teams did it and only half the time-- still a lot of cheating-- would that 6% of games from losses to wins?

 
Matt Swartz
(24824)
Comment rating: -

Yeah 1978-80 at 55.2% was as close as it got, but no three year period had 55.5%, so it's slightly unprecedented. And it is significantly different from the 1950-2010 average at the 99% confidence level. It's just that it still could be a coincidence, and I think that's still the most likely explanation. I think after 2011 and 2012, we'll have a better sense of this.

 
Matt Swartz
(24824)
Comment rating: -

Yeah, I agree that it's not a very relevant correlation given the sample size. I only thought to look at it because of what I learned about the ratio of triples/doubles for home teams, and how stadium quirks would seem to exacerbate those type of difficulties on the road.

 
Matt Swartz
(24824)
Comment rating: -

Okay, so firstly, I'm 100% sure that you are right that the better the performance of a player's teammates, the more likely a similar performance will lead to an MVP Award. I'm also 100% sure that the luckier your teammates were, the better there numbers were. However, I'm not asking that question. What we do know is that luck between teammates should not be correlated at all. If we define luck as we generally do-- randomness-- then we should not expect a correlation between teammates' luck. If Albert Pujols flips a coin that comes up heads and then gives the coin to Matt Holliday, his coin should still have a 50/50 shot of coming up heads. If Pujols' true talent level is a .400 OBP guy and Holliday is a .380 OBP guy, then Pujols getting on base more than 40% of the time should not affect Holliday's likelihood of getting on base more than 38% of the time (excluding things like IBBs).

 
Matt Swartz
(24824)
Comment rating: -

I'm not sure I understand what I would change. Is your belief that the luck is correlated between players on a team? If so, I wouldn't call that luck. Maybe you mean that the things that tend to cause unsustainable jumps in performance are often common among players on a team? Maybe so, but it would be tough to quantify that I think, but I'm open to suggestions. Thanks.

 
Matt Swartz
(24824)
Comment rating: -

I'm actually inclined to include Bonds, because it's not that surprising that one player every couple decades will exceed their competition by such a large margin. I think you want to include outliers here. Though for the curious, the WARP change would now be from 5.0, 7.5, 6.1 WARP, and .299, .327, .310 TAv. So it would still be about the same with a little more of a pre-MVP to MVP jump and a softer MVP to post-MVP decline. Bonds' 2005 injuries cause a 10.2-win drop in production that was atypical but his strong 2000 performance made the average improvement in MVP season look smaller. Still, I think we should include Bonds. Sometimes MVPs are won by Ruths and Bonds and sometimes by Vottos and Hamiltons.

 
Matt Swartz
(24824)
Comment rating: -

D'oh. Fortunately, that mistake was made copying from Excel into Word, so the averages were done using Santana's correct statistics. His 2003-05 numbers were: 3.4, 8.6, 5.9 WARP; 3.07, 2.61, 2.87 ERA; 110.3, 228, 231.7 IP. Thanks.

 
Matt Swartz
(24824)
Comment rating: -

Unless the market turned in the other direction. The average value of a win from a player with more than six years of service in 2009 and 2010 is already built into MORP, so the fact that outfielders were particularly cheap and other players were more expensive might not have been obvious. I'm not sure that there was conclusive information about which direction the market would go, and the fact that prices fell doesn't necessarily imply that the process was bad. Keep in mind also that Ibanez's performance might seem to have been a shock to us, but for people inside an individual baseball organization, they are privy to a lot more information. Given the familiarity of the Phillies with the Mariners' system, it wouldn't be surprising if they felt they had enough information to know he would beat his PECOTA projection. That he did beat it isn't necessarily conclusive about the process any more than knowing that prices fell is conclusive about the process in the opposite direction, but it's another thing to take into account. The net effect is that Ibanez produced more per dollar than most dollars spent on free agents over the last two years, even if you imagine the entire contract being spent on the previous two years.

Nov 19, 2010 10:58 AM on Philadelphia Phillies
 
Matt Swartz
(24824)
Comment rating: -

Sorry-- the glossary definition is old. The descriptions came here: http://www.baseballprospectus.com/article.php?articleid=10629 and here: http://www.baseballprospectus.com/article.php?articleid=10642 The numbers for valuing draft picks were later put at $13MM for signing your own Type A, $8MM for surrendering your first round pick to sign someone else's, and $5MM for surrendering your second round pick to sign someone else's. Otherwise, the numbers in that article should be correct.

Nov 19, 2010 10:26 AM on Philadelphia Phillies
 
Matt Swartz
(24824)
Comment rating: -

Ibanez had 4.8 WARP in 2009 and 3.6 WARP in 2010. That is worth about $41.5MM right there. He cost them $31.5MM for the three year deals plus a draft pick worth about $8MM. The deal is already about a wash if he is replacement level this year. Most multi-year deals are net gains early on and net losses later on.

Nov 19, 2010 6:53 AM on Philadelphia Phillies
 
Matt Swartz
(24824)
Comment rating: -

Wow, I didn't even think of relating that article to this. That is definitely worth checking- thanks!

 
Matt Swartz
(24824)
Comment rating: -

This is definitely vindication of that method, and even more so because the Hit List tries to adjust for strength of schedule. The Orioles do have to play in the AL East every year, so even if 3rd Order Record overestimates their future performance, it more correctly states their value.

 
Matt Swartz
(24824)
Comment rating: -

SIERA does not account for league differences, so that's part of the reason why I'm inclined to say Lee is better than Lincecum. At the same time, Lincecum facing pitchers 2x per game probably wouldn't have an overwhelming impact. Some of the league difference in skill level comes from BABIP differences both on offense and defense, so I'm inclined to say that maybe a quarter of a run difference in skill level might be reflected in peripherals, but that could be off. Even still, saying Lee is better than Lincecum is not a statement I could make with 100% certainty, so perhaps I should dial that back a bit. In my opinion, I'd certainly pick Lee over Lincecum in terms of skill level but both of them are fantastic. Tonight should be a treat!

 
Matt Swartz
(24824)
Comment rating: -

I discussed Blanton's 2010 season at length here: http://www.baseballprospectus.com/article.php?articleid=11954 Firstly, I would strongly advise you not to use line drive rate to measure pitcher performance. The correlation between a pitcher's line drive rate relative to his team is 0.003. Strikeout rate is the most persistent pitcher statistic, which is why it was not surprising to see Blanton maintain most of his strikeout rate this year. Also worth noting is that Blanton had a 3.48 ERA in the second half with a 20% K-rate, correcting some of the flaws he had in early 2010 with being hard hit. My argument for Blanton not being the kind of pitcher to pitch behind his SIERA is not simply the career history of doing so. It is also that he is a very normal pitcher. He doesn't have a knuckleball or anything that confuses these things. He doesn't have a bizarre pop-up inducing tendency. He doesn't have major splits within bases empty and men on. He doesn't throw very fast or very slow. He throws an average fastball at 89-90mph, mixing in primarily a changeup and a slider with an occassional curve or cutter. He's got years of matching peripherals and now has been able to inducing more whiffs than he did in Oakland. He's a pretty classic case of being hit a little bit harder and more balls falling in, but all in a non-repeatable way. And it hasn't been repeating since the All-Star Break, when I started chirping that he was going to improve. (Now cue jinxed Blanton having a small-sample size implosion in 4.5 hours...)

 
Matt Swartz
(24824)
Comment rating: -

Bumgarner had a 3.88 SIERA this year, and Blanton had a 4.01 SIERA this year. Blanton walked 5% of hitters, Bumgarner walked 5% of hitters. Blanton struck out 18% of hitters, Bumgarner struck out 18% of hitter. Bumgarner had a 46% ground ball rate to Blanton's 43% ground ball rate. Their talent level is very similar right now. The point of SIERA is to isolate skill level from luck, and I have generally explained the good luck for pitchers with ERAs below their SIERAs and bad luck for pitchers above their SIERAs. In cases where I did not believe that was appropriate, I highlighted pitchers like Matt Cain who seems to have a unique skill at beating his SIERA. I'm not convinced Bumgarner has that. He will probably improve going forward, but that doesn't make him a decisively better pitcher in today's game. Both are very good 4th starters, so "one of the best fourth starters" would be equating his skill level for Bumgarner's, generally acknowledged as a good fourth starter.

 
Matt Swartz
(24824)
Comment rating: -

Check Blanton's SIERA the past two years (3.92, 4.05) and let me know if you think that something about Blanton's pitching that makes him unable to match his SIERA. I don't see any reason to think he's something other than a 4.00-4.10 ERA pitcher when it comes to true talent level. Keep in mind his ERA last year was right on with his SIERA, as it has been in years past. Bumgarner is a great fourth starter as well. Both are leagues better than the majority of fourth starters in the league. How many other teams have four starters with talent levels of 4.10 or better?

 
Matt Swartz
(24824)
Comment rating: -

The indented paragraph is the re-printed summary of what I wrote for each pitcher's first start. The analysis of the season/career of the pitchers still holds, and it seems silly to rewrite the exact the same thing, though maybe I'll eliminate the references to LDS opponents going forward if it's confusing.

 
Matt Swartz
(24824)
Comment rating: -

I meant that he gets two games allotted to him. If the series ends first, it won't happen of course. The Rangers still need at least 2 wins from non-Lee pitchers.

 
Matt Swartz
(24824)
Comment rating: -

In the sense that it affects sequencing, sure. Tom Glavine notoriously bested estimators like SIERA in his career because of how he sequenced-- specifically, attacking hitters with bases empty and pitching around them more with men on. The result was more of his opponents' OBP came when men were on (and SLG was relatively more important), and more of his opponents' SLG came when bases were empty (and OBP was relatively more important). His UBB rate went up from 6.2 to 8.9 percent with men on, while his HR rate went down from 2.3 to 1.5 percent. As a result, you get things like: 2005: ERA 3.53, SIERA 4.72 2006: ERA 3.82, SIERA 4.37 2007: ERA 4.45, SIERA 5.20 That's not quite what most people mean by mental makeup, but it's mental in the sense that he was an excellent strategist with good control. In terms of "bearing down" which is what most people mean by mental makeup, I think that it could work the other way-- certain pitchers with anxiety issues may perform worse with men on and have ERAs worse than their SIERAs, though I suspect people with that tendency might not have buckled down when they needed to pitch in front of scouts the first time anyway. I also think that people who don't focus with bases empty probably would have trouble making the big leagues too-- so they might have a tendency to have ERAs lower than their SIERAs but both would be too high in the first place. I think people imagine pitchers and hitters out there with intense concentration at crucial moments in the game, but I tend to think that's just when fans concentrate really hard and notice the concentration a pitcher or hitter that was concentrating hard the whole time.

 
Matt Swartz
(24824)
Comment rating: -

SIERA does not factor in run support at all. It focuses on what a pitcher's ERA should be if he had neutral luck and played in a neutral park with average defense. To your question about best rotations in the playoffs, the Phillies rotation is better than the Giants in terms of SIERA, with three of the top six SIERAs in the playoffs (among starters). Even if you adjust for Cain constantly being able to beat his SIERA, it would still be in the Phillies favor as Sanchez is a good deal behind Oswalt and Hamels. Keep in mind Blanton as a 4th starter is much better than Kendrick-- his SIERA this year was actually just 4.01. Still, I think it's probably safe to say that the Giants have the second best rotation in the playoffs, though the Braves and Rangers both have pretty solid ones too.

 
Matt Swartz
(24824)
Comment rating: -

I'm guessing that the park adjustments would very a lot from pitcher to pitcher, though, because different pitchers throw in different regions with different angles more than others. A guy with a sweeping slider might be particularly tough to evaluate. On top of that, my home-field advantage studies found that the strike zone was where large home/road differentials existed, with BBs and Ks being big areas of home-field advantage. I would think adjusting pitcher to pitcher could be tricky. All in all, I think doing it without a scientific calculation like pitchF/X is going to be problematic. I think it's a great approach to baseball and the results above show it does have some value even with all its problems, but the granularity of noisy information makes it tough to really approach scientifically and learn something about.

Oct 01, 2010 9:18 AM on Pitch Data and Walks
 
Matt Swartz
(24824)
Comment rating: -

Hmm...I will look into this tonight when I have access to my data again. The first-pitch strikes was about the year before though-- the high walk year-- so the pitchers who had thrown more first pitch strikes but had higher walk rates anyway improved their walk rates whether or not they had more first pitch strikes the second (lower BB) year.

Oct 01, 2010 7:36 AM on Pitch Data and Walks
 
Matt Swartz
(24824)
Comment rating: -

You completely misquoted him. DL asked Burns that question while following up with a comment on whether it was polarizing. It's clear from Burns' response that he meant that it mirrors society a lot (which is his whole point anyway), but that the polarization of society is not quite mirrored in baseball. Separately-- The issue of subsidizing stadiums is whether it generates enough money to give the government more money to spend, so you're not focusing on the right issue there. Also, the value of seats is not a moral issue, and shouldn't thrown in either. A price ceiling on tickets would just create a black market where scalpers would be the beneficiaries of baseball teams' and players' hard work.

Sep 29, 2010 2:31 PM on Ken Burns
 
Matt Swartz
(24824)
Comment rating: -

If you read J.C. Bradbury's article on aging, he found that strikeouts peak so early that having a normal aging curve wouldn't have been appropriate. The correlation between SO/PA and the other variables is very high, which is why I needed a regression in the first place, to see which effect was driving it. They aren't colinear or anything, just highly correlated, so it's appropriate to run the regression with both K% and the correlated BIS variables like swinging strike rate.

 
Matt Swartz
(24824)
Comment rating: -

Weird, I somehow mistyped the tables! The one labeled Swing% is O-Swing%, the one labeled O-Swing% is Z-Swing%, and the one labeled Z-Swing% is Swing%. O-Swing% was weakly statistically significant.

 
Matt Swartz
(24824)
Comment rating: -

I think the last correlation you found is a result of selection bias. There are pitchers who are good at getting ground balls, pitchers who are good at striking guys out, pitchers who are good at both, and pitchers who are good at neither. Halladay is good at both. Hamels is good at Ks. Hudson is good at GBs. The guy who is good at neither is in AAA. Hence you get negative correlation. I'm not sure that high strikes are the best explanation, because when I ran my regression in the spring in the "Why SIERA Doesn't Throw the BABIP Out with the Bathwater" article, I found that even controlling for ground ball rate, high-K pitchers had lower BABIPs. Controlling for Ks, high ground ball pitchers had higher BABIPs. I think both are distinct effects. One is about the types of batted balls most likely to go for hits (GBs vs. FBs) and the other is about batter's decisions with how hard they swing, etc.

 
Matt Swartz
(24824)
Comment rating: -

I definitely have found the BABIP vs. SO% inverse correlation and written about it some before. I would suspect that is stronger than the swinging strike vs. BABIP correlation, but it's worth checking at some point when I can merge all that data in one set. Good idea, thanks.

 
Matt Swartz
(24824)
Comment rating: -

It's true that we don't have the balls in play data, but it's pretty clear Ted Williams was lucky that year because he hit .344 the year before and .356 the year after. So either he (a) became ridiculously more talented than he already was for one year and lost that talent immediately afterwards (b) faced ridiculously poor pitching/defense that one year in such a way that his Three True Outcomes numbers didn't move (c) figured something out very amazing about pitchers that one year that completely did not affect his Three True Outcomes numbers and which pitchers immediately fixed the next year or (d) he was a little lucky that year. So with a little bit of a cop-out, I'm gonna say he was lucky, but yeah, no batted ball data means no E-BABIP.

 
Matt Swartz
(24824)
Comment rating: -

Well, the .367 BABIP is at least partly luck. His minor league BABIPs weren't that high, but were all pretty consistently above average. His K and BB numbers certainly don't indicate he controls the strike zone all that well, and he only has a few HR, so this isn't a guy who really drives the ball. What he does have is a slight downward plane to his swing which is giving him a 50%-ish ground ball rate and few infield flies. That's why he's above average at BABIP in terms of skill level, especially when mixed with his speed. I'd say he's probably a .315-BABIP kind of guy, but I haven't pushed him through the model yet to see what it would say.

 
Matt Swartz
(24824)
Comment rating: -

With Young, I actually had him projected at .316 BABIP for 2009, so .313 seems about spot on. I'd need to look at his numbers more carefully to say what has changed, though, and it would take some time to calculate his E-BABIP for 2011. I had to plug the other guys in individually since I didn't have a whole spreadsheet put together. I'd guess around the same area though. His line drive BABIP was certainly high in previous years, which usually suggests regression though. I'm guessing that's a lot of what happened.

 
Matt Swartz
(24824)
Comment rating: -

With random variance formula-- it is highest when p is closest to .5, which means ground balls (.456) will have a HIGHER random variance than line drives (.189). It really depends on the relative noise to skill. For instance, home run rate is further from .5 than BABIP is for hitters, but it has a much higher year-to-year correlation because the size of the home run skill across major leaguers varies so much more than the variance of the BABIP skill.

 
Matt Swartz
(24824)
Comment rating: -

I'm not really sure it has to do with sample size, because it's still the same sample of balls in play. In fact, the standard deviation is lower for a low percentage outcome. If ground balls are 45%, than their standard deviation over 500 balls in play would be 2.22%, but with line drives at 19%, their standard deviation is 1.75%. It's about the relative skill level standard deviation being much smaller. There are guys who hit 55% ground balls regularly, but there isn't anyone who hits 29% line drives regularly. I wouldn't even think it's measurement error because that should actually make unadjusted line drive numbers more persistent because the bias is more persistent. When it comes to line drives for pitchers, they look a small bit persistent only before you look at line drive rate relative to team and than persistence falls to 0. The only reason line drives appear to happen more often is that scorer bias is persistent. The reason I think skill level with line drives is smaller has to do with swings. If you're good at putting your bat on the ball, you'll have more batted balls, but I'm not sure you center it that much more regularly. However, if you swing with an uppercut or a downward plane, that is going to regularly come off the bat a different trajectory.

 
Matt Swartz
(24824)
Comment rating: -

This is somewhat new for him this year. His career rate of ground balls going to the outfield is just 16.2%, near the league average of 15.3%. In 2010, it jumped to 20.0%. Is he hitting ground balls a lot differently this year perhaps?

 
Matt Swartz
(24824)
Comment rating: -

It is random from the pitchers point of view, not the hitter's. Hitters with more power regularly hit the balls hit to the outfield over the wall, but pitchers don't. After adjusting for park, the correlation between a pitcher's percentage of outfield fly balls hit for home runs one year with the the same percentage the next year is only .08. It's almost entirely random.

 
Matt Swartz
(24824)
Comment rating: -

Oops. Yes, you're right. Thanks.

 
Matt Swartz
(24824)
Comment rating: -

This is a very intriguing hypothesis, and I don't think the CERA studies have much to say on whether catchers are better matches for certain pitchers, just that catchers don't tend to be repeatedly much better or worse than each other. That doesn't really comment on catchers sucking at calling pitches for a certain pitcher. There is abundant evidence that Shields is pitching worse when Jaso is catching. K% down 2%, BB% up 1.5%, XBH% up about 5.5% while singles are only up like 1% or so, definitely suggesting that he is being hit harder and throwing worse with Jaso behind the plate. My question is why? Is Jaso calling games very differently? Is it possible Jaso is just getting more playing time against better hitting teams? If it's game-calling, what is Jaso doing wrong? That's the question I'd like to see answered. I don't have the PitchF/X chops to answer a question like that, but it's a very good question. If he's calling pitches predictably, that could be an obvious effect. Keep in mind that would be another example of high BABIPs not being repeatable but still not being attributable to simply bad luck. There would be an action that was bad-- poor playcalling-- that would be reversed by a team that was paying attention to what they were doing, and therefore there would low/no correlation in BABIP the next year.

 
Matt Swartz
(24824)
Comment rating: -

Could be, but statistically that gap is just too wide to be very plausible-- two full runs! Also, keep in mind that Hudson still has been lucky and has a WAY more extreme rate of ground balls, making an extreme ground ball BABIP more credible. What type of sink is he putting on the ball to get 2/3 of contact on the ground? Is it making more choppers and fewer one hoppers through the hole? Regardless, Buchholz may be maximizing the value of the Red Sox defense but it won't cut his ERA in half versus what his peripherals suggest, in my opinion.

 
Matt Swartz
(24824)
Comment rating: -

More or less, I think I agree with what you are saying but provided that we define DIPS somewhat loosely. I think that pitchers have SOME control over BABIP, but I tend to think that you can deduce their BABIP tendencies better by looking at their strikeout and fly ball tendencies. But otherwise, yes, that would be how I think about it. I do not think current defensive metrics are good enough to pinpoint the defensive adjustment, but I think that if you approximate the runs above or below average and spread that by innings pitched or something, you can get a good ballpark number. But I do think that the extremes with defense are going to be +/- 0.50 runs or so, and the extreme with parks are going to be about the same, so even a perfect defense in a huge stadium would have less than a run difference between ERA expected and SIERA for a pitcher. For the vast majority, of pitchers, the effects will be far smaller than a run. The Red Sox have pretty good defense but a hitter's park, same with the Phillies, so those effects might cancel out. Or at least they won't explain a large portion of the SIERA-ERA difference. For Santana, the park effect is going to be relatively small and I guess the defense is pretty average, so you can probably figure that a 4.18 SIERA in CitiField playing for the Mets defense would maybe be at 4.00ish? Maybe 3.90ish? Certainly not 2.98. Cahill I guess gets a double bonus for park and defense, but that would move him from 4.28 SIERA to maybe 3.5-3.6 expected ERA? Certainly not 2.72 or anything close to it-- there's just too many ground balls being hit at infielders even for the A's. I do think it would be ideal to come up with a way to translate SIERAs into "expected ERAs" or something like that, but defense is not so exact a science that we can trust it perfectly when we get numbers. I think when Colin Wyers published his runs above average collectively for teams, we can take that and the percentage park factors that Clay does, and then make a noisy adjustment. But that would really just be a guessing game at this stage. Good question, thanks. I'm really liking everyone's questions on this article.

 
Matt Swartz
(24824)
Comment rating: -

Sequencing could definitely be an issue, and I agree that's partly a catcher issue. The league average pop-up rate is about 7.5% of all balls in play. The league average BAs with bases empty and runners on base are .253 and .266, respectively, with OPS of .715 and .755. 37.8% of two-strike counts have ended in Ks so far in 2010.

 
Matt Swartz
(24824)
Comment rating: -

Definitely not a stupid question! I think you just gave me about four article ideas in three sentences! The answer is probably a great question for a scout, but the logical starting point is-- why aren't his swinging strikes happening on two-strike counts as often as other players? Is it a matter of learning to pitch? Is it a matter of having a few good pitches without an obvious putaway pitch? Or the opposite? I don't know. Worth hearing what scouts and regular viewers of Red Sox games have to say. Thanks for the fantastic question.

 
Matt Swartz
(24824)
Comment rating: -

I'd guess it's probably related to just missing on talent level of players, which is certainly a flaw of sabermetrics. As far as OBP and K's, I think that sabermetrics has very carefully approximated how well those things contribute to wins, but might be missing on the OBP and K tendencies of the players themselves.

 
Matt Swartz
(24824)
Comment rating: -

I think with most of these things, the only reason that there would be an issue is if teams that were less sabermetrically inclined were more likely to have the factors you have mentioned. From your wording, I think you realized this, but it should be highlighted. While PECOTA may be conservative on prospects, teams that were sabermetrically inclined would have to be less likely to develop have ceiling prospects in the first place. Otherwise, it would not affect the correlation. Similarly, good medical staff would need to not only be negatively correlated with good sabermetric staff, but the difference would have to not be reflected in historical injury trends, which PECOTA does adjust for. Synergy would also have to be better understood and better utilized by less sabermetrically inclined teams, which I'm not so sure is true. It's possible though. Defense is included in PECOTA projections in that pitchers on teams that are more likely to have good defense will have lower ERAs (and RAs). The issue would have to be that teams that are sabermetrically inclined would have to be less likely to pick up defensive players that only look good by the numbers-- entirely possible, and a solid plausible example of where the effect above might be seen. As far as the chance of this being random error, it's possible. The correlation is .27, which isn't massive. But considering how many other random factors there are, I would never have expected to get a correlation even that high.

 
Matt Swartz
(24824)
Comment rating: -

This would only affect the correlation between being sabermetric and being good, not the overprojection issue, so it's only a limited effect. I'm still not quite sure that there are obvious examples of teams that were overrated as far as sabermetric usage. Boston really does seem like a good candidate for most sabermetric, you know? The teams that were ranked the highest are the ones that sabermetricians are known to be on staff at in many cases too. It's a fair point and it certainly suggests that these conclusions should be seen as qualitative for the most part, but it seems unlikely that it would change the results much. If you have some teams in mind that were possibly badly rated, though, please let me know.

 
Matt Swartz
(24824)
Comment rating: -

Wow, this is a great post. I need to think about these questions. I don't have immediate answers, but this should definitely be filed as things to look at more deeply in the future. The managers question is awesome. The bullpen question is somewhat possible too, as the correlation between overprojection and pythagorean record is .22, lower than the overprojection vs. real record correlation of .27, though not by too much. The rest of those are good ones to think about. Thanks.

 
Matt Swartz
(24824)
Comment rating: -

Checked out the correlation from PECOTA and 1st order Pythagorean Record. Still .22, but definitely less than the .27. Good food for thought. I'm not sure people were biased by PECOTA's preferences. I think that it's a tough leap to make that PECOTA liking a team means they are sabermetric. I don't know people would assume that a high PECOTA projected win total must have come from a secret saber genius at the helm. I could be wrong though. I didn't even fill out a survey because I was afraid I would bias the results.

 
Matt Swartz
(24824)
Comment rating: -

There's some holes in these arguments that I think you're missing. Hear me out... 1) Potential bias ONLY affects the rating of how much saber teams help you, and does NOT affect the primary conclusion of the article at all-- in fact, it would have the opposite bias. It seems like my colleagues were very careful to rate teams highly in cases where they were not the cream of the crop. Although there could be some effect, I'd like to hear some team names first to even address that weaker side-conclusion that the saber teams were better. 2) Huntington was hired by the Pirates before the 2007-08 offseason, meaning 3 of the 5 seasons in question were with a sabermetrically leaning GM. My colleagues ranked them 10th. The Mariners were 5th even though there were only 2 years of Zduriencik at the helm, but they have been so extremely highlighted in the media for being saber leaning that it's not a totally ridiculous ranking. Overall, the Mariners did not significantly differ from their PECOTA projections at all (1.7 games overall) which is pretty much coming entirely from 2010, when PECOTA did, in fact, overrate them. 3) Opponent difficulty is adjusted for in the PECOTA standings and obviously real life standings are affected by it as well. Despite your focus on the AL West, there is actually a POSITIVE correlation with being sabermetric based on your sabermetric leanings. Look at the saber-tastic AL East and contrast it with the saber-lacking NL East. The top 6 teams in sabermetric ranking were all in the AL. 4) I'd be interested in this too, and that's a good idea for future research if I want to really get my hands dirty. A starting point could be some of my articles on "The Cost of OPP" and free agent PECOTA bias. 5) This is a pretty minimal effect, I'm guessing. The Indians are the premier example of this, overprojected by 10 games a year, so about 50 games overall. Cliff Lee and CC Sabathia were only worth 6.5 games to the Phillies and Brewers combined. The Yankees add a lot of talent and are actually listed as more sabermetric. Also, there is a slightly positive correlation between being competitive and being saber-leaning, so there is going to be a counter-effect whereby saber teams add talent midseason anyway. I do appreciate the criticism here, but there are major holes in a lot of these arguments. If you think the authors are wrong about the saber usage of a team, tell me which team it is. A couple teams won't change the conclusion but it is certainly worth hearing.

 
Matt Swartz
(24824)
Comment rating: -

No, because each season is only one observation among an infinite number of paths a team could take in a given year. There is no absolute truth in one season or even five seasons of data. There is still a margin of error.

 
Matt Swartz
(24824)
Comment rating: -

Thanks for looking into the regressions. The issue is that there are too few observations (just 30 teams!) to really interpret a multiple regression in any meaningful way. The correlation was .27 even ignoring the payroll or payroll rank issue, but that's not significant in a regression because of the number of variables. Including raw payroll is a problem because of the little outlier in the Bronx that has such a high deviation from average in both payroll and wins. More than 30 observations would remove issues like that. I did run a couple regressions to get a feel for the data, but didn't report them because of the sample size issue. Nonetheless, thanks for looking into the data.

 
Matt Swartz
(24824)
Comment rating: -

They are all valuable in different ways, and there are also three versions of WARWARP. Two main distinctions are important: 1) Measuring outcomes vs. Measuring skills 2) Measuring cumulative performance vs. measure rate of performance FanGraphs WAR uses FIP, which is like SIERA in that it measures a skill, but unlike SIERA in that it measures how well you did cumulatively instead of as a rate of performance like SIERA (which approximates earned runs per nine innings). BP WARP and Baseball-Reference WAR measure outcomes as a cumulative performance, while ERA+ measures outcomes as a rate of performance. WARP and WAR measure outcomes relative to a replacement player, while ERA+ measures outcomes relatively to average. Hope that helps!

Aug 30, 2010 3:54 PM on Part 5
 
Matt Swartz
(24824)
Comment rating: -

The typical boxscore in a newspaper typically doesn't write down batted ball types, but even FanGraphs boxscores just lump it all in with fly balls. That's okay because we have fly balls and pop ups always added together in the equation so it's okay to call all pop ups fly balls and totally have the equation work. Any game summary on Gameday will include pop ups too.

Aug 30, 2010 3:47 PM on Part 1
 
Matt Swartz
(24824)
Comment rating: -

Okay, Eric's awesome league splits by situation spreadsheet he sent me actually can help see how true this is: Bases empty: LHB BABIP: .298 RHB BABIP: .293 Runners on 2nd & 3rd, with 1st base open: LHB BABIP: .291 RHB BABIP: .302 Runner on 2nd, with 1st & 3rd both open: LHB BABIP: .293 RHB BABIP: .292 Runner on 3rd, with 1st & 2nd both open: LHB BABIP: .305 RHB BABIP: .302 Runner on 1st, with 2nd & 3rd both open: LHB BABIP: .319 RHB BABIP: .306 Runners on 1st & 2nd, with 3rd base open: LHB BABIP: .295 RHB BABIP: .291 Runners on 1st & 3rd, with 2nd base open: LHB BABIP: .316 RHB BABIP: .305 Bases loaded: LHB BABIP: .303 RHB BABIP: .293 So it's definitely somewhat true from looking at these, I think. Worth noting above is that the BABIP with RISP vs. bases empty is only +.0115 for LHB and only +.0041 for RHB, compared to the men on vs. bases empty .0198 for LHB and .0059 for RHB, so maybe my 27 shiftees up there are doing better with men on 1st especially, somewhat better with other runners on, but not as much of a jump. That makes sense. I don't have their numbers by situation though.

 
Matt Swartz
(24824)
Comment rating: -

BTW studes linked to the Greenhouse work below I'm seeing now.

 
Matt Swartz
(24824)
Comment rating: -

I'm not at home so I don't have my copy of The Book to look through with me, but did they control for hitter skill? Could that explain the difference? Because lefties are better hitters overall than righties (I think I read maybe on THT a few years ago that had to do with more righties being 2B, 3B, SS, and C, rather than 1B and OF), maybe they face more same-handed pitching with runners on when managers bring in relievers specifically to get them out. Maybe the league as a whole faces a smaller advantage. Eric sent me all the BABIP numbers above and they matched up with the league numbers on B-R, so it should be accurate. And we do see a 1.5-point jump in BABIP for lefties more than righties with men on. I'm not sure what that means for wOBA, but it's certainly some fraction of that 20. I'm curious what you find. Actually, maybe righties hit better with men on 2nd and 3rd but not first because the 3B and SS can't play as far back? Maybe that makes the overall "runners on" numbers closer but the "first base" numbers much more advantageous to righties. That would make sense in explaining why the shift causes a wider gap because even with runners on 2nd and/or 3rd and first base open, it would still severely limit the shift.

 
Matt Swartz
(24824)
Comment rating: -

Yeah, I've been told that a couple times today, though I really only was focused on the 1993-2010 era. I guess it's a matter of relative use. The righties above in the control group were similar to the lefties in the test group, except for alignment.

 
Matt Swartz
(24824)
Comment rating: -

Yeah, that's why I looked at all lefties and all righties to check the BABIP difference. There was one but it was only 1.5 points difference.

 
Matt Swartz
(24824)
Comment rating: -

LOL. I'd like to thank to academy...

 
Matt Swartz
(24824)
Comment rating: -

Sure, teams could do it anyway, but once you have your first baseman out of position, you need to think about moving the second baseman over the cover some ground, and the overall result is just less optimal-- it's not like the difference in BABIP is 100 points. It's 20 points-- just enough to see the suboptimality of defensive alignments with runners on when there are goals other than minimizing the chance of getting a hit.

 
Matt Swartz
(24824)
Comment rating: -

I think punching ground balls the other way does counteract the effectiveness, but I've read that it's very hard to put breaking balls on the ground the other way so maybe that limits the ability of hitters to do this. Also, forcing hitters to hit the ball the other way might lower their SLG even as it could raise their AVG. I think the overall effectiveness of the shift is somewhat unclear, though I do think this article would suggest it at least works a little. Jeremy Greenhouse has done some interesting work at BaseballAnalysts.com on this issue, might be worth checking out (though I didn't save the link...you can just click on his name on the sidebar and you can see all his articles).

 
Matt Swartz
(24824)
Comment rating: -

Darwinism explains a wide variety of topics in economics-- both biology and economics inform each other well. I had a friend who did both and was able to utilize both fields to understand them both better. However, the survival aspect of Darwinism generally links to the entry/exit of firms condition in economics, which is not something that is relevant in Major League Baseball. Baseball teams do not go out of business, are not forced to dissolve when better competition emerges like restaurants are. However, the general point that teams from bigger markets stand to gain more from investing in winning is true, regardless of whether you frame it as they stand to lose more by not winning. In any case, the general point of the article holds. As long as the return on investment is vastly different in large and small markets, the large market teams are likely to outbid the small market teams unless the cost of investment is lower for the small market teams.

 
Matt Swartz
(24824)
Comment rating: -

I agree with your point that the decision to spend money on a team is one of investment, not of charity, and based on building a customer relationship. However, I was assuming that the value of building customer relationships in New York City is higher both because of the sheer number of customers and because of their wealth, so the price of talent-- which builds that fan relationship-- is driven up by the big-city and/or playoff-bubble franchises with the most return on that investment. So I'm guessing that not participating in the auctions for players is probably rational for a small-market non-playoff-bubble team. I think you and I agree, basically, in principle though and maybe my language of "for the sake of making fans happy" was a bit of an exaggeration of the general fan sentiment of calling their owners "cheap."

 
Matt Swartz
(24824)
Comment rating: -

This is all true, but it comes down to asking what the goal is. Do you want the Pirates to neglect being less terrible for the next few years so that they can be championship contenders after that, or do you want the Pirates to put a team on the field that is going to give the Cardinals and Reds a run for their money in pennant-deciding games? I'm pretty sure baseball might care more about the latter more than the former. But if not, sure, go ahead and subsidize development cost too. That's why I said "a number of ways" in the article-- I was hedging on that one!

 
Matt Swartz
(24824)
Comment rating: -

Excuse me? I also fail to understand your point, Peter. I'm not sure what payrolls having a "complete lock on team income" means. I think it's also pretty clear that you seem to have come to the same conclusion as everyone else-- that building a winner is more profitable in large markets-- and then you seem to think I don't think that, despite the fact that the crux of my argument was precisely that...Oh, and I do own The Wealth of Nations. It was required text in the intro courses to my Ph.D. in Economics. I'm not sure I read anything in there about anything you said though. Why don't you try to see what you disagree with in my article, and we can discuss?

 
Matt Swartz
(24824)
Comment rating: -

Yeah, something exactly like that. MLB contributes towards payroll at the low end of the payroll distribution and subsidizes it, while taxing at the high end. The "number of ways to implement" was just to say that anything that involves lowering the marginal cost below the actual cost of the salaries for low salaries is good. The reason it's okay to being effectively paying everybody $15 million is that a lot of these teams are receiving that money now anyway but not raising salary. You get the money from taxing the high end of the curve and from things like MLBAM. There could also be a phase-in payroll subsidy that gradually becomes a payroll tax, too.

 
Matt Swartz
(24824)
Comment rating: -

Well, a 1-win player at the beginning of the season increases your playoff odds by about 6% so it would only be $1.8MM in expected value (6% of $30MM). However, there are two other factors. Firstly, there is also a dollar value of winning games independent of the playoff affect, and that probably makes up half (maybe?) of the value. Not only that, but the $30MM as Gennaro describes in the interview is for an average market-size team (he cites Cincinnati). Keep in mind that the team that offers that free agent the most is probably going to have a higher than average value for him-- often free agent dollars are spent most by teams from New York, Los Angeles, Philadelphia, Chicago-- all bigger markets than Cincinnati and probably with a greater than $30MM value if I understand him correctly. I'm just playing with the numbers, but for an average team, the $30MM sounds reasonable.

 
Matt Swartz
(24824)
Comment rating: -

I do not have inside knowledge of packages but I explained my reasoning in several articles, including this one, why the value is likely much higher. There were rumors that their asking price was through the roof, perhaps explaining why they did not complete a trade. I moved to DC a year ago, so I didn't bus down, and I wasn't killing any kegs; I had a pretty good hot dog, though. I was treated civilly by fans that I treated civilly as well. The vast majority of fans in any city treat you well enough in their stadium if you behave yourself, and that's certainly true for DC and Philly.

Aug 07, 2010 3:49 PM on The 2010 Trade Deadline
 
Matt Swartz
(24824)
Comment rating: -

Fair enough, but how exactly would I go about proving that? I'd need to try and capture a market that was different every summer and every winter based on a bunch of different factors. I could see why it might be hard to trade position players whose value comes from their place on the defensive spectrum, but probably not 1B/DH types.

Aug 06, 2010 6:17 PM on The 2010 Trade Deadline
 
Matt Swartz
(24824)
Comment rating: -

I don't like the return for Oswalt + Berkman, but I just think the demand wasn't there right now because they were expensive, Berkman was overperforming, and Oswalt had a no trade clause. I think the economy and interest rates made it a bad seller's market, but also made holding expensive players really tough on sellers. I did think Happ wasn't the best match for them.

Aug 06, 2010 6:15 PM on The 2010 Trade Deadline
 
Matt Swartz
(24824)
Comment rating: -

You aren't linking the PECOTA projections. You're linking the staff predictions. Mine, incidentally, had Cincinnati in 2nd, so that's not really a fair analogy. As I mentioned above, the Padres clearly tried something innovative that worked. And they still aren't a guarantee to make the playoffs, and they were able to spend some money midseason to correct things when they realized they were better than they thought. It's not a matter of giving up. You still play the games. You just allocate resources differently-- as in, you invest in 2012 when your odds of competing are three times as high, even if they are not zero in 2010. Also, Dunn is a free agent after 2010 so his contract for 2011 would have to be a new deal, and wouldn't be a bargain. So, considering they'd be paying full price for him, wouldn't it make more sense to do so when they already know they're competitive (or to trade for a first baseman midseason if they are suddenly in the thick of the race in July 2011)? Allocating resources is a big deal. I'm not saying the Brewers and Nats won't be good this year. I was wrong about the Padres this year. I'm saying that teams that have a lower percent chance of competing shouldn't spend money on winning now when that money could be used towards later. Similarly, they shouldn't hoard talent now at the expense of trading for talent that can help them later. Washington is a pretty big and wealthy city, and the Nats are going to be competitive if they are run wisely. A sustained period of success could come to them if they build the groundwork beneath them.

Aug 06, 2010 2:10 PM on The 2010 Trade Deadline
 
Matt Swartz
(24824)
Comment rating: -

I'll say this. I'm a Phillies fan myself, and I watched through the late 90s and felt the same way you describe until 2008, when I was at (both parts of) Game 5, and realized every loss endured while the team was bad was worth it if it made that championship happen. I'd stay open to the possibility of changing your mind. That said, I never had the experience my cousin did: http://sportsillustrated.cnn.com/vault/article/magazine/MAG1161294/index.htm (btw, I'm the "number-crunching cousin" in that article).

Aug 06, 2010 12:11 PM on The 2010 Trade Deadline
 
Matt Swartz
(24824)
Comment rating: -

I was trying to reply to a few of his posts at once, and he mentioned a couple times that he didn't like my snarky comment about the Nats fans at Saturday's game. Sorry that wasn't clearer.

Aug 06, 2010 11:50 AM on The 2010 Trade Deadline
 
Matt Swartz
(24824)
Comment rating: -

I understand and very much appreciate you clarifying. I shouldn't make unnecessary snarky remarks like the Dunn comment. If you were sitting in my section that day, you would probably agree those fans knew far less about baseball than you certainly do.

Aug 06, 2010 11:28 AM on The 2010 Trade Deadline
 
Matt Swartz
(24824)
Comment rating: -

BTW if you go to the first link in the article about buyers and sellers, I make a pretty detailed case on how to determine where the buy/sell line should be, and how large the range of prices which would work is. The teams with 1% or less change in championship odds are clearly sellers and the teams with 3% or more change are clearly buyers, and the 1-3% range in the middle is pretty small and those teams clearly are incentivized strongly enough to make major moves by either price.

Aug 06, 2010 11:27 AM on The 2010 Trade Deadline
 
Matt Swartz
(24824)
Comment rating: -

The Padres clearly were doing something smart here and were the exception rather than the rule. They still aren't guaranteed to contend, and were able to make midseason improvements anyway rather than dipping into the free agent market. The Padres also played in a division where the Dodgers had not spent any money to maintain their performance, the Rockies were good but not necessarily considered a 90-win team at the time (though I had them very high), and the Giants and Diamondbacks weren't clear contenders. With the Phillies and Braves already looking pretty intimidating for 2011, it's not wise for the Nats to beat them now. The Phillies have a bunch of 30-33 year olds on deals that expire soon, and the Braves don't spent much money, leaving the Nats a real window if they aim for winning in a couple years, especially with having such elite players already in the fold and under team control. I just don't think the relative value is high enough now even if there is a 5-10% chance of making the playoffs in 2011.

Aug 06, 2010 10:35 AM on The 2010 Trade Deadline
 
Matt Swartz
(24824)
Comment rating: -

Actually this is pretty close to what I was going to say for part of my reply, though maybe including the snark about Nats' fans was unnecessary. I generally try not to do that. Anyway, it's not really supply and demand in an intro micro sense. It's an auction market. The price is going to be determined by the next-highest bidder. There were a few 1B/DH types on the market, and more teams that could have used the boost, so the price was going to be determined by how much the next-highest bid would have been-- which is about value for two pennant chases, not one. And that team would likely have at least as good a chance as the Nats in 2011, so they would have more value for Dunn as well. There is not going to be a mathematically rigorous way to talk about these market effects. It's certainly more true with pitchers, though I do think the Johan Santana return is probably a great example of what happens when you trade with one season left. I've seen a few examples over the years where the return was starkly different but I'd need to think about this before I could even devise a method of coming up with a rigorous way to analyze it. After all, establishing prospect rankings would be very tough to do in a non-contraversial way.

Aug 06, 2010 10:30 AM on The 2010 Trade Deadline
 
Matt Swartz
(24824)
Comment rating: -

Agreed. This is the reason that I didn't totally write off the wins Saunders generates-- it's an evaluation of the market value of the package with an agnostic view about which parts of the package are most important.

 
Matt Swartz
(24824)
Comment rating: -

The problem with that logic is related to my articles on MORP where I determined whether it should be linear. What you're saying would be right if 4-win players were worth more than 2 2-win players, but as I showed in those articles, this is not the case. So, the expected number of wins from players is sufficient here. A high-prob low-ceiling guy can be worth as much as a low-prob high-ceiling guy as long as they are sufficiently high/low probabilities.

 
Matt Swartz
(24824)
Comment rating: -

Thanks. I agree that most teams are going to victim's of the winner's curse. The solution to the winner's curse is to value the players at a lower level than you estimate them to be, figuring that if you do acquire them, they are probably not as valuable as you thought they were. My belief is that the D'backs didn't do that here, or at least not as much as they should have.

 
Matt Swartz
(24824)
Comment rating: -

I guess it's possible splits could provide a clue about aging, though it would actually work the other way in your example, because a player could be platooned once they get below replacement level against same-handed pitching, so they'd lose less value in that sense.

Jul 23, 2010 12:28 PM on Buyers and Sellers
 
Matt Swartz
(24824)
Comment rating: -

Splits always need to be regressed back to the average split, because they are such a small sample size. I think that a good estimation is that you need 1000 PA on each side to even consider a player's true split skill level to be 50% his split and 50% the league average split. I'm not sure about the 1000 but it's something like that. Regardless, that doesn't matter much because Howard is better at RHP and worse at LHP. It doesn't seem to affect Howard's overall performance in high leverage PA, so it shouldn't make much difference.

Jul 23, 2010 11:36 AM on Buyers and Sellers
 
Matt Swartz
(24824)
Comment rating: -

You're right in a sense. I checked into this issue when I was preparing to create MORP in the offseason, and I found that going into the offseason, teams generally had replacement level players at about 4-6 starting spots, and almost always at least 2 spots. But that's as of November. Teams certainly have less "vacancies" now. However, most teams are still pretty close to replacement level in the 4th and 5th SP spots, which is why it's always easier to move a pitcher at the trade deadline than a position player. The Efficient Market Hypothesis economist in me wants to say that teams can trade away players in such a way to add to their roster, but these are high-friction markets, particularly in July, so that may be very unlikely. It's a good point that you make. Some teams are probably not going to be able to upgrade substantially even if it's an interesting question to consider how much they'd gain from an upgrade of 4.5 wins. Thanks for highlighting this.

Jul 23, 2010 11:33 AM on Buyers and Sellers
 
Matt Swartz
(24824)
Comment rating: -

I think that a big part of the huge boost in expected revenue upon making the playoffs is that there is a good chance even the worst team in the playoffs advances or even plays in the World Series. I'm not sure that the biggest payoff is the first round, per se, but I'm not sure. I guess there would be something more to be said for some of the teams with 20-30% chances of making the playoffs becoming buyers if I'm underestimating the gain from making but being eliminated the first round.

Jul 23, 2010 10:44 AM on Buyers and Sellers
 
Matt Swartz
(24824)
Comment rating: -

Adam Dunn is underrated and Ryan Howard is overrated. Hopefully, some team is smart enough to offer the Nationals a good haul on Dunn. I think that you're probably underestimating Dunn's relative OBP advantage and underestimating Howard's relative SLG advantage, but the reality is that their offense is very similar in value. The difference in skill is really that Ryan Howard has average defensive skill and the jury is still out on Dunn's defensive skill. Dunn has been better at first this year, but Howard's had a couple average years in a row at first since he lost weight and worked on his agility. As I wrote at the time, Ryan Howard's contract is going to depend on how well he ages and how much salary inflation baseball sees. He's worth the money now, but if he ages like Mo Vaughn instead of Jim Thome, that's going to be what determines if the contract is a disaster. If the economy sputters and salaries hold, though, it could be anyway.

Jul 23, 2010 8:20 AM on Buyers and Sellers
 
Matt Swartz
(24824)
Comment rating: -

I was basing things largely on Championship Odds Added, so the Red Sox at 4.0% were a little higher than the White Sox at 3.6%. In all honesty, I was updating this throughout the past week, and the White Sox number was growing and the Red Sox number was shrinking. If I did the analysis all in one shot, I probably would have called the White Sox a strong buy. Both are certainly up for debate. I still think the Twins should hold because of the low championship odds, or at least no more than a weak buy. They have a good franchise and a good window to win.

Jul 23, 2010 8:16 AM on Buyers and Sellers
 
Matt Swartz
(24824)
Comment rating: -

I assumed they were pretty good, but even if they are 5-10% off, the relative change in playoff odds added and championship odds added should be about right.

Jul 23, 2010 7:47 AM on Buyers and Sellers
 
Matt Swartz
(24824)
Comment rating: -

John, the Twins did have a huge jump in playoff odds added by bringing somebody in, but because their odds are not all that great in the first place, they have a relatively small gain in championship odds. I based most of my decision about what to do on championship odds. The Twins are a good enough franchise that they should be looking at that too, I think. Why sacrifice a championship later for a division title now? Elm, the frame of the article was that adding a player halfway through the season could have almost the same impact as adding him at the beginning. Except for circumstances like the Padres, the full season is going to help more than a little more than a third of the season in expectation. The point is that stars add about 2/3 the value for 1/3 the season for a lot of contenders.

Jul 23, 2010 7:46 AM on Buyers and Sellers
 
Matt Swartz
(24824)
Comment rating: -

My best guess is that the majority of selections were made before the last two weeks, and that sample is very unlikely to affect the 48 players selected. The performance of those players being LOWER in the last two weeks of the first half than in first two weeks of the second half would mean I am likely understating the improvement and fly ball spike if anything. If it helps, the last two weeks of the first half saw a 2.5% lower (FB+PU)% and a 2.6% lower HR/FB than the cumulative first half for participants, which presumably indicates that the drop in performance coming from regression to the mean was already occurring. My feeling was that the extra sample size far outweighed the concerns about the bias, though I agree a negative HR Derby effect would probably be flawed and I'd need to rethink the methodology; zero or positive effect seems unlikely to suffer from that issue to me.

 
Matt Swartz
(24824)
Comment rating: -

Sometimes it is best to just look at odds of winning a baseball game. How much does having a solid starter increase your odds of winning a baseball game? Answer that question, compute the percentages, and then move the Secret Sauce stuff around at the margin to bump things up or down by a couple percentage points. Being better at baseball than one's opponents doesn't need a study to confirm it. Nate found 3 interesting things that shined above the noise, but that doesn't mean that the same things that make one team have a more than 50% chance of winning a game won't be important, too. You still need to assume the better team wins a baseball game more often than not.

Jul 15, 2010 3:46 PM on Resetting The Races
 
Matt Swartz
(24824)
Comment rating: -

Yup, it was a mistake on my part. It was higher post-Moneyball but modestly: .078 to .084.

 
Matt Swartz
(24824)
Comment rating: -

Hmm...I'll find my spreadsheet tonight and report back. Thanks for finding this.

 
Matt Swartz
(24824)
Comment rating: -

No, just saying that the difference between OBP and AVG for all-star starters was larger in this era. In the case of a .300 hitter, they would probably have smaller differences in OBP and AVG. Think of it like a .270 hitting power hitter generally had a .372 OBP in the Early Era and a .419 OBP in the Internt Era.

 
Matt Swartz
(24824)
Comment rating: -

I'm looking into this, and it definitely might be worth writing a blog post about, so look for that. Thanks.

Jul 13, 2010 3:01 PM on Surrendering Lee
 
Matt Swartz
(24824)
Comment rating: -

I would go for guys to make them competitive in a couple years if I were them. They don't seem likely to make up the gap in 2011 because they are giving up Fielder, but some prospects and some good moves could make them good in 2012 and 2013. It would probably involve trading some other pieces, probably.

Jul 09, 2010 12:51 PM on Trading The Prince
 
Matt Swartz
(24824)
Comment rating: -

Yeah, my feeling is this. The Brewers are probably a .500 team next year. Becoming a 91-win team would cost about $50MM beyond whatever raises players are already due in arbitration and already written contracts. Sure, they could lucky, but why risk it? That's not the point where you make a dent. A team that is maybe five wins from contention should drop some money on free agents, but I don't think the Brewers can cover the win gap with free agents without losing money in the process. It may be that I'm underrating them, but I don't have much faith in their non-Gallardo pitching.

Jul 09, 2010 11:42 AM on Trading The Prince
 
Matt Swartz
(24824)
Comment rating: -

My point was that all teams need young talent to produce. The Red Sox have done fabulously with young talent-- read through my "No Turnover Standings" articles. I'm curious why you think the Brewers are going to be competitive. Look at the Brewers' roster, contracts, and minor league prospects they have now. Sticking with Fielder, Braun, Gallardo, and the other talent they have is not going to make them a competitive team before 2011 in all likelihood (certainly not 2010), and retaining Fielder is going to cost enough money that it would difficult to do so afterwards. They are going to need young talent to get better-- replacing Fielder's uniqueness has a marketing argument, but replacing his wins is something that they should be attempting to do by trade.

Jul 09, 2010 7:55 AM on Trading The Prince
 
Matt Swartz
(24824)
Comment rating: -

The Red Sox spend $170MM, which would buy about 32 WARP, which would only put them at 72-90. The Yankees spend $215MM, which could buy 41 WARP, putting them at 81-81, meaning a minimal level of cost-controlled talent could push them over the edge. The Brewers probably should trade some other parts too, but players like Braun and Gallardo are locked down long enough that they are still assets in a couple years. Some other players would go, too, though.

Jul 09, 2010 4:14 AM on Trading The Prince
 
Matt Swartz
(24824)
Comment rating: -

It's a good question. In theory, I think it has to do with whether there is at least more seller than the number of buyers, at least one more buyer than the number of sellers, or an equal number. If there are 2 more buyers than sellers vs. if there are 3 more buyers than sellers, that should not change much in the bargaining (unless their replacement level is very different). The obvious comparable player potentially on the market is Adam Dunn. I would think three of the teams I mentioned might create the necessary scarcity to drive up the price. I like your question, though. I think the Rays might be buyers, though, at least according to reports.

Jul 08, 2010 10:39 AM on Trading The Prince
 
Matt Swartz
(24824)
Comment rating: -

There is certainly more roster flexibility in the offseason, so that's why you might not see the in-season bump for position players like you do for starting pitchers (there is always a spot for an ace). At the same time, the Brewers do not need empirical tests here, because they can simply start taking offers and create some urgency. If bidders do not realize the extra value Fielder provides, that would obviously be different, but I suspect teams do. I would like to look at this empirically-- I've found market inefficiencies before, and without savvy investors being able to make their own teams to compete in the MLB easily, you really can see market inefficiencies last in baseball unlike other businesses where competitors can emerge more easily.

Jul 08, 2010 10:13 AM on Trading The Prince
 
Matt Swartz
(24824)
Comment rating: -

So, what type of difference do you think the Brewers would have in attendance if they made the playoffs for a couple years in a row a couple years down the line. It seems like getting good sure spiked their attendance around 2007 and on, especially when they made the playoffs in 2008. Obviously this is going to mix in with the Fielder effect, but certainly the Brewers fanbase seems to respond to winning. Attendance is down this year even though payroll went up. I do agree that there probably is an effect of perceived effort by ownership, but a couple good prospects might make people think they are putting in the effort when they come to fruition even if it angers people now.

Jul 08, 2010 10:10 AM on Trading The Prince
 
Matt Swartz
(24824)
Comment rating: -

Definitely seems true about playoff hangovers, and I'm pretty sure they extend well beyond one year. Fielder's body type certainly isn't one that suggests rapid decline, but I think that would probably be priced into his contract at this point. Most teams with money are somewhat afraid of this, at least most teams without huge 1Bs.

Jul 08, 2010 10:00 AM on Trading The Prince
 
Matt Swartz
(24824)
Comment rating: -

Agreed that it's a big hit financially now. No debate that it would hurt. But I also watched the playoff chase and the 2008 NLDS and it seems like that fanbase is rabid for a winner, no? I suspect it would be a huge financial hit, but so would losing him to free agency. I just think a pennant could dwarf the lost revenue for the 2010-11. If they are going to retain him because he has such a unique value to them, that's one thing I don't have the hypotehtical data required to study, but I can't imagine a half decade of 80-win seasons is better than a couple of 75-win years before a couple playoff berths.

Jul 08, 2010 8:41 AM on Trading The Prince
 
Matt Swartz
(24824)
Comment rating: -

Thank you very much for pointing that out. I love Bill James and subscribe, but missed that article. The second game HFA spike is definitely ambiguous, and it's caused some contraversy before. There appears to be some psychological basis for it, as Russell Carleton tells me that sleep stuff supports that effect (you can tell by my diction who the expert is on this). Of course, this past decade provided stronger evidence than decades past. In SABR's Baseball Research Journal, I expanded this to look at 1970-2009 or something like that, and found a mildly significant effect. I'm inclined to think it's true, but I recognize smart people stand on both sides of the debate. Thank you for pointing out James' article on HFA. One of my favorite authors and one of my favorite topics, gotta love it.

 
Matt Swartz
(24824)
Comment rating: -

It can have an effect for teams on the margin, unsure whether they want to trade veterans. There are a bunch of veterans practically guaranteed to be traded at the deadline every year, a million veterans who won't ever be traded at the deadline, and then a few guys on the margin. I think it can have an effect on the margin because the incentive is there, as I have mathematically highlighted above.

 
Matt Swartz
(24824)
Comment rating: -

Yes, it's 13.18 wins out of the top pick and 2.32 out of the 30th pick using Sky's formula with the proper adjustments (rWAR to WARP, fraction of WARP out of first six seasons). I do think that it's subtle but I have to think that if there was a clear cut #1 pick again, the Orioles would be thinking long and hard about calling up a superstar prospect that might make up the difference between themselves and the Pirates, and I could really see the incentive for the Pirates to trade some veterans like Octavio Dotel just to do it. Once something like that happens, it's more dangerous. I'm not saying they are tanking yet, but when the incentive is clearly there, I don't see why the league should wait. Teams eventually do pick up on a lot of things like this, especially when there is a precedent in another sport or two.

 
Matt Swartz
(24824)
Comment rating: -

The gigantic differences between SIERA and ERA are only going to come from luck. I mean, how big is a run park factor for a given stadium? Maybe, a hair more than half a run at the extremes? And defense probably can do something similar to that, I think, though I'd need to work out the numbers in more detail. The point is that if David Price has a 2.44 ERA and a 3.97 SIERA but plays in front of good defense in a pitch stadium, his defense- and park-adjustments to his ERA indicate he might regress to something like 3.5 instead of 3.97 but 2.44 isn't a realistic adjustment for park and defense unless you played in a ginormous stadium and with 8 superheroes playing defense for you.

 
Matt Swartz
(24824)
Comment rating: -

The table has three columns. The first is the pick number, the second is the value added of the pick above the subsequent pick, and the third is the average record of that pick. So the average first pick is worth 3.90 wins more than the second pick, and the worst team wins 5 fewer games (58-104 vs. 63-99) than the second worst team. The average second pick is worth 1.73 wins more than the average third pick, and that team wins 1 more game (63-99 vs. 64-98). The average third pick is worth 1.04 wins more than the average 4th pick and that team wins two games fewer (64-98 vs. 66-96), and so on.

 
Matt Swartz
(24824)
Comment rating: -

After playing with some numbers, I realized how arbitrary it would be to be specific. I've made some approximations of the breakdown of the marginal value of a win before (last July's Roy Halladay articles, based on some updates to earlier Nate Silver work), but I just didn't see a way to come up with a precise way to do it. But regardless of the way you do it, the win now is worth less than a large fraction of a win later, which is really what the table above shows-- that it's a larger fraction of a win that you gain for losing a game now. So the marginal value is going to create this incentive regardless of how exactly I break down the numbers. Play with some numbers, and you'll see that just about any realistic numbers you pick show the incentive for certainly the bottom four or five teams to lose, but that picking any ones on my part would have drawn away from the general point. I'm not talking about the Mariners trading away Cliff Lee for an elite prospect to help themselves. I'm talking about the Orioles trading away Kevin Millwood and Ty Wigginton because they are risking winning too many games by having them on board. It would certainly put a slightly higher price tag on them, because they would be more valuable to the Orioles. The arbitration argument is easily correctable-- a fixed number of years based on either age or draft year or signing year where you have them under control. If the Nats had Strasburg from the time he signed until eight years after they signed him regardless of when they called him up, they'd have called him up sooner?

 
Matt Swartz
(24824)
Comment rating: -

I love this idea. I wonder if the league would ever implement it. Maybe you could have the same types of rules with playoff rosters where players need to be on your roster on August 31 to play in this series. That would certainly be interesting!

 
Matt Swartz
(24824)
Comment rating: -

Right, you still want to lower those incentives. There are a number of ways to do it, but the current system is incentivizing teams to lose a little too much.

 
Matt Swartz
(24824)
Comment rating: -

No, my point is that the lottery would lower the incentive to tank and that this incentive is starting to become more obvious.

 
Matt Swartz
(24824)
Comment rating: -

Oh I agree with you. The ground that needs to be made up in evaluating defense is really large, and you don't want to conflate the two issues. The reality is that over a full season for a starting pitcher, BABIP variance is made up by 75% luck, 13% defense, and 12% pitcher skill. SIERA picks up on the 12% pitcher skill, and removes the other 88%. In half a season, even more of BABIP variance is luck. You can tweak SIERA by some small fraction of a run to come up with an approximate defensive effect just like you should make a park adjustment to add runs for pitchers pitching in hitters parks, but SIERA as defense- and park-neutral going to get you most of the way.

Jul 02, 2010 7:31 AM on SIERA Darlings
 
Matt Swartz
(24824)
Comment rating: -

So your entire point is that since teams haven't been caught tanking yet (even though we see tons of talent traded away at the deadline every year), there is not a problem, and that since basketball teams still have some incentive to tank, you don't realize how much worse it would be if they had more incentive to tank. And you also seem to think that even though I have an actual table laid out for you showing that on average higher picks do better, you don't believe this is an issue because sometimes lower picks do better than higher picks and that you are unable to figure out the number one pick eleven months from now. There is less incentive to tank when the payoff to tanking is lower. If that is not evident to you, I don't what I can do to show it to you. If each game you lost now, gained you 0.8 games in the future when you might be competitive, that would be more of an incentive than if there was a lottery an that was lowered to an expected gain of 0.2 games. You also seem to think that since sabermetrics has already existed for many years in front offices, that means that they won't realize the value of higher draft picks in the future? Teams are learning things all the time. They're getting better at forecasting talent, too. This is about preempting teams from making decisions that appear to be in their best interest. Also, "you can win with the lowest payroll" is laughable. The correlation is high between winning percentage and payroll, and getting higher. Simply noting some counterexamples of smart teams does not change the fact that winning with a low payroll is hard. Having good talent evaluators helps too of course. That does not change the fact that having a higher draft pick helps in addition to having money and good talent evaluators. It's like you have decided that a few examples of other ways that teams have been successful changes anything about a backwards incentive structure. And you conclude with "if it ain't broke, don't fix it" when I conclude with being proactive rather than reactive. I like mine better. I've shown the incentive to lose, and now suggested a way to counteract it. That the NBA still has some problems with tanking does not change the fact that the race to the bottom is less valuable than before.

 
Matt Swartz
(24824)
Comment rating: -

Maybe. It could be the other way though-- experienced pitcher is also older so cooling off may have worse effects. Tough to study but an interesting thought, thanks.

 
Matt Swartz
(24824)
Comment rating: -

Meh, the Jays got a lot of potential talent in the Halladay deal. It ain't fun for this weekend, but the Jays time to shine is probably in a few years when Halladay costs a lot more than the talent they got for him and those guys are producing on the field.

 
Matt Swartz
(24824)
Comment rating: -

The Milwaukee game was a neutral site. As I explained, home-field advantage comes largely from familiarity and distance. Neither team had any more familiarity with Milwaukee, both were staying in hotels, and there probably was not a huge difference with Chicago being closer to Milwaukee than Houston, but perhaps there is some. Regardless, I think the unfairness came from the fact that the Astros didn't get an advantage rather than any claim the Cubs may have gotten some.

 
Matt Swartz
(24824)
Comment rating: -

It is absolutely ridiculous, agreed. I wouldn't negative rate you. But in all fairness, the interleague schedule is particularly unfair to the Phillies this year: http://www.thegoodphight.com/2010/6/25/1536636/for-the-phillies-the-injustice-of David S. Cohen of The Good Phight found that the Phillies had the hardest interleague opponents, with a winning percentage of .545, compared with the Braves .527 and the Mets .500. So the reality is that the .24 win bonus that the Phillies are getting here puts them somewhere between the Mets and Braves in terms of unfair unscheduling. I think it's pretty clear that it's unfair to the Blue Jays, and that a fairer solution might be neutral territory and actually making equitable interleague schedules in the first place.

 
Matt Swartz
(24824)
Comment rating: -

Someone better come up with a way for the Phillies to relay signs better in their own park, because their home-road differential during the last 6 years is the smallest in the majors.

 
Matt Swartz
(24824)
Comment rating: -

In general I agree that managers have incentives to play by a set of suboptimal rules to keep their jobs. However, walking the go ahead run into scoring position isn't really "by the book," is it? It seems like that would be the opposite situation. Also, Manuel probably has more job security than any manager you can imagine, so this might be less risky than other managers sticking their necks out on a strategic move.

Jun 21, 2010 11:37 AM on Walking Justin Morneau
 
Matt Swartz
(24824)
Comment rating: -

There are a number of other likelihoods that I did not account for, all of which were small and had some other effects. The effect of a poor bunt only moved the probability a couple of percent if it happened, and a million other outcomes were also possible. Regardless of how you model it, you're going to come up with a situation where Danys Baez would need to be roughly halfway between a regular pitcher and a home run derby pitcher for it to make sense.

Jun 21, 2010 9:08 AM on Walking Justin Morneau
 
Matt Swartz
(24824)
Comment rating: -

Of course there are a million little possibilities that factor in, with the single possibly ending up better than a walk or the double not scoring Mauer or whatever, but this was a rough sketch. They basically would need to think that this was almost a home run derby for Morneau to make this move credible.

Jun 20, 2010 6:47 PM on Walking Justin Morneau
 
Matt Swartz
(24824)
Comment rating: -

If they IBB him and Rauch bunts, they have a 49% chance of winning. If he hits a single or regular walks, it's about 49% chance anyway. So you just need the odds of an RBI hit (which gives them a 13% chance) weighted by that probability by the odds of an out (which gives them a 64% chance) weighted by that probability to equal 49%. 0.49 = (p) * (0.13) + (1-p) * (0.64) p = 0.29 So a mix of a 29% chance of ending up with a 13% chance of winning and a 71% chance of ending up with a 64% chance of winning on average are as good as a 49% chance of winning.

Jun 20, 2010 6:46 PM on Walking Justin Morneau
 
Matt Swartz
(24824)
Comment rating: -

I'm not sure you read the article very carefully. I looked at the BABIP in detail to determine how much of it was flukiness. Given the rate of ground balls reaching the outfield was flukishly low but the ground ball rate was not, given that the number of balls that reached the outfield in the air was the same as historically, I determined he wasn't seeing much of a change in his batted ball profile. I also discussed the normal range of BABIP that can be attributed to skill, which Ubaldo's numbers fall out of. Given that the drop in BABIP is a function of 40% of line drives being caught and 89% of ground balls being fielded by infielders, I think I have looked at it carefully enough like you would suggest.

 
Matt Swartz
(24824)
Comment rating: -

Bob Gibson did not play in the era of batted ball statistics, so we do not have SIERA on him. I'm suspecting that his SIERA would exceed his ERA, but mostly because he record ERA was by definition lower than all his other ERAs and I assume that he must have had some good luck. Or, to put it another way, what are the odds that Gibson had bad luck that year? Really, really low, right? Therefore, I assume he probably had some good luck, especially because it was his best year. I doubt there are many pitchers that have ever had bad luck during their career year in terms of ERA.

 
Matt Swartz
(24824)
Comment rating: -

SIERA is determined based on the current run-scoring environment that has been pretty similar over the last 17 years. That could change, at which point we wound change SIERA. The only difference in between OTSgamer's list and the true list is that the "P" term is Popouts, not putouts.

 
Matt Swartz
(24824)
Comment rating: -

Of course, you're right that this is a limit. I was asked a similar question in the first No Turnover Standings article, and I responded with the following: "Playing time is not factored into it. There were 81*30 total wins/teams to go around, and I just added up WARP3's. If you want to think of it as the production of the amateur scouting department, with an idea that players could be traded at fair value but the No Turnover Standings are a reference point, that works too."

 
Matt Swartz
(24824)
Comment rating: -

I think it's just a matter of me explaining the method more clearly. The Braves record of 89-73 means that they were 8 wins per year above average when you add everything up-- 4 above average in the 1990s draftees, 6 above average in the 2000s draftees, 1 below average in the 2000s signees, and 1 below average in the 2000s Asian professional, and average at everything else. I'm sorry I wasn't clearer, but it should be that way for every time, give or take rounding adjustments.

 
Matt Swartz
(24824)
Comment rating: -

Of course. I definitely want to look a little more into draft pick number, because I'm sure that's driving a lot of this. It's not really comparing apples to oranges in the sense that I'm still asking what teams produced and getting a good answer, but obviously some teams have advantages in terms of the draft. This just shows the resulting differences in acquiring amateur talent.

 
Matt Swartz
(24824)
Comment rating: -

Yup. Please see the last article for details. The Rays actually do an incredible job of acquiring other people's minor league talent. Think Ben Zobrist, J.P. Howell, Matt Garza, Jason Bartlett-- all these players were acquired in trades the Rays won.

 
Matt Swartz
(24824)
Comment rating: -

I would look at SIERA over the past three years, obviously weighting right now a lot more to get a defense-neutral version of ERA and then just adjust by time. Look at how the teams FIP compares to its ERA, and figure that the difference between SIERA and ERA should be similar to the difference of the team FIP and the team ERA (though maybe shrink this difference a little bit). The fewer innings you are looking at, the more you want to regress the whole ERA back to the mean, though. For fantasy baseball, I would imagine it is probably more important to rank pitchers ordinally rather than giving a cardinal measure of ERA though? In that case, SIERAs with a FIP/ERA adjustment might do the trick pretty well. Comparing SIERA and PECOTA can probably do a good job of giving you extra information too.

Jun 02, 2010 5:27 PM on Sneaky SIERA
 
Matt Swartz
(24824)
Comment rating: -

Ah-- yes, league average defense.

Jun 02, 2010 3:49 PM on Sneaky SIERA
 
Matt Swartz
(24824)
Comment rating: -

Well, it tries to predict ERA going forward as much as FIP does, only it incorporates a few other factors to make it less biased against certain pitchers. It tells you what ERA typically is for pitchers with certain GB, BB, and K rates, taking into account the interactions, etc. It also incorporates BABIP skill that pitchers do have. So FIP would assume that pitchers have the same distribution of BIP skill, while SIERA assumes that pitchers with high K% and low GB% will have lower BABIPs.

Jun 02, 2010 2:42 PM on Sneaky SIERA
 
Matt Swartz
(24824)
Comment rating: -

They are in the four reports-- the Pitcher Standard Report, the Pitcher Rates Report, Pitcher Team-Standard Report, and the Pitcher Team-Rates Report. You can also get SIERAs if you do a customized stats report. I have a couple links in my favorites because I like certain things all together in one report that I can make a .csv out of.

Jun 02, 2010 1:56 PM on Sneaky SIERA
 
Matt Swartz
(24824)
Comment rating: -

Certainly interesting, but not really enough to explain things. We do see a spike in HR/FB for the D'backs offensively from 10.4 to 11.9%. However, the spike defensively is 13.0 to 18.7%, which is just way larger. If it were only year-to-year changes in park effects, that jump could not be as large. Good point, though.

Jun 02, 2010 1:03 PM on Sneaky SIERA
 
Matt Swartz
(24824)
Comment rating: -

This is true, but I think most of what the article is doing is actually explaining this fact, rather than calling everything luck. That said, there IS such a thing as luck, and you CAN actually pinpoint it. There is a counterrevolution in sabermetrics claiming that too much is being called "luck", which is a good thing to highlight, but it's being taken too far. We KNOW that a binomial variable like BABIP should have a standard deviation of .019 for a whole season among pitchers who throw 150 IP. That means that when we observe an actual standard deviation, net of team, of .021, we CAN see that MOST of BABIP fluctuation is bad luck. Of course it's important to explain all the variance you can, but dismissive statements saying that you can't call anything luck are as dogmatic as saying that everything you can't explain is luck.

Jun 02, 2010 1:01 PM on Sneaky SIERA
 
Matt Swartz
(24824)
Comment rating: -

So that's a really small effect, and also somewhat in the wrong direction. Dan Haren has surrendered 5 of his 16 HR in the 1st inning, and 8 in the first three innings. Edwin Jackson has surrednered 5 of his 10 HR in the first three innings, while Kennedy and Rodrigo Lopez each have 5 of 11 in the first three innings. We have a stat called Fair Run Average (FRA) which takes into account what RA should be adjusting for inherited runners. Haren is at 5.16 RA with 5.22 FRA, Jackson has a 6.03 RA with a 5.96 FRA, Kennedy has a 3.50 RA with a 3.35 FRA, and Lopez has a 5.20 RA with a 5.07 FRA. These type of things are small effects compared to the larger issues cited above relating the BABIP, HR/FB, and difference in performance with men on.

Jun 02, 2010 12:58 PM on Sneaky SIERA
 
Matt Swartz
(24824)
Comment rating: -

I mean, the t-test definitely makes more sense to run if the data has a unimodal nonskewed distribution, because that will at least look normal enough to make a t-test relevant. The thing is that the data is so incredibly different that really any t-test will either tell you "there is enough data assuming that the samples aren't biased" or "there is not enough data". The only thing a t-test can tell you is if you're ready for a t-test, because it would be a problem that the sample size was too small if the t-test failed to reject the null. Failed to reject the null is all you could say...certainly not enough to say you accept the null. $/WARP or WARP/$ seems like a judgment call but it would probably be based on what the data was shaped like. I guess looking at changes in total WARP vs. % WARP change is useful too, but it won't grab the injury effect for mediocre players (because going from a 1.0 WARP to a 0.0 WARP injured player isn't that much of a change, but going from a 5.0 WARP to a 2.5 WARP player will seem much worse). I think that could actually be a good way to look at things, but I just think it's tricky with playing time. Maybe absolute changes could be good for looking at rate stats? I'd need to think about that. Absolute changes in counting stats relative to a replacement level just could be really biased, especially for two-year deals often given to players not that far above replacement level. You did look at three-year deals at least though. I see a little more what you're doing now, but be very careful not to read too much into t-tests. The real question is (a) if we have biased samples, rather than (b) if we have enough data. The "is the difference large enough" question is obviously not legitimate. The question is really about the sample size being large enough to confirm the difference matters, because the difference in performance is huge.

May 20, 2010 1:34 PM on The Cost of OPP
 
Matt Swartz
(24824)
Comment rating: -

How can you run t-tests?? What did you do for a standard deviation?? I don't think sample standard deviation makes any sense at all in this case, and I don't think you can make the necessary assumptions about normality or whatever else would be required to even assume a sample standard deviation works. Also, the means for the three-year deals are very different so any changes in expected average wins is particularly useless as a measure. The third year new signings lost half of their value...how can you compare that to the re-signings who lost only 20% of their value? I don't see how you can run t-tests really at all on that. Nextly, there are 33 new signings of 3-year deals, not 26. So right there I'm not sure what you did either. Also, where are you getting 0.9 WAR/(Salary/3)? Maybe 0.09? Is that what you meant? Why did you use wins per dollar instead of dollars per win? I can't imagine wins per dollar had a normal distribution either, so you're really confusing a number of things I think. The three-year re-signed had a 3.6 WARP3/salary in the third year, and the newly signed had a 13.9 WARP3/salary in the third year. That's different by nearly 400%. If whatever test you're running doesn't find that significant, you're not running a useful test. It's possible that whatever tests you're trying to run, even with the standard deviations that you could even use to make sure, just require a larger sample size. Someone on on another website checked my thesis for 1990-2009 and found that the results were on a similar scale by slightly smaller for this period of time. So if your post is a way of saying "we need more data," that might be a good place to start.

May 19, 2010 2:34 PM on The Cost of OPP
 
Matt Swartz
(24824)
Comment rating: -

In all fairness, his late development was because A.J. Burnett broke his wrist, and he needed a couple surgeries to fix it and he got DFA'd. He was a top prospect before that. Also, he plays a corner but he's capable of playing center. He's very athletic, and probably doesn't fit the "corner OF uh oh" profile. I don't think there's any reason to assume he'd be overpaid as a FA. It seems like he's the type of player who can push them to 90 wins in 2011 at least.

May 18, 2010 3:11 PM on Werth The Funds
 
Matt Swartz
(24824)
Comment rating: -

The Phillies raised payroll after Cliff Lee left. I think they traded Cliff Lee to restock the farm system. They had in mind that they were going to increase payroll by about 10%, and they thought they could restock the farm by increasing it other ways. Keep in mind that Benny Looper, the Phillies' Asst GM, was the Asst GM of the Mariners when those three prospects were brought into the Mariners' system. I think the Phillies specifically wanted those guys. Regardless, they raised payroll by about 10% and are likely to keep raising payroll by about that much, I think.

May 18, 2010 2:06 PM on Werth The Funds
 
Matt Swartz
(24824)
Comment rating: -

AND Olmedo Saenz too. He wasn't eligible for free agency when he re-signed either. Again, doesn't change the results but worth mentioning.

May 17, 2010 6:47 PM on The Cost of OPP
 
Matt Swartz
(24824)
Comment rating: -

NOTE: Jason LaRue shouldn't be there in the two-year deals. Just realized he was a few days short of free agency when he signed the two-year deal in the off-season of 2005. Doesn't really change much analysis wise, but should be change in the dataset.

May 17, 2010 6:42 PM on The Cost of OPP
 
Matt Swartz
(24824)
Comment rating: -

I have a hard time believing the hometown discounts are 50%, especially because the real difference in the $/win performance from these players is a result of performance late in the deal, not early. If it were only hometown discounts, you would expect the WARP3 values to decline at the same pace for both re-signed players and OPP, but instead you see the real production different late in the deal. That smells like something far more than a half-off discount given to the hometown team.

May 17, 2010 3:41 PM on The Cost of OPP
 
Matt Swartz
(24824)
Comment rating: -

Very good point. I got rid of contracts that covered arbitration years in this round of analysis to make sure that the risk-aversion of that sort wasn't mixing in with the results, but I actually think that there could be an element of exactly what you're talking about. Players don't want to move their families, etc. On the other hand, that probably isn't the whole issue, because otherwise pitchers must be far more risk-averse individuals than hitters are!

May 17, 2010 1:06 PM on The Cost of OPP
 
Matt Swartz
(24824)
Comment rating: -

Interesting. I didn't think about the four-year deals being biased that much, but the breakdown of pitchers and non-pitchers is perfect. It's the same result after all! I think this only says that pitchers are overpaid if the replacement level is perfect for both. If the replacement level for pitchers is overestimated or if the replacement level for hitters is underestimated, it may be fine. One thing that this makes me think is that maybe there is a rational basis for teams over-drafting pitchers. They seem to be such bad bets based on Sky Andrecheck's draft pick analysis, consistently producing less than hitters drafted in similar rounds. Hitters on the free agent market being better bets than pitchers might make reevaluate this. I guess it would depend on vacancy availability to sign hitters. i.e. how many lineup spots are replacement level for a given team vs. how many rotation spots are?

May 17, 2010 1:04 PM on The Cost of OPP
 
Matt Swartz
(24824)
Comment rating: -

The players who signed deals with the hometown clubs still may not have given hometown discounts. Certainly this magnitude of hometown discount would appear unlikely. Economically, the lack of testing the market shouldn't be an issue on its own, because the player and team should both have a good approximation of how high bidding wars go, and the player should want his hometown team to match this approximation. It seems unlikely that player agents would perpetually undervalue the effect of a bidding war. If they expected a bidding war, they would typically force the hometown team to bid up for it. Really, I think this comes down to information asymmetry. That's the "why" in my opinion at least. The evidence that pitchers show a more extreme effect probably adds to this case.

May 17, 2010 12:39 PM on The Cost of OPP
 
Matt Swartz
(24824)
Comment rating: -

I definitely agree the opportunity cost of re-signing teams is larger because of the supplemental sandwich pick, which effectively costs them 2 picks for signing a player instead of 1. I'm not sure that would explain the difference though, given the magnitude. The extra pick is probably worth about $6MM per contract. Also, if there were perfect competition between teams, the hometown teams would simply be outbid because of the extra "tax" of the sandwich pick.

May 17, 2010 12:37 PM on The Cost of OPP
 
Matt Swartz
(24824)
Comment rating: -

I agree a longer time period is a good idea. I don't have service time information that goes back all that far, so it makes it somewhat difficult to discern which extensions cover arbitration years and which extensions cover free-agency eligible years. If I could get my hands on this data, that would be great. It took a long, long time to aggregate all this data already, though. I agree that more would be ideal to see this effect through time.

May 17, 2010 12:35 PM on The Cost of OPP
 
Matt Swartz
(24824)
Comment rating: -

I like the idea to look into hitters and pitchers separately! I forgot about that. I had started with a few splits into this issue in the first article on this topic a couple months ago, but I should have checked it again when it came to money. Thanks for breaking it down. I do agree that Howard isn't the best example of a player who is likely to age differently than others. I think the only reason that he would be an example of the "information asymmetry" issue is because of the supposed workout plan. However, Howard was the big extension contract du jour a couple weeks ago, and I mentioned it in the article as a reason. I hate hitching my analyst wagon to the "Howard deal is good" simply because I said "it might not really be that bad." I'm still not thrilled about it as a fan of the team. Still, the comment about not needing any tie-in to Howard isn't really relevant. The Howard article was a great way to re-broadcast this research which got burried without much comment until Howard was signed. I'm a practical guy, and I knew that had a good strong enough relationship to my result that it was worth re-stating clearly. Chipper, on the other hand, is an injury-ridden superstar. Clearly that makes him a better candidate to be a bargain based on information asymmetry. This is all the more relevant because it's the Braves, who are by all reports a tremendous scouting team who does the due diligence on these type of issues. If the Braves think Chipper's unhealthy after his deal is up and don't re-sign him, I sure don't want him.

May 17, 2010 12:33 PM on The Cost of OPP
 
Matt Swartz
(24824)
Comment rating: -

Tango, it appears you've selected on the dependent variable. Put it this way-- if you remove the OPP guys who failed to produce after the deal was signed, you're eliminating a lot of data that could be the exact evidence of what we are looking for. Why is the fact that Adam Eaton and Jason Schmidt were disastrous three-year deals a lack of evidence that the Rangers and Giants knew something that the Phillies and Dodgers didn't get???? The point here is that the money is a proxy for how good they were before the deal. Eventually, I can dig up the WARP in previous years (if anybody has added to the data sets I provided above, please let me know), but the money they were signed for is the real value. I'm in agreement that this work isn't done, but the cost per win is really strong evidence that these groups are dissimilar for the reasons mentioned. What we have here is evidence that it cost more to get wins from OPP than your own re-signed guys. Regardless of how good or bad the players were beforehand, the players allowed to reach public bidding were real bad contracts on average.

May 17, 2010 10:07 AM on The Cost of OPP
 
Matt Swartz
(24824)
Comment rating: -

Mariano Rivera averaged being worth about $9MM per year from 2002-09. That just seems low to me. I can't breakdown WARP by relievers and starters, I don't think. Maybe there's a way to do it in the sortable stats. The relievers' salaries might be misleading. What's the average age of a reliever vs. a starter adjusted for leverage of innings? This could make the cumulative salary vs. cumulative wins comparison biased if different. I'm not saying either methodology is right for relievers. I'm just saying it's a hard topic, and probably one where there are improvements to be made.

 
Matt Swartz
(24824)
Comment rating: -

I don't like the use of FIP for relievers. I think it's inconsistent to attribute the BABIP luck that hitters get completely to themselves, but to leave it out for pitchers. That's really my issue. The chaining thing I'm not sold on yet, but I'm not sure it's wrong either. I'd need to look at usage patterns in a lot of detail. Colin's criticism about the creation of leverage seems concerning. It also seems like replacement level relievers get so many high leverage innings that I'm not sure that the absence of a player would always bump up the leverage of all other players one by one. It also seems a little bit inconsistent with the concept of replacement level, but maybe that is resolvable. I'm not saying that I like it or don't like it, just that the results of such low win values give me pause. The WPA values are so high for good closers, and it seems like chaining is the reason that the fWAR values aren't. It's just something to think about harder, and an outcome that says FanGraphs thinks 30 teams overpay for relief help even though WPA seems to imply that they might be fairly priced...it's just something I'd want to devote a lot of time to before forming an opinion.

 
Matt Swartz
(24824)
Comment rating: -

Little short-- he had 1.3 WARP3, which gives him a MORP of $6.8 million, so his $14.3 million cost, leaves him at -$7.5 million value, which fell short of the list.

 
Matt Swartz
(24824)
Comment rating: -

My understanding is that they are dealt with in a similar way as starters. The player's Run Average is adjusted for defense, park, and opponents, and compared with what replacement level players would do with similar defense, park, and opponents. Runs are then converted into wins. WXRL is done differently as I understand it, but I kept WARP3 for relievers to compare apples to apples. Approximating win values for relievers is really tricky. I'm not sure that there is a great way out there to do. FanGraphs uses chaining and FIP, but the win values come out very low, and there are a lot of assumptions built into that which I'm not totally comfortable with.

 
Matt Swartz
(24824)
Comment rating: -

Yes. I approximated signing bonuses by draft slot. It doesn't change the numbers much, but it does play a role.

 
Matt Swartz
(24824)
Comment rating: -

Okay, it seems like giving out my spreadsheet won't be an issue. I don't have all the data you ask for but I do have names, ages, deal lengths, and WARPs. I'll try to put something together soon, maybe a google doc.

 
Matt Swartz
(24824)
Comment rating: -

The majority of this was from a proprietary data set that somebody gave me, and a lot of it was pieced together from that and other sources. I'm not sure I'm even allowed to give it away like that.

 
Matt Swartz
(24824)
Comment rating: -

I agree the 26-31 two-years are a useless sample in isolation actually. I should have said something about that, but I forgot when I was forming the tables. I decided to include them because they are part of the whole sample and I didn't want to be removing players from the sample. But Andruw Jones -2.0 WARP the first year of his Dodgers' deal is such a matzoh ball hanging out there that any way I did the ages, some group was going to be stuck with 31-year olds and they were going to age badly. And Abe Nunez was pretty ugly at 30 years old with -1.6 WARP the first of his two years w/ the Phillies too. In future articles, I'm going to make a point to avoid using percentages alone. I've gone back and forth using percentages and total WARPs, but I think I should use both.

 
Matt Swartz
(24824)
Comment rating: -

In this article: http://www.baseballprospectus.com/article.php?articleid=10505, I looked for this. I got all the PECOTA projections for each group of players. The re-signed players underperformed their PECOTAs by 13% and the newly signed players underperformed their PECOTAs by 15%. That's basically the same. Sean Smith did that study without doing it by subgroup that showed CHONE over-projected free agents by about 15% too. So it seems like they are performing similarly on aggregate the first year. I think that's selection bias-- I discuss this in that article. I agree that I don't have perfect matched pairs. I think that's why it's important to aggregate to get some sense of these things.

 
Matt Swartz
(24824)
Comment rating: -

Thanks to all those asking for WARP by year totals. I had forgotten about this. I'll do them in the comments, but I can post an unfiltered post if it's not clear. AVERAGE WARP EACH YEAR FOR EACH DATA SET (N = SAMPLE SIZE IN PARENTHESIS): TWO YEARS 26-31 YEARS OLD Re-signed: 1.64, 2.48 (N=9) Newly signed: -0.34, -0.18 (N=5) 32-34 YEARS OLD Re-signed: 0.37, 0.53 (N=12) Newly signed: 0.28, 0.33 (N=11) 35-37 YEARS OLD Re-signed: 0.80, 1.31 (N=13) Newly signed: 0.73, 0.44 (N=7) 38-47 YEARS OLD Re-signed: 1.00, 1.80 (N=5) Newly signed: 1.16, 0.44 (N=9) THREE YEARS DEALS 27-30 YEARS OLD Re-signed: 1.99, 2.11, 1.59 (N=9) Newly signed: 1.13, 1.43, 0.73 (N=12) 31-34 YEARS OLD Re-signed: 4.13, 3.73, 3.68 (N=4) Newly signed: 1.17, 1.35, 0.29 (N=15) 35-38 YEARS OLD Re-signed: 0.40, 0.31, 0.29 (N=2) Newly signed: 1.96, 0.49, 0.48 (N=3) FOUR YEAR DEALS 27-30 YEARS OLD Re-signed: 1.47, 4.17, 2.87, 1.03 (N=3) Newly signed: 1.50, 1.85, 1.77, 1.57 (N=6) 31-34 YEARS OLD Re-signed: 2.27, 2.03, 2.53, 2.17 (N=3) Newly signed: 3.89, 2.87, 2.23, 2.86 (N=7) So, all in all, it doesn't seem like there is any real pattern to the overall quality of the players signed. The sample sizes obviously get pretty small, but on the aggregate, they were statistically significant, and nearly every one of the individual samples seems to point in the same direction. I'll continue to look for different ways to splice this data set and please make suggestions if you have any. Thanks for these comments.

 
Matt Swartz
(24824)
Comment rating: -

There is also a negative linear term in there for regular strikeouts that will always dominate. You would need to have over 100% strikeouts to have the positive squared term have a bigger effect than the negative linear term. Think of it this way: ERAs can't go below 0. The positive squared term slows the run prevention down when there weren't many runners on in the first place.

May 09, 2010 5:15 PM on Part 5
 
Matt Swartz
(24824)
Comment rating: -

Oops-- the $24.8 is right, but I typed in $34.1 for MORP instead of $38.1. The formula for Doc would be ($4.9MM/WARP)*(7.7 WARP) + league min 400K = $38.1.

 
Matt Swartz
(24824)
Comment rating: -

Actually-- it wasn't $1.88 billion for free agents. It was $1.69 billion for 383 wins, plus the draft pick compensation, which made it $1.63 billion for 337 wins (subtracted out bonuses and the discounted win values of the draft picks last). That would leave about $810MM for arb-level players about, so that would be almost exactly $2.0MM/win for them.

 
Matt Swartz
(24824)
Comment rating: -

Blake was acquired in July 2008 at the end of his contract with the Indians. He was granted free agency after the year, but re-signed with the Dodgers. I agree the trade market could shine some light on how players are valued, but that is not something that could be modeled in the free agent market. I've seen some models of trade valuation, but there is room for improvement. Mid-season trades have different methods of valuation because MORP isn't linear through the year. Wins are more valuable at the beginning to teams who don't find themselves at pennant races in the end, and more valuable at the end to teams who find themselves in pennant races in the end. If teams knew how close their races would be, MORP would be uniform through the season, but since they don't, it's not. Trade value gets really complicated for that reason.

 
Matt Swartz
(24824)
Comment rating: -

I doubt he's doctoring the ball. I'm pretty sure that he could make the batters miss sometimes he did-- he gets swinging strikes at about half the league pace. The ball just seems to be going right at the Nats' fielders no matter how well hitters square it up.

May 07, 2010 11:03 AM on Livan La Vida Loca
 
Matt Swartz
(24824)
Comment rating: -

Well, that's mostly because he got paid $14MM less. Rivera had quite a whole to cover there. I think Rivera outperformed his contract last year, but not as much as Counsell. The reason Counsell comes out looking so good is that he was a borderline replacement level player who suddenly had his SLG jump 100 points while his AVG spiked 60 points. When you get an above average middle infielder for free, that's really good. Rivera was paid like an elite closer and performed like a super-elite closer.

May 07, 2010 11:01 AM on Most Net Valuable Player
 
Matt Swartz
(24824)
Comment rating: -

I really like this idea. I'm going to see if I can put something together. If it turns out that it's just a bunch of obvious names, maybe I'll make it an unfiltered post or something. Thanks!

May 07, 2010 10:58 AM on Most Net Valuable Player
 
Matt Swartz
(24824)
Comment rating: -

Ah, I can clarify then-- The salary cap would be to make players affordable for small-market teams. Small market teams clearly have an avenue to cheap talent already, though.

 
Matt Swartz
(24824)
Comment rating: -

The WARP3 numbers are somewhat of a methodological difference from fWAR. WARP3 has a lower replacement level than fWAR (~1250 WARP3 vs. 1000 fWAR, 49-win vs. 40-win replacement level). That's probably a judgment call in that range. I'm not sure how big the replacement level error bars should be. As far as the free agents, I don't have my numbers in front of me but I'm pretty sure it was 383 WARP3 for about $1.88 billion, but that could be +/- .03 billion. The net (MORP-Salary) sum total was 0 by definition. Using my service-time contracts and wins articles, I can see it was about 403 WARP3 for arb-level guys, and min-salary guys had 445 WARP3. I think the total salary compensation for the league was something like $2.8 billion, which would be $2.5 billion above the league minimum. So it's about $630MM for arb-level players above the league minimum; so about $1.6MM/win. There are three main reasons MORP might seem high. 1) Draft pick compensation-- it was 12% of all free agent compensation in 2009, I think; 16% and 15% for '07 and '08, if I remember correctly. That number is based on discounting future wins at 8% per year, so it could be higher or lower depending on that. 2) Using all years of a contract, rather than just the first years. The market moves throughout the off-season and before. No one knows the exact marginal revenue, but they try to pin it years in advance, not just the off-season in advance. So when a two-year contract is signed in Feb '08, it matters for average salaries in '09 even if it wasn't in the 5-month window before '09. I had an analogy to BLS employment numbers in my previous article, and how you can't look at wages only of new job-finders to find average wage levels. 3) Using production vs. projection. Projected free agent production is about 15% higher than actual production, according to PECOTA and CHONE (CHONE is according to Sean Smith/Rally, I only did the PECOTA test and confirmed the same bias). I had an article a couple weeks ago about how players who reach free agency are a biased sample of all players so they shouldn't really be up to par with comptuer projections that don't know the information contained in the economic decisions of their teams.

 
Matt Swartz
(24824)
Comment rating: -

No, not at all. A lesson from this would be that the National League teams have been able to avoid paying for better talent in the draft by mimicking each other's behavior, and have managed to win their share of World Series anyway. Unless it is hurting their bottom line that fans in their cities know the American League competition is superior, the lesson very well might be that teams could do well to simply draft only the more affordable players and only change that strategy if teams in their division were doing anything different. If paying more for these draftees only encourages your rivals to do the same, it might not be beneficial to do so. Of course, if the talent level got so much worse that NL teams could not win World Series, that would be a different story.

 
Matt Swartz
(24824)
Comment rating: -

Thanks. I found Steve's articles. Very interesting stuff-- the difference in methodology (other than WARP vs. WSAB) is that he looks at what minor league systems developed the players and I looked at who drafted the players. I guess this somewhat reflects something about my thinking at the time that I had not realized-- I was basically looking for who identified talent well enough as a springboard for who helped themselves the most from that starting point, while Steve was clearly asking who was capable of developing talent, rather than identifying it. I wouldn't have expected the results to come out so different, but they really did for 2007, which was the only season of overlap (he did 1946-2007, I only did 2007-2009). The teams are more or less in the same places, but with about half a dozen teams swapped in different orders. The NL East totally changed shape compared to what I did. Interestingly, the Mariners were not the dominant superpower in "value production" as they were with "no turnover." There must have been a ton of players who were drafted by the Mariners by traded away before being developed. This could be because the Mariners aren't given full credit for David Ortiz in Steve's work, and Ichiro clearly doesn't count based on his methodology. I'm guessing there are other players I can't dig up right now that might be causing this big difference. It's interesting to look at the differences in outcomes for a subtle change in what question is being asked. Thanks for pointing out this article to me.

 
Matt Swartz
(24824)
Comment rating: -

The claims that the economy is getting worse are not really accurate. The economy is growing again (relative to where it was several months ago). Unemployment lags the economy by about six months typically. Firms don't start making hiring decisions right away. So, much of the awful labor market is a result of how bad the economy still was six months ago. It's not a consolation for anyone out of work, but it does mean that the labor market should improve. Nextly, there are a lot of jobs to make up. Unemployment is high in large part because of the cumulative amount of jobs that have been lost. The other thing is that March finally saw significant job gains, which is good, but also very well might make the unemployment "rate" look artifically high as people who were not even looking for work now start looking, which makes them count as unemployed instead of out of the labor market altogether. The recession is over-- meaning that the economy stopped getting worse, in the sense that the economy is growing, but also in the sense that it actually has a lot of catching up to do and in the sense that there is a major lag in labor markets, a lot of people don't feel it yet. Personal consumption expenditures actually don't lag the economy like the labor market does, and personal consumption expenditures are increasing in recent months and are higher than last year. This would certainly indicate the demand for baseball tickets might be increasing right now. That does not mean April attendance would rise-- something Neil explains well a few comments up the page-- but it should indicate future attendance would go up. The problem, of course, is that attendance is really just quantity of tickets purchased. As a number of readers have pointed out above, the question of prices is another matter, and teams may be raising prices enough that the revenue may be higher even as the attendance might be lower. What might actually be a major issue is something Shawn has written about a few times recently-- the strength of substitutes for watching a baseball game in the stadium are various forms of media. If those substitutes are becoming superior, this makes the demand for tickets go down as the demand for baseball goes up.

 
Matt Swartz
(24824)
Comment rating: -

The slotting undoubtedly makes it easier to coordinate, but it is perfectly reasonable that coordination could occur without any signalling by the league. Teams could simply observe each other's behavior and react accordingly. That is exactly how teams were able to suppress salaries during free agency when players also could not select their teams. In fact, it does not seem like a coincidence that this type of coordination between teams would occur around the time that free agent salaries really began to escalate. That would probably strengthen the case if I had to guess. Thanks for the comment. It gives me a much fuller understanding of the situation. I'll definitely look for Treder's work on this topic.

 
Matt Swartz
(24824)
Comment rating: -

That could be the case. It could also be that they simply find it worthwhile to outbid NL teams for some free agents and are willing to pay more. I would guess that they are just slightly less efficient in recent years by chance, but it could be more complicated than that.

 
Matt Swartz
(24824)
Comment rating: -

I will try to track down this research. Thank you for pointing it out to me.

 
Matt Swartz
(24824)
Comment rating: -

Thanks! I really appreciate this, and I'm glad you enjoyed the article so much.

 
Matt Swartz
(24824)
Comment rating: -

A big part of the question is whether those guys would be available at the same price. I don't think they will, but I'm guessing that the price difference would be worth the difference in production. I could be underestimating salary growth, like many of us did during the last economic expansion.

Apr 28, 2010 11:39 AM on The Howard Extension
 
Matt Swartz
(24824)
Comment rating: -

You said that "the Phillies are competent" was not a given. That's almost calling him incompetent, at least. I guess you're probably right that it wasn't exactly the same comment, but it's how I read. I also didn't know who Abner Doon was, to be honest, but a quick google search seems to imply he's not a bright guy, right? Regardless, I think we agree that Amaro can't be given the benefit of the doubt automatically, but perhaps we also could agree that it's worth considering he is smarter than his moves make him seem at first blush. You are correct that trading Lee was not part of trading Halladay. They just happened concurrently. Polanco was already signed at that point, so really I meant that they could have just non-tendered Joe Blanton and Chad Durbin, and not signed Baez and Contreras. Those guys cost about $14MM this year. My point was just that was totally okay. I am also skeptical that he necessarily knew what he was doing with Ibanez, but it certainly seems plausible that he did. I also agree that Amaro inherited a stockpile of resources, but keep in mind that he was seemingly a very involved Assistant GM who may be responsible for such good fortune during the Gillick years. Again, just asking for some hesitation at blasting Amaro. I've gotten burned on doing so a few times.

Apr 28, 2010 10:41 AM on The Howard Extension
 
Matt Swartz
(24824)
Comment rating: -

Agreed. I'm not saying that they were actually good moves given the information he had. I said it's certainly plausible. Ibanez isn't as slow as other plays who are 15 runs below average at defense, and the Phillies seem to be very good at positioning their defense well. I suspect that it was a reach that he would improve 15 runs, and we'll probably see a regression, but it's possible that they thought he had coachable flaws given how poor his routes or positioning was or something like that. However, I do think they knew Burrell was going to tank. He had injury problems that he played through for the last several years. I think that they sheer wrongness of my blasting the decision to offer Burrell arbitration was rather humbling for me. However, I agree with you the jury is still out and there are major question marks surrounding many of the moves.

Apr 28, 2010 10:32 AM on The Howard Extension
 
Matt Swartz
(24824)
Comment rating: -

I'm as skeptical as the next phan, but this statement is just ridiculously unfounded. His major moves include not offering Burrell arbitration which led to a massive attack by a million analysts, myself among the loudest, after which Burrell got WAY less than he would have in arbitration and played replacement level after words. He signed Ibanez to replace him, a move that was lampooned, right before Ibanez's defense appears to have improved by like 15 runs and during which Ibanez produced about 2/3 of his contract value during the first year of the deal. A disasterous 2nd and 3rd year would make the contract still look justifiable ex post. He signed Moyer, which seems bad but not disasterous considering he's only really paying him to provide 1-1.5 wins. He brought in Chan Ho Park when no one else would and squeezed a ridiculously good year of relief out of him (albeit after a failing attempt to use him as a starter). He also signed a large number of players to multi-year deals during the 2008-09 offseason, well below what similar players were getting paid, likely taking advantage of the good will towards the team that winning a WS will do. Then he cut ties with sunk costs like Geoff Jenkins. In the middle of the year, he leveraged the Jays and Indians until one of them gave him their ace at far below the typical trade value for players of that stature. This past offseason, he signed Polanco (an average player) to a contract that you would typically pay a below average player, signed Blanton to a three-year below his market value, and made a contraversial trade of Lee and prospects for Halladay and prospects that looks questionable, but keep in mind that we haven't actually seen what happens with those prospects yet. Then he signed Halladay to a below market extension. Does he do everything right? No way. But to call his moves incompetent or idiotic, or to say he's made far more poor moves than good ones, is just untrue. The jury is still out, and he's proven me wrong in a couple times already in my skepticism. I have an open mind for sure.

Apr 28, 2010 9:15 AM on The Howard Extension
 
Matt Swartz
(24824)
Comment rating: -

Great questions. Thank you for these. I think that I have taken these into account implicitly, but they hit at the heart of the economics of baseball. I'll do my best to answer them well, but please ask me to clarify if I am unclear. The marginal revenue of a win is most for teams on the playoff bubble. Of course, there is no way to know exactly how many games you and your division rivals will win. One win for the 2007 Mets would be worth tens of million of dollars in expectation, and one win for the 2007 Phillies would be worth only a few hundred thousand in all likelihood. Since the free agent market is actually an auction, the prices reflect the second highest bidder's valuation of the player's services. Mostly, this would mean that they reflect the value of a player's services to a team on the bubble. This actually means that 1 win is worth itself in the standings (ignoring playoff implications) PLUS a 5% increase in the probability of making the playoffs. A 4-win player represents four wins in the standings plus about a 20% increase in the probability of making the playoffs. Thus, this is pretty much exactly what is being paid for by teams. So the price of a win is basically a competitive team's estimates of the long-term revenue implications of one win in the standings and 5% higher playoff odds (and naturally, all the variance in possible outcomes upon making the playoffs). Thus, Howard's contract takes into account the possibility that if he takes a Phillies team that would otherwise be three games back in the division and makes them win their division by one and that they happen to win the World Series that year, then he will have been worth like $100MM (to throw out a number) to them. It takes into account the possibility that he takes a team that would have won the division by one games and makes them win by five games in a year that they lose in the first round, and thus makes the team $1MM or $2MM. It takes into account all of these things. The only thing it really assumes is that the Phillies are pretty much competitive going into each season for the contract. Now, if the Phillies turn into a bad team, the contract will be a bad one if they are stuck with him. Of course, they can trade Howard at that point. They would likely have to eat money, but that's fine, because back-loaded deals effectively entail paying for early-years' performance later on, so eating money in a trade would be exactly that. It also takes into account the variety of ways that Howard's performance may change so that if he turns into the best player in baseball or replacement level, trades in all scenarios are priced in.

 
Matt Swartz
(24824)
Comment rating: -

Well, he was worth about 4.5 WARP on average the last three years. For the sake of picking numbers, I projected he'll start the deal a 4.0-win player and end it a 2.8-win player, so maybe 17 wins over that time period. If I had to guess, he'll probably end up being worth only $110MM over that period after taking into account the two picks he cost the team. I would put huge error bars on all of that, so please keep in mind that I'm just picking for fun.

 
Matt Swartz
(24824)
Comment rating: -

Firstly, Adrian Gonzalez is only 2.5 years younger than Howard. Secondly, I'd bet he gets more than $25MM/year. He would get $25MM/year if he went on the market right now, but I'm guessing he breaks $30MM/year. Both teams gave up leverage. The myth that future price uncertainty isn't already included in current prices continues.

 
Matt Swartz
(24824)
Comment rating: -

That's a pretty bad mis-characterization of my Werth section. As I said, the reason I believe it's not a tradeoff is that they are more likely to make money if they spend at the playoff bubble rather than intentionally falling short of it. As to the Cliff Lee claim, that was clearly not just to save money. They spent far more than his $9MM in salary in subsequent moves after the trade. That trade was about the value they placed on Seattle's prospects, guys that their own Assistant General Manager drafted when he was Assistant General Manager of the Mariners. The Phillies decided on a budget based on what they felt gave them a competitive team, and they chose to trade Lee and spend the money elsewhere because they thought the prospects were worth the difference in between Lee's production and the combination of Blanton's, Durbin's, Contreras' and Baez's production. As I wrote at the time, that was a questionable move, but subsequent actions clearly revealed that it was not just to save the $9MM. Otherwise they wouldn't have spent more than that weeks later.

 
Matt Swartz
(24824)
Comment rating: -

I can't remember who wrote what. I just remember a lot of comments questioning the timing and asking how well second basemen age. Sorry.

 
Matt Swartz
(24824)
Comment rating: -

Every time a new huge and long deal comes out, people look at the contract and assume there will be little inflation and determine it's a disaster. With the exception of the last couple years, this is overwhelmingly a mistake when it is done. Assuming a lack of salary growth in a rapidly expanding industry seems far more likely to be a presumptuous assumption. I hardly think it's fair to say I'm taking MORP too seriously or misapplying it considering I only introduced in five days ago. I have a pretty good sense of what it represents, which is the opportunity cost of a win on average. That seems to me like an excellent way of assessing whether a deal was smart or not. This article took a lot of time to write, and hardly represents some sort of contrarian cooking the books. I have been hammering the point about the differing aging curves when you compare re-signed players and newly signed players for weeks. It was my exact response to the Mauer signing as well, which also represents a player where there is a lot of inside information that the team has on the player's health. What I tried to do was look for what kinds of scenarios would make the deal worthwhile and what kinds of scenarios would not. The majority of the sabermetric community appears uninterested in considering the variance in performance and the variance in inflation to simply arrive at a sure answer. Instead, I attempted to look at a range of possibilities objectively without the dogmatic claim to predict the future with certainty simply because I have a weighted mean. I looked at the distribution instead of the weighted mean, and discovered that the contract was probably not a good one, but hardly the disaster that sloppy analysis suggests.

 
Matt Swartz
(24824)
Comment rating: -

The .261/.348/.529 is (I think) in games that were started by LHP, since Baseball-Reference.com has 3232 PA combined in "vs. LH starter" and "vs. RH starter", which is Howard's total PA. So that line appears to include him mashing RHRP after the LHSP had departed. I guess the implication is that you should play him in those games anyway, because you'll have used up a bench player in the 6th or 7th inning when you replace the RHB who started against the LHSP.

Apr 27, 2010 2:19 PM on The Howard Extension
 
Matt Swartz
(24824)
Comment rating: -

Oops, I responded to this post above as part of my response to the other post. But the point I made is that it's not a perfect correlation but the oppo-mashers on that chart did age better than the pull-mashers on that chart on average.

 
Matt Swartz
(24824)
Comment rating: -

I like that worse. That would probably hurt their ability to spend money in 2012 and 2013, while they could simply continue with back-loaded contracts to whoever they sign in 2014-16. Money is worth more now. Effectively, by getting paid more in the less productive end of his contract and getting paid less in the more productive end of his contract, the Phillies have effectively borrowed money from Ryan Howard.

 
Matt Swartz
(24824)
Comment rating: -

I'm not cherry-picking. Look through the list of players and their aging curves. It's not a perfect match, but the more extreme pull hitters decined more rapidly on average. Thome's WARP numbers would absolutely be satisfactory for Howard. Just 2005 looked bad. He was pretty sick in the other years. One bad year would not make the deal useless if he had a couple very good years or one spectacular year. There are a lot of paths that this could take. That's my point. I'm not saying it's a good deal. I'm painting the scenarios in which it works out well. There are a number of scenarios that aren't far-fetched.

 
Matt Swartz
(24824)
Comment rating: -

That would definitely change things but I think the luxury tax treshold is $162MM in 2009, up from $155MM in 2008. That would be well beyond where the Phillies are headed.

 
Matt Swartz
(24824)
Comment rating: -

Agreed. If anyone is playing the "I know your business better than you" card here, it's not me.

 
Matt Swartz
(24824)
Comment rating: -

I have no inside information. I doubt Howard contributes too much to inflation. I think the net effect on the market is one less 1B, which creates scarcity, but one less team who needs a 1B, which negates that effect. I think it's a clue into what at least one team is thinking the market will look like.

 
Matt Swartz
(24824)
Comment rating: -

I'm not quite sure what Black-Scholes would have added. It's been a decade since I actually did Black-Scholes, so forgive me if I'm misremembering, but isn't what I did equivalent to Black-Scholes with no time lag and a uniform distribution instead of a normal one? I doubt the added value would be worthwhile anyway, given the fact that these are only approximations anyway.

 
Matt Swartz
(24824)
Comment rating: -

I discussed this more in the MORP series and it took some time so I won't go through the details. I largely based my estimates on a model developed by Sky Andrecheck at BaseballAnalysts.com. It is linked in the previous article. He does a terrific job of approximating career production by draft picks, and I simply rescaled it to look only at the first six years of contracts, arbitration money, and rescaling to fit WARP3 instead of WAR. There was a good deal of guessing that had to be done to determine which deals came at the cost of picks, but I was able to infer the answer pretty clearly from the outcomes in most cases. There were some judgment calls, but not major enough to change the $/WARP figure much.

 
Matt Swartz
(24824)
Comment rating: -

Did you read the article or did you just show up to hate? As that section explains in detail, it's a sign that the Phillies are planning on spending money during that time period. That doesn't even mean it's smart for them at all. I like it when Starbucks buys comfortable chairs, as their customer, but that's separate from whether I find it wise for them to do so.

 
Matt Swartz
(24824)
Comment rating: -

I'm not interested in the personal attack, so I'll ignore that part of the comment. As to the content of your post, I don't even know what to respond if you don't think that teams think winning and revenue are correlated, and don't consider whether they think they can be competitive when they set a budget. It's so far divorced from any anecdotal evidence about how owners of baseball teams spend money at all that it only reveals an inability to see budgets being part of any larger picture at all.

 
Matt Swartz
(24824)
Comment rating: -

"Budgets" come about because firms first determine the cost function for different levels of revenue, and then select the level of costs that maximizes profits. Do they sit there with cost curves and their Econ 101 book? Of course not. But they behave in such a way that mimics this process. This is a very important fact about economic decision-making. The best analogy I've ever heard is to consider a billiards player. What they are actually doing in determining how to hit the ball is solving a ridiculously complicated series of differential equations. But they don't pull out Matlab and a Physics professor every team. They simply have arrived at a process of arriving at the same solution by learning over time how to succeed at billiards. Firms behave in a similar way. They don't take derivatives to solve every problem, but they seem to follow these principles. Owners typically know that the profits of a 81-win team and the profits of a 71-win team are similar, but the profits of a 91-win team are much higher than that of an 81-win team. Thus, when they feel they can only become a .500 team, they rarely spend money even in large markets like Washington. On the other hand, even smaller markets like Minnesota will spend money when they are on the sweet spot of the revenue curve. They do this by authorizing General Managers about what their budget should be. The result is that people think only in terms of the budgets handed down to the GMs and think this is all that's going on.

 
Matt Swartz
(24824)
Comment rating: -

It's a mixture of all of the above. By there is a lot of revenue that comes in the form of merchandise, subseuqent years' ticket sales, and TV contracts as a result.

 
Matt Swartz
(24824)
Comment rating: -

1) It does depend on the team that would have signed him. It could really be anywhere from about $18MM-$27MM depending on who signs him. My guess is that it would probably be a late first round pick, so $23MM seems about right. 2) If the Phillies don't pick up his 2017 option, it means that they didn't want to pay him $23MM; if they offer him arbitration, it means they will have to give him a raise on his $25MM. If the Phillies do pick up his 2017 option, they have essentially valued him at least as much as $13MM for 2017 ($23MM salary minus $10MM buyout), but offering him aribtration would require a raise on the $23MM figure, so it's also unlikely. 3) The cost of paying for lost wins in the form of dollars and picks is what I am counting here. I think we agree with each other on this point-- I determined the dollar cost of those picks and included that in the price.

 
Matt Swartz
(24824)
Comment rating: -

Aging curves vary wildly from player to player. I think an analytic argument of aging curves should look at the selection bias of aging curves for re-signed players versus newly signed players when they are so dramatically different. The crux of my argument is whether the Phillies are right or wrong about his aging curve, and I give reason to believe that ignoring the fact that they watch him stretch and work out every day is dismissing very important qualitative information that is clearly correlated with quantitative trends.

 
Matt Swartz
(24824)
Comment rating: -

As I wrote above, splits are fleeting. I'm not sure if you watched Howard last night but his "0 for 3" included a 400 foot fly ball to dead CF. The "late inning reliever" argument is overplayed, since teams only have so many LOOGYs. Greg Dobbs and Matt Stairs got a lot of PA vs. RHP in recent years because of this type of maneuvering. The "get one reliever and you're in the clear" argument is just not going to work. He still has a very high SLG vs. LHP, which is important in high leverage situations.

 
Matt Swartz
(24824)
Comment rating: -

Pujols is going to get so much more than this that it's not even funny. If the economy improves and baseball keeps growing like it has been, I could absolutely see $40MM/year. I think the effect of one deal on the other is minimal, however. The opportunity cost of signing Pujols remains filling your team with a couple of 3-4 win players each at half his cost. Pujols' salary will depend on what teams are willing to pay him. Howard going off the market doesn't change what teams are willing to pay him.

 
Matt Swartz
(24824)
Comment rating: -

Thanks. I don't know if hitting more HR to the opposite field is the reason. The best case I've made in the past is that hitters who are pull-happy rely on reflexes to hit HR, while hitters who are oppo-happy rely on strength. Reflexes decline more rapidly than strength, I think. It's just a trend I noticed. The larger factor, like you mention, is health. The best argument for Howard aging well has got to be that the Phillies know his health better than anyone, and they have one of the best medical staffs in the league (they won last year's Dick Martin Award from BP).

 
Matt Swartz
(24824)
Comment rating: -

I agree that the market will depend on scarcity. The market always depends on scarcity. The reason that I unraveled a series of possible inflation alternatives is precisely because I agree with you that the market conditions are not set in stone and a number of possibilities should be considered.

 
Matt Swartz
(24824)
Comment rating: -

My article used OPS to analyze his performance against LHP, which is already a quick and dirty measure. Your response is to use batting average? Ryan Howard is a low OBP, high SLG hitter against LHP historically. He's done better in his career against LHP than the average LHB. And most importantly, SPLITS STATS NEED TO BE REGRESSED TO THE MEAN IF YOU WANT TO PREDICT FUTURE PERFORMANCE. There's been massive amounts of research done on this topic. Also, if you want to get into "HR vs. LHP" argument, at least look at the rest of the league. Howard LED THE NATIONAL LEAGUE IN HR VS. LHP in both 2006 and 2007, and was third in 2008. It's a weak statistic, but at least compare him to other people. Nobody gets 650 PA against LHP.

 
Matt Swartz
(24824)
Comment rating: -

The "how much talent is in the FA class" argument is somewhat misguided as well. What is lost in that analysis is that players on the FA market are removed from their old teams. Yes, that year's FA class will potentially also include Albert Pujols, Prince Fielder, and Adrian Gonzalez, but the Cardinals, Brewers, and Padres (or whoever Gonzalez is playing for at the time) will now all be without a first baseman in these scenarios. I'm not saying that the Phillies are forced to sign him. I'm saying that the Phillies have the altnerative of paying approximately the same price for wins as everyone else will at that time, so if they plan on being competitive, Howard may represent as good an option as any at the time. Under higher salary inflation, it could turn out to be an efficient way for the Phillies to get their wins. Anything near ten percent inflation with a modest rate of decline for Howard will make the cost of Howard's wins added lower than the alternatives on the market.

 
Matt Swartz
(24824)
Comment rating: -

Market scarcity is already priced into MORP. That's the whole point of market values is that they reflect scarcity. MORP says the current cost of a win on the free agency market is $5MM including the opportunity cost of draft pick compensation. It could stagnate making this deal a disaster or it could inflate making it decent, but the concept of scarcity is not a new concept that needs to be introduce. It's the basis of market values. The Phillies don't use advanced stats to evaluate players as near as I can tell. They seem to be vaguely aware of them, but they are not using them in their analysis. Somehow, the groupthink on this has led to the conclusion that "if the Phillies don't use new school stats, they must focus on old school stats." That's ridiculous. The Phillies are a team heavily invested in scouting and health, which is the only reason that they've kept their heads above water while much of the league does better data analysis. I do not see any evidence this was based on RBI. I see evidence that the Phillies were focused on his workout routine and improved defense, and determined that he was unlikely to age poorly. If he maintains his value throughout this contract, it's a good deal. The question is about how much he declines, not about the stats which will decline. Also, there is a HUGE phallacy in the "wait for the market to develop" argument which I cannot believe has gotten this far. The market could go in EITHER direction. If the market gets better, Howard requires more money to re-sign; if the market gets worse, he requires less. This is about taking on a risk, which is generally smart to do in baseball given the shape of the marginal revenue curve. Put it this way-- if "waiting for the market to develop" was always a good idea for teams to do, it would always be a bad idea for players to do. Scott Boras would be in the poor house. His off-season strategy is tied up with hoping the market gets more expensive over the course of the off-season.

 
Matt Swartz
(24824)
Comment rating: -

Yes, the opportunity cost of letting Howard go in 2011 is one of my main arguments as to why I think the value was higher than I'd have paid. If the next CBA eliminates this compensation before the 2012 draft, I think that the deal would look much better and probably wouldn't be over market value at all. Utley is around until 2013, and the Phillies will probably extend him after that. I think it's fairly reasonable that Howard will be able to assume the Phillies will put up some OBP in the 3-hole.

 
Matt Swartz
(24824)
Comment rating: -

I think you are mixing up the definition of positive analysis. Positive = "what is", Normative = "what should be". I think you're suggesting I do a normative analysis and implying this is a positive analysis, but please tell me if I'm misinterpreting. This is a study about opportunity cost. In that sense, positive analysis is somewhat normative. What I am saying is "here is what teams are paying for wins" and therefore "good deals likely involve paying more than this." The reality is that without a team's actual books and projections of attendance and revenue under various scenarios, I cannot determine whether teams should be spending money on players in the way they do. All I can say is what the opportunity cost of a win is around the league, and therefore I can determine the point at which teams should choose to spend their money elsewhere.

 
Matt Swartz
(24824)
Comment rating: -

PECOTA is a reference point. Never ever take PECOTA as the end of the story. If you look at the aging tables above, you will see that re-signed players do not age as PECOTA predicts, while players who sign contracts with new teams do exhibit this pattern. Additionally, I discussed the comparables that formed that projection, explaining why Howard's non-PECOTA characteristics have more in common with his more gracefully aging comparables. The fact that Howard hits better with runners on is something that is common among the majority of players who defenses shift against: http://www.thegoodphight.com/2009/1/29/741980/there-is-clutch-or-the-cas I wrote that article last year detailing this process. Howard hits more HR on the road, for one thing, and the stadium isn't going anywhere before 2017 anyway so he still gets to reap its benefits anyway. Joe Mauer was probably a great deal, though it really depends on his health. Still, it will probably look better than Howard's contract. Alex Rodriguez's will probably look worse if you check his PECOTA, but I'm guessing he will age more gracefully than that as well. Put it this way: Mauer was a bargain, but that wasn't the opportunity cost of signing Howard. Mauer was off the market. Other deals for this period of time are not yet signed. I don't think the contract is a great one, and I did say in the article it was a little high. I would have come in closer to $110MM or so. Still, just talking about PECOTA when I already looked at the PECOTA and decided where it was off is going backwards. Say why you think Howard will age like Cecil Fielder rather than Jim Thome. That would be a more compelling case.

 
Matt Swartz
(24824)
Comment rating: -

This is a good idea. I definitely might do in the near future. I would need to think about methodology a little bit, but I think this would be a good thing to do. Thanks.

 
Matt Swartz
(24824)
Comment rating: -

This is an approximation. The average draft pick that makes the majors debuts three years after being drafted. They play six years before reaching free agency, so the average WARP accrues about six years later. Changing it to 1/6 of the WARP for 4, 5, 6, 7, 8, and 9 years later probably won't change things much since 8% was a pretty noisy estimate in the first place. It's not precise because it's such a judgment call anyway. The goal was to get a more reasonable estimate of the opportunity cost of surrendering or forgoing a draft pick.

Apr 23, 2010 12:55 PM on Reintroducing MORP
 
Matt Swartz
(24824)
Comment rating: -

Yeah, you basically have it right there. It's true that pitchers do have control over ball-in-play outcomes, but the amount that they do control is relatively small and what they do control is correlated with strikeout, walk, and ground ball rates. Since SIERA is a regression analysis, it actually picks up on the portion of BABIP control that is correlated with those DIPS stats.

 
Matt Swartz
(24824)
Comment rating: -

I think you might want to go back and check my series on Expected BABIP that I wrote a few weeks ago. It isn't really about explaining away performance, but there is a lot of luck involved and it can be a clue. In Jay Bruce's case, his infield pop up rate just wasn't high enough to justify such a low BABIP last year. His line drive rate was low, but those things tend to bounce back pretty quickly. Expected BABIP sees him as a .278 BABIP caliber hitter, which is below average but not extremely so. He loses points for having a swing that produces a lot of balls in the air, but his ability to hit for power negates a lot of that problem so that he's not so far below average. The kind of BABIP he had last year is the kind that you'd expect from a guy who has an uppercut but no power, not one with power. Bruce hits the ball too hard. In Hamels case, BABIP provided a clue that I analyzed a lot more thoroughly where it was just clear people weren't hitting the ball harder off of him. They were swinging and missing as much, fouling as much, hitting grounders/liners/flies/popups as much, and hitting the ball out of the infield just as much. BABIP provided the clue in that instance. Of course, it can't be the only tool but in the case of Hamels and even Bruce it's a pretty great way to get to the root of what changed.

 
Matt Swartz
(24824)
Comment rating: -

You're more than willing that people are using statistical samples to making inferences, but than you seem to want to assume that all people who do so are suffering from confirmation bias. I'm just saying that people making inferences from statistical samples is a relevant aspect here. Unless you're unwilling to admit that some people are not inherently biased before they meet other people, then you shouldn't have a problem accepting this. People don't need to have statistics degrees to be statistically discriminating. I'm just talking about a general trend. It's hard to believe people put zero weight on any experiences they have ever. You're also totally missing the point of free entry. It's not that there are an infinite number of companies that need to consider entering. It's just that there needs to be no barriers to entry for any individual with a good idea. Very few industries are going to fail to meet this description. Baseball has a government sponsored monopoly and is therefore unique.

 
Matt Swartz
(24824)
Comment rating: -

I agree that employer prejudice is a weak claim, which is why I focused on customer discrimination and statistical discrimination as more plausible in the article. Collusion can occur far more easily though when the set of firms (teams) is fixed in size without free entry. That certainly played a role in pre-Jackie Robinson baseball segregation. It's just that implicit collusion is harder to pull off from a baseline of integration. Certainly Americans can be harder to sign because of other obligations, but they are also subject to a draft which holds their available bidders down. That last argument might not be strong enough, especially because the discussion is focused on offers to existing African American players in general, rather than teenaged two-sport athletes.

 
Matt Swartz
(24824)
Comment rating: -

You lost me with your first sentence. If you don't think rational behavior drives some of baseball economics after reading my columns, I just don't know what I can add in a comment that will change your mind. I think it's pretty obvious that the Indians had lower marginal values for CC, Lee, and VMart then their trading partners did and so there were gains from trade available where they had a higher relative benefit for future wins than present wins. Your example about Dye, Sheffield, and McCutchen isn't clear enough yet so you're going to need to elaborate. I don't think the existence of some players who risk saying no to contracts has anything to do with this issue. I'm also not getting the "quit whining" example. If Jermaine Dye were facing discrimination (which, again, I don't think is the case for him), that's millions of dollars that he doesn't get. That's enough for a generation of Dye to live off of, and so it's relevant to him. It might not seem like much when you have a lot of money until you realize that there is a generation of his family who will either be set for life or not based on a couple million dollars. That they like their jobs does not mean that they deserve to be ripped off, especially if it's inequitable.

 
Matt Swartz
(24824)
Comment rating: -

Again, individual players have a variety of factors influencing their contracts. A player could have a questionable ticking time bomb injury that shows up during physicals that makes his value inexplicably low. It's tough to say definitively in one case. Looking at larger samples is just better.

 
Matt Swartz
(24824)
Comment rating: -

I will probably try to come up with a good way to look at this empirically at some point, but it's a tricky concept so I'd need to solidify my approach to contracts in general once. It would be dangerous to try to find a one-to-one comparison like J.D. Drew because there are so many factors that go into an individual player's perceived value that you don't know when something else is going on. As far as the Branch Rickey aspect, I think that it was a risk that he was willing to take that others weren't. I think he probably thought it would be good for business, so in that sense, I'm somewhat in tune with the "market story" versus the "morality story." At the same time, it was obviously a risk that many owners would be on the fence about, and he went with the moral side and can be commended as such. Still, I think customer perception was a big part of the story, so I think Robinson's behavior was going to be crucial in any market story. I guess I'm saying that employer discrimination would be hard to maintain and it was only maintained as long as it was because of the lack of free entry, but that the customer discrimination persisted which is why even in 1947 only one owner took a risk on Robinson. I'm not sure there was enough information then for statistical discrimination to be playing a role. I think Robinson might have helped any statistical discrimination by being so calm and focused in the face of prejudice. If he created a bad stereotype, it would be harder for other players. Given the pressure he was under, it's tough to imagine many people being able to do so.

 
Matt Swartz
(24824)
Comment rating: -

So why is it unreasonable to think that teams recognized Bradley's .999 OPS was an outlier? J.D. Drew has performed up to his contract overall, hasn't he? I don't think this is a good comparison. I also again fail to see where I'm misinterpreting Jackie Robinson breaking the color barrier. You mentioned lobbying above, but that's not related to what I was discussing. Mentioning various other aspects of the negotiations was not relevant to get my point across either. What I said is true, as far as I know. If I'm stating something inaccurate, please tell me.

 
Matt Swartz
(24824)
Comment rating: -

Identifying the market inefficiency has always led to correcting it. If I have discussed a market inefficiency here that is researched further and identified, perhaps that will happen. I can't say. Hudson's value could be low for any number of reasons and it's not hard to see why people would take a shot at Milton Bradley anyway. Keep in mind though, as I mentioned earlier in the comments, that Bradley's last "valuation" on the marketplace was "=Carlos Silva". Not exactly getting away with his off-the-field issues right now.

 
Matt Swartz
(24824)
Comment rating: -

I certainly might do some empirical research on this topic, but I did link to J.C. Bradbury's blog post which has links and summaries of a dozen different studies on the topic over the years. My point in the article, however, is that you need both theoretical and empirical research to talk about this issues because of other correlations. For instance, 57% of starting shortstops in the league right now were born in Latin American countries, compared to about 20% overall in the league. WARP and WAR treat positional adjustments differently, and if one of them is wrong, it will affect how we perceive Hispanic discrimination. The economic theory on this is not just a toy, and I was able to take it very far on this. Once you have theoretical modeling, you can do empirical testing better. I'm also not really sure how the existence of lobbying for Black players in the MLB discredits anything I wrote in the article. Obviously lobbying has an effect, but my point that Rickey took a chance and that Robinson had to endure customer discrimination and do his best to change it remains true either way.

 
Matt Swartz
(24824)
Comment rating: -

What you describe certainly sounds unfair. I also am very doubtful that the wage gap-- which I think is something like 30% overall and 19% after you correct for education and experience-- can be explained by connections. That seems like way too large of an effect. I'm also pretty sure that Blacks earn less even controlling for parental income, but I haven't really tried to do that research since I was an undergraduate and I could be misremembering so please don't take my word on this.

 
Matt Swartz
(24824)
Comment rating: -

I appreciate the comment, but please remember the distinction I made between discrimination and prejudice. If one makes inferences based on observed statistical tendencies, that is statistical discrimination but it is based on information even if it's a biased sample. The best example of statistical discrimination is the one I gave in the article about women's labor market exit rates. I do think that the market corrects issues when the affect is real. On the field production is very measurable, so I think the market corrects for these issues. I have always enjoyed the joke about the economist and the $100 bill on the street, but it's important to ask yourself if you have ever seen a $100 bill on the street. Probably not, I'm guessing. The dominance of the NL over the AL after the color barrier being broken fits right into what I'm saying in my article. The "magic of the marketplace" is due to free entry. As I explained in the article, free entry of new firms is not possible in baseball with finite teams, so it's possible to sustain a barrier like the color barrier. The bench filler issues that you bring up are a good example of where off-the-field value can push players over and under replacement level. The closer you are to replacement level, the more likely perceived off-the-field value can affect your ability to get a job. Please don't misunderstand my point about statistical discrimination. I am not justifying anything. I'm identifying the issue. If we can understand that the human mind categorizes and we can identify how it can categorize based on observations, we can work on solutions. Without solutions, it's just finger pointing. I don't quite understand why Hudson hasn't gotten more attention on the free agent market, but there are plenty of individual players who seem to be undervalued by the market (i.e. why did Cliff Lee twice get traded for lower value prospects than his contract and value would suggest?). That Hudson is Black can't be assumed to be the reason.

 
Matt Swartz
(24824)
Comment rating: -

Milton Bradley is very uniquely talented, and even if his value is affected by off-the-field issues, he still would get quite a contract. Regardless, he was traded for no real value this past offseason because of off-the-field issues primarily, so the last record of his value is low.

 
Matt Swartz
(24824)
Comment rating: -

Correct, Dukes is seemingly a terrible person and there is no issue with that affecting his actual and perceived value. I mentioned Myers as someone who has a domestic violence precedent, and who has at least showed up with a black eye one other time about which his story changed. Who knows why, but Myers is a questionable player but he's worth about one win and is compensated as such. Dukes is probably worth about one win and is not. There are plenty of players who get a mulligan, but I suspect that it might be correlated with race.

 
Matt Swartz
(24824)
Comment rating: -

I don't think it's medical information in the way you're thinking. Players take physicals in all of these scenarios in general, so it's not really that. When I say medical issues, I mean just knowing how the player's body holds up day-to-day. Something like knowing how well a reliever's arm bounces back is something the player's old team just knows better.

 
Matt Swartz
(24824)
Comment rating: -

I didn't use depth charts. I actually just used PECOTA spreadsheets. I'm sure that they overestimate IP on average but probably not for players who got multiyear deals since those players were less likely to be on the borderline between being in the minors and majors the first year. If PECOTA runs a little hot, that's another issue, but I don't think the CHONE system was running all that hot when it happened and Sean Smith didn't seem to see that as the issue in the article I linked.

 
Matt Swartz
(24824)
Comment rating: -

I did look at age originally but it didn't seem to be the source of the bias, so I dropped that from the analysis. Players did age worse as they were older, but that just didn't seem to be what was going on at all.

 
Matt Swartz
(24824)
Comment rating: -

VORP does account for league adjustments so that should not be an issue. I doubt that clubhouse chemistry or a change in time zones is the explanation specifically because the players who change teams are the ones who do worse and worse over the rest of the contract. The players who re-sign tend to underperform their VORP the first year, but then they don't really decline much at all. The players signing with new teams underperform their projections and then decline even further than that. That's why I think it has more to do with medical information as well as simple workout tendencies. For instance, I think the Phillies knew that Chase Utley was a workaholic and probably would have a longer peak than other second basemen so they signed him for longer.

 
Matt Swartz
(24824)
Comment rating: -

In a one-sided test, 56% under-performances is significant at the 94.8% level (t-stat=1.63, with t=(93/165-.50)/sqrt(.50*.50/165). The argument is backed up by theoretical evidence though, which is very important here. There is a reason why it SHOULD be a biased sample. The additional evidence of the difference between aging of re-signed and newly-signed players is also evidence of this kind of bias. Also, this isn't 1/2 a win over the whole contract. It's just the first year. It gets worse than that for newly-signed players and better than that for re-signed players.

 
Matt Swartz
(24824)
Comment rating: -

Correct. It is percent of total WARP for the whole deal in each year. It shows that players who re-signed maintained their value well but players signing with new teams did not.

 
Matt Swartz
(24824)
Comment rating: -

I really don't think my cousin Mark knows who Javier Vazquez is. I'm bearish on Pettitte and somewhat bearish on Burnett. I think CC is probably a good bet too, but I think that Javier Vazquez has been the victim of bad luck and bad defense at the wrong times, and I think that the NL/AL league difference-- while huge-- is exaggerated to the point that people act like NL numbers are AAA numbers. Vazquez had a 2.68 SIERA last year to CC Sabathia's 3.70. I'm having a hard time thinking that the league difference is that big.

Apr 06, 2010 6:27 AM on Staff Picks for 2010
 
Matt Swartz
(24824)
Comment rating: -

Also, your insult that I was referring to, which I'm repeating here is the implication that I have not thought about the point of this exercize which you called "academic." You again repeated that I was not applying logic. I think it should be abundantly clear that the whole strength of this exercize was in applying the logic. If you're not seeing it, that's a shame.

Mar 25, 2010 11:31 AM on Predicting BABIP, Part 2
 
Matt Swartz
(24824)
Comment rating: -

As I explained in the first article, the reason to regress BABIP separately is that K% and HR% are very reliable, and so you want to pin down the effects of the unreliable portion of batting average. The reason you won't do as well doing (outfield hits)/(outfield fly ball) as you will doing (outfield non-HR hits)/(outfield non-HR fly ball) is that hitters with difference home run skills are going to have vastly difference skills with respect to the former ratio but similar skills with respect to the latter ratio. Applying the same regression to all hitters on the first ratio will cause a huge problem. The point of discussing hitter BABIP was mentioned in the BABIP Roundtable we did a couple months ago, and it's well worth reviewing if you're still skeptical. The key is that you want to regress the statistics that represent more luck and less skill more than you want to regress the statistics that represent less luck and more skill. Home run rates have less luck than line drive BABIP. But infield hit rates have less luck than outfield fly ball BABIP. By breaking each skill down, you take the deviations from average that represent skill and regress those to the mean less than you regress the deviations from average that represent luck.

Mar 25, 2010 11:30 AM on Predicting BABIP, Part 2
 
Matt Swartz
(24824)
Comment rating: -

Last year, there was a .323/.298 BABIP difference at home vs. on the road for Colorado hitters, and a .311/.297 discrepancy for their hitters, so there is clearly some park effect, probably around .020 points. I'm not convinced that it's necessarily outfield fly ball BABIP that is getting the biggest effect, but probably some of it. Fowler's outfield fly ball BABIP probably is being regressed causing his overall BABIP to go down about .003-ish more than it should be regressed. The humidor may make balls sink quicker, causing more hits, or it could be that there is more space in the outfield than outfielders can cover. There are a lot of possible effects. I don't think defensive usage is going to move it much. How often do teams really rest an outfielder that would have failed to get to a ball that game and replace him with someone who would get to the same ball? A lot of stars need to line up for that to happen.

Mar 25, 2010 11:24 AM on Predicting BABIP, Part 3
 
Matt Swartz
(24824)
Comment rating: -

Definitely a good idea for an article. Thanks! As far as Kinsler himself goes, it's definitely his swing which is clearly a bit of an uppercut. He only had 32.1% ground balls last year (league average = 45%) and about 13.4% pop ups (league average = 7.8%). That seems pretty close to his old rates too. Numbers like that actually would imply a .265 BABIP or so I think, if not for his power and speed which bring him up to .279.

Mar 25, 2010 11:15 AM on Predicting BABIP, Part 3
 
Matt Swartz
(24824)
Comment rating: -

Tons, actually. Ryan Howard, for example, had a BABIP in the majors of over .350 through 2006, and even higher in the minor leagues. Then the shift came, and now he's projected for about .313, which is where he has been recently. Fortunately, that is already built into the projections for guys who have been shifted against. A young pull-happy lefty slugger might need his BABIP regressed as teams notice his pull-happy ways though, beyond even what E-BABIP would regress.

Mar 25, 2010 11:02 AM on Predicting BABIP, Part 3
 
Matt Swartz
(24824)
Comment rating: -

Hmm! Very interesting. I hadn't even thought about that fact. I would probably scale back what I said a little bit then. Texas' LD-BABIP last year was .708, and the league average was .729, so I'd bet you should cut short the "extra line drive BABIP" I gave Jones by .021, which would change his BABIP maybe .004. So maybe, he should have been around .255 last year instead of .260.

Mar 25, 2010 10:58 AM on Predicting BABIP, Part 3
 
Matt Swartz
(24824)
Comment rating: -

Definitely true. I think this would be more pronounced for hitters who are going to Boston. So, chalk up Adrian Beltre and Mike Cameron for a couple extra points for example. There also could be some over-regressing to the mean for some other hitters though. I think it would explain some small trends in BABIP certainly, but maybe we're talking about .005 or so for most guys.

Mar 25, 2010 10:55 AM on Predicting BABIP, Part 3
 
Matt Swartz
(24824)
Comment rating: -

The Colorado team BABIP on fly balls looks pretty normal, but the team BABIP on line drives seems pretty high. That could easily be one of Colin's classic press box heights or other scorer bias issues, though, and there could totally be more outfield fly balls that would fall in for hits anyway-- the size of the park certainly would suggest that. Clearly there are some park effects. However, Fowler is getting the same park effects in Coors in previous years as this year. If you just throw team=Rockies in as a dummy variable, you get about .009 of BABIP, but similarly for throwing it into old numbers. I would think maybe you've touched on something worth about .005 of BABIP or so for Fowler, but probably not enough to explain the .317 vs. .345 discrepancy. Very interesting thought, though. Thanks!

Mar 25, 2010 10:53 AM on Predicting BABIP, Part 3
 
Matt Swartz
(24824)
Comment rating: -

Now I'm not getting the link to work at all. Try this: http://spreadsheets.google.com/pub?key=tMC8mqXiVL6gP9bnd74jrEw&single=true&gid=0&output=html

 
Matt Swartz
(24824)
Comment rating: -

I meant bracket, not quotation. So there's a bracket, a slash, and the letter A at the end of the hyperlink. Delete those, and that should work.

 
Matt Swartz
(24824)
Comment rating: -

There's a quotation, a slash, and an A at the end. Somehow even typing that into comments cuts it off. Just delete those three characters. I've emailed about this, so hopefully it can be corrected. Thanks and sorry!

 
Matt Swartz
(24824)
Comment rating: -

Looks like there is an extra three characters "

 
Matt Swartz
(24824)
Comment rating: -

From yesterday's article: "They also were closer to actual BABIP than PECOTA was 57 percent of the time, and closer to actual BABIP than CHONE was 60 percent of the time. That might not seem like much, but those fractions are significant at the 95-percent and 99-percent level, respectively. In other words, there is less than a five-percent chance that there would have been that large of a difference between my BABIP model and PECOTA if they were equally as good, and less than a one-percent chance that the model would have beaten CHONE as badly as it did just by chance." Implicit in this is that if you add in the same number of HR and SO that PECOTA predicts, but change the number of hits and outs on balls in play to E-BABIP, you will have a superior prediction of AVG. If you're going to make flat out accusations and hurl insults, you should at least do a brief scan of the articles to make sure I didn't address this. I also talked about it in last year's "BABIP Superstars" article and last year's "You Can Beat PECOTA Without a Computer Model" articles.

 
Matt Swartz
(24824)
Comment rating: -

I think you mean Baseball-Reference, don't you? I can't find it on Fangraphs. I can't get any normal output that I could automatically merge, and I haven't been able to get reliable massive datasets in a form that wouldn't involve typing every number in for each individual player. In truth, I stink at programming, so I'm reliant on excellent help from Eric Seidman and others for these things. I'm not sure if people here are comfortable replicating the B-R L/C/R numbers reliably without muddy data.

 
Matt Swartz
(24824)
Comment rating: -

How can you honestly read these two articles and actually type your Step 1? Do you realize that balls in play being hits more often will necessarily imply higher batting averages? Of course players who have high BABIPs will have higher AVG and higher OBP. Studying BABIP as a subset of performance clearly has value, as I've explained in both of these articles, because it is more complicated to project. The point of projection is to say that Michael Young and Joe Mauer will not only have "highish batting averages" but to pinpoint how high. To the extent that you can increase the accuracy of how high you think it will be, projecting BABIP has value. Do you believe projection is a useful concept at all?

 
Matt Swartz
(24824)
Comment rating: -

All fly balls are included, but the use of HR/FB is mostly a proxy variable for power. It's not implicitly subtracted at all. The more frequently hitters hit 400 foot fly balls, the more often they hit 350 foot fly balls so the more ground outfielders have to cover, and the fewer balls they catch. There's also another reason, though. Say you have 100 balls in play and that there are 40 ground balls, 20 line drives, 30 outfield fly balls, and 10 pop ups. If none of those are home runs, then line drives only make up 20% of your balls in play. However, if you're Prince Fielder and 10 of every 30 outfield fly balls leave the yard, then 20 of every 90 balls in play are line drives-- 22% instead of 20%, which would drive BABIP up by about .010. So there's two effects there.

 
Matt Swartz
(24824)
Comment rating: -

My R^2 for the 3 years regression was 0.31, not 0.20. The reason that you are getting weird P-stats for GB% FB% and LD% is because of multicolinearity. They should all add up to 100% because Fangraphs includes Pop-ups and Outfield Flyballs in "Fly Balls", so one should be removed from the regression. One of them would actually have automatically been removed if not for I'm guessing Fangraphs' numbers are rounded and so they might theoretically have added up 99.9% or 100.1% for some people. Pick one, take it out, and leave the other two in. Rerun the regression. Your R^2 is probably a little higher because you used Fangraphs' default PA which is higher than 300. That makes your R^2 more exact for your dataset. Running mine on the same data set by restricting it to 500 PA gives me an R^2 of .38, for example You're actually running almost the exact same information as I am, minus GB-BABIP, OFFB-BABIP, and CONTACT%, so you're going to get most of the way there anyway. It sounds like you did. Standard Deviations of 18.5 points sound almost perfect, as it should, and naturally that means 65% should fall within 18.5. Running the regression with an intercept of 0 was an unnecessary sidebar. Setting the intercept equal to 0 means requiring things to all be 0, what is the variable. So for some who theoretically has 0% GB, 0% FB, and 0% LD and 0% IFH, 0% IFFB, 0% HR shouldn't even have a BABIP. And doing so while dropping one of the three collinear variables (GB, FB, and LD) would be arbitrarily assuming BABIP for some with 100% of that batted ball type is 0.

 
Matt Swartz
(24824)
Comment rating: -

I'm not taking them out of the analysis. I regress on GB/BIP, PU/BIP, LD/BIP. That implicitly assumes OFFB/BIP is included. You can't included four things that add up to 100% in a regression because of multicollinearity. Instead, you include the three other things. The reason ground balls and line drives are positive coefficients is because the more of them you get, holding popups constant, the less flyballs are hit. And ground balls and line drives have higher BABIP than outfield fly balls. Similarly, popups are negative because holding line drives and ground balls, constant, more popups and fewer outfield fly balls means a lower BABIP.

 
Matt Swartz
(24824)
Comment rating: -

Thanks for the compliment :-) It's a good question. I would guess it's more useful at higher levels than lowel levels, and you would probably need to come out with some sort of Minor League Equivalencies for each metric to convert it. Also, since ballparks are so different across the minors, you couldn't get away without doing park effects as I've attempted to do. Once you figured out the MLEs and PFs, you could probably do well to convert infield hit rate in the minors to the majors considering the 3B are better, convert outfield fly ball BABIP to reflect better fielding OF. The real question is how much line drive rate would carry over. I would guess that although major league pitchers don't have much difference in their LD% abilities, there is probably a lot more difference between the LD% of MLB pitchers and pitchers in high-A ball. If you converted all these things, you really might be able to do something like that assuming you were careful. At a more qualitative level, you could probably do a lot. What is the groundball/flyball ratio of this guy in AAA-- does it indicate a downward plane to his swing? Does pop-up rate shine any clues? How often is he reaching on infield hits? How much power does he have? What's his contact rate? How well does he get ground balls to the outfield and fly balls to fall in the outfield compared to other hitters with similar skills? Is this indicative of a good ability to spread the ball around? Once you know all this, you can probably pin down where a hitter is. A speedy ground ball hitter who spreads the ball around, makes good contact, and rarely pops up might be a .330 BABIP kind of guy. A power hitter who rarely hits ground balls but spreads the ball around well might be a .285 guy. A power hitter who pulls everything but hits everything very hard and rarely pops up might be a .315 guy. Things like that are very informative.

 
Matt Swartz
(24824)
Comment rating: -

Thanks for the question. There are several reasons for this. The primary reason is that ground ball hitters are good at BABIP, but not an EqA or any other measure of overall offense. Take Juan Pierre. He's a below average hitter with above average BABIP. The reason is largely related to ground ball rates. Another large factor is pitchers do not control their rate of home runs per outfield fly ball. So pitchers who allow more fly balls naturally allow more home runs. However, not all hitters that have high fly ball rates are going to have high home run rates, because there are huge difference in the rate of home runs per fly ball for hitters. My article last week on SIERA and BABIP actually showed that pitchers who allow more fly balls have lower BABIPs, which is partly why the "cost" of a fly ball isn't as high for a metric like SIERA as it is for other metrics that are based on run estimation of individual outcomes. Since pitchers who allow more fly balls also allow more home runs, a fly ball is a bad thing for run prevention in both metrics, but it's not as bad because SIERA implicitly assumes that their BABIPs also go down a little. I guess the real difference is home runs. BABIP is part of hitter skill, but certainly not all.

Mar 24, 2010 12:33 PM on Predicting BABIP, Part 2
 
Matt Swartz
(24824)
Comment rating: -

It definitely is. THT's xBABIP uses this statistic which they call "spray". Foruntately, using historical BABIP rates seems to be picking up this effect for me, but if I could get my hands on that data, I would like to see if high historical BABIPs on grounders and fly balls are more likely to be maintained by hitters who spread the ball to all fields.

Mar 24, 2010 11:23 AM on Predicting BABIP, Part 2
 
Matt Swartz
(24824)
Comment rating: -

I'm not sure if Marc still uses that model, and what for, but it's been debunked as a predictor of next year's BABIP. The correlation between BABIP in year-1 and BABIP in year-2 is higher than the correlation between (LD% + .120) in year-1 and BABIP in year-2. The reason is predicts same year BABIP pretty well is that BABIP on line drives is about .730 and on everything else it's about .190. The difference between the best and worst at BABIP skill is only about .680-.780 in skill level and the difference between the best and worst at non-LD BABIP is probably only .140-.280. So if you weight those two groups, you can always get close IF YOU KNOW THE % OF LINE DRIVES. The real problem is that line drive has only a .37 year-to-year correlation, making it relatively less accurate than knowing a player's speed, power, contact, and GB, OFFB, and PU rates.

Mar 24, 2010 11:07 AM on Predicting BABIP, Part 2
 
Matt Swartz
(24824)
Comment rating: -

Okay, I'll pick a few that might be harder to guess: Casey Kotchman, Nick Punto, Martin Prado, Lyle Overbay. Take a look at those and see if you can guess whether I think PECOTA is high or low on them. They'll be in the article tomorrow along with 6-7 others IIRC.

Mar 24, 2010 10:55 AM on Predicting BABIP, Part 1
 
Matt Swartz
(24824)
Comment rating: -

Thanks! That will be at the end of tomorrow's article. Tomorrow's article is going to talk about E-BABIP vs. PECOTA-BABIP. Then I'll link to the E-BABIP spreadsheet at the end.

Mar 24, 2010 10:35 AM on Predicting BABIP, Part 2
 
Matt Swartz
(24824)
Comment rating: -

Thanks. The thing is that line drive rate DOES have a high correlation with BABIP that year, but line drive rate is not very persistent. So certainly a guy who has 19% line drives (league average) might have a BABIP of .300 (league average, holding everything else average), while if the same guy had 24% line drives, his BABIP would be .325. However, a guy with 24% line drives one year is probably going to have a 21% line drive rate the next year, which might correspond with a BABIP of .310. So 5% more line drives in 2009 indicates only .010 points more of BABIP in 2010.

 
Matt Swartz
(24824)
Comment rating: -

Check out the SIERA and BABIP article I wrote last Wednesday for a more detailed answer, but pretty much I think that pitcher BABIP is almost entirely explained by K, BB, and GB skills.

 
Matt Swartz
(24824)
Comment rating: -

I think 4-5 wins with 10.5% BB is exactly where he's been lately. I guess you're probably guessing his HR-rate falls. That's certainly possible but Thome seemed to maintain his pretty well. I think there are different muscles and reflexes involved in hitting oppo-HR, and those just tend to be muscles and reflexes that age better. We really know so little about aging. In this article (http://baseballprospectus.com/article.php?articleid=9055), I talked about how sluggers who hit opposite field HR seemed to age better, at least based on my puny sample size of Howard's 2009 PECOTA Card comparables.

Mar 23, 2010 1:04 PM on There Goes My Hero
 
Matt Swartz
(24824)
Comment rating: -

xBABIP is a post-dictor/estimator, rather than a predictor of BABIP, which is a useful concept certainly (SIERA is a post-dictor/estimator of ERA), but serves a different purpose. More specifically, it uses same year variables to predict that year's BABIP. E-BABIP predicts future BABIP based on batted ball breakdown. A lot of the original xBABIP model has been changed due to suggestions they got at the time. For instance, they realized that using home run rate to model power was better than pitches/XBH, mostly because it was not putting doubles and triples directly in the set of independent variables when it's supposed to be the dependent variable. The speed score is supposed to represent ability to get hits on infield ground balls or draw the infield in. However, to post-dict BABIP, you can't just include infield hits because obviously that's including some singles in the independent variables already. (That would be like including RBI doubles in an ERA estimator-- of course you're ERA is higher, but really that's not what you're measuring.) However, to predict BABIP, it's okay to use historical infield hit rate, which is what I use (+ reaching on errors). Then you're simply asking how well the past predicts the future. I did use handedness in my regressions originally, but it did not do more than add or subract .001 from overall BABIP and wasn't significant. The reason is that any of the ways that handedness would affect BABIP would have already affected GB-BABIP and OF-FB-BABIP in previous years. xBABIP was trimmed down a lot and adjusted later. The things like discipline had been introduced, and things like home run rate and pop-up rate were things I was already using in older models, and the lefty*(gb/fb) variable is really just modeling infield hit rate too. The real add-on, and data that I'd love to get my hands on and incorporate meaningfully, is that they introduced "spray" which basically measures how well the hitter sprays the ball across the field. That would be useful to improve this too. I think that's the main contribution of xBABIP is proving that a wider distribution of batted balls across the field plays a major role in BABIP.

Mar 23, 2010 11:11 AM on Predicting BABIP, Part 1
 
Matt Swartz
(24824)
Comment rating: -

Clearly that's wrong. What is true is that 3.5% of all first pitches are hits. Since batters swing at about 1/2 of pitches, and about 1/2 of pitches are over the plate, I'd assume that he could mean either "7% of all swings at first pitches are hits, while 93% are strikes or outs." or he could mean "7% of all pitches over the plate are put into hits, while 93% are strikes or outs."

Mar 23, 2010 10:45 AM on Predicting BABIP, Part 1
 
Matt Swartz
(24824)
Comment rating: -

Yeah, in part 3, there will be a google doc linked with all the E-BABIP projections. Thanks!

Mar 23, 2010 10:39 AM on Predicting BABIP, Part 1
 
Matt Swartz
(24824)
Comment rating: -

Part two should be tomorrow. Thanks!

Mar 23, 2010 10:38 AM on Predicting BABIP, Part 1
 
Matt Swartz
(24824)
Comment rating: -

The walk and strikeout rates are different for Ortiz and Vaughn, too. None of these are perfect matches. Howard: 13% BB/PA, 33% SO/AB Thome: 17% BB/PA, 30% SO/AB McGriff: 13% BB/PA, 22% SO/AB Ortiz: 13% BB/PA, 21% SO/AB Vaughn: 11% BB/PA, 26% SO/AB Howard obviously isn't a perfect match for any of them, with worse K's than all, but his batted ball profile is more similar to Thome and McGriff than Ortiz and Vaughn. If you throw out McGriff because of contact rate, you have to throw out Ortiz. Latest PECOTA cards up today have McGriff as his 1st comparable, but doesn't list Ortiz til 8th. Thome is 4th and Vaughn isn't on there. None of these people are the exact same player, but Thome's batted ball profile is the one that looks most like Howard. Eye, patience, and contact skill are all important traits that could suggest one aging curve or another, but batted ball profiles demonstrate physical skills and muscles used too, which also suggest other aging curves. My suspicion is a batted ball PECOTA would map Howard and Thome even more closely than Howard and McGriff, and far more than Howard and Vaughn or Ortiz.

Mar 23, 2010 7:43 AM on There Goes My Hero
 
Matt Swartz
(24824)
Comment rating: -

Pujols is worth far more than Howard certainly, but let's not make the mistake mixing up Howard being overrated and Howard being bad. David Ortiz is way down on Howard's comparables this year, and Mo Vaughn isn't listed in his Top Ten. McGriff and Thome are far better comparables. Ortiz and Vaughn are/were dead pull hitters. Ortiz is a poor fielder who is very flyball happy. Howard hits a lot of line drives all direction and pulls almost all his ground balls which he hits very hard, and he hits most of his home runs the other way. He also fields his position at an average level, outside of having a noodle of a throwing arm. Howard isn't going to lead the league in HR when he's 40, but he's still going to be a wrecking machine in his early 30s like Thome was. Thome is the model here. He's shaped like Howard, hits an absurd number of his home runs the other way, strikes out a lot, and hits rocket hard ground balls (all of which are also pulled, unlike his flies and liners). Thome also is like Howard in having average fielding range without looking graceful in the process. Howard's early 30s likely will look more like McGriff and Thome than Ortiz and Vaughn.

Mar 22, 2010 9:17 PM on There Goes My Hero
 
Matt Swartz
(24824)
Comment rating: -

Greg Maddux is the obvious example of a pitcher who had a low BABIP because he didn't do that. I apologize of that wasn't clearer.

 
Matt Swartz
(24824)
Comment rating: -

Haha, no, must be my dad.

 
Matt Swartz
(24824)
Comment rating: -

Sadly, there was not a database for this at all. I asked around, and people didn't seem to have any knowledge of one, so I ultimately put guys in by hand, using Baseball-Reference. The good thing was that it was pretty easy to do 2007 and 2008 with vlookup in Excel, since there was pretty good overlap. It was 2009 that took forever.

 
Matt Swartz
(24824)
Comment rating: -

I don't think I'm going to be able to pull this one off. The salary totals I have are accumulated from bizofbaseball.com and copied off their list. I don't think I'd be able to do put the rest of it together.

 
Matt Swartz
(24824)
Comment rating: -

Let me know if this works: http://spreadsheets.google.com/pub?key=t0E2MpKtKqTRknQQauA6mow&single=true&gid=0&output=html This should be the 2009 draftee/signee players with WARP3 listed, team drafted/signed as an amateur, team played for. This took a bit of work to format nicely enough to be readable, so I don't have time to do 2007 and 2008, but this should paint a pretty interesting picture of this season.

 
Matt Swartz
(24824)
Comment rating: -

Playing time is not factored into it. There were 81*30 total wins/teams to go around, and I just added up WARP3's. If you want to think of it as the production of the amateur scouting department, with an idea that players could be traded at fair value but the No Turnover Standings are a reference point, that works too.

 
Matt Swartz
(24824)
Comment rating: -

In his defense, the NL West standings went up after the rest of the article, and also checking out the 2007-08 standings, you'll find that the Reds were fantastically bad at drafting/signing amateurs.

 
Matt Swartz
(24824)
Comment rating: -

Hahaha-- 11th and 12th for Carpenter and Wainwright, respectively, actually. I considered making it a Top 12 to show where they were, but I was afraid that I was letting the pitchforkists win if I did that! :)

Mar 15, 2010 10:56 AM on SIERA in Stat Reports
 
Matt Swartz
(24824)
Comment rating: -

Hmm..definitely interesting questions. I think it definitely decreases the benefit of the SOMA plan, especially since the effect is still getting smaller over time. I would think the technology aspect of being able to see a reliever in advance certainly would lower their effectiveness too. At the same time, the specialization of relievers is probably making them more effective as technology improves too-- teams can learn what type of pitches hitters struggle with more as they have better access to technology (pitch F/X, etc.). They can't teach a right-handed starter to just randomly learn to throw a slider with his left hand against Ryan Howard, but they can certainly bring in a lefty slider-thrower.

 
Matt Swartz
(24824)
Comment rating: -

Three stars is not #3. There are 19 NL pitchers in that are Four stars and five stars there, but 16 NL teams. He's somewhere between the 20th and 37th most valuable fantasy starter in the NL according to that article, and that's accounting for the fact that he plays in a hitter's park in the NL East which is easier than the NL Central though maybe not easier than the NL West and obviously for a team that can get him W's.

 
Matt Swartz
(24824)
Comment rating: -

Blanton is projected by PECOTA to have a 4.00 ERA. Only 14 of 30 teams have two starters who are projected by PECOTA to be below 4.00. Only 4 teams have three starters projected below 4.00, and only one team has four starters projected below 4.00. In fact, 5 teams don't even have any starters projected below 4.00. So Blanton would be about the #1 for 5 teams, the #2 for 11 teams, the #3 for 25 teams, the #4 for 5 teams, and the #5 for 1 team. I don't see how anybody could think he was a marginal #3. I think what you might be missing is the strikeout rate is pretty much the most persistent stats in the league. When it changes, it usually doesn't regress back very far if at all.

 
Matt Swartz
(24824)
Comment rating: -

Rollins has earned over 4.4, 7.3, 5.9, and 2.5 WARP3 in his first four years of the deal, only one of which would have been an arbitration year under team control. Even last year, he was worth more than $8MM. I think he's already produced twice what his whole deal cost easily, and will probably triple his return by the end of 2011 easily. I'll have more exact calculations on all this stuff later on, but the Rollins ridiculous bargain deal is a big part of the secret as to how the Phillies won the last three titles.

Mar 04, 2010 9:46 AM on NL East
 
Matt Swartz
(24824)
Comment rating: -

Although Maddux is an example of a pitcher who played in front of great defense, you do make a very smart observation here. I don't have the link but google for an article by Tom Tippett who I believe now works for the Red Sox who showed that there is a tendency for pitchers to be particularly good at suppressing BABIP in their peak years. My belief is that this is not the same as saying they suppress their line drive rate more in their peak years. In fact, my general belief on this topic might be categorized well by saying "Matt thinks that pitchers do control their BABIP but they just don't control their line drive rate which is the primary determinant of their BABIP so it's tough to tell BABIP skill apart."

 
Matt Swartz
(24824)
Comment rating: -

TheRealNeal-- It's really not laziness to say that it's luck, because you can statistically calculate the standard deviation of luck of a binomial variable as sqrt((p)*(1-p)/n). So a pitcher who allows 500 balls in play in a .300 BABIP league will have a variance of sqrt((.300)*(1-.300)/500), which is .020 points. So in any given year, 1/3 of pitchers will have BABIPs below .280 or above .320 even without having special BABIP skills.

 
Matt Swartz
(24824)
Comment rating: -

It's true that pitchers who can't strike hitters out well enough to stay at the big league level will not manage league average BABIPs, which was a large point I was trying to make in the article. It's also true that groundball pitchers will tend to have BABIPs around .310, as will pitchers who struggle to strike hitters out. There's also such thing as bad defenses that play a role here.

 
Matt Swartz
(24824)
Comment rating: -

Sorry it took so long for me to reply-- I was out of town this weekend... Certainly there ARE pitchers that have some control over BABIP-- both myself and others have discovered that groundball pitchers have higher BABIPs because groundballs have higher BABIP than line drives. Also, knuckleballers as you mention are BABIP-prevention wizards at the MLB level. Further, strikeout pitchers have lower BABIPs. The thing is that all of these things make the skill level range +/- .010 with respect to league average (except knuckleballers). The pitchers you have above are a mixture of pitchers who pitched in front of good defenses, strikeout wizards, perhaps some luck, but BY FAR the most common string you see there is pitchers who played in low BABIP eras overall. Most of the recorded 1950s had BABIPs around .275. The 1960s was around .269-.281. The 1970s ranged from .272 to .287. The 1980s were mostly in the .280s varying around, and same with the early 1990s. It wasn't until the "modern era" of 1993-now that BABIPs have averaged around .300. So looking through the list, most of the pitchers there did play during the era when those were normal. But a lot of those guys are knuckleballers or guys that played in front of great defenses (Maddux, Glavine). You also see great strikeout pitchers like Johan Santana on there.

 
Matt Swartz
(24824)
Comment rating: -

They definitely do, but less than people think.

 
Matt Swartz
(24824)
Comment rating: -

I'm getting -0.32 same-year correlation between GB% and LD% and -0.05 for LD% and (FB+PU)% with BP data. The problem is that GB%+LD%+(FB+PU)%=100%, so the non-line drives need to go into either the GB category or into the FB or PU category. Looking at correlations with following year line drive rates, we see that the correlation almost disappears, just .08 for (FB+PU)-first-year vs. LD-second-year, and -.14. GB-first-year and LD-second-year. So maybe there is some element of more non-line drives being turned into grounders. I'm not sure if that's park effects or something, or something I'm not thinking of but it doesn't seem to be much evidence of a big deal. I like what you did there, and I thought about that once too, but it's really the issue that the non-line-drives need to go somewhere. I'm not sure why Fangraphs' stats said +0.16, but that could be an indication that non-line-drives could be turning into GB more than FB.

 
Matt Swartz
(24824)
Comment rating: -

@Mooser: They control what percent of non-line drives are GB and FB. The way I think about it is that when a pitcher is lucky on "linedrivelessness" they get extra GB and FB instead and unlucky with lots of line drives being hit, they get that subtracted from GB and FB. Assuming the luck is even up or down, that's why we used (GB-FB-PU)/PA as our estimator. If a few extra line drives hurt both evenly so neither GB/PA or (FB+PU)/PA would be great stats to use. Let me know if this helps.

 
Matt Swartz
(24824)
Comment rating: -

There apparently isn't any correlation between LD% for high GB% and low FB%. It seems that 1% extra GB skill correlates with 1% less FB+PU skill. Lowe and Webb (and mostly Lowe) happen to have low line drive rates thus far and Pineiro seems to be on the high line drive side of things, but I think that if I had to pick an over/under LD% for all three of them in 2010, I'd go with 19%.

 
Matt Swartz
(24824)
Comment rating: -

Again, there are agreed upon standards that meet logical rules. Being able to predict next-year ERA well is good because it highlights skills that pitchers control well. There's different standards that are agreed upon for different types of statistics. Repeatability and team-level correlations are important.

 
Matt Swartz
(24824)
Comment rating: -

Greensox: Year-to-year correlation-- that's how to prove it. It's pretty standard. If the pitcher who did it before is more likely to do it again than the pitcher who didn't do it before, it's probably a skill. For K%, it's about .75. For BB%, it's about .65. For GB%, it's about .75. For LD%, it's about .01.

 
Matt Swartz
(24824)
Comment rating: -

Thanks. I really liked the description in the first post. I guess the question is really what the distribution is near the center of the bat. I'm guessing it's not that steep for pitchers and that's probably what's going on. I think your characterization of correctly identifying pitches is great too. I think that's what's going on. Thanks again.

 
Matt Swartz
(24824)
Comment rating: -

HR rate isn't really a pitcher skill, but flyball skill is, so that's why HR has something like a .15 correlation year-to-year. FB has a .70 correlation I think, but HR/FB is like .07 I think. Liners aren't just hard contact. They are hard contact centered. That's the key.

 
Matt Swartz
(24824)
Comment rating: -

Not really going to change the methodology, but I sure wouldn't expect it to be gospel outside of MLB pitchers. I imagine it's pretty good at minors and worse at low minors, but I'm not sure.

 
Matt Swartz
(24824)
Comment rating: -

I want to learn more about LIPS. I don't have a good enough sense of it, and I haven't seen it talked about much. Reading through some of Gassko's stuff though, it looks very well done and he seems to be coming to a lot of good conclusions. I still like what we're doing with SIERA and the interactions and the quadratic terms, but LIPS seems to be understanding the line drive issue well. I think it you want do a projection, do a projection. Looking at next year ERA is a way to show you are representing the relationship between ERA and skills.

 
Matt Swartz
(24824)
Comment rating: -

I doubt the 1.159 vs. 1.162 is statistically significant, but I think it's enough to think that as SIERA gets more data, it should be able to get there we think. It also seems to be slightly winning in various tests, and doing better in a linear regression to estimate next-year ERA rather than just doing it directly. But the point here is that we've got something that should be at least as good from a totally different angle-- meaning the combination should be great-- and we have clear room to improve once pitchers get up on the mound and throw us a few more frames to put in the data, and once we incorporate all the great suggestions we got through this process.

Feb 24, 2010 10:14 PM on Barry's World
 
Matt Swartz
(24824)
Comment rating: -

Yes, GB profile makes the unearned run portion of RA go up naturally. But xFIP and FIP both talk about fly ball percentage, which is going to be directly related to ground ball percentage, so the bias applies to all of them. We did run some tests of what "SIRA" might look like and it was pretty similar with small diminished value of ground balls. Since this was a smaller issue, we left it out here, but we very well might make a "SIRA" later, just like xFIP should have varying constants based on fly ball rate and FIP should have varying constants based on home run rate. Note that xFIP and FIP do NOT at all assume constant batted ball profiles, just constant 1B/2B/3B/out distributions.

Feb 24, 2010 10:11 PM on Barry's World
 
Matt Swartz
(24824)
Comment rating: -

Yeah, if people were more familiar with that scale, that would have been better. Even Fangraphs converted tRA into tERA recently though, just because of that problem. There is also FIP with an RA adjustment to the constant out there somewhere too, but people prefer FIP as ERA-scaled.

Feb 24, 2010 6:27 PM on Barry's World
 
Matt Swartz
(24824)
Comment rating: -

This is a great description, thanks! I think SIERA will inform PECOTA more later, rather than just being used as a checkpoint which I know it was.

Feb 24, 2010 6:25 PM on Barry's World
 
Matt Swartz
(24824)
Comment rating: -

I don't think we were able to get the formula for tRA* but the point is that tRA is largely dependent on line drive rate, which has a 0.007 intra-class correlation. So unless tRA* reflects that, it's bound to have some problems too. I heard Graham say once that (LD)/(LD+FB+PU) had a 0.4 high correlation, but that's because FB+PU has a high correlation. The point is that LD/(LD+FB+PU+GB) has a 0.0 correlation, so trying to credit pitchers with line drive rates different than their team line drive rates is going to be a problem. If you want to make a regressed projection rather than a formula based on actual statistics, though, you might as well use PECOTA and other projection systems.

Feb 24, 2010 6:24 PM on Barry's World
 
Matt Swartz
(24824)
Comment rating: -

Should be right after the PECOTA cards, I think. It'll be on the Statistics page, and we might be able to put together a nice hypothetical SIERA Calculator where you can just plug things in.

Feb 24, 2010 6:20 PM on Barry's World
 
Matt Swartz
(24824)
Comment rating: -

I mean, is a squared term really something that is all that complicated? Figuring out how to establish the run value to cook up the 3.20-ish moving constant or which is the right HR/FB number to make xHR seems less straightforward than multiplying a number by itself to me, but that's probably about the most subjective difference I can think of. I appreciate the encouragement, though. Thanks. If you look at my RMSE posts above, you might be encouraged a bit more too.

Feb 24, 2010 6:19 PM on Barry's World
 
Matt Swartz
(24824)
Comment rating: -

Root Mean Square Error: SIERA: 1.159 xFIP: 1.162 average of SIERA and xFIP: 1.153 So it is better. Also, regress ERA_park_2 on SIERA and you get: 1.51 + 0.658*SIERA That gets you a RMSE of 1.1276 If you regress ERA_park_2 on xFIP, you get: 1.38 + 0.679*xFIP That gets you a RMSE of 1.1437. Regressing on both basically says that xFIP is insignificant: 1.59 + 0.745*SIERA - .103*xFIP with a RMSE of 1.1259, so not much else is added there. Regressing on correlated things like this doesn't get much of anywhere in some cases. Brian Cartwright apparently found below that SIERA does a little better at next-next-year and next-next-next-year ERA too, though I can't verify that other than to say Brian has nothing at stake and is very smart, and that I like his research supporting SIERA :)

Feb 24, 2010 6:02 PM on Barry's World
 
Matt Swartz
(24824)
Comment rating: -

Definitely a great in-season estimator, especially early on when ERA is so noisy. One of the things that Nate Silver said about QERA was that it stabilized quickly-- SIERA will have the exact same benefit. The point of SIERA is to talk about how a collection of skills will produce an ERA. A July 1st ERA is full of noise, but SIERA will have skills only in there with less noise.

Feb 23, 2010 6:52 PM on Barry's World
 
Matt Swartz
(24824)
Comment rating: -

Good stuff, Brian. We should just have used your tests!

Feb 23, 2010 6:49 PM on Barry's World
 
Matt Swartz
(24824)
Comment rating: -

The most important thing to do is probably look for pitchers who have big discrepancies between their SIERA and their ERA, and then see if projections reflect that gap. Obviously, SIERA is going to regress to the mean like any statistic, but it won't be as volatile as ERA. Pretty soon we will have the SIERA numbers all up online, and the first thing you want to do as a fantasy player is look for gaps in SIERA and ERA. Some of the biggest ones that I remember seeing among people good enough to draft as starting pitchers were Ricky Nolasco and Cole Hamels. I don't really do fantasy baseball myself much, but I would guess that Nolasco is probably one of those guys you could nab late in the draft and then have him put up an ERA under 4 when no one expected it. Hamels is one of those players who might slip back in the draft but probably is someone you want to get before he's being drafted. I think two players that are great according to SIERA and maybe just very good according to other metrics are Justin Verlander and Javier Vazquez. This will be a more fun conversation when we get the numbers all up, I think! I remember only a handful of guys off the top of my head who's SIERAs taught me something, but there's probably more mixed in where the 2009 ERA and projected 2010 ERA won't look as much like the 2009 SIERA as it should, and that will be your green light/red light.

Feb 23, 2010 6:00 PM on Barry's World
 
Matt Swartz
(24824)
Comment rating: -

I took out option years because that would definitely be a biased sample.

 
Matt Swartz
(24824)
Comment rating: -

This is an interesting idea. Thanks!

 
Matt Swartz
(24824)
Comment rating: -

It was 39 of 71 individuals who improved in the second year of two-year deals, and 26 of 39 of those re-signed two year deals. The biggest jump was Mussina from 1.1 to 5.0 in 07 and 08. It didn't look like outliers to me at all.

 
Matt Swartz
(24824)
Comment rating: -

@philly: The confusion is that xFIP is apparently much better than we thought because it is tricky to code and we did it wrong the first time. The difference between SIERA and FIP/tRA/QERA is still very large, as it was before, but xFIP and SIERA are both close in a lot of categories, enough that for this quick correction we didn't expand on the minor differences.

Feb 23, 2010 3:31 PM on Barry's World
 
Matt Swartz
(24824)
Comment rating: -

@thegeneral13: Really? I don't agree that SIERA is more complex than xFIP-- remember that this whole problem came about because Eric and I were unable to properly code xFIP. Yes, the coefficients are round numbers, but given that no one calculates either one in their head, I don't think Markov chains (xFIP) are necessarily easier to understand than regression (SIERA) at all. Both involve strong assumptions that aren't always true. xFIP assumes that individual events are of the same value and not more likely to occur in one context than another, and SIERA assumes that correlation between other variables like BABIP and DIPS variables are going to be persistent and that measurement error in batted ball rates aren't too large as to affect the regressors. The two estimators come at things from very different angles. If they are both equally good at predicting ERA, I don't think it would be smart to throw either out. Put it this way-- if I told you that a pitcher's xFIP was 4.00, your best guess is that his ERA in a neutral environment would be 4.00. If I then told you that his SIERA was 3.50, you should bump that down. If I had told you that has SIERA was 4.50, you should bump that up. Similarly, if I told you that a pitcher's SIERA was 4.00, then you should predict his ERA would be 4.00 in a luck neutral environment. If I then tell you his xFIP is 4.50, bump your guess up, and if I tell you instead that his xFIP is 3.50, bump your guess down. They both contain information that is different, and it would be dangerous to throw one out because the coefficients aren't round numbers.

Feb 23, 2010 3:16 PM on Barry's World
 
Matt Swartz
(24824)
Comment rating: -

@Flynnbot It's an indication of what ERA should have been last year in a neutral world that is evidenced by its ability to predict the next year's ERA. Luck isn't persistent, but strikeout, walk, and ground ball rates are, so the combination of these three skills will give you a good estimate of ERA. xFIP is very strong as well. @royalsnightly: Please read the linked article. We DID put it up against tRA and tRA did no better than FIP and distinctly worse than xFIP and SIERA, well modeled metrics of pitcher skill. It is a very flawed metric that fluctuates largely with line drive rate-- which has an intra-class correlation of 0.007, maybe the lowest ICC I've ever seen. tRA is not a big dog at all. Use xFIP and SIERA if you want to measure skill. It's silly to name tRA as a gold standard when it was not tested anywhere but here and did not outperform even FIP with more inputs. To those suggesting an ubermetric: That's definitely a good idea. Maybe we will incorporate that into future testing, but for now, I think that it would be great to use both. To those asking for more detailed tables of subsets of pitchers: The differences were small enough that we thought it was best to just publish more general ideas. Due to the construction of SIERA, it made sense that it did better with extreme ground ball rates in either direction, and moderate strikeout rates, but statistically these differences would not have been significant. We look forward to continuing to improve SIERA. It's really just SIERA 1.0 right now, and it's already up there with xFIP as the clear best luck-/defense-neutral ERA estimators. We understand the frustration with this, but we want to stress that any good metric like this should be continually tested, and would have been regardless of this finding about xFIP's comparable strength.

Feb 23, 2010 2:33 PM on Barry's World
 
Matt Swartz
(24824)
Comment rating: -

Good question. I avoided this by just aggregating WARP3 for all players in a group. Otherwise, the only issue is 0.0 total, because even -0.5 and +0.7 would be summed as 0.2 and the percents could be -0.5/0.2 = -250% and 0.7/0.2 = +350%. Clearly, you've hit on why I aggregated rather than trying to average the outlier percent scores.

 
Matt Swartz
(24824)
Comment rating: -

I think that the research on aging says that particularly good players decline more slowly but tend to peak closer to 31, but not such that they are better at 36 than 35 as seems to be the case here. Certainly it is selecting sampling on aging, but there is something else mixed in too that seems related to re-signings in some way especially.

 
Matt Swartz
(24824)
Comment rating: -

Definitely possible, but the pitcher improvement was pretty modest: 52% of performance in the second year. The real gain was hitters who provided 64% of their value in the second year.

 
Matt Swartz
(24824)
Comment rating: -

Well...the rule is supposed to apply to GMs having knowledge of players not contained in the numbers, so as much as that rule doesn't apply to the Pirates, maybe...

 
Matt Swartz
(24824)
Comment rating: -

I'm not sure the details of contract year performance, but I don't think that this difference is enough to explain this jump. This is certainly a valid question to ask though. I just don't have the data on it to answer it thoroughly, so all I can say is that my understanding of this research is that the contract effect is more subtle and largely related to age at the time players reach six years of service time.

 
Matt Swartz
(24824)
Comment rating: -

The difference actually seems statistically significant. 39 of 71 two-year deals showed improvement instead of decline. The confidence interval on improvement from that is (43%, 67%), and certainly I would guess that less than 43% of this age group of players improve overall. This is more dramatic for re-signed two-year deals, of which 26 of 39 saw improvement. That's a (52%, 81%) confidence interval, and certainly fewer than 52% of players in that group decline overall. Thanks for highlighting this issue, though.

 
Matt Swartz
(24824)
Comment rating: -

Very possible. I don't know if the playing for a contract explanation covers all of this, especially the divide between re-signs and new signings, but it definitely could explain some. A valid point for sure.

 
Matt Swartz
(24824)
Comment rating: -

Richie, Ben's explanation here is correct. Think of it this way. If a coin comes up heads more than 50% of the time, it's bound to regress back to a 50/50 chance the next time it is flipped, but it's not anymore likely to come up heads or tails the next time after that. Ben, those sound like some papers I ought to have a look at! Thanks for letting me know about the literature. That's definitely relevant stuff for me to read.

 
Matt Swartz
(24824)
Comment rating: -

You're entitled, Dr. Carleton ;-)

 
Matt Swartz
(24824)
Comment rating: -

Hi! Okay, firstly I will say I loved this article. Next, I thought about exactly what JDSussman mentions, but I don't think it's an issue. Ultimately, when a player is non-tendered, they are not banned from baseball. They can receive their market rate. If they can receive their market rate, that means that if they are approximately a quad-A player, they can get a minor league deal and still will get a chance to play if they can sneak above replacement level. For players who reach six years of service time, they will face a similar situation, where once they are a free agent they can get paid accordingly to their quality, and if they are above replacement level, they can play. What is possible is that the investment of playing a player below replacement level in hopes that he will learn something starts to become less and less valuable as he gets closer to six years service time. That could be playing a role, but probably not a very large one like the one we see in the graph above. There could also be team bias factoring in somehow where the team that drafted the player is the only one who thinks he can perform above replacement level, but I don't know if that's much of an issue. It certainly is less of an issue with the latest CBA allowing teams to sign their own free agents. Basically, I really doubt it's a big issue, even if I think it's possible I'm missing something. I guess the most obvious thing to check is if there are a pair of modes for number of years of service time.

 
Matt Swartz
(24824)
Comment rating: -

Justin, what we've been doing is finding out individual pitchers' SIERAs and weighting them by the IP, so it would be like getting Skill-Interactive Earned Runs for each pitcher, adding them up, and divided them by team innings (and multiplying by 9). You want to get the interactions to be relevant for the relevant pitchers.

Feb 12, 2010 9:43 PM on Part 5
 
Matt Swartz
(24824)
Comment rating: -

I put that in the comments of the last article. With the updated formula though, it's basically a tie between FIP and SIERA (1.108 for FIP and 1.107 for SIERA, with QERA at 1.185, and xFIP at 1.226 and TRA at 1.307. Keep in mind, though, that SIERA was a regression on SAME-YEAR park-adjusted ERA, so it is not biased on those coefficients. Using this, the new coefficients and park factors have the final tally at: SIERA at 1.158, FIP at 1.198, QERA at 1.258, xFIP at 1.331, and tRA at 1.202.

Feb 12, 2010 9:41 PM on Part 5
 
Matt Swartz
(24824)
Comment rating: -

lopkhan00: With all due respect, I have absolutely no idea how you could come to this conclusion from this article. The 4th article showed that ERA doesn't do nearly as well predict future ERA as any of the estimators, and in this article, ERA was the baseline to check the other estimators against. I can't follow your thought process at all, but please expand if I'm missing something.

Feb 12, 2010 9:37 PM on Part 5
 
Matt Swartz
(24824)
Comment rating: -

My browser isn't letting me reply to individual comments, so I will reply in bulk to the first three I see-- @NathanJM: I think that like most DIPS metrics, it will do particularly badly for pitchers who should not be in the majors or should be on the DL. These pitchers probably would be likely to have higher line-drive rates and higher HR/FB. After all, if I was in the majors, I would have higher LD% and HR/FB. I think FIP will do better at these high extremes because it will punish the injured or unqualified pitcher for the HR/FB instead of treating it as luck like SIERA would. @seanpotter: PU is for popups, so in the FanGraphs data, you can just use their FB because that is outfield flies plus infield flies. Just replace our (FB+PU) using BP's stats with FanGraphs' FB stat. @MHaywood1025: looking at 2009, hitters had a .297 BABIP with none on and .304 with men on first, which means that there were about 274 more hits with men on first than bases empty. There were 3,494 double plays with men on first. So while the effect you mention is true, it's a much smaller effect than the double play effect.

Feb 12, 2010 11:05 AM on Part 5
 
Matt Swartz
(24824)
Comment rating: -

We did discuss it already, but that's okay. I don't mind reiterating once rather than making everyone reading 200 comments... We developed the coefficients by regressing on SAME-year ERA. So the tests on NEXT-year ERA are legitimately different. Also, to check, we ran the regression on the 2003-08 ERA for same-year data and then used those coefficients to compute 2009 SIERA stats and then checked same-year ERA and it finished similarly. Thanks for asking again, actually, because this question should be highlighted rather than obscured and have other people miss it.

Feb 11, 2010 8:28 PM on Part 4
 
Matt Swartz
(24824)
Comment rating: -

It helped check them and may be more involved next year.

Feb 11, 2010 11:33 AM on Part 4
 
Matt Swartz
(24824)
Comment rating: -

Yes, that's a typo, sorry!

Feb 11, 2010 11:33 AM on Part 4
 
Matt Swartz
(24824)
Comment rating: -

We didn't have the code for tRA* so we couldn't test against it. We also don't have a SIERA* to test against it. We might as well compare it to PECOTA or another projection system if we're going to start regressing components. The goal is really to answer the question of how a collection of skills leads to keeping ERA down. The main thing though is that tRA as a metric has a major problem in that it is affected so much by line-drive rate. Since we found that LD/Batted Ball had an ICC of 0.007-- which is pretty much the closest thing to zero I can remember seeing in sabermetrics-- it doesn't make sense to use it. I heard its inventor say once that LD/Ball in Air had a high correlation but that's because Fly Ball/Batted Ball has a high correlation and it's picking that up. If you picked (Pitcher's Birthdate)/(Pitcher's Birthdate + Fly Ball%) and checked its correlation, it would be high as well. It's just that you can't then call birthdate significant any more than you can call line drive rate significant.

Feb 11, 2010 10:38 AM on Part 4
 
Matt Swartz
(24824)
Comment rating: -

Sure. If that helps, I'll put it here in the comments-- Next-year ERA for 03-04, 04-05, 05-06, 06-07, 07-08, 08-09 SIERA 1.107 1.141 1.179 1.186 1.107 1.248 QERA 1.237 1.237 1.219 1.277 1.206 1.316 xFIP 1.284 1.403 1.211 1.404 1.287 1.311 FIP 1.120 1.230 1.298 1.236 1.170 1.283 tRA 1.162 1.202 1.273 1.216 1.171 1.307 ERA_pk 1.391 1.388 1.488 1.429 1.390 1.493 As you can see, it's ahead every time and offers a solid improvement if you compare the difference between the other estimators and regular ERA_pk to the difference between the other estimators and SIERA.

Feb 11, 2010 10:23 AM on Part 4
 
Matt Swartz
(24824)
Comment rating: -

Well, FIP can do well with same year because it treats HR/FB as skill rather than luck. Since that's not really the case, it seems like it won't be helpful. If you want to predict same-year ERA using luck-based numbers as skills, I'd go ahead and look at the actual ERA, you know?

Feb 11, 2010 10:18 AM on Part 4
 
Matt Swartz
(24824)
Comment rating: -

We have below average walk rates. If you mean more than one standard deviation below average, we did that test too, and were better in that as well-- but since we didn't have a squared term on BB, we thought it was too much clutter to include it. We certainly were not biased in the tests we reported.

Feb 11, 2010 10:17 AM on Part 4
 
Matt Swartz
(24824)
Comment rating: -

If you don't like it being on the other side of the decimal place, multiply everything by 1000. The difference in between these is large and very significant. Compare the difference in other metrics and simply using ERA to the difference in SIERA and other metrics, and you'll clearly see it's a BIG step forward. I should note that if you do these tests separately for EACH YEAR from 2003-09, SIERA is ahead of the same estimators EVERY time. This is a large difference even if we're dealing with ERAs which are necessarily going to require some decimals.

Feb 11, 2010 10:15 AM on Part 4
 
Matt Swartz
(24824)
Comment rating: -

Hmm...I would really like that approach only it doesn't do a good job of projecting next-year ERA that well. Why would the error term being correlated across years for pitchers matter as much for what we're doing? I agree that's definitely true, especially because it includes team defense, but still I'm not sure that is a big deal. Thanks for your comments and approach. It's interesting to see how you frame it.

Feb 10, 2010 11:07 PM on Part 3
 
Matt Swartz
(24824)
Comment rating: -

That actually does bump the GB*BB term to -10 and up to weak significance (p=.07), but why would you take out the linear GB term. It's effectively equivalent to limiting the minimum effect of GB% to exactly where GB=FB+PU. Think of it as a regression showing an equation of: SIERA = a + (b + c*SO_PA)^2 + d*BB_PA + (e + f*GB_FB)^2 + g*GB_FB*SO_PA + h*GB_FB*BB_PA. This way the effect of GB_FB is minimized at a value determined by where f = -e which can move rather than where f = 0. It's a more general assumption to leave it in there even if it cuts the GB*BB term in half and makes it appear insignificant.

Feb 10, 2010 10:34 PM on Part 3
 
Matt Swartz
(24824)
Comment rating: -

I see your point, but it really wouldn't change a single SIERA by .10 and it's a matter of preference. The reason I don't agree is that I think that the effect is real but close to -4.0. So the type II error of rejecting anything less than -15 is very, very high. It's a matter of intuition in this case. Especially given that the variance in GB*SO is high enough that the regression said it was positive. I don't know what else it would be correlated with that would get in the way. I doubt it though. If you think about the implications are of high walk rates and high ground ball rates, you'd think it adds a few double plays a year to have both skills, which is exactly what this type of coefficient around -4.0 would suggest.

Feb 10, 2010 10:27 PM on Part 3
 
Matt Swartz
(24824)
Comment rating: -

There probably is, but it's probably canceled out by the grounders and super-grounders, another benefit of the (GB-FB-PU) term where GB includes balls chopped into the ground in front of the plate as well balls that one-hop between the SS and 3B. But if the pop-ups happen more often than the choppers, it's an indicator that the pitcher is throwing the ball on a trajectory that generates upwards spin, and therefore is home run prone. The key is that pop-ups/batted ball is correlated with fly ball rate, which is correlated with home run rate. I don't know about foul balls and swinging strikes as a percent of strikes, but I suspect it would be interesting to look at. If anybody has, Russell Carleton has-- he's the foul ball expert.

Feb 10, 2010 10:21 PM on Part 3
 
Matt Swartz
(24824)
Comment rating: -

Maybe GB/FB changes correlate with (BB-IBB)/(PA-IBB) changes? I could certainly see two skills' deterioration being correlated. I guess many pairs of skills generally would as people aged.

Feb 10, 2010 6:17 PM on Part 2
 
Matt Swartz
(24824)
Comment rating: -

Eric and I can check into the Lahman database thing park factor issue. I'm not sure about this yet. We used 40 IP as a minimum. We checked RA by the same method (though not FRA) and got basically the same coefficients with the intercept being about 0.4-0.5 higher, so since people are familiar with ERA this is easiest to do. Fair RA is an intriguing idea, though. The reason we kept the GB*BB term with p=.56 is that (a) we don't think the effect is bigger than something around -4, and it would take 20 years of batted ball data for it to be significant, and (b) the exclusion of it, while re-running the regression and generating new SIERAs would not change anybody's ERA by 0.10. It's just too small of a difference to make a fuss about.

Feb 10, 2010 5:33 PM on Part 3
 
Matt Swartz
(24824)
Comment rating: -

FB on fangraphs includes pop-ups and outfield fly balls but NOT line drives, so that seems to be muddying things a little. I get .75 for GB% year-to-year correlation for 2005-08 data on pitchers with at least 30 IP both years. GB/FB gives me .78.

Feb 10, 2010 1:15 PM on Part 2
 
Matt Swartz
(24824)
Comment rating: -

We definitely will be keeping an eye on the BB*GB term. The problem is really that we suspect this term has an effect but that even a perfect term that accurately captures the effect probably would not be statistically significant because we only have 7 years of data. We tested it on individual years and sets of years and the coefficient jumped all over the place from much more negative even to more positive. The -4.027 number is probably a pretty good approximation, though. If we left it out and re-ran the regression, it would move the SIERAs by no more than .10 runs, which is pretty much the magnitude that you would expect the term to be. It's important, but it's not going to show as significant in this sample size. Thanks for the suggestion though. Definitely was an important thing to check.

Feb 10, 2010 12:09 PM on Part 3
 
Matt Swartz
(24824)
Comment rating: -

We didn't include HBP. I just did a little re-checking and but I remembered correctly that it doesn't seem like it would have changed all that much. It might be a small improvement, though, and we might look into it in the future as we get more data, but it was probably too small of a factor to consider. This is a good point, though, and worth checking as we get more years of data, especially if HBP are very persistent which I suspect they are at least somewhat.

Feb 10, 2010 12:05 PM on Part 3
 
Matt Swartz
(24824)
Comment rating: -

Yeah, it was just easier to approximate that half the games were at home. I can't imagine that there would be enough noise there to affect the results. I guess also since we were running a regression with park-adjusted ERA as the dependent variable, it was less important to be precise with park-adjusted ERA because the coefficients would be unbiased even if the park-adjustment was noisy. Noisy independent variables would bias the coefficients towards zero, though, so I think that would have been a bigger issue.

Feb 10, 2010 11:58 AM on Part 3
 
Matt Swartz
(24824)
Comment rating: -

Thanks. We did play around with IBB a little bit, but some of the problem is that it is difficult to differentiate between IBB where the pitcher gives up after getting into a 2-0 or 3-1 count and direct IBB from the first pitch, and then to separate even further the difference between those IBB and just pitching around people. There certainly was some indication that IBB led to fewer runs, particularly with respect to the ground ball term, but at this sample size we figured it was probably best not to do something that could be construed as data mining. We also felt that the gains from distinguishing between BB & IBB seemed negligible anyway. That is a good point, though. Thanks for highlighting it.

Feb 10, 2010 10:08 AM on Part 3
 
Matt Swartz
(24824)
Comment rating: -

Eric M. Van-- I'm not sure that there is a correlation between BB% and HR%. I'm finding only -0.03 in my data set. There seems to be a correlated between doubles and walks, and between doubles and home runs, but not between walks and home runs. I do think SIERA should do a better job of accounting for these kinds of correlations, especially when they affect run scoring differently as a set of skills rather than as a sum of their parts, but I'm just not sure that BB% and HR% are really correlated in general based on the data I'm looking at (2003-09 pitchers here).

Feb 10, 2010 8:54 AM on Part 2
 
Matt Swartz
(24824)
Comment rating: -

I think PECOTA does use information on groundout/flyout ratio, but I'm not totally sure. I think SIERA will be more involved next year in the PECOTA process but I'm not sure really where it fits in. I do think that batted ball statistics that have only been collected properly since 2003 would help all projection systems immensely, but that it might be tough to incorporate some of that accumulated knowledge into a model like PECOTA without throwing out 50 years of other data is uses effectively too.

Feb 10, 2010 8:47 AM on Part 2
 
Matt Swartz
(24824)
Comment rating: -

Yeah, maybe we shouldn't have used the word unfoil. It was supposed to be somewhat of a play on first-inner-outer-last, but really we should have stuck with unravel. I think it'll be clearer in today (Wednesday's) article, though we probably used the word again IIRC. Regress is pretty standard as a verb at this point, I think, but it may not be perfectly used all the time within sabermetrics.

Feb 10, 2010 8:44 AM on Part 2
 
Matt Swartz
(24824)
Comment rating: -

Hmm...there must be something wrong with your data source. GB/FB, GB/Batted Ball, FB/Batted Ball-- these all have correlations of something like .70-.80 year-to-year. I'm pretty sure they are more persistent than even strikeout rates for pitchers. The thing is that you cannot use data before 2003. It's possible that you are looking at Groundout/Flyout data, but even that should be reliable. HR/9 on the other hand, has a year-to-year correlation of something like .2 and that breakdowns when you net out team effects and do HR/outfield flyball.

Feb 10, 2010 8:40 AM on Part 2
 
Matt Swartz
(24824)
Comment rating: -

Thanks. I think you are typing the formula in wrong though. I'm getting Aardsma 3.41, Accardo, 5.20, and Adams 4.17.

Feb 09, 2010 1:47 PM on Part 2
 
Matt Swartz
(24824)
Comment rating: -

This will be in tomorrow's article in more detail. The cubic thing is limited only because there just isn't enough data to finely tune it quite that much. Nothing comes out significant when I try that. Thanks for the question.

Feb 09, 2010 1:23 PM on Part 2
 
Matt Swartz
(24824)
Comment rating: -

Thank you for pointing this out. The correlation of BABIP and Defense Independent Pitching Statistic is something I've discussed before. It's small but it's there. The benefit of using regression to do this is that it picks up this effect. Pitchers with higher K-rates have lower BABIPs, and both the extra K/PA and the fewer H/BIP lower ERA, but the regression will pick up both effects. The only thing that SIERA will leave out is BABIP effects that are uncorrelated with ground ball, strikeout, and walk rates, which are very small effects. That's why this is more of an ERA estimator based on skills than based on DIPS.

Feb 09, 2010 5:29 AM on Part 1
 
Matt Swartz
(24824)
Comment rating: -

Stay tuned then! It comes from a regression formula. Tuesday we'll spell out the methodology and Wednesday we'll spell out the regression itself and the data used.

Feb 08, 2010 10:11 PM on Part 1
 
Matt Swartz
(24824)
Comment rating: -

It predicts park-adjusted ERA the following year best, and I have a large doubt that starters are systematically paired with certain weather and umpires, so that should even out when predicting the following year's ERA. It also does better than other HR/FB-luck-neutral estimators in same-year ERA so it covers all the bases. Although there could be parks that favor lefties and righties, I'm not sure how this would correlate with K%, BB%, and GB% enough to affect this.

Feb 08, 2010 7:10 PM on Part 1
 
Matt Swartz
(24824)
Comment rating: -

Absolutely. It uses all the same information, so it will stabilize just as quickly. Thanks for pointing this out.

Feb 08, 2010 3:37 PM on Part 1
 
Matt Swartz
(24824)
Comment rating: -

As far as the diminishing run prevention effect of strikeouts, it does really matter where the BB and GB numbers are because those determine the number of base-runners and the double play ability to remove those base-runners. We did test the second order term for BB-rate, which we'll explain in more detail on Wednesday, but it kept coming up as insignificant so we left it out of the equation.

Feb 08, 2010 3:23 PM on Part 1
 
Matt Swartz
(24824)
Comment rating: -

Quick note that we didn't mention in the formula-- for pitchers who give up MORE fly balls and pop ups than ground balls, the ((GB-FB-PU)/PA)^2 term would be positive, but we basically made it negative in that case. So that term should be negative or positive depending on the sign of (GB-FB-PU)/PA.

Feb 08, 2010 2:14 PM on Part 1
 
Matt Swartz
(24824)
Comment rating: -

Are you asking how well this would match up with linear weights? If so, that's not directly in subsequent articles, but I think it would probably match it reasonably well, at least as more data is collected on batted ball numbers over the next few years and the coefficients are refined. Some of the strength in the estimation might come from situational pitching, which probably wouldn't show up quite as much in linear weights as I understand it, but certainly the magnitude of the interaction terms at the end might work out pretty well at least. It's a good question, but I'm not sure yet.

Feb 08, 2010 2:00 PM on Part 1
 
Matt Swartz
(24824)
Comment rating: -

Oops-- you're right. It should be higher/higher and lower/lower rather than the way we did it. Thanks for pointing that out. It shouldn't obscure the error though-- looking at BABIP-neutral statistics on a per out (or per 3 outs) basis isn't really BABIP-neutral at all.

Feb 08, 2010 1:26 PM on Part 1
 
Matt Swartz
(24824)
Comment rating: -

The 2010 Annual will list 2009 SIERA and will compute 2010 SIERA according to the projection. Pretty soon, the 2003-09 SIERA's will be available on the Statistics Reports and the 2009 SIERA's will do very well at helping predict 2010 ERA, at least net of park effects.

Feb 08, 2010 12:29 PM on Part 1
 
Matt Swartz
(24824)
Comment rating: -

SIERA helped find some of the mistakes in the first round of PECOTAs but it wasn't early enough to actually build it in to 2010 PECOTA. It definitely could be part of the process more next year, though I'm not quite sure about that.

Feb 08, 2010 12:15 PM on Part 1
 
Matt Swartz
(24824)
Comment rating: -

Definitely could be part of it. Glavine's career BABIP is .286, while Wakefield's is .281. SIERA and FIP do about as well when it comes to Wakfield. The thing about Glavine's BABIP partly is that he played in front of good defense, so that's not all the effect. It definitely would explain some of it, though.

Feb 08, 2010 12:11 PM on Part 1
 
Matt Swartz
(24824)
Comment rating: -

It happens bit by bit, so it's tough to pick an exact number. I know that SIERA tested particularly well for pitchers in the 6.5-9.0 range of K/9 while doing about as well as everything else for K/9 above that.

Feb 08, 2010 11:54 AM on Part 1
 
Matt Swartz
(24824)
Comment rating: -

Haha-- unfortunately, Glavine baffles SIERA as well, at least for 2003-2008 where there is actually batted ball statistics recorded. His SIERA's look similar to his other ERA estimators, all ahead of his ERA (about 4.9 versus 4.2 for those six years). The thing about Glavine was that he was far superior to his peers at pitching to the situation. I think that pitchers with really high ground ball rates may be particularly good at pitching to the situation, but at least from 2003-09, Glavine is pretty average there so he is not a puzzle SIERA can foil.

Feb 08, 2010 11:52 AM on Part 1
 
Matt Swartz
(24824)
Comment rating: -

Exactly right. We also let them take on more realistic values because we unfoiled the regression. For instance, in Nate's formula: QERA = (a + b*BB% + c*K% + d*GB%)^2, it's essential that b is positive and that c & d are negative, but that requires that b*c is negative when it is actually positive in nearly every regression we ran, and it also requires that d^2 is positive, when it actually should be negative. Thanks for highlighting this point, though. I think it makes the metric more transparent.

Feb 08, 2010 11:31 AM on Part 1
 
Matt Swartz
(24824)
Comment rating: -

Very good question. We did run regressions on various subsets of data and this term was basically always positive. It fits with the general theme pretty well too. Think of it this way: when do you want a strikeout most? With runners on base. The more K's you get, the fewer runners are on base though, so it tends to get gradually less affective. In Wednesday's article, this will be flushed out a little more, but the methodology behind a lot of the quadratic and interactions terms entails asking "which pitchers need this most?" That's why GB rate is more important for pitchers who allow more base-runners, and why marginal improvements in K-rate is less important the higher the K-rate goes.

Feb 08, 2010 10:48 AM on Part 1
 
Matt Swartz
(24824)
Comment rating: -

That's not true. More risk added brings more risk to the projected. Obviously +/- 20 wins is extreme, but +/- 5 wins is actually impossible since the binomial theorem projects more randomness within 162 games' standard deviation. I was exaggerating to make a point. If it makes you feel better, I could say that I would rather a team that was 81 wins +/- 12 wins than 81 +/- 7 wins.

 
Matt Swartz
(24824)
Comment rating: -

I can't tell if you are arguing against me or Fangraphs here. I agree that using projections to determine linearity is unwise, because of the Winner's Curse, but that's not really what I'm doing. I'm explaining that the only reason why linearity would not be a valid economic assumption is if there were capacity constraints on rosters that were binding-- in other words, if teams typically filled all of their vacancies with players above replacement level. They overwhelmingly do not, and therefore any team overpaying for stars beyond a linear price is just overpaying, just as Ed Wade does for veteran relievers and other GMs do for RBIs. In the coming weeks, I'm going to actually pin down a formula for MORP based on previous seasons' outcomes.

 
Matt Swartz
(24824)
Comment rating: -

Oops! Forgot to write that one in. I had him listed in my spreadsheet as a medium-size free agent because it was only $36 million guaranteed. Thanks for pointing that out.

 
Matt Swartz
(24824)
Comment rating: -

This is all exactly right, but I'm going to asterix it at the end here. Almost all outfielders and almost all free agents are not worth it for the latter half of their deals, and almost all free agents are worth it for the first half. The Cardinals are getting a great deal now and a bad deal later. Overall, I think they'll come out even. A lot of the people you mentioned pushed their team into contention or to a World Championship in Manny's case during the first half of the deal. Most of them also sunk their teams financially later. It's a tradeoff. Thanks for the great examples. I think it illustrates the danger of the back end of deals very well.

 
Matt Swartz
(24824)
Comment rating: -

I'm not sure why you think that the actual free agent market is paying players non-linearly. From my preliminary work on this subject, I have yet to see evidence of that at all, and looking at Fangraphs' values certainly seems to lend some strength to the argument. Nate's old MORP formula was based on a previous version of WARP that was not using the correct replacement level leading to that result. I think you're missing a couple other things here. One is that you claim that you think "you'd be looking at which set of players is most likely to provide you with the greatest number of wins" (between two 2-win players and one 4-win player) that the 4-win player is higher-- that means that either he's not a 4-win player or the other 2 are not 2-win players, by definition. The point is that in expectation, they would add up the same. The number of bidders does not have a huge impact here as long as there are two bidders with similar values-- this is NOT the typical supply and demand framework at all! There is one team per player-- it's an auction, so the shortage of labor demand does not matter. Further, lots of mid-market competitive teams sign big-ticket free agents. The example here is Matt Holliday and the Cardinals. That's a great example right there. If he adds wins, he'll do it for whoever. Maybe the Marlins won't bid on free agents, but among competitive mid-market and large-market teams, it's really a matter of the number of free agents they sign, not how their win totals are distributed. Also, there is never really anybody with a 0% chance of getting injured and no business can ever write off an outlier that is greater than 0%. Even if it's 5%, that's built into the probability distribution. Ultimately, there's no real reason in these counterpoints that you're making to refute what I'm saying. I think it should be linear and I think it pretty much is, but I'll get back to you on that one as I get more results.

 
Matt Swartz
(24824)
Comment rating: -

At being a shortstop, he is 14-27 runs above average. I'm fully aware he's not an average hitter. I meant defensively he's 14 to 27 runs above average.

 
Matt Swartz
(24824)
Comment rating: -

The PA in each row are those at the position. The other statistics are aggregate. So it's not 40 RBI in 102 PA. It's 40 RBI in all his 377 PA, with 102 of those 377 PA being in CF.

 
Matt Swartz
(24824)
Comment rating: -

True. But is risk bad? Not necessary, as long as it's built into the price. I'd rather have an 81 +/- 20 win-team than an 81 +/- 5 win-team.

 
Matt Swartz
(24824)
Comment rating: -

This is a very good point, but the marginal effects of success in the playoffs are so small and unpredictable compared to other effects that I don't think it's necessarily that big of a deal. Certainly the back of the roster is less important in the playoffs, but with in-season injuries, selecting the back of the roster isn't all that clear. I think you highlight another good reason to limit my reliever-openings to 4, though. The 5th, 6th, and 7th relievers aren't really getting much time in the playoffs.

 
Matt Swartz
(24824)
Comment rating: -

Well, I kind of like the Cards' move fine. It's risky, but risk can be good if it has good upside, which this does. Holliday is much better than Carlos Lee though-- Lee never topped an EqA of .298 and Hoiday has been consistently in the .320 range for a few years. It's a good point that you're making, but Holliday is still better than Lee. I do think Holliday is going to be worth less than his contract at the end of the deal, but that's pretty much true with all free agents. That's my general point about long-term deals-- you generally pay equal dollars per year for declining value. You end up with a bargain early and an overvalued asset later. That's fine. You get the same issue at a restaurant. You pay for the meal afterward but you still ate.

 
Matt Swartz
(24824)
Comment rating: -

I definitely have been discussing articles with Eric a lot recently, since we're discussing some similar topics. I think the risk thing is important but tricky. First you need to decide if risk is good or bad for an individual team (and it depends) and next you need to decide if 2 players worth 2 wins are riskier than 1 player worth 4 wins. The values concentrate differently, in all likelihood, with more of a possibility of getting 0 wins if you go all in one player, but you have to deal with twice as many bloopers and lineouts with two guys as opposed to one, so it's not clear that one is less risky. Your point about Bay is true-- Tommy Bennett did an excellent article on this at the time. I do think that changes things, but it depends on the team's cash flows how to space out the money. It's certainly worth less in the future if it's the same dollar figure, though.

 
Matt Swartz
(24824)
Comment rating: -

Well, I don't know about the concern about blocking players being a major factor. There is always mid-season trades if need be, and they had enough openings that concerns about blocking one player could be avoided by just filling another position. There is always room for more pitching, in general, too-- someone is bound to get injured on your team, or on another team to which you could trade another starter. Also, with respect to the issue of making 4 free agents work-- there are plenty of 1.5-win players available. Look at the list of free agents from this past offseason. Picking a set of four guys for them to sign was easy, and I could have selected a totally different set of four. Adding another bidder plays some role, but it's not going to make a difference on average. We're talking about rough numbers here.

 
Matt Swartz
(24824)
Comment rating: -

Why is diversifying good for a baseball team? Remember Eric Seidman's article-- risk is good when you're less than likely to make the playoffs and bad when you're more than likely to make the playoffs. We were discussing these pieces together for a while, and I think that's pretty much the important conclusion to realize. However, remember there is risk based into summing over four 1.5-win players too, though, with so many more bloopers and lineouts that could happen--- risk doesn't cancel out over a sample size of four, it adds up. There is just more of a concentration of risk one side that is much steeper for one player because his injury could doom you. For a team like the Cards, it was probably about even either way I think. They weren't a foregone conclusion to make the playoffs after signing him, nor would they be after signing four 1.5-win players, but it certainly makes them the favorite and puts their odds maybe over 50%.

 
Matt Swartz
(24824)
Comment rating: -

It certainly would be an interesting study, but tough to pull off with trustworthy methodology. My understanding is that the marginal dollar gain from making vs. missing the playoffs is so huge that "star power" is probably trivial in comparison. Thanks for your thoughts.

 
Matt Swartz
(24824)
Comment rating: -

I think there might actually be a lower cost to using a good reliever on the road, because you don't have to use a reliever in the bottom of the 9th. Using your 2nd and 3rd best relievers when you are losing in the bottom of the 7th and 8th innings isn't wasting resources as much as using your 2nd and 3rd best relievers when you are losing in the top of the 7th and 8th innings, because in the event that you are still losing afterwards, the top of the 9th inning also requires another reliever to be sent to the mound by home managers, but not by road managers if they are losing in the bottom of the 9th. Even if managers have a better chance of winning at home and want to hold back less, they have a higher probability of needing to use a reliever in the 9th so they need to hold back more. Really interesting article, all in all. I'm not sure how I would have approached a lot of this stuff, and I think you probably used the best methodology given the circumstances. Looking at managers qualities piecemeal is undoubtedly the way to actually evaluate their decision-making performance.

 
Matt Swartz
(24824)
Comment rating: -

I don't know who this Schwartz fellow is, but he should ask the Swartz fellow why he said he didn't believe that the behavior in this experiment was irrational ;-) In all seriousness, the point I was trying to make but maybe wasn't clear is that I'm not sure GMs are irrationally risk averse, but the economic assumptions required to generate a rational perfectly competitive market do not hold in baseball due to lack of free entry of competing businesses, so they may be. Thanks for your thoughts.

Jan 25, 2010 9:43 PM on Analyzing RoboPitcher
 
Matt Swartz
(24824)
Comment rating: -

I would agree that some low end free agents are more likely to go to other teams. This is for a couple reasons. One is that "replacement level" isn't always true in practice and teams with particularly bad alternatives that are worse than replacement are more likely to sign a mediocre player. Basically, let's say that we have a guy who is worth 0.5 wins and the market prices him around $3MM. For a team with a bad alternative who is actually going to produce -2.0 wins, that's a pretty good deal. Of course, if there is a replacement level player worth 0.0 wins, that's a better option so you'll probably observe that more often. That's part of the reason you see a whole host of minor league deals. The other issue is that for bad teams especially, there is an indirect cost of calling up a bad player because you use up his service time. If you have a guy you expect to be worth 3 wins a year starting next year but is only worth 0.5 wins now, why call him up now? You can get 2.5 wins extra from his 6 cost-controlled years by signing a replacement level player instead. The benefit of blocking a developing player who would theoretically be above replacement now but is likely to be further above replacement in the future is valuable.

 
Matt Swartz
(24824)
Comment rating: -

This is a very good point. I guess the concentration of the Red Sox and Yankees in the AL East does cause that kind of issue. I guess that would affect the market, but given that the Cardinals signed Matt Holliday and the Mets signed Jason Bay, and neither are top 2-3 revenue clubs, I suspect that maybe the marginal revenue isn't as distinct among top teams even if total revenue is (certainly they must be highly correlated though). There's also another aspect that I'm not approaching here that you remind me of, which is externalities-- I mean, the Yankees and Red Sox really exert externalities on each other's playoff probability when they sign free agents and in those cases the marginal revenue might get higher because of what it does to the competition too. There are definitely a lot of complicated issues here, but I think that the main one is the roster construction issue of number of openings and probably these results are enough to ensure it's safe to use a linear $/WARP framework.

 
Matt Swartz
(24824)
Comment rating: -

I agree that optimal roster discussion is probably a good task, though it's probably still in its infancy. Have a look at recent Eric Seidman articles. He and I have had a few discussions recently about certain free agents and risk, and I really like what he's been able to do in applying his ideas to Pineiro and Garland, and other free agents. Basically-- risk isn't always bad in baseball. The average team wants risk. It's only teams who are very good who want to lower the standard deviation of their win total distribution. Let's say the Mets expect to win 83 games next year. They want the standard deviation of their win total to be higher, because it entails a higher chance of reaching the playoffs. Similarly, the Cardinals are almost a shoe-in and probably could expected 92 wins, say, in a weak division. They want their standard deviation as low as possible. Thus, Seidman explains that the Cardinals would be a better match for Garland and the Mets would have been a better match for Pineiro (though he signed with the Angels, which makes sense for the same reason).

 
Matt Swartz
(24824)
Comment rating: -

No offense, but I don't think you know what you're talking about when you say "I don't think this analysis is economic." I'm a PhD economist, so I'm pretty sure it's economic and I'm pretty sure that counts. If you don't agree with economic theory as it applies to this problem, then say that, but try to avoid telling me I'm not using economic analysis. I'm also confident that in the past decade of work at Baseball Prospectus, including much of the pioneering work by Silver, there is little ability to control the probability of success once you reach the playoffs with notable exceptions like the Secret Sauce formula that you can find on the stat pages. Of course, teams put different subjective weights on these different outcomes, but regardless, that would not really change the fact that signing a superstar and signing several average players are both viable choices even for teams who have higher aspirations is still an option. I'll say that again for clarity-- the whole point of this article is that achieving the 100th win or the 60th win, teams who sign superstars will generally have the ability to add those win by filling in several average players as well. There are very few situations where a team has almost no replacement level players in important roles. As far as your 71st win versus 89th win tradeoff, I addressed that in the last article, this one, and again in the comments-- the point is that the vast majority of teams who sign free agents are the ones with the highest value for a win. The reason is that those teams generally value the win more. You can figure that out from their behavior, which is both intuitively obvious and basic economic logic. No one is questioning Silver's marginal value of a win-- which isn't a subjective argument either, but an objective one with noise mixed in-- but rather approaching how the market works given that. Since teams who sign free agents are generally competitive and generally set the market, there is no conflict in those thoughts, what I'm saying, and economic theory.

 
Matt Swartz
(24824)
Comment rating: -

Oh, it's a very good point, but just make sure to separate indirect costs that vary by player and indirect costs that are the same for all players because you 25 plane tickets, 25 uniforms, 25 sets of bats, etc., regardless. I definitely appreciate the comment.

 
Matt Swartz
(24824)
Comment rating: -

It could be that optimism plays a role, but that would still mean those GMs are overvaluing wins. That said, I think it's more likely the somebody thinks a pitcher who has a 4.50 career ERA can be spun into a pitcher with a 3.50 ERA with the right pitching coach than it is for a team to think a pitcher who has a 3.50 career ERA can be spun into a pitcher with a 2.50 ERA. Same idea-- a GM seems less likely to think a .325 hitter could be turned into a .365 hitter than he is turn a .285 hitter into a .325 hitter. I like the transaction cost idea a lot, but I think short of health care costs, it's really hard to see other stuff playing a huge role (i.e. you get 25 uniforms and 25 plane tickets regardless of whether a schlub is wearing the uniform in the airplane seat or not). Health, maybe, but I don't know-- how much does Dr. Andrews charge?

 
Matt Swartz
(24824)
Comment rating: -

It could also be explained by teams who sign stars overvaluing their worth. The difference is this test I've run where I checked how many openings teams have. If teams who sign superstars had several replacement level players who they neglected to replace, then they shouldn't be nonlinear. The point is that if the market with respect to WARP is nonlinear but shouldn't be, then we can't make the market value of those wins can no more than we can consider RBI to be worth more. If there was a way to replace a 4-win slugger with a 4-win player who has fewer RBI at a cheaper price, the cheaper price is the market value of 4-win players. If teams that sign 4-win players can be eschewed in favor of pairs of 2-0 win players at a cheaper combined price, then the cheaper combined price represents the price of 4 wins as well.

 
Matt Swartz
(24824)
Comment rating: -

I'm pretty sure you get it based on this, but I'm not quite sure about the distinction between those two things. The point is to replace a player, typically one who is a replacement level at the same position (or at least by shifting players around the diamond and bumping someone replacement level out). Sure, some times it won't be a replacement level player but teams with the worst alternatives are typically the ones most inclined to sign free agents. I'm glad you were able to follow without taking econ. I always strive to explain the econ well enough that it makes sense as common sense rather than just empty theory :-)

 
Matt Swartz
(24824)
Comment rating: -

I am not up to checking the regression yet, but based on my results in this article and the next, the point is that what teams are paying for is wins. The analogy with stars of restaurants is not the same because you can't go to a one-star restaurant and a two-star restaurant and feel as satisfied as you would going to one three-star restaurant. If b2 turned out to be significant, then that would indicate a market inefficiency or something missing like arbitration compensation or replacement level too low, etc. But if adjusting for arbitration and proper replacement level, I still find b2 is statistically significantly positive, then that means something is wrong with the market that can be exploited.

 
Matt Swartz
(24824)
Comment rating: -

I read through that discussion between you and Tango the other day, but decided to hold off on commenting until I published all three articles. I guess the short of it is that I think that you were right about the kind of tests that need to be run and he was right that it'll come out linear even if he didn't have the right reasoning down. Basically, Friday you'll see just how many replacement level players in key roster spots (i.e. lineup + rotation + top four bullpen guys) are lined up as of early November. It's a lot, even for good teams. So, while I agree that there is absolutely an economic argument for a premium due to concentrating talent, there are just too many ways to do it that signing a superstar isn't necessary. Think of the best free agents signed this year-- Jason Bay and Matt Holliday-- and look at how many holes were in each roster as of November. I do think that current estimates of replacement level-- 40 to 50 wins for a whole team-- look a little low unless you count displaced talent. Specifically, the first example I'll give Friday is the Phillies and Greg Dobbs. He's not quite 20 runs below average even if he had to hit off lefties, but the effective pinch-hitting changes by making the Phillies use a worse PH vs. RHP make it about 20 runs below average. So I think displacing people is part of the story. Also, remember that calling up talent from the farm costs service time, so I don't know if that's really a preferable solution in many cases. With respect to what MORP is trying to represent, it's a good question. I'm basically assuming that teams are setting marginal revenue to marginal cost on average, but that there is some noise in the estimates of win values of players. I would say that marginal revenue is really equal to marginal wins x marginal dollars per win. Further, I'm assuming that marginal dollars per win is probably negligibly different between the top few teams but that marginal wins is really where there is more variance. From there, I'm basically assuming a first price sealed bid auction. Due to the winner's curse and the fact that it's a first price auction, I assume teams probably due shade their bids down some but that those are negligible differences. Really, I think that on average, a model of ($/WARP constant)*(WARP) pretty much explains the market and can give you MORP but that teams can do things like the Phillies did with Polanco (check my earlier article on this if you haven't seen it) by adjusting the replacement level of the (N+1)th team where there are N free agents. But in general, I do think that the model as is explains it pretty well. I'm not sure I like monopsony because I think of the 30 teams as being the labor demand side rather than 1 MLB, so that would put it at an oligopsony, and while I do think that the draft is an example of implicit collusion of the repeated prisoner's dilemma ilk, I'm not sure that the free agent market does that mostly because teams from different divisions bid, and so I'm pretty sure they bid based on the expected marginal revenue (conditional on winning, winner's curse thing). Thinking more about it, I do see the possibility that there is still some fraction of marginal revenue product that the players do not capture in expectation, but I think in that case MORP still answers the question "What do teams pay for this number of wins on the free agent market on average?" so I think it's a valuable tool even without looking at actual marginal revenue. I definitely liked your discussion with Tango, though. I think you saw something he missed, but I think the results make it such that it doesn't matter.

 
Matt Swartz
(24824)
Comment rating: -

Well, Bonifacio is probably replacement level for 2010 even if his upside is higher. He certainly is the same quality of player as I was thinking of, but unless other players of his caliber were replaced on other teams, we wouldn't have counted him. I think it may be clearer Friday, but please ask questions if it's not.

 
Matt Swartz
(24824)
Comment rating: -

@archilochusColubris: The number of replacement level players will outstrip the number of free agents signed, you'll see on Friday. A lot of teams don't sign or trade to improve on replacement level production-- and end up getting it as a result. Think about how many teams have 5th starters with ERA's well into the 5s. The idea is that as we get a sense of who is replaced, we get a sense of what replacement level is. The example you gave of 6 low-WARP players vs. 3 high-WARP players actually will illustrate the point of exactly how low and high WARP of free agents are. There are only a small handful of free agents that will get you more than 4.0 WARP reliably, while you can certainly see an abundance of guys in the 1.5-2.0 range. I guess it's possible that a team could try to sign 3 guys with 4.0+ WARP, but that's really rare and not common enough to affect the market price, mainly because you'd really need two bidders to push up the price. Maybe the Yankees signed Sabathia, Pettitte, Burnett, and Teixeira last year, but that's a rare year even for them, and they probably not have people competing with them to sign several of them rather than individual teams competing to sign one or two of them (e.g. the Red Sox were not probably also trying to sign three of these guys, probably just Teix and a pitcher at most). The scarcity is reflected in the already higher price-- it's just that the scarcity comes in linearly. The important distinction to be made with how far along their teams are along the win curve, on average, is an important one. The average team does not sign free agents-- the team that values the player the most does. Thus, teams that sign free agents are the ones that are likely to be the most competitive. For them, the marginal value of a win is already high since most of them are already in the 85-95 win range. For your MR measurement, the problem is that we do not have counterfactuals. In other words, even ideally we could only know what the Phillies' revenue was last year given that they won 93 games-- we do not know what their revenue would have been had they not signed Ibanez and won, say, 89 games. Any attempt to solve this empirically would be muddled by the fact that the Phillies clearly thought 89-73 was unpalatable compared to 93-69, so even if we had the revenue of other teams who went 89-73 or even the Phillies in previous years where they won 89 games, those numbers likely understate the difference between 89-73 and 93-69 because in instances where the team won 89 games, they likely didn't have as huge a gap (and didn't want to spend the money), and in instances where a team won 93 games, they likely had a larger gap. In reality, assuming the very rich individuals and companies who own baseball teams have at least a decent sense of their marginal revenue, we can use the Axiom of Revealed Preference to impute a decent approximation of marginal revenues. Let me know if I left anything out, and thanks for giving such an incredible amount of thought to this. It's a very challenging job to redo MORP, and I encourage thoughtful comments like these to help guide me along and challenge my assumptions (and change them where necessary!). @JDSussman: I would say that this is estimating what teams should pay for wins, conditional on the fact that the market is producing the correct price on average. For instance, if Omar Minaya overestimates the win-value of Jason Bay and overpays but Jack Zduriencik sneaks in at gets Chone Figgins at low $/WARP, that would not be an issue in the model as they would cancel out. Where the issue would lie is if winning actually doesn't add much to revenue but Steinbrenner just lets Cashman spend to a certain budget because he can afford it-- rather than because he can make money allowing him to do so competently-- then MORP won't know that this is happening. It assumes that, on average, teams are paying for wins at their marginal benefit.

 
Matt Swartz
(24824)
Comment rating: -

Sunpar, thanks for your clarifying question. Although Bonifacio probably is replacement level, that wasn't quite what I meant. I actually meant literally players being replaced. In Friday's article, you might see what I mean a little clearer, but some examples would be that the Phillies had Greg Dobbs as the next-best option to making a move at 3B by staying in the organization but chose to replace him with Placido Polanco, the Red Sox had Jed Lowrie in the organization as the next-best option to signing Marco Scutaro at SS who they chose to replace him with instead, the Rays had Dioner Navarro as the next-best option to trading for Kelly Shoppach at C, etc. Friday's article will definitely show you more about what I meant. Thanks for clarifying.

 
Matt Swartz
(24824)
Comment rating: -

It's definitely not going to be ready yet for the PECOTA cards. I'm not sure of a timeline yet, but I think the cards are coming out much sooner than I could ever get this done.

 
Matt Swartz
(24824)
Comment rating: -

Albert Pujols is double-amazing. He did two ridiculously amazing things if you want to break it down. Firstly, he went from a no-name 13th round draft pick into a mega-superstar. Secondly, he went from being an elite young player to maybe the best player in baseball history when all is said and done, instead of regressing like maybe 90% of other elite young stars who make that jump. Put it this way. Albert Pujols signed his current deal in February 2004. From 2001-2003, his first three years of baseball, he hit .334/.416/.613 playing in 158.3 games per season. Now, if you want to figure any normal regression to the mean, no one on the planet would have projected Pujols to do what he did from 2004-2009 which is actually to get ridiculously better: he hit .334/.435/.636 playing in 154 games per season. Not only that, the latter performance actually was during an era with lower hitting totals so his OPS+ went from 165 to 175. He didn't regress to the mean, he regressed to deity. Not even Scott Boras would have had the chutzpah to suggest that Pujols was going play 160 games with an OPS+ of 188 six years later. They priced his free agent years, which were 3-8 years away at $16MM. That's ridiculous. If you knew a guy would play 160 games at first base with a 188 OPS and plus defense and he offered his services for one year-- and this was guaranteed-- we're talking $40-50MM that teams would pay for that if it were guaranteed. 13th round to all-star in a couple years...that's tough to believe. But getting ridiculously better? Wow.

 
Matt Swartz
(24824)
Comment rating: -

I wonder if some kind of Heckman 2-step kind of thing would work. I haven't done one in years but I think that would be the kind of test that the Heckman test is supposed to correct?

 
Matt Swartz
(24824)
Comment rating: -

This is definitely true, and I might be exploring this at some point. Thanks!

 
Matt Swartz
(24824)
Comment rating: -

Very interesting! I really like this suggestion, and I'm going to + this comment :)

 
Matt Swartz
(24824)
Comment rating: -

Yeah, this is true but this would actually account for more variation in NM WARP3 since those are the drafted players. The rule that Pujols broke here is how much better he got after he signed his long-term deal, and how much more expensive wins got too.

 
Matt Swartz
(24824)
Comment rating: -

Fair point and certainly noted. I'm a Phillies fan so I have an inherent frustration with Ed Wade hardwired. He certainly can't be blamed for the lack of minor league talent. Of course, I suspect that trading Berkman, Oswalt, and Lee might have worked if he kicked a little money in to the players to wave their no-trade clauses. That generally works in some cases, but it's not clear if they could. All in all, the Astros are far enough away that they should be trying to get any value in the future and trading guys like Hunter Pence would certainly help that goal. Definitely not his fault that they're in this mess, but it's likely to be his fault if they can't get out of it.

 
Matt Swartz
(24824)
Comment rating: -

Oops, I think I meant to make the cutoff 2.5 or just meant to say "one of them" and wrote "none of them." Either way, I did miss the Burnett 2.4 in my article. You're right about Millwood and Washburn, but in reality, both of those were BABIP luck mostly. Of course there are exceptions but the general rule is that you generally get most of the value from free agent deals up front. Especially coupled with the more recent article, this seems particularly important to note.

 
Matt Swartz
(24824)
Comment rating: -

Thanks. These are all good ideas and things I'm working on trying to do.

 
Matt Swartz
(24824)
Comment rating: -

Let's look at this another way: .300/30/100 This is a lot of information. If I tell you that I have a hitter I would trade to your team who hit .300/30/100 last year, that's a lot of information. That's not a bad baseball player. What we don't know is he walks a lot or doubles a lot, or if he's a hack whose only extra-base hits are home runs. Whether that guy hits .300/.400/.580 or .300/.330/.490 is relevant, and I think that's why you all subscribe to BP instead of relying on your local beat writers for statistical analysis. You know better. However, Eric and I have done some research that lets us know even better than that. There are holes in some metrics that are biased towards some players and away from others, and using the wrong ones is the difference between knowing what everybody else already knows and knowing what smart front offices do to be better teams. You subscribe to BP because you want to know better. There are lots of great baseball writers out there. It's our national pasttime. You subscribe to BP because you want to know better, and these are not cosmetic changes to statistics. I think Kevin is dead on when he says we're going to rock your socks off. If you don't want to read Eric and my statistical analysis of why SIERA is the best estimator of its kind, that's fine-- but when Christina waxes poetic using SIERA, you'll know she's not only entertaining the hell out of you-- she's not tricking you with anything but the best information that she communicates clearly. If you do the math, we charge something like three cents per article. You don't need to read every one to get the bang for your buck that you need to make your subscription worthwhile. It's to be expected that some people will subscribe to BP for Christina and Steven and others will subscribe for Eric and me. That's what a bundled product is rather than charging per article. As a unit, we try to be accurate with our math and clever with our words. I'm better at being accurate with my math, but you subscribe to BP because snark using W-L and RBI isn't good enough. The benefit of using OBP over AVG probably wasn't obvious at first either. We want to continue to lead the industry in baseball analysis, and especially with Joe leaving, we can't rely on momentum to keep your readership.

 
Matt Swartz
(24824)
Comment rating: -

The word model is a verb. It's what models do. They model things. Using model as a verb is not new to BP nor to science nor to supermodels. There seems to be a lot of anger directed at new statistics, largely because the information of their existence arrived at the same time that Joe departed. Naturally, no one thinks that Joe leaving BP makes it better, but the metrics aren't replacing Joe. That's just something that is being added by some of the remaining and new part-time writers at the same time that another writer departs. That said, there seems to be a lot of movement towards a feeling that current statistics are "good enough." The Mariners went 61-101 in 2008 and 85-77 in 2009 under new GM Jack Zduriencik. If the Mariners thought that the current metrics available were good enough in 2008, they would not have seen that kind of improvement. There's a lot to be learned about statistics and it's imperative that the site that writes the very best articles about baseball be staffed with the very best statistics.

 
Matt Swartz
(24824)
Comment rating: -

If that's what you got from my article, I see why you didn't like it, although I can't quite see why you'd be insulted or hurt. No one needs to get value from every article to get value from the site, and we clearly have more statistically inclined and less statistically inclined readers as well as other subsets of readers within each group. However, I would suggest you read the article a little more clearly because the conclusion is specifically that no team got enough wins from free agent eligible talent nor cost-controlled talent alone in 2009 to make the playoffs. Given the previous article on free agents declining during even the second year of their contracts, the idea that you rebuild by scouting and development and not by signing free agents for the sake of gradual improvement seems relevant and seems like a lesson many teams have yet to learn.

 
Matt Swartz
(24824)
Comment rating: -

Definitely true. I wrote about this back in July: http://baseballprospectus.com/article.php?articleid=9263 They do and should value the player more in July. The reason is that the odds of a player worth X wins making the difference between making the playoffs and not are low in April because one team can sink or run away with it, but the later in the season, the more likely one win makes a difference for some teams. Imagine if teams could have traded pitchers to Minnesota or Detroit last October for game 163. You wouldn't exactly prorate the player's value to value them for one game. Whether it's a good idea to sign players to flip them later was something I looked at here: http://baseballprospectus.com/article.php?articleid=9869 Basic result was that you need to have at least an outside shot and it needs to be a flippable player like a pitcher rather than a 3B for which you don't know if there will be a team in a pennant race with a need for him.

Jan 04, 2010 7:42 PM on The Culture Club
 
Matt Swartz
(24824)
Comment rating: -

I considered this, and I'm certainly going to explore some historical analysis on this type of stuff as I dig up the data. I think that it's dangerous to draw conclusions about the current market from what it looked like. The point here is that this year all eight playoff teams had contributions from both NM and AM contract status players. It may not be possible to say that will be necessary for 2010 or anything like that. It's pretty much descriptive, but the likelihood is that the data is pointing at something accurate about how to build a winning team, even if you can't say for sure or carry it forward.

 
Matt Swartz
(24824)
Comment rating: -

I definitely like the suggestions from several commenters to go back and check this out for historical years. I'm going to work on gathering this data as well as I can, but it's much harder to get service time data. I also have my eye on looking at how much money was spent on each category of contract too, and I appreciate that suggestion as well. Most people would like to see historical information so that we can draw conclusions about changes in what it takes to build a winner. I guess given the description of the A's success in Moneyball, that's certainly worth checking. I do think, though, that what this current snapshot shows is that you clearly can't win by just throwing money at free agents without a solid homegrown group (White Sox, Cubs) and you can't build a winner without spending money on free agents (Marlins, Rays). At this stage the winners all seem to have decent showings in both NM and AM contract status players, and that's certainly worth knowing. It goes back to how to build a winner, and the answer is that players drop off as they go further into their free agent contracts but you need a number of guys in those contracts who are still producing.

 
Matt Swartz
(24824)
Comment rating: -

Rollins had more than six years of service time and was thus an AM contract. He had the ability to be a free agent and happened to be the victim of an excellent buy low strategy from the Phillies. His numbers exploded the second he got his contract and he could have earned a lot more had he waited.

 
Matt Swartz
(24824)
Comment rating: -

I looked at the percentages and basically found that winners tended to have more of their wins from AM contracts, but that this was mostly because there was more variation in WARP3 totals from AM contracts than wins from NM contracts.

 
Matt Swartz
(24824)
Comment rating: -

I wouldn't say it's balance per se. Check out the Yankees! But definitely you need something from NM and something from AM to amount to anything. The Yankees AM WARP3 total was still less than 10 teams in the league and 2 in the AL East. You need something from both. The Braves are probably in a good position-- that would definitely be my conclusion. Checking out the projections from the players they have in different contract statuses over the next couple years really should paint that picture clearer. I think the Brewers are an example of a team that needs a homegrown core to win because even getting 12 wins from auction-market contracts, which was about average for the league, didn't make them contenders.

 
Matt Swartz
(24824)
Comment rating: -

Thanks. I will work on making a chart for the next time I visit this topic. This probably will be part of a series of articles on the same topic.

 
Matt Swartz
(24824)
Comment rating: -

Here is Nate Silver's article from a few years ago: http://www.baseballprospectus.com/article.php?articleid=4464 Note that the catcher's aging curve not only peaks a year sooner, but it also starts to peak even earlier, meaning that the length of time where a catcher is a major league quality catcher is much shorter.

 
Matt Swartz
(24824)
Comment rating: -

It's not a terrible answer, but perhaps it was not blunt enough. I'll be clearer. You found a couple of examples for two large market teams who developed good catchers who lasted a long time. If you tried, I'm sure that you could find quite a number of other catchers who aged well, and I could sit there and trying to dig up several catchers who did not age well for each one catcher who did if I really had the time. The fact that catchers generally do not age well is a widely accepted fact and my conclusion was hardly a revelation. That you can think of a couple counterexamples is hardly going to change the fact that catching provides wear and tear on the body. Your counterargument amounts to the following: "I heard of two guys who aged well who also played catcher. Therefore, it's impossible to say that catching accelerates the effects of aging." Is this your conclusion? No, your conclusion is that you think I didn't prove something that is already proven when I merely gave examples-- that's true but I'm still right, and the goal of this article wasn't to provide a scientific study but to show through repeated examples who unlikely it is that a rebuilding team helps themselves by signing free agents. Do you disagree with me? I suppose you probably could find a counterexample but I'd still be right. Listen: the plural of anecdote is not data. Posada's first contract before he signed as a free agent merely highlights the fact that you were unable to cook up better examples easily. If you could not come up with the minority of catchers who did age well when we provide you with a search tool and comparable players in each PECOTA card, you're not trying. However, catching is grueling and it's effects make it harder to age gracefully. That's not my a conclusion I can claim credit for, because it's obvious.

 
Matt Swartz
(24824)
Comment rating: -

I never thought that free agents were universally bad deals. I thought that they are generally most valuable in their first season, and thus, they are of limited use to rebuilding teams.

 
Matt Swartz
(24824)
Comment rating: -

I'm not sure that I would say these signings are useless-- these players provide good value early in contracts. I'm simply expressing how much they make up for this by being bad value later in the contracts, and therefore concluding they are detrimental to rebuilding teams.

 
Matt Swartz
(24824)
Comment rating: -

Posada signed with a year remaining of arbitration, so this is not relevant. Varitek was a free agent, and did provide good value in the first year of the deal. The latter three years of a deal were somewhat of a wash with 1.3 WARP3 in 2006, 4.2 in 2007, and 2.0 in 2008, given the AAV of $10MM for the deal. This is maybe a slight bargain, but certainly would be questionable to pay for his excellent 2005 for a non-contending team to get fair value for 2006-2008? Few catchers do provide value and the fact that you may have found a borderline exception and an irrelevant case is hardly enough to dismiss the range of players cited here. Ausmus, Torrealba, and Ramon Hernandez would all have been ill-adivsable signings for rebuilding teams. There will always be exceptions, but it certainly is a valid statement to say that it is usually a bad decision for a rebuilding team to sign a free agent catcher in the hopes that he will provide value later in the contract.

 
Matt Swartz
(24824)
Comment rating: -

I left out Lowell and Linebrink mentioned above, as well as a few others. There are a number of examples of bad signings that I had to leave out to make it flow.

 
Matt Swartz
(24824)
Comment rating: -

You are absolutely correct that there is a moral hazard problem with GMs and owners, and this is widely accepted. However, this article focused on teams that are out of contention and owners should be aware of this and be against wasting money on inefficient resources.

 
Matt Swartz
(24824)
Comment rating: -

I can't tell if this is a joke. I hope so. He's been a defensive disaster and he didn't exactly make them contenders.

 
Matt Swartz
(24824)
Comment rating: -

I will try to do so, but it is tough to find complete data on free agents prior to 2006. MLBTradeRumors.com does a very good job at consolidating all this information, and it was hard to get information from before they existed. I certainly will be expanding this area of research going forward, though.

 
Matt Swartz
(24824)
Comment rating: -

Colin was explaining in this article how economics is not just basic supply/demand models you see in intro courses. "Valuation" is economics. Prices are always relative, unless you strongly crave green-tinted pictures of ex-presidents. It is fully accepted that labor markets, particularly those for specialized skills, will not follow basic supply/demand principles without some adjustments, but you don't need to throw out economics in evaluating things. You can just go beyond the first 20 pages of the econ textbook. There may not be any markets where that framework is enough. Of course businesses compare and mimic other business practices in determining prices of goods and labor. People don't actually sit there and take derivatives and set them equal to zero any more than billiards player do differential equations on each turn. They mimic what has worked, and what has worked best will look like they were doing calculus over time.

 
Matt Swartz
(24824)
Comment rating: -

Yeah, I guess that was what I was thinking. Perhaps I'm too bullish about the Twins, but they really do look like a real competitor for 2010, moreso than the Tigers who are behaving like sellers and the White Sox who have more holes than the Twins in my opinion.

 
Matt Swartz
(24824)
Comment rating: -

I certainly agree there is some effect. I estimated $1.35MM/win, so signing a mid level pitcher who is worth two wins for each of two years would be $5.4MM regardless of actual playoff impact. That's a lot of season tickets I think. I suspect that's the same source of revenue we're thinking about? I think that should cover it, but I don't have exact information on season ticket sales and all I'm doing is reporting my inflation adjusted estimates of previous work when I give the $1.35MM/win estimate.

 
Matt Swartz
(24824)
Comment rating: -

I thoroughly agree with the sentiment but not the conclusion. It's exactly the reason that these teams should be giving small contracts and minor league deals to replacement level pitchers. There's no reason for them to waste $15MM giving Marquis money that some minor league veteran might would do for $1MM.

 
Matt Swartz
(24824)
Comment rating: -

Thanks for these details. You make some good points. I guess if they can trade him for more value than the draft pick, that would be worth noting. I imagine that it would probably even out in terms of overall value, and I think an extra selection from a country of amateur talent might give them more selection than an equivalent player from a contender's minor league team. Then again, if they want someone major-league ready, this may not be true. The Orioles may contend in 2011, but they probably won't. I guess it depends on how much happens with all those high upside arms that aren't there yet. The thing is that the second year of a deal is almost always poor value compared to the first. This is bound to be especially true for injury prone pitchers because the attrition rate is higher. I think if the Orioles look like they could make a run in 2011 next off-season, there will probably be comparable players available. I see more validity to violating this strategy if it's a shortstop with a very low attrition rate who looks like he could be good for many years and your club has no shortstop available. There is always an opportunity cost of spending money. The Orioles could spend that money on the draft, for example. In fact, they are the perfect example of a team that should spend money on the draft. If you check out my very first article published at BP at the beginning of the competition, it's about the repeated prisoner's dilemma aspect of the draft. Teams in the AL East aren't going to convince the Yankees and Red Sox to implicitly mimic their strategies of not spending on expensive draftees requiring above-slot bonuses. The Orioles could do themselves a favor by going above slot for guys without really affecting the behavior of their competitors the way that the Natonials risk doing when they go over slot. The O's could also spend money in Latin America, on scouts, etc. There's just no reason to think of teams spending money as "budgets" and "left in the bank"-- check out Joe Sheehan's discussion of baseball teams as investors rather than risk-averse loss minimizers today.

 
Matt Swartz
(24824)
Comment rating: -

I'm not sure which part of the article you are referring to. I see the trade as a net positive for the Phillies in 2010 probably. Keep in mind, though, that Drabek and Taylor are both nearly major league ready and both may see time in the big leagues in 2010. I'm not sure that it's as trivially positive as you'd think.

 
Matt Swartz
(24824)
Comment rating: -

That's true-- I used the approximations above based on Nate Silver's model, readjusted upwards for inflation and to match the current $/win estimate that I felt make sense. It was an increase of about $1.35MM/win with about $50MM+ generated by reaching the playoffs. That generated the $5.25MM/win that Sky Andrecheck had found in his recent article, and was proportional to the average $0.75MM/win and $30MM playoff estimate that Nate had found several years ago.

 
Matt Swartz
(24824)
Comment rating: -

I think it comes down to the cost of promoting players. If you have no other option than to promote your young catcher who you have the rights to for only six years, there is a hidden value to spending $1MM on an older catcher in that you retain the services of the young catcher when he is better (i.e. if you expect him to be better during the 2011-2016 time frame than 2010-2015, because you think he will be better in 2016 than in 2010).

 
Matt Swartz
(24824)
Comment rating: -

I suppose that bad teams may get some gate boost in the first month or two after spending $100 million, but I don't see any reason to assume it's more than a drop in the hat compared to the cost. The teams that spend $100 million on players are typically the ones that are willing to pay for an albatross late in the contract because it's rare that those players are producing reasonably on a per-dollar basis towards the end of the contract. I think that's why we're having trouble thinking of examples. I'm not saying going from 65 to 70 wins is useless. I'm saying it's generally not worth the price because you have to outbid teams who would go from 85 to 90 wins. The picks thing-- it's dangerous. You don't necessarily get picks later. It depends on how they perform and how the rules change over time.

 
Matt Swartz
(24824)
Comment rating: -

I addressed this a little in my response to the previous post, but basically my understanding is that this value is small compared to the value of winning games. I would need to look into whether there has been a lot of research towards this, but I believe I have heard that the conclusions were that winning is the real money-maker.

 
Matt Swartz
(24824)
Comment rating: -

I understand where you're coming from here, but I still don't think it's wise. The Orioles' fan base has certainly suffered the last 12 years, and becoming a little better certainly feels better. I'm just not sure that it sticks throughout the season. The value of making the playoffs is just so much higher than the value of having a more attractive player, and the value of that really wanes quickly. I have not done research myself on the marketing value of players, but I believe I have heard that the value of actually winning more games-- and specifically, of making the playoffs-- dwarfs the value of giving the fan base a bump from a 65 win team to a 70 win team. Basically, I don't think the Orioles are going to see a boom in revenue until they make the playoffs. When they do it next, whenever that is, I suspect that there is a lot of money that will flow in and reinforce their position as a team in a pretty large market. The presence of the Nationals, I believe, helps hardcore fans by taking away a lot of easy revenue and forces them to be better to get fans. I tend to think that the marginal revenue from a win is larger when you have local competition even if the total revenue is lower. I don't have proof of that, though, it's just something I'm suspicious about. Regardless, as someone who followed the Phillies in the mid-to-late 1990s, I certainly am sympathetic to fans who want a few more wins. I just think it pales in comparison to making the playoffs and I think the Orioles set themselves back in pursuing the target of giving the fans something to come to the park for in September and hopefully October.

 
Matt Swartz
(24824)
Comment rating: -

It certainly is possible, because it entails basically getting a premium worth $5MM or whatever the compensation is to signing that player. I guess it's just a step backwards to hopefully to take a step forwards, and it's probably not wise in many situations. This is a good point, though-- thanks for highlighting it.

 
Matt Swartz
(24824)
Comment rating: -

I think someone wrote an article explaining this once...

 
Matt Swartz
(24824)
Comment rating: -

Thanks. I did mean re-signings for multi-year deals. I was thinking more along the lines of teams like the Astros who spend money but don't win. If they re-upped a 10-and-5 guy, they wouldn't get any value from trading him. I didn't realize that part had been removed from the CBA. Thanks for that-- that really does expand the validity of the type of signing I'm talking about. I guess it makes sense given the Derek Lowe rumors, since he's not pulling any trade-demand claims.

 
Matt Swartz
(24824)
Comment rating: -

True, Milwaukee is small. I meant in terms of having the incentive to spend on free agents which has more to do with marginal revenue added from a win than it does from total revenue added. The general point is that Milwaukee has the incentive to spend on free agents as demonstrated by the fact that they do when they feel they're in competition.

 
Matt Swartz
(24824)
Comment rating: -

Sorry, here it is: http://baseballprospectus.com/article.php?articleid=4618

 
Matt Swartz
(24824)
Comment rating: -

I think you're missing his point. In general, career performance is more indicative of skill than single year. However, some statistics are indicative of skill changes. He is noting that Blanton's peripherals improved drastically, both over his 2008 performance with the Phillies and more than his league-adjusted performance with the Athletics before that. That is far more likely to persist than other changes like a one-year BABIP fluke. Changes in GB%, BB%, and K% are more indicative of skill set changes for both hitters and pitchers. For BABIP, it is more useful to use multi-year averages. It comes down to comparing the variance in skill levels across the league to the variance due to luck as measured by P*(1-P)/N. There is more variance in TTO skills, and thus Blanton's performance is more likely to persist than other pitchers who have career years.

Dec 20, 2009 10:18 PM on Floridians Breathe Easy
 
Matt Swartz
(24824)
Comment rating: -

Peter, thanks for your comments. The fact is that this is all expected value analysis. You simply cannot predict anything with certainty. Even if you knew every player's exact talent level and exact number of days that would be spent on the disabled list, you would still be off the average teams win total by about 6 or 7 games. However, doing expected value analysis is the only way I see to establish a guess as a starting point, unless you can somehow unravel the trillion different ways that the next year could play out. As far as what it would take to sign Lee, the answer is about market value which is enough to do the analysis. Check out my article from last month 'How to Make up a Good Trade Rumor.' In there, I explain that you can expect that if a team retains their own player, the surplus value will be approximately the two picks the team would have gotten for him if he left due to the bidding process. The Jays only got so much from the Phillies precisely because Halladay's willingness to effectively pay $36MM to live in Philadelphia for the next four seasons made him more valuable to extend than the picks would be worth.

 
Matt Swartz
(24824)
Comment rating: -

I've used a few sources on approximating Lee's value, and based on his huge change in peripherals starting in 2008, it is certainly a reasonable approximation to put him a 5.25 wins above replacement. I've seen higher and lower ones. The reason $31.67/year seems high is that you are used to seeing contracts that are over longer periods of time, typically a couple years into a decline phase. The trade Halladay might expect on the open market is probably 5/$130MM or 6/$145MM, but even though the salary escalates, most of his value comes early in the deal when he is younger. A team paying Halladay $140MM over 6 years is probably paying him $95MM for the first 3 years and $45MM for the last three years even if they are paying more of it out later in the deal. Also keep in mind that was assuming no draft pick compensation, which would have made it closer to 3/$86MM, as part of a 6/$136MM deal.

 
Matt Swartz
(24824)
Comment rating: -

All good points. Victor Wang's article linked above did a terrific job at establishing rough distributions. He got win values per year for the first six years for sets of prospects with certain grades in different years, and reported the percentage that were in certain ranges. Lately, I've been thinking variance is good for baseball teams who aren't the Yankees and maybe the Red Sox. Think about it. Two teams have expected win totals of 85 wins. Do you want to be the one who has an 100% chance of winning 85, or a 50/50 shot at winning either 95 or 75? One way you make the playoffs less than 1% of the time and the other way about 50%. Of course, if your expected win value is 95 wins than maybe you want less variance. Thanks for the post though. This is the type of topic that should get raised to extend this expected value trade analysis and usually people take it as given. I tend to think the overall variance ends up similar since draft pick compensation takes up part of Cliff Lee's value and those picks have more variance than the prospects dealt, but it's all a good point.

 
Matt Swartz
(24824)
Comment rating: -

Was this reply just the words "I disagree" without any coherent reason. Why do you think Cliff Lee is worth less than those players? And why on earth are "3 prospects" treated like some sort of currency that you can just counted them up like dollars? Different prospects have different values. Blanton is someone that I believe that could have gotten fair value for if they waited-- about 2-2.5 wins by most estimates. Durbin is replacement level, and the "3 prospects" are not Top 100 prospects and are probably worth 3.8 wins combined. Even if they could have gotten any one of those middling prospects for Blanton, they would have gotten better value. If the market valued Blanton so much lower due to some sort of injury likelihood or something I'm missing, then it would be even more likely Lee was better. The reason 3/$90MM seems extreme to you is because you need to think about the details of deals. The reason is that you would think that someone like Halladay would get 5/$125-130 or something on the market. Thus, a lower AAV. The thing is that most free agents have accelerating salaries and decelerating value. If Halladay had only sold his first three years of free agency, they would have been valued similarly to how Sabathia's and Santana's were based on their trajectories at the time which would be about 3/$90MM.

 
Matt Swartz
(24824)
Comment rating: -

This is all expected value as this type of analysis always is. The point is to value it. That's like casinos as a business because theoretically, the house could lose. the point is to play the odds. Wang and I both expressed this as historical tendencies indicating probabilities indicating expected value. The Phillies undersold on Lee in the sense that they could have gotten prospects for him that were more likely to succeed than the ones that they got.

 
Matt Swartz
(24824)
Comment rating: -

If the Jays had traded Halladay to another team, he would not have been worth as much. He claimed even before the trade that he wanted to go to Philadelphia all along, largely due to their Spring Training facilities being nearly in his backyard. He would not have signed this extension with, say, the Angels. Had he, they would have surrendered prospects worth their value to him which would have been the same 4.4 wins as stated above (assuming the $6MM Jays subsidy), plus another 2 more for draft picks to reach 6.4 wins, because that was his value to other teams. The Phillies value to adding Halladay as described above was 11.1 wins. Thus, there was 4.7 wins of surplus value. The Phillies only got 1.3 of these 4.7 wins. It is just as easy to say "The Jays could have gotten more" as it is to say "The Phillies could have surrendered less." The 3rd order wins that we have are a way of approximating the value of performance as measured by plate appearance outcomes. It is not something that can be extended forward without considering other forms of luck. The Braves, quite simply, had an 8.1% HR/FB this past season, which is not something that they can expect going forward. Pitchers fail to show persistence in this statistic and the Braves play in a pretty neutral stadium. 3rd order wins are a way of eliminating luck on run distribution among games, hit distribution among innings, and opponent difficulty. There are other forms of luck, including BABIP and HR/FB luck for example. Had the Braves had neutral home-run luck, their 3rd order record would have looked more like their regular record. Breaking down the player by player projections this season, the Phillies are very likely to come out ahead. The decline of Ibanez may be on its way, but projecting Werth and Howard to decline considerably is just ridiculous. Werth only had a "career year" because he got more playing time. Howard has had the same season plus or minus some BABIP luck for about three years running and this season he got in shape and improved his defense. They will be 31 and 30 this year. That's not a time to fall off a cliff.

 
Matt Swartz
(24824)
Comment rating: -

UZR is one of the best fielding metrics available. Total Runs Saved (based on +/-) is another excellent metric. Fielding metrics remain in their infancy and there remains a lot of noise surrounding them. Looking at UZR over a longer time span gives more insight. It is also possible that defensive skill is very streaky. Regardless, I always prefer to throw a few metrics up there to see what sticks when it comes to defense.

 
Matt Swartz
(24824)
Comment rating: -

Dan, I think the reason that HFA is bigger in other sports is that the team that is "supposed to win" wins more often in other sports. The Nats will take 1 game out of 3 from the Yankees on average, but there's not really a point in watching the Lions try to play the Colts. It's certainly not 1 in 3. Baseball games are frequently decide by a dozen or two pitches on the borderline, a few bloop hits, or home run just inside the foul pole. My theory is that's why HFA is smaller in baseball-- more luck, less other stuff. Russell, I loved this article. I hope the new Ken Burns inning follows Burns' own Burns-esque nine innings as well as this followed up on mine. Great insight, great results.

Dec 15, 2009 9:02 PM on Home-Field Advantage
 
Matt Swartz
(24824)
Comment rating: -

Yeah, I think the strategic lying is really what's going on that gets in the way. There are also all types of anti-tampering laws with respect to trades that get in the way too, I'd think. There also are a lot of externalities with respect to teams in the same division that really get in the way, because the Yankees getting better makes the Red Sox lose a couple more games and makes them need to win more games to catch up at the same time. Definitely a lot of moving parts! The general structure of the auction holds but I think it would be tough to apply strict rules beyond the basics of auctions. I think. I may do something about adding a pitch later but I was talking about it with another economist in my office and realized how difficult it would be with the current data out there, especially if stable DIPS ERAs aren't allowed to be used to limit the variance in ERA. My friend in my office sent me a paper that I've been meaning to look at that may help, so I may contact you down the line about that. Thanks for the offer.

 
Matt Swartz
(24824)
Comment rating: -

Infield hit rate is incredibly stable, far more than outfield hits per ground ball. Infield hit rate would not be an indicator for ISO drop. Infield hit rate is per ground ball. Spikes in ground ball rate can be, but definitely not infield hit rate.

 
Matt Swartz
(24824)
Comment rating: -

I'm skeptical that you could turn many first basemen into catchers. The whole defensive spectrum seems thrown off by the existence of catchers. Part of the problem is that catchers' bodies deteriorate so much with each game played that it's dangerous to have a good hitter at catcher in some ways. It's also difficult to quantify their defensive skill on top of that.

 
Matt Swartz
(24824)
Comment rating: -

I don't think Feliz was free talent. He will probably want a major league contract for at least a couple million dollars. But I do think Feliz will be about more than 20 runs below average this year. He really lost his power this year and I don't see it coming back. I agree with Carl that the Phillies know way more than they let on, but I'm sure they have some blind spots. I just don't think they are as fundamental as believing that Polanco makes productive outs.

 
Matt Swartz
(24824)
Comment rating: -

Yeah, logically, there should be no automatic way to get a bargain. Everyone always remembers the most recent market but Scott Boras has made a lot of money based on prices going up over the course of the offseason. In reality, it just depends on the individual offseason and if it were predictable, prices would adjust in advance such that they wouldn't be different.

 
Matt Swartz
(24824)
Comment rating: -

I knew that Greinke was minimizing his ERA more than his FIP and said so in the article in multiple places. The article merely asked what the difference would be if he were pitching to minimize his FIP.

Nov 27, 2009 6:09 AM on Zack Greinke and FIP
 
Matt Swartz
(24824)
Comment rating: -

The jump from 80 to 81 wins in talent doesn't really hit the thick part of the distribution, so it's only something like 2-3% rather than 6%. But being a 90-win team in the AL East probably increases their probability of making the playoffs by 4% or even more. For the in-season acquisition, keep in mind that the price moves with the value. If there are more buyers than sellers, the cost could pretty much break even. My Roy Halladay articles pretty much showed that you trade for a star at mid-season, he can be worth 2/3 of his revenue value with 1/3 of the season left, and for a guy like Halladay with his salary, the value before the season should be about equal to the value in season. And then that's basically saying the price before the season should be equal to the value in season, or at least close to it.

 
Matt Swartz
(24824)
Comment rating: -

The marginal value of a win is still positive when you jump from a range of 74-98 up to 77-101. Keep in mind that the middle part of those ranges are more plausible. If you increase your talent level by one win, you probably increase your chance of making the playoffs by 6% if you were a 86-win team. If you increase your talent level by three wins to an 89-win team, you probably increase your chance of making the playoffs by about 18%. If you increase your talent level by six wins to a 92-win team, you probably increase your chance of making the playoffs by about 36%. These are me rounding numbers, but the point is that every time you add one win worth of talent, you increase your chance of making the playoffs by about 6%.

 
Matt Swartz
(24824)
Comment rating: -

It's an interesting idea. The question is whether you can evaluate trades after the fact. Is the Victor Zambrano for Scott Kazmir trade worse because he became good? Better because he struggled after he was good? It's tough to say. I think evaluating the trades at the time has flaws too-- maybe a team has a beat on a certain player that everyone else is underrating. It's tough to decide how is best to do it. You can certainly look at a GM's aggregate sum of trades retrospectively and decide whether he makes good trades in general, I guess.

 
Matt Swartz
(24824)
Comment rating: -

A three-win player means that on average he would be worth three wins. It takes into account the probability of becoming a replacement level player. The price of two three-win players and of one six-win player should be equivalent because of the response I wrote above.

 
Matt Swartz
(24824)
Comment rating: -

The standard deviation of win total versus true talent level is about 6.4 games, so with the season not yet started, there's just not going to be much difference in the marginal return to wins when you add the first three wins versus the second three wins. For example, if you have an 86-win team (based on talent level) that you could turn into an 89-win team with a three-win player and a 92-win team with a six-win player, you would think that going from 89 to 92 is more valuable than 86 to 89. The problem is that you're really considering taking a team that could win anywhere from 74-98 games to a team that could win 77-101 games or 80-104 games. The confidence interval is just too big to be informative.

 
Matt Swartz
(24824)
Comment rating: -

Thanks. It's a tough question. It should come down to the margin value of a win-- how much more fans does a win add than before? Chances are that goes down when the economy gets worse, so in that case, it would drive down the payroll that maximizes profit. For the Tigers, their attendance may be high but how are their ticket prices? Do they send out a lot of e-deals where they offer people specials? In modern stadiums, I would guess that it's usually smart to price tickets in a way that you sell out most games, but that's very different when you're selling out with expensive prices or low prices with lots of emailed deals that lower the price further for games with little appeal. The dead weight like Dontrelle Willis would not necessarily matter per se, but if the team is trying to avoid losing money rather than maximizing profit without concern with risk, then the story changes. They might prefer to play it safe and lower payroll even if the dead weight doesn't change the expected effect of other spending decisions. It would pretty much come down to how the team is equipped to deal with losses.

 
Matt Swartz
(24824)
Comment rating: -

What you're missing is that in the middle of the offseason, you can trade a position player frequently if he has value. And if you don't have a valid trade alternative, what you are doing making the trade in the first place? We could evaluate the idea of the Marlins acquiring a shortstop, but they wouldn't be in that trade market unless they had a plan for Hanley Ramirez. The point of this is to analyze trades that are filling holes that currently exist or holes that will exist when you trade players. Your point is well taken for in-season trades, and that's precisely why I said above that these rules do not hold for in-season trades because a roster is not as flexible, but you can certainly trade a position player or move a 1B or LF to DH in the AL to make a trade work. The roster spots have value hypothesis only makes sense for situations where you cannot trade players who have value, which is usually not true.

 
Matt Swartz
(24824)
Comment rating: -

Mglick0718-- The idea isn't that it's either/or. The idea is that the values are equivalent, i.e. if someone said "I'll trade you this unspectacular third starter for one year and I'll cover his contract, or alternatively Roy Halladay for one year but you have to pay him" then you are indifferent.

 
Matt Swartz
(24824)
Comment rating: -

Frequently there are two spots where there is a replacement level player. Generally, there is some spot on the diamond where you know you would get replacement level production if you did nothing all off-season and there usually aren't 5 starters that are considering above replacement level on any team, certainly before the off-season starts. My point is that two players who are worth three wins should each be half as expensive as one six win player-- if not, then you're replacement level is too low. I would actually suspect that fewer six win players reach free agency than three win players by a very large margin.

 
Matt Swartz
(24824)
Comment rating: -

You're mistaking payrolls set based on profitability from payrolls set arbitrarily. I would suspect that in my circumstances, the owner gives the GM a budget based on their discussion of what kind of win totals are possible with different budget levels, and then decides how much that is worth. The effect would be the same.

 
Matt Swartz
(24824)
Comment rating: -

Certainly there are trades where mistakes are made, but in general these rules are more or less true. If somebody makes a mistake or if a GM tries to save his job with a move, then there exceptions but there aren't great examples of this clearly being violated. In general, it is true.

 
Matt Swartz
(24824)
Comment rating: -

You clearly don't watch the Phillies. Hamels struggled with his call up-- 5.98 ERA in his first 11 starts...with about a .320 BABIP but without the walks. Hamels has been playing with the curve ball for years. It hasn't been good enough to throw as often as the fastball and change up. We're back to the character attacks I see.

Nov 13, 2009 9:49 PM on Cole Being Cole?
 
Matt Swartz
(24824)
Comment rating: -

I thought about this one for a while and checked with a few other economists who are better at econometrics. I'm not quite sure if Heckit would work. There is the selection bias of selecting into throwing new pitches only if it works out for the better. But there's also an issue with biased error terms where pitchers like Hamels introduce new pitches thanks to bad luck. Certainly, the average pitcher who feels he needs to introduce a new pitch is more likely to have bad luck than average. I'm not sure if biased error terms matter for Heckit? I think Heckit also requires normality, right? I guess looking at a binomial variable like OOBP rather than ERA could guarantee that? There's a lot of things to consider, and I'm not quite sure yet. It's an interesting thought.

Nov 12, 2009 6:19 PM on Cole Being Cole?
 
Matt Swartz
(24824)
Comment rating: -

I agree with all of this, but I did say that Hamels was extraordinarily lucky in 2008 three times in this article, and several more times in the original article linked at the top. I characterized him as a 3.65 ERA type pitcher who had about .60 swings of good luck and bad luck due to BABIP fluctuation.

Nov 11, 2009 5:53 PM on Cole Being Cole?
 
Matt Swartz
(24824)
Comment rating: -

No problem, thanks for the clarification. What I would say is that pitcher BABIP (at least for starting pitchers) would have a natural standard deviation of about .019 if luck were the only factor-- and that much is clear. We know that team defense would cause about .010 of standard deviation. Because of this, we know that the standard deviation of individual pitchers is probably very minimal. Those two factors would combine to explain about .021 of standard deviation. For pitchers in 2005-2008 with more than 500 balls in play, the standard deviation was about .021. That means that there is almost no variance with respect to pitchers. It would certainly be less than .005. In other words, nearly all pitchers should be between .290 and .310 with respect to their natural skill level and probably between .295 and .305.

Nov 11, 2009 12:24 PM on Cole Being Cole?
 
Matt Swartz
(24824)
Comment rating: -

When you subset his innings into little 60-87 subsets, changes of 0.6 K/9 or BB/9 represent about 5 strikeouts and walks different. Just 5. Claiming he struck out five more hitters on bad teams and five fewer hitters on good teams simply does not prove a thing when he struck out about the same number of hitters. That is ridiculously small sample sizes that you're looking at. It's the type of thing that no one would ever look for unless they were trying to make a post hoc rationalization. It's also clumping together data by team, when obviously many of those players on good teams are bad and many of the players on bad teams are good. It's a noisy measure where you are taking one or two really good games that happened to come against a good opponent in 2008 and another that happened to come against a bad opponent in 2009 and drawing conclusions. He happened to have a particularly good game against the Cardinals and another against the Cubs in 2008. He happened to dominate the Giants and Orioles once in 2009. Any pitcher who has a 2.34 BB/9 and 7.03 K/9 against good teams does worry people. No one has "magic" against good teams. Anybody could have looked at the fact that he performed better against good teams than bad teams in 2008 and said "that's not a long term trend." Again, why if he got so much worse did it not affect his overall strikeout and walk totals, the rate of hitters hitting the ball for line drives, the rate of hitters hitting the ball to the outfield, or the rate of hitters hitting the ball over the fence? A dip in quality of opponent of such a ridiculously small magnitude would not explain the 35 singles that came with no change in batted ball distribution that you need to explain.

Nov 11, 2009 11:35 AM on Cole Being Cole?
 
Matt Swartz
(24824)
Comment rating: -

I'm sorry if I came off as dismissive. There are a lot of bad theories out there about Hamels. I've heard many of them in recent weeks that are not based on facts or are trying to explain the conclusion backwards, by looking for something that changed and deciding it was causal. The way that I should have phrased my comment was that it is important to analyze BABIP frontwards. This involves asking if certain things are correlated with higher BABIP for the league in general, such as I did in the article I linked in my previous comment. Looking at an outcome and trying to explain why it might be an exception to a rule by finding other idiosyncrasies is always dangerous. The other very important thing to realize is that even if pitchers did control BABIP despite the lack of persistence, the limits of the sample size mean that you can only really say with confidence that a starting pitchers' BABIP should have been within .040 points of his actual BABIP. That is the amount that luck should count. So, based on this, we can conclude that Hamels BABIP should have been somewhere in between .281 and .361 this year. It is certainly not unreasonable to think it should have been right at .300. If you want to conclude that his true BABIP skill is different than .300, you need to look for what kinds of players do have BABIPs that are different from .300, not find a player who has an abnormal BABIP and look for characteristics of him. For example, strikeout pitchers have slightly lower BABIP, but we're talking about .292 versus .308 for really good and really poor strikeout pitchers. Ground ball pitchers have slightly higher BABIP, but we're talking about a similar 8 point range. Knuckleballers can have hugely different BABIP, and usually very low BABIP, so if you see a knuckleballer, it's best to look at comparables before making assumptions about his BABIP. I hope that helps explain how to approach this kind of question.

Nov 10, 2009 10:35 PM on Cole Being Cole?
 
Matt Swartz
(24824)
Comment rating: -

I addressed movement, location, and velocity in the article, and none seem to be issues at all. It's silly to try to make up more and more obscure theories. We know that the standard deviation of a binomial random variable is sqrt(p*(1-p)/n). That's a fact. Therefore, there will always be 20% of pitchers that differ from their true BABIP skill by .030 points even if pitchers did control BABIP. It's not hard to believe that those 20% of pitchers would exist. I've done a lot of work on how much pitchers do control BABIP. Check out this article (and note that Hamels if anything would fit the bill of a pitcher who should have a ever so slightly less than .300 BABIP rather than higher): http://baseballprospectus.com/article.php?articleid=9595

Nov 10, 2009 8:33 PM on Cole Being Cole?
 
Matt Swartz
(24824)
Comment rating: -

You are making a mistake with changing percentages of pitches thrown as some sort of holy grail of improved pitching. This is a classic example of people not understanding decision theory and looking only at summary data. Sometimes pitchers develop better pitches. When they do, they will change what they are doing and likely improve. Sometimes pitchers attempt to develop a new pitch but it isn't very good. They don't throw it, and they stay the same. As researchers looking at fangraphs, we do not have the access to know how every pitcher would have done if he changed his repertoire. We only see that those that made the decision to do so were successful. That's the problem. You are assuming that everybody who changes their repertoire will improve, when only those pitchers who will improve by changing their repertoire are the ones that actually do. It's a common mistake when looking at data and surmising a counterfactual can be considered when it cannot. Also, recall some excellent recent articles by Eric Seidman on perceived velocity. Hamels changeup is more effective by having thrown a fastball on a previous pitch, so the value of throwing a fastball to Hamels won't be recognized in the weighted runs per pitch fangraphs metric even if Hamels had neutral luck.

Nov 10, 2009 4:58 PM on Cole Being Cole?
 
Matt Swartz
(24824)
Comment rating: -

Yes, Lincecum is better than Hamels. Of course having four good pitches is better than two, and having two good pitches is better than zero. Every pitcher in the league can be blamed for not being Tim Lincecum and every hitter in the league can be blamed for not being Albert Pujols by this logic. Also, I find it presumptuous to call Hamels being upset "hissy fits." He is emotional and demonstrative on the mound. If he had a lower pitched voice, this would be viewed as righteous rage. If he screamed with joy when he did things well, it would be viewed as passionate fire in his belly glory. This is all post hoc characterizations.

Nov 10, 2009 4:58 PM on Cole Being Cole?
 
Matt Swartz
(24824)
Comment rating: -

Lincecum is another example of a pitcher who is better than Hamels. His back to back dominant seasons came with better peripherals than Hamels. That he changed his pitch selection is not good inandof itself.

Nov 10, 2009 4:56 PM on Cole Being Cole?
 
Matt Swartz
(24824)
Comment rating: -

Sabathia's ERA bounces around a good bit too, as does his BABIP. It's not as noticeable because Sabathia is a better pitcher than Hamels, so his bad years are still above average.

Nov 10, 2009 4:56 PM on Cole Being Cole?
 
Matt Swartz
(24824)
Comment rating: -

The Fangraphs' pitching run weights are about PA outcomes. That they went down is simply parroting the fact that he gave up more runs. His rate of giving up runs per fastball and per changeup both went down by similar amounts. Also, that he threw 3.2% fewer changeups is only 100 pitches different. It could not explain 6% more hits on balls in play. That's basically 100 more pitches that were fastballs instead of changeups somehow explaining 35 hits. Even if hitters batted 1.000 on his fastball and .000 on his changeup, the rate of pitches put into play would make this ridiculously not true. While I'm aware that Hamels O-Swing% went down and his Z-Swing% went up, the fact that his O-Contact% went up and his Z-Contact% went down by roughly the same amounts implies that hitters rate of putting pitches out of the strike zone into play and rate of putting pitchers in the strike zone into play was about the same both years. It could be a clue if he was whiffing fewer hitters, but that is simply not true. Further, you haven't answered the question I posed above to another writer which would debunk this as well. If Hamels was more predictable in 2009, why did he not strike out fewer hitters? Why did he not give up more home runs or extra base hits? Why did he not allow more pitches to be hit to the outfield? Why did not allow more line drives? Why would being more predictable only change the rate at which ground balls went between fielders and at which looping flies fell in front of outfielders versus in their gloves? That does not add up. If he was more predictable, hitters wouldn't have whiffed at more pitches and they would have hit more line drives and more home runs. Of course Hamels would be more predictable with a great third pitch-- he'd be even more predictable with a great fourth pitch and fifth pitch too. If he threw twenty superb pitches, he'd be less hittable too. But then his peripherals would be better than they are. It shouldn't affect only singles. It would at least increase his rate of surrendering doubles.

Nov 10, 2009 4:55 PM on Cole Being Cole?
 
Matt Swartz
(24824)
Comment rating: -

I'm not sure that much can be learned from this. It's possible that hitters are guessing better the second time around versus the first time around, but it's unclear why that would be the catalyst for such a huge change. The real reason that Hamels declined was an increase in the rate of hitters getting singles, not extra base hits. Since his K and BB numbers didn't really change as he went through the lineup, I'm not sure much can be gleaned from the fact that his extra-base hits, while not really any more numerous this time, were distributed more towards the second and third times through the lineup and less towards the first time through. They did not go up overall, meaning it's probably not the problem.

Nov 10, 2009 4:54 PM on Cole Being Cole?
 
Matt Swartz
(24824)
Comment rating: -

The point I'm making is that the difference runners on vs. bases emtpy between this year and last year is clearly not the explanation for the BABIP difference. His BABIP difference with runners on was exactly +.004 higher both years, less than the league average OPS jump with runners on.

Nov 09, 2009 5:59 PM on Cole Being Cole?
 
Matt Swartz
(24824)
Comment rating: -

He did not actually-- check the third to last comment in the article.

Nov 09, 2009 5:14 PM on Cole Being Cole?
 
Matt Swartz
(24824)
Comment rating: -

It could explain a small part. Team BABIP went from .295 to .300 so that might explain some of it. But the BABIP for everyone but Cole Hamels was actually the opposite: .297 in 2009 without Cole and .301 in 2008 without Cole.

Nov 09, 2009 5:13 PM on Cole Being Cole?
 
Matt Swartz
(24824)
Comment rating: -

Sorry-- I was trying to say that a third (or actually 32%) will be either more than a standard deviation above or more than a standard deviation below. Thanks for clarifying.

Nov 09, 2009 12:57 PM on Cole Being Cole?
 
Matt Swartz
(24824)
Comment rating: -

You're arguing with a straw man. I very clearly said in my article that I do not believe any pitcher should keep doing the same thing. Not even Tim Lincecum. Any pitcher who does not work at getting better will get worse, and I would suggest to any pitcher working on the game theoretic strategies involved in being less predictable so that hitters do not eventually increase his currently stable line drive rate to a higher level. However, I should remind you that the standard deviation of a binomial random variable like BABIP with 500-600 observations is going to be around .020. That's a statistical fact. That means that even if pitchers do vary in their BABIP skill level, a third of pitchers will have BABIPs that are .020 points different than their true skill level. Given that Hamels did not give up any more extra base hits, did not allow any more balls to be hit to the outfield, and struck out and walked the same percent of batters as last year, what would you propose the difference is? And remember that you need to explain why Hamels flaws are only causing more balls to be hit in front of outfielders and not more balls to be hit over their heads nor has it caused the balls to change direction and become pulled more often. Remember that your answer cannot use changes in velocity, movement, or location because those have all been the same.

Nov 09, 2009 12:29 PM on Cole Being Cole?
 
Matt Swartz
(24824)
Comment rating: -

The argument that Hamels can't contribute with his current pitch selections is ridicul