CSS Button No Image Css3Menu.com

Baseball Prospectus home
  
  

Search Article Archives

Find:   

Author:    Article: 
Search Date (From):    Search Date (To):   
Sort Results by:  Relevance
 Date
Show me  Results

   Show Article Summaries

New! Search comments:      
(NOTE: Relevance, Author, and Article are not applicable for comment searches)

studes
151 comments | 52 total rating | 0.34 average rating
Facebook Twitter email a friend
Share comments by 'studes' posted at
Baseball Prospectus http://bbp.cx/i/280
studes
(280)
Comment rating: 1

Yeah, this article felt like preaching to the choir some more. Then again, I didn't see the Tweets.

 
studes
(280)
Comment rating: 0

Duh! thanks.

Apr 25, 2013 10:04 AM on Pitcher BABIP and Age
 
studes
(280)
Comment rating: 0

From looking at the graph, ages 34 plus could be .292, a three-point difference. Agree about the survivor bias. Just saying that the graph didn't automatically make your argument for you, IMO.

Apr 25, 2013 9:37 AM on Pitcher BABIP and Age
 
studes
(280)
Comment rating: 0

I believe you, but here's how I did my math. Let's say a full-season pitcher incurs about 600 balls in play in a year. 5% of 600 is 30 outs that become hits, true? Apply half a run (difference of hit vs. out) to each ball, that's 15 runs. 15/200 innings times nine innings is 0.675 runs--I conservatively said half a run. What's wrong with my math?

Apr 25, 2013 9:34 AM on Pitcher BABIP and Age
 
studes
(280)
Comment rating: 1

Okay, thanks. But five points of BABIP is something--could be half a run in ERA over a full season, right? And I understand variation, but this is a big sample.

Apr 25, 2013 7:30 AM on Pitcher BABIP and Age
 
studes
(280)
Comment rating: 1

I don't understand. Doesn't the graph show a downward trend? I know survivor bias and all that, but doesn't the basic graph exhibit Ryan's trend?

Apr 25, 2013 6:12 AM on Pitcher BABIP and Age
 
studes
(280)
Comment rating: 0

What's your charge? The only part I didn't really follow was Proof #2. I think you created an xBABIP based on where the ball was hit and compared that to actual BABIP, or something like that. But did you try to account for a skill in which a pitcher induces *where* the ball was hit. I guess I just don't understand how that analysis was pulled together. Also, what's a logged odds ratio? I understand logs and I understand odds ratios, but what are you doing when you put them together? Perhaps there's an article somewhere where you explain it? BTW, fantastic job. That said, whenever I read one of these studies that debunk DIPS theory, I'm still always struck that pitcher BABIP is dang hard to predict. Which, to me, is all that DIPS theory is.

 
studes
(280)
Comment rating: 1

Looks like a really nice job, Russell, but I'll need several days to digest it to be sure. ;)

 
studes
(280)
Comment rating: 0

Just re-read the entire THT Annual piece. A couple of guys who stood out across multiple approaches were Frank Crosetti and Bobby Tolan. Mort Cooper deserves a mention, as does Gene Moore. Best teammates ever!

 
studes
(280)
Comment rating: 1

In the 2010 THT Annual, I did something similar (not in terms of the math!) in which I looked at players whose teams had most outperformed expectations. I did this several ways: - Their teams' Pythagorean variances over their careers (Ruben Sierra is a great clubhouse presence by that measure) - Their teams improved the most compared to the year before the player joined the team (All sorts of biases, but Dennis Cook), and - Their teammates most outperformed their projected WSAB totals (previous three years regressed). Smokey Joe Wood and Mort Cooper were the men. I called it Luck, but who knows? Maybe Smokey Joe Wood was a great clubhouse presence.

 
studes
(280)
Comment rating: 0

Nice job, Ben. You definitely paid better attention than I did.

 
studes
(280)
Comment rating: 0

Yes, allowing more than ten votes makes sense. Given the current logjam, I also like loosening the 15-year and minimum voting rules.

 
studes
(280)
Comment rating: 0

Thanks for adding this, Joe. In case folks missed it, here is the link for the petition: https://www.change.org/petitions/change-the-baseball-hall-of-fame-voting-process

 
studes
(280)
Comment rating: 0

Never mind, I see your answer below. I like the distinction between multiplicative vs. additive park factors for runs vs. components, respectively. Makes intuitive sense, though part of me still wonders if additive is best for runs as well.

 
studes
(280)
Comment rating: 0

thanks, Colin. Perfect. And great point about multiplying park factors to negative RAA. Have you thought much about additive park factors?

 
studes
(280)
Comment rating: 0

Great article, Colin. If you don't mind, a couple of math questions. Why introduce lgRPA the way you do? Why not just divide by parkadjust? Mathematically, isn't it the same thing? By absolute linear weight runs, do you mean runs based on zero instead of average? If you were to use linear weights centered on average, would you approach the math differently?

 
studes
(280)
Comment rating: 1

Man, Russell, you're into purging your old ghosts, aren't you? Don't be too hard on yourself! You also might be interested in Bill James' recent column about the old bullpen by committee controversy with the Red Sox (it's on his subscription site). In the meantime, I have a simple test: will managers use their closers in the ninth inning of tied games? I don't see any reason for them not to--it's keeping with the one-inning closer role. And yet a tied game in the ninth is higher leverage than a two- or three-run lead. Last time I looked (a couple of years ago) few of them did. To me, that one change would go a long way toward finding a balance between the sabermetric bliss and the real world.

 
studes
(280)
Comment rating: 0

Phew. Nicely done, Colin. But who, exactly, are your people???

 
studes
(280)
Comment rating: 2

BIS addressed the bias claim in the latest Fielding Bible. They don't believe it's been an issue for several years. I agree the batted ball data isn't perfect, and we all should continue to point that out as appropriate. We can even throw in some snark if that's our writing style. But persistent trends can be spotted and interpreted in the data. We can have a reasonable level of confidence saying that that so-and-so tends to hit more line drives. We can't be 100% confident, but so what? As long as we interpret the data correctly, we've gained something. 30 years ago, we had nothing. I can't believe how spoiled people are! ;)

Jul 18, 2012 10:46 AM on Getting Shifty Again
 
studes
(280)
Comment rating: 2

Oh yeah. Just had to say something, ya know. Someone's got to speak up for the poor, oppressed batted ball data. There's some good stuff in there!

Jul 18, 2012 7:18 AM on Getting Shifty Again
 
studes
(280)
Comment rating: 1

We still don’t have good batted ball data, of course... IMO, what we have is pretty good and worthwhile for a lot of analytic purposes. It's just not clear at what level the usefulness of the data breaks down.

Jul 18, 2012 7:04 AM on Getting Shifty Again
 
studes
(280)
Comment rating: 0

Nice one, Russell. I must say that you rarely came across all that cocky when you were writing. And, IMO, the people who were the most one-sided about DIPS were the ones who hadn't studied it closely.

Jul 09, 2012 10:42 AM on Hire Joe Morgan
 
studes
(280)
Comment rating: 10

Pat, the first 100 pages are a description of the system and the last two-thirds are basically a Win Shares encyclopedia. In between those pages are some excellent essays, in which Bill talks about things he discovered while creating the system. As someone who spent a lot of time understanding Win Shares, I have to say that I learned a lot by doing so. And give Bill credit here. He systematically described his system in detail, so that others could read it, analyze it, critique it, whatever. In fact, I still understand Win shares better than I do WAR or WARP, because Bill is so good at explaining things. When Colin says "One of the most instructive failures we can look at is the madness of King James the Bill" he is way out of line, IMO. Bill started changing his system almost immediately after he published it. Bill's thoughts and work are always evolving, and he's to be complimented for it, not denigrated.

Jun 14, 2012 10:38 AM on The Madness of King Bill
 
studes
(280)
Comment rating: 0

Right, bhalpern. That series blows away every other LCS, even though it lasted only five games.

 
studes
(280)
Comment rating: 4

The 2011 ALCS is 17th among all ALCS's, out of 42. 2004 is 12th. 2003 is first.

 
studes
(280)
Comment rating: 0

My recollection is that Guy doesn't buy the fundamental premise of WPA--assessing impact in linear "real time." I'm pretty sure you and he see eye-to-eye in that regard.

 
studes
(280)
Comment rating: 0

You're welcome. I appreciate the lengthy quote, particularly coming from you. :) Just to detract from the warm feelings a little, while I don't disagree at all that WPA overrates relievers vs. starting pitchers, I do think there's value in using WPA/LI to assess reliever usage patterns.

 
studes
(280)
Comment rating: 2

Hey Colin, this is nicely put: "What the win expectancy model is truly capturing is not how much a play contributes to team wins, but how well an event predicts the outcome of the game itself." A fine distinction, but one that's worth chewing on.

 
studes
(280)
Comment rating: 0

Congrats, Joe!

 
studes
(280)
Comment rating: 0

Ah, okay. Speed is better than angle. I missed that. Why is that, though? From the graphs, it would appear that angle is as important as speed. Or is it that speed is more predictable than angle? I guess I love all the data and analysis, but I'm not sure what the take-away is.

 
studes
(280)
Comment rating: 0

So, to summarize what I think is your main point (if I may), we can better predict a pitcher's future (or "true") BABIP by looking at the underlying batted ball characteristics (speed and angle off bat) of his previous batted balls (as opposed to looking at previous BABIP). This is exactly what people who have played with batted ball data (such as ground balls and line drives) have found, but you're looking at the the underlying physical data instead of "observed" batted ball data. Have you shown that your approach is better than the batted ball approach? I'm not doubting it, just wondering.

 
studes
(280)
Comment rating: 0

Thanks, Mike. I think I understand that. :) It was basically a way of "normalizing" speed off bat, unless I miss your point.

 
studes
(280)
Comment rating: 2

By the way, thanks for referring to Brian's article from the THT Annual. This general finding (that GB pitchers allow more line drives) is in agreement with David Gassko's analysis from 2006: http://www.hardballtimes.com/main/article/the-truth-about-the-grounder/

 
studes
(280)
Comment rating: 0

Interesting stuff again, Mike. I need to digest this, but there's something I just don't get. I think you say your looking at a subset of batted balls: "It is also interesting to look at the batted ball speed in the plane inclined by 12 degrees above the horizontal, the launch plane for which the ball is most likely to become a hit." In the data, you show an .800 BA on those balls. But in the hSOB graph for 12-degree balls, BA only reaches .800 over 100 mph. So I don't get what we're looking at here.

 
studes
(280)
Comment rating: 0

Interesting, Mike. From a physics perspective, does it make intuitive sense that the pitcher would have more influence on the speed of a batted ball, since he initiates the pitch and the batter reacts? I wonder what the correlation between pitch speed and hSOB is? Also, the next question that would occur to me is whether the batter/pitcher interaction has an impact on Batting Average. That is, batting average may not follow the line graph published above on a batter/pitcher specific basis. I assume that's something you'll touch on in the next piece?

 
studes
(280)
Comment rating: 1

By the way, I should have said that I agree with your idea of just using hues with of a single color in things like heat maps.

 
studes
(280)
Comment rating: 0

Colors are tough. Personally, I stick to ROYGBIV as much as possible and avoid lighter/darker hues. In the boxes above, the blue boxes stood out most to me, so my first inclination was to think that they were the highest/best zones for batters. Turns out they were worst. I'm partially color blind, so I may be a bad interpreter of this sort of thing.

 
studes
(280)
Comment rating: 0

Awesome, Mike. Sidenote: I was just saying yesterday that I don't trust heat maps, and you've nicely articulated why.

 
studes
(280)
Comment rating: 0

BTW, now that I've wasted your afternoon with many comments, I should say nice job, Derek. Regressing against pitchers with similar flyball rates is an elegant solution.

Sep 09, 2011 10:52 AM on The Infield Fly Rules
 
studes
(280)
Comment rating: 0

I need an editor. This... FB/BIP "stabilizes" in about 100 BIP, while IF/BIP "stabilizes" in 288 BIP. I assume this is almost entirely due to the fact that there are many less FB than IF in a group of BIP. should say this... FB/BIP "stabilizes" in about 100 BIP, while IF/BIP "stabilizes" in 288 BIP. I assume this is almost entirely due to the fact that there are many less IF than FB in a group of BIP.

Sep 09, 2011 8:17 AM on The Infield Fly Rules
 
studes
(280)
Comment rating: 0

I'm going to attempt to recap what I think I found here. Sorry for taking up the space, but I'd be interested if you think I got it right. FB/BIP "stabilizes" in about 100 BIP, while IF/BIP "stabilizes" in 288 BIP. I assume this is almost entirely due to the fact that there are many less FB than IF in a group of BIP. 0.07*FB/BIP "stabilizes" against IF/BIP more quickly than IF/BIP itself does (216 BIP). I assume it has something to do with the relative frequency of FB and the tight relationship between FB and IF. If you take anything that has a strong relationship with another thing--and that second thing happens a lot more often than the first--you will naturally have an equation that "stabilizes" more quickly. This is probably natural mathematics. IF/FB takes longer to stabilize (414 flyballs). I assume this is due to two things. One is again the low rate of infield flies, but exacerbated by the fact that we don't know the pitcher's flyball rate. The rate at which flyballs are actually infield flies will partially depend on how many flyballs that pitcher gives up per ball in play. I wonder how quickly IF/FB would stabilize if you included the pitcher's flyball tendencies? So, if you want to predict future flyballs, you need to base your calculations off something that includes "information" about his contact rate and his flyball rate. Due to the correlation between infield flies and all flies, previous IF rate does that, but previous FB rate stabilizes more quickly because there are a lot more of them. Does this make sense to people?

Sep 09, 2011 8:13 AM on The Infield Fly Rules
 
studes
(280)
Comment rating: 0

You know, I think THT started this whole infield fly thing when we requested infield fly data from BIS back in 2005. We thought it was a new angle that would be interesting. Guess that's why I have such a strong interest in it. They mark infield flies based on distance from home plate. The latest I heard is 140 to 150 feet.

Sep 09, 2011 7:09 AM on The Infield Fly Rules
 
studes
(280)
Comment rating: 0

Okay, I think I see. By using a linear formula that isn't based on zero, you're approximating what might be a curvilinear relationship. So, how quickly does this formula stabilize? Is it an improvement over the straight 7%?

Sep 09, 2011 4:27 AM on The Infield Fly Rules
 
studes
(280)
Comment rating: 0

That is, they don't use "outfield grass," which obviously can't be used in some ballparks anyway.

Sep 09, 2011 4:20 AM on The Infield Fly Rules
 
studes
(280)
Comment rating: 0

I'm confused. Derek said that any fly caught by an infielder is an infield fly. You're saying that MLBAM uses distance instead? That's exactly what BIS does too (and they don't just use the infield parameters).

Sep 09, 2011 4:11 AM on The Infield Fly Rules
 
studes
(280)
Comment rating: 0

Got it. Thanks. It makes sense to me that IF% and FB% would be correlated, since IFs are included in FBs. At least, I think it makes sense. :) So, could you improve your model by varying the 7%, according to the pitcher's FB%? Or would that not be worth it?

Sep 08, 2011 7:48 PM on The Infield Fly Rules
 
studes
(280)
Comment rating: 0

I get confused when people start flipping off r-squareds as to what is being correlated with what, and how that relates to other correlations. For instance, you say that your r-squared of .68 is "much more significant." But than what? The .21? But those were two different regressions--one regressing IF/FB rate vs FB/Contact rate and the other comparing the IF/Contact rate in two halves. Why compare them?

Sep 08, 2011 7:40 PM on The Infield Fly Rules
 
studes
(280)
Comment rating: 0

OK, I think we weren't quite addressing each other. That helps clarify. Thanks.

Sep 08, 2011 7:37 PM on The Infield Fly Rules
 
studes
(280)
Comment rating: 0

Oh, okay. Thanks for the clarification. I thought you were referring to the HR/OF vs. HR/Contact analysis Colin had a couple of months ago. I left several confused comments in that one, too.

Sep 08, 2011 7:36 PM on The Infield Fly Rules
 
studes
(280)
Comment rating: 0

Last comment, I promise. When you say this... "Yes, it does go up. The graph shows that relationship." ...are you saying that you didn't use a standard 7% of flyballs in your first table? You increased it as the pitcher's flyball rate increased?

Sep 08, 2011 6:41 PM on The Infield Fly Rules
 
studes
(280)
Comment rating: 0

By the way, I am just totally confused about these points. What does HR/FB have to do with things? Are you referring to Colin's previous analysis, which I also didn't get? Should I just give up on this stuff?

Sep 08, 2011 6:32 PM on The Infield Fly Rules
 
studes
(280)
Comment rating: 0

I guess when you throw off r-squareds like that, I don't know what you're saying. I often have that problem with BPro articles (and some of my own too!). What is the point of that correlation? Also, is correlating something the only point? How about having a better handle on what makes pitchers unique and interesting?

Sep 08, 2011 6:20 PM on The Infield Fly Rules
 
studes
(280)
Comment rating: 0

Why would you say that, Mike. Cataloging flies according to the person who caught them is dependent on the range of the infielders in question. Why would that make more "baseball sense" than where the ball actually lands?

Sep 08, 2011 6:17 PM on The Infield Fly Rules
 
studes
(280)
Comment rating: 0

Wow. That is significant. I'm having problems, then, reconciling that fact with the notion of just using straight flyball rates as a proxy for infield fly rates. Wouldn't the rate (the 7% you used in your first table) go up as the overall flyball rate goes up? I have to admit that I'm still a fan of IF/OF, though I can't adequately explain why. It feels like it captures a nuance that IF/Contact misses.

Sep 08, 2011 3:20 PM on The Infield Fly Rules
 
studes
(280)
Comment rating: 0

Thanks, Derek. That's what I thought. One observation is that BIS is more rigorous about identifying infield flies. It doesn't depend who caught the ball--it's based on where the ball lands. I know Colin has issues with how well they measure it, but I'd have more faith in the BIS classifications being meaningful than MLBAM's. Don't know if that would affect your analysis at all--just an observation.

Sep 08, 2011 3:12 PM on The Infield Fly Rules
 
studes
(280)
Comment rating: 0

Quick question #2: other studies have found that IF/FB goes up as FB% goes up. It appears that your graph might support that theory too. Any thoughts?

Sep 08, 2011 2:20 PM on The Infield Fly Rules
 
studes
(280)
Comment rating: 0

Apologies for not knowing the answer to this, but how is infield fly defined? I believe you use the MLBAM version, which is the same as Pitchf/x? Is that right? How do they define it?

Sep 08, 2011 2:16 PM on The Infield Fly Rules
 
studes
(280)
Comment rating: 0

Wow, Mike. I'll have to spend some time on this one. Just wondering, do you read the old "Plunk Biggio" blog? I think it's now called Plunk Everyone. I thought he might have inspired you.

 
studes
(280)
Comment rating: 1

I'm looking forward to the movie, too, but dreading the old arguments. Oy. Never thought of Steven Collins as a character actor. Hasn't he almost always been a leading man type? Good choice for Art Howe.

Aug 05, 2011 5:52 AM on Moping About Moneyball
 
studes
(280)
Comment rating: 0

I can see that. It's just the change in language that threw me for a loop, I guess.

Aug 04, 2011 11:23 AM on Why When You Go Matters
 
studes
(280)
Comment rating: 0

Thanks Jeremy. A stolen base breakeven table for all inning/base/outs/score cells would be way cool. I'll work on it! (though it's probably been done somewhere already). I don't mean to be nitpicky, but I had interpreted 1% to be WPA/Pre-attempt WE. I'm not sure putting a percentage to .01 WPA should be done.

Aug 04, 2011 11:00 AM on Why When You Go Matters
 
studes
(280)
Comment rating: 0

Sorry. Leverage Index.

Aug 04, 2011 7:21 AM on Why When You Go Matters
 
studes
(280)
Comment rating: 0

Nice job, Jeremy. Can you talk a bit more about what drives the breakeven rate? Is it related to LI? Score? Other things? Also, what do you mean when you say Gardner has averaged 1% WPA for each stolen base attempt? Also, are the BPro win expectancy tables still based on actual data? Probably doesn't matter for your specific comparisons, but I was just wondering.

Aug 04, 2011 6:49 AM on Why When You Go Matters
 
studes
(280)
Comment rating: 1

Yes to this: "...we have now conditioned large swaths of people to have a Pavlovian response to predictions based on a limited number of observations - 'small sample size.'"

Jul 25, 2011 1:35 PM on Lost in the SIERA Madre
 
studes
(280)
Comment rating: 3

Well, you may not like the tone of Colin's article. It may not match that found in academic articles. But I still have no idea why this means that sabermetricians are hypocritical for criticizing the Murray Chass's of the world. BTW, Colin is definitely saying that SIERA is "bad" in some ways. Look again at his statements about multicollinearity or why using HR/FB is bad (which I still don't get). Mike is saying that an approach that relies too much on regression is, indeed, "bad."

Jul 24, 2011 7:28 PM on Lost in the SIERA Madre
 
studes
(280)
Comment rating: 4

But that's a different point. Now you're saying that people are slinging mud; before you were saying that people should work together to find a "unified alternative." I have no problem with people openly disagreeing with each other. In fact, I think it's healthy. But I agree with you that it's to no one's credit if people start getting personal and slinging mud. I also agree that there is a little bit of mud slinging here. Colin can be harsh in his assessment at times. But it's to his credit that he doesn't hold back in his assessments and, as far as I know, doesn't let things get personal.

Jul 24, 2011 7:20 PM on Lost in the SIERA Madre
 
studes
(280)
Comment rating: 1

Okay, but why does that reflect poorly on sabermetrics? BPro is just one site--there are plenty of other sabermetric places on the web to talk about SIERA if you'd like.

Jul 24, 2011 3:15 PM on Lost in the SIERA Madre
 
studes
(280)
Comment rating: 8

I have just the opposite reaction--in fact, I'm glad that people disagree about these types of issues. Being willing to discuss what's right and/or wrong about the current "baseball way of thinking" (whether it's in the mainstream media of just among us baseball nerds) is exactly what sabermetrics is supposed to do. I see no hypocrisy at all.

Jul 24, 2011 1:36 PM on Lost in the SIERA Madre
 
studes
(280)
Comment rating: 1

Sorry to keep asking questions, but this also confuses me: "...there is a reason that this sort of analysis can be unsatisfactory: it assumes that a fly ball that wasn’t a home run is equally predictive of future home runs as a fly ball that is a home run." But isn't this just as true as HR/CON? In fact, couldn't you turn this argument around and say that this is why HR/FB is preferable to HR/CON? Because fly balls are more predictive of future home runs than, say, ground balls?

Jul 24, 2011 5:36 AM on Lost in the SIERA Madre
 
studes
(280)
Comment rating: 2

Third question: By using first half and second half results, you're typically looking at pitchers who played in the same environment. Won't that skew the results toward the HR/CON figure? Part of the reason to look at flyball rate is to take the park (mostly) out of the equation. A good test might be to ask whether HR/FB is more predictive of second half home runs than HR/CON?

Jul 24, 2011 5:27 AM on Lost in the SIERA Madre
 
studes
(280)
Comment rating: 2

Also, you say this: "Fly balls on balls in play are a much poorer predictor of future home runs than home runs on contact, with an r-squared of only .014." But your table shows an R-squared of .023 for FB/CON. What's the diff between .023 and .014? And aren't those abysmally low R-squared figures? I'm used to getting R Squareds in the .20 to .30 range.

Jul 24, 2011 5:23 AM on Lost in the SIERA Madre
 
studes
(280)
Comment rating: 1

Colin, I don't understand how your graph of HR rates supports what you're saying. Aren't you graphing actual HR rates vs. predicted HR rates based on flyball rates? Of course the actual HR rate is going to be wider--that's the nature of real life vs. projection, right? Are you implying that the distribution of expected home runs based on contact rate is wider than that for fly ball rate?

Jul 24, 2011 5:17 AM on Lost in the SIERA Madre
 
studes
(280)
Comment rating: 3

I thought you said "pinochles"

Jul 23, 2011 1:01 PM on Lost in the SIERA Madre
 
studes
(280)
Comment rating: 0

Mike, thanks so much. I always enjoyed working with you too. I agree 100% that sometimes there is a need to have projection systems, even if we don't totally understand them. There are a few things you're willing to take on faith. Not many, just a few.

Jul 23, 2011 11:15 AM on Lost in the SIERA Madre
 
studes
(280)
Comment rating: 5

Hey Colin, nice job (and nice job in the comments, Mike). I won't comment on the "personal" side of this issue cause I don't really understand it. But you've certainly given me something to think about, particularly the contention that HR/Con is better than HR/OF (or HR/FB). That is counterintuitive to me, and I'm going to have to read your explanation several more times to understand it. To me, xFIP is a useful stat that tells you something about a pitcher, but (IIRC) I resisted putting it on the THT stats page because I didn't see it as a "reference" stat. That was silly of me, I guess. Many readers asked to see it, so we eventually added it. Similarly, we used to run "ERA-FIP" at THT, which was also something requested by readers. I was kind of uncomfortable with that too and when we got a request to run ERA-xFIP too, I refused. I thought it put too much emphasis onto a single number and calculation. Time has shown that I'm in the minority in that position. I guess I am someone who is uncomfortable with the quest for a complex stat that explains everything. I am leery of issues like multicollinearity and other things I can't pronounce or understand. I intuitively won't trust a stat I can't understand. I make only two exceptions: projection systems and total-win based systems. So I wish you guys success on your own statistical quests but please: don't try to do too much. Keep it simple.

Jul 23, 2011 9:09 AM on Lost in the SIERA Madre
 
studes
(280)
Comment rating: 0

Okay, I thought you were interpreting that graph. I admit that I don't understand your methodology. I mean, I understand your description as far as it goes, but I don't understand how that translates into specific numbers on your graphs. Probably just me. ;)

Jun 23, 2011 1:52 PM on Checks and Balances
 
studes
(280)
Comment rating: 0

Okay, so I guess I don't quite understand this: Basically, I compared how players who faced each other in consecutive years fared in those confrontations and in their matchups against everyone else. If they did better against the same players, that means their environment was different.* If they did worse against everyone else than they did against the same players, that means everyone else got better. How does this paragraph translate into the numbers on the graph above it?

Jun 23, 2011 9:47 AM on Checks and Balances
 
studes
(280)
Comment rating: 0

That said, I'm still digesting the methodology.

Jun 23, 2011 9:40 AM on Checks and Balances
 
studes
(280)
Comment rating: 0

Jeremy, did you regress the stats of the first year? If not, I believe you have a selection bias.

Jun 23, 2011 9:33 AM on Checks and Balances
 
studes
(280)
Comment rating: 0

Oy. I wouldn't foist the Cook books on my worst enemy. They're unreadable and do a wretched job of "laying out the basics of baseball analysis." Cook was a terrible writer, his methodologies overwrought and many of his findings flawed. Just because he published before James, Palmer, et. al., doesn't make his books noteworthy in other ways.

May 04, 2011 6:14 AM on The GM Starter Pack
 
studes
(280)
Comment rating: 3

Good stuff, Mike. The graph breaking out different years is pretty interesting. Worth staring at. So, is the correlation between temperature and fastball speed 100% real (occurs on the pitcher level) or might it also impact the PITCHf/x recordings? BTW, I hope that's good news for Kerry Wood.

 
studes
(280)
Comment rating: 1

Just want to add my own three thumbs up, Mike. I keep the third thumb in reserve for articles like this.

Feb 16, 2011 2:00 PM on The Real Strike Zone
 
studes
(280)
Comment rating: 0

Thanks, Colin. So, picking on the 100-200 bin, when you use SIERA to predict future ERA, 66% of the results will be within 2.1 runs of ERA and 95% will be within 4.2 (that's taking the RMSE on both sides). When you use ERA to predict ERA, 66% of the results will be within 2.5 runs of ERA and 95% will be within 5.0 runs. Is that right? If so, it's obviously an improvement, but I think somewhere we should acknowledge that ERA is just plain tough to predict, regardless of what measure you use. Looked at that way, I don't think there's much to choose between SIERA, xFIP and FIP. Just use whatever you're comfortable using.

 
studes
(280)
Comment rating: 0

Congrats, Sky, and welcome to BPro. I'm also someone who feels that, once you reach a certain threshold, ERA is just as good a predictor as anything. Happy to see you peg the threshold. I'm pretty comfortable saying that, after a couple of years as a starting pitcher, actual performance is as good an indicator as these other metrics. Technical question: how good are any of them anyway? At 100 innings, it appears that the average RMSE is 1.2 points of ERA. Is RMSE similar to standard deviation? Can you say that 66% of all estimates fall within one RMSE, or something similar to that? So the best any of these estimates might get is 66% within plus or minus 2.4 ERA? If so, it makes the difference between the RMSE of these measures kind of trivial.

 
studes
(280)
Comment rating: 0
 
studes
(280)
Comment rating: 0

Here it is. I forgot that it was inspired by a BPro article. http://www.hardballtimes.com/main/article/long-live-baseball-analysis/

 
studes
(280)
Comment rating: 0

Yes, that is my reaction, too. The point would be that closers have more value in low-scoring environments.

 
studes
(280)
Comment rating: 1

Excellent article, Colin. I think that's the best job of articulating the issue I've seen--not the issue of "after the game, everything is the same," but the issue of "diminishing leverage for those who come later in the game." Different things, I think. Anyway, I'm someone who is okay with the way WE works. Guess I'm an "in the moment" kind of guy. But I wonder if there is some way to develop another system that accounts for "game impact" in the moment but also incorporates the idea of "creating leverage." I've done a few things along those lines, such as writing about giving weights to batting and pitching events based on how close each game turned out to be, but that's probably just the tip of the iceberg.

 
studes
(280)
Comment rating: 0

Okay, I understand that perspective. Thanks.

 
studes
(280)
Comment rating: 0

Well, if I'm understanding your point (always a questionable issue due to my lack of powers of comprehension), you're saying that the "disagreement" between UZR and DRS--that left by the partial correlation--is a bad thing. I don't know that that's true, because the two systems have different components and also measure them differently. If your point is that those other components and measurements make the systems more different than people might realize on the surface, then you've raised a good point. I think a lot of folks are already aware of how different the systems are, but it's good to reinforce the point. But perhaps the bigger reason for my reaction is that this sentence raises my hackles... "For a time we stopped doing science when it comes to fielding analysis, and instead have been doing baseball alchemy..." ...because it's condescending and I think it diminishes the terrific work that MGL and BIS continue to put into their systems, to better understand the dynamics of fielding and improve the work.

 
studes
(280)
Comment rating: 0

I did mention runs, forgot that. But it is an issue to the extent that the two systems interpret runs differently. DRS includes "home runs saved," for instance, which Dewan gives a lot of weight to (1.4 runs). I don't believe UZR has that. I don't believe that the differences in the two systems are as straightforward as you and Colin imply. Things like pickoffs, bunts, "home-run saving catches" etc. are all handled differently in the two systems, and those are only the things I know about. Park adjustments, yadda yadda. This sentence... "2) realistically, expected DER likely accounts for the lion's share of any overall difference there is." ...is a hypothesis, as far as I know, and I'm skeptical of it.

 
studes
(280)
Comment rating: 0

I'm not worried about the units. I guess I don't understand your point. The two fruit salads don't have the same fruits. You seem to want to pin your findings on bad data, or inconsistent misinterpretation of the data, but I think the systems are too different to conclusively say that.

 
studes
(280)
Comment rating: 0

Isn't DRS Defensive Runs Saved? If so, that is very different from DER, isn't it? For example, I think it includes catcher ERA, runners picked off and other things not related to DER. Plus, it's denominated in runs, which would make it different from DER. I don't know about UZR, but this feels like apples and oranges to me.

 
studes
(280)
Comment rating: 0

You made me go back and look at Win Shares fielding stats, because James' system is not unlike your new FRAA system. Win Shares agrees with you in 2010--it calculates that Jeter was more than 30 assists less than average. I'm know Win Shares are anathema to some (and for some good reasons), but it's interesting to me that his fielding metrics may still be relevant.

 
studes
(280)
Comment rating: 0

Burr, that's what I did/do with Win Shares at THT. Here's an example: http://www.hardballtimes.com/thtstats/main/index.php?view=winshares&linesToDisplay=50&orderBy=total&direction=DESC&season_filter[]=2008&pos_filter[]=All&Submit=Submit "WSP" refers to Win Shares Percentage, where .500 is average. It's a Win Shares rate stat. Yes, sometimes players are better than 1.000.

Nov 04, 2010 5:57 PM on Replacing Replacement
 
studes
(280)
Comment rating: 0

Yup, that's my point. If we want the replacement model to reflect economics and if we use your parameters, then we can only apply it to players with more than six years of experience, right?

Nov 04, 2010 5:53 PM on Replacing Replacement
 
studes
(280)
Comment rating: 0

Colin, sorry. I should have posted the link. The approach is still the same, but I have changed the percentages. In particular, I found that the old criticism of Win Shares is true--it undervalues starting pitchers--so their replacement level is lower. I think I wrote about the updated values in the Annuals and not on the web. But the basic idea is the same: starters against next level of bench players. I didn't feel beholden to the "freely available talent" approach.

Nov 04, 2010 5:50 PM on Replacing Replacement
 
studes
(280)
Comment rating: 0

By the way, the average major league player paid the major league minimum actually performs at least as well, if not better than, bench level. We've discussed this before and I don't mean to shout out Colin here.

Nov 04, 2010 12:00 PM on Replacing Replacement
 
studes
(280)
Comment rating: 0

Tango, you're again insisting that replacement player is only an economic concept. I'm saying it's not. It's an important adjustment to any metric that compares players to average, when those players have significantly different amounts of playing time.

Nov 04, 2010 11:48 AM on Replacing Replacement
 
studes
(280)
Comment rating: 0

I agree that chaining is an important factor, and one that ought to be discussed. That's why I'm not convinced that the "26th man" approach is best. However, if you want to compare players with different amounts of playing time with just one number, then you will be limited by using an average baseline. Yes, you can make the comparison without it, if you throw in other stats and do the math. But why make people do that? It's like saying this: "Hey, here's a stat, but you can't use it the way it is. Here's some other numbers that will help you. Do the math yourself." Why not make it usable in the first place?

Nov 04, 2010 11:46 AM on Replacing Replacement
 
studes
(280)
Comment rating: 0

Replacement level is not just an economic concept, and I think that limiting it that way misses the bigger point. If you want to compare players with significantly different amounts of playing times, then you DEFINITELY need a baseline that is different than average.

Nov 04, 2010 10:56 AM on Replacing Replacement
 
studes
(280)
Comment rating: 0

But replacement level tells you as much as average level does, and more. Targeting 2 WAR over a full season may be roughly equal to average--just adjust from there. On the other hand, you lose something with the average baseline, as Colin explains. It gives no credit to a player who plays an entire season at an average level WHEN COMPARED TO a player who was above average but had only one plate appearance. That player actually helped his team reach the postseason more than the first one did. If players all played the same amount of time, then I'd agree with you. But they don't.

Nov 03, 2010 6:18 PM on Replacing Replacement
 
studes
(280)
Comment rating: 0

I agree 100% that we need replacement level due to the playing time issue. As far as I'm concerned, the exact level can be set a number of places. I used "bench" as my level for Win Shares Above Bench (and posted my research at Baseball Graphs), and I think Keith Woolner did the same thing in one of the BPro books. Is there a particular reason to believe that a replacement level equal to the "26th man" is better than another level?

Nov 03, 2010 5:44 PM on Replacing Replacement
 
studes
(280)
Comment rating: 1

Exactly right, Colin. Well said.

Sep 14, 2010 8:28 AM on Missing the WAR
 
studes
(280)
Comment rating: 0

So what *can't* Fieldf/x do?

 
studes
(280)
Comment rating: 0

It's sort of irrelevant now, but here is the study I was looking for: http://www.hardballtimes.com/main/article/base-stealer-intangibles-part-1/

 
studes
(280)
Comment rating: 0

Very cool, Matt. The other variable to consider is number of outs. It would be important with a runner on first.

 
studes
(280)
Comment rating: 0

Good question about controlling for batter effectiveness. I'm not sure. But I also was thinking along the lines you suggested, that perhaps the 2/3 situation balances out the first base situation in some way. It would be interesting to dive into. Good stuff, Matt.

 
studes
(280)
Comment rating: 0

http://baseballanalysts.com/archives/2010/06/shift_morneau_s.php http://baseballanalysts.com/archives/2010/06/on_defensive_al.php Actually, I'm not sure that these articles definitively answer the question. FWIW, Bill James and John Dewan have argued about this topic for many years.

 
studes
(280)
Comment rating: 0

By the way, John Walsh doesn't study this specific issue in the following article, but it's a nice overview of left-handed batters and why they seem to perform better than righties: http://www.hardballtimes.com/main/article/the-advantage-of-batting-left-handed/

 
studes
(280)
Comment rating: 0

This is the one I remembered off the top of my head: Page 323 of The Book by Tango et. al. With a runner on first and less than two out, left-handed batters have a wOBA that's 20 points higher; for righties, it's ten points higher. That doesn't address BABIP specifically, though. I'll keep looking. Logically, lefties should have a higher BABIP with a runner on first and no out, cause the first baseman will be playing close to first and all batters naturally pull groundballs.

 
studes
(280)
Comment rating: 0

Matt, it seems to me that your results don't jibe with other studies I've seen that point to a significant increase in BABIP for lefthanded batters (relative to righties) with a runner on first. Perhaps it's not apples to apples?

 
studes
(280)
Comment rating: 0

Yes, exactly. That has been documented several times in the past. I remember it from Tango's Book, as an example. It's a good reason to put left-handed batters in the #3 spot of the lineup.

 
studes
(280)
Comment rating: 1

But that's the nice thing about Sam's proposal. It doesn't force a team into the FA market, it only makes it cheaper for them. If it doesn't make sense to sign a free agent, even at a cheaper rate, then the team won't do it. Theoretically...

 
studes
(280)
Comment rating: 0

Excellent perspective, Matt. I agree with you and Sam Mauser that a much better system would be one that tied the tax and its distribution to salary/payroll. Revenue is too squishy. On the surface, Sam's proposal makes a lot of sense. It might be worth getting more specific with it.

 
studes
(280)
Comment rating: 0

Well, never mind. Stupid question. If it works for 162 games, it should work for one. I'm just slow, is all.

Aug 25, 2010 3:03 PM on Support Group
 
studes
(280)
Comment rating: 0

By the way (and I understand this is off the point), but does the Pythag formula work well for determining the probability of winning a specific game? I don't believe I've seen it used that way before. Just curious...

Aug 25, 2010 2:45 PM on Support Group
 
studes
(280)
Comment rating: 0

Got it. Thanks.

Aug 25, 2010 2:38 PM on Support Group
 
studes
(280)
Comment rating: 0

Great stuff, Colin, though I agree with evo34. Looking at this data by team would get rid of the multicollinearity, or whatever you call it. BTW, I think there is an error in your formula. You use RPG for offense, but RA for defense. I assume that's Run Average and not Runs Allowed, but you multiply the RA by the total number of innings pitched for both starters and relievers. That would give you total runs allowed, not runs allowed per game. If I'm reading it correctly.

Aug 25, 2010 2:22 PM on Support Group
 
studes
(280)
Comment rating: 0

Congrats, Colin! Excellent move for BPro.

 
studes
(280)
Comment rating: 0

Yah, good point. I have no idea how to apply standard deviations to non-normal distributions. Think it substantially changes the 68%?

Jul 28, 2010 6:13 PM on Looking Farther Afield
 
studes
(280)
Comment rating: 0

Yes, that's 68% (34% on either side). Two standard deviations are 95% and three are 99%. FWIW, I would advocate using two standard deviations. 68% is okay, but not compelling, and using two standard deviations, or 95%, is more understandable and intuitive to the average reader. Obviously, down the line, you can make statements of how likely it is that Ozzie *isn't* the best fielding shortstop in your dataset.

Jul 28, 2010 2:20 PM on Looking Farther Afield
 
studes
(280)
Comment rating: 0

Big huzzahs for introducing margins of error, by the way. Love it. Another stupid question, though: what range does the margin of error represent? 99% of potential outcomes?

Jul 28, 2010 12:49 PM on Looking Farther Afield
 
studes
(280)
Comment rating: 0

By the way, I do see that you addressed that in your article. I guess I'm just pulling it out a bit more.

Jul 28, 2010 12:43 PM on Looking Farther Afield
 
studes
(280)
Comment rating: 0

I have a stupid question and/or comment. With the "ground ball adjustment," we're hurting above-average infielders who are paired with below-average outfielders, right? And vice versa? I think the reply is that the system is worse without that adjustment, and I would intuitively agree. But it seems that it ought to be noted that this is a possible problem with the system. It would also seem to set up the next question: can we make the system better by using simple batted ball types, or do you feel there is too much systematic bias in even that data? What's the tradeoff?

Jul 28, 2010 12:41 PM on Looking Farther Afield
 
studes
(280)
Comment rating: 0

Good point, Eric. I am not a fan of K/BB ratio at all. I think it tells us very little and can be downright misleading. Another approach, which we use in the THT Annual, is to assign linear weights to strikeouts and walks and total them up.

Jul 21, 2010 11:09 AM on To Subtract or Divide
 
studes
(280)
Comment rating: 0

Neat interview. I find it ironic that a guy who is famous for what he did in Vegas is surprised that business is all about money.

Jul 15, 2010 2:24 PM on Jeff Ma
 
studes
(280)
Comment rating: 1

I'm somewhere in between Colin's viewpoint and many of the posts. The strikeout rate doesn't bother me, but it does seem that Wright is trying to find a new approach after his power outage of last year and still hasn't settled on it. He may be in the process of morphing into more of a TTO guy, which would be sad in many ways but not unprecedented. The really sad thing is that Wright has turned into a New York media whipping boy even though he's clearly been the Mets' best hitter so far. Heavy hangs the head...

May 16, 2010 4:51 AM on Wrighting the Wrong
 
studes
(280)
Comment rating: 0

I think you're misinterpreting the 26-31 year olds who signed with new teams. Their WARP went from -.34 to -.18, but you have them providing more value in their first year in your percentage table in the article. BTW, I also have a problem with this methodology, using the percentages to argue that players that re-signed with their teams "aged" better. Unfortunately, I can't really articulate why--I'll see if I can come up with something constructive. OTOH, I find your general conclusion (teams that re-sign their own players get better results cause they know their players best) intuitively correct. It would be great to really "prove" it.

 
studes
(280)
Comment rating: 1

Seems to me that WPA is what it is. That's it's strength and weakness. As a story stat, I think WPA is superb. When it comes to giving "credit" to players, you're really quantifying their role in the game story. As Matt says, you're not passing some sort of moral judgment on them, and I don't know many people who are using WPA to pass moral judgment. Regarding Colin's specific example, Sweeney did get credit for extending the inning to Ichiro, but you're right that he didn't get credit for what Ichiro subsequently did. Should he? Well, if you give him that credit, that might be "fairer," but it also disrupts the game story. It becomes a "post hoc" story instead of an in-game story. To me, that's not worth it. Isn't leverage index the way to solve Clay's boundary issue?

May 03, 2010 8:11 AM on WPA
 
studes
(280)
Comment rating: 2

New BPro contest: find the Nash equilibrium in the MLB draft!

 
studes
(280)
Comment rating: 1

I would think there is one flaw in the draft slot game theory. IIRC, the draft salary slotting only started a few years ago--less than five years or so. Yet the AL superiority developing minor league talent dates back to the early 1990's.

 
studes
(280)
Comment rating: 0

Hey Matt, still chewing on this, but are you aware of the work Steve Treder has done in this area? He's presented at SABR, and also published his work in the most recent THT Annual. Bottom line, he published a long series of work at THT similar to your "No Turnover" Standings, but througout baseball history. He called his the "Value Production Standings." You should be able to find it easily at our site. Bottom line, the balance in developing "original" talent shifted from the NL to the AL in the early 1990's and has stayed there ever since.

 
studes
(280)
Comment rating: 0

Matt, responding to your previous post, I did read your article in detail and didn't "show up to hate." In fact, I posted a review of it at THT. I'm sorry you find my comments hurtful, but I've certainly received a lot worse and not minded.

 
studes
(280)
Comment rating: -10

"For Phillies phans, this is actually a great sign." ...seems pretty close to "best contract ever." Just sayin'

 
studes
(280)
Comment rating: -8

You remind me of hot-shot consultants I used to hire who thought they could model how my business worked but wound up adding no value. Total waste of money. The bottom line is that teams still have a budget based on what they think their revenue will be in the next year. If you pay someone a certain amount of money from that budget, it represents less money for someone else. And budgets are typically based a year at a time, not for a three- or five-year period. The nature of baseball--in which wins can be hard to predict--makes long-term budgeting an intellectual exercise only. The bottom line is that, if the Phils win less games next year, they will have less money to pay ballplayers. No amount of theorizing can refute that. Howard's contract is more than just a "sunk cost"--it's an actual payment due from a future budget.

 
studes
(280)
Comment rating: 0

One other comment. I don't understand this statement: "The myth here is that teams spend money to be nice to their fans, but only up until they reach some arbitrary budget, and then they stop. This is untrue: teams spend money to make money." My experience is that teams don't follow an economists' ideal of matching marginal cost to marginal revenue. They don't consider player expenses to be investments, at least not financially. They have annual budgets, just like every other business. Do you have evidence to the contrary?

 
studes
(280)
Comment rating: -2

Matt, putting the analytic talk aside, the crux of your article seems to be that the Phillies know Howard better than analysts. That may be true, but there have been many instances in which team overvalue players on hand. Groupthink can cut both ways, and I don't think an analytic approach that rests on the assumption that the Phillies have avoided groupthink is going to put the rest of us outraged analysts at ease.

 
studes
(280)
Comment rating: 0

George Lindsey probably deserves primary credit for creating the run expectancy tables in the early '60's. The Mills brothers took his work a step further and turned them into win expectancy tables and calculated WPA (not their term) for all players in 1969. However, the Mills brothers didn't create a "Runs Created" sort of stat. Skoog built on Lindsey's ideas(and Palmer's) to create his RC stat. I'm not aware that anyone calculated a specific RC stat before Skoog (though many, like Lindsey and Palmer) had the idea), but that's because Skoog had the Project Scoresheet data. At least, that's my understanding.

 
studes
(280)
Comment rating: 0

I'm enjoying the discussion. Just one small point: FIP doesn't talk about fly ball percentage except to the extent it's implied in the home run rate.

Feb 27, 2010 8:53 AM on Barry's World
 
studes
(280)
Comment rating: 0

Never mind. Eric just reminded me of our discussion about it last September.

Feb 26, 2010 7:58 AM on Part 4
 
studes
(280)
Comment rating: 0

Additionally, as it pertains to xFIP, we spoke with Dave Studeman of The Hardball Times in order to determine that the expected number of home runs to be substituted into the FIP formula is to be calculated through home runs per outfield flies, not the sum of those and popups. I don't remember any conversation about this at all.

Feb 26, 2010 4:47 AM on Part 4
 
studes
(280)
Comment rating: 0

One correction: xFIP is not based on HR/FB, but HR/OF (home runs per outfield fly). Sounds minor, but it's an important distinction.

Feb 16, 2010 7:39 AM on Part 1
 
studes
(280)
Comment rating: 0

http://www.hardballtimes.com/main/article/ten-things-about-momentum-in-the-postseason/

 
studes
(280)
Comment rating: 0

Of course, if the Mets had actually created some offense at the time, Torre might not have changed his opinion after all.

Sep 28, 2009 8:37 AM on Weekend Update
 
studes
(280)
Comment rating: -1

I don't know about VORP, but that's been done with a number of other value stats. I did it with Win Shares for several years. In every case, free agent pitchers were much more expensive than free agent hitters. And I used WSAB, which attempts to adjust for the imbalance between starting pitching and everything else in Win Shares.

 
studes
(280)
Comment rating: -1

Fun stuff. Thanks.

 
studes
(280)
Comment rating: 0

I may be missing the point, but it seems to me there's nothing wrong with not spending a lot on pitching if you have a lot of good young pitching. A heavy payroll in a certain category usually means the team hasn't done a good job of developing that talent internally, right? This is particularly true of pitching, where the aging curve is mostly flat before eventually trending down. Plus, pitchers are generally not as good an investment on the free agent market than hitters. So this may be a consequence of smart money management by the Nats, not bad money management.

 
studes
(280)
Comment rating: -1

A couple of critical comments: 1. That first paragraph was incredibly mind-numbing. Talk about not grabbing your reader at the beginning. 2. What the heck is the definition of all those stats in the table? Z-Swing? I think you assume too much on the reader, particularly since those aren't BPro's stats. We try to avoid things like that at lesser sites like THT.

 
studes
(280)
Comment rating: 0

What most baseball analysts really object to is the reference to chemistry and pressure when those things may or may not have had an effect. How do you know the Rays \"choked\" or lost confidence? You don\'t, anymore than I do. I think the tone of Joe\'s article was right on. Those things may have had an impact, and it\'s fun to think about them. But those psychological pressures may not have been the real story of the game. Sometimes players do and don\'t perform just because that\'s how the ball bounces.

 
studes
(280)
Comment rating: 1

Nice article, Joe. I don\'t have a problem with a baseball analyst being willing to consider the role of confidence or anxiety in a game or situation. Just don\'t make it a habit!