CSS Button No Image Css3Menu.com
New! Search comments:
(NOTE: Relevance, Author, and Article are not applicable for comment searches)
You work for a couple of teams ... have any teams made substantial sabermetric discoveries that would be of high interest to sabermetricians in general?
That is, suppose you have a scale of sabermetric discoveries something like this (your rankings may vary, feel free to recalibrate):
Runs Created: 10
Voros and DIPS: 8
Players' aging curves: 6
Strikeout pitchers have longer careers: 4
Clutch hitting doesn't exist much: 2
And supposed you ranked teams' top five discoveries that the general sabermetric public doesn't know about. How would they rank on that scale?
I’d like you to tell us the one thing you learned about women. Then I’d know one thing too!
Mitchel Lichtman analyzes another of the authors' claims here:
mgl, Does it make a difference whether you use 0, 5, 10, 15, or something higher?
Because, if it only makes a little difference, and the peak still comes up around 27 regardless, wouldn't that pretty much settle the question of peak with respect to the delta method?
That is: the problem with the delta method is the dropouts. If the results are robust (roughly the same) regardless of what reasonable method you use to compensate for the dropouts, doesn't that give you a solid conclusion?
Or believing that there is so much other, independent evidence that 1000 PA players have earlier peaks that, if that's not what JC found, it must be a problem with the algorithm he used when applied to these players.
Or believing it, but arguing that if "virtually identical" means only slightly lower, that is actually strong evidence of a lower peak when interpreted more closely.
I argue for both of the above in my post, which I linked to in one of the other comments.
OK, fair enough.
I disagree with the "standard practice." I think in this case what JC did is not hazardous and need not be avoided.
BTW, when CAN you extrapolate? I don't get it ... is this specific to curvilinear regression, or any best fit line?
Should I just stop worrying about global warming because the future is outside the data set?
Or, if my flesh-eating disease seems to be spreading from my toes to my ankles to my knees, should I avoid worrying that it's going to spread to the rest of me?
Snarkiness not meant for the commenters here, but to the textbook that preaches blind adherence to a rule of thumb.
I disagree with JC on this point. A couple of months ago, I argued that the results of lowering the PA to 1000 actually DO (if properly interpreted) constitute evidence that peak age is closer to 27 than 29. This is using a methodology similar to JC's (although not quite as rigorous).
That is: like MGL, I believe that JC's finding that hitters peak at 29 is completely due to his selective sampling of long-tenured players. That's even accepting his method itself without qualification.
I know that JC disagrees with my conclusions on this point, as he disagrees with MGL's. Either MGL and I are wrong, or JC is wrong.
Or, maybe one of us is 90% wrong and 10% right; or maybe we're each half right and half wrong; or maybe we're all full of crap. If you're interested, take a look and judge for yourself.
All my comments on JC's study are here. For the 1000 PA case in particular, look for the "part II" post.
I think on this point, JC is correct. Imagine throwing a ball at a 45 degree angle in a vacuum. If you measure the ball's position at three points -- say after 1, 2, and 3 seconds -- you can perfectly predict its entire path, and where it's going to peak, no matter how long it takes to come back to earth. It makes no sense to say, "hey, you can't extrapolate your estimate past three seconds," because in this case, as a matter of physics, you CAN, and precisely.
If your model is correct, there's no problem at all evaluating the estimated curve outside the sample range. Of course, the quadratic model is an approximation, and it's possible to argue that it's not valid at the extremes, like age 23.6. But that's hard to justify when it's so close to the sample, agreeing that it's OK at 24.0 but arguing that at 23.6, well, it's just too far off. At least not without a serious argument about why those 0.4 points makes a big difference.
"You should refuse to consider values of the function even 4% outside your sample domain" is like, "you should wear a helmet even if you're only walking to the backyard in case a meteor hits." I think you have to use a bit of common sense here.
Exactly! ProTrade's comment is exactly what MGL and others think is the problem with JC's study, in a two sentence nutshell.
Despite its negative rating. :)