CSS Button No Image Css3Menu.com
New! Search comments:
(NOTE: Relevance, Author, and Article are not applicable for comment searches)
The most shifty team in 2012 was managed by Joe Maddon. The least shifty team in 2016 was managed by... Joe Maddon.
Not directly, though K and <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=BB" onmouseover="doTooltip(event, jpfl_getStat('BB'))" onmouseout="hideTip()">BB</span></a> park effects tend to be small, and I did test (and left on the cutting room floor) whether the effects were there specifically for team switchers and the results came out the same.
They are the same teammates, but we're looking at year-to-year change. So, if a GM likes guys who strikeout a lot, they probably already were K-hogs, and therefore, their change scores aren't going to be that big.
Probably not, because the CF has to cover a lot more ground than LF and RF going backward, and it would be good to have a rangy guy who could cover that area.
The problem has never been "character doesn't matter." The problem is "We have no data on character!"
I did write that! In fact, I was going to fold that into the discussion, but it got to be deadline time and I was tired...
Not as of 2016. I would imagine if we were to see another position start in this direction, it would be right field. I very much doubt it would ever make its way down to shortstop.
As to the mid-inning issue, I actually ran a separate set of analyses (got left on the cutting room floor) where I specifically controlled for whether there were runners on base. (A pitcher brought in mid-inning is probably out there because the guy before him was giving up a lot of base runenrs) The findings still held.
As to the rest of it, if they have different approaches, it's not showing up (in the aggregate) in their results. Perhaps they are conceptualizing/thinking/maybe even pitching the situation differently, but if they get the same outcomes...
Your reading is a little off. I'm saying that the pitchers themselves get the results consistent with their talent (and the talent of the batters that they face) in the 7th as they do in the 8th. If you're an amazing reliever in the 7th, you're an amazing reliever in the 8th. If you are awful in the 7th, you're still awful in the 8th.
Let's say you have two pitchers who are going to have to pitch today anyway, and the better one is "used to" pitching in the 8th. The problem is that in the 7th, the other team has the heart of the order coming up, why not have the guy who's actually good come into the 7th, rather than follow a rigid progression?
Well, the Yankees _were_ up 3-0 in that series...
They did use B-H, but when you do adjustments like that, it doesn't eliminate false positives. It just means that you're being a bit more careful. The problem is that their conclusion that "jet lag has an effect" seems to be based on "We got _some_ significant findings, so they must mean something!" rather than "We got some significant findings, although even with our post-hoc corrections against Type I, there's still going to be a few false positives in here, and even if these are all real findings, they don't tell an actual baseball narrative that makes any sense.
Also, I looked at a suite of outcome events (single, XBH, <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=HR" onmouseover="doTooltip(event, jpfl_getStat('HR'))" onmouseout="hideTip()">HR</span></a>, K, <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=BB" onmouseover="doTooltip(event, jpfl_getStat('BB'))" onmouseout="hideTip()">BB</span></a>, etc.) and nothing really shook out significantly. There's just not a lot of LWTS to be had here.
The fact that the matchup data are reliable didn't surprise me, but I did want to check. I wondered whether the fact that some of the datapoints in there were collected 7-8 years apart would mean that over time, using the pitcher-batter as the constant factor that tied the data together, rather than the temporal factor (i.e., 2016 is the common factor) would make a difference. Turns out that the answer is no. It just says that matchup data, after a while, provide a reasonably coherent measure. If it hadn't, I wouldn't have run Part 2.
As to the second inquiry, what surprised me was the variance partitioning. "This year" still beat "matchup history" by a 2:1 ratio, but that's out of alignment with the usual guidance on this work. As you suggest, the answer might just be "Smith does well against <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=GB" onmouseover="doTooltip(event, jpfl_getStat('GB'))" onmouseout="hideTip()">GB</span></a> pitchers and Jones is a GB pitcher" and maybe that's the answer for 80 percent of cases. That's fine and I'm happy to see more work in that area. (You're quite right I haven't gotten there yet!) Personally, I walked away from this one with "Gee, I really should be taking matchup data more seriously." If others walk away with that as well, I am a happy man.
It's likely that the two measures are going to be strongly correlated, but there will be cases when they aren't, and those are the cases that we want to adjudicate. The regression says that both measures provide reliable and unique information, so even if they are strongly correlated, there's enough variance that is not shared _and_ which correlates with the outcome variable (what happens in this at bat). I think the more important takeaway from this is that we need to re-think our basic approach to how those data are valued strategically.
It's possible that the answer is that "Smith just sees lefties really well and Jones is a lefty." And that's something we know is out there, and maybe it's the answer in this specific case. I wonder though if there are more answers than just "he's got a platoon split."
Not necessarily, because the control variable/base expectation is fitted off of units. The regression knows that <span class="playerdef"><a href="http://www.baseballprospectus.com/card/card.php?id=67119">Andrelton Simmons</a></span> is out there and adjusts accordingly.
Yeah, this controls for baseline talent. You can be overall amazing, but the slope points downward as the season goes on or you can be terrible, but at least maintain that level of terrible throughout the season. The point is that the power to adjust that slope has value as well as the mean.
I was going to make my annual plaintive cry for more <span class="playerdef"><a href="http://www.baseballprospectus.com/card/card.php?id=68495">Stetson Allie</a></span> coverage, but he's gone. :(
It isn't just the 47 vs. 53 here that we have to contend with. That's just "is a critical seventh inning situation likely to be the most important situation of the game?" You also have to consider the fact that the manager has to sniff out that there might be a key situation that's coming up, in advance of when it appears, and we've seen that those don't usually give a lot of warning.
I wouldn't rule it out. One thing that I thought of was that maybe having the closer start getting ready at the beginning of the eighth inning in a one-run game might make sense. If the situation gets hairy, he's ready. If not, he's coming into the ninth anyway.
You are correct that LI doesn't consider any of that, but rarely do our analyses where we whine about managers include that either!
Some of them don't. Some of them, you just click on <span class="playerdef"><a href="http://www.baseballprospectus.com/card/card.php?id=49617">Andrew Miller</a></span> and suddenly, he appears on the mound ready to go!
I had that thought too, although didn't write about it for length concerns. I agree that it is a bias and that it is a small bias.
We've seen in the past that "save situation" conversion rates aren't all that different for teams in the seventh inning as the ninth. The ninth inning guys are better, but not by enough to really move the needle on this one.
And who can pitch like <span class="playerdef"><a href="http://www.baseballprospectus.com/card/card.php?id=49617">Andrew Miller</a></span>.
You can actually thank Harry for that one. When I ran by him that I wanted to use him as a bit of comic foil, I figured that I'd at least give him the right to pick his own entrance music. (Awesome pick on his part.)
This is my favorite piece of the year, every year.
There's plenty of incentive to make sure that issues like race and language don't get in the way of a good team, both from a "racism sucks" point of view and a competition point of view. A lot of the response in the clubhouse is self-organizing, and that's cool. It helps to have guys who are mature enough to realize that hey, we all need to work together. I don't know how much teams put their thumb on the scale (in either direction... they might be doing all they can.)
Perhaps forgive the guy writing at 11 at night after putting three kids to bed a little math error?
I doubt it because the 4-5 percent number is based on community surveys. They go out and actually test people for ADHD, including people who have been formally diagnosed and people who haven't. It's possible that people with ADHD self-select into being baseball players, although that doesn't seem likely.
It's not awful. It's better than using the All-Star game. Doing it "the right way" using some sort of context-adjusted records would be too Byzantine to explain. It will probably be right more often than it is wrong, but then again using alternating years (and by extension, saying that we don't care about record) means never being wrong (or right).
There were 20-something players who notched a 5 win season (or better) last year (pending which WAR version you use). There will probably be a similar number next year, as there were last year. We have a decent idea of who those players will be. Sure, not all of them are on the free agent market (or even FA eligible). But, if we assume $8M per win, we should be seeing a few $40M per year salaries. Even going back a few years when we expected "only" $7M per win, we never saw a $35M salary. Some of that was risk tacked onto the back end of 6 year deals, but there is a compression effect at the top, and I think that elite closers are in the borderland where that compression effect works against them.
For relievers, WAR generally applies a leverage adjustment based on entrance LI. Basically, you calculate their context neutral WAR and then multiply that by a factor that is halfway between their entrance LI and 1. I've examined that previously and I think we can do a better job methodologically, but the answers that I got mirrored what that method was doing anyway. This was a third attempt to try to figure out closer value. Again, I think the upper bound is that an super-elite closer is worth 3-4 wins.
Pitchers have a separate accounting for the framing runs that their catchers so generously provide them (or take away from them)
All in good time.
Sylvester Codmeyer III likes this.
The Puckett reference followed the comparison of the effects of steroids to having a 5 mph wind at your back. As an aside, I recalled the days when the Twins would turn on the air conditioning during close games when the Twins were up to actually give them a little extra "wind" in the Metrodome (click through on the linked story). I went with the first two Twins-iest names I could think of from that era as exemplars.
I see where you're going with that, but I was more interested in the question on an individual basis (the HOF angle). The problem there is that to go back in time like that, I'd have to go with less reliable stats (<span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=HR" onmouseover="doTooltip(event, jpfl_getStat('HR'))" onmouseout="hideTip()">HR</span></a> rate and <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=ISO" onmouseover="doTooltip(event, jpfl_getStat('ISO'))" onmouseout="hideTip()">ISO</span></a> are the obvious ones, I could probably dream a few more up.) There's going to be some noise built into those. I'm not opposed to doing it, it was just not the angle I wanted to take on this one.
I suppose I should be thankful for the luxury of not having to file a ballot!
I can respect a reasonable and well-thought out argument on either side of the moral issue, and I think you're right in that there's a qualitative difference between those who did it when it wasn't _officially_ against the rules vs. those who did it when it was most definitely against the rules (and they got caught!) There will be a bloc of voters (and probably more than 25 percent of them, which means he doesn't get in) who will be #NeverARod folks based entirely on his admission of use. I'm interested to see if his candidacy fares better than does Bonds who has been suspected of PED use, though he denies any use.
I'm just here for the Matt Christopher name-drop.
Should be SS
That's in there. In the first set of analyses, everyone is facing a new pitcher whom they did not see the last time they were up. The key variable is whether that pitcher is of the same or a different handedness. Doesn't make much difference.
Me too, Ben...
Yeah, a lot of the critiques that we've made can be made of other businesses/industries. And you are correct that a lot of companies lead with the technical skills and I think it is for that reason. But it seems like shrugging about the issue gives up a chance to really make some hay. Very valuable hay.
In a business that is super-competitive and where a new idea can literally be worth millions, is it a good idea to stick with a system that might be screening some of those ideas and the people who have them out before they even have a chance to be heard? Maybe there eventually becomes a capacity problem in the ability to process all of those ideas, but I wouldn't want to filter those ideas before they got to me and I could decide whether they were decent or not.
Thanks so much!
That is accurate if you aren't going to the PEBO job fair, but it will certainly cost you airfare to wherever the meetings are (DC this year), and hotel and transit when you get there. That's a couple hundred bucks right there. You also need to have the ability to randomly take a couple days in December away from work. That's a lot of money to interview for a job that might not pay enough to cover your living expenses. (Or anything.)
Some of that might be the way that we defined things. We looked at formally titled front office workers. There may be former players who are playing key advisory roles, but not under the usual titles that have normally defined the front office. There are several front office people who tend to be former players, many of them Hall of Famers, who are listed as "special assistants to the GM" and we did not include them, because there are a lot of them who are there more for ceremonial reasons than anything. That could be tilting things a bit.
As to when it began, we don't have systematic data on that.
Well, given that I helped to design that particular piece of the puzzle, I happen to be a fan. It's not the perfect design, but it's better than nothing and it actually worked to get some data, mostly because it was simple. It could be used in MLB, and if anyone out there on the inside is willing to help, then certainly, I'm game.
I think there's an interesting distinction in your question. There's a difference between making decisions based on intuition and making a decision which is primarily focused on maintaining chemistry in the clubhouse.
The decision to focus on the chemistry might lead to what looks like a sub-optimal decision on the field in that moment, but a manager can say that he's playing the long game and that the chemistry is going to be more important down the road. We don't have the tools to fully research that right now, but we might at some point. At that point, it becomes a data-driven decision.
Sadly, I think "intuition" is often a codeword for "path of least resistance" or "guessing."
"We're almost entirely the same person we were when we were 6, just with more facts and different incentives."
This basically sums up the seven years that I spent in grad school.
This might be the smartest thing I've read all year.
There's no way to control for pitcher/park/etc. with the publicly available data. If someone has better data and can control for all of that, I am happy to read that. I used <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=BABIP" onmouseover="doTooltip(event, jpfl_getStat('BABIP'))" onmouseout="hideTip()">BABIP</span></a> (and total bases, which is basically just SLGcon) entirely for the reason that it's the only thing available. I also strongly suspect that there is an effect on <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=HR" onmouseover="doTooltip(event, jpfl_getStat('HR'))" onmouseout="hideTip()">HR</span></a>, <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=BB" onmouseover="doTooltip(event, jpfl_getStat('BB'))" onmouseout="hideTip()">BB</span></a>, and K and I suspect that the effect is negative, but I can't prove it. I would love to include something more comprehensive (wOBA would be fine), but it isn't out there for me to play with.
I'm putting this all out there not fully formed because the information that we do have says something that doesn't connect with the generally accepted belief. There are plenty of ways that I could be proven wrong, and if someone can prove me wrong, then I'm wrong. I haven't the slightest bit of fear of that. I'm a married man, so it wouldn't be the first time.
There are data out there that could answer the question, but they aren't public. So, effectively, we have to sit here in partially in the dark. The only data that are available are shift splits -- based on BIS data -- that Fangraphs have put up. They only cover balls in play.
But let's take stock of what we do have on hand. We know that BABIP is at least part of the equation (and an important part at that), and for that part, the effects of the shift are quite minimal (again, in the aggregate... I don't suggest completely abandoning The Shift, but that does suggest that it's over-used). Within what restrictions we have to work with, it is at least un-settling to what had been considered a settled question, and I think that there's value in that.
It could be that teams use the shift behind more <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=GB" onmouseover="doTooltip(event, jpfl_getStat('GB'))" onmouseout="hideTip()">GB</span></a> happy pitchers, which could explain the increased GB rates...
I guess it does implicitly make that assumption. I have no way of controlling for that or knowing if it's true.
It's a frustrating walk, because I don't have the super-secret data to look at. (If I did, I'd probably be forbidden from talking about it publicly anyway... I don't.) I wouldn't say that just because the "smart" teams are shifting more means much. Having a lot of data or analysts doesn't make you smart. If you're asking the question wrong, it makes you a wrong person with a lot of data.
There do seem to be minor effects. But yeah, I think fans under-estimate how resilient these guys are.
Hadn't thought of that angle. Makes sense. Personally, I'm more interested in the non-differences between the 7th, 8th, and 9th inning comebacks. My original title was "Tis Better to Have Loved and Lost... in the Eighth?"
Oh how I wish we had this sort of data stream going back 10 years or so. We'll never be able to prove it for Mo specifically, but I'm guessing that we'll be able to nail down some of the "DIPS beaters" who are currently active and that inducing weaker contact will be a part of the puzzle.
Oh yeah, that. I have the strange distinction that I have never actually watched an episode of Friends. True story.
I delimited the sample so that everyone had at least 300 <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=PA" onmouseover="doTooltip(event, jpfl_getStat('PA'))" onmouseout="hideTip()">PA</span></a>, so it's not a problem with the sample itself. However, there could be a problem in who that leaves out... but I'm not sure how that would operate...
I ran them chronologically because that's how we get them in real time. I'm guessing that upfront, we have the issue that's been identified, but as you go further into the season, it smoothes out.
That makes sense.
It really doesn't. Those epidemiological rates (4.4% of adults and 5.4% of males) are based on community survey methodologies. Essentially, you take a cross-section of people from the general public and assess them for ADHD (among other things.) Some of them have probably already been identified and treated. Some haven't. But the base rate in the population as a whole is 4-5 percent.
If 5-6 percent of MLB players had a TUE, then I'd feel comfortable with the thought that this was just a reflection of the population base rate. It's more likely that baseball provided an incentive to get everyone who did have ADHD to be properly screened and -- if needed -- medicated. But the fact that it's 9 percent means that either something fishy is happening or that baseball is drawing in more people with ADHD than the general population, that these people are more likely to have been assessed, but that ADHD -- which can still cause issues even when it's been treated -- doesn't negatively impact the ability to play baseball at a high level. (Or that this is an amazing statistical fluke.)
<span class="playerdef"><a href="http://www.baseballprospectus.com/card/card.php?id=31282">Paul Zuvella</a></span>.
Well, there's a certain amount of gamesmanship there. Batters are generally aware of their own weaknesses. Before the game, the hitting coach might say "Look, they're going to try to attack you low and away. Be ready for that." Of course, the pitcher has to pick something to throw, but constantly going to the thing that the batter knows is coming eventually probably isn't a sound strategy.
But rocks are fun!
Amazingly sneaky since according to the scoreboard, he's also on deck.
Myers-Briggs is a four-letter word.
Like any tax, it would discourage consumption.
The market based solution would be to tap the players in those countries on the shoulders and say "Hey, look at what players in the US draft get..." or point out some of the numbers that have been paid to KBO and Japan or Cuban defectors and tell them "You could ask for a lot more and they'd probably say yes."
It's a problem because there's a real chance of one of the top 750 available baseball players not playing in MLB for no really good reason.
And I think it's worth considering from Desmond's point of view. He's earned the right to go to free agency, but he gets no say in whether he gets a QO or not. Once the Nats make that offer, he's got two options. He can accept and stay in DC and have to live with the uncertainty of a one-year deal or he can go out into the free agent market knowing that it will hamper his earning potential. It seems like a dent in the idea of free agency that Desmond gets for being decent enough to "qualify."
It's probably more of a problem from the player's perspective than the team's, but the players are one side of the CBA.
In fairness, this is a bit of a selective sampling nightmare. When I insisted that the model only look at guys who made it through at least 27 batters, well... that's a sample of guys who were having a good enough day to be worth leaving out there for hitter #27. But take off that limit and you get guys who are pulled after hitter #24 specifically because they just gave up 3 straight hits... perhaps they would have been OK facing #25-27, but we'll never know that...
When I say "the technology", I mean that we may or may not know enough to be able to pinpoint when a specific pitcher is "too tired" in some mathematically precise way, at least in-game. I think there's something to be said for a risk management strategy though that says that we know enough to know when that point is still in the future, and if we understand that getting to that point can be disastrous, it's possible to design the team's roster (boost the bullpen) to avoid it happening.
Could you re-do that prediction and lie to me instead?
<span class="playerdef"><a href="http://www.baseballprospectus.com/card/card.php?id=68495">Stetson Allie</a></span>. Is he gonna make it? At this point, I'll take one <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=AB" onmouseover="doTooltip(event, jpfl_getStat('AB'))" onmouseout="hideTip()">AB</span></a> for him. He went to my high school and my high school has never had a Major Leaguer.
I think if there's a hole in this, it might be that. I tipped the cap to it in the article, but I wish I would have had time to go deeper on this very issue.
Well, a catcher doesn't need the same level of precision in mentally tracking that pitch that an umpire does to do his job. All the catcher really has to do is get his glove in front of the ball so that it doesn't fly to the backstop. The umpire has to be able to tell whether -- in mid-flight -- it ended up in an imaginary box and often when the pitcher is specifically throwing a pitch that is on the very edge of that imaginary line. If the catcher's glove is a two inches to the right, it doesn't much matter. If the umpire is missing calls by two inches, he'll be given a coupon for the nearest optometrist. For a catcher, it's much more about how he positions his body when he receives the ball.
My concern is that the type of pitch which is most benefitted by a good framer is the pitch which paints the edge. Now, it's perfectly reasonable to play "paint the edge" as a pitcher. That's one way to get hitters out, especially if you are good at it. But it's also a higher risk pitch. Of course, throwing further away from the strike zone is likely to suppress <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=HR" onmouseover="doTooltip(event, jpfl_getStat('HR'))" onmouseout="hideTip()">HR</span></a> rates, but the problem is that you always run the risk that the ball doesn't do what you think it's about to do. Pitchers screw up every now and then, and having a ball break 4 inches instead of the 5 you were hoping for can be disastrous when you are playing that game. Instead, a pitcher could opt to play "Let's see if he'll chase this one low and outside." A good framer isn't going to affect that one in either direction but if you miss an inch of movement, you've just thrown a ball that was slightly less outside than you had planned.
My concern is that having a good framer behind the plate tempts the pitcher into playing the more dangerous game more often. Combine that with an optimism bias. It's easy to say "If I nail the location on this pitch, Lucroy will frame it and it's strike 2!" but not thinking about "If I screw this up, I'll give up a home run."
The idea is that the "problem" that "causes" framing effects is a base neurological problem. The human eye can't see exactly where a ball going 90ish MPH crosses a small area (home plate) and resolve that in three dimensions. The human brain just can't resolve images that quickly. The umpire then uses other clues as to what happened. If the catcher's glove is moving away from the plate, the brain naturally uses that information to say "I guess it was tailing away from the plate." If your best guess was that it was on the edge of the zone anyway and your brain is already thinking it must have been moving away, then you are more likely to call a ball. This isn't something that's done consciously. All of this takes place much too fast to involve conscious thought.
The problem is more that maybe he should have thrown that pitch low and away and nowhere near the plate. Pitching on the edge is great if it works. If your ball doesn't get as much break as you hope for, it will hang over the plate and politely be deposited in the left field seats.
The regression coefficient was pointing upward, though not significant. It's in the strange area where we have to say "We can't statistically distinguish it from zero, but what small effect there might be is more likely to mean more home runs than fewer home runs."
I think your alternate explanation is reasonable, although the regression equation already "knows" the pitcher's baseline <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=HR" onmouseover="doTooltip(event, jpfl_getStat('HR'))" onmouseout="hideTip()">HR</span></a> rate, and the only thing that's really varying in these equations is the framing capabilities of the catcher.
The sinister piece is that the pitcher is giving up fewer fly balls, but the same number of HR. We generally just assume <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=GB" onmouseover="doTooltip(event, jpfl_getStat('GB'))" onmouseout="hideTip()">GB</span></a> = good for pitchers, and if it were just a random sampling of fly balls turned into grounders, then that's fine. But the point is that it doesn't seem to be a random sample. And I think the Peltzman Effect is at least a reasonable hypothesis as to why.
Last year's WAR is a pretty good estimation for what he will do this year. Not perfect, of course, but frankly that's what most projections systems spit out anyway.
My point was not to show long-term projection, but to show that in the first year or two of a deal, where we can reasonably expect a player to hold relatively steady, that players at the top of the food chain are getting much less than the rest of the market. Long term, I wouldn't bet on last year's numbers, but in the short term, it makes sense.
I think that the case for framing is that it's taking advantage of a neuropsychological weakness in umpires (and all humans). Technically, it's a strike based strictly on where it crosses home plate, but of course, it's doing that at a speed of 80-100 mph while curving and moving and the umpire trying to figure out if a pitch was either half an inch to the outside or inside of an imaginary line in the air. That's a hard task for any human being to do, and so umpires naturally use other information. If they see the glove moving away from the plate, they -- whether consciously or unconsciously -- make the assumption that the ball was breaking that way.
Could umpires adjust? Maybe. Being aware of these biases might be a good place to start, but a lot of them operate at an unconscious, near-automatic level. The brain fills in information based on conjecture all the time.
The other option would be robot umps.
Yes. The regression is aware of the pitcher's quality.
This is controlled for in my analyses. The regression "knows" that a somewhat weaker hitter is in the batter's box.
It isn't that WAR strips out LI. It's that it does a really bad job of how it adjusts for LI. It basically looks at the average leverage index that the pitcher faced and goes halfway. So, if the reliever faced an average LI of 1.5. it multiplies his WAR by 1.25 (halfway between 1.5 and an league average LI of 1.0).
Closers really have two jobs. There are the save situations that they are actually paid for, and then they soak up an inning or so here and there because the manager needs an arm on the mound and he was available. (These are the 'Why is Kimbrel out there if we're losing 13-4' innings.) They just happened to be the short-straw/need to pitch guy in that case. Teams do not care how they perform in these situations, but LI records it as "He was in the game with an LI of 0.02, and we're going to weight that equally with the 5-out, came into the 8th inning with 2 runners on, up by 1 save he had last week."
I'm officially a double-adult.
WAR specifically seeks to strip out the context from everything (including leverage) in order to compare everyone to the same baseline regardless of team context and the role in which they were used. That has its uses. But the entire point of "closer" is that the role is based entirely on context. You go in during very specific situations. It isn't that you evaluate the player differently, it's that you evaluate the effect that this player will have in that magnified role differently.
It isn't that it makes sense. It's that sometimes you sign a guy who you think will give you 65 <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=IP" onmouseover="doTooltip(event, jpfl_getStat('IP'))" onmouseout="hideTip()">IP</span></a> of 1.95 <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=ERA" onmouseover="doTooltip(event, jpfl_getStat('ERA'))" onmouseout="hideTip()">ERA</span></a>, and maybe that's his true talent, but sometimes, randomness is not your friend. It makes sense to sign that guy for $10M if you're fairly well convinced that's his true talent, but you have to live with the risk that he might lay an egg when he actually gets on the mound.
This morning, after I wrote that, I thought, I should have said "Fiddlesticks Utterance in the Context of a Strikeout." Much better joke.
He’s a good player (6).
6. B.J. Upton by the Braves
I had similar idea run through my head. I might do this at another time.
This is brilliant, even in its non-finding.
They're both from Florida! The <span class="teamdef"><a href="http://www.baseballprospectus.com/team_audit.php?team=CLE" target="blank">Cleveland Indians</a></span> play a "rivalry" series against the <span class="teamdef"><a href="http://www.baseballprospectus.com/team_audit.php?team=CIN" target="blank">Cincinnati Reds</a></span> every year for the same reason despite the fact that it takes 4 hours to drive from Cleveland to Cincy and only 2 to drive to Pittsburgh.
Tom, it would be a link to a bunch of regression output that showed non-significant finding after non-significant finding. There's only so many times someone can read p = .872 until they glaze over. There just wasn't anything there to show.
The Rockies did it a few years ago (4 man rotation, 75 pitch limit...) Evaluating anything to do with pitching in Colorado is dicey.
The Rays have had some success with a model emphasizing a starter going 18 batters (and then out!) this past year.
I don't know the answer to that one yet.
I'd say that Taylor Swift was a guilt pleasure of mine, but you have to feel guilty about it first.
It isn't so much that I blame myself for having such awful predictions. I could have had stellar methodology, caught some bad breaks, and just ended up with some duds. I think just about everyone had the Nats as the NL East winner, and... well, they didn't. And that was a perfectly reasonable pick (it was mine!)
The problem is that I can't just chalk all of it up to bad luck. Some of it was just bad method.
I think it's quantifiable using different methods than we usually see in Sabermetrics. I've written about the problem of n = 1 research. Might come in handy here.
As a Cleveland native and an Indians fan in the 90s myself, this is a reasonable question. I wonder if <span class="playerdef"><a href="http://www.baseballprospectus.com/player_search.php?search_name=Dennis+Martinez">Dennis Martinez</a></span> or <span class="playerdef"><a href="http://www.baseballprospectus.com/card/card.php?id=17155">Orel Hershiser</a></span> would care to comment!
True story. I saw the title of this article on the main page and clicked on it, without looking to see who wrote it. It took me a few paragraphs into it to realize that... I wrote it.
Working on it!
Whoops... I got my strategies mixed up. This is why my 6-year-old always beats me. Best strategy is to take one of the four "middle" squares.
I thought about that, but I wanted this to be a guy who was around all season, and wasn't just some waiver wire guy that a team picked up on a two-week rental just to patch a hole. I figured there would be more of those than guys like Giambi.
You're of course right that we could simply ask, but even assuming that I could do that leg work, I'd have to ask under cover of anonymity... there would be a strategic advantage in withholding that information.
Part of the beauty of Sabermetrics more generally is that it doesn't require you to ask questions where the person has a reason to lie. You just observe the behavior and analyze that. The next step would probably just be looking at pitch selection based on infield defense. Maybe we could identify guys that way.
There's actually evidence that two good defenders next to each other makes them both a little worse! http://www.baseballprospectus.com/article.php?articleid=21215
And at this point... I'll defer to anyone in the audience who knows the answer to that! I know that I've never seen a genotyping like that done prior to a prescription. My guess is that if the drug can't be processed, the family would simply see no improvement and the presciber might turn to a different treatment approach (using antihypertensives, for example).
You are correct. I was more referring to the fact that ADHD is something that you manage, rather than cure.
Many of the behavioral strategies that I (used to) teach around ADHD would be very applicable for a child with sensory issues (especially sensory seeking). In that sense, there is some crossover. On a personal note, I'm happy to hear that if your son is going to have to work around SPI, he was diagnosed early. I didn't even hear about SPI until I was in my mid-20s. It explained a lot of things from my childhood!
Well, as a point of reference, I have sensory processing issues and do not have ADHD. It's possible to have both, but the areas of the brain affected (as best as we understand them) are different for both diagnoses. For those who have constant need of sensory stimulation often comes across as hyperactivity and lack of focus. This is why I always encourage people to have a really good diagnostic test for ADHD.
I agree that it would be hard for a sensory seeker to play baseball. I'm sure that most just give up before reaching the majors, but maybe there's someone who has enough <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=%23want" onmouseover="doTooltip(event, jpfl_getStat('#want'))" onmouseout="hideTip()">#want</span></a> and figures out a way to work around it. It could happen.
My experience has been mostly that people who don't want medication do so out of some sort of principle. Some for religious/philosophical reasons, some because they just don't like the idea of taking medications. It varies.
If someone comes in worried about side effects, I can present them with research that shows the rates of side effects compared to the expected improvements and the evidence shows these medications are pretty good. For me personally, if one of my daughters had ADHD (and I was convinced of the diagnosis), I would ask for prescription for medication for her.
I can also talk about how if side effects do actually happen on the medication, we can set up a follow-up appointment to figure out what's going on and make a change if necessary. Some people think that if you sign up for medication, you have to take it no matter what and that they're basically on their own afterward. If any doctor practices like that, that's a seriously flawed practice.
Those who opposed medication on principle probably won't change their mind based on the same 10 minute conversation.
Depression, anxiety, trauma reactions, poor sleep or nutrition, sensory processing issues, thyroid problems (and a few other endocrine issues), or just the fact that baseball requires an astounding amount of attention over a lot of long boring periods. Some people are better at that sort of sustained attention than others, even though there's nothing clinically concerning about it.
I guess I'd amend that sentence to include "all else equal." A couple of paragraphs earlier, I talk about how it might not be the case in the specific. Chapman would probably have to lay off the 101 mph stuff and settle for "only" 96. Could he do that and still be effective? Maybe.
I will grant that there are a lot of variables that we can't account for here. My goal was to look at what precious little data we could grab on this question.
I'll grant that this isn't ideal methodology, but it's at least one analog that we have actual data on.
There is risk in Mazara. Dude can hit, no question. Can he field? That's going to affect his value. Still going to be a hell of a player.
I agree that it's a bit of a stretch given the Rangers current situation. 6 years of Mazara, when he gets to the show in a year or two, vs. a year and a half of Gomez right now is a steep price to pay. But Gomez is one of only a handful of players who can even think about putting up 6 wins and the fact that he'll make 9M next year is criminal. You have to pay a steep price to get that. (If Gomez were on the open-market for a one-year deal... what do you suppose he would get?) For an org that has a lot of really expensive contracts that aren't going well, it's the kind of guy that helps to re-balance the books a bit.
It's actually "Mr. Mojo Risin'" If you re-arrange those letters, you get <span class="playerdef"><a href="http://www.baseballprospectus.com/player_search.php?search_name=Jim+Morrison">Jim Morrison</a></span>.
That's a reasonable characterization there. I'm looking at this strictly the way that a robot would. I understand that there are other motivations that play into it, but I find that they are much more fear than reality. There's a kids/Sesame Street book called "The Monster at the End of the Book" in which Grover tries to keep the reader from turning the pages for fear of the monster, but soon discovers that it is just him at the end. I think the same lesson applies.
I get the fear of having a night where the guy you traded away comes back to your park and beats your team, but that's only one loss and it's not as likely that it happens as people think. The lovely thing about baseball is that they play another game the next night, so if you have a night like that, there will be another story tomorrow, one hopefully that has a better ending. As a GM, you have to think rationally. Which strategy is going to get me the greatest likelihood of winning the World Series? There is a "penalty" to be paid for trading within the division, but it's not that big.
There probably is a signaling effect, although by the time you get to where it makes sense to be making a "veteran dump" trade, I think most people who would be buying tickets (or basing their ticket buying decision on the competitiveness of the team) have already figured it out. The trade is just the completion of a disappointing first 4 months of the season.
The shrug was more an attempt at humor based on "Brian Cashman's" emoji fueled trade offer. But basically, yeah. Projectable stuff, but as my M.O., I wanted something a little closer to MLB level, rather than to signal a complete tear-down and rebuild. That may not be the absolute best course of action from a #GoryMath point of view, but I think that impulse does have to be taken into account.
In my head, I was most interested in getting something back that would be fairly close to MLB ready. Maybe that's not the best strategy for maximizing value, but that was where I think the Reds are in this case. Maybe they'll surprise me when it comes to the actual Cueto trade in the next two weeks.
But eeek, Arcia is probably a -20 in left field over a full season, and that's him in his mid-20s. Sure, the power might develop (more) but that's a guy who needs to be in a league with a DH.
Cleveland, actually. Which should explain both my fawning love and snark for Cincinnati. I drove through last week on my way up from Atlanta to visit my parents and stayed in Florence on the Kentucky side of the river.
No one appreciated the Gary Burbank reference?
First time I've ever done it.
I was surprised to see his name in an offer as well. It's a big gamble on going all in for this year. Stroman will come back from his ACL tear, and hopefully will regain form. But he's not useful for 2015 and suddenly, the Blue Jays are the new Royals, with the longest playoff drought in baseball and the constant handicap of playing in the AL East. I can see the motivation for throwing a deal like this out there. I don't know that I would do the same, but I at least get where Rian was going with this.
Thanks. That was the idea.
The only real lie up there was that I used 2012-2016. It was real 2010-2014 data. Everything in this is a real finding.
That is _the_ question, isn't it. And I don't know how to answer it yet. At this point, I feel comfortable saying that the effect isn't zero. I'd also feel comfortable saying that the same "clubhouse" guy might have a different effect in a different clubhouse. Maybe it's time to at least take a stab and try to develop ENZYME or some sort of chemistry measure.
He might also just be a year older.
A reasonable hypothesis. Maybe it's all placebo. Maybe a rookie sees the new veteran guy whose poster he had on his wall growing up and doesn't want to disappoint, despite the veteran guy being completely clueless about it! Maybe it is all just an illusion of positive thinking.
I did run all of the above analyses looking at the number of throws to first (rather than yes/no, which is all I reported). There are additional effects for the additional throws to first. I didn't dive too deep into it though.
Remember that two (or three) batters with Upton at first base counts as two (or three) times.
Oh I think you're spot on. The throw to first is just one tactic that a pitcher can use and the fact that there's such wide variation in how often it's used and it doesn't seem to coincide with results, so it's likely something that some guys use as part of their plan and some that others don't. But, we don't have data on feints.
Jobu is real. As a Cleveland native, I can tell you that.
10 million extra credit points for the <span class="playerdef"><a href="http://www.baseballprospectus.com/card/card.php?id=18210">Cory Snyder</a></span> reference.
Third base coaches, as a rule, are far too conservative in sending runners.
I don't honestly know enough about how handegg/organgeball do it to really borrow from them. With StatCast, we will have much better measures of where the ball landed and where the players were before the ball was hit (and their reaction time and speed, etc.) That's probably where the frontier is.
This was specifically designed to work in a much lower-information environment.
I've never been involved with how <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=WARP" onmouseover="doTooltip(event, jpfl_getStat('WARP'))" onmouseout="hideTip()">WARP</span></a> is calculated.
I showed that drawing to my wife and we made a bet about how long it would take until someone suggested that those weren't <span class="teamdef"><a href="http://www.baseballprospectus.com/team_audit.php?team=COL" target="blank">Colorado Rockies</a></span>. I had much more faith in you guys. ;)
I actually ran (but did not report on) number of pitches previously seen in this game from this pitcher as a potential predictor, but it never poked its head above significance and never beat out either times through the order or pitch count overall. It got left on the cutting room floor for that reason. Because it would be conflated with overall pitch count and times through the order, it makes sense.
You are correct that there's generally a stair-step pattern to times 1 then 2 then 3 on the outcomes, although the steps that I saw were a little steeper for the transition from 2 to 3 than from 1 to 2. That could just be "fatigue is not linear" or it could be that on the second <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=PA" onmouseover="doTooltip(event, jpfl_getStat('PA'))" onmouseout="hideTip()">PA</span></a>, guys are a little more contact happy, but by the third, they are fully invested in the approach, for whatever reason.
I focused on 2->3 because no one takes the starter out after 9 batters. The tell for me was that it wasn't hitters taking the same approach and just getting better at everything. Strikeouts go down, but outs in play go up. In some sense, we're just changing how the outs are made. On top of that, we have the evidence about the changing of the approach.
It doesn't shock me that baseball players would use a sub-optimal strategy. Isn't that why we all have jobs writing this stuff? ;) You could say that early in the game, guys are being too picky, or they're too invested in the idea of running up the pitch count and they are taking too many pitches. Maybe it's not sub-optimal given where the pitcher is at the time. When the pitcher is fresh, it pays to take bigger, riskier swings and be more picky. When he's more tired, it makes sense to just go up there and hack.
If a team wanted to really go deep into this, they would probably have to adopt a four-man rotation, if only because they'd need more bullpen arms. They might also try something more along the lines of having more relievers who are "longer" rather than the 58 tactical relievers that most pens have now.
There is and there isn't. My analyses control for the fact that the #1 hitter is likely better than the #9 hitter. All of this is relative to seasonal expectations. If there is magic to the 19th batter, or if there is some preventive power in pulling the starter at that point, it's in that it possibly stops the offense from adopting a more potent strategy (more focus on putting the ball in play). There's no specific reason why the offense can't do that at other times, but these numbers suggest that for whatever reason, they don't. So, putting in a reliever would keep them in sub-optimal mode.
There are two possible theories as to what's happening. One is that I do better against the pitcher the third time around because his pitch count is at 80 and he's been working for an hour and a half, vs. back in the first inning when his pitch count was 10. The other is that I've now seen him three times and have a better feel for him. Of course, the two are going to be conflated. The lack of a clear victory for either theory in my initial set of regressions suggests that there's a certain amount of conflation.
What surprised me though was that the results weren't "strikeouts and outs in play go down, everything else goes up." It was strikeouts and walks go down, everything involving the ball being hit into play goes up. The difference isn't that hitters are universally better, just that they are putting the ball into play more often (and are getting better value for doing so).
The reliever issue was telling. The starter vs. reliever was "guys in their third <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=AB" onmouseover="doTooltip(event, jpfl_getStat('AB'))" onmouseout="hideTip()">AB</span></a> of the game", so if it was just a matter that as the game went on, guys got more contact happy, we'd see no difference for starter vs. reliever. But there was one. I guess my argument is that batters treat a third time against a starter differently. In essence, that the batters become (somewhat) different hitters and that's actually what makes them better.
You'd be amazed at what teams are into that they _don't_ talk about.
We could say that his talent level during the last month seemed to depart from his career norm and say it with some relative amount of certainty. But using that to say "and we expect it to stay at X for the rest of the season" is much more shaky. I think the problem is that people treat talent as something that is much more static than it actually is.
I think that there is still some game theory, even though the defense has to "show their hand" first. Let's say that tomorrow, Ortiz starts bunting all the time and has an .800 <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=OBP" onmouseover="doTooltip(event, jpfl_getStat('OBP'))" onmouseout="hideTip()">OBP</span></a> over the next two weeks -- all bunt singles. Teams would do the obvious and not shift any more. (At the very least, they would move the one remaining infielder on the left side to play like a pinched in third baseman and leave the entire left side of the infield virtually unprotected.)
But then Ortiz starts swinging away and enjoying the fruits of not having the second baseman cut off his grounders in short right field. (Or if they play a three on right, third baseman pinched in, he will get some extra hits based on balls that he hits in the general shortstop area.) Teams once again think about going back to the shift.
If Ortiz is mini-maxing everything correctly, he will resume bunting and you're right that game theory doesn't really apply. Then it's just a math equation for the defense as to which alignment allows the fewest runs/hits and Ortiz to respond appropriately.
But we've already seen that guys like Ortiz aren't taking the obvious solution in front of them now (bunt!) because "I'm not getting paid to bunt." Ortiz could see the initial flurry of bunts as a way to get rid of the shift. A temporary con, if you will, but he has no intention of becoming a permanent bunt machine based on his view of who he is as a hitter. (It's not efficient, but male pride is not known for its efficiency.) He's banking on the fact that "Now that they see that I can bunt, they won't put the shift on me any more and I can swing away in peace" and hopes that the defense doesn't, in effect, call his bluff. That's where the game theory comes in. Even if Ortiz shows that he has the talent to do it, there's a reasonable chance, based on what we know about how players think about bunting against the shift, that he'll actually employ the mathematically correct decision when it comes time.
I didn't break it down like that here. For a player being played without a shift, if he's known as a threat to try to bunt for a hit, he's probably also a threat to lay down a sac bunt. I'm guessing that the defense will play him pinched in at third either way. For guys bunting against the shift, last year, I found that for hitters who had exactly one bunt attempt (whether it went fair or foul), they were successful in landing it in fair territory roughly half of the time. Presumably if they got one down fair, it would be an "automatic" base hit for them.
I was actually really surprised by how sensitive that decision is to the overall quality of the #8 hitter (and the pitcher less so, but still, it's a big deal). I originally thought this might be one of those things where the evidence was overwhelming that a team should always do one or the other. Not so!
I focused only on innings 1-6, when pinch hitting for the pitcher is rather rare, and only plate appearances in which the pitcher actually appeared after the eighth hitter were considered.
Me either. Plus, Cubs.
The point isn't that it's more/less likely that the Royals would run off a 7-game streak to open the season vs. a 7-game streak at any time. By definition, it's much more impressive to do something you only get one shot to do vs. something you get 156 shots to do.
The question that Rob, Joe, and CJ were rolling around was whether the fact that they won the first seven games had any special predictive power for their season chances over and above the fact that they won 7 games. Is the fact that it's games 1-7 special vs. a streak from games 65-71? The answer says that there's something to the fact that they opened the season that way, although there are other periods of the baseball season which are more predictive.
Yeah, .5 runs over a season is a rounding error. You can make that up by turning one walk into a strikeout. If there's something to be said, the effect of the rage that Tommy LaStella feels for being put at the bottom of the lineup probably isn't huge either, so maybe we're talking about a rounding error in the other direction, but yeah, this is pretty much a coin flip.
It's an interesting inefficiency argument here. I did find that 8th hitters aren't IBBed as much as we think. Maybe that has something to do with the fact that the other team thinks "Well, it's just the #8 guy." As we saw, the average #8 hitter has an <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=OBP" onmouseover="doTooltip(event, jpfl_getStat('OBP'))" onmouseout="hideTip()">OBP</span></a> twice that of the pitcher, so in a critical situation where the <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=IBB" onmouseover="doTooltip(event, jpfl_getStat('IBB'))" onmouseout="hideTip()">IBB</span></a> is "available" it would make a lot of sense to see more IBBs than we do now. Perhaps managers are responding to the cultural meanings that we attach to being a #8 hitter rather than a cold look at the probability. If the pitcher is hitting 8th, the other team has to look at the option of IBB for the 7th hitter. The further you move the pitcher up the lineup, the more of a contrast there is between the pitcher and the hitter before him (if the pitcher hits 5th, the team would likely walk the cleanup hitter almost every time!)
Joe Maddon has echoed the "second leadoff hitter" thread. I suppose it's all in how you frame it.
That's how often does the 8th spot come up in a situation where a team is trailing, but it's close (I used within 2 runs). The 8-spot will come up at some point in either the 5th, 6th, or 7th, but there's no guarantee that it'll be a close game.
Because even the best players in baseball only avoid making an out 40 percent of the time. Because if you have someone who is honestly that good, he's worth more to you coming up 4 or 5 times in a game, even if you don't get to pick his spots, than he does in one leveraged spot. Because if he's that good, he'll probably be walked as soon as you PH him.
They're trained to be functional bunters, and they get a lot of chances to show that particular skill, but that's not the same as being a good bunter. The point of a sac bunt is that you want to get it far enough away from the catcher that he can't pick it up and throw to second to get the lead runner and not hard enough that it's not basically a <span class="statdef"><a href="http://www.baseballprospectus.com/glossary/index.php?search=GB" onmouseover="doTooltip(event, jpfl_getStat('GB'))" onmouseout="hideTip()">GB</span></a> to third or the pitcher where he can spin around and throw to second. If you can do this and keep it fair, it's not an automatic play at first, but the reasonably adequate 3B can make that charge play and throw to first. It doesn't matter if it was an easy or hard play at first, just that you took away the option of second. There's some room for error there.
Bunting for a hit is another story. To take away the play at first you need to either be really fast (most pitchers aren't) to cut down just on sheer time that the defense has to react or you have to put the ball perfectly to where no one can get there in time, either by dragging it and using the foul line as your friend or a more standard bunt that's placed perfectly in the proverbial Bermuda Triangle between P, C, and 3B. Much harder to do.
If I had the processor power to re-run Ruane's study, I would. This is a fascinating little topic and there's a little bit of room here and there for lineup optimization. I think we might have to settle for basic principles on the order of "Stop hitting the bad hitter leadoff just because he's fast!"
This is a fascinating read here: <span class="playerdef"><a href="http://www.baseballprospectus.com/card/card.php?id=28343">Babe Ruth</a></span>'s batting game log for 1918.
It was the first year that Ruth started playing in the field. The Red Sox had started using him as PH in the years before, but now we see him starting to play 1B and LF early in the season, moving into the cleanup spot, and still taking the occasional turn in the rotation. But look at his game on 5/15/1918. He was the starting pitcher that day, and despite having hit cleanup in his previous few starts (and he would hit cleanup the _next day_ playing LF), Ruth was still banished to the 9th spot. Because he was the pitcher. That didn't last though. Ruth didn't pitch as much during 1918, but when he did, he hit 4th.
The Baseball Musings tool uses a much simpler algorithm. Mine actually simulates innings and knows that after three outs, the inning resets. I also have inputs for speed and some baserunning events as well as double plays. Given Tom Ruane's work, I very much doubt the difference of .158 from one simple lineup change.
Oh the 1/3 number is probably very dependent on a number of factors. I want to create something that's "good enough" given a low-information environment (going back in time to the era where we only have play-by-play data, for example)
I wonder what would happen if everyone gave me a million dollars. How would that change baseball? Nah, no time to investigate that.
I thought about going there, but because I eventually want to use this to chop up something like WPA, I'd rather leave the umpire out of it. Strictly a personal preference on that one.
Yeah, that's the problem with using Retrosheet data. It relies on stringers to determine whether something is a line drive or a fly ball. The press box theory has been put out there by <a href="http://www.baseballprospectus.com/author/colin_wyers">Colin Wyers</a> (#RIPColin) and I would treat those findings gently.
I'm alive and well.
Also, I was listening to the Vengaboys when I was writing it.
Then you shall have it! I made up the word. Andrew nailed my thought process above. We have a word for the concept of an albatross contract, and so we worry about it. Language is a powerful thing. The fact that we don't have a word for the opposite issue might mean that people don't have a way to even think about the issue, and so they over-value and over-weight what they do have a word for.
It's a good thing that I read this article while sitting in a room filled with clinical psychologists.
When I took her to her first game last year, my wife looked at me and said "You've been planning this since the little strip turned blue, haven't you." Actually, yes.
It's probably not a good thing that my boss thinks I'm on crack.
Well... strikes are much more valuable than people give them credit for.
This is another thing I do want to tear into. At least I know what next week's article would be.
Re: Farrell vs. Black: That's a decent hypothesis and would represent added value. Not sure how to pull that one apart, but a good one to start thinking about.
All of the measures have intra-class correlations in the .60 to .70 range. Pitchers slightly less than hitters. (ICC is a measure of year-to-year reliability, just that it can take into account more than 2 years.)
By definition, if a pitcher throws a strike, a batter takes a strike. So, in the aggregate, those rates move completely in tandem. But we know that over time, that aggregate rate tilts a little bit (against the batter) as the season goes on, but that some managers seem to have consistent second-level effects. There could be other covariates which explain some of that, and I'll test for those.
Hoping to look at this very issue next week, although Collins, Sandberg, and Gardenhire were leaders in 2014. Not exactly a list of playoff managers.
Ah, so many things to explore...
The regression doesn't ask "Who was actually managing on day 90?" I picked 90 days because it's roughly halfway through a six month season. I was trying to create a standard set of circumstances to measure everyone equally.
In Sandberg's case, we know that he picked up near the end of the season, when we would expect levels of Grind to be high. We can look at how the Phillies actually performed under his care. In his 2013 case, it's a small sample size (the regression can handle incomplete data, although like any small sample size, it's a little iffy how much you can trust it), but he was good in 2013 and 2014.
Bochy always draws rave reviews for his handling of bullpens. When I get around to that, we'll see how he does. But don't mistake winning the World Series for being good. There's a lot of luck in baseball.
Yeah, something that I've thought of and a line that got left on the "cutting room floor." What I am calling "manager" here is probably some mix of manager/coaches. I might look into this further. What happens when the hitting coach leaves, but the manager stays?
Some of the same methods might apply. I could look at defensive efficiency over the course of a season. Less data to work with, but worth a look.
My daughter is 5. Will you represent her?
The same thought occurred to me. There are several big variables missing in here, including hitter expectations and hitter talent at spotting up different pitches. I'm totally with you on that. There's so much more to do in this series!
I am Russell, so I'll take a stab too. Let me create a parallel universe in which all baseball players are roughly within a win of each other. The distance from the best player to the worst is one win of value and people fall mostly in between. The good players would still get paid, but the small market teams, signing guys toward the bottom of the market would be getting guys who would be nearly as good, because the talent pool is all smushed together. Now imagine a world where the best players are 100 win players. (I think it's called the NBA.)
We know that the shift affects a certain type of hitter (pull-happy lefties, mostly) more than others. Some are David Ortiz, who will be a star-level player, even with the shift. But some are barely hanging on as replacement level players. Taking away that little bit of value pulls them off the table and they are replaced by someone else. But the guy who got shifted out would have been better in a world without the shift than the guy who now replaces him. It means that the talent distribution gets pulled a little further apart. Even if it's a small amount, that's bad for the small market teams.
I have to wonder how many players never made it to MLB because of off-the-field stuff that was entirely preventable (or at least, addressable)
That would be a nice bonus, but not really required. In this sort of role, it makes more sense to have enough credibility and people skills to get people to talk. I could train someone up to be appropriately adept at recognizing when it was time to refer out in a couple of weeks.
There are many ways to the truth. I am a traveler of but one of them.
I can really only speak for myself on this one (and I have very little to do with PECOTA) but here's why I don't even bother with PED allegations. I just don't care. I know it all looks so sanitized on TV, but it's just not. I can't sit there and wag my finger at someone while drinking my own morning performance enhancing caffeinated beverage.
You can view the individual ballots in the spreadsheet. Matt Sussman is the man who left both off his 10-man ballot, although his "extended" ballot included both men.
Let's assume that there are 100 voters. That means 1000 votes available. 75 would be needed to get in. If everyone colluded perfectly 13 could get in. Those numbers work if you adjust them for the actual number of voters, because it would all be multiplied by the same factor.
I briefly considered suggesting this very thing. This mischief maker in me says that this would be so awesome to see someone do. I honestly don't know what BBWAA would do. Probably invalidate the ballot.
Well, if you only have 10 votes... why vote for someone who will get in with or without your vote?
I didn't expect otherwise. I just wanted to see what the effect size was. I'm more concerned with the fact that people make grand statements about how teams should copy this model or that model, when in reality the difference isn't that big.
Not necessarily. It means that you won't get all that much out of it, but it's there for the taking, so why not?
Oh sure, I write that yesterday and then the US and Cuba decide to make nice with each other.
The Dodgers have robbed us of a season of hearing Vin Scully say Erisbel Arruebarrena over and over. We'll just have to settle for listening to Vin Scully say absolutely anything else.
I actually ran (but didn't report) the regular vs. irregular DHs. The small effect that a player gets from DHing yesterday is mostly driven by guys who play positions regularly.
The wear-and-tear/injuries question is harder to get at. A good reason to move someone to DH for a day or two is because he's hurt, but that also means that he's a bigger injury risk going forward. We might see that DHing once in a while is associated with a greater risk of serious injury, when the DHing is just an indicator, not the cause.
Yeah, but what fun is that?
Maddon rated in the slightly positive (good) direction on this one. His players tended to get better over the course of the season, although not tremendously so.
Well, I have to wonder whether it's pitchers "settling down" or batters "getting tired." We always assume it's the pitchers.
I believe you spelled "Robot umps now" wrong.
There probably will be a counter-move in terms of player development. Maybe we will start to see more money devoted in the draft to hitters or more guys who are potential two-way talents going the hitter route. But how long until that shift bears fruit...
IP/G for starters (it's in the form of outs/G, but I'm assuming people here can do the #GoryMath and divide by 3) is here: http://www.baseballprospectus.com/article.php?articleid=22320
I like your theory on why relievers are suddenly striking more guys out. Makes sense.
That's a reasonable hypothesis, although I'm not sure that DL use is actually down or that recovery times are down or that pitchers don't lose as much mojo after. I'd have to check.
There's a reason I'm afraid of 7.
I actually have that in the "to do" queue.
That's in there. Maybe the guys to focus on would be the ones who statistically show no difference between win and loss, but as I pointed out in the last part, I'm hesitant to even speak prescriptively about whether the effect _should_ be positive or negative. If one of the reasons that teams have a bad day is because they fall out of tune in one direction or another, then a good corrective manager should put them back in the proper direction. It's all rather speculative at this point.
I've actually mused on that topic before, as well:
Yeah, re-reading this the day after, Sam was probably right.
Saul was blinded on the Road to Damascus... I hope I have better luck.
Take heed. I'm a lyrical poet.
Rozenson apparently hasn't heard the rule around the office that we don't mention Craig Counsell or the '97 World Series when I'm around... *cries quietly in the corner*
Because of that book, I've probably spent more time than I care to mention doing #GoryMath on random baseball questions instead of talking to my wife.
I also talk in #hashtags
I should have worked that into the line about the nutrition program. The Dodgers can afford Dijon ketchup.
We know that we expect less out of a hitter facing Clayton Kershaw than facing me and my 40 fastball (that's not a scouting grade, that's MPH) and we generally assume that the effect is sorta linear. As the pitchers get better and better, the outcomes we expect for hitters decrease in line with how much better the pitcher is. You seem to be suggesting that there exists a type of hitter who has non-linear curve. Once you get into the "good" pitchers, his performance actually has a significant inflection point. (And obviously, you are suggesting that the Yankees had a critical mass of these players.)
Y'know, I can't dismiss this out of hand. I might take a look.
There's probably some positive skew in that distribution. Occasionally you get a team who runs out of options and has to send someone out there who's just awful or they are irrationally wedded to some veteran who has clearly lost it. Most of those guys don't last long enough to meet our usual min PA threshold, but once in a while, they stick around. Those guys probably have True Talents above .300 on BABIP. Because MLB pitchers are pulled from the right tail of the talent pool, it's much more likely that you'll get someone who's awful than amazing.
In statistical terms, we're start with var (obs) = var (true) + var (error). DIPS assumes that the variance between pitchers is much less than error variance (the statistical way of saying "it's mostly luck")
I suppose the twin questions are "What does the distribution of true talent look like?" (DIPS, the way that it's practiced basically says mean = .300, SD = 0) and "what does the distribution of error look like?"
We do know that for your typical starting pitcher sample size for one season (call it 700-800 batters faced? Maybe 500 balls in play, give or take), no matter how you do it, the reliability estimate (whether that's year-to-year, intra-class, split half, KR-21) gets up to, at best, .2 or so
Using .260 and .340 as admitted guesses as to your basic +/- 2 SD range, around observed BABIPs, and assuming that var (true) and var (error) aren't correlated, it means that we can regress backward toward .300 and suggest that true talents have a spread between .292 and .308. (Even if we fiddle with those numbers a bit, we're going to get the same basic results.) We know that the noise is deafening.
Now, that's (as you point out) all nice and neat and Gaussian, and we know that there's research that suggests that the error term contains both random error and measurement error (need to treat GB/FB pitchers differently, knuckleballers are weird, park factors matter, etc.) There may even be an error covariance and local variations in there that the simple Gaussian model doesn't account for. However, the research all points out that while we can account for some of that var (error) with these measurement error/bias issues, the amount that we can account for is still relatively minor.
All that said, I get the desire to account for every last scrap of variance that we can, but I don't think that you escape the conclusion that even if it's not exactly right, the mnemonic "everyone will regress toward .300" is going to be right much more often than it is wrong.
I have to say that I enjoyed the movie, but I wasn't dying to use it. It just kinda nicely fell into place.
Mike Trout should win all of these awards.
FWIW, if you pump up the sample size to include LCSs and World Series since 1995, the finding still holds (n = 755 games that weren't Game 1)
How dare you do #GoryMath?
What kind of pompous fool would use a phrase like that?
I think the answer will be better, but not to the point where it gets past the fundamental noisiness of what we're trying to measure.
There's probably some of that in there. But you try standing still for 15 minutes without having much to do and tell me you don't wiggle a bit ;)
In terms of percentage of chances, sure, but in batting, there's always the chance to do something good (or bad) and it's much more in the batter's control to actually go and do it.
Basically, those 60-90 balls are the only ones not in either the "any idiot can get that" or the "no one could get that" area. Imagine if all hitters were evaluated on the basis of 60-90 plate appearances and that the outcomes of the rest of those plate appearances were basically pre-determined.
Oh there's plenty more to do. Baseball is such a rich dataset.
At that particular juncture, I was talking about center fielders. As Dave points out, you rarely see a 30 or 40 runner in CF, nor do you see a true 80 runner in a corner. The point is that even these "huge" gaps are really only worth a few feet of advantage for a good fielder and that there are chance factors that can wipe out those few feet's worth of advantage with relative ease.
Just me totally nerding out, but mixed-linear model?
There is something to be said for this as "makeup" or one aspect of it. Jason Parks used to tell me that the most important thing that a player can have is the ability to deal with failure.
That's a fair point. It's entirely likely that the "casualty" of all this losing would be a player who is primarily there right now because he owns a glove and likes the color orange. But there is a bigger message. Yes, there really are effects of losing on players.
That thought occurred to me. Maybe bad teams are bad teams because they take chances on the "wrong" players in general, and it's a systemic issue. I don't know that there's a way to control for that...
My arb hearing is coming up, so every little bit helps.
One billion points! (Although, you will actually only receive 37 of them. I took care of all the applicable taxes and other fees for you.)
I am humbled by your kind words. Thank you.
I actually consulted my mama on this one (I went upstairs). She had some interesting advice.
Thanks. I only looked at starters, so at that point, if he is removed, the regression doesn't care any more. It's possible that if you have a game where the starter's last act is giving up a three-run shot (and obviously, his manager thinks that he is completely gassed), you could make the case that's a survivorship bias, but that's a problem of missing data bias at that point, because we don't know what would have happened if his manager had left him in.
Well, through log-odds, we have a reasonable estimate of the probabilities of what is about to happen next. We know that Smith makes outs 70 percent of the time and that Jones induces outs 68 percent of the time. The rest is just a little #GoryMath.
In the strict sense, because we're looking retrospectively, we're working in a closed system, so the subset of "plate appearances that follow on-base events" is by definition a set where we've at least eliminated one plate appearance (immmediately before) that featured an on-base event. However, in a sample that large (5 years worth of play-by-play) the effect will be negligible.
That's a really important observation and probably the biggest issue in this sort of research.
The log-odds ratio method specifically corrects for that though. Going into each at-bat, the regression "knows" that the batter has a certain seasonal stat line and adjust expectations based on that knowledge.
It's easy to pile on a guy when he's down and when you don't have complete access to all of the info. There are things that I would have done differently than RAJ, but that doesn't make him an idiot.
Look at his O-Swing rates around that time. He started swinging more at stuff on the edges. It apparently worked.
Yeah, that's the other piece of it. There wasn't a beat reporter following me around when I was 19.
Thanks for the kind words. We totally should have nerded out on program evaluation shop talk! Next year!
I'd argue that the primary goal of parenting during this time is to "launch" the kids so that they can do things on their own. Some folks have parents, either their actual parents or some other mentor, who help them through this time, but not everyone does. It's not about micro-management. It's about structuring services to meet needs.
Teams use a lot of different resources. Teams do hire outside professionals to come give talks on specific subjects and retain consultants (nutritionists were mentioned more than once to me... no surprise there)
Many teams have their players stay with host families and form relationships that way. I don't know how much training the host families get, but that's a place where teams might look to improve. How can we make sure that these host families aren't just "groupies" and could actually have some net benefit to the team?
Perry threw spitballs?
We need to have a rule in the comments sections. Never bring up Game 7 of the '97 World Series when Russell is around.
*cries in the corner*
I've seen this one on ESPN Classic. Fantastic game.
Actually, that's just because of the realities of the publication schedule. I didn't have time to look into whether it washed out or whether it was a net positive. Your critique is valid. My only defense is that I had a busy week.
He could end up with the same results, just through different means (grounding to second, rather than striking out). Even here, I found that Ks and BBs went down, but outs in play and doubles/triples went up. It's possible that in terms of run production, it all washes out.
I used swing tendencies as a proxy for "Is affected by high leverage."
Oh if only I could find that dataset...
I can live with that. I think clutch is majorly over-sold as a narrative, but a limited view of it might stand up to more scrutiny.
The Yerkes-Dodson curve suggests that people are actually at their peak at a moderate amount of stress, and that too little or too much have the same effect of driving down performance, but we don't know where the high point of the curve is.
Thanks. I'm probably among the last of the Clevelanders who was "raised" by Herb. May he rest gently.
And proud of it.
I actually ran those just to look, but didn't save it. Four strikeouts has an odds of happening about once every year. Four HR is supposed to happen once every 11 or 12 years, using 2013 numbers as my baseline.
My goal is to have a week where they mention me in all 5 episodes.
I actually started down that proverbial rabbit hole and then thought better of it. Those are possible ways that it could happen, but again, reaching on a dropped third is such a rarity that covering those would add so little to the probability estimate.
What kind of idiot would name himself after an auxiliary kitchen implement?
This has been studied in the lab a bunch of times and that's generally the explanation that people give when they turn down a free nickel. It's generally seen as a test of how much people value societal notions of fairness. I could see an argument that within a broader society, teaching the lesson might be more valuable. In a baseball setting, I would take the money.
As always with Vegas odds, recognize the statistical impossibility. 2/1 odds mean a predicted 33% chance of winning (2 chances he doesn't win to 1 chance he does). That's why Vegas reports the lines in odds ratios, rather than percentages, because people have no idea what odds ratios are.
If you convert all of those odds to percentages, Vegas is predicting that there will be 1.33 winners of the HR Derby. Of course, there can be only one. If you adjust those odds so that they would sum to 1.0 winners, then that book believes that Stanton has about a 25 percent chance of winning, so they should be laying 3/1 odds, but they'll only pay out at 2/1.
The smart way to bet in these situations (not that books let you do this) is to back the losing side of the bet in all 10 cases. You will make 10 dollars minus the lain odds that the eventual winner had. (So if Stanton actually wins, your payout is 8 bucks; If Morneau wins, you are down 5). Probablistically, you will come out ahead because you are saying that there will be 9 losers (the truth) and Vegas is only saying that there will be 8.67.
Never bet against Vegas. They're better at math than you are.
I won't vouch for these numbers 100% because I did it with a piece of scratch paper and tally marks, but it is indeed true that Zachary does have the most guest appearances on EW, tied with Paul Sporer (7 each). Woj and I have both made 6 guest appearances. Ken Funck has been on 5 times. A whole bunch of people are 4-timers (Ian Miller, Doug Thorburn, Matthew Kory, Jason Parks, and Geoff Young). Colin Wyers, Jay Jaffe, and Will Woods have all been on 3 times.
The first on-air guest was Marc Normandin (episode 57).
The first voice other than Ben or Sam to appear was Ian Miller on Episode 8, co-hosting in Ben's absence (from what I can tell, the only show that Ben missed), creating a Miller vs. Miller dynamic. Woj and I have guest hosted thrice each, along with single guest host appearances by Will Woods, Zachary Levine, and Ian. There were two simulcast episodes with FanGraphs, so I guess Carson Cistulli was a two-time co-host.
We should probably also acknowledge Pete Barrett and Nick Wheatley-Schaller who each produced 30 segments during the season-previews.
Why do you hate all 30 teams?
They'd never actually do it, but the correct answer is probably the 10-3 Alfredo Simon.
Never heard of him.
I haven't run the numbers... it wouldn't have to be a lot. Just enough to affect an already small sample.
Scott Patterson was a real-life AAA pitcher in the Braves and Yankees system before taking up acting. Thankfully someone got the Gilmore Girls reference!
The statistician in me keeps looking at (n = 31) and having that same thought.
There are probably a lot of New England Ex-Patriots (*rimshot*) out there in college at NC State and Clemson. Maybe I should go back to see what happens by place of birth.
I left out Alaska and Hawaii too. It was all about sample size.
The counter-move is beginning...
Many projections systems do incorporate in-season data like that. Those are the most recent data we have, and if you're going to use something, it might as well be that.
The problem with using the stabilization points in the manner that you suggest (and it's a method which I commonly see used), is that they were never meant to be used that way. The fact that Betances reached 70 PA means that we can feel "comfortable enough" with that sample to say that over those 70 PA (17 innings), he really was (past tense) a pitcher with a talent level around 42% (yikes!). It's reasonable to think that over another 70 or 100 or 500 PA, he'd pitch similarly, but that's an assumption. Perfectly reasonable assumption, but not ironclad.
Going forward, he won't have the element of mystery any more (pitchers usually have the upper hand in the first meeting with a batter), he'll be a bunch of pitches deeper into his season (and more tired?) and he'll face a different suite of hitters, probably in higher leverage situations. In other words, the next 70 hitters could be very different than the first 70. Stabilization answers the question "If we gave him two sets of 70 batters in roughly the same circumstances, how closely would the two performances match." The question of what he will do next month is entirely different.
My seventh grade teacher taught me that you should re-arrange questions to declarative sentences to figure out what goes where. "We would pick whom now."
Then again, Miley Cyrus taught us that "Things don't run we."
*cries quietly after re-imagining the '97 World Series with Billy Wagner*
Aside from extenuating circumstances, I've broken the "never leave early" rule once in my life. August 5, 2001, Seattle at Cleveland. Look that one up.
(I had a plane to catch the next morning to see my then-girlfriend, now-wife for her birthday. I have since forgiven her.)
I just googled "cost of MRI" and it turns out that no one really knows. Just grabbing a number, $2000 per MRI, which seems to be at least realistic, times 6 months times say 50 pitchers in the system is $600k. Teams could probably get a punch card where they get a free one with every 5 they buy deal and knock some of that cost off, so let's say that it's somewhere in the low-to-mid 6 digit range. That's the right order of magnitude.
Now, for that to make sense, you'd have to make the case that the MRIs would give additional information that would save the team more than 600k (or whatever) in lost production. Would constant MRIs mean that they could intervene to actually prevent an injury or would it just document the inevitable decline into a shoulder or elbow injury? I don't have the expertise to comment on that one.
Also, there's the side effect that no one enjoys an MRI... and if signing with this team means I have to do 6 of them a year... I'm going over to that other team.
I was happy when that one popped into my head.
I threw in the single year (2013) MLB salary stuff as a quick benchmark. The message there is that the correlation ain't .80 even when we have MLB data to work from and we're not projecting into the future.
I actually ran just about everything in here with draft position as the predictor. Signing bonus was a better predictor consistently. These drafts (03-08) are back in the era before slotting (or even "slotting") and teams made "signability" picks in the top ten, so there were likely players whom the teams believed weren't as good who got picked ahead of better players who wanted bigger bonuses.
It's true that the talent may bunch in different ways, but in an efficient market, where everyone has perfect information (obviously not actually the case), the signing bonuses should sort that out as well. What strikes me about all of this is that the market is a long way away from efficient, which I take as "prospectin' is hard."
Yeah, that's probably in there somewhere. I think there's been explicit work done on this in the NBA. It's why I also considered career WAR so that we had something to balance against that threat.
In those comparison, I'm comparing HS to HS and college to college. So, there will be some 2003 HS grads and 2008 HS grads in there, same as there will be 2003 college and 2008 college grads. Sure, there's bias in that a 2003 HS grad would be 29 now, while a 2008 HS grad would only be 24. It's why I sliced things a few other ways. What amazed me was that the correlations stayed so consistent. And moderate.
The gory math warning was just something that I've turned into a personal trademark/gimmick. It was actually based on Dante's Inferno (Abandon all hope ye who here enter!) I started doing it a few years ago actually to specifically allow people to skip the details if they just wanted the conclusion. Some people like hearing about covariance matrices. Some don't.
Thought about that... maybe we need 2-3 years to see the process unfold fully. My decision to keep it to one year though was based more on the fact that people a) freak out when hitters/pitchers are doing so much better/worse than last year and b) people call for firings as a result.
The more I research this stuff, the more it seems to me that the biggest inefficiency out there is understanding that leadership aspect and the systemic effects that come with it.
I didn't save the data file, mostly because it was all events from 1948-2013, which is about 10M lines. Yeah...
The problem is that you have to run a LAG function (I use SPSS) a few times on a couple of different variables to make sure you're selecting out the correct cases. It's not terribly complex to program, it just takes a lot of processor time to actually plow through.
Yeah, there's that aspect of it. I wish I had a way around that...
I've met Josh. Ben's right, he could totally do it.
That's not true at all. You'd be amazed at the things that I have heard "old school" people say. At its core, Sabermetrics comes down to making a reasonable argument, and you don't last in baseball if you're not willing to listen to a reasonable argument.
In a perfect world, people would understand that it's a simple risk-benefit calculation and that sometimes, you do the right thing and it doesn't work. But, that's not the world we live in and we need to work within that world.
If you hear Mongolian yaks talking to you, please consult your doctor!
That should actually be reversed. 4.39 after a day off, 4.43 with no day off. It's a very small effect.
Mike, we have an opening for a Yak F/X expert, if you're interested.
In fairness, they may have behind the scenes. They didn't report it in the article. Until they either say "We didn't do that" or reveal their methods (and results) the proper thing to do is reserve judgment.
I do wish BP (and everyone else) would publish error bars. I get why BP (and everyone else, including ESPN) doesn't. It's mostly a space issue, but the old stats prof in me says if mean, then standard deviation.
As someone who lived and died with that particular series in my native Cleveland...
Because of the bottom of the ninth.
Ahhh... the super-computer...
There will always be noise. Which is good, because it means I always have something else to write about.
There's some amount of truth in this. I guess the way to think about it is to look at what happens when AL teams go to NL parks. Most place their DH in the field (at 1B or LF) in some sort of an attempt to bury him as a fielder (the defensive #9 hole?). It seems an implicit endorsement of the idea that maintaining the integrity of the batting order is better than maintaining the integrity of the fielding grid. Other fielders can sorta cover for the defensive weaknesses of the usual DH, while no one can help a batter as he stands there. If we accept that maintaining a good groove in the lineup is even slightly more important, then the league which can indulge in offense over defense, because of the DH and the lack of a need to maintain as much defensive flexbility, will have an advantage.
Infield metrics are more reliable than outfield metrics, so we at least have that going for us here. In terms of tipping the anlysis one way or the other though, the results should be big enough that even leaving room for some margin of error, the choice is fairly obvious.
In a perfect world, a Yankee manager would have a Markov-type analysis at his disposal looking at different configurations of lineups based on the players at his disposal. My goal here is to show that there are structural issues to consider when making these sorts of issues and that WAR doesn't take those into account. The magnitude of those coefficients is one part an observation on the environment and one part real structural effects. Pulling apart how much is what would take some deeper digging. Your point is well-taken though.
Don't worry. The irony isn't lost on us either.
Someone's gotta keep Woolner in line. Best of luck, Max!
Confirmed that they are all good ole fashioned Pearson correlations.
Sorta. I ran (and didn't report) a step-wise regression throwing both of those into the pot. There was a small effect for the end of season estimate, but it was very small R-squared wise.
For the seasonal numbers, on average you're talking about 480 PA. For anything involving a trend line (#3, #4, #5 on the list), it represents an average of 132 PA powering each number (some more, some less, obviously). For #6, as you point out, it's 100.
Heart go boom. That bad.
And that's the sort of analysis that we're gonna need. Assuming that Ianetta's hot streak was due to getting contacts, the only thing he'll need to remember is to put his contacts in every morning. Of course, that quickly accelerates beyond our data.
I have no specific reason to call for a Trout regression other than some sort of existential discomfort. There's always the element of Waiting for Godot in baseball. The past was always glorious and someday, another all-time great will emerge, but that time is not now. That's my entire reasoning.
Someone has clearly never seen Lindor play.
It's on Detroit...
Laugh a little. It's proven to help you live longer.
I don't know yet.
I didn't explicitly go there yet but I probably will next time. Something like a Bonferroni (or at least asking p to be less than .001) Right now, I'm just trying to establish the method.
You read it correctly...
Yes it does and yes there are ways to control for that. I made a conscious decision not to get too deep into those weeds yet. Right now, we've got a nascent model, and little idea of where it's going to end up. Best not to over-complicate something if you don't need to.
But... he went to my high school!
My guess is that "Don't make stupid challenges" will become an unwritten rule in the game. I just wonder whether baseball is ready for the first time that someone gets beaned in response to a frivolous replay challenge.
Another idea I had after we had went to press was that perhaps a lost challenge could lead to the opposing manager being able to select a player from the other team's bench or bullpen who would be taken off the potential list of substitutes. Think they got the call wrong? Are you willing to bet being able to use Joe Nathan at the end of the game?
BRef's WAR adjustment is (WAR * (1 + average gmLI) / 2), but that's going to be dragged down by "need to pitch" outings that the team doesn't care about. In addition, they use game entry leverage, in a well-conceived attempt to make sure that a pitcher can't just cook up his own high leverage situations by being bad at his job, but the fact is that in a ninth inning situation, if the closer goes 1-2-3, the leverage actually increases as those outs go on the board. All told, he faced much greater leverage than we give him credit for. I also don't understand the derrivation of "let's go halfway back to 1" other than "let's just split the difference." Further, that adjustment will be agnostic to whether the pitcher was good in mopup/bad when it counted vs. bad in mopup/good when it counted vs. same performance in either situation.
The point isn't to figure out how to assign credit to the pitcher. It's to realize that, from the team's perspective, there is a real incentive to make sure you have good pitchers to staff those high leverage situations, because, by definition, they have a greater impact on the game than low leverage situations. Teams control not only whom they sign, but what role he will fill. WAR is only somewhat sensitive to that leverage, and the market seems to be drawing more in line with WAR. So, there's room to exploit an inefficiency.
When I speak of the traditional Sabermetric orthodoxy, I refer more to the idea that any idiot with a right arm can close a game. In that case, why would teams paying all this extra money for a veteran (often a "proven closer") when they could simply have some league minimum guy handle it (and pick up 30 saves in the process.)
I'm looking at that relationship in the reverse. Yes, WPA inflates numbers for closers, specifically because the 9th up by a run is a high leverage situation, and most certainly, you don't pay for past success or past leverage. But, if you sign a guy with the intent of putting him in those high leverage situations, WAR isn't doing as much as it should to highlight the out-sized impact that he can have on games.
My argument isn't about whether we should use WPA to evaluate individual relievers. It's about that fact that WAR doesn't do properly reflect what relievers do.
I'm willing to deliver cookies to Atlanta based subscribers.
When Julio Urias was born, I was Julio Urias's age...
Yeah, I did. I may have written that line with a 4 month old wriggling in my arms.
The problem is that if you're going to use a tandem 5th starter (model 4) you have to settle on what day the two men would pitch and what their rest schedule would be. So, on some days, you'd effectively have a 6 man bullpen.
You could say that maybe if you did it on the 3rd day ("start" on day 1, rest 2 and 3, available on 4, rest on 5, "start" on 6.) that if he was available for light duty on day 4 that on day 3, the manager could deploy his pitchers differently. That's reasonable. But on 2 and 5, you're a roster spot short.
You are correct in your interpretation. I interpreted those data in roughly the same way in my head. Pitching into the sixth usually means he's doing well enough not to have been taken out. The fact that he made the sixth is a quick marker. That's all I intended for it to be.
If you showed me a video where Barry Bonds was holding a big jug labeled "Steroids! (now in mint)" and drinking from it, I would still vote for him.
I can't wait to read it!
Edmonds will probably get a couple of courtesy votes, which is a shame. His career bWAR is 60, which puts him around the McGwire/Piazza/Sosa/Sheffield range. He was never "the best player in the league" but there's something to be said for a decade of 5-6 win seasons. Jones is almost a carbon copy of that same profile. However, I doubt that either will receive his proper due.
They're both a bit #BigHall for my own personal tastes (I know, I'm a little crazy on that count), but they show the difficulties of giving proper credit to defense in the game. We don't even have the vocabulary to discuss it at this point. That's going to be a problem.
Thanks for that... I'll be taking 732 of those points back now.
One billion point! Also, you get some extra credit for you sophomore bio class. That was the function, but circumstances have rendered that function irrelevant.
It's one possibility. Stay tuned... ;)
It's probably related to the fact that starts that last 50-60 pitches are relatively rare and that the distribution is very much skewed toward at least 90 pitches.
You're right... I started writing one sentence and then finished another. It's high 20s early on, peaking around 40% in the 70s, and then falls from there.
One billion points!
You're right, I did! Also, you can have lunch with Dayton Moore.
Let's leave aside your point about the selection bias in the sample (it's a valid critique, but for a moment, let's assume that it doesn't make a difference) In the 60's, we see that starters in their final innings were a quarter run worse than relievers in the same inning. Over time, we see that managers moved more toward relievers, and as they did, the lines converged. My hypothesis is that this was the managers realizing (consciously or not) that they were sending out starters when they had a reliever who would have been a better option, and becoming better about telling the difference. In the 60s they weren't. If they went back to the 60s, it would be returning to the days of incorrect assessment.
The argument that I'm making is that managers have gotten better about picking their spots. If they were to revert to the old days, they would send pitchers out whom they shouldn't send. It's not the starters that have improved, but the managers who have done so.
One thing I did think about was to look only at cases where either the starter or the reliever who entered finished the inning, as a way to delete the possibility of it being a starter who suddenly lost it after everything was looking great.
There is some sense of counter-balance in that relievers can also have just absolutely horrid innings as well.
In my head, yes, although it seems that over time, the historical trend has dealt with this by having a couple extra guys in the bullpen, so as not to overtax the relief corps.
Evan, I said this on Twitter, but I will repeat it here. I know that this work is preliminary, but it is very strong and a major conceptual step forward. The ability to mix methods and data sources, blending the quantitative with the qualitative will drive the field forward in some very interesting ways. Thanks for conceiving of this and sharing with us.
Ben wins a billion points for saying "this is the refractory period". That means Tuesday was...
That's one reason that you'd pretty much only get to do it once. And probably in the third game of a series where you don't have to see that same team the next night!
One I'd love to see along these same lines. With a runner at second and a RH batter being intentionally walked, there would be a called play where on the fourth pitch (3-0), the batter would actually swing, if for no other reason than to surprise/confuse the catcher (that might buy you a tenth of a second), and the runner would break for third. You'd want to study some game film to look for signs that the pitcher and catcher fall into a lull while issuing an IBB and pick your spot, because you'd really only get to do it once. But if you do it right, you end up with a runner on third and a 3-1 count on a hitter whom they wanted to walk anyway. They either finish the walk or try to get cute... with a good hitter spotted a 3-1 count and a runner on third.
After I sent Sam that e-mail, I had something of the same thought. Perhaps limited range at first base becomes even more limited at third, and this might let more things leak through. We'd have to account for that. It probably would work best if you have a first baseman who's a convert or a part time 3B anyway. But, you could still make a case that it might work.
The thing that worries me is whether we're assigning players based on their abilities to make particular iconic plays positionally (the guy with the big arm HAS to play RF because he might have to make a throw to third once a week) vs. placing them based on whether their abilities match what will be most often asked of them. We're scared to take an action that might make one position (or the chances of making one type of play) worse, even if the corresponding improvement is a net gain.
That sort of tactic is probably already baked into the comparison data.
I fully understand that this would be boring to watch (you're probably talking about a few extra minutes onto the game of just guys getting into position), but it would have a small effect on helping a team to win more games. Is the job of a team to be entertaining or to win?
Also, what would be the rule change? You'd pretty much have to say that teams could not make mid-inning defensive positioning changes with two players not in the game, and that if you are the stated left fielder, you actually have to be to the left of the center fielder.
That would effectively outlaw the Lawrie Shift as well where the third baseman is the third infielder on the right side of the infield, and the much-rangier shortstop holds down the fort on the left side. It probably also deletes the ever-popular "5th infielder" tactic that you'll see used at the end of games with a runner on third, less than 2 out, and the game tied.
I would agree that you would have to account for this, but I'd also argue that there are plenty of players who play both spots on a semi-regular basis. If a player knew he was going to do this, he could prepare for it better. I don't know whether that prep time would actually work. However, in one of the articles that I linked from earlier this year, I found that when a player does switch positions in the middle of the game, he performs at the same level that we would normally expect of him at that position. It even worked when I looked only at the first inning of the switch.
I understand this sentiment fully, and there's probably something to be said for it. I would happily listen to an argument that says that showing him up is bad for the clubhouse and that's going to have a far greater effect than the handful of runs that this would save.
A question in reply though, meant only as a test of the limits: At what point would you?
Congratulations graduates! (And Jason, your psychoanalysis appointment is still on for 3:00 today.)
Also, re: the 4 WAR CF becoming a trade chip, you can flip him for something else, but with reduced leverage. The other team knows that he's surplus to parts, so why should they pay full price?
It's more than just setting the intercept to the league minimum (500k). It's that when you suppress the intercept, it messes with how the line is best fit to the data. (Click on the JC Bradbury link for more info.)
It depends on how you measure it. In 2013, the number paid per win was roughly 5.2M for hitters and 7.4M for pitchers, but that includes salaries from contracts signed in 2012, 2011, 2010...
If we're looking only at new contracts, it seems that number really is creeping closer to 7M.
You could make the case that the Schaeffer squeeze wasn't really all that gutsy. If it works, you win. If it doesn't, then you still have a runner at 3rd and just need one run and you still have another out in the inning. Even if you don't score you just go to the 10th.
The #GoryMath reason actually had to do with how I did service time allocations. I didn't want to count a guy who got a handful of PA or a mop up inning in September one year to have that count as his "rookie" season.
I figured that some of these guys (say, guys who got 150 PA) were partial year call ups, so the actual salary outlay for them would be a pro-rated amount of the minimum, my model credited them with receiving the full 500k. However, there would be others who got 90 PA, who also got a pro-rated amount, that the model doesn't see.
Mine would be the world's most boring memoir of my "life" on the inside. It would be a few pages of "And then they asked me about this topic, so I gathered together some data and ran some #GoryMath and e-mailed them with some answers." Just repeat that over and over.
I suppose you could interpret it that way. Or perhaps as a failure to maintain a good balance between aggression and caution.
Binary logit does freak out a little bit near 0 and 100, but with roughly 2-3% of the cases being positive, it should be OK.
The Astros front office is starting to resemble a StatSpeak reunion.
It looks like the Cards are shortening their swings. Let's (over)simplify that trade off to being a trade between chances of a HR and chances at a single (with a greater weight being placed on the chances of a single... you might trade 1 HR for 3 singles)
In a bases empty situation, a single doesn't have a lot of value, compared to a HR. In RISP situations, it has more value.
Check the bottom of the article. There's a "contact the author" link.
Hadn't thought of that... good point.
Of course, building type I error is always a problem. In general, I like to run several variations on the same analysis and see if they are all producing the same basic results. If the answer is yes, then I'm happier in saying that the results are real.
That's baked in. I studied the effects of the bunt before knowing the outcome. If you look at the second graph, it shows the rates of both the happy outcomes (batter gets a hit or reaches on an error) vs. the sad outcomes (double play or foul bunt for a strikeout)
The Book does break it down by whether there was a bunt attempted at some point (a bunt-and-miss or a foul bunt) in the plate appearance. Uribe's HR the other night would be a case of this. I was more interested in the bunt itself as an event.
I had originally planned to look at the distribution of runs (specifically on the issue of maximizing one run vs. risking a wider range of outcomes). I ran out of time.
From the preliminary work that I ran, I did find that it did do a good job in maximizing the probability of a one-run inning over swinging away. However, that's really only useful in certain situations in which one run is absolutely critical and mutliple runs aren't really needed (i.e., bottom of the ninth and it's tied or you're down by one and it's late and you need to tie). The general pulse that I get is that we are much more forgiving of these "tactical" bunt attempts (and we should be!)
I would surmise that the most frustrating bunt attempts are done away from these situations where the idea of maximizing runs should always win out.
I didn't look at the team level. I'm guessing that most of a single team's bunt attempts will be concentrated on a couple of hitters, so we'd really be looking at those hitters rather than some sort of team-level effect.
The idea of looking at MiLB bunting is interesting. I do wonder if there's something to it.
James is crafty like that.
Gomes actually doesn't have rookie eligibility. He was on the Toronto roster too long last year. I looked.
I had Gomez #3 but in general, there were a bunch of guys all clustered around that same range of WAR(P). Can't say I really fault any of the names above Gomez.
For what it's worth, four people (myself included) did not submit any ballots for manager of the year. Hurdle appeared on 24 out of 27 ballots.
Funny enough, when I wrote that, my first thought was similar. They are two very disjointed thoughts. The thing is that even though there is an inherent contradiction, both statements are actually true.
The fear that you have to live with as a small market team when you find a hidden advantage is that it will be easily copied by anyone willing to sink some money into it and that the breakthrough is all in the brilliant idea rather than the execution. Then, it becomes something that is common knowledge, and everyone else has more money than you do to implement it.
With a big market team, you have to worry that some other team (especially a small market team who has incentive to put a lot of resources into identifying these sorts of things) will copy you and do it better. In that case, it's an idea that requires not only a flash of inspiration but a lot of technical skill in implementing it. You might be doing the pilot testing for a small market team who was afraid of committing to the initial expense. But once they know that it works, they might be able to implement it better, and you've just shot yourself in the foot twice!
Absolutely, if I were in charge of an evaluation of such a program, I'd check on a few of these things. The problem is that you still have to make the second level case that all of those effects lead to wins at the MLB level. It's not an impossible case to make, but you still have to draw the line. Obviously, it's a lot easier to convince an owner/team president/GM that that it's a worthwhile program if you can say "Well, they are doing more of the things that we know lead to growth and development (watching more video, staying healthy)"
That's a low-risk approach. You only copy things that have been shown to work, but in the process, never expose yourself to the benefit that you might either gain an advantage or the downside of spending money on a failed idea. I can imagine that when it comes to "non-baseball" interventions, front office people correctly identify that they don't know enough to be right more than they are wrong, so they pursue a low-risk strategy.
There is nutrition counseling provided, but the reality is that players get per diem (and not much of it) and have to eat wherever the bus stops. The original idea behind "Food for Kids" is that it's hard to learn things (which is what the minors are about) on a poor diet. The effects of the program wouldn't be felt for a few years at the MLB level.
I personally struggle with how to put pitchers into the MVP debate. In a perfect world, I would have the MVP be for position players while the Cy Young awarded the best pitching performance, but that's not reality (my own ballot had Justin Verlander in 2nd last year). In general, I don't explicitly adjust when comparing pitchers to hitters. I'm open to the thought that maybe I should.
In general, I stick with WARP standings. When I deviate (and I do sometimes), it's between players who are bunched fairly close together while mumbling the phrase "margin of error." Usually, it's a case where there's been a potential weakness identified in WARP that hasn't been fully examined (e.g., catcher framing) and where I believe that WARP is either over- or under-estimating a player's worth.
I will use "premium position" to break a tie. I don't much care about team standings.
I don't think it's WARP that we're married to. It's the idea of scientific, dispassionate inquiry. In this piece, I wanted to raise the critique that too often it seems that we start with the conclusion (Miguel Cabrera is MVP) and then fit the available evidence to serve that conclusion. WARP starts with "Here's a reasonable definition of value" and from there, the numbers will fall where they may. If Miggy was way ahead, I'd gladly vote for him.
Actually... that was a subtle cue in the writing. I was concerned that someone might be catching on as I went along and said "Wait a minute... Cabrera's not quite that low." I wanted people to stay with the premise until the end.
Spearman's rank-order correlation (rather than Pearson) is one. You could also use 20-30-40-50-60-70-80 as separate categories and load the variables that way into a regression, or code guys as perhaps 50 or better vs. 40 or worse and run things binarily. The trick is remembering that ordinal variables are more akin to nominal variables than interval or ratio variables.
People often shove 20-80 into a regression anyway and pretend that it's a ratio variable, which is OK if you're just looking for an effect size so obvious that it'll show up no matter what you do.
If you're trying to predict ordinal variables, there are ordinal logit regression techniques, although I don't know why (in this situation) a scouting score would ever be a dependent variable.
I didn't actually offer a billion points, but... one billion points to you!
Actually, the reason that I shy away from using the word qualitative is because it looks and sounds so much like quantitative data that it makes for a hard read. When I taught stats, I would spend five minutes just making sure that people were hearing the words correctly. You're correct that the term most certainly applies and in my head, that's what I call it.
60 is supposed to be 1 SD above the mean, and that's a good analogy for how to conceptualize and use 20/80, and there are probably plenty of people who use it that way. The problem is that what we're trying to get at it is so nebulous and un-standardized that, if we're being super technically nerdy about our stats, it's better to just say it's ordinal and be done with it rather than make the ordinal/ratio jump.
They do pay the bills, don't they...
Good bunch of guys.
An inspiration for this piece.
I know on the "steroid" issue. I was making a very tortured joke.
Hey! You take that back!
This was a fascinating question. In 2012, only guys who started the game:
Felix Hernandez, 16
Mat Latos, 13.33
Clayton Kershaw, 13
Zack Greinke, 12
Cole Hamels, 11
Justin Verlander, 11
Cliff Lee, 10.66
Clay Bucholz, 10.33
Homer Bailey, 10
David Price, 10
Not a lot of surprises on the list. But a good question.
Well, he had the fortune of being on a team that played in a lot of close games.
Did I just unwittingly get a stat named after me?
I wonder if it's olde timey training methods or olde timey understandings of risk management.
It's always more complex than the numbers suggest. Teams have constructed the role of pitcher differently in the past 10 years than they did in the 70s. They also train differently, etc. The game is constantly evolving. I'd happily concede that my findings only apply to recent years, but thankfully, that recent era is still going on!
One of my new year's resolutions is to kidnap Doug Thorburn and just pick his brain for an hour on the subject.
A thought that has crossed my mind as well. One problematic issue is that Pf/x doesn't capture mechanics. But, the motion for curvy stuff is probably harder on the arm than a fastball (which is why curves are often banned in little league). Maybe there's something to it.
I think Jason just interviewed himself for that one.
Well, assuming that the game is close and the ninth really matters, would you rather have a really tired starter or a fresh and elite reliever in there. If it were a 2-hitter, there would be no question.
Sam edited out the part at the end where I said "picks up phone and calls Hahn." It was supposed to be an internal monologue. This is what I get for going high-concept
A 30-to-1 chance actually happens once every 31 times.
Listen to this twice.
Slight correction on your math. Let's say he's a 4.00 RA/9 pitcher usually, but throws a shutout (so, 4 runs per 9 less than usual). Figure that his starts aren't usually 9 innings, but he's a 180 IP guy seasonally (for ease of calculation). We'd expect him to give up 80 runs over 180 innings. Taking this masterpiece, but long start out, we assume 80 runs in 171 innings, which means that he's something like a 4.21 RA/9 pitcher. I don't know that we can make those sort of static state assumptions in real life, but the point is well-taken.
I believe that the argument you're making is that even the small effect I found might be even smaller, which I am happy to support.
I hinted at this in the article where I pointed out that you can make a decent argument that the couple of run penalty can actually be justified. Your ace is cruising, it's a tight game, and the bullpen pitched 6 innings yesterday. It might not be a horrible idea.
It's a variation on the "punishment illusion". Lincecum went 140 pitches _because_ he was throwing a no-hitter, and it's not likely that his next game will be as good. (Where do you go from there but down?)
I had considered that. I figured that the effect is ameliorated by the fact that my baseline for performance is his average stats for that year (although as you point out, this includes his likely awesome performance, which will skew the results). If anything, that if some of the decrease in performance is due to a regression to the mean bias, the small effect that I found just got smaller.
I looked only at in-zone, as best as I could. So for example, when looking at the third baseman, I looked at balls to the '5L' '5' and '56' zones (even though the last one is a "shared" zone). If the shortstop scooped a ball in the third baseman's zone, I tried different ways of handling that (penalize, non-event, success). Didn't really matter either way.
This has been attempted from time to time, although not so formally. Something of an annotated bibliography of Sabermetric findings. Part of the problem comes from the fact that for a lot of concepts, even broader concepts, there aren't multiple guys who do a lot of research on it. For example, with my effects of coaching work, when I looked around for people who had done similar work, there wasn't a lot out there.
I haven't looked at outfielders. Yet.
There is part of me that worries that there's a methodological Jamie Quirk that I didn't see. Maybe the model is over-fitted. Maybe the bias in the Project Scoresheet files (they were done by human stringers) is systematically biasing the results. I actually agonized over whether to go to press with this, because it's so counter-intuitive. But after a bunch of different runs, the results kept coming up the same way.
At the very least, I think we can dismiss (in general) the idea that a good infielder has a secondary effect on the fielders around him.
There probably is some state-dependency that goes with it. For example, I once took the Conners attention test myself at 4:00 pm after an exhausting day. It pretty much diagnosed me as having ADHD. In general, we would do this sort of test first because if a kid was well-rested and not tired from other tests, but still showed limited attention, then that was an indication that there was a problem.
Clinical in a program focused on child and adolescent. I ended up working mostly with adolescents, oddly enough. I had planned on working school age. Sometimes weird things happen in life.
Raven's was one of the things that I left on the cutting room floor as well, and you are right that the issue of culture and language is something that would need to be considered (wrong place to have that discussion here, but certainly a good issue to think about). The issues around self-efficacy would make for some interesting research though.
Attention and focus are going to be affected not just by frontal lobe activity. You are correct that one problem seen in ADHD is that people have a hard time turning on "the filter." There are a billion things going on around us at one time and we don't notice them, because we don't attend. And we can't... it would be too distracting to process all that.
But attention is also a function of whether something provides a level of stimulation that matches the cognitive resources that a person has available at the time. Something that is interesting is often high stimulation, and if baseball matched up with your frontal lobe's desire for stimulation, it probably was something that you enjoyed and that you could properly process. That's a wonderful thing.
I had a section on the D-KEFS ready to go, but chopped it for space. I'm fascinated by the idea of executive functioning assessment and what it could do for both scouting and development. If nothing else, Stroop will give you a good idea of parallel processing and response inhibition, both of which would be good to have in the middle of a game where you have multiple strategic concerns to balance.
That article was actually in the back of my head when I wrote this.
Will to actualize natural talent. W.A.N.T.
It's half a win if everything breaks perfectly. It's probably just worth a couple of runs. That's real value, but it's just not in line with how much people seem to think it matters.
In all honesty, because it was the quickest to calculate.
That's not an argument. That's an anecdote, and the plural of anecdote is not data.
Yeah, it's an old argument, and Joe has been at this much longer than I have. One of these days...
I can confirm that I'm at least 18. And then some. And yes, for some reason, my wife still puts up with me and my baseball obsession.
I do think regression to the mean is in there, and stated as much in the piece. I don't know how much of the improvement is due to regression to the mean (and I'm not immediately sure how to model that), but there may very well be more to it.
I've never watched the West Wing.
I'm a Godfather!
Doug? You in?
There are several "next big things" in Sabermetrics. But I'm convinced that one of them is moving away from the large-N data model and into understanding each player as a unique data set.
There will always be room for "Show me the numbers" and eventually, there will be numbers. But, the standard model that we've employed over the years is a number dump, followed by exasperation that people didn't read the number dump. You have to package the message. It's something that we've been ignoring too long.
The thing about volleyball positions is that while players technically have to rotate after each side out, once the ball is live, they may move where they choose. You'll often see "real" volleyball teams line up with the front line players in a little clump by the net arranged so that one is technically on the left-center-right of the others, but then re-arrange themselves quickly after the serve goes up so that the player who is a middle hitter (vs. the outside hitters) get into their "natural" positions, rather than the one that is "assigned" to them. Volleyball also has much looser substitution rules than baseball (a player may enter/exit 3 times in a set), which leads to some specialization of backcourt vs. frontcourt-only players. Also, there has been a recent introduction of the "libero" position, who is a defensive specialist who stays int he backcourt and can not approach the net (nor serve, if I remember correctly) and who does not rotate.
Yeah, you'd pretty much need to have everyone in the pen able to go multiple innings, when necessary, and a 2-to-3-inning fireman/closer would be nice to have, but the end result of this plan is that you probably do spread out the saves. And yeah... some guys would probably carp about that.
There's a variation on this that Tom Tango uses in his book. A team employs 3 "real" starters who pitch on days 1, 2, and 4. Two other starters pitch 3 innings/50 pitches each on days 3 and 5.
If there is a(nother) great unexplored frontier in Sabermetrics, it's implementation research. Everyone knows that smoking is bad, but people still smoke. How to convince people to put down the cigarettes, and holding on to pitching wins? Want a Sabermetric Nobel prize... work on that one.
In my original draft, I had a section on needing one of the relievers to have a rubber arm (or be a knuckleballer) just to pick up the garbage time innings. It's a good idea.
There is an advantage to being able to use the tandem starter model to gain platoon advantage, if your two starters are of different handedness. Because the other team knows that it will be a lefty then a righty (or vice versa), they can't gain as much platoon advantage with their starting lineup. Maybe the ability to gain the platoon advantage that you lose in the 8th inning is canceled out by the extra platoon advantages you gain in the 5th inning. If anything, it seems that a tandem starters model is built around trying to win the game in the 5th inning through prevention than in the 8th inning through strategy.
The average MLB pitcher makes 2.7 million per year. So, 8 guys at that rate is 21.6 million. Looking through salary data, a lot of those guys made under a million bucks last year (Fundraiser will be next week.) A team could probably get above average performance, maybe for savings around 10 million.
The 15 guys who I named had an xFIP of at most 3.75 in their first 50 pitches in 2012, which would put them around the top 30 of all starting pitchers. That's pretty solid performance. So, I'd invest in the offense.
As to the Astros farm system, I wonder whether every system has a bunch of guys who fall into the "good for 50 pitches, but not for 100" and are labeled failed starters, because there's no role for them.
Max, what's the stability from year to year in framing "ability"?
Even if he's trolling, it's a teachable moment. Even if Heyman doesn't take the lesson, there will be others who will read and understand. Even if he's trolling, he brings up a reasonable and important question, and it's one that needs to be addressed, and we need to honestly address it, even if it means that the answer is "We don't know yet and you are correct."
I remember listening to that episode while riding on a bus from Chicago to Cleveland coming home from a wedding. It was the first time I'd ever heard Colin's voice.
Harper (139 G/597 PA) logged more playing time than did Conigliaro (111 G/444 PA). Harper played primarily in CF, while Conigliaro played in LF. It looks like you're quoting the Baseball Reference version of WAR. You're correct that most of the difference between the two seasons comes down to Conigliaro being rated as 10 wins below average in the field and Harper as 14 above. For Harper (and for everyone from 2003 onward), BBRef uses Defensive Runs Saved from BIS (The Fielding Bible people). For pre 2003, they use a measure called Total Zone, which was invented by Sean Smith. I once created a similar measure. TotalZone uses roughly (and I mean very roughly) the same ideas I've used here.
The broader point is that Harper's claim to being the best teenager ever rests a good deal on his superior (in the eyes of the metrics) fielding performance. Conigliaro was clearly the more productive hitter. We can put more faith in the reliability of those hitting metrics than the defensive metrics. There's a decent case to be made that Conigliaro deserves a second look and that the case in favor of Harper is not so clear cut.
It am taking some small liberties with KR-21. I'm assuming that a grounder is a grounder is a grounder (and that all are of equal difficulty), mostly because in the data set I'm using, I can't tell the difference as to which grounders were soft two-bouncers right at the fielders and which were screamers headed through the middle. The way that I have the database structured, I lined up the "test questions" in chronological order. So for first basemen, "question" #1 was the first ground ball that he saw from 1993 onward that was hit in his general area. For some guys, that was an easy one, for others a near impossible ball to get. What I'm counting on is that the noise all cancels out in the wash.
Actually... I originally ran the numbers and calculated the rate at 140 (and wrote that section while I waited for some other numbers to calculate), then realized my syntax was doubling everything and changed it accordingly. 70 is correct.
I was working as a therapist at the time, and therapists have more than their fair share of stalkers. I didn't want my patient's googling me and finding my baseball work and wanting to talk about that in session. My real name has always been something of an open secret in the baseball world.
Actually, one of my favorite researchers of all time did something on long two-strike-foul battles. Weird guy though. Named himself after a kitchen utensil.
That would take a very different set of analyses to do... maybe I'll have some time tonight...
I ran a mixed-linear model, controlling for pitcher identity AR(1) covariance matrix. I looked at the effects of age (as a fixed linear covariate). 2003-2012.
The effect of age was non-sig (p = .55) although the trend line was pointing down, by about a ten thousandth of a point each year of age.
Interesting point. I hadn't considered that aspect of it.
The statistical tool that you were looking for in your discussion of aces is basically an independent samples t-test. You are saying that you know the mean (each pitcher's ERA) and imposing some sort of distribution and then drawing randomly from those distributions. Basically the question that you are asking is that if Verlander is really better than... me, how often would we be able to detect that in a small sampling frame (1 game). That's a question of statistical power. (For the initiated, in a t-test, it's 1-beta.)
From the archives: http://statspeakmvn.wordpress.com/2009/02/16/did_i_scare_you/
We do know that HBP rate stabilizes quickly. Some guys do seem to have a tendency (probably because of their stance) to get hit more often. I could run a chi-square test to see whether Greinke hit Quentin, but the sample size is far too small to be powerful enough to detect any effect.
You are correct that these are not issues confined to the US (and of course, not all players in MLB come from the US.) There are other cultural concepts (machismo, for example) which present their own challenges. I hesitate to write about them more broadly because I just don't understand them enough to do it.
That makes sense on first glance. I found that very small (10 BIP) samples don't do well as predictors, maybe because it's just too small a sampling frame to get a good read on what's going on. Maybe if we conceptualized it in a different way we'd find an effect. It's an open question at this point.
In the interests of investigating whether this theory works at the extremes, I ran some new analyses.
Using the same basic framework as I did in the original, I took the league average and the past 100 BIP and let them fight it out in the same logistic regression.
I only took cases where the last 100 BIP yielded a prediction of .280 or lower, then .275 or lower, then .270 or lower, etc.) There does come a point where league BABIP is a better predictor, and it seems to happen somewhere between .270 and .265. However, it should be noted that the past 100 BIP still holds some significant sway, even as you descend even further.
Perhaps .240 is too lucky to believe, but .270 is not.
I think you're right on looking into how many of these cases where you have an "extreme" value (say, .240) there are, and how well the model would perform in these cases, but that's testable.
As to your second point, I'd love to know how this works too! If the fundamental message of what I'm trying to say here holds, then it opens up a lot of different avenues of investigation!
On your third question, I don't know that one yet.
Yeah, but you can't trust the kitchen utensil guy.
We're not talking here about how far to regress here. That's a different set of analyses. We have two variables fighting it out to see who is better at predicting the outcome of the next ball in play, which is the true measure of how good a predictor is. These analyses tell you that recent history does a better job modeling the outcome of the next BIP than does league average. That right there suggests that the standard DIPS assumption that everyone is league average deep down should be treated with suspicion.
So at that moment, he is better described as a .240 BABIP rather than a .300 BABIP.
My personal favorite example is Troy Percival who had BABIPs usually in the .270 range year after year.
Figure a starter faces 25 or so batters per start, and strikes out/walks 7 or 8, then 100 BIP would be roughly 6 starts.
Good point. My original thought was that the recency of 10 BIP would have a strong amount of predictive power, but it seems that it's just too small/noisy a sample size to really get a read on what the pitcher is up to. But it turns out that when you pull back a little bit, the sample size is big enough to provide at least some clarity. Predicting the next ball in play may very well be a function of how the pitcher is feeling that day, but the previous 10 BIP just doesn't give us enough of a read on that to tell how he's doing.
On the third point, if BABIP is far from the league norm over the last 100 BIP (say it's .240), then from a variance explained point of view, the recent personal history of the pitcher is more important than the league average. However, understand that the recent personal history of the pitcher is not a static number.
By definition, you can't predict randomness. If BABIP is random, why am I able to find a predictor?
In #2, the idea is that I created indicators of xBABIP based on where balls off the pitcher were hit (and how many outs a league average team should have recorded based on that), so that the pitcher wasn't penalized for having a bad defense (or credited for a good one). Then, I took the defense's BABIP for when that pitcher wasn't around.
(Natural) log of the odds ratio is just a statistical trick that I used because I used a lot of logit regression. It has to do with raw percentages not being normally distributed, and using LOR corrects for that. Also, when logit does its actual modeling, it spits out a function that gives you the LOR of the probability that you want to model.
In #2, the idea was to see how well these predictors performed relative to each other from the point of view of variance explained (as much as logit lets you do that.) Was it the pitcher's general talent in steering the ball toward a fielder? Was it the sparkling defense? Was it the batter steering the ball himself?
Call me if you need a tutor.
*clips for arbitration hearing file*
One of these days, I wanna have a conversation with all of you about makeup.
Only 5 players all time have logged 3 or more seasons with 50 doubles. Tris Speaker, Stan Musial, Paul Waner, Albert Pujols, and Brian Roberts.
He's only on pace for 324 this year. At that pace, it will actually be 2015 before he tops 763.
I wish my name was either Ben Lindbergh or Jon Shepherd so I could lie to people and say I wrote this.
For what it's worth, I've never met Jack Morris. He might actually be a really nice man.
I can live with that critique. There's a lot of upside for those guys, but everyone's got enough upside to win the 2016 World Series... if everything goes right... just ask them.
Y'know, that's not a bad re-frame. It's a lot to bill to the marketing department, but... whatever gets you through the day.
You are hardly the first. This has been an idea that has kicked around for a while.
Hooray for Stetson Allie! He went to my high school.
Paul McCartney always told me that name-dropping isn't polite. ;-)
The biggest issue here is that you're trying to mix something that naturally has six beats into a rhythm that has 5. If you're figuring that a tandem takes care of the fifth starter role, then really they'd only be available for relief duties for the third starter. (#1 and #2 would be their rest days) Even then, a relief appearance might only leave them one day to recover, and they couldn't cover for #4 because they have to pitch the next day. In some sense, what you lose is the roster spot.
Alternately, you could run the experiment as 1/2/tandem/3/4/tandem, but that's essentially pushing the four "real" starters into a six-day loop, rather than a 5-day. Especially if one of them is really good, you want him to pitch more often, rather than less.
Maybe a team might try this in September when they are out of contention and the rosters are expanded.
On the topic, an interesting article from 40 years ago in SI on Earnshaw Cook and the ideas in his book.
Actually, you see a few of these guys pop up every post-season... usually the team's fifth starter. Tommy Hunter also seemed to be reprising a similar role.
It does happen in lower level leagues, but that's with a view to slowly building guys up to go 7 innings.
Or guys who have been injured aren't able to keep it up through 6 innings. Injured pitchers do come at a discount...
Tom's model was to have three "normal" starters who would pitch on days 1, 2, and 4. On days 3 and 5, a tandem starting team would take the hill. I've played around with that model a bit. It does have the problem that your normal starters need to be decent... and that negates the cost-savings aspect of my plan. But certainly his way has its merits.
I guess if there's one message that I could get across it's that it doesn't take a huge effect size of chemistry for it to be meaningful. And it isn't that huge a jump to assume that it might have a small effect. I doubt it turns scrubs into stars, but we don't need it to do that.
Maybe Inge really is a nice guy and helps people out.
I hold out hope that it can be quantified. And used to great effect. But right now, hope is all I have on the matter...
Would you believe me if I told you that this very discussion got left on the proverbial cutting room floor? I was going to add it in after the part where I hypothesized that Inge might have reached some players but not others.
In the professional literature around psychotherapy, the #1 predictor of improvement is not the experience or skill of the therapist in implementing therapy, but the extent to which the client feels a strong connection/bond/alliance with the therapist, and as you point out, people respond to different personalities and characteristics. When I worked as a therapist, I experienced this firsthand. Some people responded well to me, some did not. This is why I wish we had BFF F/x...
As to the issue of younger players being more affected, I briefly considered restricting the age range to something like 23-26 (for that very reason), but it dropped the sample sizes dangerously low.
This is what happens when you read "Tigers" and "SEC Championship" in a sentence but don't read the rest too carefully...
Nathan, I do almost all of my own syntax writing and analysis. I have a small library of code fragments that I can just pull out for common tasks. Saves a lot of time.
The thought occurred to me to do that study too. It's somewhere in the queue. I promise.
As my wife points out when I dip into my 90s nostalgia, all of that stuff is now on the oldies station...
I not only computed the effect on K rate, but also, walks, HBP, singles, 2b/3b (put together), HR, and outs on balls in play. Those are the 7 basic outcomes of a plate appearance. There weren't any effects to speak of.
The reason I was even at Moscow State was that my wife was presenting at an academic conference. Everyone else was dressed for a conference. I was... not.
Actually, that's just me everyday.
Everyone became more likely to call for a steal after a CS. All of them. But, there were some who showed less proclivity to fall into this pattern than others.
The model looks for changes in SB attempt rates. Perhaps it's picking up on the fact that those gentlemen run no matter what, whereas the others only run when challenged by failure.
They paid market rate for a decent pitcher. It wasn't a great move, but it was understandable.
Ideally, yes, but those data aren't easily available, and this is a case of "direction before precision"
2012 was not a validation year. I looked at all pitchers (who met the criteria) from 2006-2012. Sample size was a few hundred player-seasons. In theory, I probably should have used some sort of GEE to account for using the same players across multiple years.
They were in the list of factors that were available to the model to select stepwise.
It's not that Jose Reyes is a malcontent. It's that if the Blue Jays moves don't work, people will find bad things to say about the players that they brought in.
Your reading is correct. And it is scary.
The reason that I didn't do velocity measures was that i didn't have time to merge in the Pf/x database. It's worth a look.
Oddly enough, I put in BMI, but not height... And I have height right there ready to go
Bad researcher. No Hot Pocket for you.
The marginal risk of pitch #2000 is different than the risk of pitch #3000, if ever so slightly. It then becomes a risk-tolerance question. You can push Smith for another inning and he may be better suited to handle it than anyone and you might need this inning, and maybe you don't blow out his arm. But maybe that's the sinker that breaks Smith's elbow.
I think you're right on. I don't know if we can pull apart the chicken from the egg on that one.
I did look at many of those issues, but didn't report due to space. I entered interaction variables (e.g., number of pitches x age) into the regression. Being older added injury risk, as might be expected. As to whether power/finesse pitchers are more likely to be injured, those were mostly secondary predictors when they were significant. But then, my model was a first pass at this sort of thing. Must. Improve. Regression. Equation.
It's all regression-based, so you have to interpret that as "holding everything else constant, another inning actually predicts a lessened chance of injury."
A lot of it comes down to pitch efficiency. The big message is that it's the number of pitches, not the number of innings that you rack up.
I could see that working. My reach for social net was based on the fact that all proclamations of the amazing chemistry in the clubhouse start with "we're all such good friends" or some other non-sense like that.
I took it once and actually split three of the four down the middle.
Rule #3: Never mention the Myers-Briggs around me.
This is why I wrote this article. I didn't even think of that.
A billion points!
Of course it exists. The question is whether it matters on the field.
If you measure things longitudinally across the season (and across several seasons) you could get a better idea of this.
There are, but since "chemistry" is a team-level term, I'd need at least a good response rate within each team, and since there are only 30 teams, I'd need a good chunk of them to have any sample size to look at team-level effects. So, I'd need a response rate that was pretty high.
Pat, all of those were issues that I left on the proverbial cutting room floor. Per #1, my guess is that there is probably more downside than upside, but I hold out hope that some innovative person could figure out a way to optimize the upside.
Per #2, I briefly considered comparing a clubhouse to a junior high classroom and all the drama that could ensue. There probably are players who love the drama, but that's not a sign of a healthy group in general. Friendly competition is one thing, but "No, I won't tell you the thing that I noticed about the opposing pitcher because I'm mad at you" doesn't help anyone.
Per #3, I focused only on the quantifiable for this article, but there's a fantastic opportunity within Sabermetrics for people who have qualitative and textual analysis skills. I wish there were more of those people out there.
1. I've never seen the switching positions study done. Fits into the "reasonable hypothesis" category. It might even be true.
2. I would debunk my own mother if she were wrong. (Mom, if you are reading this, you've never been wrong).
3 & 4. Good questions to which I have no answers. There's so much more to figure out.
The biggest problem with Verducci's formulation is that it is too broad. We need a more fine grained look. Another for the queue.
I do worry about that. Our injury database actually lists injuries that took place (or were discovered/reported) during training camp. I might be able to look specifically at this.
That's a reasonable thought. Some might have more injury risk after changing mechanics (and I could see some being better off for the change both short and long term). If anything, it adds to the confusion of the Verducci sample.
To break that into stat terms, we over-estimate the variance that our models explain. There are many factors for which we don't (and in some cases, can't) assess.
This is the sort of line of work that would have to start with a "direction before precision study." Age and spread of age are easy to figure (mean and SD), and that's a good starting place.
This was something of a strategic methodological choice on my part. The biggest danger that I saw was random extreme variations (i.e., career years) distorting year-to-year improvement stats, so I only controlled for year-1 raw OBP. Your approach is perfectly defensible. I just went a slightly different direction.
A reasonable hypothesis.
Isn't there a game theory issue here? If I know for sure that the other manager won't pitch out, then I can feel a little more comfy at first base as I plan my mad dash to second and my manager can call for the SB at will. You have to do it once in a while if for no other reason than to keep them honest.
Sam Miller, what are you drinking?
This is a reasonable theory. I do wonder how much "last chance" vote Morris got/will get. Frankly, that sort of thing bugs me, but it's a real phenomenon.
Clemens/Bonds supporters voted for an average of 6.6 names other than C&B. Their overall total was 8.6 names.
Yeah, but 1997 was fun.
Taubensee and Willie Blair. That's OK, I think the Indians did OK in that deal. They got Dave Rhode too!
Completely true story: I have never seen the movie Major League. My parents had a "No R-rated movies" policy, and by the time I was old enough, it seemed rather campy and by that point, I wasn't much into TV or movies in general.
John Wayne Airport is a truck stop that just happens to serve airplanes.
Sorta. Sports psychologists might have training in neuropsych assessment, but I'm thinking of this being something that's talked about within an organization. If a GM ever comes out and uses the term "executive functioning" or "frontal lobe" correctly in a sentence, I will declare his team to be my new favorite forever and ever.
Hooray, after 2 years of begging, it finally happens!
Or would they want to take advantage of their park?
There'd be plenty of temptation to defect though. If I'm the only guy in the league who has the DH, I can not only have an offensive advantage (more attendance?), but sign a guy who will be very helpful 81 games a year while the rest of the league has no such guy, and since I'm the only buyer, I have monopsony power.
I should what...?
Hooray, I got podcast lovin'
Not necessarily. Kevin estimated that he'd be missing 12 things. It appears that there's one extra thing that he's getting half right.
If MLBTR/Rosenthal/Morosi/etc. get 8% of what's going on, then for every one thing that they have, there would be 11.5 other things going on.
Bad closers usually become 7th inning guys. Think of how many former closers who are now just replacement level relievers are around because "he had a 30 save season a few years ago and maybe he still has something in the tank."
But he went to my high school...
Whither my fellow alumnus, Stetson Allie? Can he make it at 3B? (We actually did go to the same high school.)
Yeah, the agents would probably hate it.
It would have to be a stipulation that came with declaring for the draft. Your rights are held by the drafting team and your options are sign with that team and accept the bonus or go home. The tradeoff is that your bonus is essentially being bid up to the highest bidder. The bonus system right now is determined by a bunch of games of chicken being played between draftees and teams. I'm suggesting a more rational market approach.
They would have a larger pool of cash so would be able to outbid (in theory) the better teams. And I'd contend that the current system forces them into a high-risk rebuilding strategy, whether they like it or not. They may choose to pursue it in this system, but they may go with another method.
Proven winner. Was on the 1995 Braves World Championship team.
I'd envision it as pick 1 gets auctioned, the player is selected, and then pick 2 is auctioned. Lather, rinse, repeat.
That's some high praise. Thank you. Could you say that louder and make sure that my boss is in earshot?
Googling is cheating...
This is officially required reading for anyone who considers her/himself to be a Sabermetrician.
You could at least adjust for "Well, of guys who were age-23, the average improvement made was X. Smith improved Y."
Yes. Lots of upside. Potential 7 on the snark.
This falls into the "hitting is the only way to help a team" fallacy. In fact, if the award were best hitter in general, it would (and should) go to Miguel Cabrera. But, a good chunk of Trout's value was in his baserunning and defense. During the months of August and September, when you look at Trout's contributions to the Angels including these other areas, he actually outperformed Cabrera in the last month of the season.
As to not making the playoffs, had Detroit been located on the West Coast and Anaheim in the Midwest, then the Angels would have won the AL Central.
Tip that Royals/Brewers/Angels hat to Mr. Greinke. Actually, no matter what hat you're wearing, tip it to Zack.
I don't know anything about what his potential suitors have in mind to help him. My hope is that it's "a lot".
In Texas, he had an accountability partner who traveled with him. I'm guessing that he had friends on the team who helped him out (as a good friend should.)
There are data on throws to first going back (I think) to 1993, and a throw to first does decrease SB success rates.
Are teams throwing over to first less often?
No one pointed out the biggest need of the off-season. The Astros don't have a DH.
This is brilliant.
Ding ding ding. A billion points for you!
The big unanswered question is whether that "counts" as a defined role. For a long time, roles have been based mostly on inning (aside from LOOGYs). Is it enough to say, "When you see things getting crazy, start warming up" I think that's an open question.
And in response to a question posed by a particularly handsome young man, using the term "replacement level" in a sentence.
I always budget things high, and I put in there a little bit about how a team could probably negotiate a pretty good price give the size of that sort of contract. If the price falls, then the amount of gain that would ahve to come for it to make sense goes down too!
Yeah, there will be a lot of deadweight loss in that a lot of food will go to guys who will never see a major league uni. But how do you do it otherwise and not cause a knife fight. Even with the extra expense, it's still cost effective and maybe you strike gold with a guy who just needed a good sandwich.
That's massively exaggerated. A player may have limited English, but that doesn't mean he doesn't have survival skills. One of the first things that he will likely do is to figure out either where he can get food and order in his native language or he will prioritize words related to food in what he looks up or he'll ask a friend on the team.
I agree on the issue of whether or not kids... erm, prospects will eat their veggies. Plus, you can't stop them from stopping off and grabbing a not-so-healthy bite to eat after the game. My argument would be not that this will solve all problems, but that it is better than the current situation.
As to the salad, the problem is that a salad is offered by restaurants as a low-calorie option. Baseball players need their calories, but unfortunately, the only thing that can provide them with those calories comes in the form of high-fat food. The body needs protein and complete carbs as well.
There are teams who have "looked at" this issue. You mentioned the Cardinals and MLB ran this on them a while back: http://stlouis.cardinals.mlb.com/news/article.jsp?ymd=20100223&content_id=8127110&vkey=news_mlb&fext=.jsp&c_id=mlb
A few other teams have nutritional consultants who work with the players and some teams do provide some food. But even then, you're talking mostly about giving out good advice to players rather than proactively setting food in front of them. Information alone does not solve public health problems. I'm saying that a team who went fully into a comprehensive program, even if it were expensive, would net some games from it.
Maybe Cespedes (Cuba) and Darvish (Japan) get points for coming to another country and putting up good numbers. That's about all I got.
And no, New Jersey is not another country.
Not sure... I'd have to look.
As to the second question: there have been roughly 100k plate appearances in post-season history, and about a third of them are concentrated in the wild card era. From a methodological point of view, bigger sample sizes are always appreciated. If all that we had was the pre-divisional World Series only model, that would make things a little tougher for me as a researcher.
I'm personally not much of a purist. I've found most of the purist critiques of the expanded playoffs are aesthetic in nature. I have opinions on the matter (I like the expanded playoffs), but as far as arguing whether it makes the game more beautiful, that's a very subjective call.
I don't have a grand unifying theory of the post-season yet. I wish I did. I'd be rich. But, I think that we don't take into account that players deal with different types of pressures in the post-season and that these can have real effects. What exactly those effects are need to be properly studied and not outsourced to cliches, but I think we ignore them at our peril.
There are very few teams who aren't Saber-friendly any more, or at least Saber-aware, including teams that would make you say "Really? Them?"
Ack, my bad. I had Berkman and Holliday switched in my head.
Cano's September/October OPS was .999. Below that of Cabrera to be sure, but not Neifi Perez-esque either. I wonder what their defensive and baserunning contributions were during those months.
I'm personally willing to entertain the hypothesis that performance in crunch time is harder (and more valuable?). But, I'd also suggest that we need to be careful of recency bias. Part of the reason that September is so salient to the minds of voters (whether official or not) is that it takes place right before the voting and is easier to remember than May. Then there's the flip side of the "crunch time" argument: had one of these guys gone nuts in May, his team could have stored up wins, clinched earlier and coasted through September. It makes for a less interesting story, but an equally effective way to win a division.
If you're going to apply the logic that clubhouse presence counts, you have to apply it fairly. You can't simply assume that Cabrera (or Trout) "made their team better" while ignoring the fact that the other might have done so as well.
Jay just likes to swear.
To his credit, I would have put Miggy 4th. In general, I'd personally rather see the MVP be a position player only award, but since pitchers are eligible, you gotta give props to Verlander who's having a fantastic year and is more valuable to the Tigers than is Cabrera. Cano's raw numbers aren't as pretty as Cabrera's, but he plays an above average second base compared to Cabrera's below average 3B. Plus, positional adjustments matter. And if you want narrative, Cano's been the only Yankee who hasn't been hurt this year... and the Yankees are going to the playoffs.
The reasons that there are several different versions of WARP come down to different methdologies that are used to determine the run value of various events. There are also different philosophies on where to set replacement level. They're all roughly the same from 30,000 feet, which is why you don't see a guy with 10 wins on one measure and 2 on another, but when you get down into the gritty details, they do differ.
I guess it would have to be a 3-team tiebreaker running concurrently with a 2-team tiebreaker and the winners would play each other. Even more fun if it was a 5-way tie for the second wild card. Elimination baseball!
The standard cutoff points seem to fit, but I think interpreting it as a validation of the standard peak model falls short. The last sentence that you write is the one that gives me the most trouble.
A nice, smooth, uniform, upward curve/line would have a high correlation coefficient running through it year-to-year. Straight lines have a correlation of 1.0, after all. These numbers tell me that we need to view 24-26 as more of a chaotic, malleable period. Some will take bigger jumps than others. Some will fall. At 26 though, the chaos stablizes. At 29, some start to decline, while others hold.
I think that we need to get away from the assumption that everyone follows the same curve (up, plateau, down) and embrace player development for the much more chaotic process that it is.
Good. I am Chuck Norris. This is what I do when I'm not spin-kicking people.
Trout actually leads in WPA. And Cabrera has been the 4th most un-clutch player in the AL.
I look forward to the day where even in sports, we look at someone and say, "Is s/he bringing more love and kindness into the world?" and if the answer is yes, to say "That's awesome!"
Unfortunately, the full answer involves brain dissection...
I think a good way to see whether it's a habit or conscious process is to see whether it persists in situations in which the behavior is clearly not beneficial.
I should have been better about saying that explicitly... Win some, lose some... (and when you lose some, you're winsome)
Indeed he did. My point in this series isn't to deconstruct the players that I highlight themselves. My goal is to illuminate issues that I saw come up in Jason's series. There are probably a couple dozen guys who would benefit from a more aggressive approach. Wil Myers was simply a vehicle to bring that idea to the forefront.
MLB specifically ruled on this one a few years ago that in the event of a "mixed" three way tie where you are tied for both the division and (at the time the only) Wild Card, being in game #163 for the division will not knock you out of the Wild Card tie.
It's actually not random who gets the bye. There's a set of tie-breakers to determine seeding (starts with head-to-head). The top seed gets to pick whether they want to play two home games (beat both #2 and #3 at home) to advance or one road game (make #2 and #3) play, and then go to the winner's stadium for a winner-take-all game.
My point there is that at 11-0, the math is pretty obvious and the logical thing to do is to punt the rest of the game and live to see tomorrow. But let's say that you do some math the day before and realize that the best idea is to punt the whole game. In both cases, you're giving up. It's just a matter of when. Would the Yankees or Orioles or whatever team did it be hailed as wise or fools?
I would gladly admit that the circumstances that would need to come about are a little far-fetched. But let's say that the Yankees would start CC while the A's, due to having to sprint through the last week of the season, would only have some #4 guy available. And let's give the O's Justin Verlander.
I meant this piece to be more theoretical than anything. My ultimate thought is that there exists some set of not-impossible circumstances in which punting makes logical sense.
Why yes they did...
Could have sworn I've seen that title somewhere.
Also, as an Indians fan in the 1980s, thou shalt not diss Brook Jacoby as a mere throw-in.
'91 was the season that the Indians moved the fences back in an attempt to build the park around... Alex Cole (true story). And I may or may not have seen a couple of those home-runs from the good seats at Cleveland Muni despite having an upper-deck ticket.
I won't tell...
Welcome to BP, Cory. I may end up writing about negative self-talk, which I would generally (although not completely) lump under depression. I don't practice therapy any more, but there are a few techniques that can be used around negative self-talk. Most of them follow the Stuart Smalley approach.
Ben, I'm going ot use this in my arb hearing this winter.
We'll talk about anger control soon enough.
I really am 5'11". And married.
And we need to do another StatSpeak reunion.
Those data could be parsed...
I appreciate the desire to be precise in terminology. Here, I use "racism" very broadly, and I can appreciate that, for some, too broadly.
I recognize that attributing the results to a bias is an assumption, but it's at least a reasonable one. The methodology provides direction, but not precision. Still, it's the best that we have right now.
Yeah, but announcers gonna announce.
That was actually brought up in some of the original critiques to the Atlantic article. I suppose that work ethic can transcend the language barrier (you can watch a guy work out/take extra grounders/etc.) But there is something to be said for this. Suppose you have broadcasters who don't speak Spanish and miss a player who does A LOT to lead in the clubhouse... because they don't understand what he's saying. In this data set, there's no way to pull that apart.
Thanks. I giggled when I wrote that.
Yeah, that's another problem with qualitative analysis. Because there's no standard definition of what you're looking at, there's all sorts of room for bias in creating that definition (and then implementing that code.) There are ways of increasing rigor (inter-rater reliability, which the researchers do not report on...) but it's always lurking in the background as a research methodology problem.
This is the sort of research that isn't fun to sit with. There are a lot of what-ifs. I'm also convinced that learning to live a bit more with that uncertainty would push the field of Sabermetrics further.
Awww, I got podcast lovin'!
A team with 2 or 3 good relievers wouldn't be able to hang around in a 19 inning game, unless the other team's pen is just as bad. Perhaps they are lucky enough to be involved in games that don't call for that. Perhaps not.
I believe that you mis-understand what I'm saying there. We agree that baseball is a game with a lot of randomness in it. At least the way that I'm measuring things, there doesn't seem to be a lot of repeatable skill on the team level as to what happens in a one-run game. In the same way that the playoffs are a crapshoot (because of the shortened timeframe, which breeds more noise), an extra inning game is really a 1-inning game... on the team level, that breeds more noise.
Also, 1000 bonus points for using antepentultimate correctly.
I recommend Sam Miller's article on the subject (from today). Bullpen performance is beset by all sorts of small sample size problems and luck.
It's true that the Orioles have played a lot of extra inning games, which are best won by good bullpen performance. To put it less charitably, the Orioles have been fortunate to be in a bunch of the type of one-run games that have called for a good bullpen and fortunate that their relievers all seem to have four-leaf clovers in their back pockets this year.
The model only sees games that end in one-run. Suppose that the O's go into the 7th inning leading by one, and pitch 3 shutout innings. The game ends with a one-run O's win. Conclusion: Good bullpen performance, but why didn't the offense make it a 3 run game?
Would a better variable be time elapsed. Figure that all pitchers stop hitting once they leave high school (age 18). Take the pitcher's age minus 18. I can't imagine that it's pitching extra innings that somehow eats away at the neurons that encode the muscle memory for hitting, but rather the simple passage of time and their not being renewed/tended to.
Thanks Jay, I will. ;)
Is this not just an engineering problem though? If we could get the robots to process fast enough, then the calls could be instant.
That's one of the next things in my queue.
Direction, then precision.
How could you not include Stetson Allie? He totally went to the same high school that I did!
I'm happy to take requests.
Hooray! Someone caught it!
I'll talk about that in next week's article. The short version is that you can be fairly certain that a player's GB rate over those 80 PA was reflective of his true talent _over that period of time_. But true talent unto itself can change and does so more rapidly than we'd like to think.
Guy's a no-talent hack if you ask me.
The standard way that this has been done is that at r = .7, you take 70% performance and 30% league. As to how to weight performance vs. projection, that's one that I've never really looked at.
I use the statistical program SPSS for my work, which can be publicly bought (although it is expensive). The good news is that any spreadsheet style program can handle Retrosheet data.
I had dreams of doing these initially... I'll see if I can fire up my PFX data base later.
A few thoughts come to mind:
1) Players wear their own team's uniform, suggesting that the game is supposed to be a collection of individuals. If baseball wants the idea that teams should take pride in themselves, perhaps they should wear those batting practice jerseys that they do at the Home Run Derby?
2) On snubs, people will point out the big ones (Michael Bourn), but what percentage of the roster would get say 75% approval that "yeah, he belongs there." 90%?
I think I've actually seen this done elsewhere. I just can't remember where. Confidence intervals are woefully under-used in baseball... and life.
Richard is right. There's a big difference between "It doesn't exist" and "I haven't found a good way to measure it yet... maybe it's because it doesn't really exist and maybe because I'm bad at math."
It never does. Don't worry, there's plenty more stuff to write about.
My wife and I just bought a house. I own my own basement!
There are some topics that frankly I will be silently avoiding. I certainly can't duplicate things I did internally, and I probably will tread lightly in other areas. With that said, there are plenty of areas that I think are interesting (and I hope y'all do too!) which don't conflict with my NDA.
I can't talk about specific data sources, but I will say I believe that there is still A LOT that can be learned from good ole Pitch FX and Retrosheet.
Question everything. Constantly. If we are to be scientists of baseball, constant questioning is the only way that science works.
As you might imagine, there are a few projects that I've had to shelve over the past 2 years. There will be fun to be had. Stay tuned, and watch out for the gory mathematical details.
He's not a terrible man. He's a man who views the world of baseball through a different lens than I do. There are many paths to knowledge. I am but a traveler on one of them.
I'd love to say that there's a juicy story, because it would be so much more interesting, but there's not. We did a bunch of projects, we got to the end of one, and it didn't make sense to start another one. I have no hard feelings and I genuinely wish them well.
You might appreciate that I'll have to decline to talk about the specifics. My work on the inside is covered by a non-disclosure agreement.
But, may I point out a similar process that has taken place in public. DIPS has gone from "there is no difference between pitchers in preventing base hits on balls in play" to "Well... sorta... but it's a lot more complicated than that." (Shout out to Mike Fast, among others, for a great deal of work on that.)
Awww thanks. Good to be back.
There probably is. Those are Retrosheet data 2003-2006, and Colin Wyers has previously (and elegantly) pointed out the flaws in RS batted ball data. Also, that's LD/PA, not per BIP. That probably makes it look a little more stable.
Thanks for the "gory details" nod.
One minor edit. In my original piece, I looked for where the R passed .70, which is (roughly) an R^2 of 50%. At that point, a projection of talent incorporating regression to the mean would be 70% performance and 30% league average (or whatever mean you want to use).
I swear, I'm alive and well, Ben!
@abbiet: I am alive and well, I promise. I am still doing baseball work, although for people who prefer that I keep my findings to myself (and them...) I miss writing for BPro something awful though. As you can infer from Eric's shout out, I still keep in touch with what's going on, but for now, it will have to be behind the curtain.
The Cubs haven't won a World Series because... they're the Cubs. Actually, I think that the day games would work in the favor of the Cubs. They can get used to playing day ball. The other teams have to adjust. I don't think that we would have the sample size necessary to look into this hypothesis, but it's an interesting question, no doubt.
Am I now the official "sample size guy?"
You could screen for sleep disorders with a sleep study, but mostly, this is behavioral, such as picking a good bed time and sticking to it.
The outcomes on the ground balls include some errors. When I say "single, runner to second" that might just be a muffed grounder that went for an error. Either way, it doesn't matter. The point is that the end state was 1st and 2nd, no out. That's what I want to model.
Umpires aren't in there. There are a number of factors that in theory could be added, but then you overload the regression.
There are certain players who just seem to get beaned a lot. It's a relatively stable stat year to year for hitters. It probably has to do with standing close to the plate. Or being a jerk.
@Brian24, it's not that the recognition pathways aren't there, it's that they aren't strongly connected to then reacting as a lefty. He's seen a pitcher throw lefty, he's hit left-handed, but he's never done both at the same time. That's the issue.
I don't personally. But you could perhaps see how someone might mistake him for a good long term investment.
The thought process would be that while Howard is a drop from Pujols, if the Cards aren't confident that they can sign Pujols long-term, then the decision is between Howard and some other random first baseman. It does assume that the Cards are taking a longer view, at least in the financial aspect of things. This is probably where the deal falls apart.
I would argue that the booing of Favre had more to do with "You can't leave us! You broke our hearts!" rather than "Your departure has made our team worse." The focus is still on the relationship between the individual and the fans. I don't know much about football, but if Favre's departure made the Packers a better team, then the proper response would be to cheer him to thank him for leaving.
You're right that it's much easier to pick out an individual from a cognitive load perspective (it's easier to focus on an individual rather than do all that messy work), but I'd argue that it's deeper than that. Hence, my example on the name the greatest teams vs. name the greatest players. There's no philosophical reason why teams shouldn't be as available. It's a matter of what we're taught to focus on.
Is Paul Zuvella still playing?
It was an ESPN piece. They ask that we kepe it short and sweet.
Well, that assumes that most batters are up there taking on the first pitch. That's not a bad assumption, as it seems rather common. But, if you have a batter who swings a lot on the first pitch, it might be a good idea to throw a pitch outside and hope he chases it.
Here, we have to make a distinction between getting strike one (always a good thing) and how you get the strike. Should you throw something down the middle, try to paint the corner, or throw something that you hope the batter will chase?
The permutations on this are maddening. Understanding the inner workings of this matchup is the next big task of Sabermetrics.
I'm not so sure that Markov would work so much as a repeated interaction Nash-type game. Surely, a batter's expectations are going to be influenced by the previous few pitches, because he intuitively knows that a pitcher is not truly randomizing his pitches and he might have an idea of the pattern that the pitcher is using.
When I originally conceived of this article, I thought about going there, but to be honest, I ran out of time this week... It's rather involved to run the numbers. I might at some point in the future though.
This "intro" of which you speak would probably end up being a book.
My father put it a little less elegantly when he said that he knew enough French to either get lucky or slapped, but that his face was red a lot.
The reason that I ignored it was that I couldn't quite figure out which correlation coefficient was being referred to... If you mean the stats on the regression itself, "combined talent" had a Wald value (how you measure factors in binary logit) in the 2000's. Recent OBP had one around 25. Both are signficant, but talent is much much much better, and when you look at the actual effect for recent OBP, the effect was seen at the fifth decimal place.
I suppose that enjoyment is in the eye of the beholder. Here, I'm much more concerned with winning the game rather than enjoying the game. It's not the way that everyone wants to experience a game, and I can respect that. But that's the goal here. Take it as you will.
I have done quite a bit of this research. It really depends on the stat in question. Some stats stablize after 50 PA. Some never do over the course of a season.
I had originally slotted a paragraph in here for that very thought, but then cut it for space. I often hear both on the radio/TV. I think that the hot streak is a little more seductive, based on the fact that it's recent and is likely to have won a team a game (and we like to single people out as "having won the game.")
You're a little off on your methodology. The dependent variable is "got on base in this at bat." It's either yes or no. The predictor variables are his OBP over the last 10/25/100 PA, as well as the control variable of pitcher/batter quality. So, I'm looking at his recent performance as a continuous variable.
It's an interesting suggestion to break it down a little bit and look at the hot players (say, recent OBP below .200) vs. the hot players (recent OBP over .500). I might.
It would need a slightly different methodology, but no reason that the study couldn't be done... hmmm.... *begins scribbling equations furiously*
I should post that above my desk as a mission statement.
For the reasons that OBP is really easy to calculate (work smart, not hard!) and that I needed a dichotomous variable for my outcome. BABIP is a big deal in pitcher's OBP against, although in a giant sample of 10 years, that can wash out a bit. If it helps you to know, the combined batter/pitcher OBP factor was the one that was unbelieveably significant.
I can't control for injuries, and yes that is a confound. It's the major weakness of this methodology. I wish I had a way around it.
I've had those times too. Of course, when I'm bowling it means I might break 100.
An interesting counter-example. 2008 ALCS, Red Sox-Rays. Boston comes back from a 3-1 deficit to tie the series at 3, much like they had done the year before to my beloved Indians... and I heard something or other about them doing that to the Yankees one year... It was assumed that it was a foregone conclusion that the Red Sox would win Game 7, because they were locked in.
Did I mix up my occipital and temporal lobes again? My old neuropsych supervisor wouldn't be pleased. Thankfully, he doesn't like baseball.
People who talk a lot but don't know what they're saying. I can't think of a better definition of politician.
The time element is fascinating. I may go deeper into that issue in the future.
... don't say it...
Not necessarily. Walking is actually more related to being reluctant to swing. Strike zone command leads to avoiding strikeouts.
As someone who taught college classes, GPA tells you almost nothing about anything.
I actually asked Kevin Goldstein how this sort of assessment plays out in scouting. He said that it varies, but that there's not a lot of it out there.
Over the last 30 years, there's been a lot more attention paid to mental health issues. People are beginning to realize that they are nothing to be ashamed of and that they can be treated. Would it have happened 15 years ago? It's less likely.
I don't know that we could create a measure for confidence based on game data (Retrosheet, Pitch F/X, etc.) that everyone would agree is "confidence." But we could take a look at some of these issues and try to make some reasonable conjecture. This is the problem with psychology research more generally. No one is psychic and people lie about their true motivations/emotional states.
I can't tell you how many times I say the words "If only..."
I'm not a fan of writing these types of articles either. I like to have a punchline at the end too. If there is a conclusion to be drawn, it's that just because a methodology works in one situation, it might not work in another.
You're correct that context matters in other decisions, and in a 15-2 game when both teams just want to go home, it's not a big deal whether the runner stays or goes. You've identified another specific situation where it might not hold (the ninth inning, behind by a few runs), but what about the other eight innings?
This is a very simple, well-defined decision (stay/go), that has definable outcomes (via the run expectancy matrix... if you want to monkey around for some other context, fine) and if the goal is maximizing the number of runs scored (which is kinda the point of the game) then this is a way to maximize run scoring.
The weighting is done by the binary logit analysis that I alluded to in the text. I didn't fully report it, because ESPN wanted me to keep the numbers light. However, in that case, I controlled for both distance and speed.
Small tweak: more runs would be scored with 100% send than are scored now. There's probably a point around (just guessing here) 90ish% that would be the optimal point.
Yeah, that's a goof on my part. I still stand by the overall message. There's no reason for success rates to be that high.
That about covers my thoughts on the issue. There are situations when the runner should hold, but they are much rare-er than is generally believed.
I used the "always send" to have a little fun with the data. The results surprised me. I went with it. It's not actually going to happen (and it isn't optimal), but it's better than what's being done now.
I used 250 for illustration. It's probably better read as "a really long throw."
Conventional baseball wisdom is filled with all sorts of things that are demonstrably false. The run expectancy matrix builds in the fact that even with one out, the likelihood that the runner scores is high.
It's a lot harder than you might think. Consider that we're talking about a pinpoint 250 foot throw, under pressure. Get the angle wrong or the force behind the ball wrong, and you've messed up the throw. The runner just has to run straight ahead for four seconds and slide.
In a separate study, I was able to control for depth of the fly ball and speed of the runner. I still haven't found good outfield arm ratings. The findings were quite strong.
Well, when that's what you get paid to do...
It's not easy, but somewhere in the back of his mind, the 3BC is doing some sort of calculation. It's just skewed toward keeping the runner there more than it should be.
19 stayed at third base. Had they gone, they might have been safe, they might have been out. We'll never know.
I think this line of logic falls into the trap of "runs in the ninth inning are worth more that runs in the second." Runs are runs.
I am making that assumption. But if there are a bunch of 100% chances, then why aren't they all being sent?
There are situations where I wouldn't send the runner unless it was a total gimme. Ninth inning down by 2, why risk the runner when you're going to need another hit anyway comes to mind. However, runs are runs and the game is always about scoring more of them.
Good points there.
You all beat me to it. BP readers are smrt.
That thought occurred to me too. However, I still think that the findings stand. If 3BCs sent everybody from the 73% chances to the 99.9% chances, we'd likely see an overall success rate around 85% or so. I'm consistently seeing 95%+ The other thing, and I was only able to allude to this in the piece, was that I have done other work suggesting that it's almost never a bad idea to send a guy on a possible sac fly. I generated "would he make it?" probabilities based on distance and speed, and they were almost always (98% of the time or so) above the break-even point.
In that case, I meant "runner on third, no other runners. Sorry, I wasn't clear." To be in that situation, you'd basically need a leadoff triple followed by a fly ball.
Gasp! You've uncovered my kryptonite. I don't have a good minor league database on hand. But you're right. It is limiting the sample. I'm willing to live with that limitation for now.
I have recent Ph.D. syndrome. I just spent 7 years jumping through hoops to get those letters and I'm going to enjoy them!
I'll call Matt Swartz.
Predicting peak is a whole 'nother series of articles. Clearly, we don't have a crystal ball to see when his career will end. I think this work gives us direction (better players seem to peak later, and some guys may be late bloomers.)
If I hope to get any point across in this article, it's that this mantra of "peak at 27" is over-played. If you absolutely forced me at gunpoint to say one number, 27 is the best number, but it vastly over-simplifies things. Human development isn't that linear or precise. It's messy, and I think that the bulk of the work is in getting in and cleaning up the mess.
Those are guys who "survived" their first year, but not to age 27. So they played beyond their debut season.
Matt, no I didn't experiment with covariance matrices. I'm not as familiar with my covariance structures as I could be. I've heard of compound symmetric before, but never had the time to fully study it. Can you give me (and the rest of BP reading this) a quick summary? I like AR(1) specifically because I have repeated measures and I know that the year-to-year measures are going to be correlated in some ways.
It's possible. There are a small number of guys who played to age 35 with no discernible offensive talent, but who could field at short like a dream. Clearly, the team figured there was some value in keeping them around.
Actually, I think we have a lot to learn from studying the margins. People are fascinated by the exceptional cases, but there's so much more to be learned by figuring out why it is that some players just don't make it.
I don't have that handy. There's also the issue of top-10% when? At the beginning of the contract? At age 27? In a specific year? It's something of a moving target. My guess is that they are best represented by the early debut/long career group.
Yes, once the pitcher moves off the mound and into the field in the AL, the DH is gone.
Second order interaction terms!
But on a day when Freddy Fourth-Outfielder is playing LF...
You're right. It wouldn't work in Fenway, even if Fenway were an NL park. Your second point is some interesting counter-strategy. I hadn't thought of that. I'm not sure if the pitcher would get some additional tosses after switching. Let's assume not... but relievers when they throw two innings also have to deal with the break when their team is batting.
Luke, I hear ya and I struggle with how to balance out how to present exotic methods. Part of the problem is that if I went all the way into explaining AR(1), then my articles would become the Wikipedia entry and that would overwhelm the baseball end of things. To that end, I try to simply state what I do. There are people with experience in this area who will understand the short-hand, but alas there are people who won't (not through any fault... like you, they probably just haven't learned this particular method.)
I regret that it sometimes gets into "trust me, I'm an expert."
The short version of AR(1) goes like this: I'm using a technique called mixed linear modeling. What that does is that it allows me to run a regression that incorporates both repeated measures (multiple seasons of data from the same player) but also look at factors that are outside the player that might influence the outcome (manager, park, etc.) Now, when we're dealing with within-subjects effects like that, we need to specify to the program what sort of covariance matrix that the program should use. AR(1) is a covariance matrix that allows for the fact that if we know what Player X did in 2004, we should have a pretty good idea what he did in 2005. The performances are no doubt correlated. AR(1) specifically says that we would expect certain elements in the covariance matrix to be correlated, because they come from the same person, and that the model should count this as within-subjects (individual person) variance, and not mistake it for variance associated with the other factors under consideration.
Peter, my degree is in psychology. I learned all my stats through the research requirements in the program. If you do want to learn, take a class. This sort of model is an advanced model and you have to work your way up to it, but it's only knowledge.
What you've said is true and might make a good follow up. For what it's worth, I was more interesting in looking at the shape of the production for my own nefarious purposes. I was thinking of looking at some one-number outcomes, but it got cut for space.
The model specifically corrects for that. That's the beauty of HLM. For example, if I end up managing a team of walk-happy guys who have been walk-happy elsewhere, then the model will not credit me with increasing their walk totals.
It's possible that Williams was better when he had a chance to sit down, think things all the way through and write it out, but couldn't explain things on the spot in a coaching situation. It's two different skill sets.
Having been a professor, it's more than just that. Understanding the underlying process is necessary, but not sufficient for either doing something or teaching it. Williams was able to physically execute it. Guillen is apparently able to explain it well. There are some people who can both physically execute and explain well, but those skills are probably independent of one another.
On the first part, I think that's a fair statement in that the manager should be treated like the principal of the school (to extend the metaphor).
On the second part, when I did my pitchers study (need to get those archives back on line somewhere!), I ran into some similar problems. I specifically sought to correct for them in this study (some of those overly-stringent inclusion criteria were aimed at that). The Angels may walk more because of an Abreu effect, and that's fine. It won't show up as a Scioscia effect and if it isn't, it shouldn't.
I entered home park as a fixed effect. Perhaps it isn't clearing up all the noise.
The full lists would have over-whelmed the texts. If you (or anyone else) would like a copy of the full list, send me an e-mail back channel and I will send it to you.
I think that you're closer to the truth here. Much of the dirty work is probably the job of the specialty coach. However, I subscribe to the theory of management that says that Ozzie picked the coach and let him do his work, so Ozzie gets credit for that.
Tversky indeed! I have the feeling that you wouldn't get any takers from the GMs. I think that the appeal of RoboPitcher here is that he looks human, but instead of having to suffer through the inconsistency that goes with humans, he's a guarantee. I think that's what the GMs would pay for. And they'd pay for it out-of-sync with your proposal, which of course is exactly the same thing.
Conclusions: humans are irrational.
Just looking at the list, it doesn't look like there's a correlation one way or the other, although a lot of the guys who got fired were at the top.
You're right that there's a lot of other gunk in there. It was the most difficult piece of the work. I was pleased that I was able to get some sort of a reliable measure, but without mind reading abilities, this one would be really hard to do properly.
I'm not nearly that cool. Or good looking.
I meant 5 SB opportunities as well. I think we're on the same page methodologically. You are correct in that the number of PA/BF/opportunities can affect ICC, much in the same way that it would affect yty. However, as Mr. Solow points out, so long as you set your inclusion criteria high enough, it's not going to make a big differnence. In this case, I actually upped the criteria a bit and didn't get much improvement in ICC. It's something of an asymptotic relationship.
In this particular case, there are two different questions that one can ask. One is, "How reliable is this stat year to year?" (which I chose to ask, .538) The other is "How many PA/BF/opps does it take before this stat becomes reliable?" I haven't run that one yet.
Not sure. Brian's comment below develops this same idea. I think you've both hit on something interesting.
Ah, more variables. Actually, this is a good point. I've got a lot on my plate this week, but this one might be worth a look.
Mr. Solow's response is mostly right. ICC is a measure of consistency across the years. I did toss out most of the interim managers who only had a few games at the helm when I ran that ICC, specifically for sample size reasons. (He had to call for at least 50 SB attempts.)
Think of ICC like year-to-year. If I only had five observations per year, then I'd probably get a lot of random variation and so not a lot of consistency within managers over the years. My choice of inclusion cutoff was somewhat arbitrary, but based more on the realities of what we're observing. We look at managers based on the season-to-season level, so I evaluated them as such.
My program actually takes variables like that and dummy codes them. But good catch. Someone was reading closely! Ten extra credit points.
http://www.philbirnbaum.com/btn2007-05.pdf It was in SABR's "By the Numbers" newsletter.
The standard of comparison is all managers from 2003-2009. The league was a little bit more aggressive last year than in previous years.
Actually, I'd say that's the manager's fault more than anything. Francona, instead of making the decision in the moment, makes the decision ahead of time. Either way, he's still giving the green light to steal.
True in theory, hard to quantify. Not that I won't try.
I usually cut the whole list for space because the extremes are the more interesting ones to talk and read about. However, since several of you asked:
Managers listed in order of aggressiveness, with percentage of model expectation.
B. Geren 1.66
O. Guillen 1.43
J. Maddon 1.36
M. Scioscia 1.34
C. Hurdle 1.26
T. Francona 1.21
C. Cooper 1.19
J. Riggleman 1.18
J. Tracy 1.15
E. Wedge 1.14
B. Melvin 1.12
J. Torre 1.10
B. Black 1.08
B. Cox 1.08
R. Washington 1.06
T. Hillman 1.04
D. Baker 1.04
J. Manuel 1.03
D. Tremblay 1.03
J. Russell 1.02
M. Acta 1.01
J. Girardi .99
T. LaRussa .98
L. Piniella .97
B. Bochy .94
C. Manuel .94
R. Gardenhire .89
C. Gaston .86
K. Macha .83
A.J. Hinch .82
F. Gonzalez .81
D. Wakamatsu .74
J. Leyland .70
Don't worry... logit is coming.
Freel spent nine games with the Orioles.
It's sweet to know that he did that. Maybe he's got "future child psychologist" written all over him.
Last week, someone pointed out Russell Branyan as an alternative to Huff. I'm not a fan of signing guys like Freel in general. The FA market right now is kind of a dumpster dive. Felipe Lopez is better but probably won't come cheap. I'm intrigued by Rick Ankiel for a team who needs an outfielder. He's the opposite of a speedy slap hitter, but hey, he's versatile. He can probably still pitch an inning or two in a pinch!
Not a bad idea. I don't know whether if it would work logistically on our end (it might!) but it is a good idea. I think in the mean time, I'll flag the "gory details" part so that if people want to read it or want to skip it, they can make up their own minds.
Interesting hypothesis. If I have a moment, I'll go back and check that.
Yes, they certainly would be. However, in this particular context, my point was that, for a batter, getting the ball through the infield on a GB, and getting an infield hit once an infielder has gotten to the ball are separate skills, although both would fall under the category "singles on a GB."
Legthy? Probably. Depends how much my daughter lets me write. Snarky? Oh yeah. Oriole-centric? Nah. The Orioles are just one of 30 teams.
To their credit, the Orioles have said they don't want Huff back and had the good sense to flip him to Detroit and get something for him mid-season.
Plus David Segui got a HOF vote, so he can't be all that bad.
He's the type of player that will probably hang around until mid-Feb, and then when a team gets really nervous and decide that they need a "proven veteran" he'll be sitting right there.
You're inferring my intention properly. I was drawing attention to the fact that as a DH, to achieve a high VORP, you have to have a really really good year, better than what you would need as a shortstop.
20 extra credit points to Kniker for the Burger Time reference. And an extra pepper.
One other benefit that might induce you to apply: obviously, you'll put this type of thing on your resume. When you are out interviewing for real jobs, even if they're not in baseball, maybe that hiring manager is a subscriber. You'll get a few "You worked with them?" comments and you'll get to start a nice little conversation that ends with "and then Will Carroll says to me..."
My original plan for this article actually included a test for the idea that putting young relievers into high leverage situations was hazardous to their health, but eventually, I had to drop it for time.
That's the underlying theory for why I think that the signing was made. Perhaps another day, I'll take a look at the hard data.
Kay Hanley will be there. Perhaps she'll play "Pizza Cutter".
I did do some preliminary checks for survivorship bias, but did not report them. There was no association between early winning percentage and washing out of baseball. If there was a "survivorship bias" it would be that some of the players in the first part of the sample had not yet made it to age 29, and as such I couldn't include them.
The point about negatively skewed feedback is well-taken. I hope to do some more writing on the subject soon. Blown leads in the ninth subjectively hurt more than a vlown lead in the sixth inning, but the result is the same objectively. I think this disparity drives a lot of silly decisions in baseball. Could it have an effect here, especially around Gonzalez? Maybe. That would take a little more data digging.
Your reading is correct. Early winning percentage does not affect future SLG or OBP.
Early winning percentage correlates with early OBP at .234 and SLG at .191. Those are significant numbers, although not compellingly large. Players on better teams are better, but better players make for better teams. It becomes a correlation-causation trap.
Contact me via backchannel if you want the specifics. I enjoy me some good geekiness. My e-mail's at the end of the article text.
Whether the saved payroll money would have been spent elsewhere is something only the O's know for sure, but theoretically, it could have simply been banked. Or spent on minor league development. Someone might make the case that this sort of spending would actually be more beneficial in the long run.
I actually intended for the piece to be about the effects of winning/losing on player development. I was rather surprised that the comments went that way, but that's what people wanted to do. Anyone still up for the discussion on player development and culture of winning?
I promise I'm not an amateur. I have a Ph.D. in clinical psychology (hence, Baseball Therapy) with an emphasis on children and adolescents. A lot of the stuff that I have written in the past and will continue to write will be looking at a number of commonly held beliefs about baseball which are really just bad amateur psychology. But I trust that means I won't get booted out? ;)
FWIW, I hope you keep reading. As someone who read BP for a long time before I got hired here, I miss a lot of those first generation folks too. They wrote some really cool stuff and I enjoyed reading it too. In fact, one of the coolest moments in my life was actually getting to meet Dan Fox at the BP event in Pittsburgh this past summer and having him say "Oh yeah, I've read your stuff, it's really good." I'll do my best to carry on the proud tradition.
Let me see if I can sum up the responses, and correct me if I get something wrong.
1) Most of you would like to see us inject a little bit more of our scintilating personalities into the work, so that BP doesn't start reading like the Journal of Applied Statistical Science. Extra credit will be given to obscure references to 16th century Belgian military history. Fair enough. (Note to self: go to library this weekend.)
2) People prefer applied topics or at least, as my grad school advisor liked to say, a good story to go along with your table. OK. Connect things in so that they are relevant. Sound advice for any writer.
3) People here aren't stat-phobic, cuz ummm, that's kinda the whole point... anyway, it seems that most people don't mind if we break out the major numerical nerdiness. (Y'all know that there are some writers who are more prone to that than others. Personally, I know that I look for excuses to pull things out of my statistical tool belt.) It's not everyone's thing, and that's cool. But it seems that people are hungry for more than just tweaks in the methodology, or at least to know what difference it makes if we tweak the formula for WXYZ42BNL makes a difference.
In other words, after the digit gymnastics, we need to be able to answer the question "So what?"
Pbconnection, I also rue the fact that Nate and a lot of those first generation writers have either cut back or moved on. But give us newbies a chance. You might grow to like us.
Alright, a fair number of you have suggested that BP would do better with less of this or that in our articles. I ask you as a real live BP writer, and in a spirit of actually wanting to deliver a better product to the customer, what exactly is it that you are looking for?
I can understand not wanting to read things that sound like academic journals. What would you rather it sounded like? More snark? More Miley Cyrus references (oh yeah, I totally would)? I want to put the emphasis on the word "more" though. More looking at recent events? More looking at history? More work explaining how WXYZQQFJLDAs work? More application to individual players/teams?
If all I hear is that we want less of something, I soon have nothing to write about. So, start your sentences the way that Britney Spears would "Gimme gimme more..." I can't guarantee that I can give you everything you hope for, but if you've got an idea and it makes sense... well then I'd be a fool not to take it.
Or perhaps in some cases, replace "playoffs" with 81 wins?
I agree whole-heartedly with that line of reasoning. It's irrational behavior on the part of all involved, but human beings are rarely rational.
The economic impact is a little beyond my reach right now, but I will say this both from a research perspective and from experience. I'm amazed at the power of the bandwagon effect that happens when a team starts winning.
Why can't it be used in 2012? Baseball teams have bank accounts too. Those millions can sit there and collect interest. Money in the pocket doesn't have to be spent. Consider what the Marlins have done over the past 12 years.
Let's assume that the Teix money (17.5M per year) is still in the warchest. Why spend it on someone who won't get you into the playoffs this year, and instead bank it until 2012, when you can have all that saved money to spend to add pieces to a team which has naturally grown to the point where those moves make sense? I can understand that this is not a pleasant thought of having to endure another couple of losing seasons, but as a therapist, sometimes you have to tell people things that they don't want to hear.
I actually rather like Gonzalez himself as a pitcher. If the goal was "get a closer", then the Orioles did well in that regard. I'm questioning the underlying assumption that "we need a closer."
The draft pick is an issue, but it's a cost of doing business. I think that the "draft picks are gold/draft picks are toothpicks" pendulum has swung a little too far in the gold direction. Yes, there's nothing better than a draft pick who works out. There's nothing more frustrating that a draft pick who doesn't. A draft pick is a high-stakes coin flip, not a guaranteed future star who will only make $500K per year and take us to the World Series in the process. He might become that. He might flame out in AA.
You're correct, I don't live in Baltimore. I live in Cleveland. (It's not like living through the Indians bullpen woes has been easy either.) I believe many of the issues you bring up here are addressed in comments I made above.
Let me add this: My goal in writing this piece isn't so much to tweak the Orioles' management for the signing (OK, maybe a little). What's done is done. But there's another team out there thinking of doing this same type of thing right now. I want people not to evaluate moves based on the assumption of a steady-state past. I'd rather that they took a broader view of the options available to them and projected those into the possible future.
The problem with making signings to minimize frustration is that if you manage with your emotions, you get burned. I'm convinced that a good deal of what passes as baseball "strategy" is an attempt to make fans/players/coaches feel better rather than to win the game.
Does George Sherrill really need to be replaced? Maybe. But what about thinking about it from another direction? Maybe it's best to take the short term hit of a lousy bullpen if it means a better chance at winning down the road. The answer may come out to be that to sign Gonzalez is the better plan. But a simple reflexive "Sherrill left, need a closer" over-simplifies all of the options available to a team and cuts off what might be a better option to pursue.
It's worth thinking about. Here's where I disagree with this type of plan. Suppose they flip him for a prospect. A prospect is a "maybe" and in 3-4 years, he'd still be a kid.
Gonzalez is signed for two years, so either this July or next July, he might end up in a trade package, but why not save the money now, bank it, and in 3-4 years when you need that fully developed piece, you have the cash. In the free agent market, you are buying relative certainty of what you're going to get. When you're on the steep part of the marginal revenue curve, would you rather then plug in that "maybe" or would you rather a guy with a track record. A lot depends on your time horizon.
A good addendum to my thought process. What's curious about humans is that they want to see a nice neat trend line. Suppose that the Orioles put Gonzalez's money in the bank, give the job to a rookie this year and next year, lose those couple of extra games, and then in two years when the time is right, signed a couple of free agents. Over the next few years, they might stagnate around the same win total. And you're right that free agents might interpret that as "well, they'll win that many next year, so why bother going there?" It's an irrational thought process, but it's a real one.
Fair enough. Mr. Easterbrook probably isn't the first man to do something like this either nor is he the only one I've seen do it, but I have to admit that I read his work and enjoyed it. Now, all I need to do is get a clue about football!
Eric, were the "relief appearances" perhaps a former starter working out of the pen as a long-reliever, a role in which you are basically a starter, but you come into the game in the 3rd inning due to the fact that the real starter is currently nursing a bad case of whiplash from all the HR he gave up? How many of these guys went from starter to 1-2 inning relievers?
Tim, I would argue that a manager who continues to stick with a faltering closer (Capps and Lidge were your examples) should be penalized. Sticking with a guy because he is the annointed one is just stubborn.
Has someone yet put together an "offers database?" For example, if it's reported that Mapleland Bees offered 5 years and $7 gazillion for Jason Bay, and someone reported that, we can get some idea of the pricing process. Of course, some of those offers are bogus (probably floated by agents to inflate the price.) Some offers are likely never reported. Obviously we know the winning bid, but do we know the losing bids and how the rest of the league prices Player A.
I also wonder if the better market analogy isn't a rarities market. It seems that at the beginning of each off-season, before anyone can officially make bids, there's a generally accepted "contract that Larry Larfelschnarger will get." And he usually gets that. Teams can walk around this rarities market and either choose to buy or not buy, but it always seems that the price is set in advance.
"His elbow/knee/leg/spleen is fine structurally... he's just gotta get the confidence back in it to use it." How true is this really?
Someone posted above on brain injuries (concussion being the most prevalent). Some work on the basic neuropsychology of a concussion would be awesome.
Statistics never lie. Liars use statistics.
I think this one has two parts, both worthy of a little bit of further thought. Is losing detrimental to the development of young players. Maybe there's some sort of (dare I say) psychological price to be paid for coming up in a losing environment? The other issue is the fan base. In Vegas, it's well-known that people generally have a point where they feel their gambling losses are too much, and so they stop. (Vegas manipulates the heck out of this, btw) Maybe there's a point where Orioles' fans will cry uncle and leave en masse. That one would be harder to quantify though...
And right there, you nailed down the struggle that I have when I do this sort of work. I did have the thought of trying to use some sort of MLE at the time of being brought up as a control on the model, but the truth is I'm awful at MLE's.
I did some earlier work of this type with managers and tried specifying for pitcher age, home ballpark, and year-league, as well as using an AR(1) covariance matrix, but even then, the model was acting a little screwy. I'm not familiar with the test you suggested (I have to admit, it's been a while since I was fully immersed in HLM) but I'll check it out. Thanks for the tip.
The biggest problem though is that the ommitted variables that "catcher mentor" picks up on, as I mentioned in connection with M. Redmond are organizational variables. What sort of guys do the Twins draft. Whom do they promote? Whom do they keep around? I suppose that an MLE approach might answer pieces of those questions. But how to quantify the rest???
Maybe I should just stick to t-tests!
Christos Razdajetsja! (a few days early.)
I think there's some value in that process though. One of the things that I think has happened in Sabermetrics is that we've valued construct validity (it makes sense in our heads) over environmental validity (it actually makes a difference out there in the real world.)
A psych major in Sabermetrics. That would never work. ;-)
IIRC, the idea that Rodriguez was signed simply to be a draw at the gate was floated as well. Seems likely that it's about the only benefit that the Nats will derive from him.
I don't know that the theory holds. Obviously, the home team will be cheered for in their own park, and the new acquisition will be cheered in his new home park. But the crowd at a Cubs-Sox game is decidedly mixed, and the response to a returning veteran might be "you betrayed us in free agency!" or more likely, "Oh yeah, didn't you used to play here last year?"
I've actually heard MLB players interviewed saying that they take the cheering for the other team and pretend that the crowd is cheering for them.
I'd also contend that HFA isn't just a function of the actual field of play. Even if ballparks were completely uniform, I think we'd still see these effects. Consider your high school for a moment. It has the same basic anatomy as all the other high schools in America. But you could tell your high school immediately when you walked in, even if it has changed a lot (like mine!) The nuances at a ballpark can be in the ambient noise in the background (is the park in the middle of downtown with a bunch of traffic around? By a river?)
It affects everyone in baseball. The reason is very simple. Your brain runs on about forty watts of power, which is probably less than is powering that light bulb above your head right now. It has to economize and use shortcuts, and there's only enough mental energy to keep attention on a small amount of stimuli at a time. Consider that the participants in the experiment probably would have noticed that the person behind the counter changed if someone drew their attention to it directly. The problem is that in the absence of this attention, the brain just fills in the blanks with what it thinks goes there.
A small example. Try to recall an event that happened a year or two ago (maybe longer) involving several people to whom you are close, and one where you have pictures from when it happened. Family parties, weddings, etc. fit this rather well. Try to recall as many details about the event as you can. When you can later today, go look at those pictures. I'm guessing that in your mind's eye, you envisioned most of the people as they look today, without remembering that Larry had that mustache back then but has since shaved it off, and Curly has since lost all that weight, and Moe was with whatshername that you never liked, but for some reason you assume that he had been there with the new girl, whom you don't like either. That's because you probably didn't make detailed observations about everyone during the event. Instead, you have a general idea what these guys look like and your brain projects that backward into the past.
In situations where change is gradual (a pitcher's motion changing slightly as he gets mre tired), unless you have specific training in recognizing the differences and are watching for that, you probably don't recognize that it happens. However, if I showed you his delivery in the first inning and the seventh inning and told you to pay attention, you'd probably pick up on it. But before that, you'd swear it was the same delivery all game.
Thanks. Actually, I'd argue the other way. HFA would still be as strong. The differences between stadia are more than just outfield dimensions and foul territory. In basketball, the acoustics are a little different everywhere you go, I'm sure the floor is slightly different in consistency... little things like that. It's not that they directly interfere with play, but they do impart a slightly different feel to the arena/stadium. I think that's part of what's being responded to.
That's actually the exact reason that I used the odds ratio correction method. I'm measuring outcomes relative to expectancies. So, if the player traded was an overall .400 OBP guy, the model knows that and expects him to be on base 40% of the time.
I'd personally like to see that one in Pitch F/X. (Has someone done this study?) The effect could be that road pitchers pitch a little more tentatively.
I agree that batting last plays a huge part. In fact, it's on my to-do list to look into some of the mechanisms of why exactly that happens.
Homestead Homies of '95? Twenty extra credit points for that reference!
I think there is some merit to this argument, especially given that "that team from the Bronx" has been mentioned in connection with Halladay. Trade him there and you have to see him 5-6 times per year pitching against you. Maybe that plays into the calculations, especially since you hope to rebuild the Jays in 3-4 years and Halladay might still be in pinstripes at that point. However, I think that's just something that you build into the "cost" of the offer. Maybe the Yankees or Red Sox have to pay a premium because of the division issue, but to outright refuse to deal with them seems silly.
Thanks all so far for the comments. My 'reply to' button is also malfunctioning, so I'll have to do one big round-up here.
Several of you have brought up good points about other variables which will no doubt play into the Blue Jays' thought process. Of course, nothing in baseball is simple enough that it can fit into a few thousand words.
A couple people have brought up whether this is a true analog to the Ultimatum Game more properly. It's true that a year of Halladay and his compensatory draft picks aren't worthless, but there's probably a better offer out there. True, there are multiple teams (I assume) making offers, but eventually, they will all bid up to their highest point. One of them will be the best. That's when it becomes a two player game. What if that offer isn't "equal value" to Halladay? So long as it beats the two draft picks, then the Blue Jays should take it. The idea of taking things to the trading deadline is a rational idea, but eventually, you have to make a decision. It's possible that you get a better deal in July, but at that point, you might still find yourself not getting equal value.
I think some people misunderstand here. It's not that the Jays should take the first offer that comes along that's better than the two draft picks. They should build a market, encourage teams to bid up, etc. When everyone says "... and that really is my final offer" they should take that one.
The other issue is whether there is some value in not cooperating, whether in the form of "hey you know you screwed the other guy over" (which, added to $3.00, will get you a cup of coffee), or whether building a reputation for further trades is worth it. (Something like: "Anthopolous is soft and will take less than full value.") I don't think that holds up in this case for a couple reasons. One, Halladay is a major stakes game. How often do you trade a guy like him? You need to make sure that you get something for him. If that means you get a little less when you trade some random middle infielder next year, so be it. Second, I don't know that it really is "giving in" here as much as bowing to the reality of the situation. Structurally, the Jays are screwed. They have no leverage other than "make a better offer or we'll send him to Team B," but even that has its upper limit. When they make another deal where they have some leverage, they can hold out all they want. They can spin this one (correctly) as "what were our other options?"
Yes, the fans are going to be mad. Trading Halladay is an admission that the Jays don't think they have a chance for the next few years, and that's discouraging, because Americans... oh right... want a winner and want it now, and most of the time, you can only have one of those.
Don't hang around with clinical psychologists. They're all nuts.