CSS Button No Image Css3Menu.com
New! Search comments:
(NOTE: Relevance, Author, and Article are not applicable for comment searches)
Yeah, you'd pretty much need to have everyone in the pen able to go multiple innings, when necessary, and a 2-to-3-inning fireman/closer would be nice to have, but the end result of this plan is that you probably do spread out the saves. And yeah... some guys would probably carp about that.
There's a variation on this that Tom Tango uses in his book. A team employs 3 "real" starters who pitch on days 1, 2, and 4. Two other starters pitch 3 innings/50 pitches each on days 3 and 5.
If there is a(nother) great unexplored frontier in Sabermetrics, it's implementation research. Everyone knows that smoking is bad, but people still smoke. How to convince people to put down the cigarettes, and holding on to pitching wins? Want a Sabermetric Nobel prize... work on that one.
In my original draft, I had a section on needing one of the relievers to have a rubber arm (or be a knuckleballer) just to pick up the garbage time innings. It's a good idea.
There is an advantage to being able to use the tandem starter model to gain platoon advantage, if your two starters are of different handedness. Because the other team knows that it will be a lefty then a righty (or vice versa), they can't gain as much platoon advantage with their starting lineup. Maybe the ability to gain the platoon advantage that you lose in the 8th inning is canceled out by the extra platoon advantages you gain in the 5th inning. If anything, it seems that a tandem starters model is built around trying to win the game in the 5th inning through prevention than in the 8th inning through strategy.
The average MLB pitcher makes 2.7 million per year. So, 8 guys at that rate is 21.6 million. Looking through salary data, a lot of those guys made under a million bucks last year (Fundraiser will be next week.) A team could probably get above average performance, maybe for savings around 10 million.
The 15 guys who I named had an xFIP of at most 3.75 in their first 50 pitches in 2012, which would put them around the top 30 of all starting pitchers. That's pretty solid performance. So, I'd invest in the offense.
As to the Astros farm system, I wonder whether every system has a bunch of guys who fall into the "good for 50 pitches, but not for 100" and are labeled failed starters, because there's no role for them.
Max, what's the stability from year to year in framing "ability"?
Even if he's trolling, it's a teachable moment. Even if Heyman doesn't take the lesson, there will be others who will read and understand. Even if he's trolling, he brings up a reasonable and important question, and it's one that needs to be addressed, and we need to honestly address it, even if it means that the answer is "We don't know yet and you are correct."
I remember listening to that episode while riding on a bus from Chicago to Cleveland coming home from a wedding. It was the first time I'd ever heard Colin's voice.
Harper (139 G/597 PA) logged more playing time than did Conigliaro (111 G/444 PA). Harper played primarily in CF, while Conigliaro played in LF. It looks like you're quoting the Baseball Reference version of WAR. You're correct that most of the difference between the two seasons comes down to Conigliaro being rated as 10 wins below average in the field and Harper as 14 above. For Harper (and for everyone from 2003 onward), BBRef uses Defensive Runs Saved from BIS (The Fielding Bible people). For pre 2003, they use a measure called Total Zone, which was invented by Sean Smith. I once created a similar measure. TotalZone uses roughly (and I mean very roughly) the same ideas I've used here.
The broader point is that Harper's claim to being the best teenager ever rests a good deal on his superior (in the eyes of the metrics) fielding performance. Conigliaro was clearly the more productive hitter. We can put more faith in the reliability of those hitting metrics than the defensive metrics. There's a decent case to be made that Conigliaro deserves a second look and that the case in favor of Harper is not so clear cut.
It am taking some small liberties with KR-21. I'm assuming that a grounder is a grounder is a grounder (and that all are of equal difficulty), mostly because in the data set I'm using, I can't tell the difference as to which grounders were soft two-bouncers right at the fielders and which were screamers headed through the middle. The way that I have the database structured, I lined up the "test questions" in chronological order. So for first basemen, "question" #1 was the first ground ball that he saw from 1993 onward that was hit in his general area. For some guys, that was an easy one, for others a near impossible ball to get. What I'm counting on is that the noise all cancels out in the wash.
Actually... I originally ran the numbers and calculated the rate at 140 (and wrote that section while I waited for some other numbers to calculate), then realized my syntax was doubling everything and changed it accordingly. 70 is correct.
I was working as a therapist at the time, and therapists have more than their fair share of stalkers. I didn't want my patient's googling me and finding my baseball work and wanting to talk about that in session. My real name has always been something of an open secret in the baseball world.
Actually, one of my favorite researchers of all time did something on long two-strike-foul battles. Weird guy though. Named himself after a kitchen utensil.
That would take a very different set of analyses to do... maybe I'll have some time tonight...
I ran a mixed-linear model, controlling for pitcher identity AR(1) covariance matrix. I looked at the effects of age (as a fixed linear covariate). 2003-2012.
The effect of age was non-sig (p = .55) although the trend line was pointing down, by about a ten thousandth of a point each year of age.
Interesting point. I hadn't considered that aspect of it.
The statistical tool that you were looking for in your discussion of aces is basically an independent samples t-test. You are saying that you know the mean (each pitcher's ERA) and imposing some sort of distribution and then drawing randomly from those distributions. Basically the question that you are asking is that if Verlander is really better than... me, how often would we be able to detect that in a small sampling frame (1 game). That's a question of statistical power. (For the initiated, in a t-test, it's 1-beta.)
From the archives: http://statspeakmvn.wordpress.com/2009/02/16/did_i_scare_you/
We do know that HBP rate stabilizes quickly. Some guys do seem to have a tendency (probably because of their stance) to get hit more often. I could run a chi-square test to see whether Greinke hit Quentin, but the sample size is far too small to be powerful enough to detect any effect.
You are correct that these are not issues confined to the US (and of course, not all players in MLB come from the US.) There are other cultural concepts (machismo, for example) which present their own challenges. I hesitate to write about them more broadly because I just don't understand them enough to do it.
That makes sense on first glance. I found that very small (10 BIP) samples don't do well as predictors, maybe because it's just too small a sampling frame to get a good read on what's going on. Maybe if we conceptualized it in a different way we'd find an effect. It's an open question at this point.
In the interests of investigating whether this theory works at the extremes, I ran some new analyses.
Using the same basic framework as I did in the original, I took the league average and the past 100 BIP and let them fight it out in the same logistic regression.
I only took cases where the last 100 BIP yielded a prediction of .280 or lower, then .275 or lower, then .270 or lower, etc.) There does come a point where league BABIP is a better predictor, and it seems to happen somewhere between .270 and .265. However, it should be noted that the past 100 BIP still holds some significant sway, even as you descend even further.
Perhaps .240 is too lucky to believe, but .270 is not.
I think you're right on looking into how many of these cases where you have an "extreme" value (say, .240) there are, and how well the model would perform in these cases, but that's testable.
As to your second point, I'd love to know how this works too! If the fundamental message of what I'm trying to say here holds, then it opens up a lot of different avenues of investigation!
On your third question, I don't know that one yet.
Yeah, but you can't trust the kitchen utensil guy.
We're not talking here about how far to regress here. That's a different set of analyses. We have two variables fighting it out to see who is better at predicting the outcome of the next ball in play, which is the true measure of how good a predictor is. These analyses tell you that recent history does a better job modeling the outcome of the next BIP than does league average. That right there suggests that the standard DIPS assumption that everyone is league average deep down should be treated with suspicion.
So at that moment, he is better described as a .240 BABIP rather than a .300 BABIP.
My personal favorite example is Troy Percival who had BABIPs usually in the .270 range year after year.
Figure a starter faces 25 or so batters per start, and strikes out/walks 7 or 8, then 100 BIP would be roughly 6 starts.
Good point. My original thought was that the recency of 10 BIP would have a strong amount of predictive power, but it seems that it's just too small/noisy a sample size to really get a read on what the pitcher is up to. But it turns out that when you pull back a little bit, the sample size is big enough to provide at least some clarity. Predicting the next ball in play may very well be a function of how the pitcher is feeling that day, but the previous 10 BIP just doesn't give us enough of a read on that to tell how he's doing.
On the third point, if BABIP is far from the league norm over the last 100 BIP (say it's .240), then from a variance explained point of view, the recent personal history of the pitcher is more important than the league average. However, understand that the recent personal history of the pitcher is not a static number.
By definition, you can't predict randomness. If BABIP is random, why am I able to find a predictor?
In #2, the idea is that I created indicators of xBABIP based on where balls off the pitcher were hit (and how many outs a league average team should have recorded based on that), so that the pitcher wasn't penalized for having a bad defense (or credited for a good one). Then, I took the defense's BABIP for when that pitcher wasn't around.
(Natural) log of the odds ratio is just a statistical trick that I used because I used a lot of logit regression. It has to do with raw percentages not being normally distributed, and using LOR corrects for that. Also, when logit does its actual modeling, it spits out a function that gives you the LOR of the probability that you want to model.
In #2, the idea was to see how well these predictors performed relative to each other from the point of view of variance explained (as much as logit lets you do that.) Was it the pitcher's general talent in steering the ball toward a fielder? Was it the sparkling defense? Was it the batter steering the ball himself?
Call me if you need a tutor.
*clips for arbitration hearing file*
One of these days, I wanna have a conversation with all of you about makeup.
Only 5 players all time have logged 3 or more seasons with 50 doubles. Tris Speaker, Stan Musial, Paul Waner, Albert Pujols, and Brian Roberts.
He's only on pace for 324 this year. At that pace, it will actually be 2015 before he tops 763.
One billion points!
I wish my name was either Ben Lindbergh or Jon Shepherd so I could lie to people and say I wrote this.
For what it's worth, I've never met Jack Morris. He might actually be a really nice man.
I can live with that critique. There's a lot of upside for those guys, but everyone's got enough upside to win the 2016 World Series... if everything goes right... just ask them.
Y'know, that's not a bad re-frame. It's a lot to bill to the marketing department, but... whatever gets you through the day.
You are hardly the first. This has been an idea that has kicked around for a while.
Hooray for Stetson Allie! He went to my high school.
Paul McCartney always told me that name-dropping isn't polite. ;-)
The biggest issue here is that you're trying to mix something that naturally has six beats into a rhythm that has 5. If you're figuring that a tandem takes care of the fifth starter role, then really they'd only be available for relief duties for the third starter. (#1 and #2 would be their rest days) Even then, a relief appearance might only leave them one day to recover, and they couldn't cover for #4 because they have to pitch the next day. In some sense, what you lose is the roster spot.
Alternately, you could run the experiment as 1/2/tandem/3/4/tandem, but that's essentially pushing the four "real" starters into a six-day loop, rather than a 5-day. Especially if one of them is really good, you want him to pitch more often, rather than less.
Maybe a team might try this in September when they are out of contention and the rosters are expanded.
On the topic, an interesting article from 40 years ago in SI on Earnshaw Cook and the ideas in his book.
Actually, you see a few of these guys pop up every post-season... usually the team's fifth starter. Tommy Hunter also seemed to be reprising a similar role.
It does happen in lower level leagues, but that's with a view to slowly building guys up to go 7 innings.
Or guys who have been injured aren't able to keep it up through 6 innings. Injured pitchers do come at a discount...
Tom's model was to have three "normal" starters who would pitch on days 1, 2, and 4. On days 3 and 5, a tandem starting team would take the hill. I've played around with that model a bit. It does have the problem that your normal starters need to be decent... and that negates the cost-savings aspect of my plan. But certainly his way has its merits.
I guess if there's one message that I could get across it's that it doesn't take a huge effect size of chemistry for it to be meaningful. And it isn't that huge a jump to assume that it might have a small effect. I doubt it turns scrubs into stars, but we don't need it to do that.
Maybe Inge really is a nice guy and helps people out.
I hold out hope that it can be quantified. And used to great effect. But right now, hope is all I have on the matter...
Would you believe me if I told you that this very discussion got left on the proverbial cutting room floor? I was going to add it in after the part where I hypothesized that Inge might have reached some players but not others.
In the professional literature around psychotherapy, the #1 predictor of improvement is not the experience or skill of the therapist in implementing therapy, but the extent to which the client feels a strong connection/bond/alliance with the therapist, and as you point out, people respond to different personalities and characteristics. When I worked as a therapist, I experienced this firsthand. Some people responded well to me, some did not. This is why I wish we had BFF F/x...
As to the issue of younger players being more affected, I briefly considered restricting the age range to something like 23-26 (for that very reason), but it dropped the sample sizes dangerously low.
This is what happens when you read "Tigers" and "SEC Championship" in a sentence but don't read the rest too carefully...
Nathan, I do almost all of my own syntax writing and analysis. I have a small library of code fragments that I can just pull out for common tasks. Saves a lot of time.
The thought occurred to me to do that study too. It's somewhere in the queue. I promise.
As my wife points out when I dip into my 90s nostalgia, all of that stuff is now on the oldies station...
I not only computed the effect on K rate, but also, walks, HBP, singles, 2b/3b (put together), HR, and outs on balls in play. Those are the 7 basic outcomes of a plate appearance. There weren't any effects to speak of.
The reason I was even at Moscow State was that my wife was presenting at an academic conference. Everyone else was dressed for a conference. I was... not.
Actually, that's just me everyday.
Everyone became more likely to call for a steal after a CS. All of them. But, there were some who showed less proclivity to fall into this pattern than others.
The model looks for changes in SB attempt rates. Perhaps it's picking up on the fact that those gentlemen run no matter what, whereas the others only run when challenged by failure.
They paid market rate for a decent pitcher. It wasn't a great move, but it was understandable.
Good bunch of guys.
Ideally, yes, but those data aren't easily available, and this is a case of "direction before precision"
2012 was not a validation year. I looked at all pitchers (who met the criteria) from 2006-2012. Sample size was a few hundred player-seasons. In theory, I probably should have used some sort of GEE to account for using the same players across multiple years.
They were in the list of factors that were available to the model to select stepwise.
It's not that Jose Reyes is a malcontent. It's that if the Blue Jays moves don't work, people will find bad things to say about the players that they brought in.
Your reading is correct. And it is scary.
The reason that I didn't do velocity measures was that i didn't have time to merge in the Pf/x database. It's worth a look.
Oddly enough, I put in BMI, but not height... And I have height right there ready to go
Bad researcher. No Hot Pocket for you.
The marginal risk of pitch #2000 is different than the risk of pitch #3000, if ever so slightly. It then becomes a risk-tolerance question. You can push Smith for another inning and he may be better suited to handle it than anyone and you might need this inning, and maybe you don't blow out his arm. But maybe that's the sinker that breaks Smith's elbow.
I think you're right on. I don't know if we can pull apart the chicken from the egg on that one.
I did look at many of those issues, but didn't report due to space. I entered interaction variables (e.g., number of pitches x age) into the regression. Being older added injury risk, as might be expected. As to whether power/finesse pitchers are more likely to be injured, those were mostly secondary predictors when they were significant. But then, my model was a first pass at this sort of thing. Must. Improve. Regression. Equation.
It's all regression-based, so you have to interpret that as "holding everything else constant, another inning actually predicts a lessened chance of injury."
A lot of it comes down to pitch efficiency. The big message is that it's the number of pitches, not the number of innings that you rack up.
I could see that working. My reach for social net was based on the fact that all proclamations of the amazing chemistry in the clubhouse start with "we're all such good friends" or some other non-sense like that.
I took it once and actually split three of the four down the middle.
Rule #3: Never mention the Myers-Briggs around me.
This is why I wrote this article. I didn't even think of that.
A billion points!
Of course it exists. The question is whether it matters on the field.
If you measure things longitudinally across the season (and across several seasons) you could get a better idea of this.
There are, but since "chemistry" is a team-level term, I'd need at least a good response rate within each team, and since there are only 30 teams, I'd need a good chunk of them to have any sample size to look at team-level effects. So, I'd need a response rate that was pretty high.
Pat, all of those were issues that I left on the proverbial cutting room floor. Per #1, my guess is that there is probably more downside than upside, but I hold out hope that some innovative person could figure out a way to optimize the upside.
Per #2, I briefly considered comparing a clubhouse to a junior high classroom and all the drama that could ensue. There probably are players who love the drama, but that's not a sign of a healthy group in general. Friendly competition is one thing, but "No, I won't tell you the thing that I noticed about the opposing pitcher because I'm mad at you" doesn't help anyone.
Per #3, I focused only on the quantifiable for this article, but there's a fantastic opportunity within Sabermetrics for people who have qualitative and textual analysis skills. I wish there were more of those people out there.
1. I've never seen the switching positions study done. Fits into the "reasonable hypothesis" category. It might even be true.
2. I would debunk my own mother if she were wrong. (Mom, if you are reading this, you've never been wrong).
3 & 4. Good questions to which I have no answers. There's so much more to figure out.
The biggest problem with Verducci's formulation is that it is too broad. We need a more fine grained look. Another for the queue.
I do worry about that. Our injury database actually lists injuries that took place (or were discovered/reported) during training camp. I might be able to look specifically at this.
That's a reasonable thought. Some might have more injury risk after changing mechanics (and I could see some being better off for the change both short and long term). If anything, it adds to the confusion of the Verducci sample.
To break that into stat terms, we over-estimate the variance that our models explain. There are many factors for which we don't (and in some cases, can't) assess.
This is the sort of line of work that would have to start with a "direction before precision study." Age and spread of age are easy to figure (mean and SD), and that's a good starting place.
This was something of a strategic methodological choice on my part. The biggest danger that I saw was random extreme variations (i.e., career years) distorting year-to-year improvement stats, so I only controlled for year-1 raw OBP. Your approach is perfectly defensible. I just went a slightly different direction.
A reasonable hypothesis.
Isn't there a game theory issue here? If I know for sure that the other manager won't pitch out, then I can feel a little more comfy at first base as I plan my mad dash to second and my manager can call for the SB at will. You have to do it once in a while if for no other reason than to keep them honest.
Sam Miller, what are you drinking?
This is a reasonable theory. I do wonder how much "last chance" vote Morris got/will get. Frankly, that sort of thing bugs me, but it's a real phenomenon.
Clemens/Bonds supporters voted for an average of 6.6 names other than C&B. Their overall total was 8.6 names.
Yeah, but 1997 was fun.
Taubensee and Willie Blair. That's OK, I think the Indians did OK in that deal. They got Dave Rhode too!
Completely true story: I have never seen the movie Major League. My parents had a "No R-rated movies" policy, and by the time I was old enough, it seemed rather campy and by that point, I wasn't much into TV or movies in general.
John Wayne Airport is a truck stop that just happens to serve airplanes.
Sorta. Sports psychologists might have training in neuropsych assessment, but I'm thinking of this being something that's talked about within an organization. If a GM ever comes out and uses the term "executive functioning" or "frontal lobe" correctly in a sentence, I will declare his team to be my new favorite forever and ever.
Hooray, after 2 years of begging, it finally happens!
Or would they want to take advantage of their park?
There'd be plenty of temptation to defect though. If I'm the only guy in the league who has the DH, I can not only have an offensive advantage (more attendance?), but sign a guy who will be very helpful 81 games a year while the rest of the league has no such guy, and since I'm the only buyer, I have monopsony power.
I should what...?
Hooray, I got podcast lovin'
Not necessarily. Kevin estimated that he'd be missing 12 things. It appears that there's one extra thing that he's getting half right.
If MLBTR/Rosenthal/Morosi/etc. get 8% of what's going on, then for every one thing that they have, there would be 11.5 other things going on.
Bad closers usually become 7th inning guys. Think of how many former closers who are now just replacement level relievers are around because "he had a 30 save season a few years ago and maybe he still has something in the tank."
But he went to my high school...
Whither my fellow alumnus, Stetson Allie? Can he make it at 3B? (We actually did go to the same high school.)
Yeah, the agents would probably hate it.
It would have to be a stipulation that came with declaring for the draft. Your rights are held by the drafting team and your options are sign with that team and accept the bonus or go home. The tradeoff is that your bonus is essentially being bid up to the highest bidder. The bonus system right now is determined by a bunch of games of chicken being played between draftees and teams. I'm suggesting a more rational market approach.
They would have a larger pool of cash so would be able to outbid (in theory) the better teams. And I'd contend that the current system forces them into a high-risk rebuilding strategy, whether they like it or not. They may choose to pursue it in this system, but they may go with another method.
Proven winner. Was on the 1995 Braves World Championship team.
I'd envision it as pick 1 gets auctioned, the player is selected, and then pick 2 is auctioned. Lather, rinse, repeat.
That's some high praise. Thank you. Could you say that louder and make sure that my boss is in earshot?
Googling is cheating...
This is officially required reading for anyone who considers her/himself to be a Sabermetrician.
You could at least adjust for "Well, of guys who were age-23, the average improvement made was X. Smith improved Y."
Yes. Lots of upside. Potential 7 on the snark.
This falls into the "hitting is the only way to help a team" fallacy. In fact, if the award were best hitter in general, it would (and should) go to Miguel Cabrera. But, a good chunk of Trout's value was in his baserunning and defense. During the months of August and September, when you look at Trout's contributions to the Angels including these other areas, he actually outperformed Cabrera in the last month of the season.
As to not making the playoffs, had Detroit been located on the West Coast and Anaheim in the Midwest, then the Angels would have won the AL Central.
Tip that Royals/Brewers/Angels hat to Mr. Greinke. Actually, no matter what hat you're wearing, tip it to Zack.
I don't know anything about what his potential suitors have in mind to help him. My hope is that it's "a lot".
In Texas, he had an accountability partner who traveled with him. I'm guessing that he had friends on the team who helped him out (as a good friend should.)
There are data on throws to first going back (I think) to 1993, and a throw to first does decrease SB success rates.
Are teams throwing over to first less often?
No one pointed out the biggest need of the off-season. The Astros don't have a DH.
This is brilliant.
Ding ding ding. A billion points for you!
The big unanswered question is whether that "counts" as a defined role. For a long time, roles have been based mostly on inning (aside from LOOGYs). Is it enough to say, "When you see things getting crazy, start warming up" I think that's an open question.
And in response to a question posed by a particularly handsome young man, using the term "replacement level" in a sentence.
I always budget things high, and I put in there a little bit about how a team could probably negotiate a pretty good price give the size of that sort of contract. If the price falls, then the amount of gain that would ahve to come for it to make sense goes down too!
Yeah, there will be a lot of deadweight loss in that a lot of food will go to guys who will never see a major league uni. But how do you do it otherwise and not cause a knife fight. Even with the extra expense, it's still cost effective and maybe you strike gold with a guy who just needed a good sandwich.
That's massively exaggerated. A player may have limited English, but that doesn't mean he doesn't have survival skills. One of the first things that he will likely do is to figure out either where he can get food and order in his native language or he will prioritize words related to food in what he looks up or he'll ask a friend on the team.
I agree on the issue of whether or not kids... erm, prospects will eat their veggies. Plus, you can't stop them from stopping off and grabbing a not-so-healthy bite to eat after the game. My argument would be not that this will solve all problems, but that it is better than the current situation.
As to the salad, the problem is that a salad is offered by restaurants as a low-calorie option. Baseball players need their calories, but unfortunately, the only thing that can provide them with those calories comes in the form of high-fat food. The body needs protein and complete carbs as well.
There are teams who have "looked at" this issue. You mentioned the Cardinals and MLB ran this on them a while back: http://stlouis.cardinals.mlb.com/news/article.jsp?ymd=20100223&content_id=8127110&vkey=news_mlb&fext=.jsp&c_id=mlb
A few other teams have nutritional consultants who work with the players and some teams do provide some food. But even then, you're talking mostly about giving out good advice to players rather than proactively setting food in front of them. Information alone does not solve public health problems. I'm saying that a team who went fully into a comprehensive program, even if it were expensive, would net some games from it.
Maybe Cespedes (Cuba) and Darvish (Japan) get points for coming to another country and putting up good numbers. That's about all I got.
And no, New Jersey is not another country.
Not sure... I'd have to look.
As to the second question: there have been roughly 100k plate appearances in post-season history, and about a third of them are concentrated in the wild card era. From a methodological point of view, bigger sample sizes are always appreciated. If all that we had was the pre-divisional World Series only model, that would make things a little tougher for me as a researcher.
I'm personally not much of a purist. I've found most of the purist critiques of the expanded playoffs are aesthetic in nature. I have opinions on the matter (I like the expanded playoffs), but as far as arguing whether it makes the game more beautiful, that's a very subjective call.
I don't have a grand unifying theory of the post-season yet. I wish I did. I'd be rich. But, I think that we don't take into account that players deal with different types of pressures in the post-season and that these can have real effects. What exactly those effects are need to be properly studied and not outsourced to cliches, but I think we ignore them at our peril.
There are very few teams who aren't Saber-friendly any more, or at least Saber-aware, including teams that would make you say "Really? Them?"
Ack, my bad. I had Berkman and Holliday switched in my head.
Cano's September/October OPS was .999. Below that of Cabrera to be sure, but not Neifi Perez-esque either. I wonder what their defensive and baserunning contributions were during those months.
I'm personally willing to entertain the hypothesis that performance in crunch time is harder (and more valuable?). But, I'd also suggest that we need to be careful of recency bias. Part of the reason that September is so salient to the minds of voters (whether official or not) is that it takes place right before the voting and is easier to remember than May. Then there's the flip side of the "crunch time" argument: had one of these guys gone nuts in May, his team could have stored up wins, clinched earlier and coasted through September. It makes for a less interesting story, but an equally effective way to win a division.
If you're going to apply the logic that clubhouse presence counts, you have to apply it fairly. You can't simply assume that Cabrera (or Trout) "made their team better" while ignoring the fact that the other might have done so as well.
Jay just likes to swear.
To his credit, I would have put Miggy 4th. In general, I'd personally rather see the MVP be a position player only award, but since pitchers are eligible, you gotta give props to Verlander who's having a fantastic year and is more valuable to the Tigers than is Cabrera. Cano's raw numbers aren't as pretty as Cabrera's, but he plays an above average second base compared to Cabrera's below average 3B. Plus, positional adjustments matter. And if you want narrative, Cano's been the only Yankee who hasn't been hurt this year... and the Yankees are going to the playoffs.
The reasons that there are several different versions of WARP come down to different methdologies that are used to determine the run value of various events. There are also different philosophies on where to set replacement level. They're all roughly the same from 30,000 feet, which is why you don't see a guy with 10 wins on one measure and 2 on another, but when you get down into the gritty details, they do differ.
I guess it would have to be a 3-team tiebreaker running concurrently with a 2-team tiebreaker and the winners would play each other. Even more fun if it was a 5-way tie for the second wild card. Elimination baseball!
The standard cutoff points seem to fit, but I think interpreting it as a validation of the standard peak model falls short. The last sentence that you write is the one that gives me the most trouble.
A nice, smooth, uniform, upward curve/line would have a high correlation coefficient running through it year-to-year. Straight lines have a correlation of 1.0, after all. These numbers tell me that we need to view 24-26 as more of a chaotic, malleable period. Some will take bigger jumps than others. Some will fall. At 26 though, the chaos stablizes. At 29, some start to decline, while others hold.
I think that we need to get away from the assumption that everyone follows the same curve (up, plateau, down) and embrace player development for the much more chaotic process that it is.
Good. I am Chuck Norris. This is what I do when I'm not spin-kicking people.
Trout actually leads in WPA. And Cabrera has been the 4th most un-clutch player in the AL.
I look forward to the day where even in sports, we look at someone and say, "Is s/he bringing more love and kindness into the world?" and if the answer is yes, to say "That's awesome!"
Unfortunately, the full answer involves brain dissection...
I think a good way to see whether it's a habit or conscious process is to see whether it persists in situations in which the behavior is clearly not beneficial.
I should have been better about saying that explicitly... Win some, lose some... (and when you lose some, you're winsome)
Indeed he did. My point in this series isn't to deconstruct the players that I highlight themselves. My goal is to illuminate issues that I saw come up in Jason's series. There are probably a couple dozen guys who would benefit from a more aggressive approach. Wil Myers was simply a vehicle to bring that idea to the forefront.
MLB specifically ruled on this one a few years ago that in the event of a "mixed" three way tie where you are tied for both the division and (at the time the only) Wild Card, being in game #163 for the division will not knock you out of the Wild Card tie.
It's actually not random who gets the bye. There's a set of tie-breakers to determine seeding (starts with head-to-head). The top seed gets to pick whether they want to play two home games (beat both #2 and #3 at home) to advance or one road game (make #2 and #3) play, and then go to the winner's stadium for a winner-take-all game.
My point there is that at 11-0, the math is pretty obvious and the logical thing to do is to punt the rest of the game and live to see tomorrow. But let's say that you do some math the day before and realize that the best idea is to punt the whole game. In both cases, you're giving up. It's just a matter of when. Would the Yankees or Orioles or whatever team did it be hailed as wise or fools?
I would gladly admit that the circumstances that would need to come about are a little far-fetched. But let's say that the Yankees would start CC while the A's, due to having to sprint through the last week of the season, would only have some #4 guy available. And let's give the O's Justin Verlander.
I meant this piece to be more theoretical than anything. My ultimate thought is that there exists some set of not-impossible circumstances in which punting makes logical sense.
Why yes they did...
Could have sworn I've seen that title somewhere.
Also, as an Indians fan in the 1980s, thou shalt not diss Brook Jacoby as a mere throw-in.
'91 was the season that the Indians moved the fences back in an attempt to build the park around... Alex Cole (true story). And I may or may not have seen a couple of those home-runs from the good seats at Cleveland Muni despite having an upper-deck ticket.
I won't tell...
Welcome to BP, Cory. I may end up writing about negative self-talk, which I would generally (although not completely) lump under depression. I don't practice therapy any more, but there are a few techniques that can be used around negative self-talk. Most of them follow the Stuart Smalley approach.
Ben, I'm going ot use this in my arb hearing this winter.
We'll talk about anger control soon enough.
I really am 5'11". And married.
And we need to do another StatSpeak reunion.
Those data could be parsed...
I appreciate the desire to be precise in terminology. Here, I use "racism" very broadly, and I can appreciate that, for some, too broadly.
I recognize that attributing the results to a bias is an assumption, but it's at least a reasonable one. The methodology provides direction, but not precision. Still, it's the best that we have right now.
Yeah, but announcers gonna announce.
That was actually brought up in some of the original critiques to the Atlantic article. I suppose that work ethic can transcend the language barrier (you can watch a guy work out/take extra grounders/etc.) But there is something to be said for this. Suppose you have broadcasters who don't speak Spanish and miss a player who does A LOT to lead in the clubhouse... because they don't understand what he's saying. In this data set, there's no way to pull that apart.
Thanks. I giggled when I wrote that.
Yeah, that's another problem with qualitative analysis. Because there's no standard definition of what you're looking at, there's all sorts of room for bias in creating that definition (and then implementing that code.) There are ways of increasing rigor (inter-rater reliability, which the researchers do not report on...) but it's always lurking in the background as a research methodology problem.
This is the sort of research that isn't fun to sit with. There are a lot of what-ifs. I'm also convinced that learning to live a bit more with that uncertainty would push the field of Sabermetrics further.
Awww, I got podcast lovin'!
A team with 2 or 3 good relievers wouldn't be able to hang around in a 19 inning game, unless the other team's pen is just as bad. Perhaps they are lucky enough to be involved in games that don't call for that. Perhaps not.
I believe that you mis-understand what I'm saying there. We agree that baseball is a game with a lot of randomness in it. At least the way that I'm measuring things, there doesn't seem to be a lot of repeatable skill on the team level as to what happens in a one-run game. In the same way that the playoffs are a crapshoot (because of the shortened timeframe, which breeds more noise), an extra inning game is really a 1-inning game... on the team level, that breeds more noise.
Also, 1000 bonus points for using antepentultimate correctly.
I recommend Sam Miller's article on the subject (from today). Bullpen performance is beset by all sorts of small sample size problems and luck.
It's true that the Orioles have played a lot of extra inning games, which are best won by good bullpen performance. To put it less charitably, the Orioles have been fortunate to be in a bunch of the type of one-run games that have called for a good bullpen and fortunate that their relievers all seem to have four-leaf clovers in their back pockets this year.
The model only sees games that end in one-run. Suppose that the O's go into the 7th inning leading by one, and pitch 3 shutout innings. The game ends with a one-run O's win. Conclusion: Good bullpen performance, but why didn't the offense make it a 3 run game?
Would a better variable be time elapsed. Figure that all pitchers stop hitting once they leave high school (age 18). Take the pitcher's age minus 18. I can't imagine that it's pitching extra innings that somehow eats away at the neurons that encode the muscle memory for hitting, but rather the simple passage of time and their not being renewed/tended to.
Thanks Jay, I will. ;)
Is this not just an engineering problem though? If we could get the robots to process fast enough, then the calls could be instant.
That's one of the next things in my queue.
Direction, then precision.
How could you not include Stetson Allie? He totally went to the same high school that I did!
I'm happy to take requests.
Hooray! Someone caught it!
I'll talk about that in next week's article. The short version is that you can be fairly certain that a player's GB rate over those 80 PA was reflective of his true talent _over that period of time_. But true talent unto itself can change and does so more rapidly than we'd like to think.
Guy's a no-talent hack if you ask me.
The standard way that this has been done is that at r = .7, you take 70% performance and 30% league. As to how to weight performance vs. projection, that's one that I've never really looked at.
I use the statistical program SPSS for my work, which can be publicly bought (although it is expensive). The good news is that any spreadsheet style program can handle Retrosheet data.
I had dreams of doing these initially... I'll see if I can fire up my PFX data base later.
A few thoughts come to mind:
1) Players wear their own team's uniform, suggesting that the game is supposed to be a collection of individuals. If baseball wants the idea that teams should take pride in themselves, perhaps they should wear those batting practice jerseys that they do at the Home Run Derby?
2) On snubs, people will point out the big ones (Michael Bourn), but what percentage of the roster would get say 75% approval that "yeah, he belongs there." 90%?
I think I've actually seen this done elsewhere. I just can't remember where. Confidence intervals are woefully under-used in baseball... and life.
Richard is right. There's a big difference between "It doesn't exist" and "I haven't found a good way to measure it yet... maybe it's because it doesn't really exist and maybe because I'm bad at math."
It never does. Don't worry, there's plenty more stuff to write about.
My wife and I just bought a house. I own my own basement!
There are some topics that frankly I will be silently avoiding. I certainly can't duplicate things I did internally, and I probably will tread lightly in other areas. With that said, there are plenty of areas that I think are interesting (and I hope y'all do too!) which don't conflict with my NDA.
I can't talk about specific data sources, but I will say I believe that there is still A LOT that can be learned from good ole Pitch FX and Retrosheet.
Question everything. Constantly. If we are to be scientists of baseball, constant questioning is the only way that science works.
As you might imagine, there are a few projects that I've had to shelve over the past 2 years. There will be fun to be had. Stay tuned, and watch out for the gory mathematical details.
He's not a terrible man. He's a man who views the world of baseball through a different lens than I do. There are many paths to knowledge. I am but a traveler on one of them.
I'd love to say that there's a juicy story, because it would be so much more interesting, but there's not. We did a bunch of projects, we got to the end of one, and it didn't make sense to start another one. I have no hard feelings and I genuinely wish them well.
You might appreciate that I'll have to decline to talk about the specifics. My work on the inside is covered by a non-disclosure agreement.
But, may I point out a similar process that has taken place in public. DIPS has gone from "there is no difference between pitchers in preventing base hits on balls in play" to "Well... sorta... but it's a lot more complicated than that." (Shout out to Mike Fast, among others, for a great deal of work on that.)
Awww thanks. Good to be back.
There probably is. Those are Retrosheet data 2003-2006, and Colin Wyers has previously (and elegantly) pointed out the flaws in RS batted ball data. Also, that's LD/PA, not per BIP. That probably makes it look a little more stable.
Thanks for the "gory details" nod.
One minor edit. In my original piece, I looked for where the R passed .70, which is (roughly) an R^2 of 50%. At that point, a projection of talent incorporating regression to the mean would be 70% performance and 30% league average (or whatever mean you want to use).
I swear, I'm alive and well, Ben!
@abbiet: I am alive and well, I promise. I am still doing baseball work, although for people who prefer that I keep my findings to myself (and them...) I miss writing for BPro something awful though. As you can infer from Eric's shout out, I still keep in touch with what's going on, but for now, it will have to be behind the curtain.
The Cubs haven't won a World Series because... they're the Cubs. Actually, I think that the day games would work in the favor of the Cubs. They can get used to playing day ball. The other teams have to adjust. I don't think that we would have the sample size necessary to look into this hypothesis, but it's an interesting question, no doubt.
Am I now the official "sample size guy?"
You could screen for sleep disorders with a sleep study, but mostly, this is behavioral, such as picking a good bed time and sticking to it.
The outcomes on the ground balls include some errors. When I say "single, runner to second" that might just be a muffed grounder that went for an error. Either way, it doesn't matter. The point is that the end state was 1st and 2nd, no out. That's what I want to model.
Umpires aren't in there. There are a number of factors that in theory could be added, but then you overload the regression.
There are certain players who just seem to get beaned a lot. It's a relatively stable stat year to year for hitters. It probably has to do with standing close to the plate. Or being a jerk.
@Brian24, it's not that the recognition pathways aren't there, it's that they aren't strongly connected to then reacting as a lefty. He's seen a pitcher throw lefty, he's hit left-handed, but he's never done both at the same time. That's the issue.
I don't personally. But you could perhaps see how someone might mistake him for a good long term investment.
The thought process would be that while Howard is a drop from Pujols, if the Cards aren't confident that they can sign Pujols long-term, then the decision is between Howard and some other random first baseman. It does assume that the Cards are taking a longer view, at least in the financial aspect of things. This is probably where the deal falls apart.
I would argue that the booing of Favre had more to do with "You can't leave us! You broke our hearts!" rather than "Your departure has made our team worse." The focus is still on the relationship between the individual and the fans. I don't know much about football, but if Favre's departure made the Packers a better team, then the proper response would be to cheer him to thank him for leaving.
You're right that it's much easier to pick out an individual from a cognitive load perspective (it's easier to focus on an individual rather than do all that messy work), but I'd argue that it's deeper than that. Hence, my example on the name the greatest teams vs. name the greatest players. There's no philosophical reason why teams shouldn't be as available. It's a matter of what we're taught to focus on.
Is Paul Zuvella still playing?
It was an ESPN piece. They ask that we kepe it short and sweet.
Well, that assumes that most batters are up there taking on the first pitch. That's not a bad assumption, as it seems rather common. But, if you have a batter who swings a lot on the first pitch, it might be a good idea to throw a pitch outside and hope he chases it.
Here, we have to make a distinction between getting strike one (always a good thing) and how you get the strike. Should you throw something down the middle, try to paint the corner, or throw something that you hope the batter will chase?
The permutations on this are maddening. Understanding the inner workings of this matchup is the next big task of Sabermetrics.
I'm not so sure that Markov would work so much as a repeated interaction Nash-type game. Surely, a batter's expectations are going to be influenced by the previous few pitches, because he intuitively knows that a pitcher is not truly randomizing his pitches and he might have an idea of the pattern that the pitcher is using.
When I originally conceived of this article, I thought about going there, but to be honest, I ran out of time this week... It's rather involved to run the numbers. I might at some point in the future though.
This "intro" of which you speak would probably end up being a book.
My father put it a little less elegantly when he said that he knew enough French to either get lucky or slapped, but that his face was red a lot.
The reason that I ignored it was that I couldn't quite figure out which correlation coefficient was being referred to... If you mean the stats on the regression itself, "combined talent" had a Wald value (how you measure factors in binary logit) in the 2000's. Recent OBP had one around 25. Both are signficant, but talent is much much much better, and when you look at the actual effect for recent OBP, the effect was seen at the fifth decimal place.
I suppose that enjoyment is in the eye of the beholder. Here, I'm much more concerned with winning the game rather than enjoying the game. It's not the way that everyone wants to experience a game, and I can respect that. But that's the goal here. Take it as you will.
I have done quite a bit of this research. It really depends on the stat in question. Some stats stablize after 50 PA. Some never do over the course of a season.
I had originally slotted a paragraph in here for that very thought, but then cut it for space. I often hear both on the radio/TV. I think that the hot streak is a little more seductive, based on the fact that it's recent and is likely to have won a team a game (and we like to single people out as "having won the game.")
You're a little off on your methodology. The dependent variable is "got on base in this at bat." It's either yes or no. The predictor variables are his OBP over the last 10/25/100 PA, as well as the control variable of pitcher/batter quality. So, I'm looking at his recent performance as a continuous variable.
It's an interesting suggestion to break it down a little bit and look at the hot players (say, recent OBP below .200) vs. the hot players (recent OBP over .500). I might.
It would need a slightly different methodology, but no reason that the study couldn't be done... hmmm.... *begins scribbling equations furiously*
I should post that above my desk as a mission statement.
For the reasons that OBP is really easy to calculate (work smart, not hard!) and that I needed a dichotomous variable for my outcome. BABIP is a big deal in pitcher's OBP against, although in a giant sample of 10 years, that can wash out a bit. If it helps you to know, the combined batter/pitcher OBP factor was the one that was unbelieveably significant.
I can't control for injuries, and yes that is a confound. It's the major weakness of this methodology. I wish I had a way around it.
I've had those times too. Of course, when I'm bowling it means I might break 100.
An interesting counter-example. 2008 ALCS, Red Sox-Rays. Boston comes back from a 3-1 deficit to tie the series at 3, much like they had done the year before to my beloved Indians... and I heard something or other about them doing that to the Yankees one year... It was assumed that it was a foregone conclusion that the Red Sox would win Game 7, because they were locked in.
Did I mix up my occipital and temporal lobes again? My old neuropsych supervisor wouldn't be pleased. Thankfully, he doesn't like baseball.
People who talk a lot but don't know what they're saying. I can't think of a better definition of politician.
The time element is fascinating. I may go deeper into that issue in the future.
... don't say it...
Not necessarily. Walking is actually more related to being reluctant to swing. Strike zone command leads to avoiding strikeouts.
As someone who taught college classes, GPA tells you almost nothing about anything.
I actually asked Kevin Goldstein how this sort of assessment plays out in scouting. He said that it varies, but that there's not a lot of it out there.
Over the last 30 years, there's been a lot more attention paid to mental health issues. People are beginning to realize that they are nothing to be ashamed of and that they can be treated. Would it have happened 15 years ago? It's less likely.
I don't know that we could create a measure for confidence based on game data (Retrosheet, Pitch F/X, etc.) that everyone would agree is "confidence." But we could take a look at some of these issues and try to make some reasonable conjecture. This is the problem with psychology research more generally. No one is psychic and people lie about their true motivations/emotional states.
I can't tell you how many times I say the words "If only..."
I'm not a fan of writing these types of articles either. I like to have a punchline at the end too. If there is a conclusion to be drawn, it's that just because a methodology works in one situation, it might not work in another.
You're correct that context matters in other decisions, and in a 15-2 game when both teams just want to go home, it's not a big deal whether the runner stays or goes. You've identified another specific situation where it might not hold (the ninth inning, behind by a few runs), but what about the other eight innings?
This is a very simple, well-defined decision (stay/go), that has definable outcomes (via the run expectancy matrix... if you want to monkey around for some other context, fine) and if the goal is maximizing the number of runs scored (which is kinda the point of the game) then this is a way to maximize run scoring.
The weighting is done by the binary logit analysis that I alluded to in the text. I didn't fully report it, because ESPN wanted me to keep the numbers light. However, in that case, I controlled for both distance and speed.
Small tweak: more runs would be scored with 100% send than are scored now. There's probably a point around (just guessing here) 90ish% that would be the optimal point.
Yeah, that's a goof on my part. I still stand by the overall message. There's no reason for success rates to be that high.
That about covers my thoughts on the issue. There are situations when the runner should hold, but they are much rare-er than is generally believed.
I used the "always send" to have a little fun with the data. The results surprised me. I went with it. It's not actually going to happen (and it isn't optimal), but it's better than what's being done now.
I used 250 for illustration. It's probably better read as "a really long throw."
Conventional baseball wisdom is filled with all sorts of things that are demonstrably false. The run expectancy matrix builds in the fact that even with one out, the likelihood that the runner scores is high.
It's a lot harder than you might think. Consider that we're talking about a pinpoint 250 foot throw, under pressure. Get the angle wrong or the force behind the ball wrong, and you've messed up the throw. The runner just has to run straight ahead for four seconds and slide.
In a separate study, I was able to control for depth of the fly ball and speed of the runner. I still haven't found good outfield arm ratings. The findings were quite strong.
Well, when that's what you get paid to do...
It's not easy, but somewhere in the back of his mind, the 3BC is doing some sort of calculation. It's just skewed toward keeping the runner there more than it should be.
19 stayed at third base. Had they gone, they might have been safe, they might have been out. We'll never know.
I think this line of logic falls into the trap of "runs in the ninth inning are worth more that runs in the second." Runs are runs.
I am making that assumption. But if there are a bunch of 100% chances, then why aren't they all being sent?
There are situations where I wouldn't send the runner unless it was a total gimme. Ninth inning down by 2, why risk the runner when you're going to need another hit anyway comes to mind. However, runs are runs and the game is always about scoring more of them.
Good points there.
You all beat me to it. BP readers are smrt.
That thought occurred to me too. However, I still think that the findings stand. If 3BCs sent everybody from the 73% chances to the 99.9% chances, we'd likely see an overall success rate around 85% or so. I'm consistently seeing 95%+ The other thing, and I was only able to allude to this in the piece, was that I have done other work suggesting that it's almost never a bad idea to send a guy on a possible sac fly. I generated "would he make it?" probabilities based on distance and speed, and they were almost always (98% of the time or so) above the break-even point.
In that case, I meant "runner on third, no other runners. Sorry, I wasn't clear." To be in that situation, you'd basically need a leadoff triple followed by a fly ball.
Gasp! You've uncovered my kryptonite. I don't have a good minor league database on hand. But you're right. It is limiting the sample. I'm willing to live with that limitation for now.
I have recent Ph.D. syndrome. I just spent 7 years jumping through hoops to get those letters and I'm going to enjoy them!
I'll call Matt Swartz.
Predicting peak is a whole 'nother series of articles. Clearly, we don't have a crystal ball to see when his career will end. I think this work gives us direction (better players seem to peak later, and some guys may be late bloomers.)
If I hope to get any point across in this article, it's that this mantra of "peak at 27" is over-played. If you absolutely forced me at gunpoint to say one number, 27 is the best number, but it vastly over-simplifies things. Human development isn't that linear or precise. It's messy, and I think that the bulk of the work is in getting in and cleaning up the mess.
Those are guys who "survived" their first year, but not to age 27. So they played beyond their debut season.
Matt, no I didn't experiment with covariance matrices. I'm not as familiar with my covariance structures as I could be. I've heard of compound symmetric before, but never had the time to fully study it. Can you give me (and the rest of BP reading this) a quick summary? I like AR(1) specifically because I have repeated measures and I know that the year-to-year measures are going to be correlated in some ways.
It's possible. There are a small number of guys who played to age 35 with no discernible offensive talent, but who could field at short like a dream. Clearly, the team figured there was some value in keeping them around.
Actually, I think we have a lot to learn from studying the margins. People are fascinated by the exceptional cases, but there's so much more to be learned by figuring out why it is that some players just don't make it.
I don't have that handy. There's also the issue of top-10% when? At the beginning of the contract? At age 27? In a specific year? It's something of a moving target. My guess is that they are best represented by the early debut/long career group.
Yes, once the pitcher moves off the mound and into the field in the AL, the DH is gone.
Second order interaction terms!
But on a day when Freddy Fourth-Outfielder is playing LF...
You're right. It wouldn't work in Fenway, even if Fenway were an NL park. Your second point is some interesting counter-strategy. I hadn't thought of that. I'm not sure if the pitcher would get some additional tosses after switching. Let's assume not... but relievers when they throw two innings also have to deal with the break when their team is batting.
Luke, I hear ya and I struggle with how to balance out how to present exotic methods. Part of the problem is that if I went all the way into explaining AR(1), then my articles would become the Wikipedia entry and that would overwhelm the baseball end of things. To that end, I try to simply state what I do. There are people with experience in this area who will understand the short-hand, but alas there are people who won't (not through any fault... like you, they probably just haven't learned this particular method.)
I regret that it sometimes gets into "trust me, I'm an expert."
The short version of AR(1) goes like this: I'm using a technique called mixed linear modeling. What that does is that it allows me to run a regression that incorporates both repeated measures (multiple seasons of data from the same player) but also look at factors that are outside the player that might influence the outcome (manager, park, etc.) Now, when we're dealing with within-subjects effects like that, we need to specify to the program what sort of covariance matrix that the program should use. AR(1) is a covariance matrix that allows for the fact that if we know what Player X did in 2004, we should have a pretty good idea what he did in 2005. The performances are no doubt correlated. AR(1) specifically says that we would expect certain elements in the covariance matrix to be correlated, because they come from the same person, and that the model should count this as within-subjects (individual person) variance, and not mistake it for variance associated with the other factors under consideration.
Peter, my degree is in psychology. I learned all my stats through the research requirements in the program. If you do want to learn, take a class. This sort of model is an advanced model and you have to work your way up to it, but it's only knowledge.
What you've said is true and might make a good follow up. For what it's worth, I was more interesting in looking at the shape of the production for my own nefarious purposes. I was thinking of looking at some one-number outcomes, but it got cut for space.
The model specifically corrects for that. That's the beauty of HLM. For example, if I end up managing a team of walk-happy guys who have been walk-happy elsewhere, then the model will not credit me with increasing their walk totals.
It's possible that Williams was better when he had a chance to sit down, think things all the way through and write it out, but couldn't explain things on the spot in a coaching situation. It's two different skill sets.
Having been a professor, it's more than just that. Understanding the underlying process is necessary, but not sufficient for either doing something or teaching it. Williams was able to physically execute it. Guillen is apparently able to explain it well. There are some people who can both physically execute and explain well, but those skills are probably independent of one another.
On the first part, I think that's a fair statement in that the manager should be treated like the principal of the school (to extend the metaphor).
On the second part, when I did my pitchers study (need to get those archives back on line somewhere!), I ran into some similar problems. I specifically sought to correct for them in this study (some of those overly-stringent inclusion criteria were aimed at that). The Angels may walk more because of an Abreu effect, and that's fine. It won't show up as a Scioscia effect and if it isn't, it shouldn't.
I entered home park as a fixed effect. Perhaps it isn't clearing up all the noise.
The full lists would have over-whelmed the texts. If you (or anyone else) would like a copy of the full list, send me an e-mail back channel and I will send it to you.
I think that you're closer to the truth here. Much of the dirty work is probably the job of the specialty coach. However, I subscribe to the theory of management that says that Ozzie picked the coach and let him do his work, so Ozzie gets credit for that.
Tversky indeed! I have the feeling that you wouldn't get any takers from the GMs. I think that the appeal of RoboPitcher here is that he looks human, but instead of having to suffer through the inconsistency that goes with humans, he's a guarantee. I think that's what the GMs would pay for. And they'd pay for it out-of-sync with your proposal, which of course is exactly the same thing.
Conclusions: humans are irrational.
Just looking at the list, it doesn't look like there's a correlation one way or the other, although a lot of the guys who got fired were at the top.
You're right that there's a lot of other gunk in there. It was the most difficult piece of the work. I was pleased that I was able to get some sort of a reliable measure, but without mind reading abilities, this one would be really hard to do properly.
I'm not nearly that cool. Or good looking.
I meant 5 SB opportunities as well. I think we're on the same page methodologically. You are correct in that the number of PA/BF/opportunities can affect ICC, much in the same way that it would affect yty. However, as Mr. Solow points out, so long as you set your inclusion criteria high enough, it's not going to make a big differnence. In this case, I actually upped the criteria a bit and didn't get much improvement in ICC. It's something of an asymptotic relationship.
In this particular case, there are two different questions that one can ask. One is, "How reliable is this stat year to year?" (which I chose to ask, .538) The other is "How many PA/BF/opps does it take before this stat becomes reliable?" I haven't run that one yet.
Not sure. Brian's comment below develops this same idea. I think you've both hit on something interesting.
Ah, more variables. Actually, this is a good point. I've got a lot on my plate this week, but this one might be worth a look.
Mr. Solow's response is mostly right. ICC is a measure of consistency across the years. I did toss out most of the interim managers who only had a few games at the helm when I ran that ICC, specifically for sample size reasons. (He had to call for at least 50 SB attempts.)
Think of ICC like year-to-year. If I only had five observations per year, then I'd probably get a lot of random variation and so not a lot of consistency within managers over the years. My choice of inclusion cutoff was somewhat arbitrary, but based more on the realities of what we're observing. We look at managers based on the season-to-season level, so I evaluated them as such.
My program actually takes variables like that and dummy codes them. But good catch. Someone was reading closely! Ten extra credit points.
http://www.philbirnbaum.com/btn2007-05.pdf It was in SABR's "By the Numbers" newsletter.
The standard of comparison is all managers from 2003-2009. The league was a little bit more aggressive last year than in previous years.
Actually, I'd say that's the manager's fault more than anything. Francona, instead of making the decision in the moment, makes the decision ahead of time. Either way, he's still giving the green light to steal.
True in theory, hard to quantify. Not that I won't try.
I usually cut the whole list for space because the extremes are the more interesting ones to talk and read about. However, since several of you asked:
Managers listed in order of aggressiveness, with percentage of model expectation.
B. Geren 1.66
O. Guillen 1.43
J. Maddon 1.36
M. Scioscia 1.34
C. Hurdle 1.26
T. Francona 1.21
C. Cooper 1.19
J. Riggleman 1.18
J. Tracy 1.15
E. Wedge 1.14
B. Melvin 1.12
J. Torre 1.10
B. Black 1.08
B. Cox 1.08
R. Washington 1.06
T. Hillman 1.04
D. Baker 1.04
J. Manuel 1.03
D. Tremblay 1.03
J. Russell 1.02
M. Acta 1.01
J. Girardi .99
T. LaRussa .98
L. Piniella .97
B. Bochy .94
C. Manuel .94
R. Gardenhire .89
C. Gaston .86
K. Macha .83
A.J. Hinch .82
F. Gonzalez .81
D. Wakamatsu .74
J. Leyland .70
Don't worry... logit is coming.
Freel spent nine games with the Orioles.
It's sweet to know that he did that. Maybe he's got "future child psychologist" written all over him.
Last week, someone pointed out Russell Branyan as an alternative to Huff. I'm not a fan of signing guys like Freel in general. The FA market right now is kind of a dumpster dive. Felipe Lopez is better but probably won't come cheap. I'm intrigued by Rick Ankiel for a team who needs an outfielder. He's the opposite of a speedy slap hitter, but hey, he's versatile. He can probably still pitch an inning or two in a pinch!
Not a bad idea. I don't know whether if it would work logistically on our end (it might!) but it is a good idea. I think in the mean time, I'll flag the "gory details" part so that if people want to read it or want to skip it, they can make up their own minds.
Interesting hypothesis. If I have a moment, I'll go back and check that.
Yes, they certainly would be. However, in this particular context, my point was that, for a batter, getting the ball through the infield on a GB, and getting an infield hit once an infielder has gotten to the ball are separate skills, although both would fall under the category "singles on a GB."
Legthy? Probably. Depends how much my daughter lets me write. Snarky? Oh yeah. Oriole-centric? Nah. The Orioles are just one of 30 teams.
To their credit, the Orioles have said they don't want Huff back and had the good sense to flip him to Detroit and get something for him mid-season.
Plus David Segui got a HOF vote, so he can't be all that bad.
He's the type of player that will probably hang around until mid-Feb, and then when a team gets really nervous and decide that they need a "proven veteran" he'll be sitting right there.
You're inferring my intention properly. I was drawing attention to the fact that as a DH, to achieve a high VORP, you have to have a really really good year, better than what you would need as a shortstop.
20 extra credit points to Kniker for the Burger Time reference. And an extra pepper.
One other benefit that might induce you to apply: obviously, you'll put this type of thing on your resume. When you are out interviewing for real jobs, even if they're not in baseball, maybe that hiring manager is a subscriber. You'll get a few "You worked with them?" comments and you'll get to start a nice little conversation that ends with "and then Will Carroll says to me..."
My original plan for this article actually included a test for the idea that putting young relievers into high leverage situations was hazardous to their health, but eventually, I had to drop it for time.
That's the underlying theory for why I think that the signing was made. Perhaps another day, I'll take a look at the hard data.
Kay Hanley will be there. Perhaps she'll play "Pizza Cutter".
I did do some preliminary checks for survivorship bias, but did not report them. There was no association between early winning percentage and washing out of baseball. If there was a "survivorship bias" it would be that some of the players in the first part of the sample had not yet made it to age 29, and as such I couldn't include them.
The point about negatively skewed feedback is well-taken. I hope to do some more writing on the subject soon. Blown leads in the ninth subjectively hurt more than a vlown lead in the sixth inning, but the result is the same objectively. I think this disparity drives a lot of silly decisions in baseball. Could it have an effect here, especially around Gonzalez? Maybe. That would take a little more data digging.
Your reading is correct. Early winning percentage does not affect future SLG or OBP.
Early winning percentage correlates with early OBP at .234 and SLG at .191. Those are significant numbers, although not compellingly large. Players on better teams are better, but better players make for better teams. It becomes a correlation-causation trap.
Contact me via backchannel if you want the specifics. I enjoy me some good geekiness. My e-mail's at the end of the article text.
Whether the saved payroll money would have been spent elsewhere is something only the O's know for sure, but theoretically, it could have simply been banked. Or spent on minor league development. Someone might make the case that this sort of spending would actually be more beneficial in the long run.
I actually intended for the piece to be about the effects of winning/losing on player development. I was rather surprised that the comments went that way, but that's what people wanted to do. Anyone still up for the discussion on player development and culture of winning?
I promise I'm not an amateur. I have a Ph.D. in clinical psychology (hence, Baseball Therapy) with an emphasis on children and adolescents. A lot of the stuff that I have written in the past and will continue to write will be looking at a number of commonly held beliefs about baseball which are really just bad amateur psychology. But I trust that means I won't get booted out? ;)
FWIW, I hope you keep reading. As someone who read BP for a long time before I got hired here, I miss a lot of those first generation folks too. They wrote some really cool stuff and I enjoyed reading it too. In fact, one of the coolest moments in my life was actually getting to meet Dan Fox at the BP event in Pittsburgh this past summer and having him say "Oh yeah, I've read your stuff, it's really good." I'll do my best to carry on the proud tradition.
Let me see if I can sum up the responses, and correct me if I get something wrong.
1) Most of you would like to see us inject a little bit more of our scintilating personalities into the work, so that BP doesn't start reading like the Journal of Applied Statistical Science. Extra credit will be given to obscure references to 16th century Belgian military history. Fair enough. (Note to self: go to library this weekend.)
2) People prefer applied topics or at least, as my grad school advisor liked to say, a good story to go along with your table. OK. Connect things in so that they are relevant. Sound advice for any writer.
3) People here aren't stat-phobic, cuz ummm, that's kinda the whole point... anyway, it seems that most people don't mind if we break out the major numerical nerdiness. (Y'all know that there are some writers who are more prone to that than others. Personally, I know that I look for excuses to pull things out of my statistical tool belt.) It's not everyone's thing, and that's cool. But it seems that people are hungry for more than just tweaks in the methodology, or at least to know what difference it makes if we tweak the formula for WXYZ42BNL makes a difference.
In other words, after the digit gymnastics, we need to be able to answer the question "So what?"
Pbconnection, I also rue the fact that Nate and a lot of those first generation writers have either cut back or moved on. But give us newbies a chance. You might grow to like us.
Alright, a fair number of you have suggested that BP would do better with less of this or that in our articles. I ask you as a real live BP writer, and in a spirit of actually wanting to deliver a better product to the customer, what exactly is it that you are looking for?
I can understand not wanting to read things that sound like academic journals. What would you rather it sounded like? More snark? More Miley Cyrus references (oh yeah, I totally would)? I want to put the emphasis on the word "more" though. More looking at recent events? More looking at history? More work explaining how WXYZQQFJLDAs work? More application to individual players/teams?
If all I hear is that we want less of something, I soon have nothing to write about. So, start your sentences the way that Britney Spears would "Gimme gimme more..." I can't guarantee that I can give you everything you hope for, but if you've got an idea and it makes sense... well then I'd be a fool not to take it.
Or perhaps in some cases, replace "playoffs" with 81 wins?
I agree whole-heartedly with that line of reasoning. It's irrational behavior on the part of all involved, but human beings are rarely rational.
The economic impact is a little beyond my reach right now, but I will say this both from a research perspective and from experience. I'm amazed at the power of the bandwagon effect that happens when a team starts winning.
Why can't it be used in 2012? Baseball teams have bank accounts too. Those millions can sit there and collect interest. Money in the pocket doesn't have to be spent. Consider what the Marlins have done over the past 12 years.
Let's assume that the Teix money (17.5M per year) is still in the warchest. Why spend it on someone who won't get you into the playoffs this year, and instead bank it until 2012, when you can have all that saved money to spend to add pieces to a team which has naturally grown to the point where those moves make sense? I can understand that this is not a pleasant thought of having to endure another couple of losing seasons, but as a therapist, sometimes you have to tell people things that they don't want to hear.
I actually rather like Gonzalez himself as a pitcher. If the goal was "get a closer", then the Orioles did well in that regard. I'm questioning the underlying assumption that "we need a closer."
The draft pick is an issue, but it's a cost of doing business. I think that the "draft picks are gold/draft picks are toothpicks" pendulum has swung a little too far in the gold direction. Yes, there's nothing better than a draft pick who works out. There's nothing more frustrating that a draft pick who doesn't. A draft pick is a high-stakes coin flip, not a guaranteed future star who will only make $500K per year and take us to the World Series in the process. He might become that. He might flame out in AA.
You're correct, I don't live in Baltimore. I live in Cleveland. (It's not like living through the Indians bullpen woes has been easy either.) I believe many of the issues you bring up here are addressed in comments I made above.
Let me add this: My goal in writing this piece isn't so much to tweak the Orioles' management for the signing (OK, maybe a little). What's done is done. But there's another team out there thinking of doing this same type of thing right now. I want people not to evaluate moves based on the assumption of a steady-state past. I'd rather that they took a broader view of the options available to them and projected those into the possible future.
The problem with making signings to minimize frustration is that if you manage with your emotions, you get burned. I'm convinced that a good deal of what passes as baseball "strategy" is an attempt to make fans/players/coaches feel better rather than to win the game.
Does George Sherrill really need to be replaced? Maybe. But what about thinking about it from another direction? Maybe it's best to take the short term hit of a lousy bullpen if it means a better chance at winning down the road. The answer may come out to be that to sign Gonzalez is the better plan. But a simple reflexive "Sherrill left, need a closer" over-simplifies all of the options available to a team and cuts off what might be a better option to pursue.
It's worth thinking about. Here's where I disagree with this type of plan. Suppose they flip him for a prospect. A prospect is a "maybe" and in 3-4 years, he'd still be a kid.
Gonzalez is signed for two years, so either this July or next July, he might end up in a trade package, but why not save the money now, bank it, and in 3-4 years when you need that fully developed piece, you have the cash. In the free agent market, you are buying relative certainty of what you're going to get. When you're on the steep part of the marginal revenue curve, would you rather then plug in that "maybe" or would you rather a guy with a track record. A lot depends on your time horizon.
A good addendum to my thought process. What's curious about humans is that they want to see a nice neat trend line. Suppose that the Orioles put Gonzalez's money in the bank, give the job to a rookie this year and next year, lose those couple of extra games, and then in two years when the time is right, signed a couple of free agents. Over the next few years, they might stagnate around the same win total. And you're right that free agents might interpret that as "well, they'll win that many next year, so why bother going there?" It's an irrational thought process, but it's a real one.
Fair enough. Mr. Easterbrook probably isn't the first man to do something like this either nor is he the only one I've seen do it, but I have to admit that I read his work and enjoyed it. Now, all I need to do is get a clue about football!
Eric, were the "relief appearances" perhaps a former starter working out of the pen as a long-reliever, a role in which you are basically a starter, but you come into the game in the 3rd inning due to the fact that the real starter is currently nursing a bad case of whiplash from all the HR he gave up? How many of these guys went from starter to 1-2 inning relievers?
Tim, I would argue that a manager who continues to stick with a faltering closer (Capps and Lidge were your examples) should be penalized. Sticking with a guy because he is the annointed one is just stubborn.
Has someone yet put together an "offers database?" For example, if it's reported that Mapleland Bees offered 5 years and $7 gazillion for Jason Bay, and someone reported that, we can get some idea of the pricing process. Of course, some of those offers are bogus (probably floated by agents to inflate the price.) Some offers are likely never reported. Obviously we know the winning bid, but do we know the losing bids and how the rest of the league prices Player A.
I also wonder if the better market analogy isn't a rarities market. It seems that at the beginning of each off-season, before anyone can officially make bids, there's a generally accepted "contract that Larry Larfelschnarger will get." And he usually gets that. Teams can walk around this rarities market and either choose to buy or not buy, but it always seems that the price is set in advance.
"His elbow/knee/leg/spleen is fine structurally... he's just gotta get the confidence back in it to use it." How true is this really?
Someone posted above on brain injuries (concussion being the most prevalent). Some work on the basic neuropsychology of a concussion would be awesome.
Statistics never lie. Liars use statistics.
I think this one has two parts, both worthy of a little bit of further thought. Is losing detrimental to the development of young players. Maybe there's some sort of (dare I say) psychological price to be paid for coming up in a losing environment? The other issue is the fan base. In Vegas, it's well-known that people generally have a point where they feel their gambling losses are too much, and so they stop. (Vegas manipulates the heck out of this, btw) Maybe there's a point where Orioles' fans will cry uncle and leave en masse. That one would be harder to quantify though...
And right there, you nailed down the struggle that I have when I do this sort of work. I did have the thought of trying to use some sort of MLE at the time of being brought up as a control on the model, but the truth is I'm awful at MLE's.
I did some earlier work of this type with managers and tried specifying for pitcher age, home ballpark, and year-league, as well as using an AR(1) covariance matrix, but even then, the model was acting a little screwy. I'm not familiar with the test you suggested (I have to admit, it's been a while since I was fully immersed in HLM) but I'll check it out. Thanks for the tip.
The biggest problem though is that the ommitted variables that "catcher mentor" picks up on, as I mentioned in connection with M. Redmond are organizational variables. What sort of guys do the Twins draft. Whom do they promote? Whom do they keep around? I suppose that an MLE approach might answer pieces of those questions. But how to quantify the rest???
Maybe I should just stick to t-tests!
Christos Razdajetsja! (a few days early.)
I think there's some value in that process though. One of the things that I think has happened in Sabermetrics is that we've valued construct validity (it makes sense in our heads) over environmental validity (it actually makes a difference out there in the real world.)
A psych major in Sabermetrics. That would never work. ;-)
IIRC, the idea that Rodriguez was signed simply to be a draw at the gate was floated as well. Seems likely that it's about the only benefit that the Nats will derive from him.
I don't know that the theory holds. Obviously, the home team will be cheered for in their own park, and the new acquisition will be cheered in his new home park. But the crowd at a Cubs-Sox game is decidedly mixed, and the response to a returning veteran might be "you betrayed us in free agency!" or more likely, "Oh yeah, didn't you used to play here last year?"
I've actually heard MLB players interviewed saying that they take the cheering for the other team and pretend that the crowd is cheering for them.
I'd also contend that HFA isn't just a function of the actual field of play. Even if ballparks were completely uniform, I think we'd still see these effects. Consider your high school for a moment. It has the same basic anatomy as all the other high schools in America. But you could tell your high school immediately when you walked in, even if it has changed a lot (like mine!) The nuances at a ballpark can be in the ambient noise in the background (is the park in the middle of downtown with a bunch of traffic around? By a river?)
It affects everyone in baseball. The reason is very simple. Your brain runs on about forty watts of power, which is probably less than is powering that light bulb above your head right now. It has to economize and use shortcuts, and there's only enough mental energy to keep attention on a small amount of stimuli at a time. Consider that the participants in the experiment probably would have noticed that the person behind the counter changed if someone drew their attention to it directly. The problem is that in the absence of this attention, the brain just fills in the blanks with what it thinks goes there.
A small example. Try to recall an event that happened a year or two ago (maybe longer) involving several people to whom you are close, and one where you have pictures from when it happened. Family parties, weddings, etc. fit this rather well. Try to recall as many details about the event as you can. When you can later today, go look at those pictures. I'm guessing that in your mind's eye, you envisioned most of the people as they look today, without remembering that Larry had that mustache back then but has since shaved it off, and Curly has since lost all that weight, and Moe was with whatshername that you never liked, but for some reason you assume that he had been there with the new girl, whom you don't like either. That's because you probably didn't make detailed observations about everyone during the event. Instead, you have a general idea what these guys look like and your brain projects that backward into the past.
In situations where change is gradual (a pitcher's motion changing slightly as he gets mre tired), unless you have specific training in recognizing the differences and are watching for that, you probably don't recognize that it happens. However, if I showed you his delivery in the first inning and the seventh inning and told you to pay attention, you'd probably pick up on it. But before that, you'd swear it was the same delivery all game.
Thanks. Actually, I'd argue the other way. HFA would still be as strong. The differences between stadia are more than just outfield dimensions and foul territory. In basketball, the acoustics are a little different everywhere you go, I'm sure the floor is slightly different in consistency... little things like that. It's not that they directly interfere with play, but they do impart a slightly different feel to the arena/stadium. I think that's part of what's being responded to.
That's actually the exact reason that I used the odds ratio correction method. I'm measuring outcomes relative to expectancies. So, if the player traded was an overall .400 OBP guy, the model knows that and expects him to be on base 40% of the time.
I'd personally like to see that one in Pitch F/X. (Has someone done this study?) The effect could be that road pitchers pitch a little more tentatively.
I agree that batting last plays a huge part. In fact, it's on my to-do list to look into some of the mechanisms of why exactly that happens.
Homestead Homies of '95? Twenty extra credit points for that reference!
I think there is some merit to this argument, especially given that "that team from the Bronx" has been mentioned in connection with Halladay. Trade him there and you have to see him 5-6 times per year pitching against you. Maybe that plays into the calculations, especially since you hope to rebuild the Jays in 3-4 years and Halladay might still be in pinstripes at that point. However, I think that's just something that you build into the "cost" of the offer. Maybe the Yankees or Red Sox have to pay a premium because of the division issue, but to outright refuse to deal with them seems silly.
Thanks all so far for the comments. My 'reply to' button is also malfunctioning, so I'll have to do one big round-up here.
Several of you have brought up good points about other variables which will no doubt play into the Blue Jays' thought process. Of course, nothing in baseball is simple enough that it can fit into a few thousand words.
A couple people have brought up whether this is a true analog to the Ultimatum Game more properly. It's true that a year of Halladay and his compensatory draft picks aren't worthless, but there's probably a better offer out there. True, there are multiple teams (I assume) making offers, but eventually, they will all bid up to their highest point. One of them will be the best. That's when it becomes a two player game. What if that offer isn't "equal value" to Halladay? So long as it beats the two draft picks, then the Blue Jays should take it. The idea of taking things to the trading deadline is a rational idea, but eventually, you have to make a decision. It's possible that you get a better deal in July, but at that point, you might still find yourself not getting equal value.
I think some people misunderstand here. It's not that the Jays should take the first offer that comes along that's better than the two draft picks. They should build a market, encourage teams to bid up, etc. When everyone says "... and that really is my final offer" they should take that one.
The other issue is whether there is some value in not cooperating, whether in the form of "hey you know you screwed the other guy over" (which, added to $3.00, will get you a cup of coffee), or whether building a reputation for further trades is worth it. (Something like: "Anthopolous is soft and will take less than full value.") I don't think that holds up in this case for a couple reasons. One, Halladay is a major stakes game. How often do you trade a guy like him? You need to make sure that you get something for him. If that means you get a little less when you trade some random middle infielder next year, so be it. Second, I don't know that it really is "giving in" here as much as bowing to the reality of the situation. Structurally, the Jays are screwed. They have no leverage other than "make a better offer or we'll send him to Team B," but even that has its upper limit. When they make another deal where they have some leverage, they can hold out all they want. They can spin this one (correctly) as "what were our other options?"
Yes, the fans are going to be mad. Trading Halladay is an admission that the Jays don't think they have a chance for the next few years, and that's discouraging, because Americans... oh right... want a winner and want it now, and most of the time, you can only have one of those.
Don't hang around with clinical psychologists. They're all nuts.