You would never know baseball had undergone a data revolution if you only followed the NCAA. This might finally be changing.
The following is excerpted from The Evolution of Data and Statistics in Baseball, a thesis paper written by Jake Garcia for the Walter Cronkite School of Journalism and Mass Communication, at Arizona State University.
Doing introductory research on a niche topic like the interplay between baseball, stats and data, I initially pegged baseball as experiencing something similar to a Schumpeterian moment—an economic theory derived from the work of Karl Marx that describes a process of destroying an old structure and creating a new one. But ever since its inception, baseball and numbers have been heavily intertwined, and the Schumpeterian moment has been fixed in a never-ending cycle. The destruction of old stats and analytics and the creation of new ones has been incessant and perpetual. It did not stop when the concept of the run batted in was introduced in 1879; it definitely did not stop when computers came along in the 1960s to encourage in-depth statistical analysis; and it will not stop any time in the near future with technological advances like Statcast continuing to mold the way people think about baseball. How does the change continue and how will it manifest itself?
Spotlighting college baseball was presumably a good starting point. Since the NCAA is chock-full of players striving to embark on their professional careers in the near future, you would expect it to also be the perfect grounds for experimentation with data, stats and technology, much like the Arizona Fall League, right? Well, not exactly.
Somewhere, a recreational Sunday softball league is about to get really, really good.
Seemingly every year or two for the past decade when David Ortiz has gone through a rough stretch—a bad April, a slow start coming back from the disabled list, or even just a hitless key series—it has become a story in Boston, with attention-grabbing headlines asking if he’s washed up. The answer has always been a resounding no. In fact, few players in baseball history have as thoroughly and convincingly avoided being washed up for as long as Ortiz.
At age 26 he was released by the Twins—as a Minnesotan, the state ban prevents me from discussing this matter any further—and from the moment Ortiz started putting up big numbers in Boston many people have been waiting for him to come crashing back down to earth. He never has, topping an .850 OPS in 13 of the past 14 seasons, with a low-water mark of .794 in 2009 that really pressed the “he’s washed up!” alarm.
Just last season Ortiz was hitting .219 in early June when a local television reporter asked him about being washed up, which led to this memorable rant a few days later:
When runners are in scoring position, home runs go down, while everything else--including doubles and triples--goes up. Why?
In this article, I looked for evidence that some hitters perform better with runners in scoring position (RISP) than in other situations. (I didn’t find much.) To help me define perform, I calculated run expectancies with RISP and discovered that of common batting metrics (batting average, on-base percentage, slugging percentage, and OPS), slugging percentage correlates best to driving in runs. So I looked for hitters who slugged higher with RISP than in other plate appearances.
A couple commenters suggested that this is a low bar. After all, they reasoned, hitters should have a higher slugging percentage with RISP than without, for a number of reasons:
· Pitchers have to pitch from the stretch (though the perception that pitchers give up velocity by pitching from the stretch doesn’t appear to be true)
· Infielders must position themselves to hold runners on, opening up lanes for basehits that wouldn’t exist with the bases empty
· Similarly, with fewer than two outs and a runner on third, infielders may move up to prevent a run, creating additional opportunities for singles
Don't forget how great Joe Nathan was. And as he prepares for a comeback, don't forget what he's still working toward.
The last big-league pitch Joe Nathan threw was an 86 mph, 1-2 slider to Torii Hunter on Opening Day of last season. Hunter checked his swing, got rung up by umpire Joe West for a game-ending strikeout, and argued his way into a meaningless ejection (followed by several days of the usual “Joe West is the worst” headlines). Detroit beat Minnesota, Nathan got his 377th career save, and two days later he was placed on the disabled list with an elbow injury that eventually required Tommy John surgery.
It was the second Tommy John surgery and third major arm surgery of Nathan’s career and at age 41 it seemed like the end of the line for the six-time All-Star closer, with a headline-grabbing one-out save against his former team and former teammate serving as a memorable final act. Instead, he rested and rehabbed, and last week Nathan signed a major-league contract with the Cubs that includes a spot on the 60-day disabled list until he’s ready to pitch again. As of now he’s aiming for early July.
Nathan wasn’t great for the Tigers before blowing out his elbow—posting a 4.78 ERA and 55/29 K/BB ratio in 58 innings—but having closely watched his entire Twins career it’s my duty to remind everyone of how great he was for a long time in Minnesota and later in Texas. Nathan at his best was as dominant as nearly any reliever in baseball history, and Nathan was at his best a lot. For instance, here’s a list of the pitchers since 1920 with the most seasons in which they threw at least 50 innings and posted an ERA below 2.00:
DRA in depth: Finding a run-expectancy curve that would eliminate the negative DRA.
This is the second in a series of articles explaining in depth the updated formulation of Deserved Run Average. The overview can be found here, and Part I of the in-depth discussion of the revised approach can be found here.
Call me Jonathan.
For most of this offseason, my (entirely metaphorical) White Whale was baseball’s run expectancy curve; the distribution, if you will, between the minimum and the maximum number of runs yielded by pitchers per nine innings of baseball. Why would something so seemingly arcane be so very important to me? Let’s start with some background on run expectancy.
In 2015, for pitchers with at least 40 innings pitched, their ERAs ranged from .94 (Wade Davis) to 7.97 (Chris Capuano). In more prosperous times, such as the 2000 season, pitcher ERAs at the same threshold ranged from 1.50 (Robb Nen) to 10.64 (Roy Halladay). For something more in the middle, we can turn to 1985, when a starter (!), Dwight Gooden, had the lowest ERA at 1.53, and Jeff Russell topped things off at 7.55.
Here’s what those seasons look like on a weighted density plot, side by side:
Putting the Twins' top pitching prospect's disaster start in perspective.
Monday night I tuned into the Twins-Tigers game, in part because I'm a glutton for punishment with a Pavlovian need to watch Twins games no matter how bad things get, but also because top prospect Jose Berrios was making his fourth career start. I was excited, or at least as excited as baseball fans in Minnesota get these days. Berrios' first three starts weren't great, but his previous outing was encouraging enough to make me think perhaps the 21-year-old former first-round pick was ready to become a full-time member of the rotation for the next decade or so.
He wasn't. Actually, whatever the opposite of that is, he was ready to do that and only that. Berrios failed to make it out of the first inning, recording two outs while allowing seven runs. First-inning knockouts have always fascinated me. It’s like showing up to your office, spilling coffee down the front of your shirt, slipping and falling on the wet floor beneath you, knocking over a filing cabinet in the process, and then being asked to go home by your boss. Not only did you embarrass yourself in front of co-workers, now they’re all watching you exit in shame. It’s made even worse by knowing that everyone else has to keep working for the rest of the day.
Twins manager Paul Molitor came out to remove Berrios following a bases-loaded double by light-hitting shortstop Jose Iglesias, and their limited interaction had a "put him out of his misery" vibe. Shortly after the game--which the Twins came back to tie at 8-8 before losing, thus providing their fans with the maximum possible pain--Berrios was demoted to Triple-A. There he joins fellow top prospect Byron Buxton, who began the season as the Twins' starting center fielder before being demoted to Triple-A three weeks later after hitting .156 in 17 games.
Buxton ranked No. 2 overall on our top-101 prospect list and Berrios was No. 17, so in addition to all the losing happening in Minnesota it's getting increasingly difficult to convince people here that prospects are worth believing in. I'm still a big believer in Berrios (and Buxton too), but his disastrous start Monday did give me some pause, in that it got me wondering how often a prospect as young and as highly touted as Berrios has ever been that helpless on a mound. My hope was that it actually happens quite a bit, and better yet happens quite a bit to prospects who go on to become amazing pitchers. But... well, it doesn't.
Here's the complete list of pitchers since 1995 who've been knocked out of a start in the first inning while allowing seven or more runs before turning 22 years old:
Regardless of the number, the man could drink beer. Is that enough for us?
One of the completely under-sold perks of writing for Baseball Prospectus—aside from the deeply intelligent community, colleagues, and conversation, of course—is that you sometimes get baseball cards with your paychecks. When I was told I might get “cardboard bonuses,” I assumed the person I was talking with was making a cutesy allusion to the cardboard that would come in the envelope behind my check. I may need to reassess how sub-whimsical my previous writing jobs have been.
In any case, I was pleasantly surprised to see a 1991 Leaf Wade Boggs card in my mail the other day, and I almost immediately determined to write something at least tangentially related to the Red Sock they, apparently, called Chicken Man. My first thought was to analyze his historic cross-country flight upon which he drank either 64 or 107 beers, maybe going beer by beer and rhapsodizing briefly on each. But it seemed maybe a little poetically cocky for my fifth column. Unfortunately, the idea of Boggs’ beer consumption, which had to have been a record of some sort, couldn’t be shaken. No amount of half-drafted articles on how weird it was that he was at one point a Devil Ray could get me off the trail.
So I decided to zig instead of zag and write a bit about the reasons that records, particularly in baseball, are so persistent in the minds of the sport’s fans and writers. This persistence is the evergreen cliché of baseball, of course—any film or TV show that includes a character who “loves baseball” will be sure to show that character casually tossing out Ty Cobb’s lifetime average, name dropping Ted Williams as the most recent .400 hitter, or rhapsodizing about Hank Aaron or Barry Bonds’ home run records. And saber-savvy fans online are the same way at times, or perhaps you’ve forgotten about the several dozen “Mike Trout Facts” twitters we all follow? On some level, records—the absolute statistical outlier—fascinate us.
Maybe the reason is simple enough and implied by “statistical outlier.” Baseball is a game that, even in a season not-yet-finished, is awash in numbers, some predictive, some not so predictive, and all of which tell a narrative. A 162-game season is pretty difficult to make sense of without a zoomed-out view, which the numbers help to give us for better or worse. And as a subset of “baseball numbers,” the record numbers represent high and low boundary points, ways to understand the scale of the achievements we see in-season. If Rickey Henderson’s 130 stolen bases are the modern era’s single-season record, then we have a high water mark to judge the speedsters of today.
But while this seems very straightforward, the issue gets clouded when we start assigning transcendent meaning to records. Either we take the comparative qualities of the record too seriously, or we make hagiographies of the players who achieved these feats. If we do the former, we usually find ourselves embroiled in yet another debate over steroids, and whether different eras can be compared due to different drug use and non-use. And then we’ll find ourselves in some debate over whether things were better in the past, as if the Cubs-Brewers game is the perfect jumping off point to talk about what’s wrong with the kids these days. Things aren’t much less tedious if we deify the record-holders, either. Fun facts, as BPro’s resident podcast Effectively Wild has proven, are a blast, but they also keep us from enjoying the product on the field. It’s cool that Bonds had an OBP over 600 in 2004; it isn’t meaningful.
That distinction, between important and meaningful, is dicey, though. Why distinguish between these two fairly close descriptors at all? Why not just assume an important thing is a meaningful thing? Well, Swiss linguist Ferdinand de Saussure, considered by many the father of contemporary semiotics, has a pretty good, if roundabout explanation as to why.
Russell sits down with Ben and Sam to discuss the experience of writing and living through their best-selling book, The Only Rule Is It Has To Work.
This one is special. Many of the people reading this article have already bought and read the book The Only Rule Is It Has to Work, by Baseball Prospectus’s own editor-in-chief, Sam Miller and our former editor-in-chief, Ben Lindbergh, about the summer they spent running the Pacific Association’s Sonoma Stompers. And somewhere in the blitz of press that they have been doing to promote the book, they took some time to chat with me.
As we were setting up before the interview, I apologized in advance for the fact that I am not a real journalist, and I’ve never really done a sit-down interview like this before. The only model that I really had to draw from was my time when I worked as a therapist. I told them that the only rule was that they had to answer my questions.
(And yeah, there are a few spoilers in here…)
I don’t know that I can properly plug the book in a way that hasn’t already been done, but I will try. There’s a certain fantasy that someday, if we just yell loud enough, teams, managers, fans, and everyone else will stop doing all of the irrational things that we yell loudly about. This book made me think, “Huh, maybe I’m the one who needs to stop yelling.”
Russell: I think that most of the readers at Baseball Prospectus know the story of how the project was conceived. I think a good place for us to start would be during the gestational period. There was a point where you had sent all the e-mails and phone calls and everyone had signed off, and there was probably a moment of “Oh dear, what am I getting myself into?” But then there was some preparation before, Ben, you took the cross-country flight and Sam, you got into your Honda Fit and drove out to Sonoma. Tell me about how you prepared for what you thoughtwas about to happen.
What you need to know before your sweeping take about a player's exit velocity.
Note: Baseball Prospectus has removed the leaderboards mentioned in this article. Thank you for your interest in our work and for your patience as we attempt to resolve this issue.
Last year, the folks at MLB Advanced Media started publishing what is commonly described as “exit velocity”: the pace at which the baseball is traveling off the bat of the hitter, as measured by the new Statcast system.
As a statistic, exit velocity is attractive for several reasons. For one thing, it is new and fresh, and that’s always exciting. It also makes analysts feel like they are traveling inside the hitting process, and getting a more fundamental look at a hitter or pitcher’s ability to control the results of balls in play.
However, we’ve seen many people take the raw average of a player’s exit velocities and assume it to be a meaningful indication, in and of itself, of pitcher or batter productivity. This is not entirely wrong: Raw exit velocity can correlate reasonably well with a batter’s performance.
But this use of raw averages also creates some problems. First, if you use exit velocity as a proxy of player ability, then you must also accept that one player’s exit velocity is a function of his opponents, be they a batter or pitcher. Put more bluntly, a player’s average exit velocity is biased by the schedule of the player’s team.
Second, and much more importantly, we have concluded Statcast exit velocity readings, as currently published, are themselves biased by the ballpark in which the event occurs. This goes beyond mere differences in temperature and park scoring tendencies. In fact, it appears that the same player generating the same hit will have its velocity rated differently from stadium to stadium, even if you control for other confounding factors.
Can anybody in 2016 identify what the Twins are good at?
Twins general manager Terry Ryan is a Well-Respected Baseball Man™.
He was drafted by the Twins in 1972 and pitched four seasons in their farm system. From there he became a scout and, eventually, the Twins' scouting director. In the fall of 1994, when two-time World Series-winning general manager Andy MacPhail left the Twins to take the same job with the Cubs, the team chose Ryan as his replacement. He's been the Twins' general manager for 18 total seasons split between two stints, separated by a self-imposed four-season hiatus. Terry Ryan is the Minnesota Twins.
That cliché about someone who has forgotten more about something than most people will ever know is absolutely true of Ryan, a 62-year-old baseball lifer who has earned universal respect from his peers in baseball and from the media covering baseball. All of that is undeniable. However, also undeniable is that Ryan's overall winning percentage as Twins general manager is just .474; the team has won a grand total of one playoff series since 1995. They haven't won a playoff game since 2004, and the Twins have the second-worst record in baseball during Ryan's second stint, with a fifth 90-loss season in the past six years currently looking likely following a disastrous 10-27 start.
When the Twins were winning six AL Central titles in nine years from 2002-2010 they were known for remaining old school as MLB front offices increasingly went new school. Basically they were known for being Terry Ryan, continuing to rely on their scouting chops and well-established organizational approach as waves of analytics and innovation swirled around them. All of that remains true now, except the Twins have fallen even further behind in the various new-school categories while failing to dominate on the old-school side like they used to. In short, it's not obvious what they're even good at relative to the other 29 teams anymore.
It's been quite a while since Ryan's actual moves and the Twins' actual record matched his sterling reputation. There aren't many teams that would stick with a GM for two decades of .474 baseball and zero playoff success. There aren't many markets in which that GM and his longtime front office assistants would receive little criticism and tons of praise for producing 11 losing seasons in 18 years. But the Twins and Minnesota are that rare combination, which is why this preamble seems somehow necessary just to get to a point where it feels comfortable to say ... well, it's no longer clear that Terry Ryan should be the Twins' general manager.
Ryan is extraordinarily conservative, which has shown itself in his aversion to spending big money on outside free agents and in several seasons deciding to flat-out leave $10 million or more in projected, ownership-approved payroll unspent. He's targeted mid-level, low-upside veterans in free agency rather than going after bigger fish, most recently spending $200 million on the meh-worthy pitching quintet of Ricky Nolasco, Ervin Santana, Phil Hughes, Mike Pelfrey, and Kevin Correia. Those five free agent additions have combined to give the Twins a 4.60 ERA in 1,435 innings; three of the contracts stretch beyond this season.
Is Pittsburgh vs. Cincinnati turning into a turf war, on a global scale? We'd rather hear both sides of the tale.
Wednesday night, while Max Scherzer was striking out 20 Tigers, the Reds and Pirates were striking each other. There were six hit batters in the game, four Pirates and two Reds. Reds’ reliever Ross Ohlendorf was ejected after the last of them, when he hit David Freese with a runner on second after the Pirates had taken a 5-4 lead.
This is not something new for these teams. Since the start of the 2012 season, there have been 94 players hit in 56 games between the Reds and Pirates. (Across the majors, there are, on average, .66 batters hit per game—.33 per side.) The six hit batters Wednesday represents an apex, but the teams combined for five hit batters on June 2, 2013, four on April 8 this year, and three on seven other occasions. In fairness, some of this is probably personnel-related. When you employ batters for whom getting hit by pitches is part of their on-base toolkit, like Shin-Soo Choo (hit seven times in Reds-Pirates games in 2013 alone) and Starling Marte (hit 14 times in Reds-Pirates games dating back to August 2012), it’s reasonable to expect things to get plunky. And Pirates games, in particular, feature a lot of HBPs in the box score. Since the start of the 2012 season, Pirates batters have been hit 328 times, the most in the majors and 15 percent more than the second-place Cardinals. Pirates pitchers have hit 293 batters, also the most in the majors, and 9 percent more than the second place White Sox. (The Reds are third at hitting batters and 14th at getting hit.) But six in one game is an awful lot, as is 94 since the start of the 2012.
This has led to discussion of what might be done about this sort of thing. A hard ball, thrown at high speeds, can cause damage to the human body. Per Brooks Baseball, the pitches that hit the six batters on Wednesday night were thrown at 91.7 (Alfredo Simon in the fourth), 94.8 (Juan Nicasio in the fourth), 80.9 (Simon in the sixth), 86.4 (Steve Delabar in the seventh), 92.5 (Jared Hughes in the seventh), and 95.0 (Ohlendorf in the ninth) miles per hour. Nobody appeared to get hurt in the game, but of course, batters aren’t always that lucky. So what can be done?
On Josh Donaldson, Wade Davis, the Chicago Cubs, and the beautiful regenerative power of mistakes.
I’ve been living in Chicago since 2010, so when people ask me about the Cubs’ current run of success, it’s less because I’m a baseball fan and more because I’m the closest they have to an on-the-ground correspondent. It’s as if Anderson Cooper is breathlessly questioning me about The Baseball Spring: “After all this time, can it be true? Is the old regime truly gone? Can you comment on the peoples’ reactions to this new dawn?”
And while the Cardinals and Pirates wait in the wings to potentially shock this triumphant narrative back into the dreary everyday, they're a healthy 8 1/2 and nine games back, and there is a level of palpable optimism and confidence that I’ll admit I didn’t see for five years living, say, a block and a half from Wrigley. So when people ask, I tell them, yeah—people are really, really excited. It’s been a long time coming.
The long time coming, not the Cubs, is what I want to interrogate a bit today. Because throughout the long rebuilding process in Chicago, Cubs fans often loathed that long time and questioned it, Moses in the Desert style. It’s no fun to wander for 40 days and 40 nights, especially if that involves watching blowouts in the 42 degree Chicago spring. People on the radio questioned Theo Epstein and Jed Hoyer—“I thought this was supposed to be a three-year process!” “Theo’s plan makes it a 10-year process, we’re never gonna see a pennant!”—and around, say, 2013, there was widespread pessimism. How long, the average fan asked, can I handle a 65-win team?
The answer to that question is a bit murky, if only because it’s beyond my pay grade to psychoanalyze the thousands of Cubs fans I waded through to get to my apartment or the El. But a related question we might more fruitfully pose is how many 65-win seasons can a team, or a player handle? In the era of the pre-planned tank in baseball, this is a fairly crucial question boiling down to, if you are an owner, the calculus of balancing your diminishing on-field returns with your financial bottom line. How bad, in other words, is too bad? When does failure start to cost more than it’s worth?
It seems to me there are two ways to look at this: practically and theoretically. The practical side of things is a little difficult. We all know that the “player who doesn’t have the fire of the postseason” cliché about young players on losing teams is silly. Starlin Castro has played just fine in New York; Felix Hernandez, despite being on a perpetually snakebitten M’s team remains sublime; I’m sure if Sam Miller put his prodigious play indexing abilities to work, he could find a number of tremendous, high WARP players who never had a shot on a winning team. Good players play well regardless of locale.
It also is true, at least anecdotally, that losing streaks rarely prompt the dissolution or relocation of an entire team. The Montreal Expos were, yes, abysmal through much of their later pre-Nationals tenure, but two of the three seasons prior to the move (2002 and 2003), they were above 500 and would’ve probably been in the hunt in the two-wild-card era. And many teams have suffered through monstrous losing streaks, from the Tampa Bay Devil Rays first 10 years to the 20 years of losing baseball that are finally in the Pirates’ rearview mirror, and while they have led to firings, they have rarely prompted total organizational failure. Without being able to see the actual books of MLB teams, we may never know if losing streaks really truly do put teams in jeopardy of going belly up, but my guess is that, no, simply losing for a while cannot destroy a franchise.