March 11, 2004

Baseball Prospectus Basics: The Science of Forecasting

by Nate Silver

My name is Nate, and I am a forecaster. I forecast how baseball players are going to perform. And I pretty much get the worst of it. Tell somebody that their childhood hero is going to hit .220 next year, or that the dude they just traded away from their fantasy team is due for a breakout, and you're liable to get called all kinds of names. A bad prediction will inevitably be thrown in your face, (see also: Pena, Wily Mo) while a good one will be taken as self-evident, or worse still, lucky. The truth is, though, that those of us who make it our business to forecast the performance of baseball players have it pretty easy. For one thing, we've got an awesome set of data to work with; baseball statistics are almost as old as the game itself, and the records, for the most part, are remarkably accurate and complete. For another, it's easy to test our predictions against real, tangible results. If we tell you that Adam Dunn is going to have a huge season, and instead he's been demoted to Chattanooga after starting the year 2-for-53, the prediction is right there for everyone to see in all its manifest idiocy. Not so in many other fields, where the outcomes themselves are more subject to interpretation.

March 10, 2004

Baseball Prospectus Basics: The Concept of ''Clutch''

by Joe Sheehan

The concept of "clutch" is one of the clearest dividing lines between traditional coverage of baseball and what you'll find here at Baseball Prospectus. In the mainstream, performance in important situations is often attributed to some wealth or deficit of character that causes a particular outcome. Here, we're more likely to recognize that when the best baseball players in the world go head-to-head, someone has to win and someone has to lose, and it doesn't mean that one side has better people than the other. Clutch performances exist, to be sure; you can't watch a day of baseball without seeing a well-timed hit, a big defensive play or a key strikeout that pushes a team towards victory. The biggest moments in baseball history are almost all examples of players doing extraordinary things in extraordinary circumstances. Those moments make the game great and the players responsible for them deserve credit, and even adulation, for their heroics.

March 9, 2004

Baseball Prospectus Basics: Integrating Statistics and Scouting

by Dayn Perry

With the rise of quantitative analysis in baseball and the prominence of Michael Lewis's bestseller Moneyball (which, contrary to the ruminations of Joe Morgan, was not written by Oakland GM Billy Beane) there has been cultivated a turf rivalry of sorts between traditional scouting types and their propeller-head assailants. It's my position (and the position of probably all of my colleagues here at Baseball Prospectus) that this rivalry is silly, unnecessary, and ultimately counterproductive. That's because as organizations begin to recalibrate their approach to making player personnel decisions, they don't need to be asking: which method do we choose? Instead, it should be: how do we integrate both approaches? You see, there's no need to replace traditional scouting with performance scouting (a term sometimes used to describe what we do here at Baseball Prospectus), and there's no need to ignore the latter completely in blind preference to the former. In a column I wrote last year, I made a "beer and tacos" metaphor out of the dilemma. It's a little like asking the question: "Which do you want, beer or tacos?" The answer, of course, is: "Both. Now, please."

March 5, 2004

Baseball Prospectus Basics: OPS

by Christina Kahrl

One of the objectives of the Basics series is to sort of rehash everything that is very basic: what we know now, and how did we get to the point that we know it? Filling in some of the back-story of what's up in terms of player analysis serves a few important purposes. First, it helps eradicate some of the potential barriers anyone might have to analysis: take a look, and you that this isn't all rocket science. If even a non-math person and ex-Teamster like me can get it or get some of it, I'm willing to bet that everybody else can too. But if you like the flavor and you want more, there's a really important second goal the Basics series can achieve if you're new to this. Or, if you're already familiar with this sort of stuff, the series serves as a general reminder to those of us who think we know it all. That second lesson is: When in doubt, don't quit early. Whether you call the line of inquiry about baseball that we're involved in here "performance analysis" or "sabermetrics" or snarky and insufferable, one of the perils of working within this community is that it's stocked with bright people devising ever-better mousetraps to define player value statistically, particularly offensive value. As a result, you run the risk of getting lost in the inevitable alphabet soup of different newfangled metrics. And rather than try to sort through them all, it's perhaps easier to settle for a figure that some people refer to as simple and elegant: OPS, or On-base percentage Plus Slugging percentage. And perhaps worse yet, if you're an analyst, it's probably easiest to use OPS, because it's the easiest to explain. As we mentioned earlier in the series, OPS winds up doing a pretty decent job of mimicking a description of overall offensive value. So it works, right? And if it works, and it's simple, why not use it as a gateway stat to introduce fans to the broader, more diverse world of statistical analysis?

March 3, 2004

Baseball Prospectus Basics: How We Measure Pitcher Usage

by Rany Jazayerli

So to understand the methods we use to analyze pitcher usage, it's important to appreciate that while every team in baseball today employs essentially the same usage pattern--starting pitchers work in a five-man rotation, with four or five days of rest between starts, and never relieving in between--that usage pattern is far from the norm historically. As recently as 30 years ago, starters were expected to start every fourth day, with only three days of rest between starts. This does not appear to have had a detrimental effect on the pitchers of that era; in fact, over half of the 300-game winners of the live-ball era were in the prime of their careers in the early 1970s. There is no definitive proof that pitching in any kind of rotation is a necessary ingredient for successful pitching staffs. Through the 1950s, starting pitchers would routinely get six or seven days off to pitch against a team they matched up favorably against, then return to the mound on just two days' rest for their next start. There is no evidence that starting pitchers who relieve on their days off between starts suffer adversely for doing so. Starting pitchers routinely made 10 or 15 relief appearances a season for the better part of half a century.

March 2, 2004

Baseball Prospectus Basics: A Brief History of Pitcher Usage

by Rany Jazayerli

In the beginning, there were no rotations. There were no relievers. There was only one pitcher, and the term "everyday player" had no meaning. In 1876, George Bradley started all 64 games for the St. Louis Brown Stockings, completing 63 of them; his teammates combined to throw four innings all year. Of course, in the early days of the National League, the task performed by the pitcher bore little resemblance to what we call "pitching" today. At various times in the first two decades of professional baseball, the distance from the pitcher to home plate was less than 50 feet; a walk required nine balls; bunts that landed in fair territory before skidding to the backstop were considered fair balls; hitters could call for a "high" or "low" pitch; pitchers could throw the ball from a running start; and curveballs and overhand pitches were illegal. The game changed quickly, and it quickly became impossible for a team to rely on a single pitcher for its entire season. And once that point was reached, the question of how best to maximize each pitcher's usage was born.

March 1, 2004

Baseball Prospectus Basics: Evaluating Defense

by James Click

It is one of the most suspenseful moments in a baseball game. There's a smash to the second baseman, he slides, knocks it down, picks up the ball, throws from his knees, and the first baseman can't dig it out. The crowds waits, and then the message appears on the scoreboard "On the last play, the official scorer has ruled: HIT." Many of the problems inherent in evaluating defense are evident in the situation above. The first, and most crucial, is the fact that one of the most basic statistics involved in defense, the error, is assigned by one of baseball's loosest rules, left to the interpretation of the various official scorers. While the league has struggled for the past few seasons to remove the subjectivity inherent in calling the strike zone, it has done nothing to remove the same from the assignment of errors. Rules 10.05.a-e discuss in detail what is to be considered a "base hit"--essentially any ball that could not be fielded with "ordinary effort," a phrase that is never defined or clarified. In any field, statistics are only valuable if they are consistent and accurately reflect the action on the field. Errors, especially recently, have become assigned in such an ad hoc fashion as to relegate the statistic to nearly unusable status.

February 29, 2004

Baseball Prospectus Basics: Just Another Out?

by Ryan Wilkins

As weíve stated on a number of different occasions throughout the Baseball Prospectus Basics series, one of the goals of performance analysis is to separate perception from reality. Sometimes that means interpreting numbers, and sometimes that means interpreting events with our eyes. Either way, itís about collecting information, and getting a little bit closer to the truth. Evaluating the importance of strikeouts, especially for hitters, is something that has traditionally fallen into the second category. And itís easy to understand why: baseball is a game that centers around the ongoing conflict between pitcher and batter, and there are few outcomes that capture the drama of that conflict better than a mighty whiff, followed by a long walk back to the bench. On the surface at least, a strikeout appears to be the ultimate failure for a hitter--infinitely worse than a Texas-leaguer or a fly-out to center.

February 27, 2004

Baseball Prospectus Basics: Stolen Bases and How to Use Them

by Joe Sheehan

Think of stealing bases as a bit like one of those commercials for breakfast cereal. You know, the ones where they say it takes 14 bowls of Cereal X to equal what you get from one bowl of Cereal Y. In this case, it takes three stolen bases to equal one walk of shame back to the dugout. If you're stealing at less than a 75% success rate, you're better off never going at all.

February 26, 2004

Baseball Prospectus Basics: How to Run a Bullpen

by Derek Zumsteg

Closers are an aberration in baseball's history, a massive misallocation of resources, and eventually will go the way of the dinosaurs Carl Everett doesn't believe in. A pure closer is a reliever who only comes in to protect a one- to three-run lead, only in the ninth. The worst pitcher in baseball stands a great chance of pitching the ninth inning without giving up three runs. With no outs, a team with an average offense against an average pitcher can expect to score half a run. The best offense in baseball last year, the Red Sox, averaged about .65 runs/half inning over the course of the season. The worst reliever in the major leagues last year was Jaret Wright, who gave up 46 runs in just over 56 innings of work--.82 runs an inning. Given a three-run lead in the ninth, pitching against the Red Sox, Wright could reasonably be expected to give up an average of a run each appearance, and if he did it all season, he'd rack up 20 saves, be anointed a proven closer, and sign with the Mets for $4 million a year.

February 25, 2004

Baseball Prospectus Basics: How to Read a Box Score

by Keith Scherer

In our introduction to the Baseball Prospectus Basics series we wrote, "We always want to improve our understanding of the game--each player, each play, each pitch, each throw, each hit--what does it really mean?" We have a storehouse of data to help us refine our understanding of how baseball really works and how it can be improved. We have a team of performance analysts who help us see things we might have never perceived on our own. But the unrefined essentials of what we use are harvested from the box scores you and I read every morning from April through October. The title of this essay is misleading: there is no correct way to read a box score. Roto gamers approach a box score like it's a greatest hits record. Retrosheet's patrons dust each stroke of agate as if it was an artifact. You pay for the morning paper, you get to use your box scores however you wish, even as fishwrap. Box scores now tell us nearly everything that occurs in a game. They tell us hot warm it was, the direction and speed of the wind, and how many people came out to the park. We can find out who the umpiring crew was. Baserunning blunders, substitution patterns, clutch hits, high-leverage relief appearances--it's all in a good box score, along with groundballs, flyballs, balls, strikes, and pitch counts.

February 24, 2004

Baseball Prospectus Basics: About EqA

by Clay Davenport

Dayn Perry explained why various statistics--like batting average (AVG) and runs batted in (RBI)--were not as reliable as you've always been told, and why we at Baseball Prospectus don't use them in our analysis terribly often. Today, we're going to look into one of the statistics we do use: Equivalent Average, or EqA. In its rawest state, EqA is a simple combination of batting numbers, not so very different from OPS. Compared to OPS, it counts walks and HBP a little higher (at 1.5 instead of 1), it has stolen bases, and hits and extra bases are counted a little less (since they are divided by plate appearances, not just walks). What, then, makes EqA different from the other statistics? Simply put, its more accurate, its unbiased, and it models the scale of batting average, so it's easy for a new fan understand.

February 23, 2004

Baseball Prospectus Basics: The Support-Neutral Stats

by Michael Wolverton

"...and the tough-luck loser in tonight's game is..." We hear the above quote in dozens of post-game wrap-ups every year. A starting pitcher goes seven or eight innings and gives up only one or two runs, but his team's offense can't produce anything, so he gets stuck with an "L" next to his name in the box score. The fact that "tough-luck loser" is such a commonly invoked cliche suggests that it's widely recognized that the "L" isn't doing a very good job of measuring the starter's contribution, at least in those situations. But that still doesn't stop the W/L record from being possibly the most prominently used statistic to evaluate starting pitchers in major media baseball coverage. The idea behind the pitcher's W/L record is flawed on its face. Wins are a team thing, after all, not a pitcher thing. If the offense fails to put runs on the board, or if the bullpen melts down in the late innings, the starter won't get the win no matter how well he pitches. Conversely, if the offense is having a great night (or if they're going up against the Rangers, which is pretty much the same thing), the starter doesn't have to do anything more than last five innings to get the W.

February 20, 2004

Baseball Prospectus Basics: Statistical Consistency

by James Click

February, in the baseball world, is the month of predictions. Every analyst, writer, web site, undefeatable computer program, guy with a beer, and book (some better than others) will spend the next month looking over the offseason wasteland and espousing conclusions. The method behind these processes varies more widely than Johnny Depp's acting roles; some are based purely on numbers, some purely on empirical data, some purely on names, and some purely on nothing. So what can you count on? For one thing, you can count on me not offering you any spectacular predictions, guaranteed to be more accurate than anything on the market. If you want that, read up on BP's own PECOTA projection system. Instead, the aim will be to lay a basic groundwork for your expectations of the consistency of basic statistics from season to season. Surmising the volatility of various metrics, and their consistency from year-to-year, is the primary goal.

February 19, 2004

Baseball Prospectus Basics: Measuring Offense

by Dayn Perry

Before delving into those harrowing inhabitants of the Baseball Prospectus statistics page like VORP, RARP, EqA or any other acronym that sounds like a debutante sneezing or something uttered on Castle Wolfenstein circa 1986, it's worth asking: What's wrong with those comfy traditional offensive measures like RBI, batting average and runs scored? This Baseball Prospectus Basics column is going to address that question and, ideally, demonstrate why the traditional cabal of offensive baseball statistics tell only a piece of the story. Later, someone smarter (but shockingly less handsome) than I will take you on a tour of the more advanced and instructive metrics like the aforementioned VORP, RARP and EqA. For now, though, we'll keep our focus on why we need those things in the first place.

February 18, 2004

Baseball Prospectus Basics: Reshaping the Debate

by Joe Sheehan

"Stathead." "Stat-drunk computer nerd." "Rotisserie geek." You can earn a lot of derision when you look at things in a new way, and the people who have applied statistical tools to evaluate baseball players and teams have heard the above epithets and more. The work of people such as Bill James, Craig Wright and Clay Davenport has often been dismissed as the mind-numbing analysis of people who need to put their slide rules away and get out and watch a game once in a while. Their efforts, which have been dubbed "statistical analysis," have expanded and improved the body of objective baseball knowledge, and their work is even beginning to penetrate the insular world of baseball front offices. But the term "statistical analysis," as applied to baseball, isn't descriptive enough. Actuaries analyze statistics, and while the work pays well, it is pretty dry stuff. Life-expectancy tables and risk/benefit workups aren't going to get your average Red Sox fan excited, nor should they: baseball fans care about their teams, and the players on them, not a series of numbers. But baseball statistics are not numbers generated for their own sake. Statistics are a record of performance of players and teams. Period. Benjamin Disraeli's oft-quoted line--"There are three types of lies: lies, damned lies, and statistics"--just doesn't apply.

February 17, 2004

Baseball Prospectus Basics: Introduction

Baseball Prospectus

If you're not familiar with Baseball Prospectus, here's what we're all about: understanding the game better, and innovating in order to do it. Everyone at BP loves the game of baseball with a passion that most people just don't understand. We feel that this greatest of games is so compelling that we want to know everything about it. We always want to improve our understanding of the game--each player, each play, each pitch, each throw, each hit--what does it really mean? Those arguments that take place in bars about the relative merits of different players? We really want to know the definitive answer to those questions. But we don't want to kill the joy of the game while we're looking. To help better understand what we're all about, we're launching a series of articles, entitled "Baseball Analysis Basics." The series seeks to make our work more accessible to new readers, and to remind those familiar with our work of the underlying concepts. As Keith Woolner's recently published "Hilbert Questions" article noted, there is much work still to be done.

