CSS Button No Image Css3Menu.com

Baseball Prospectus home
  
  
Click here to log in Click here for forgotten password Click here to subscribe

World Series time! Enjoy Premium-level access to most features through the end of the Series!

<< Previous Article
Premium Article Under The Knife: Fade ... (09/29)
<< Previous Column
Reintroducing PECOTA: ... (09/28)
Next Column >>
Reintroducing PECOTA: ... (09/30)
Next Article >>
Premium Article Seidnotes: A Triple Sh... (09/30)

September 29, 2010

Reintroducing PECOTA

The Hits Just Keep On Coming

by Ben Lindbergh and Colin Wyers

Were Ichiro Suzuki represented by Scott Boras, the super-agent might be able to make a more convincing case than usual for his client’s singular, once-in-a-generation talent. Actually, Ichiro-types don’t come along even as often as that, especially in PECOTA’s post-World War II player comparison pool; the baseball gods appear to have both made and broken the mold especially for him.

The Mariners’ NPB import is an outlier in more ways than one, which makes him both a fan favorite and a likely future Hall-of-Famer. Of course, every player could accurately be described as unique, whether because of some aspect of his play on the field, his background, or his choice of breakfast cereal. But Ichiro’s uniqueness is impossible to ignore.

As it happens, some of the very qualities that endear Ichiro to baseball fans render him a persona non grata with the developers of forecasting systems, at least in their professional capacities. In addition to being a great quote, Suzuki has famously managed to collect at least 200 hits for 10 consecutive seasons, a feat that distinguishes him from every other player in history. The traits that have enabled him to amass those remarkable hit totals also mark him as the rarest of roses in and of themselves.

By virtue of his speed, tendency to hit the ball on the ground, and, perhaps, some innate ability to hit ’em where they ain’t, Suzuki has managed to sustain a .357 BABIP over more than 7,000 plate appearances in environments where the “average hitter” musters only a near-.300 figure. In not-unrelated news, Suzuki has led the American League in infield hit percentage for five straight seasons.

The statistical quirks that have made Ichiro such good news for Seattle have doubled as bad news for the accuracy of PECOTA’s projections. Since automated projection algorithms aren’t tailored to individuals, players to whom the normal rules don’t apply (or apply only loosely) present a challenge. Let’s take a look at how PECOTA’s past and present projections for the speedy right fielder stack up to reality. The following table displays Suzuki’s actual stats since suiting up on this side of the Pacific (omitting his rookie season):

Year

PA

AVG

OBP

SLG

2002

728

.321

.388

.425

2003

725

.312

.352

.436

2004

762

.372

.414

.455

2005

739

.303

.350

.436

2006

752

.322

.370

.416

2007

736

.351

.396

.431

2008

749

.310

.361

.386

2009

678

.352

.386

.465

2010

704

.314

.359

.395

AVG

730

.329

.375

.427

For comparative purposes, here’s how PECOTA projected Ichiro in each of our annual publications since the system hit the scene. PECOTA was little more than an apple in Nate Silver’s eye in 2002, so we’ll look at 2003 on:

Year

AVG

OBP

SLG

2003

.306

.368

.419

2004

.309

.351

.423

2005

.311

.355

.415

2006

.308

.343

.406

2007

.310

.354

.398

2008

.304

.346

.384

2009

.292

.338

.359

2010

.322

.375

.426

 

Withholding comment until we’ve presented all the data, let’s take a look at the retroactive forecasts (sans aging adjustments, which we’ll cover later this week) for the same seasons, generated by the latest PECOTA methodology:

Year

AVG

OBP

SLG

2002

.315

.345

.424

2003

.315

.356

.420

2004

.312

.352

.421

2005

.327

.368

.428

2006

.320

.358

.423

2007

.319

.362

.422

2008

.325

.366

.422

2009

.321

.363

.407

2010

.319

.359

.412

AVG

.319

.359

.420

 
In order to put the different implementations of PECOTA on an even playing field, we can compare Ichiro’s MLB stats to those dueling forecasts from 2003-2010, using Ichiro’s actual PA totals for PECOTA-weighting purposes:
 

Results

AVG

OBP

SLG

Actual

.330

.374

.428

New PECOTA

.320

.360

.420

Old PECOTA

.308

.354

.404

 

Actual Ichiro outperforms even the new-and-improved projected Ichiro, but not by much: only 10 points of batting average separate the two. A projection system can’t predict luck, and since some random fluctuation is inevitable, no method can pinpoint batting average infallibly. Ichiro’s true batting-average ability may have remained more or less stable even as his results jumped from as low as .303 to as high as .372, but “new” PECOTA wisely split the difference, never calling for a figure lower than .312 or higher than .327.

So what’s responsible for the improvements in Ichiro’s forecast? As Nate Silver acknowledged several years ago, PECOTA wasn’t doing a great job of grasping the legitimacy of Ichiro’s high batting averages. All high batting averages aren’t created equal, but as Nate lamented about the system’s former failings, “PECOTA thinks that Ichiro is due for a major correction because it thinks he’s like Luis Polonia, and when a hero like Luis Polonia hits .330 or something, it is almost certainly a fluke, a lucky year by a banjo hitter.”

Nate dubbed Ichiro “unique,” but he’s not the only batter whose high BABIPs manage to confound PECOTA on a regular basis. Matt Swartz has written about these “BABIP Superstars” on multiple occasions. Along with Ichiro, the group he identified includes luminaries like Derek Jeter and Joe Mauer, which hasn’t helped to obscure PECOTA’s deficiencies in the BABIP department.

The problem is in projecting batting average in the first place. There’s any number of component skills that contribute to a player’s ability to hit for average – his ability to hit home runs, his ability to make a lot of contact, his ability to leg out a few additional singles. But PECOTA was lumping all of those skills into one catch-all metric, one that is typically subject to a high amount of noise.

So we’ve broken hitting down into a much larger set of component skills than PECOTA has in the past – utilizing play-by-play data from Retrosheet, we can break out things like infield singles and reaching on errors. We can then break this more detailed batting line down into an even more detailed set of components, and project them all independently before combining them into an overall batting line.

This lets us do a better job of projecting players with unique skill sets – by taking a closer look at the variety of skills that make up their batting line, we can do a much better job of identifying the underlying skill and not regressing it away as “luck.”

This also has implications for pitchers – we’re not stuck using official pitching stats to project pitchers anymore. We can get an exact count of (for instance) doubles and triples allowed, and to the extent that pitchers have a persistent skill in allowing extra base hits on balls in play, we can use that information to project their runs allowed.

(This also reduces the amount of code needed to run PECOTA, because we can share more code between the hitter and pitcher forecasts. That means less possibility for bugs and more shared improvements between the two sets of projections.)

But what about players whose skills aren’t unique, but their situations are? Tomorrow, we look at how we’re making PECOTA smarter about injuries.

Ben Lindbergh is an author of Baseball Prospectus. 
Click here to see Ben's other articles. You can contact Ben by clicking here
Colin Wyers is an author of Baseball Prospectus. 
Click here to see Colin's other articles. You can contact Colin by clicking here

Related Content:  Ichiro Suzuki,  PECOTA,  Ichiro

13 comments have been left for this article.

<< Previous Article
Premium Article Under The Knife: Fade ... (09/29)
<< Previous Column
Reintroducing PECOTA: ... (09/28)
Next Column >>
Reintroducing PECOTA: ... (09/30)
Next Article >>
Premium Article Seidnotes: A Triple Sh... (09/30)

RECENTLY AT BASEBALL PROSPECTUS
Minor League Update: Games of Thursday, Octo...
Pebble Hunting: An Illustrated Guide to the ...
Raising Aces: Ghosts of World Series Past
Playoff Prospectus: PECOTA Odds and Game 3 P...
Playoff Prospectus: A Decade of Planning an ...
Playoff Prospectus: Never-Wrong Ned?
Playoff Prospectus: PECOTA Odds and Game Fou...

MORE FROM SEPTEMBER 29, 2010
Premium Article Under The Knife: Fade To Black Album
Premium Article Kiss'Em Goodbye: Los Angeles Dodgers
Premium Article Prospectus Perspective: Front Fours
Premium Article Prospectus Hit and Run: Disasterpiece Theate...
Premium Article Transaction Analysis: White Sox, Indians
Premium Article On the Beat: Eyeing The Glass Slipper

MORE BY BEN LINDBERGH
2010-10-08 - One-Hoppers: Andy Pettitte's Game Two Starts...
2010-10-05 - Premium Article Playoff Prospectus: ALDS Preview: Rays vs. R...
2010-10-01 - One-Hoppers: Gomes and the Game of Inches
2010-09-29 - Reintroducing PECOTA: The Hits Just Keep On ...
2010-09-28 - Reintroducing PECOTA: Whatever Happened to t...
2010-09-26 - BP Unfiltered: Retiring the Sauce
2010-09-22 - Manufactured Runs: A Walk in the Park
More...

MORE REINTRODUCING PECOTA
2011-02-07 - Reintroducing PECOTA: They're Here!
2010-10-01 - Reintroducing PECOTA: The Seven Percent Solu...
2010-09-30 - Reintroducing PECOTA: Aches and Pains
2010-09-29 - Reintroducing PECOTA: The Hits Just Keep On ...
2010-09-28 - Reintroducing PECOTA: Whatever Happened to t...
2010-09-27 - Reintroducing PECOTA: What A Long, Strange T...
More...

INCOMING ARTICLE LINKS
2013-04-08 - Premium Article The Asian Equation: What's at Stake?
2011-05-11 - Premium Article The Asian Equation
2010-10-01 - Reintroducing PECOTA: The Seven Percent Solu...
2010-10-01 - Premium Article Prospectus Perspective: Achieving WARP Speed
2010-10-01 - One-Hoppers: Gomes and the Game of Inches
2010-09-30 - Reintroducing PECOTA: Aches and Pains