CSS Button No Image Css3Menu.com

Baseball Prospectus home
  
  
Click here to log in Click here for forgotten password Click here to subscribe

<< Previous Article
Future Shock: Mock Dra... (06/03)
<< Previous Column
BP Unfiltered: Today's... (06/06)
Next Column >>
BP Unfiltered: The Pap... (06/07)
Next Article >>
Fantasy Article Fantasy Beat: Who Am I... (06/06)

June 6, 2011

BP Unfiltered

In-season PECOTA updates

by Colin Wyers

Some changes coming down the pipe for the Depth Charts (and by extension, the Player Forecast Manger) and PECOTA I wanted to share with you:

  • We are now projecting player’s rest-of-season stats to a 2011 baseline, rather than a 2010 baseline, and
  • We are now using an updated PECOTA that incorporates a player’s 2011 performance (major league numbers only–minor league translations will be incorporated at a later date), and
  • Rest-of-Season PECOTA will be updated daily.

We’re going to be pushing these updated forcasts into the Depth Charts and PFM (as well as the tops of the player cards) on a daily basis. The last updated fields in the DC and PFM will still only change when a player’s playing time has been manually tweaked by us. (If you come across any player news or have an update for our depth charts, you can let our Fantasy crew know at our comments page, sending an e-mail to dc@baseballprospectus.com or sending @BProDepthCharts a message on Twitter.)

I’m sure you’re all more interested in the PECOTA aspects of things, and I’ll be going into more detail on that in just a moment, but I do want to emphasize that we’ve shifted baselines as well. If you see a pitcher with a lower ERA forecasted than what you’ve seen in the past, but he hasn’t pitched well to date, that’s because we’re expecting all pitchers to have a lower ERA than we would have prior to the start of the season. The change in baselines is not necessarily meant to reflect our expectations of what the rest of the season will be in terms of offensive levels (as it gets warmer, we should see the baselines rise), but to facilitate easier comparison between a player’s season-to-date performance and his rest-of season projection.

Now, as to the PECOTA updates: We are not rerunning the entire PECOTA process on a daily basis. First off, that would simply be impractical; by the time we got done the next day’s stats would already be waiting for us. Secondly, it would be the wrong tool for the job. Much of the computational horsepower behind PECOTA is spent figuring out how a player will change with age. The effects of age between now and late August are minimal enough to be ignored, and the aging process used to figure a player’s aging between seasons would be very ill-suited to help us capture them anyway.

Instead, we are taking a player’s season-to-date numbers and, in effect, “regressing” them toward the pre-season PECOTA forecast. The weighting is determined by two things: (1) a player’s playing time so far this season and (2) the reliability of a player’s preseason forecast. The more a player plays this season, the more the rest-of-season forecast can move, but at the same time, the forecast for a rookie is more likely to move than that of an established veteran.

I’m anticipating some questions from y’all, so I’ll start off with the first one I expect to see: Why aren’t we projecting Jose Bautista to hit like Barry Bonds for the rest of the season?

It’s a good question. Let’s say this up front: Bautista is very nearly a singular case in the history of baseball, insofar as his transformation from journeyman to premiere hitter. It is possible that a model like PECOTA, based on historic data, is having a hard time coping with a player as unique as Bautista. The trouble is that since Bautista is unique, it is impossible for us to test this proposition any other way than to just let Bautista play and see what he does next. But the updated PECOTA is by nature conservative, not just for Bautista but for all hitters. The reason this clashes with our expectations is because of a phenomenon called recency bias. Humans have a tendency to overweight more recent information at the expense of older information. Oone of the big benefits of using a forecasting system like PECOTA is that it forces us to confront our recency bias and to account for all of the information we know about a player.

The next question I expect is, "So why does Fangraphs have a much higher rest of season projection for Bautista than you do?” My answer is simple: they are wrong. Since this is obviously a statement from interest, I shall explain why this is the case:

Fangraphs uses a stat called wOBA as their all-encompassing batting rate; conceptually it and TAv are very similar. For our purposes, the main difference between them is that wOBA is baselined to the OBP scale rather than the batting average scale. (In Fangraph’s implementation, it is reconciled to the league OBP for that season, unlike TAv where the average is held constant over time.) Prior to the season, ZiPS (the projection system designed by Dan Szymborski) projected Bautista to have a .381 wOBA. This is not too far from where PECOTA had him; depending on how you want to handle converting between wOBA and TAv, these could be identical forecasts in terms of overall batting productivity.

Since the start of the season, Bautista has hit for a .516 wOBA (by Fangraph’s reckoning; other sites such as Statcorner figure wOBA slightly differently and thus come to different results) in 235 plate appearances. That gives Bautista a projected .415 wOBA for the rest of the season, equivalent to a TAv somewhere around .330 (depending on the assumed OBP for the league rest of season), significantly higher that what rest-of-season PECOTA says. If we were to take a weighted average of his preseason forecast and his season-to-date performance, in order for the numbers to equal his rest-of-season projection you'd have to treat them as worth 698 plate appearances, right around the number of PAs Bautista had in 2010 alone, whereas a projection that took into account the previous three seasons would be closer to 1500 PAs. ZiPS is underweighing Bautista's preseason projection in favor of his most recent performance. If the point of a projection system is to help overcome recency bias, this kind of a rest-of-season forecast helps less than it hurts--instead of combatting recency bias, it reinforces it.

And this is not an issue related to Bautista’s singular nature; let's look at the top twenty players in terms of absolute change between the preseason projection and the current rest-of-season forecast:

Name

PA

wOBA_Obs

wOBA_Pred

wOBA_ROS

wOBAscale_diff

TAvscale_diff

Russell Branyan

93

.243

.374

.339

.035

.028

Jose Bautista

235

.516

.381

.415

.034

.027

Reed Johnson

69

.452

.295

.328

.033

.026

Matt Joyce

206

.440

.327

.357

.030

.024

Greg Dobbs

150

.369

.298

.325

.027

.021

Jose Molina

75

.391

.264

.290

.026

.020

Alberto Gonzalez

83

.215

.277

.253

.024

.019

Matt Kemp

254

.432

.337

.361

.024

.019

Eric Chavez

39

.357

.255

.279

.024

.019

Laynce Nix

139

.389

.312

.335

.023

.018

Jason Michaels

44

.215

.316

.295

.021

.017

Ryan Raburn

182

.256

.347

.326

.021

.017

Jhonny Peralta

201

.391

.318

.339

.021

.017

Brett Hayes

39

.427

.259

.280

.021

.017

Lance Berkman

207

.433

.359

.380

.021

.017

Paul Janish

180

.237

.293

.273

.020

.016

Alex Avila

180

.373

.306

.326

.020

.016

Chone Figgins

224

.209

.311

.291

.020

.016

Dan Uggla

242

.244

.353

.334

.019

.015

Now, looking at in-season PECOTA:

Name

PA

TAv_Obs

TAv_Pred

TAv_ROS

wOBAscale_diff

TAvscale_diff

Michael Saunders

152

.182

.247

.231

.020

.016

James Loney

230

.226

.269

.254

.019

.015

John Jaso

127

.221

.264

.250

.018

.014

Chase Headley

227

.286

.266

.253

.017

.013

Nate Schierholtz

129

.266

.265

.252

.017

.013

Emmanuel Burriss

45

.220

.230

.217

.017

.013

Gordon Beckham

211

.249

.263

.251

.015

.012

Buster Posey

185

.265

.296

.285

.014

.011

Brandon Wood

93

.210

.249

.238

.014

.011

Jerry Hairston

171

.237

.242

.231

.014

.011

Mark Ellis

222

.214

.247

.236

.014

.011

Jeff Mathis

118

.212

.213

.202

.014

.011

Omar Infante

241

.239

.262

.251

.014

.011

Drew Butera

111

.156

.209

.198

.014

.011

Ramon Castro

50

.229

.259

.248

.014

.011

Skip Schumaker

96

.184

.256

.245

.014

.011

Brandon Belt

67

.235

.291

.280

.014

.011

Cesar Izturis

29

.171

.218

.208

.013

.010

The ZiPS projections, first of all, show a lot more movement, equivalent to .028 points of TAv at its most extreme, compared to .016 for PECOTA. Saunders, in fact, is the only player on the PECOTA list with a larger change than the lowest player on the ZiPS list. The next notable thing is that the players on the ZiPS list seem to be much more likely to be established veterans, while the PECOTA list leans much heavily towards rookie players. Veterans, as a rule, should be less amenable to projection changes than younger, inexperienced players - when you have three full seasons of a guy in the majors, it should take more information to change your mind than it should for someone whose projection is based on less than a full MLB season and some translated minor league data; these lists show PECOTA behaving that way but not ZiPS.

The next question I anticipate is, when will these rest-of-season forecasts be available outside of the PFM? Right now I am working on incorporating the rest-of-season forecasts into the rest of our PECOTA offerings, including the 10-year forecasts, which I anticipate being able to debut sometime next week. We will also be offering updated in-season numbers for players who are not included on the depth charts in the very near future.

Colin Wyers is an author of Baseball Prospectus. 
Click here to see Colin's other articles. You can contact Colin by clicking here

Related Content:  PECOTA,  The Process

14 comments have been left for this article.

<< Previous Article
Future Shock: Mock Dra... (06/03)
<< Previous Column
BP Unfiltered: Today's... (06/06)
Next Column >>
BP Unfiltered: The Pap... (06/07)
Next Article >>
Fantasy Article Fantasy Beat: Who Am I... (06/06)

RECENTLY AT BASEBALL PROSPECTUS
Fantasy Article Fantasy Freestyle: Circling Back to The Holy...
Premium Article Monday Morning Ten Pack: The Season's Most D...
Premium Article What You Need to Know: September 23, 2014
Premium Article Transaction Analysis: Wren's End
Fantasy Article Fantasy Freestyle: Backing Off Backstop Pros...
Prospectus Feature: Colin Moran and the Matt...
Baseball Therapy: Will StatCast Cure Our Def...

MORE FROM JUNE 6, 2011
Premium Article On the Beat: Looking for Philly Firepower
Premium Article Collateral Damage: Avoiding June Swoons
Premium Article Divide and Conquer, NL Central: Winners' Wee...
Fantasy Article Resident Fantasy Genius: HR/FB, SIERA, and L...
Fantasy Article Fantasy Beat: Who Am I?
Fantasy Article Value Picks: First, Third, and DH
The Week in Quotes: May 30-June 5

MORE BY COLIN WYERS
2011-06-28 - Manufactured Runs: Followed Him Up to the Ga...
2011-06-09 - BP Unfiltered: Human Sacrifice
2011-06-08 - Between The Numbers: Another look at ZiPS
2011-06-06 - BP Unfiltered: In-season PECOTA updates
2011-06-01 - BP Unfiltered: Walk of Life
2011-05-29 - Between The Numbers: The Emptiest Batting Av...
2011-05-28 - BP Unfiltered: Sacrifice Walks
More...

MORE BP UNFILTERED
2011-06-09 - BP Unfiltered: The Fallout in Oakland
2011-06-08 - BP Unfiltered: The Paper Trail 6/8
2011-06-07 - BP Unfiltered: The Paper Trail 6/7
2011-06-06 - BP Unfiltered: In-season PECOTA updates
2011-06-06 - BP Unfiltered: Today's Draft Agenda
2011-06-06 - BP Unfiltered: The Paper Trail 6/6
2011-06-05 - BP Unfiltered: The Paper Trail 6/5
More...

INCOMING ARTICLE LINKS
2012-05-08 - BP Announcements: Rest-of-Season PECOTA Now ...
2011-06-08 - Between The Numbers: Another look at ZiPS