Four months ago, Harper was voted the “most overrated player in baseball” (for the second consecutive year) by his fellow players in ESPN The Magazine. Despite having racked up 1,489 Major League plate appearances without facing a pitcher younger than him (it finally happened when he faced New York’s Jacob Lindgren on June 10th) and possessing once-in-a-generation-type natural ability, there wasn’t a fantasy analyst on the planet confident enough to forecast that Harper, coming off an injury-marred campaign the previous season, was on the precipice of blossoming into a fantasy superstar at just 22-years-old.

Granted, there were some authentic signs during spring training that Harper was finally poised for a breakout, but given his struggles to stay healthy and backslide in approach at the plate last year, he carried substantially more risk than almost any other player in re-draft formats. Investing an early round pick on Harper was the fantasy baseball equivalent of Kramer and Newman’s scheme to make an extra five cents recycling bottles and cans by driving them to Michigan in an extra mail truck. Sure, the potential for a massive payoff was there if everything broke right, but the odds of Harper blowing up like he did (or Kramer and Newman finding a way to Michigan without incident) were slim at best.

No matter how you slice it, Harper is having an incredible season. He’s emerged as a legitimate MVP candidate, slashing .339/.464/.704 with 26 home runs, 59 runs scored, 61 RBI, and four stolen bases in 343 plate appearances. He leads all of baseball with a 1.168 OPS and will make the third All-Star Game appearance of his career tonight in Cincinnati.

The burning questions now become what should fantasy owners expect from Harper going forward? Is his otherworldly performance sustainable? Finally, how does his future performance match up with his trade value? Because let’s face it, Harper’s value in a re-draft formats will never be higher than it is right now.

I won’t go full Russell Carleton “gory math alert” (seriously, read his work, it’s brilliant) to explain just how valuable Harper has been for his fantasy owners this season, because Baseball Prospectus elder fantasy statesman (if you don’t get that joke, go listen to the Flags Fly Forever podcast right now) Mike Gianella has already done that. According to his latest in-season, Rotisserie-style valuations for American League-only and National League-only formats, Harper is one of just three hitters (Paul Goldschmidt and Dee Gordon are the others) on pace to earn over $40 in the NL this year.

The concept of “value” is at the core of every fantasy baseball owner’s strategy. My esteemed Baseball Prospectus colleague Jeff Quinton could do a vastly superior job explaining the theory behind this idea, but I’ll give it a shot. If the goal is to employ a process that plays the percentages (what is most likely to occur given the information at our disposal) as often as possible, then anticipating Harper to continue to produce at his current level is unrealistic.

I’m not advocating that fantasy owners look to trade a true once in a generation type talent like Harper, especially not in keeper or dynasty formats, but lets stop for a minute and acknowledge just how far into uncharted waters Harper’s performance is venturing from a re-draft league perspective. In order to do that we need to dive into the vast ocean that is PECOTA.

In addition to the weighted-mean projections for players, Baseball Prospectus PECOTA projections also include percentiles, because forecasting isn’t an inexact science. So what do the percentiles mean exactly? Here is what Baseball Prospectus’ Colin Wyers wrote when they were first released in 2013:

“Percentiles allow us to put a range of outcomes around a single-point forecast, to illustrate how uncertain the forecast is and what range of outcomes are most likely. The percentiles, then, represent the spread of outcomes if we were to have a player go through the 2013 season thousands upon thousands of times. Imagine a bell curve, with the 50th percentile at the very peak. Twenty percent of the time, a player's results should fall in between the 40th and 60th percentiles—or 60 percent of the time, a player should perform at his 60th percentile or worse, while 40 percent of the time, he should play better.”

PECOTA 2015 Regular Season Forecast

Here is a brief look at how Harper’s pre-season forecast compares to his actual statistics through 81 games this season.

• 90th percentile: 651 PA – 91 R – 26 HR – 94 RBI – 13 SB – .304/.385/.513

• 50th percentile: 597 PA – 75 R – 21 HR – 78 RBI – 11 SB – .268/.344/.452

• 10th percentile: 543 PA – 61 R – 17 HR – 64 RBI – 9 SB – .233/.304/.393

As you can clearly see, Harper is on pace to shatter his 90th percentile PECOTA projection coming into the year (especially in the power and on-base department).

• 2015 Statistics: 343 PA – 59 R – 26 HR – 61 RBI – 4 SB – .339/.464/.704

It’s hard to blame PECOTA because Harper’s breakout season has corresponded with legitimate improvements at the plate (which have been well documented) that couldn’t have been anticipated given his past performance.

One of the best takes on Harper’s change in approach at the plate came from BP’s very own Dan Rozenson, who noted in late May how Harper has more than doubled his slugging percentage against breaking balls this season. Perhaps the most noticeable improvement can be found in regards to Harper’s plate disciple. Not only has he doubled his walk rate (18.4%) but he has also cut his strikeout rate by six percent from a year ago. Is Harper’s elite performance sustainable? It seems logical, thanks to the adjustments he has made at the plate, which have enabled him to become one of the games most feared power hitters. However, PECOTA expects at least some regression in the power categories over the second half of the season.

PECOTA Rest of Season Forecast

• 90th percentile: 323 PA – 45 R – 13 HR – 47 RBI – 6 SB – .301/.383/.515

• 50th percentile: 276 PA – 35 R – 10 HR – 37 RBI – 5 SB – .270/.348/.461

• 10th percentile: 229 PA – 27 R – 8 HR – 28 RBI – 4 SB – .239/.312/.408

Barring an injury, there is a solid chance that Harper eclipses the sacred 40-home run plateau. Even baking in a little regression, at this point in the season in a re-draft league, aside from Paul Goldschmidt or Mike Trout, there isn’t another hitter for whom you would deal Harper for straight up.

British economist John Maynard Keynes famously said, “When the facts change, I change my mind. What do you do, sir?” He was likely referring to European economic conditions, but it has some relevance to Harper’s evolution as a hitter and the impact it has had on his statistics. We shouldn’t ignore PECOTA’s pre-season or rest of season forecasts. Don’t write them off as completely useless because Harper is a different hitter, but rather view them as indications of just how insanely good his performance has been, in relation to what PECOTA reasonably expected.

Harper slugged .477 as a rookie teenager and PECOTA only saw .515 as a best case scenario at age 22? Is "best case" a good way to describe 90th percentile or is a bit below that? How often do players blow by the 90th percentile of projections?
I feel like I could write an entire article just responding to this question because it's a good one. I'm not sure there is a simple answer here, it's extremely complicated. Harper is a unique natural talent, so lets get that caveat out of the way. As I pointed out in the piece he's made a number of significant adjustments as a hitter this season that PECOTA couldn't have possibly anticipated he would make given the statistical evidence we had coming into the year.

You have to factor in aging curves as well. 22 isn't when a player typically reaches their peak, it's around 27 for most players historically speaking. Which brings us back to Harper being that once in a generation type talent who kind of defies an aging curve altogether. At the end of the day, yes, tremendous raw power potential as you pointed out, but difficulty staying healthy (which depresses playing time projection) and the backslide statistically at the plate over the previous two years (which may have been the result of injuries) limits what a projection system like PECOTA can forecast. It can't just anticipate that Harper will make adjustments (like he did) when there is no evidence of that in his profile coming into the season.

When PECOTA produces a projection it's 90th percentile projection it isn't factoring in that Harper will double his walk rate, slugging percentage against breaking and off-speed pitches and cut his strikeout rate by six percent. I think a better way to view 90th percentile than a "best case" scenario is to look at it from the sense that Harper's profile as a hitter has improved so dramatically that he's exceeded what PECOTA believed was possible in an outlier year. The percentiles are a range of outcomes, so it's fair to say that the 90th percentile would be a sort of best case scenario projection, but the likelihood of it happening is almost non-existent (unless you see legitimate improvement at the plate like Harper has made).
That explanation seems... inadequate. PECOTA's rest of season 90th percentile for Harper is basically identical to its preseason 90th percentile projection (.301/.383/.515 vs .304/.385/.513, and counting stats merely cut in half). He has 26 home runs at the break, and PECOTA gives him less than a 10% chance of breaking 40. In other words, despite what he's done in 2015 to date, PECOTA hasn't changed its mind about him at all on what his 'best case scenario' is. PECOTA is just dismissing Harper's first half as a complete fluke.

To me, that points to a flaw in PECOTA that should be investigated, or at least in how the ROS projections are being created. Harper's first half should have had *some* impact on the upper range of PECOTA's projections for him going forward, and it's had none.
I wouldn't call it a flaw in PECOTA. Projection systems aren't designed to account for dramatic changes in skill level during the pre-season (PECOTA comes out in March). We can all agree that Harper has made adjustments and improved dramatically as a hitter which PECOTA isn't going to forecast before the year.

Where I think you make an excellent point is that PECOTA isn't taking those into account when forecasting going forward. I'll have to look into it for you to see how often the rest of season projections are updated to reflect new information/performance.
"Projection systems aren't designed to account for dramatic changes in skill level during the pre-season"

There is nothing preventing a projection system from doing this properly; it's just that's yours doesn't.
I'm confused how PECOTA's ROS 90th percentile is somehow a little WORSE in two of the three categories than its 90th percentile from before the season. Its somehow taking the additional information gained from his incredible first half and predicting his up-side talent level is lower?!
It's not a significant enough of a difference in AVG or OBP to really matter. PECOTA doesn't take into account "upside". The percentile projections represent a range of potential outcomes and among those, the 90th percentile is seen as sort of an extreme outlier, but possible.

Given how off the charts Harper's first half performance has been compared to the rest of his career, PECOTA is (and should be) accounting for some regression. Whether or not that actually happens is another story altogether, which was kind of the overall tone of the piece. Has Harper changed so much that we should ignore the PECOTA forecast entirely or is it an indication of just how much he's outperforming what we should reasonably expect?
How can PECOTA miss so badly on Harper? That's easy. Because PECOTA is a joke! And so is STEAMER.
So what would you suggest for a projection system? Just curious...
great article george, and i think it speaks to the constant challenge that fans generally-- and fantasy owners specifically-- face in trying to weigh current production between projections. It seems crazy to think harper will maintain his current pace ROS-- that OPS is currently the highest player in the 00s not named Bonds or Sosa on the BBREF all time season leaderboards. There's half a season for him to regress, or half a season for him to continue proving the projections wrong.

these edge cases, where performance and projections are so far apart, are always learning experiences. Players like harper will likely constantly break projections because of what you indicated about his aging curve.

So as a fantasy owner the question is what do you do with players like this? The answer seems to be-- overpay on draft day based on the assumption the player lives up to expectation. the extra $1 for harper on auction day in march was certainly worth it in 2015, but not in 2014/2013.
Thanks! I'm glad you liked the piece and you make some great points. The main goal of this piece is to spark discussion about projections and how to handle outlier seasons like Harper. Let's just step back and acknowledge how good Harper has been compared to what one of the best projection systems out there in PECOTA reasonably forecasted coming into the year.
"The main goal of this piece is to spark discussion about projections"

And yet you didn't seem to think cracker73's "PECOTA is a joke!" added to the discussion. ;-)
I can understand why a projection system doesn't really capture a young players breakout performance. I don't understand why it doesn't incorporate more of that breakout in projections going forward. The 50th %ile ROS is basically the same as it was when season started. This is a skills breakout - not just a lucky streak. While luck/randomness can be expected to regress, skills don't.
You hit the nail on the head with the point about the rest of season projection. Unfortunately, I'm not 100% certain about the frequency with which the ROS projections are updated from when PECOTA is released in March. I'm going to bring it up with our PECOTA experts and see what they say. It's certainly a topic I will be writing about again.
How have you determined what part of his performance is skill based and what part 'luck'? Considering the wide range of actual performance around a talent level, I think ytou need more than half-a-season to make that judgment. Even if PECOTA is adjusting for current results, it would be fool-hardy to have a system for all players that jumps in with both feet on a half-season breakout. It would have to average that against all prior similar palyers who had such breakouts, and to the extent no comparables are available, it will as a systems matter discount the breakout considerably. No one expects the Spanish Inquisition, even when it knocks on your door.
George, it does seem that PECOTA often "feels" low on top prospects, who often get thank ranking because scouts see the possibility for the kind of adjustments Harper made this year. Does PECOTA factor in prospect ranking or other stand-ins for scouting and, if it doesn't, would that help?
Neither PECOTA (nor any other projection system for that matter) factors in the possibility of a dramatic skills breakout when it forecasts prospects. It simply uses all of the data available at the time of the projection to forecast performance going forward. It doesn't just assume that a huge breakthrough is coming.

You are correct that prospects are notoriously difficult to forecast because all we have to work with are minor league statistics, but PECOTA does use Major-league equivalencies to project how a player will perform. One of the other core parts of the system are career-path adjustments, which incorporate information about how comparable players' stats changed over time.

This is why using projection systems (statistics) along with scouting (knowing what adjustments to look for in order to project a breakout on the horizon) is so important. PECOTA gives us a baseline forecast to work with when it comes to evaluating players for fantasy purposes. The rest is up to traditional scouting.
Does anyone know what percent of players exceed their 90th percentile projection historically (YoY)? Based on how regression analysis works (and I don't really know so this is probably not even true) it should be like 1% right? It would be interesting to see all time Pecota busts/booms (including what Pecota would have projected during the golden era). I think y'all did a series last year that looked at Pecota outliers but can't remember exactly who wrote that.
I wish I had the answer as far as what percentage of players significantly outperform their 90th percentile PECOTA projection. I would assume that not only is it a small percentage, but there would be a huge correlation between legitimate skills growth (think J.D. Martinez or Corey Kluber type skills breakouts) and shattering PECOTA's expectations. Less of the Danny Santana or Josh Harrison (BABIP-driven) variety. What I can tell you is that there is a boatload of content on the PECOTA front coming your way very soon.
I mean, shouldn't it be 10%?

GoryMath :)
As bad as this miss is on Bryce Harper, nothing will ever compare to the miss on Matt Wieters.