BP Unfiltered: Comparables and Upside

March 29, 2011

We've just rolled out some updates to the player cards:

We've added Top N Comparables to the player cards. Any player who has a PECOTA projection should have comparables. We generated 100 and then dropped the ones who didn't play the next year. The Similarity Score is back, and Similarity Index returns as well.

We've had a lot of questions about comparables this year beyond "When will they be here?" Many of the comparables in Baseball Prospectus 2011 that look a little different than they have historically can be traced to our decision to use only major-league comps for players. There are only so many 21-year-old catchers in the major leagues, for example, and they often turn out to be pretty good players, hence the problem in generating comps with this strategy. We'll be reviewing this policy going forward.

The comparables are presented similarly to last year's, except that we've made them clickable. This feature is available to Premium subscribers only (except for San Francisco Giants player cards—see Freddy Sanchez and Tim Lincecum for examples)
We've added Upside By Year to the player cards. Anyone who has a PECOTA projection should have an UPSIDE score listed. We computed UPSIDE by adhering closely to the glossary definition, but we broke it out per year, as we started doing in 2010. UPSIDE is based on major league runs above average for each comparable player, and this year we were able to run it over a player's top 200 comparables. This feature is available to Premium subscribers only (except for San Francisco Giants player cards—see Pat Burrell and Barry Zito for examples)

We've also made some progress on the remaining components of the projections:

The ten-year projections were supposed to be released last week; apologies again for the delay. We saw the opportunity to release the comparables and Upside first. I'm going to try to refrain from forward-looking statements of any specificity, but they are our top priority.
We are planning on releasing revised MORP in a few days.

While we're on the subject of the Fantasy product, I wanted to clear up something we've been seeing in comment threads and email and plainly haven't been doing a good job of communicating. We have not released a PECOTA Weighted Means Spreadsheet since the the initial release. In previous seasons, we'd run PECOTA updates during the preseason either to add players whom we hadn't projected originally (such as a fringe minor leaguer going bananas in spring training and pushing his way into the picture on the major-league team) or because we were tinkering with the settings. The former was a requirement only when we were running in Excel—in the last couple of seasons, we've simply generated projections for everyone (roughly 6000 projections per year). As for the latter, we haven't touched PECOTA's inner workings since we released the original WMS.

We decided not to include the projected playing time in the WMS this year for a couple of reasons. First, it wasn't there originally, and we heard from people who were confused when we added it. Second, we had a lot of higher-priority issues to work on, so we decided to add a button in the PFM that subscribers could click to download all of the data in CSV or TAB format. Our intention was that users who wanted the playing-time-projected numbers could download them there at any time.

We've changed around virtually all of the other statistical processes on the site, and we've been squashing some bugs as we've found them in some of the fantasy products. For example, we recently had too many wins listed for Seattle pitchers on the depth charts and in the PFM. We've resolved most of those issues, and we've also added checks to ensure that they don't re-occur wherever we've been able to do so. We're aware of the issue with Trevor Hoffman's 2007 WARP scores not matching in the "Standard" and "Recent Performance" tables on his player card, and we'll be addressing that shortly.

Thanks again for your continued patience. More is coming soon.

Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now

Dave Pease

Latest Articles

You need to be logged in to comment. Login or Subscribe

rawagman

3/29

Dave - thanks for the updates. Maybe you can explain how Travis Snider get a 0% chance at breaking out? With his age and previous playing time limitations, wouldn't he be a prime candidate at breaking out? If PECOTA said 2%, I wouldn't have thoughotu much about it, but 0% just doesn't smell right.

Reply to rawagman

denny187

3/30

Why is this question never answered by one of the authors? Its like they KNOW something is screwed up with the Improve/Breakout rates, it won't be fixed, and they are sweeping it under the rug.

Reply to denny187

dpease

3/31

I'll get an answer on this.

Reply to dpease

ScottBehson

3/29

David Wright hit 29 home runs last year. His 90% Pecota projection is for 24, and his mean projection is for 21. Que?

Reply to ScottBehson

jessehoffins

3/29

also, at the same time his upside is like, top 3 leaguewide. only miggy is in his range.

His comps seem to suggest way more than 21 dingers.

Reply to jessehoffins

markpadden

3/31

That's just a poor calculation of variance in a projection. No player should have an 80% chance to finish within 3 HR of his projection...

Reply to markpadden

cwyers

3/31

That's not what the forecast line is saying.

The percentile forecasts are based on the key "total value" rate stat, in this case True Average. What the percentile line is saying is that his expected True Average (given the expected quantity of PA - if you lower the projected PA, the range of the percentiles increases) should be within his 90th and 10th percentile 80 percent of the time.

What the component batting line at each percentile represents is ONE way for a player to achieve the forecast TAv at that percentile.

Wright is actually a great historic example of this - he hits 10 dingers one year, 29 the next, and his TAv moves five points between the two seasons. Yeah, there's a 52 PA difference between the two, but that's not going to give you 19 extra HR at the same TAv. What we see is that the wild swing in HR was offset by difference in walks, doubles and singles (it's a 24 point swing in batting average). The variance for each component of the batting line is higher than the variance of the total value of the batting line, because the variance of individual components will cancel out to some extent when you aggregate.

In Wright's particular case, if I look at the raw components going into his forecast and look ONLY at the variance in home runs on contact, the range between his 50th and 90th percentile forcast is something like six home runs, given his expected opportunities. Now obviously it's a bit more complicated than that - you also have to factor in the variance of his walk and strikeout rates as well. (Not just in terms of numbers of opportunities, but how the number of opportunities affects the variance of the home runs per contact.)

Reply to cwyers

BurrRutledge

3/29

Dave, the lists at the bottom of the player pages that links back to BP articles has not been showing up in IE for me since this weekend. Not sure if it's just me.

Rest of the cards look great.

Reply to BurrRutledge

dpease

3/31

Which version of IE? Thanks for the note--we'll get it fixed.

Reply to dpease

acammarano56

3/29

Thank you for continuing to address this. Do you plan on placing the "skills" graph (used to be at the top right of a player's card) and the "star/scrub" graph (used to be towards the bottom on left) back into the playing cards? I felt these two pieces of data gave a very good and very quick tell on what PECOTA thought of the player, where it felt that player's strengths are and what it felt was the ceiling for the potential of that player. All with a quick glance of these two charts. It also distinguished BP from a lot of the other data out on the internet. This was uniqu and valuable IMHO. Thanks again.

Reply to acammarano56

dpease

3/31

you bet--more graphs are coming to the cards. If anyone's having any trouble seeing the WARP and fantasy graphs on a current player's card, we'd love to hear from you. Thanks!

Reply to dpease

leites

3/29

Dave - Thanks for providing a response to the questions about comps for younger players. But I still have a more general, and perhaps related concern, about the projections for those younger players. For players 25 and younger that already have big league track records, the projections seem so low as to be at odds with everything I thought I knew about growth curves -- for instance, Evan Longoria projected to do worse this year than he did in any of the previous three years in the majors, even those he is approaching his peak performance years in terms of age. Absent some explanation for these projections, I have simply ignored the PECOTA forecasts for younger players, and suspect that some other readers may be doing the same.

Reply to leites

leites

3/29

The Travis Snider example, cited by Wagman, seems like another example of this.

Reply to leites

jrmayne

3/30

Check out Mike Trout's upside. The calculation of the metric is quite wrong, or the projections are seriously wrong.

I'll go out on a limb and say that Mike Trout's 10-year upside exceeds that of Orlando Cabrera, even if Pecota disagrees.

Reply to jrmayne

dpease

3/31

We're looking at Trout--I agree those numbers don't look right. We're not listing any comps for him either which is another part of the problem. (Of course, he's not the easiest player to find comps for.)

Reply to dpease

jrmayne

3/31

Thanks, Dave.

It's not just Trout, though Trout looks to be the most severe case. Wil Myers, who has great comps, has terrible upside ratings, too.

Colin said more than once that minor league players would get more minor league comps, and instead the use of only MLB comps for MiLB players has appeared to foul up the projections short- and long-term for young and old. (John Bowker's first player card comp is Will Clark, and his long-term upside exceeds all actual prospects, I think. His 10-year upside, for instance, exceeds that of Trout, Montero, Freeman, and Hosmer combined. By a lot.)

You've made some helpful statements on these points (as here) and in my view it's important that you continue to concede the errors that have been baked in this particular cake. Fixing this for 2012 ought to be prioritized.

--JRM

Reply to jrmayne

markpadden

3/31

Why on earth would you restrict a minor league player's comps. (and thus his Upside calc.) to major league players? That defeats the point of using comps. for prospects.

I fear the 10-year projections for prospects (probably the single most valuable output of this whole process) will be ruined by this policy.

Reply to markpadden

dpease

3/31

This was something we did for the book, but we haven't done it for the website.

Reply to dpease

tbwhite

3/29

Were the comps re-run between the release of the weighted-means and now ? I looked at a few of the interesting comps from the weighted means release and there are big differences. For example, Jaff Decker's top 3 comps were Carlos May, Curt Blefary and Willie Mays in the weighted means spreadsheet. Now, May and Blefary are gone from the list of comps. Both were still major leaguers, so they wouldn't have been dropped because of that. Willie Mays is now in a tie for the 6th best comparable with Decker, instead of 3rd as he was in the spreadsheet. Also interesting is that many of Decker's comps appear to be speedy centerfielders like Mantle, Mays, Vada Pinson, and the Brothers Upton. It would be interesting to know how the comps factor in position, especially for older players where the data only shows they played OF, but not where in the OF.

Everett Williams has no comps on his player card, while Mickey Mantle was his top comp on the weighted means spreadsheet.

Here's a fun one, Nick Franklin had top 3 comps of Adrian Beltre, Willie Mays and Hank Aaron. His new top 3 comps are Tim Foli, Ed Brinkman and Robin Yount. Consequently, the player compared to Hall of Famers now looks like he is projected to be out of the majors by 2015(since he has no upside beyond 2014, and even an old guy like Bobby Abreu has upside out to 2018).

Is there any comment on the projection for Longoria to have the worst year of his career ?

Reply to tbwhite

scothughes

3/29

Any particular reason to use a new nomenclature for league/level on the player cards? "1A" "1B" "1C" "3A" etc seems to be almost intentionally opaque. Why not use the standard designations? Sure, maybe you need 1 more character to fit LoA, HiA, AAA, but it'd add clarity.

Reply to scothughes

dpease

3/31

I'm not sure, but I'll check and re-reply.

Reply to dpease

shmooville

3/29

So is there a way to filter all of the upside scores on a spreadsheet now? I understand not wanting to re-release an entire pecota spreadsheet update, and while I'm glad Upside is back in some form, the value to me was to filter out and highlight specific players with the most and note that on my draft list value page. I can't go through every player card individually to get there. This is just another method I used to use to find that rare talent early that I may have overlooked for keeper leagues. It looks like it is too late this year to get it back, but next year I'd appreciate it if you could go back to providing that on the spreadsheet.

Reply to shmooville

dpease

3/29

hi there,

1-year UPSIDE is available in the PFM spreadsheet. Please have a look. That's different than classic UPSIDE, of course, but rather than exactly match the former definition we wanted to look at the question that UPSIDE is trying to answer and see if we've learned anything we can apply in the last few years.

Reply to dpease

shmooville

3/30

Thanks Dave for the heads up on this. There certainly is utility in this, however, and I've commented on this in the past, in order to get those upside figures into the spreadsheet; the PFM uses them to calculate values. Now this is certainly less of an issue than in the past when the Upside was a full ten year calculation and you would get minor league catchers showing actual value etc.... However, and maybe this is just me, but I liked to use the Upside figures as a side note versus an actual part of the calculation. For example, I'm preparing for a late draft for this weekend; without using Upside David Wright is valued at $34.81 by the PFM, but using Upside in the calculation his value shoots up to $41.16 and his position moves up 9 spots in terms of relative value. I.e., it has a meaningful effect. Now I can work around this using a spreadsheet for now with some manual entering of data, but it would really be great if next year you could add Display Only columns to the PFM output. I liked to use things like Upside and what used to be Beta etc. as notes. Every draft is different, but there are always times when you have to choose between players that may be going at a similar price, or maybe inflation is high that year so you have to choose who to overpay for at a given position etc... I also liked to use it on those dollar derby players for the future. Who does pecota think down the road might be a better potential keeper and worthy of a next year type investment in a keeper league? The old Upside figure was better for that. Anyway, that's enough spiel from me on this for this year. Thanks for the heads up.

Reply to shmooville

dpease

3/31

hi there, you can use the "Raw CSV Data" or "Raw TAB Data" orange buttons at the top of either the PFM or the depth chart home page to download all the raw playing-time-adjusted data that the PFM uses. Sorry for the lack of clarity on that.

Reply to dpease

markpadden

3/31

Add a col. for the 10-year upside, and put it on the card as well.

Reply to markpadden

jrbdmb

3/29

Of course the issue with using the PFM is that only players that are projected to get playing time in the majors are included.

The only place to get consolidated info on minor league prospects in the WMS, and as noted above multiple times there *do* seems to have been updates to player data (at least as reflected in the player cards).

Also, the case of Nick Franklin is interesting - as noted above his comparables changed quite dramatically between the WMS and his current player card, but his projected stats for 2011 remain essentially unchanged. ???

Reply to jrbdmb

Oleoay

3/29

I kind of wonder how, every so often, a comparable will show up for a 20 year old when that comparable was not in the minors or majors at age 20.

Reply to Oleoay

dpease

3/29

Really? Like who?

Reply to dpease

Oleoay

3/30

I'll flip through the annual tomorrow and find some examples.

Reply to Oleoay

dpease

3/31

Oh, in the annual--I thought you meant the current cards.

Reply to dpease

Oleoay

3/31

Nope, just meant the annual.

Reply to Oleoay

mwright

3/29

Thanks for the update. I hope the process has been cleaned up enough to roll out the "upside" figures sooner in future years and also include it on the spreadsheet. Like shmooville, it was one of the factors I used to sort on. Since my drafts have now already occurred it is almost of no use to me for this year except for the occasional trade evaluation.

Reply to mwright

zstine1

3/29

top 14 in NL in 2011 upside:
david wright, pujols, fielder, votto, braun, holliday, utley, ramirez, uggla, ethier, bowker, beltran, soto, CarGo,

which name does not belong? what is with pecota's continued love affair with bowker? since he is within a week of my age, i appreciate that pecota thinks he is still young enough to have my upside, but pecota is getting like a stalker with him.

Reply to zstine1

rbross

3/30

Glad to see this stuff is up and running.

Do you plan to put the 10-year UPSIDE figures in the PFM? The "question" I believe UPSIDE best answers is this: what is the potential quantitative value of a prospect in relation to other prospects? This is a measurement that is virtually unique to Baseball Prospectus. In my eyes, it's the greatest feature BP has to offer. It's what distinguishes it from Baseball America, Fangraphs, etc. But with just one-year figures available for comparable purposes, UPSIDE loses its applicability to prospects.

Moreover, is BP able to guarantee that these figures (in whatever form) will be rolled out about a month and a half sooner next year? As many people have already noted, they are practically worthless for fantasy players, having been issued at such a late date.

My subscription expires in a couple weeks and I'm thinking seriously about not resubscribing for the first time in three or four years.

Reply to rbross

leites

3/30

Bob - the 10-year UPSIDE figures are a wonderful idea, but I suspect that the current version of PECOTA would make incredibly gloomy predictions, given that it thinks young players such as Travis Snider have no breakout potential. My guess is, given the current configuration it would predict an "upside" of Longoria, Snider, Stanton, Alvarez, J. Upton, etc. all being out of baseball before they hit the age of 30.

Reply to leites

dpease

3/31

We could put 10-yr UPSIDE in the PFM, yes... what we did is compute UPSIDE fairly close to the original definition (though its by year, not 5-year) but what I'd really like is for us to figure out the best way to answer the question UPSIDE was designed for with the information we now have available to us.

Reply to dpease

dpease

3/31

Sorry to reply to myself but in answer to your question about availability, we're finishing recapturing all of these processes. That's what took so long this year but it's all going to be exactly the same next year (other than any improvements we make in the interim, but we're not going to rewrite it again) so we should finally be quite early with our data.

Reply to dpease

markpadden

3/31

As stated by another user, Upside should be listed as a 5-year or 10-year total value -- both in the Pecota cards in in the PFM spreadsheet.

Reply to markpadden

markpadden

3/31

Projected Playing Time (with projected stats) should be moved to the very top of the cards. Right now, if you want to know how many ABs BP is projecting for a given player, you have to scroll down 2 pages.

Reply to markpadden

dpease

3/31

PA has been added to the bio box for the projection. We'll get those projections higher on the cards as well.

Reply to dpease

BP Unfiltered: Comparables and Upside

Thank you for reading

Latest Articles

Picking Guys Out of a Lineup 2024 $

Box Score Banter: Dealin’ Dylan Does the Deed B

To Swing and Miss Less is Tough Business $

Do Sophomores Still Slump? $

The Heat Check: Loperfido Looms, Collier Crushing $

Dave Pease

Latest Articles

Picking Guys Out of a Lineup 2024 $

Box Score Banter: Dealin’ Dylan Does the Deed B

To Swing and Miss Less is Tough Business $