Notice: Trying to get property 'display_name' of non-object in /var/www/html/wp-content/plugins/wordpress-seo/src/generators/schema/article.php on line 52
keyboard_arrow_uptop

Believe it or not, most of our writers didn't enter the world sporting an @baseballprospectus.com address; with a few exceptions, they started out somewhere else. In an effort to up your reading pleasure while tipping our caps to some of the most illuminating work being done elsewhere on the internet, we'll be yielding the stage once a week to the best and brightest baseball writers, researchers and thinkers from outside of the BP umbrella. If you'd like to nominate a guest contributor (including yourself), please drop us a line.

Without further ado, let's kick off the series by extending a warm BP welcome to Jeff Sullivan. Jeff Sullivan has been writing about the Mariners since 2003, and has been running Lookout Landing since 2005. Additionally, he briefly ran Beyond The Box Score and serves as an editor at SBNation/MLB. He can be found in Oregon bars.

I’ve been granted the honor of going first in the brand-new ProGUESTus series, which is one of those things I probably find a lot cooler than any of you do. I don’t know why I was asked to lead off, since I was a pitcher in high school who wasn’t allowed to hit, but that’s more a problem with the manager, not me. That’s a baseball joke. All right, good start.

To much fanfare, Colin Wyers released the 2011 PECOTA projections earlier this week. PECOTA is probably the most well-publicized, well-known, and complex projection system on the planet, and so the date of its release is always one of the most exciting days in the early part of the year. When there isn’t any baseball, the best substitute is thinking about and analyzing the baseball to come, and PECOTA grants us that opportunity.

As soon as I heard the projections were available, I downloaded the spreadsheet and, like most other people with access to it, scrambled to find out what PECOTA thinks of Rajai Davis. (Not much.) This is where a lot of the fun with PECOTA lies. Once you get the numbers, you want to see what the system thinks about certain players. A common activity is to group the projections for players on your favorite team, figure out projected playing time, and then turn that into a projected record. That’s what we all care about, right? How many wins and losses our teams are going to end up with?

And that’s great. That’s PECOTA’s strength, and that’s PECOTA’s purpose. But the thing about PECOTA that I don’t think gets enough attention is the inclusion of player comparisons. As many of you know, the whole PECOTA system is built upon these comparisons, and for reader convenience, they’re included as part of the output. Scroll to the right of the 2011 spreadsheet and you’ll find a “Comparables” column. There you’ll see a selection of the players throughout baseball history to which the current player is being compared.

There’s fun to be had here. Jose Guillen as Julio Franco? Jason Bergmann as Shaun Marcum? Twenty-two different pitchers as Antonio Bastardo? The comps add some color and make the final product more entertaining. And they’re not without their analytical value, either. The strength of a given player’s comps, one figures, ought in theory to be directly related to the strength of a given player’s projection, and said strength, or confidence, or whatever you want to call it is shown here in parentheses after the comparable names.

So I’m here now to make an effort to get people to pay more attention to this part of PECOTA. That number in parentheses—Baseball Prospectus calls it “Similarity Index,” and it’s defined in the site glossary. The higher the number, the easier it is to find similar players with similar performances. The lower the number, the harder it is to find similar players with similar performances.

That’s good and sensible, but let’s go ahead and call it the "Ordinary Index" instead. Changing the name doesn’t change the meaning; it just makes it more descriptive. It follows, then, that the higher the number, the more ordinary the player, and the lower the number, the more extraordinary the player.

Who are the most ordinary and extraordinary players in baseball? Before, this would’ve been a difficult—if not impossible—question to answer. There are so many things to consider. But now we have an Ordinary Index. It’s right there in the name. All of a sudden, coming up with an answer to the question couldn’t be easier, and the results are shown below. I hope we can all learn a thing or two from this exercise.

—–

Most Ordinary Hitter, Minors: John Murphy (Ordinary Index: 88)

The Yankees’ second-round pick in the 2009 draft, Murphy is a right-handed catcher who last year posted a .703 OPS with A-ball Charleston. Ordinary from birth, his parents gave him the most common first name in the United States. Additionally, Baseball-Reference lists his middle name as “R.”, a sign that, while his parents understood the necessity of providing a middle name, they in no way intended to suggest that their son was in any way unique, so they opted for an all-encompassing initial.

Most Ordinary Hitter, Majors: Danny Worth (OI: 87)

Worth is a righty middle infielder who crawled his way up the ladder despite mediocre numbers and broke into the bigs with the Tigers last season. He’s fond of vanilla ice cream, the color blue, Jeff Dunham, hanging out with his friends, Bud Light commercials, and fast food french fries that are salty but not too salty.

Most Ordinary Pitcher, Minors: Mason Tobin (OI: 93)

Tobin has been drafted three times—once by the Braves in 2005, again by the Braves in 2006, and once by the Angels in 2007. The righty pitcher missed most of 2009 and all of 2010 after undergoing Tommy John surgery on his elbow. As a bullied teenager in middle school, all Tobin wanted was to be like everyone else. In time, he got his wish.

Most Ordinary Pitcher, Majors: Cesar Jimenez, Yhency Brazoban (Tie! OI: 91)

Former big league relievers Cesar Jimenez and Yhency Brazoban are so ordinary that they can’t even get a rank to themselves on a list. Jimenez is a lefty with an injury history and a high-80s fastball, and Brazoban is a righty with an injury history and a low-90s fastball. Oooh.

Most Extraordinary Hitter, Minors: Brandon Belt (OI: 57)

The Giants’ fifth-round draft choice in 2009, Belt broke into the professional ranks last season and, between A-ball San Jose, Double-A Richmond, and Triple-A Fresno, batted .352 with a 1.075 OPS. PECOTA had so much trouble finding comparable players that his third listed comp on the spreadsheet is someone named “Curt Blefary,” whom I’m pretty sure didn’t exist.

Most Extraordinary Hitter, Majors: Ichiro Suzuki (OI: 43)

The perennial bane of any and every projection system, Ichiro has continued to excel despite repeated statistical assertions that he would be Matty Alou. Ichiro is so historically unusual that he once consulted his dog before signing a contract extension.

Most Extraordinary Pitcher, Minors: Tyler Matzek (OI: 47)

Matzek was selected 11th overall by the Rockies in 2009 and handed a large signing bonus. Fresh out of high school, the lefty reported to A-ball Asheville and struck out a batter per inning while allowing a ton of walks. PECOTA projects a six percent chance of improvement and a one percent chance of collapse, meaning that PECOTA projects a 93 percent chance that Matzek either stays the same or gets a little bit worse. Thus, Matzek is evidently so extraordinary that PECOTA seems to think he ages at twice the normal rate.

Most Extraordinary Pitcher, Majors: Winston Abreu (OI: 32)

Abreu narrowly beat out Craig Kimbrel and Jamie Moyer for the honor. He hasn’t thrown a pitch in the majors since July 2009, but what makes him extraordinary is that, over his last three years as a reliever in Triple-A, he’s struck out 241 batters in 158.2 innings. The other thing that makes him extraordinary is that he’s a 6-foot-2, approximately-20-pound Dominican with half the name of a British prime minister.

—–

PECOTA, more than anything else, is a projection system, and like all the other projection systems, it’s popular for its statistical forecasts. But unlike all the other projection systems, PECOTA’s got more to it than a triple slash line. PECOTA’s an animal, and while you could simply remove the meat from its bones, doing so leaves so many other bits for which you could find a handy use.

Look at the player comps section. Savor the comps, and consider the Similarity Index (whatever you choose to call it). They’re included in part for your enjoyment. Enjoy them.


Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now
You need to be logged in to comment. Login or Subscribe
BurrRutledge
2/11
Welcome, ProGuestus Prime. A very entertaining read. Fun stuff.

And thank you, BP, for kicking off this column.
jimoneill
2/11
I'm not sure I understand the Curt Blefary reference, "pretty sure he didnt exist". Beyond the fact that a quick check of Baseball Reference would have revealed that Blefary did, indeed, exist it would also have let you know that Blefary was AL ROY in 1965 and had a short but moderately successful stint in MLB, putting up power numbers during the offensively depressed years of the mid-60's.

Sorry, but us older folk who know that baseball didnt start in 1990 expect a bit more.

Bodhizefa
2/11
It's supposed to be a joke. Y'know, cause the name sounds funny and made up?
yankeehater32
2/11
Jeff knew he was real, but couldn't escape the fact that he sounds more like a Pokemon than a baseball player.
dethwurm
2/11
Heh, Pokemon was the first thing I thought of too...
NYYanks826
2/11
Of course Curt Blefary is real. So were his teammates, Ron Sigglypuff and Martin Bikachu.
padresprof
2/11
Speaking for myself, unless you know for a fact that Curt Blefary legally changed his name to Curt Blefary, jokes made about a person's attribute that are usually beyond a person's control are in bad taste. While it is possible to change one's name (or skin color, etc.) it is exceedingly rare and often requires a herculean effort. I suspect there are writers at BP who can attest to being the target of jokes because of other uncontrollable attributes. I bet they do not find the jokes humorous.
padresprof
2/14
(-14 and expanding)...
Thanks for reminding me of the maturity level of too many of the readers of this website. I am amused.
crperry13
2/11
Nothing before 1990 was worth remembering.
briankopec
2/11
Would somebody please get this article off of my lawn!
jeffsullivan
2/11
I'm sorry, but there's no way that Curt Blefary ever existed. That's not even a last name.
JimmyJack
2/12
Let's blame it on Emma Span. Maybe her name is made up to. Or not. Both?
jimoneill
2/11
Its a generational thing, apparently. If Curt Blefary is an inherently funny name, what the hell is Yhency Brazoban
crperry13
2/11
South American mixed drink - guava juice mixed with fine-grit sand, with a touch of everclear and a twist of irony.
leites
2/11
From the earlier thread, it seems clear that the way PECOTA makes is comps is badly broken, at least for minor league players, and for younger major leaguers such as Evan Longoria.

As Brian DewBerry-Jones pointed out, it seems odd that the current version of PECOTA seems to compare minor leaguers only to major leaguers, rather than to other minor leaguers. This, perhaps, is why so many current minor leagues who are unlikely ever to rise as far as AA have as their top comps either Willie Mays or Mickey Mantle.
leites
2/11
Also, the current version of PECOTA does not seem to take the players position or body type into account.

The comps for Carlos Santana are David Wright, Alvin Davis and Carl Yastrzemski. Players this year who have Gary Sheffield as a comp include Gordon Beckham, Bobby Abreu, Kosuke Fukudome, Shin-Soo Choo, Magglio Ordonez (who also has Stan Musial as a comp), Jacob Smolinski (who also has Ron Santo), and Carlos Beltran. Jay Austin's comps are Roberto Clemente, Robin Yount and Adrian Beltre.

Rich Poythress top comps are Adrian Gonzalez, Prince Fielder, Kent Hrbek. Clint Robinson's are Adrian Gonzalez, Joey Votto, Ryan Garko. Gerald Sands' are Prince Fielder, Willie McCovey and Mark McGwire. Matt Carpenter's top comp is Chipper Jones. Trent Oeltjen's comps are Carlos Beltran, Vada Pinson, and Roberto Clemente. Thomas Field's top comp is Tim Raines.
KrisM615
2/11
Limiting comparables to just players that play(ed) the same position would be limiting PECOTA's effectiveness. The larger the field of possible comps the system has to choose from, the more likely it'll find a good comp.

As for body type, it's rare for a players listed weight to be accurate, so that wouldn't be of any help. Having someone go through baseball history and label every single player as "thin," "stocky," etc. would be subjective, and just weird.
leites
2/11
Kris - fair points. But right now, for players under the age of 25, PECOTA seems far less effective than common sense. I'd be happy to make a bet -- my own, completely non-scientific projections for those players vs. the PECOTA weighted means. Or, alternatively, PECOTA vs. some other standard projection sysem (e.g. Marcel) for the major-league sub-set of those players (such as Longoria or Upton).

Anyone feel confident enough about the current version of PECOTA to take me up on either of those offers?
eighteen
2/11
No, because I know of 2 systems that make better predictions than PECOTA.

That doesn't mean PECOTA has no value. As with any system, its value depends on the user's expertise.
leites
2/11
Eighteen - So, based on your expertise, can you give me an example of where you think PECOTA has unique value for a player under the age of 25?
eighteen
2/11
I think you misunderstood. I meant I would not take you up on those bets because I know PECOTA isn't the best projection system.

I'm not familiar enough with PECOTA to comment on what you perceive to be its limitations/shortcoming for players under 25 - but I'm certain those who are can ably defend its methodology and conclusions.
leites
2/11
I don't understand the logic here.

Let's take a particular example -- Matt Wieter's PECOTA projection from two years ago, which BP has admitted were based on faulty Davenport translations (as were all of PECOTA's minor league projections from the two minor leagues in which he played that year). How were those projections, based on faulty analysis, useful to anyone?

If, as I suspect, this year's PECOTA projections for younger players suffer from some similarly overarching flaw, how are those projections anything other than misleading?

leites
2/11
Or to take an example from this year: how it is useful to know that PECOTA thinks Rich Poythress is most similar to Adrian Gonzalez, Prince Fielder and Kent Hrbek? Does that raise your opinion of Poythress' abilities? Seriously?
tbwhite
2/11
Poythress is a great example. Lats year his top 3 comps were:
Randall Butts, Charlie Smith, and Mark Merchant.

This year Adrian Gonzalez, Price Fielder and Kent Hrbek.

In between he did go all .315/.381/.580 on the California League, but is that really that unusual for the Cal League ? And it's not like he was young for the Cal League, he turned 23 in August.

There was very little data on him for 2009, and he was a 2nd round pick, but it feels like PECOTA is a touch excitable about guys in the minors.
cwyers
2/12
Poythress actually has a better forecast if I exclude the comps and use the generic aging curve alone than if I use the aging curve you see in PECOTA (which combines the generic aging curve with the comps-based curve). Not by a whole lot, mind you - something like 20 points of OPS. But it's not like the comps are inflating his projection any. He's forecast as a touch below replacement level anyway.

(I'm writing up a blog post that explains a bit more about how we come up with the comps and the resulting age curves right now.)
leites
2/12
Colin - I look forward to reading your write-up. I hope you'll explain how the age curve affects younger major leaguers such as Longoria as well. It seems counter-intuitive that even the most talented young major leaguer hitters, with several years of high-performance, are projected to significantly decline as they reach age 25. In that respect, this year's PECOTA looks very different from the years when Nate did it.
philly604
2/12
I like forward to reading that as I also noticed the high number of seemingly pointless HoF comps for unremarkable minor leaguers.

It seems to me where we're heading with Colin's increasingly transparent explanations of what PECOTA does and doesn't do, is that a lot of its supposedly differentiating features like the comps sinply don't matter very much at all.

If changing a players comes from Butts, Smith, Merchant to Gonzalez, Fielder, Hrbek doesn't make much of a difference, then it's hard to buy the argument that the entire comps system makes much of a difference. As a result, what was once portrayed as a key feature could be revealed as nothing more than pretty bells and whistles.

We'll see though.
tbwhite
2/12
This is where the comp stuff gets so confusing.

We have a 1B who at age 22, played well if the Calif. League and is projected as a below replacement level guy if he played in the majors at age 23.

At age 22, Hrbek hit .301/.363/.485 IN THE MAJORS in over 500 AB's.

Presumably there is some growth built into Poythress's 2011 projection which means if he is supposed to be below replacement level in 2011, he probably was in 2010 as well. So, how on earth can Poythress, a below replacement level guy have as one of his best comps Hrbek who was an All-Star and 2nd in Rookie of the Year voting at the same age ?

Fielder by the way hit .271/.347/.483 at age 22 in the majors. Gonzalez posted a .821 OPS in AAA at age 22. I don't think either would be classified as below replacement level.


lucastate
2/12
Based on its handling (in comps and projections) of older minor leagues, young major leaguers, and a few other subsets of players, I will assume until corrected that PECOTA is currently broken. Look forward to it being fixed -- hopefully in tandem with PFM's release.
leites
2/12
Yes. I think the idea of PECOTA is wonderful, and I am rooting for it to be successful. But the devil is in the details, and I fear that with such a system there are very many things that can go badly wrong.
tbwhite
2/12
I'm skeptical at the moment about PECOTA based on some of the comps, but I think it's premature to say that it is broken. If you focus on extreme observations(and minor leaguers by virtues of their shorter track record are the extreme observations) it's often fairly easy with a quantitative model to pick out unintuitive results. Most times those unintuitive results can be explained pretty easily. That doesn't make the results "right", but it often deepens your understanding of how the model works, and helps you make better use of it. All of which is a long-winded of saying to Colin that yes it would be wonderful if you could write up a couple of case studies.

Oh, one other thing, my understanding is that the comps only affect the arc of the player's career, given that we only have the weighted means for 2011, it seems unlikely that the comps would be having a HUGE impact on the results. IF there is a problem, my guess is that it would relate to the way the minor league translations are being done. For example, what would happen if the minor league run environment didn't change, but the ML did(which it has in recent years), how would that affect translations of minor league stats ?
aquavator44
2/13
I think stating that PECOTA is broken, when all you have is the weighted mean and the top three comparables (out of more than twenty), is kind of hasty. The best thing about PECOTA is that is gives a range of possibilities. It seems ridiculous to me to take the weighted means spreadsheet as anything other than a sneak preview.
lucastate
2/13
I didn't "state" that it is broken; I said that based on the primary results produced thus far, I will assume until corrected that it is broken. The strange downturns expected from most young stars, and hall-of-fame comps for average minor leaguers, are at this point the best we have to go on -- and a preliminary indicator that leads me to assume PECOTA is broken, as it doesn't appear to be in line with the results PECOTA produced when it was working.

That's not a statement that it is broken -- on the contrary, I would love for Colin to engage in this thread, or with jrmayne in the other thread, and explain why the hall-of-fame comps are correct, why Kila, Dan Johnson, and others are going to be huge this year, and why it's more likely that young stars like Longoria will regress with age rather than grow. For the same reason that it was useful to discuss the Wieters and McLouth predictions (as well as the Colby Lewis predictions), and what did and didn't make sense about them, it will be useful to engage in that conversation this time around -- as I am sure that Colin will do, once he's finished with the hard work he is putting in now to meet subscriber requests for player cards, PFM, etc.

If PECOTA's not broken, but has just reached a new level of insight, I'll be the first guy cheering as I ride to the league title with Dan Johnson at 1B and Kila hitting DH. I'd just like to understand if BP's really confident that those results are intentional, and not the result of something being broken in PECOTA.