Notice: Trying to get property 'display_name' of non-object in /var/www/html/wp-content/plugins/wordpress-seo/src/generators/schema/article.php on line 52
keyboard_arrow_uptop

In case you hadn’t noticed, it’s been a roller-coaster season for Alex Rodriguez. Steroid revelations, hip surgery, a .219 batting average on balls in play, an inflated walk rate, a recent eight-game, five-homer tear-he’s done plenty to confound expectations, both good and bad. Yet the ever-controversial 33-year-old slugger’s .313 Equivalent Average through the first 81 games is just two points off his PECOTA weighted mean projection of .311.*

Rodriguez’s is hardly the only on-the-nose projection our system has had halfway into the season. Of the 227 players with at least 200 plate appearances through Sunday (the schedule’s official midpoint), 97 are within 15 points of their PECOTA weighted mean Equivalent Averages. The variations are normally distributed, with 154 players within 29 points-or one standard deviation-of their projections, and 213 within two standard deviations. Here’s a non-random selection of players within 15 points either way:


Player           Team      PA   Actual Projected Diff.
Justin Upton     D'backs  325   .302    .287     .015
Matt Kemp        Dodgers  336   .302    .290     .012
Robinson Cano    Yankees  347   .271    .264     .007
Jacoby Ellsbury  Red Sox  338   .275    .270     .005
Jason Bay        Red Sox  348   .299    .295     .004
David Wright     Mets     353   .325    .323     .002
Hanley Ramirez   Marlins  336   .326    .324     .002
Alex Rodriguez   Yankees  221   .313    .311     .002
Mark Teixeira    Yankees  356   .309    .308     .001
Miguel Cabrera   Tigers   330   .306    .308    -.002
Ken Griffey Jr.  Mariners 250   .265    .272    -.007
Emilio Bonifacio Marlins  347   .223    .233    -.010
Dustin Pedroia   Red Sox  365   .270    .284    -.014

This list is simply a baker’s dozen of players (mostly from the East Coast) who have been surrounded by lofty and in some cases unreasonable expectations. We’ve got three of the game’s six highest-paid hitters (Rodriguez, Teixeira, and Cabrera), the reigning AL MVP (Pedroia), the whipping boy of Queens (Wright), a prodigal son returned (Griffey), a 21-year-old phenom (Upton), arguably the game’s best all-around player (Ramirez), a horrible idea for a leadoff man (Bonifacio), and a few others who frequent conversations in the Northeast corridor. Despite the varying shapes of performance hidden by EqA, they’re all about as productive as PECOTA-if not the chattering classes-expected.

Turning to the extremes, here are the players exceeding their projections by the widest margin:


Player           Team       PA   Actual Projected Diff.
Jason Bartlett   Rays      248   .334    .243     .091
Ben Zobrist      Rays      270   .331    .259     .072
Joe Mauer        Twins     256   .370    .298     .072
Joey Votto       Reds      205   .355    .298     .057
Ichiro Suzuki    Mariners  342   .312    .258     .054
Adrian Gonzalez  Padres    349   .346    .295     .051
Raul Ibañez      Phillies  280   .336    .286     .050
Prince Fielder   Brewers   362   .355    .305     .050
Adam Lind        Blue Jays 354   .319    .270     .049
Pablo Sandoval   Giants    307   .318    .269     .049
Kendry Morales   Angels    308   .287    .240     .047
Gary Sheffield   Mets      218   .318    .271     .047

By and large, that’s a youngish group, with an average age (weighted by PA) of 28.5. Sandoval, at 22 the baby of the group, is in his first full season, as is the 25-year-old Lind. The latter is one of eight here between the prime ages of 25 and 29, while Ichiro (35), Ibañez (37), and Sheffield (40) are the only over-30s. Ichiro and Sheffield are the only two here for whom this year’s performance wouldn’t be a career high (and by a wide margin, at that), and both are particularly confounding age expectations. PECOTA really ought to know better regarding the former; year after year it takes Suzuki’s high BABIPs as a fluke, but this year’s .383 mark would be his fourth season out of nine above .370. As for Sheff, he’s defiantly rebounded from last year’s 43-point shortfall (the second-largest among those with 400 PA) and a springtime release by the Tigers to become arguably the second-best player in the decimated Mets’ lineup. His EqA is a ringer for his .315 career mark.

Seven players from this hot-hitting dozen were named to their respective leagues’ All-Star teams on Sunday, with Ibañez, Mauer, and Ichiro voted into the starting lineups, and Bartlett, Fielder, Gonzalez, and Zobrist named as reserves. The two first basemen are proven commodities, good players having great years, while the two Rays infielders are less defensible selections given their short track records; their 2009 performances look fairly fluky. Ibañez had parlayed a more favorable ballpark, an easier league, and some good luck on fly balls to put together a career-year start before suffering a groin strain three weeks ago; along with Mauer, Votto, and Bartlett, he’s helped at least somewhat by the smaller sample size created by his DL stint. Morales, dismissed in this space in February as sub-replacement fodder, has turned out to be a relative asset; perhaps the difficulty of translating Cuban stats and the extremes of Salt Lake City have confounded PECOTA.

To the underperformers:


Player            Team       PA   Actual Projected  Diff.
Brian Giles       Padres    253    .196   .290     -.094
Jimmy Rollins     Phillies  347    .216   .288     -.072
Bill Hall         Brewers   203    .204   .270     -.066
Garrett Atkins    Rockies   255    .222   .286     -.064
Kelly Johnson     Braves    263    .229   .290     -.061
Elijah Dukes      Nationals 211    .242   .302     -.060
Dioner Navarro    Rays      252    .199   .256     -.057
Alfonso Soriano   Cubs      348    .241   .294     -.053
David Ortiz       Red Sox   311    .248   .297     -.049
Kevin Kouzmanoff  Padres    317    .234   .282     -.048
Chris Young       D'backs   270    .229   .277     -.048
Magglio Ordoñez   Tigers    283    .245   .291     -.046
Rick Ankiel       Cardinals 227    .234   .280     -.046
Eric Byrnes       D'backs   210    .214   .260     -.046

As you’d expect, this group is older than the previous one, though at 29.8 years, not by much. Six of these 14 players are over 30, headed by the looking-quite-cooked Ordoñez (35) and Giles (38). The rest are between 25 and 29, with those younger than that likely to have been sent down to the minors. Johnson and Kouzmanoff counter Gonzalez when it comes to the age-27 phenomenon.

These players were expected to be solid contributors, with Navarro and Byrnes the only ones projected for EqAs below .270. It’s unclear whether injuries are a factor; several of these players may be at less than 100 percent, but only Ankiel, Byrnes, Dukes, and Giles have served DL stints this year, mostly following poor performances rather than preceding them. It’s possible that Byrnes, Dukes, Navarro, Ordoñez, Ortiz, and Soriano-all denizens of the DL last year-may be dealing with extensions of older woes, though Ortiz has notably come around of late. Elsewhere, we’ve got slumps galore, including the the pull-happy Rollins, the righty-shy Hall, and the sub-Mendozoid Young.

Note that while luck on balls in play contributes to over- or underperformance, it doesn’t explain the entirety of these lists. Overachievers Votto, Mauer, Bartlett, Suzuki, and Sandoval accompany the on-target Wright, Kemp, Ramirez, and Upton in the BABIP top 20. Underachievers Giles, Byrnes, Rollins, Navarro, Johnson, Atkins, and Young join Griffey, Rodriguez, and the overachieving Gonzalez in the bottom 20.

In any event, as the sample sizes increase, bank on these extreme performances to regress, and the projections to gain in their accuracy. Last year’s standard deviation among players with 400 PA was just 22 points of EqA, with every player within 51 points of his weighted mean projection, and a majority (111 out of 213) within 15 points. It may be a bit much to call PECOTA “deadly accurate,” but as hitter projections go, it ain’t too shabby.

*: For this article I’m using numbers from our page for 2009 EqAs and the final weighted mean spreadsheet for PECOTA that was dated March 14.

A version of this story originally appeared on ESPN Insider Insider.

Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now
You need to be logged in to comment. Login or Subscribe
dalbano
7/07
Great read Jay, thanks.

At what point can we look at pitchers in a similar manner? Basically, is there a large enough sample size for pitchers that adds value to an analysis looking to compare actual performance to expectations? Is the sample size at this point large enough to determine to a certain extent who we can expect to continue doing what they have done thus far, for those that have significantly exceeded expectations?
jjaffe
7/07
There will be a pitchers PECOTA article later this week, but I'm not sure who's doing it.
davelamb
7/07
Great stuff. Good point on BABIP causing the extreme performances to regress. Although it's going to be harder for Chris Young to get a BABIP bounce when he's striking out in 29% of his ABs, and harder still for Kelly Johnson and Magglio Ordonez to do it from the bench.
cjgeisler
7/07
Having 3 of the top 5 under performers on my NL roto team, plus Bonifacio, goes a long way toward explaining my team's "suckitude". :-)
schlicht
7/07
I expected to see Jody Gerut's name on your underachiever list:

Projected EqA .304
Actual EqA -.076
delta .380
jjaffe
7/07
Aside from the fact that you don't have the correct numbers above (should be projected .304, actual .201, delta .103), you didn't read the fine print: he needed 200 PA to be considered for the piece, but he's only got 121. It's certainly an underachievement -- and a questionable projection as far as I'm concerned -- but the fact that he doesn't have enough playing time to get anything near a fair shake mitigates that somewhat.
schlicht
7/07
Sorry, I missed his overall number, I got the -.076 from the Milwaukee team list.
I don't seem to have much time for fine print these days ;>} and besides he was projected to have 455 AB
jessehoffins
7/07
Matt Wieters also falls into this category. I'd love to see a pecota projection re run based on his newest numbers at triple a and the bigs.
jjaffe
7/07
Oh noes, he's 52 points off his WM EqA through his first 102 major league plate appearances -- abandon ship!

Seriously. Dude had a rough first week but is hitting .319/.382/.493 since June 7. Give him time.
Darsox64
7/07
Is it accurate to say that the projections are normally distributed when you dropped all the sub 200 PA players from the sample? PECOTA is optimized to estimate VORP, not a rate stat like EQA, right? If you include all projected player (fudging minor leaguers if you must), wouldn't you get a long-tailed distribution for the differences rather than a normal one?
jjaffe
7/07
There's a selection bias at work by using a PA cutoff, sure. I'm not sure what the shape of the distribution would be if you included minor leaguers (who do have translated stats and thus EqAs), but I do know that you can't use players with inadequate sample sizes (few plate appearances) to check the system's accuracy/

If anything I'd say PECOTA is far more geared towards EqA - which is basically an estimate of productivity (runs per plate appearance, park adjusted and placed on a normalized scale resembling batting average) - than VORP, which gets very fuzzy and almost worthless for multi-positional guys.
dpowell
7/08
You could and probably should include everyone and just use analytical weights (based on PAs) to account for the fact that you're more confident in some observations than others. Basically, this is just a heteroscedasticity issue.

That doesn't completely solve the problem of how to check the accuracy of a projection system. The weights (or an arbitrary cutoff) are partially a function of how good the projection is doing (players who are playing way below expectations get fewer PAs). But weights are much better than dropping based on some cutoff.
jjaffe
7/08
Heteroscedasticity? If I use that word in a BP/ESPN article, they'd automatically invoke their out clause in our relationship.

In any event, I'll leave the heavy lifting on PECOTA to a harder-core number cruncher who's in a position to do something about it at the end of the year, when the sample sizes are more legitimate.
Yatchisin
7/07
You can add Kelly Johnson to the DL list.
jjaffe
7/07
Indeed. Missed that bit of news while traveling over the holiday weekend. Luckily I had recently picked up Adam Kennedy in both my Scoresheet and fantasy leagues.