What is it about baseball that makes its statistics so special? The game gives cold, calculated figures a warm, human face. Numbers that are sterile and inert in any other environment suddenly become gregarious storytellers, revealing the tale of the line-drive hitter who traded speed for power as he aged, or the aging ex-flamethrower who learned to hit his spots to compensate for the loss of his fastball.
As Bill James wrote 20 years ago–I’m obligated by contract to quote James in at least one-third of my articles–“A chart of numbers that would put an actuary to sleep can be made to dance if you put it on one side of a card and Bombo Rivera‘s picture on the other.”
We recognize the tale that the numbers tell because while the specific numbers may be unique from player to player, the patterns tend to become recognizable. We look at Miguel Cabrera‘s two rows of numbers and hear in our minds the echoes of numbers we’ve seen before next to names like Hank Aaron and Frank Robinson. A stroll down Randy Johnson‘s lines conjures memories of other pitchers who found greatness after taming their wild heat, from Nolan Ryan to Sandy Koufax.
I’m not ashamed to admit that I’m a numbers guy. The number patterns that baseball players produce, both individually unique and historically evocative, have been a ceaseless source of fascination for me since I badgered my dad into buying me my first copy of The Baseball Encyclopedia when I was six years old.
Look at enough numbers for enough years, and eventually they start to seep so deep into your brain that your unconscious mind starts to recognize familiar patterns for you. The first time I saw Edgar Renteria‘s stat line after the 2000 season, I immediately had to bring up Edgardo Alfonzo‘s numbers to make sure I wasn’t imagining things. Sure enough, the similarities between Renteria’s 2000 performance and Alfonzo’s 1998 season are almost eerie. Looking at Cesar Izturis‘ numbers a year ago, I didn’t need PECOTA to tell me that one of his ten best comps of all time was Omar Vizquel. This winter, after seeing the Midwest League stat line for Padres’ catching prospect George Kottaras, I knew even before looking that I had seen his numbers somewhere before. Specifically, Mike Sweeney‘s line in the same league, at the same age, 12 years ago.
I don’t know how or why my brain comes up with these associations seemingly at random, though I suspect Malcolm Gladwell might. So long as it does, though, I might as well use these associations the way any self-respecting analyst or stock trader would: to better predict the future. It figures that if a modern player has a historical comp, that comp would be awfully useful in predicting where today’s player might be heading.
Of course, this is exactly what PECOTA does, only on a much grander, more statistically accurate scale. Which makes it reasonable to ask, if we’ve got PECOTA, why bother to use intuition?
Well, PECOTA doesn’t make a single comparison; it makes thousands of comparisons, then calculates an aggregate prediction based on the probabiities that a player will follow the career path of a hundred different guys. It’s not going to make a leap of faith that David Wright is the next Scott Rolen, only predict that there’s a small probability that it will happen. (Not to mention the fact that Rolen isn’t among the list of Wright’s 20 most comparable players.)
To put it another way: a computer is without peer when it comes to calculating the probabilities of an almost infinite array of possible outcomes. But a computer can’t play a hunch, Wily Mo Pena excepted.
So here’s a list of five players who had breakout seasons last season, and one man’s gut feeling over which player fits their profile most closely this season. I might be wrong more than I’m right. In fact, I’ll probably be wrong more than I’m right (I’m still waiting on Joey Hamilton to go all Greg Maddux on the league). But that’s the thing about breakout seasons: if they were common, they wouldn’t be breakouts.
Breakout Profile #1: The Unremarkable Sinkerballer.
Score one for intuition: a year ago, despite no statistical evidence to support my opinion, I had a sneaking suspicion that Jake Westbrook was due for a breakout season. What was disturbing to me wasn’t just that I had this feeling over a player with a 5.33 career ERA and a strikeout-to-walk ratio of 58 to 56 the year before, but that I couldn’t figure out what it was that I liked about him.
Then I remembered to check his groundball/flyball ratios. Sure enough, his G/F ratio in 2003 was a lofty 3.02, and his home-run rate was correspondingly excellent (just nine surrendered in 133 innings). My unconscious was telling me that if Westbrook could lower his walk rate significantly–and he was young enough to do so–he could thrive even without a high strikeout rate by forcing teams to beat him with singles.
In 2004, Westbrook’s walk rate plunged to 2.55 from 3.79 in ’03, his ERA dropped to 3.38, and he was the fourth-most valuable pitcher in the American League according to VORP.
Who’s this year’s Westbrook? Here are the number patterns to consider:
Player G GS IP H HR BB K ERA G/F Westbrook 71 34 246 286 22 94 127 5.33 2.36 Zach Day 61 44 285 277 24 119 147 4.01 2.48
That’s Westbrook’s career before 2004. The big difference between the two lines is in ERA, which is largely due to significantly different outcomes between the two pitchers on balls in play. In the areas over which pitchers exert a lot of influence–strikeouts, walks, homers and groundball/flyball ratio–the two are very similar.
Much as Westbrook had his best overall season in 2003, presaging his breakout in 2004, Zach Day also provided some statistical glad tidings last season. Ignoring intentional walks, Day gave out just 38 free passes in 116 innings last year, a rate which was much better (2.93) than his previous career rate (3.68).
Westbrook turned 27 last September. Day turns 27 this June.
Day, who didn’t even have a job assured to him with the Nats going into spring training, clinched the fifth-starter role with an excellent performance this month. After playing for a team with park effects that have been all over the place–the Expos’ 2003 run environment was more hitter-friendly than Coors Field–he’s moving to a stadium that will almost certainly play as a significant pitchers’ park. As Erik Siegrist points out, the entire NL East plays as a pitchers’ division, which ought to add a nice glossy sheen to Day’s stat line. I’ll go on record as predicting that Day is going to be one of the most pleasant surprises in the National League this season.
Breakout Profile #2: The Minor League Slugger Who Finally Gets A Shot.
Thanks to concerns about his defense, Travis Hafner didn’t get a shot at playing every day in the majors until last spring, when he was just a few months shy of turning 27. Hafner had career minor-league numbers of .298/.401/.515, and he ranked 20th on our 2003 Top Prospects list. All he did in his first real opportunity was hit .311/.410/.583, ranking just a rounding error behind David Ortiz on the list of the most valuable DHs in baseball.
Hmmm…where can a Royals fan find a comparable big, hulking left-handed slugger who has nothing left to prove in the minors, but has had trouble getting opportunities in the majors because of his glove?
Player G AB H D T HR R RBI BB K AVG OBP SLG Hafner 114 353 89 23 4 15 41 46 30 96 .252 .327 .467 Pickering 88 237 55 10 1 13 33 42 40 77 .232 .341 .447
Calvin Pickering hasn’t had quite as much playing time in the majors as Hafner had prior to last season, but as you can see they both showed pretty much the same skills, right out of the Ken Phelps Tool Kit. Pickering also made our Top Prospects list–he ranked 29th in 1999–only to disappear for a few years because of injury and weight problems. After hitting .314/.451/.712 in Omaha last year, with 35 bombs in 89 games, he got a shot with the Royals and slugged .500 after being called up.
There is one problem with this comparison, which is that Pickering is still engaged in a Sumo Wrestling Deathmatch with Ken Harvey for the final roster spot on the Royals. We’ll venture that eventually talent must rise to the top, and Harvey must sink to the bottom.
Breakout Profile #3: The Promising Young Starter Who Hasn’t Put It All Together Yet.
It’s hard to believe that barely 18 months ago, the consensus opinion was that the Pirates didn’t get nearly enough for Brian Giles when they swapped their star outfielder to the Padres for Oliver Perez and Jason Bay. It might have been the most wildly inaccurate first impression since the sports media chuckled at those rubes in Montreal for trading their star second baseman, Delino DeShields, for a diminutive middle reliever named Pedro Martinez.
While Bay won Rookie of the Year honors, it’s Perez who is the real star in Pittsburgh. Just 22 last season, he ranked sixth in the league with a 2.98 ERA and struck out 239 hitters in 196 innings. Just two starting pitchers in major-league history have had a strikeout rate that high at such a young age: Dwight Gooden and Kerry Wood.
There were signs that Perez was better than the pitcher who finished 4-10 with a 5.47 ERA the year before. In 2003, he struck out 141 hitters in 127 innings, yet somehow gave up more than a hit an inning. Only one other pitcher in history (Octavio Dotel)–has ever surrendered over a hit an inning while striking out at least 10 batters per nine.
Perez was definitely hit-unlucky in 2003. Other than improved command, the only significant difference in his performance in 2004 was that his luck evened out.
Who is this year’s Perez?
Pitcher Year IP H HR BB K ERA Age Perez 2003 127 129 22 77 141 5.47 22 years, 10 months Bonderman 2004 184 168 24 73 168 4.89 22 years, 8 months
Jeremy Bonderman took over Perez’s spot as the youngest starter in the majors when he opened the 2003 season in the Tigers’ rotation. While he was clearly overmatched in his rookie season, Bonderman quietly made enormous strides in 2004. I say “quietly,” because his 4.89 ERA gives the impression that he still has a long way to go.
Perez’s problem in 2003 was that he was hit-unlucky. Bonderman’s problem was that he was run-unlucky. His component ERA–what his ERA should have been based on his peripheral numbers–was just 3.93.
The biggest reason for the discrepancy between his real and projected ERA is pretty simple: he clumped most of his baserunners in the first half of the season. Here’s how Bonderman’s numbers break down:
IP H R ER BB K HR ERA Through August 1st 111.1 111 75 75 49 100 21 6.06 After August 1st 72.2 57 26 25 24 68 3 3.10
After August 1, Bonderman cut his walk rate by 25% and his home run rate by an absurd 78%. Interestingly, Perez showed similar in-season improvement in 2003, cutting his walk rate by 29% and his home run rate by 43% after the All-Star Break.
Such in-season improvement proved a harbinger for Perez. If it’s a harbinger for Bonderman, he’s going to be one of the best starters in baseball this year.
Profile #4: The Hard-Throwing Reliever With Command Issues.
Before last season, Brad Lidge was a good middle reliever best known for an almost comical series of injuries that kept the former first-round pick from reaching the majors until he was almost 26.
Then he struck out 157 hitters in 95 innings.
It’s almost impossible to convey just how ridiculous that strikeout rate is. Lidge faced 369 hitters all year and whiffed nearly 43% of them. It wasn’t the highest strikeout rate of all-time–Eric Gagne in 2003 and Billy Wagner in 1999 both bested Lidge by a little bit–but at least with the other two, there was some warning that they were among the elite closers in the game. Lidge recorded a historic season out of relative obscurity.
Who could break out of relative obscurity this year? If the standard is what Lidge did, the obvious answer is “no one.” If I had to put my money on one reliever to have a breakout year, I’d split my cash between these two:
Pitcher Year IP H HR BB K ERA Age Brad Lidge 2003 85.0 60 6 35 97 3.60 26 Chad Cordero 2004 82.2 68 8 39 83 2.94 22 Juan Cruz 2004 72.0 59 7 29 70 2.75 25
Note that I’ve taken intentional walks out of the walks column above, which is relevant because both Lidge and Chad Cordero dispensed enough of them to artificially impact their control stats.
Like Lidge, Cordero was a first-round pick who thrived in the bullpen as a rookie. He is not quite Lidge’s comparable in the strikeout department, but those are still pretty good numbers for a 22-year-old pitcher. Like Zach Day, Cordero gets the benefit of moving into RFK’s spacious environs this season. Despite the age difference, Cordero fits the statistical profile of last year’s Lidge better than anyone else does.
Juan Cruz‘s inclusion is not based simply on his numerical similarities with Lidge; consider him a hunch. Like Lidge, Cruz is a converted starter, moved to the bullpen not out of injury concerns but due to command problems. For that reason, the drop in his walk rate–from 5.11 to 4.13 to 3.63 the last two years–takes on added significance, as last year was the first in which he was used exclusively as a reliever. Now that he’s a member of the A’s, where missing bats appears the order of the day in the bullpen, there’s a decent chance he’ll reach triple digits in the strikeout department.
Breakout Profile #5: The Unassuming Middle Infielder.
This might be the hardest profile to fit, because there’s simply little precedent for what Carlos Guillen did last year. Players who hit .270 with seven homers every year simply don’t wake up one April and decide to lead all major league shortstops in OPS. If you’re looking for a middle infielder who’s going to take a quantum leap forward this season, though, I have two names for you:
Player Year G AB H D T HR R RBI BB K AVG OBP SLG Guillen 2002-3 243 863 231 43 9 16 136 108 98 155 .268 .341 .394 Hudson 2003-4 277 963 259 53 13 21 127 115 90 185 .269 .335 .416 Jimenez 2003-4 298 1124 305 52 10 26 145 124 148 188 .271 .356 .405
Orlando Hudson has made few headlines since complimenting J.P. Ricciardi’s fashion sense a few springs ago, but in addition to establish himself as quite possibly the best defensive second baseman in the game, he made quiet gains in his power and walk rates last season, breaking the one walk per ten at-bats milestone in 2004. Like Guillen and D’Angelo Jimenez, Hudson is a switch-hitter, and he just turned 27; Guillen was 28 last year.
Jimenez has been around seemingly forever–he cracked our Top Prospects list in both 1999 and 2000 before his career nearly ended on a road in the Dominican Republic–and yet he’s actually nine days younger than Hudson. Like Guillen, his defensive reputation is sketchy at times; in his favor, he has the best plate discipline of the trio. He also has a minor-league record (mostly compiled before his car accident) that makes you wonder if he’s capable of more than we’ve seen from him so far.
Chances are that most of the guys above aren’t going to have that big breakout season in 2005. Some of them will probably even regress. But it only takes one or two breakout campaigns to make or break a baseball team. If your favorite team–real or fantasy–decides to take a flyer on a couple of these guys and it pans out, congratulations. If it doesn’t…well, whoever said you should rely on your hunches anyway?