A Letter to a Young, Inconsistent Pitcher

Imagine, for a moment, that you’re a young pitcher who has been slapped (fairly or unfairly) with a reputation for inconsistency by the local media. You realize all the possible ill effects, from salary implications to fan relations, such a label could have on your career. What could we possibly tell you to do? In short, my advice is to finish the year better than you started.

The Hobgoblin

We’ve been discussing consistency recently here on BP, particularly with regard to starting pitchers. In the immediate context of the season, nothing is more frustrating than inconsistency, whether from a team or a player. Consistency is one of those fleeting notions seen only at the periphery of statistics; the more you focus on it, the more ephemeral it becomes. And while you’ll certainly find those who swear up and down that it is valuable, you’ll find nearly as many who claim consistency with value in and of itself is as ridiculous a notion as turkey bacon.

While I’m as much a sucker for numbers as the next guy, every once in a while I like to look at my data in a more immediate way. So I’ve followed a pretty simple method: I’ve taken the first two current major-league pitchers to appear twice in a Google News search for the terms “consistent baseball” as well as the first two pitchers to appear twice in a similar search for the terms “inconsistent baseball.” What my method lacks in sophistication, it compensates for with simplicity and accessibility. One nice thing about this way of going about it is that the choices were made for me; all that remains is testing them.

First, let’s introduce our contestants. The consistent: CC Sabathia and Joe Blanton. The inconsistent: Chad Billingsley and Joel Pineiro. It’s worth noting that none of these pitchers are particularly bad, but summed together the “consistent” pitchers are the better pair. Here are their simple, combined 2009 statistics.

Guys          IP    H    R    BB    K    RA
Consistent   425.1 395  185  126   360  3.91
Inconsistent 410.1 391  188  113   284  4.12

The two pairs match up relatively well, but the “consistent” pitchers strike out and walk more batters. As a result, they allowed fewer runs per inning last year. But the question remains how well each label fits.

The LOWESS Branch on the FIGS Tree

For the purposes of this exercise, I’ve adopted the form of fielding-independent game score (FIGS) favored by Fungoes. Their implementation of the stat is as follows: 50+BF/3+2*SO-2*BB-8*HR. I have taken all the starts for our four pitchers and calculated a FIGS value for each. I’ve then plotted the starts in chronological order, with FIGS on the y-axis. I’ve then applied a local regression line to filter out some of the noise visually. (It’s a LOWESS regression with a smoothing factor of 2/3.)

Let’s take a look at the consistent group, starting with CC “splish splash, I was taking” Sabathia:

Chart 1

Sabathia appears, visually, to be rather consistent. The regression line is rather flat, and the range of the FIGS is approximately 30. He was markedly better after his 20th (or so) start. Now let’s look at his consistent team partner, Joe Blanton (perceived to be the innings eater’s innings eater):

Chart 2

Blanton, too, appears relatively consistent from the regression line. However, his individual starts were distributed somewhat more widely, and the range on his FIGS was closer to 45 than Sabathia’s 30. Like Sabathia, Blanton was better in the second half of the season than he was in the first. Blanton, however, tailed off a little at the end of the year.

How about the inconsistent group? Here’s Billingsley, whose inconsistency bumped him from the Dodgers‘ playoff rotation:

Chart 3

Billingsley, as you can see, declined “consistently” over the course of the season. Although he started strong, he did not finish well at all. His starts also varied in quality as the season progressed. His range of approximately 40, however, was superior to Blanton’s. How about Pineiro, who put together a renaissance season under the tutelage of Dave Duncan in St. Louis?

Chart 4

Pineiro also tailed off a little at the end of the season, but ever so imperceptibly. From May through September, he was more or less the same pitcher. Additionally, over the same period, his starts cluster very close to his average performance, suggesting Pineiro was indeed rather consistent. One bad start in September rendered Pineiro’s range at about 35. Absent the outlier, his range would have been approximately 20.

In the end, I’m not sure how much we can learn from the lessons of just four pitchers. A few caveats bear noting. First, a reputation for consistency (or lack thereof) is presumably something that develops over many seasons, and so looking at just last year’s data may not tell the full story. Additionally, by using FIGS, I am obscuring the impact that balls in play have on perceived consistency in the form of runs scored. However, viewing the FIGS data visually does bring into relief some interesting phenomena. The consistent pitchers finished stronger than the inconsistent ones relative to their seasonal baselines. Interestingly, the consistent pitchers did not cluster their performances any more closely than the inconsistent ones did. It also bears noting that Pineiro, who in fact may have been the most consistent of the four despite his “inconsistent” label, enjoyed a sharp spike in ground-ball rate last season (by any batted ball classification data). Whether ground-ball pitchers are more consistent on a metric like FIGS, or whether there was some third factor that caused Pineiro’s ground ball rate to spike with his consistency (call it the Duncan Effect), remains unclear.

Question of the Day

What advice would you give to a young pitcher seeking to gain a reputation for consistency? Are there factors (other than pitching well all the time) that might help a pitcher gain such a reputation? Is this the sort of thing that a pitcher can have control over?

You need to be logged in to comment. Login or Subscribe
Blanton's wider FIGS range seems to be the product of two starts. His 4th start gave him a FIGS score around 35, and his 9th start gave him a figs score around 80. Remove those two starts from the equation, and his FIGS range is identical to Sabathia's.
I think you were going for a rather simplistic analysis, but a small change would have probably provided a better representation of consistency than FIGS range. Just calculate the standard deviation between the raw data and your LOWESS regression curve. Or simply use the standard way of calculating standard deviation from the mean(or median). Also, range can be skewed in this case. The higher the average FIGS score for a pitcher, then the higher the range will probably be. For example, a good pitcher who has typically high FIGS scores and then has one really bad game will have a larger range than an average pitcher with typically lower FIGS scores that has a really bad game. Either way, I think analizing consistency can be done a vast number of ways, but in the end you must see if those results are statiscally significant(i.e. using a chi-squared test,etc.)
I would probably characterize it as "simple" rather than "simplistic," but I hear your concerns. Thank you for the feedback.
What I'd be interested to see is how FIGS would view the consistency or lack thereof of given seasons from elite pitchers across time: i.e. a comparison of 1999 Pedro, 1971 Seaver, 2004 Santana, 1995 Maddux, etc. Would it be more or less consistent? How would the trendlines look? And would such brilliance translate into greater consistency by the FIGS metric? (I assume the answer to the last question is yes, but I also recall Pedro getting demolished twice that season, including once in Kansas City...yeah, hard to believe, I know, but perhaps these outliers would skew things). I just hope you know you've opened up a can o' worms here, Tom.
What advice would you give to a young pitcher seeking to gain a reputation for consistency? Short and simple - learn to throw strike one on the first pitch to each batter.
better advice: focus on the third pitch.
I don't believe that Chad Billingsley is the best example. The other three pitchers are above or near 30 and Billingsley will be 26 in July and will be full-time starter for his third season. He was inconsistent but he did break his leg last winter and he hurt his hamstring down the stretch. While he is a great example of inconsistency with his All-Star first half to not being able to start in the playoffs, he is still a young pitcher. Piniero's graph is interesting because he did fall off in September and the playoffs but the start plot seems to have less noise or variance than the two consistent pitchers. I agree with ferret in that he needs to throw his first strike. He still struggles with command and doesn't seem to be comfortable in going after batters but his strikeouts tell you that he can improve if he stops walking people. He walked more pitchers than I care to remember. Tommy, I don't mean to go against what you are writing about because I don't think that you are fervently arguing that your findings prove the media labels. I would be interested in seeing more parts to this form of analysis. Perhaps you would pick two or three of pitchers that you believe to be the most consistent and inconsistent. For me, Jon Garland comes to mind for consistent and Javier Vazquez for being inconsistent.
Since the pitcher is seeking a "reputation for consistency", not actual consistency, I would advise the young man to frame all of his answers to the press with talk about consistency. If you pitch a bad game you stress that you need to be more consistent and that consistency is your goal. If you pitch a great game, downplay it, you did what you always do and just got lucky tonight, it all about consistency and if you can just remain consistent great games will follow from that. If you are mediocre, talk about how you battled, you kept the same approach and tried to stay consistent, kept your team in the game and tried to give them a chance to win. After a while, reporters will internalize the consistency meme and you'll be thought of as consistent whether it is warranted or not.
It seems like doing what you suggest, if to works, would cause "consistency" to equal "success" and "inconsistency" "failure" in the minds of the press and fans. So, if the pitcher is largely successful, he might get the label of "consistent". But, if he's average or slightly worse, I'd imagine he's even more likely to be deemed inconsistent, so your strategy would backfire.
Actually, check out the overall totals for Billingsley over the last three seasons: 2009: 196.1 IP, .244 BA, .323 OBP, .376 SLG, .699 OPS 2008: 200.2 IP, .248 BA, .321 OBP, .363 SLG, .684 OPS 2007: 147.0 IP. .241 BA, .318 OBP, .379 SLG, .696 OPS Considering that we´re talking about a 25 1/2 year old-pitcher, that´s a very, very consistent overall line. I don´t think there are more than a handful of starting pitchers in all of Baseball with an OPS against total within .015 points of each other for three straight years. Of course, the knock on Billingsley is the lack of "consistency" on a start to start basis. That may drive one crazy in an indvidual game. Still, in a seasonal context, this is quite an impressive track record for consistency...
Indeed, "consistency" describes nothing until it's qualified by a time frame. Does a pitcher consistently hit his spots with each pitch? Is he consistent from batter to batter? From inning to inning? Start to start? Season to season? I mean, is consistency even a real trait? My perception of consistency is similar to that of tbwhite above and Brian24 below in that it is a euphemism rather than a description.
The interesting thing about the word "inconsistent" as baseball analysts use it is how often it is a euphemism for "consistently bad." It's amazing the number of times I see a player like, I don't know, Jack Wilson come to the plate and the analyst will say, "He's been great in the field, but he needs to find more consistency with the bat." And I think, "No. He's consistent."
I'm a little concerned by the recent trend toward using LOESS Regression at BP. LOESS wants nothing more than to overfit the noise in your data; it's best-suited to extremely large data sets for that reason. 34 starts by CC Sabathia is not an "extremely large data set". If you use a classical test for serial correlation on Sabathia's FIGS data, what kind of p-value do you get? If you can't reject the hypothesis of "no serial correlation", at a high confidence level, then using LOESS (which is an explicit attempt to model that correlation) is inappropriate. LOESS is really just "fuzzy splines"; good at capturing the pattern of what happened, but bad at telling you whether it was an accident or a real effect.
I think you make a good point, but I'm not using LOESS for anything other than to capture the pattern of what happened. You're right that if I were to use the fit line for some grand project it would be silly, but the main idea was to give some indication of the course of a pitcher's season. For example, the LOESS line shows very clearly the trajectory of Chad Billingsley's season (which would also show up in a linear regression), but it also gives some indication that Sabathia did in fact pitch a little better as the season wore on (which linear regression would not show as clearly).