May 28, 2009
Checking the Numbers
The Cain Mutiny
The Giants are currently mired in the middle of a mediocre division, hovering around the .500 mark despite consistently putting together lineups with an aggregate slash line eerily similar to that of Jeff Francoeur. Entering the season, nobody had any misconceptions about their strengths and weaknesses; the pitching staff was considered effective enough to keep the team competitive, but they were expected to fall prey to the poor run support. After all, a team can only win so many games while averaging fewer than four runs per contest. As expected, Tim Lincecum has continued to dominate, but the Giants have also been the grateful recipients of a stellar start from the younger, more experienced Matt Cain. Whether or not the 24-year-old righty can sustain his early-season heroics has certainly become a hot topic, but Cain has unquestionably done his part over the past two months.
Type his name into Google and a surplus of articles are bound to surface, proclaiming that Cain should be sold high in fantasy leagues because his performance is not "real." These articles tend to hone in on a disconnect between Cain's ERA and FIP, almost going so far as to suggest that his current statistical resumé is due to nothing more than luck. Perhaps Cain has benefited from some favorable bounces or from a few metrics that are bound to regress over the remainder of the season, but to sound these alarms without any evidence other than a FIP somewhat higher than his earned-run mark is absurd. In fact, all of these ERA-FIP articles are just begging for a reminder of why we use FIP in the first place.
Fielding Independent Pitching quantifies precisely what its name suggests-the contributions of a pitcher relative to the events under his control, or those events immune to the effects of defense or luck. Walks, strikeouts, and home runs are recognized as controllable skills based on year-to-year correlations of moderate-or-greater strength. Because these three metrics provide a more telling window into a pitcher's skill set, FIP serves as a better predictor of future ERA than ERA itself. An FIP higher than an ERA does not erase performance up to that point or indicate that the numbers lack validity. Instead, it merely suggests that the success, or lack thereof up to that point are not entirely related to the act of pitching.
The relationship between FIP and ERA is often directly tied to the strand rate, or the percentage of baserunners that fail to add to their runs-scored total. The league-average pitcher will strand about 72 percent of his runners. If a pitcher boasts a solid FIP with a putrid strand rate, he figures to have a worse earned run average. Inversely, a higher FIP married to a fantastic strand rate can transform an average pitcher into a Halladay clone. The latter situation has occurred with Cain this season, as his FIP currently rests at a good (but not great) 4.35 that is almost two runs higher than his ERA thanks to an otherworldly 89 percent strand rate.
From 2005-08, only an Alfonseca-sized handful of pitchers have exceeded a strand rate of 80 percent, with Johan Santana's 82.6 percent in 2007 topping the chart. Suffice to say, the likelihood that Cain's impressive rate will hold is extremely low. Even with that caveat, such a high rate of marooning baserunners indicates an ability to bear down when runners reach base. Unfortunately, too many neglect to ask why or how this has occurred, simply reaching this point and dismissing the hurler's performance as a fraud. Keep in mind that over the last ten years the league-average slash line with the bases empty is .260/.322/.419, compared to .270/.336/.426 with runners on. Cain has been spitting in the face of those numbers so far this season:
Runners have not necessarily struggled to get on, but they end up trapped upon reaching their destinations. Additionally, as the situations become more tense, Cain has produced another counter-intuitive split. Most pitchers fare better in low-leverage situations, but Cain has allowed hitters to slash a mere .172/.265/.241 throughout the most crucial circumstances. In less important plate appearances, hitters have managed a .286/.353/.390 line. Continuing with Cain's reverse splits, the 6'3", 230-pound righty has dominated lefties more than his same-handed brethren:
These numbers reveal Cain's career splits, where lefties have a higher REqA, as do batters hitting with ducks on the pond. On top of that, the reverse splits mock the idea of context, since Cain has strayed so far from the general population. Despite the ultra-low ERA and dominance with runners on base, many have already thrown in the towel with regards to Cain, since he has experienced a decline in whiffs per nine while exhibiting some interesting marks in the plate discipline department. After inducing swings out of the zone at a steadily increasing clip, reaching its rate apex at 26 percent last season, Cain has dropped to 21 percent in a league where the average is 24.5 percent. He has been bailed out by a 59 percent rate of contact on these outside swings, way down from the 65 percent found in his career data prior to this season.
Entering 2009, hitters made contact on just 84-85 percent of Cain's pitches in the actual strike zone, a very solid rate indicating that his raw stuff has overpowered the opposition. This season, that rate has risen to 90 percent, despite a slightly lower rate of overall swings. More contact on fewer swings is a Cook Yourself Thin recipe for strikeouts. Here are Cain's peripherals since 2006:
Year GP IP K/9 UBB/9 HR/9 ERA FIP 2006 32 190.2 8.5 4.1 0.9 4.15 3.96 2007 32 200.0 7.3 3.4 0.6 3.65 3.78 2008 34 217.2 7.7 3.4 0.8 3.76 3.91 2009 9 60.0 6.2 3.6 0.9 2.40 4.35
With subtle spikes in both the walk and home-run rates as well as a dramatic drop-off in strikeouts derived from shifts in the approaches of opposing hitters, it stands to reason that Cain will have to make adjustments of his own when those tremendous numbers with men on base begin their inevitable trek downward. At this juncture, one question stands out: How in the wide, wide, world of sports has Cain been able to transform himself into Pedro Martinez circa 1999 with runners on base? For starters, here's a look at Cain's pitch selection data over the past four years:
Year Fastball Curve Slider Change 2006 72.2 14.1 6.5 5.8 2007 64.5 8.6 16.5 10.4 2008 65.4 10.2 13.8 10.6 2009 62.9 15.4 9.3 11.6
Upon first arriving in the big leagues, Cain used his powerhouse mechanics, complete with a slight jump towards the end of his delivery, to rocket 93 mph fastballs almost three-quarters of the time. Since then, he has matured with respect to pitch selection, incorporating off-speed offerings much more often. These frequency increases of the non-fastball components of his repertoire were not random, but rather the result of actual strides made in improving the pitches. Since 2007, Cain has added two inches of vertical movement to his curveball and 1.7 inches of horizontal movement to his changeup. As we discussed not too long ago, vertical movement in the Pitch-f/x data set refers not to how much a pitch rises, but rather to the extent to which it does not drop relative to a pitch thrown at the same velocity with no spin. At -7.3 inches of vertical movement on the curveball, compared to -5.3 inches back in 2007, Cain is exerting much more spin on the ball, so much in fact that he has caused the pitch to drop considerably more than if gravity were acting alone.
With runners on base, Cain's data shifts markedly in a few key areas, primarily the movement on both his fastball and curveball. Here is his data with the bases empty:
Pitch % Velo PFX PFZ Fastball 60.4 91.7 4.1 10.6 Curve 16.3 74.4 6.2 - 6.9 Slider 10.6 85.6 2.6 3.1 Change 12.5 85.9 7.2 4.2
And with any or all bases occupied:
Pitch % Velo PFX PFZ Fastball 66.2 91.6 4.7 10.7 Curve 14.3 73.7 6.6 - 7.8 Slider 7.7 84.8 2.5 2.2 Change 10.4 85.2 7.1 4.5
Though a fraction of an inch here or there is not necessarily significant in terms of release point or movement, throwing a curveball with 0.9 more inches of vertical movement and a fastball with 0.6 more inches of horizontal movement than with the bases empty certainly helps to explain some of Cain's stranding prowess. Context is always important when discussing splits, but as with our look at Javier Vazquez not too long ago, controlling for all of the factors that make pitchers unique leaves us with a very tiny sample of comparables. It really isn't meaningful at all to compare Cain's deltas in these areas to the five or six other pitchers with similar repertoires, pitch data, and workloads. It does, however, seem that Cain's pitch data deltas do exceed the averages with fewer factors controlled.
The fastball velocity issue has raised eyebrows as well, since it is fairly rare to see a 24-year-old with fluid mechanics and no real prior injury history suddenly drop from 93 mph to 91.7 mph in under three seasons. While the knee-jerk reaction would involve suggesting some kind of injury, the more likely cause is a combination of incorporating more off-speed pitches as well as learning not to max his effort on every single offering. Cain will not finish the season with an ERA this far from his FIP, since one will work like a magnet, drawing the other closer. His performance right now is very real, and his success with runners on seems to stem from much more than just luck-based indicators bound for regression.
The numbers suggest that he has been able to kick the gears into overdrive when a baserunner reaches, digging deep for the extra movement needed to put the batters away and prevent any damage. Luck has certainly been cast in at least a supporting role, but it doesn't have as much of an effect as his abilities. The BABIP marks will regress, as they usually do, but Cain's rate of strikeouts should also revert to his previously established norms. They may not cancel each other out, but the regression highway handles traffic on both sides. Matt Cain may never become a true ace, and his end-of-season statistics may be unrecognizable relative to his present production, but we now have the data available to investigate the roots of these statistical shifts. Let's use it, instead of relying on more obvious data and merely treating that as gospel.