April 27, 2006
Of Crowds and Splits
“I have never wished to cater to the crowd; for what I know they do not approve, and what they approve I do not know.”
In the fall of 1906 Francis Galton (1822-1911), the British polymath and half-cousin of Charles Darwin, decided to attend a country fair near his home. Galton was a man of many and varied talents--he invented the weather map, a method for classifying fingerprints, and even the silent dog whistle--but among them was a statistical bent, and he had used his skills to try and understand human differences and heredity.
Those of us in the performance analysis community particularly owe him a debt, since it was he who is generally credited with creating the concept of regression and discovering the idea of regression to the mean, both of which we often apply with a vengeance. Of course, he was also a big proponent of eugenics, although in his defense, we can note that he cautioned people against the negative forms that became popular in the fist half of the twentieth century.
Be that as it may, on that particular day, Galton watched as 800 people (many “non-experts” in Galton’s own words) bought tickets to try and guess the weight of an ox after it had been slaughtered and dressed, in order to win a prize. Being interested in the intelligence of individuals, he asked for the tickets after the event. After tabulating them, he noticed something interesting. The ox had actually weighed 1,198 pounds and while the guesses of the individuals formed a normal distribution, the mean of those guesses was 1,197.
What Galton had discovered, and what Epicurus apparently did not, was what we now call the “wisdom of crowds.” And if this vignette rings a bell, it may be because some readers know that James Surowiecki uses it to open his wonderful 2004 book The Wisdom of Crowds: Why the Many Are Smarter Than the Few and How Collective Wisdom Shapes Business, Economies, Societies and Nations.
As you might imagine, one of the wonderful byproducts of the information age is the ability to harness the wisdom Galton discovered through the blogosphere and the miracle of email. Well, in response to my column on platoon splits two weeks ago, I received a whole boatload of wisdom (although I would not classify readers as non-experts) and so this week I want to cull that wisdom to take a look at the two most frequently mentioned aspects that were missing from that original article.
Slice and Dice
For those who haven’t read the original article the thesis was simply this: performance analysts don’t typically put a whole lot of emphasis on platoon splits because the amount of variability inherent in them (even across halves of careers) means that for the majority of players there is little quantifiable difference between the assumption that there is a general platoon advantage for right-handed hitters versus left-handed pitchers (and vice versa), and the notion that there is an inherent skill above and beyond this general advantage.
The data set used in that article included all 505 players who garnered 2,000 or more plate appearances from 1970 through 1992 (incidentally the 1993-1998 and 2005 play by play data are now available on the Retrosheet web site).
Although we broke this group of hitters into left- and right-handers and measured both the average platoon split and how it correlated across even and odd years of careers, several readers wondered what the data would look like if it were further sliced. For example, is it the case that platoon splits are larger or smaller for certain kinds of hitters?
To take a first run at that question we can calculate the overall Isolated Power (ISO, defined as slugging percentage minus batting average) for each hitter and then divide the hitters into three groups.
--------Right----------- -----------Left-------- -----Platoon Split---- ISO Count Bats AVG OBP SLG OPS AVG OBP SLG OPS AVG OBP SLG OPS <.1 122 R .253 .300 .325 626 .269 .317 .358 675 .016 .017 .033 49 L .276 .328 .364 692 .250 .302 .313 615 .026 .025 .052 77 .1-.15 183 R .262 .312 .381 693 .280 .332 .423 755 .019 .020 .042 62 L .283 .340 .417 757 .258 .312 .354 665 .025 .029 .063 92 >.15 200 R .261 .326 .437 762 .278 .348 .479 827 .017 .023 .042 65 L .277 .351 .470 822 .251 .319 .396 715 .026 .032 .074 106
What this table shows is that as a group, the higher the ISO, the larger the platoon split, at least when it comes to on base percentage and slugging percentage. This can also be shown graphically below, where differences in OPS are indicated by the Y-axis on the left, and split differences by AVG, OBP, and SLG are indicated on the right.
As we move to the right, the bars for right- and left-handed hitters get a little higher. One might theorize that the reason for this is that hitters with more power generally have longer swings. And with a longer swing comes the need to commit earlier and a greater likelihood of being victimized by pitchers of the same hand who feature hard breaking stuff.
This is borne out by taking a quick look at the pitchers that gave these hitters the most trouble. The following list includes the fifteen pitchers with the lowest OPS against left-handed hitters in the < .1 ISO group (with more than 250 plate appearances), along with their repertoire as listed in the highly-recommended The Neyer/James Guide to Pitchers.
PA OPS Repertoire Mike Norris R 275 443 screwball, fastball, curve, change Frank Tanana L 338 524 fastball, hard curve (then slow curve, fastball, change) Dwight Gooden R 353 525 fastball, sweeping curve, change Roger Clemens R 330 526 rising fastball, forkball, curve, slider Dave Stieb R 574 548 slider, sinking fastball, high fastball, curve, change Jerry Koosman L 334 552 fastball, curve, change, slider (1974) Larry Gura L 302 552 curve, slider, fastball, change Scott McGregor L 265 559 slow curve, fastball, change Dan Petry R 432 559 fastball, slider, change, curve Mike Flanagan L 279 563 slow curve, heavy sinker, fastball, change (1979) Mike Caldwell L 258 564 sinker, fastball Tommy John L 369 566 sinking fastball, curve, cut fastball Mark Gubicza R 294 570 fastball slider, curve Steve Carlton L 352 576 slider, high fastball, sweeping curve David Cone R 251 578 fastball, slider, sharp curve
The three things to notice about this list are that:
In other words, there isn’t a very discernible pattern.
We can then compare the previous table to one that includes the fifteen pitchers who performed best against lefties in the > .15 ISO group.
PA OPS Repertoire Bob Lacey L 253 484 N/A Joe Sambito L 276 536 fastball, slider Andy Hassler L 521 545 live fastball, hard slider Fritz Peterson L 347 552 fastball, curve, slider, screwball (1971) Willie Hernandez L 519 559 screwball, fastball Paul Assenmacher L 281 560 overhand curve, fastball, slider Craig Lefferts L 343 560 fastball, slider, screwball, cut fastball (1989) John Candelaria L 573 566 fastball, hard curve, slider, change Zane Smith L 317 572 sinking fastball, slider, curve Paul Splittorff L 855 578 fastball, slider, curve, change Jimmy Key L 627 591 curve, fastball, change, slider Kelly Downs R 292 595 split, fastball, curve Mark Davis L 308 596 hard curve, fastball Mitch Williams L 330 597 fastball, hard slider Gary Lavelle L 347 600 fastball, slider
This list is very different. It includes fourteen southpaws and just a single right-hander, and is not necessarily populated with the best overall pitchers of the period. And whereas the repertoires of the first group of pitchers varied, ten of those here featured sliders or hard curves in their top two pitches. Clearly the harder breaking stuff proved to be more difficult for the more powerful hitters.
Incidentally, Neyer/James did not have a listing for Bob Lacey and I couldn’t find out what he threw before publication of this article. Inquiring minds do want to know and so I’m sure some A’s fan out there fill us in.
The other question then is whether or not splits for players with a larger ISO are more predictive. In other words, is a larger ISO indicative of players with more consistent platoon splits?
The answer appears to be 'no.' On balance, the correlations between even and odd years for the higher ISO group are no larger than that for the lowest group, and there is no trend in either direction. Although this seems counterintuitive, it may simply indicate that as a group, power hitters have the same variability in their splits as do other hitters, only their distribution is centered on slighter larger differences.
David Smith of Retrosheet fame reprised a study he originally did in 1996(warning: pdf) last year for The Baseball Research Journal that examined the question of whether hitters learn during a game. His conclusion was that, with a few caveats, indeed they do.
This same question repeated itself many times, as readers asked whether platoon splits actually shrink over time. In other words, over the course of their careers, do hitters learn to hit better against same-side pitchers, and therefore reduce the opposing manager’s advantage when using his pen?
To look at this, we can take all of the hitters in the original study and focus in on only those who played in the majors for 15 consecutive years. That leaves us 104 players, 40 left-handed hitters and 64 right-handed.
We can then graph their performance over the course of these 15 years against both right and left-handed pitchers for each type of hitter.
By glancing at the graphs, you can see that split differences for both right- and left-handed hitters don’t really change much throughout the period. The gap between the red and navy blue, the pink and green and the aqua and purple lines remains comparatively constant, indicating that there appears to be little "learning" going on relative to performance against their strong side. For right-handed hitters, the gap remains roughly 18 points of batting average, 20 points of on-base percentage, and 38 points of slugging percentage, while for lefties the gaps are 26, 32, and 61 points respectively. However, there are two trends that are discernible.
First, the performance of right-handed hitters against right-handed pitchers declines precipitously around year 13, while performance against left-handed pitchers does so more slowly, thereby widening the gap a bit. For left-handed hitters this trend is not present. As a result, older right-handed hitters are probably relatively more valuable in a platoon role than are older left-handed hitters.
Second, left-handed hitters' performances against left-handed pitchers basically peak by year five, and then hold steady as left-handers continue to improve against the opposite side through year nine. On the contrary, right-handed hitters seem to improve against right-handed pitchers (as they do against lefties) more steadily through year eight before they begin to decline once again, widening the gap just a bit.
Focusing only on 15-year veterans gives us a long period of time to examine, but it cuts down the sample size and ensures that we’re looking only at a group of generally pretty good ballplayers. It certainly could be the case that players with lesser skills have a different progression, although it seems unlikely. In any event, it's a test that will have to wait for another day.
Like Galton I’m always impressed with the wisdom of crowds, and especially the insightful wisdom of BP readers. Keep it coming, and feel free to share your thoughts, opinions, complaints, and insights any time.