"I'm reminded a bit of the principle of superposition–each player in the game produces a contribution that has an effect on the probability of winning, somewhat analogous to a wave function. Add up these "wave functions" for each team, and you get a result that expresses how likely the team is to win with these particular sets of contributions, yet at this point it's still unknown whether the team actually wins (much like the fate of Schrödinger's cat inside the box). However, the wave function only collapses to the actual result when the game is played (or the box containing the cat is opened)."
–Keith Woolner, “Aim for the Head” October 24, 2001

When Woolner wrote that over four years ago, this column wasn’t even a twinkle in BP’s collective eye. I do love the analogy, though, and before moving on to this week’s topic–which connects to Woolner’s quote–I wanted to take a minute to explain the title of this column and how, other than the play on words, it relates to baseball. (If quantum physics doesn't interest you, click here to go directly to the baseball part of the article).

Erwin Schrödinger was an Austrian physicist who, in 1926, formulated the fundamental equation of quantum mechanics. His equation described a world where properties of a particle (such as the location of an electron) at a specified time can be pinpointed only probabilistically. In other words, the particle may have a greater chance of being in one place than in another, but its location is described by a wave where the peak represents the position with the greatest probability. The quantum world appeared to be a fuzzy one governed by probability unlike the clock-work deterministic world of our everyday experience.

By 1935, many physicists, including Niels Bohr (although famously not Albert Einstein), had interpreted this waveform equation to mean that particles do not in fact possess specified properties (such as location) before measurements are taken; they are in a spread out and fuzzy state of superposition (literally beyond position) until a measurement is taken and causes the waveform to “collapse” to a particular value.

As a response to this view, Schrödinger devised a thought experiment that came to be known as “Schrödinger’s Cat”; he believed it showed that Bohr’s interpretation of quantum theory was, at the very least, incomplete.

In short, the experiment involves a cat in a box with a trigger that releases poison. That trigger is tied to a device that measures a property of a particle. According to the waveform equation there is a 50% chance that the particle will be in state A and a 50% chance of it being in state B. If the device measures it in state A the poison is released and the cat dies.

The core question for Schrödinger was simply this: if the particle’s property is not determined until it is measured, under Bohr’s interpretation isn’t the cat–through its connection with the particle via the device–also left in an undetermined state and therefore neither dead nor alive until the box is opened? Bohr’s interpretation didn’t really answer the question since it didn’t define any rules about the nature of measurement and observation.

To make a long story short, which is told in wonderfully accessible prose by Brian Greene in The Fabric of the Cosmos, the questions raised in this thought experiment baffled physicists for years, but now have been mostly resolved by applying a concept known as decoherence. That concept holds that long before the box is opened the influence of the environment (from photons to air molecules and other particles) has nudged the waveform function into taking on a specific value, meaning that the cat is in fact really dead or alive and not caught in some state of limbo.

What I like about this episode in the history of science is that Schrödinger devised a clever experiment used to test a common perception in his own field of quantum mechanics. That experiment made people think deeply about what they knew or thought they knew about the nature of reality itself. And while I’m not pretending that baseball has anything profound to say about such matters (it is, after all, just entertainment), I do hope that through this column, at least now and then, we can devise clever experiments that put to the test both conventional and sabermetric wisdom and help us think more deeply about our shared distraction.

Before moving on I should also mention that reader John MacKenzie noted that he’s been using the moniker Schrödinger’s Bat (with a different spelling) for his fantasy league team for several years. We were of course unaware of that usage, so please don’t give John a hard time thinking that he lifted it from us.

And now on to your regularly scheduled programming…

Win Expectancy 101

The concept of Win Expectancy (or Win Probability Added) is now an old one in performance analysis circles. Simply put, Win Expectancy is the probability of wining a game given the inning, score, and base/out situation. Using the Expected Win Matrix here at BP you can see, for example, that in 2005 when the visiting team was behind by a run in the top of the 6th inning with runners on first and third and nobody out, their probability of winning was exactly 50%.

Changes in that probability throughout a game can be tracked and then applied to a host of questions both strategic (when to sacrifice, when to steal, when to issue an intentional walk, when to bring in a reliever) and reflective (who contributed most or least to increasing their team’s chances of winning in 2005–their aggregate contribution to the waveform function for each game in which they played, to use Woolner’s analogy).

Those readers who’ve treated themselves to The Numbers Game by Alan Schwarz or Curve Ball: Baseball, Statistics, and the Role of Chance in the Game by Jim Albert and Jay Bennett know all about the Mills Brothers and their computation of “Player Win Averages” (PWA) for the 1969 season published in their 1970 book Player Win Averages: A Computer Guide to Winning Baseball Players. There they devised a system where changes in win expectancy were assigned to players and multiplied by a point system to compute Win and Loss points. The ratio of those became the PWA, their goal being to formulate a statistic like batting average used to discover clutch performers.

But simply because it’s a topic with some legs doesn’t mean there aren’t new applications and refinements that can be made. Woolner himself contributed to this endeavor through the publication of the Win Expectancy Framework (WX), first discussed in the 2005 Baseball Prospectus and again in the 2006 version as well as in Baseball Between the Numbers, where it is applied to topics ranging from relief pitching to stolen bases.

For those unfamiliar, the framework allows for the computation of the probability of winning a game given the current inning, score, base/out state, run environment (both home and visiting teams), and run differential. It does so by calculating all the permutations of possible outcomes from that point forward to determine the probability of each team winning.

The key difference between the framework and matrices such as the one referenced previously, is that the probabilities produced are theoretical and the situations from which they derive needn’t have occurred in real life. This has a twofold advantage:


  1. it allows the framework to be more flexible by considering parameters such as the offensive environment of each team instead of being averaged across all teams
  2. it eliminates the problem of small sample size where a particular situation that occurred only a handful of times–or not at all–results in probabilities that are counterintuitive.

For example, in the scenario described above, the visiting team had a 50% chance of winning as revealed in the table. However, in the intuitively less favorable situation where the visitors had a runner only on first, their probability of winning in 2005 was 52.4%. The inherent nature of WX eliminates these problems. From that perspective, WX is more similar to the approach used by the Mills brothers in computing PWA where they used computer simulation to derive the probabilities.

Leveling the Playing Field?

In any case, in his 2006 Baseball Prospectus article “Adventures in Win Expectancy” Woolner applied WX to hitter seasons using play by play data extending from 1960 through 2005. In other words he calculated and then summed the change in win expectancy across all plate appearances for each hitter using the WX framework to produce a kind of “number of wins above average” contributed by each player. The results were then shown in two tables that reveal the 15 highest and lowest seasonal Batting WX and the 20 highest and lowest career Batting WX for the time period. The two tables below show the top and bottom five for each.

Seasonal Batting WX
Year   Name            PA      WX
2004   Barry Bonds    617   12.07
2001   Barry Bonds    664   11.71
2002   Barry Bonds    612   10.45
1969   Willie McCovey 623   10.02
1998   Mark McGwire   681    9.65


2003   Royce Clayton  543   -4.28
1970   Larry Bowa     577   -4.29
1968   Hal Lanier     518   -4.45
1997   Gary DiSarcina 583   -4.76
2002   Neifi Perez    585   -6.69
Career Batting WX
Name                 WX
Barry Bonds      115.71
Willie McCovey    74.11
Hank Aaron        71.15
Willie Mays       63.41
Frank Robinson    63.04


Tim Foli         -24.61
Doug Flynn       -25.69
Royce Clayton    -27.29
Alfredo Griffin  -28.90
Larry Bowa       -31.50

Obviously the gap between Barry Bonds and the rest of the pack is wide because of Barry’s late and unprecedented 1999-2004 performance illustrated in the first table, but also because Willie Mays, Hank Aaron, and Frank Robinson all played significant portions of their careers prior to 1960 with Mays' rookie season in 1951, Aaron's in 1954, and Robinson's in 1956. And what about Babe Ruth, Ted Williams, Ty Cobb and the rest? How do they compare to Bonds?

Fortunately, we can augment these lists to stretch back through time by applying a formula and a table of slopes and intercepts Woolner provided to estimate the win value of an offensive event given any offensive environment.

First, by applying the formula to the league run environment over time for the National League we can produce the following two graphs:


chart 1


chart 2

There is an interesting aspect of the first graph, as noted by Woolner. In eras of higher run scoring–such as when the average NL team scored 7.36 runs per game in 1894, 5.68 runs per game in 1930, and around 5.00 runs per game in 1999-2000–each offensive event contributes less to a win than in lower run scoring environments such as 1908 (with 3.32 runs per game), and 1968 (at 3.42).

In other words, contrary to the notion that home runs during the dead-ball era weren’t as important as small ball tactics, they were in fact even more important, since each extra base hit–especially one that plates a run–has a larger relative impact on winning the game. Looking closer you’ll notice that as the number of bases gained by the event increases the relative value also increases as the run environment decreases. So in 1908 a home run is worth 3.14 times more than a single while in 1930 it’s worth exactly three times as much.

It is in that context that the following quote from the supreme hitter of the dead-ball era is relevant:


"If I had set out to be a homerun hitter, I am confident in a good season I would have made between twenty and thirty homers…I would naturally have sacrificed place hitting, which, to my way of thinking, is the supreme pinnacle of batting art."
Ty Cobb as quoted in F.C. Lane’s Batting.

If Cobb could indeed have hit 25 home runs a season in the days before 1920 as he also is purported to have contended in the oft-recited anecdote where he hit three homeruns to prove the point, then he would have been well served to do so.

In the second graph the value of various kinds of outs are shown and what is revealing is that the win values of strikeouts and other kinds of outs don’t change very much over time. The graph also shows how much more costly getting caught stealing is than other kinds of outs and that caught stealing fluctuates more with the run environment. In higher scoring eras getting thrown out doesn’t cost as much as in lower run scoring eras since when runs are scarce and runners are hard to come by, losing a baserunner has a larger relative impact on winning or losing. The long and short of it, as illustrated by Woolner in the original article and discussed by James Click in Baseball Between the Numbers and shown in the following graph, is that you have to be successful at a higher rate in low run scoring environments than you do when runs are more plentiful.


chart 3

Attentive readers will note that the break even percentages shown here vary somewhat and are lower than those shown in the original article. The reason is that these are based on the overall win expectancies calculated using Woolner’s formula and not on specific situations in various run environments.

So by joining lower win values for offensive events in higher run scoring environments and very similar win values for most outs in lower run scoring environments you get something rather counterintuitive. But both of those statements have their roots in the fact that the basic structure of the game hasn’t changed much. Despite styles of play that come in and out of vogue, you still get just three outs per inning and 27 outs per regulation game and a home run has always been the most efficient way to score runs.

So let’s apply the formula to individual batter seasons and adjust for both the league run environment as well as the ballpark using three-year park factors. After all, just as an extra base hit increases the win probability in the low run scoring environment of 1968 more so than in 2001, it does so to a greater degree at Dodger Stadium in 2001 than it does in Coors Field.

After making the calculation for 83,733 player seasons (starting in 1876 in the NL and 1901 in the AL) we find the following top and bottom 15 seasons. Note that there are two tables of the bottom performers, since the bottom performers were dominated by pre-1900 players.

Name                     PA     WX1
2001  SFN Barry Bonds             664   11.59
2002  SFN Barry Bonds             612   10.85
1923  NYA Babe Ruth               699    9.83
1921  NYA Babe Ruth               693    9.61
1920  NYA Babe Ruth               615    9.44
1927  NYA Babe Ruth               691    9.17
1926  NYA Babe Ruth               652    9.10
1927  NYA Lou Gehrig              717    9.09
1941  BOS Ted Williams            606    8.94
1946  BOS Ted Williams            672    8.94
2004  SFN Barry Bonds             617    8.79
1957  NYA Mickey Mantle           623    8.74
1917  DET Ty Cobb                 669    8.66
1924  SLN Rogers Hornsby          640    8.55
1942  BOS Ted Williams            671    8.49
Name                     PA     WX1
1894  CHN Jiggs Parrott           536   -6.02
1933  SLA Jim Levey               567   -5.84
1886  KCN Jim Lillie              427   -5.52
1893  SLN Joe Quinn               584   -5.36
1894  NY1 John Ward               575   -5.24
1894  CL4 Chippy McGarr           554   -5.07
1885  NY1 Joe Gerhardt            423   -5.05
1895  PHI Jack Boyle              625   -5.03
2002  KCA Neifi Perez             585   -5.03
1890  CL4 Bob Gilks               582   -5.02
1884  BFN Jim Lillie              476   -4.81
1891  CIN Germany Smith           551   -4.81
1890  BRO Germany Smith           526   -4.67
1879  CN1 Will White              300   -4.64
1892  BSN Joe Quinn               574   -4.64

Post 1900 Only
          Name                     PA     WX1
1933  SLA Jim Levey               567   -5.84
2002  KCA Neifi Perez             585   -5.03
1933  SLA Art Scharein            522   -4.62
1953  SLA Billy Hunter            604   -4.54
1934  SLA Ski Melillo             589   -4.53
1909  BRO Bill Bergen             372   -4.48
1999  COL Neifi Perez             732   -4.47
1932  SLA Ski Melillo             659   -4.45
1931  SLA Jim Levey               540   -4.42
1936  PHA Skeeter Newsome         508   -4.33
1937  CHA Jackie Hayes            631   -4.30
1977  OAK Rob Picciolo            446   -4.09
1902  CLE John Gochnauer          506   -4.08
1970  CIN Tommy Helms             605   -4.07
2000  COL Neifi Perez             699   -4.07

What stands out of course is that the WX values for Bonds from the tables shown previously don’t match the WX1 values in the first table here. The reason is that the formula applied to calculate these values is more of an approximation and doesn’t put into complete context each individual plate appearance. As a result you would expect to see more variability when play-by-play data is used since a player may find himself more or less frequently used in highly leveraged situations through both chance and managerial decision.

In other words, the price we pay for being able to reach back before play-by-play data was available is a loss in precision. However, given that the presence of a clutch hitting ability–if it exists at all is likely quite small–some might argue that WX1 has the advantage of removing the effect of randomness and in that way actually provides a more “pure” technique for comparison.

Bonds’ 2001 and 2002 seasons still come out on top, but Ruth makes his mark with five consecutive entries on the list which is rounded out by appearances by Lou Gehrig, Ted Williams, Mickey Mantle, Ty Cobb, and Rogers Hornsby. This is pretty much what you’d expect from similar lists that look at Equivalent Runs (EqR), or park adjusted Runs Created (RC) or BaseRuns (BsR). The top active player is Albert Pujols whose 2003 season came in 35th with a WX1 of 7.47.

Cubs fans will no doubt be disheartened to see Neifi Perez grab three of the bottom 15 slots since 1900.

We can then sum the WX1 values for entire careers and provide the following top and bottom 20 career performers with the bottom performers list being duplicated once again for post 1900.

Name                 PA       WX1
Babe Ruth          10616    117.37
Barry Bonds        11636    108.73
Ty Cobb            13072    105.14
Ted Williams        9791     98.37
Hank Aaron         13940     90.41
Stan Musial        12712     87.85
Willie Mays        12493     85.29
Mickey Mantle       9909     82.43
Lou Gehrig          9660     79.31
Rogers Hornsby      9475     78.44
Tris Speaker       11988     77.15
Frank Robinson     11743     73.09
Mel Ott            11337     72.06
Honus Wagner       11739     65.08
Eddie Collins      12037     65.03
Rickey Henderson   13346     58.73
Jimmie Foxx         9670     57.57
Jeff Bagwell        9431     56.65
Joe Morgan         11329     55.48
Frank Thomas        8602     53.22
Name                 PA       WX1
Tommy Corcoran      8275    -41.25
Joe Quinn           6341    -38.43
Germany Smith       4652    -34.46
Alfredo Griffin     7330    -34.24
John Ward           7470    -34.14
Bobby Lowe          7741    -33.95
Bill Bergen         3228    -33.22
Kid Gleason         8198    -32.70
Malachi Kittridg    4446    -32.18
Ozzie Guillen       7133    -31.80
Bones Ely           5000    -30.73
Davy Force          3081    -30.04
Fred Pfeffer        6563    -29.65
Ed Brinkman         6640    -29.40
Don Kessinger       8529    -29.13
Ski Melillo         5536    -29.03
Herman Long         7845    -28.97
Everett Scott       6373    -28.77
Larry Bowa          9103    -28.64
Tim Foli            6573    -28.49
Post 1900 Only
Name                 PA       WX1
Alfredo Griffin     7330    -34.24
Bill Bergen         3228    -33.22
Ozzie Guillen       7133    -31.80
Ed Brinkman         6640    -29.40
Don Kessinger       8529    -29.13
Ski Melillo         5536    -29.03
Everett Scott       6373    -28.77
Larry Bowa          9103    -28.64
Tim Foli            6573    -28.49
George McBride      6235    -27.02
Tommy Thevenow      4484    -26.57
Neifi Perez         5123    -25.53
Aurelio Rodrigue    7078    -25.16
Hal Lanier          3940    -24.77
Leo Durocher        5827    -24.60
Mark Belanger       6602    -24.50
Luke Sewell         6041    -23.92
Roy McMillan        7653    -23.74
Wally Gerber        5816    -23.18
Rabbit Warstler     4611    -22.97

You’ll notice that the total WX1 here for Bonds is just four wins or so less than the table shown earlier while Ruth overtakes him at 117.37. Of course, Ruth’s contribution to winning here does not include his pitching performance which would further distance him from Bonds. Nor do these values include fielding which would help Bonds close the gap a bit.

Mays and Aaron also add 19 and 22 wins, respectively, by including their entire careers; Ty Cobb comes out very well, and both Ted Williams and Stan Musial round out the top seven. Jeff Bagwell and Frank Thomas, two underrated players of the modern era, also make the top 15.

Perhaps the most interesting thing about the top performers list is that Willie McCovey–second only to Bonds with 74.11 in Woolner’s original table–comes in 24th at 50.10 in WX1. His 1969 season that was rated at 10.02 in WX comes out to 7.00 in WX1. The most probable explanations: McCovey happened to have more highly leveraged plate appearances over the course of his career than would have been expected, he happened to hit well in the highly leveraged opportunities he had, he was one of the few true clutch hitters, or a combination of all three.

Clearing the Bases

To wrap up, there are also a couple issues I wanted to address from last week’s column regarding platoon splits.

I mentioned last week that The Book notes that right-handers need about 2,000 plate appearances against lefties before their measured platoon split can be considered reliable. I received several comments on this to the effect that since 2,000 plate appearances is the equivalent of 10 to 12 years of playing time, that seems like an awfully long time to wait before you can say anything about a player’s split.

I agree. The point is not that you can’t know anything about the player’s split ability in fewer plate appearances. The point is that if you had only two pieces of information–a hitter's platoon split and the average split for right handed batters–and you had to choose which was more accurate, you would chose the average split.

That doesn't mean that you couldn't get a better estimate by regressing the player's split to the mean using a weighted value, which the authors also discuss. So you certainly don't need to ignore the measured platoon split of players like Wily Mo Pena or Eduardo Perez. However, in the case of a player like Perez, who has just over 300 career plate appearances against southpaws, your best estimate of his true platoon split would be pretty heavily regressed to the mean.

Second, because in this case the statistical threshold is so high, teams can and do combine both scouting information and statistical data to make predictions about future performance. So Epstein’s comments about Pena’s ability to perhaps contribute immediately because of his platoon split hopefully also reflects their scouting of his swing mechanics and pitch recognition among other attributes.

And finally, I’d like to thank all the regular BP readers who have so kindly welcomed me into the fold. Your support is appreciated and your feedback encouraged anytime.