Last week, we talked a bit about measuring the uncertainty in our estimates of offense. I hinted at having a few additional ideas on quantifying the uncertainty involved. Let’s examine two different routes we could take, both of which would offer less uncertainty than what we quantified last week.
When we did our estimates of uncertainty last week, we compared the linear weights value of an event to the actual change in run expectancy, given the baseout states before and after the event. What we can do instead is prepare linear weights values by baseout state and find the standard error of those instead. Looking at official events:
LWTS 
STDERR 

Out 
0.246 
0.184 
2B 
0.724 
0.179 
1B 
0.443 
0.164 
3B 
1.010 
0.065 
K 
0.261 
0.021 
NIBB 
0.296 
0.000 
1.398 
0.000 

0.315 
0.000 

0.174 
0.000 
The high value of the out is somewhat misleading—that includes things like the reaching on error, which we separate out in our current linear weights implementation. But here, the source of error comes from the potential differences in baserunner advancement. It makes a certain kind of sense—an Adam Dunn double and a Juan Pierre double present different opportunities for a runner at first to advance, for instance. (A Juan Pierre double is probably closer to an Adam Dunn single, and an Adam Dunn double is probably a good chance at a triple for Juan Pierre.)
So now you’re reduced your estimated error without changing your run estimates! Congratulations. The downside is that you’re now measuring your error against something that I suspect most people have a hard time understanding. You’re getting pretty far into the weeds of hypothetical runs, rather than measuring against a good proxy for actual runs, like what we did last week.
Another thing we can do is look at the change in run expectancy for each event. This isn’t a particularly new idea (Gary R. Skoog came up with it in 1987, calling it the Value Added approach[i]), although it hasn’t been especially popular because of the playbyplay data needed to compute. Let’s pull up the same run expectancy chart we used last week:
0 
1 
2 

000 
0.489 
0.263 
0.101 
100 
0.858 
0.512 
0.221 
020 
1.073 
0.655 
0.319 
003 
1.308 
0.898 
0.363 
120 
1.442 
0.904 
0.439 
103 
1.677 
1.146 
0.484 
023 
1.893 
1.290 
0.581 
123 
2.262 
1.538 
0.702 
To get the Value Added for a plate appearance, you take the runs scored on the play, add the ending run expectancy, and subtract the starting run expectancy. So a bases loaded home run with no outs would have an ending run expectancy of .489, plus runs scored of 4, minus 2.262: a Value Added of 2.227. A home run with the bases empty and two outs has the same run expectancy at the beginning and the end, so you end up with a Value Added of only 1.
This approach, needless to say, does a much better job of reconciling with actual runs scored than the linear weights approach. It also comes closer to measuring performance “in the clutch,” although it ignores inning and run differential (we’ll talk about that at a later date). So why not use it instead of linear weights? The data needed to power it is readily available now, at least for the modern era, as is the computing power required to accomplish it.
The issue—if you’ll remember back to our goals laid out in the first week—is that we want to avoid overcrediting a player for the accomplishments of his teammates. A player is not directly responsible for the baseout states he comes to bat in; that’s the product of the hitters ahead of him in the lineup. But if you look, there’s a substantial relationship between the average absolute Value Added of the situations a player comes to bat in, and the difference between his linear weights runs and his Value Added runs (per PA):
In other words, a player’s Value Added is driven in part by the quality of opportunities he has, not simply what he does with them. The ability for a player to impact plays is greater in some situations than others, and players who get to bat in those situations more than the typical player will have more of a chance to accrue Value Added.
But we can adjust for this. Using a variation on Leverage Index called baseout Leverage Index, we can adjust the Value Added run values for the mix of baseout situations a player comes up in. We take the average absolute change in run expectancy in each baseout state and compare that to the average absolute change in run expectancy in all situations, so that 1 is an average situation and higher values mean more possible change in run expectancy. Then we divide the Value Added values by the leverage to produce an adjusted set of values that reflects a player’s value in the clutch without penalizing or rewarding a player based on the quality of his opportunities.
There is still a substantial relationship between linear weights runs and adjusted Value Added—most players won’t see a drastic change. But some will. Take Robinson Cano, for instance. In 2009, he had a pretty good offensive season by contextneutral stats, batting .320/.352/.520 in 674 plate appearances. That’s good for 25.5 runs above average. But Cano had some pretty pronounced splits that season, hitting .376/.407/.609 with the bases empty but .255/.288/.415 with men on. Cano’s “clutch” performance was so bad, he ended up being worth 1.6 adjusted Value Added runs, worse than the average hitter despite hitting well above average for the season on the whole.
It comes down to what we want to measure—do we want the contextneutral runs, which say Cano was a superb hitter in 2009? Or do we want the clutchbased Value Added runs, which say he was below average? Or should we present both, and let readers decide which they prefer? Here’s your chance to weigh in. We’re not taking a poll—it’s not an election, per se. But we’ll listen to arguments on both sides, and I promise you that this isn’t a trick question.