Prospectus Toolbox: Still Stranded

I was ready to move on, but last week’s column generated a lot of comment, so we’re sticking with baserunners left on base for today’s column. Now, last week we looked at the correlation between leaving runners on base-the Team LOB statistic-and run scoring, by looking at team totals of those stats going back to 1971. I kept the conversation limited to raw totals of runners left on base, times on base, and runs scored for two reasons: first, because that was the question that had been asked, and second, because raw totals are the way the left on base stat is most often used and discussed. Once the strike-shortened seasons of 1981, 1994, and 1995 were omitted (although I failed to take the first strike year, 1972, out of the sample) the teams were on more or less equal footing in terms of opportunities to put men on base, score runs, or strand them.

As several readers noted, using raw counting stats left a number of questions unanswered. For example, reader S.B. pointed out:

The better stat to look at is probably Percentage of Baserunners who are Stranded (or a very-pleasing acronym POBWAS). This is analogous to the RBI/RBI opportunity stat for an individual batter being a better indicator than total RBI.

Statheads often favor rate statistics to counting stats, because rate statistics provide us with valuable context-usually, by contrasting a counting stat against opportunities. Reader M.P. took things a bit further than S.B. did, crunching some numbers while using the stats I provided in last week’s column:

It seems to me that the key stay for LOB would actually be the percentage of runners left on base, a sort of offensive strand rate, OSR perhaps? Just using the data from 2007 that you presented, the coefficient of correlation between runs scored and OSR would be -.62, which shows a pretty strong negative correlation between OSR and runs scored. That would explain the Tigers outscoring the Nationals despite the Nationals stranding fewer runners. The Nationals stranded over 58 percent of their runners while Detroit stranded just under 53 percent. It would be interesting to see if home runs are a factor. I would think Detroit hit a good number more homers than Washington, making their offense much more efficient.

OSR also explains Oakland’s high LOB/low runs situation. No one stranded runners at a higher rate than Oakland in 2007, leaving nearly 59 percent of their runners high and dry.

OSR would explain why it is so frustrating to watch a team strand runners sometimes. Red Sox and Yankees fans do not get to end of the season feeling like their teams left too many players on base because, even though they left more runners on base than 25 other teams, they were both in the top half in percentage of runners brought home (the Yankees were second in fact).

It would also be interesting to see how GIDP fits into this equation. If we counted GIDPs as LOBs as well (since they sort of are), then wouldn’t the numbers change? Maybe the correlation would increase and some of the anomalies would even out? Something to think about…

The rate stat described by both S.B. and M.P., the ratio of runners left on base to times reached base safely or LOB/TOB, would seem to be a pretty good (albeit inverse) measure of a team’s offensive efficiency, and a negative correlation between that stat and run-scoring would undercut the conclusion of last week’s study, that a high number of LOB aren’t necessarily bad for your offense. M.P.’s point about Detroit’s home runs makes some intuitive sense, and is bolstered by the fact that the Yankees, another power-hitting ballclub, featured a similarly low LOB/TOB ratio. However, the most efficient offense last year, by this measure, that of the Los Anaheim Angels of Angeles, had the exact same number of homers (123) as the Washington Nationals, who had the second-highest LOB/TOB. We’ll shelve further investigation of that theory, for the moment.

M.P.’s other point, about the how much double plays might throw off the LOB/TOB ratio, was interesting, but when I followed up on it, it seemed incomplete. If double plays should be added to LOB, why not triple plays? That’s easy enough, but then there were other questions (“Why not caught stealing? Why not pickoffs?”) which were a bit of a headache. So, with the assistance of William Burke, I took a look at another statistic, Team OBI percentage (OBI%), that includes double plays and takes the batter scoring on a home run out of the equation.

As it happens, I wasn’t the only person who thought that maybe LOB/TOB wasn’t the end of the discussion. Reader L.H. chimed in with his own preferred metric:

You missed one factor of interest: the ratio of Runs to LOB. The more Runs a team scored, the higher the ratio. What was very interesting was that the ratio for the majors last year was .666. Those that fell below, did worse. To put it another way, the lower-scoring teams got 33-35 percent of their TOB around to score, the higher-scoring teams 40 percent. The number 15 team, Tampa, got 36.4 percent across. Everyone above them had a better rate. Only three of the remaining 15 teams below them had a better rate, and none by more than one percent.

So LOB is important. It is the ratio. If my team has more than twice as many LOB as Runs in a game, I’m going to be upset. Run the numbers against your database and give a better idea what what we should accept.

Another reader, M.C., suggested the remaining combination among the stats I listed last week:

A couple of years ago (when the Red Sox were leading the league in runs scored and OBA annually), I heard a lot of Sox fans complaining about all the runners left on base, so I decided to see what percentage of base runners scored (R/TOB). As it turned out, despite leading the league in LOB, the Sox plated the second-highest percentage of base runners, which I felt indicated that they were pretty efficient in driving runners home.

I don’t have the data any more, so I can’t recall any other specifics, but I would be interested to see the correlation between percentage of runners scoring and total runs scored.

So, we have a quandary: one concept (find a statistic that describes a team’s “efficiency” on offense) and four different proposed metrics, with the usual alphabet soup of acronyms in tow. Confronted with that challenge, I decided to calculate each metric’s correlations to run scoring, again going back to 1971. Since mixing counting stats and rate stats is a bit like mixing acid and water, I decided to correlate the various metrics to another rate stat, runs per game (R/G). Another 2007 leaderboard follows:

Year Team     R/G   LOB/TOB    R/LOB      OBI%    R/TOB
2007  NYA    5.98    .5268     .7750     15.96    .4083
2007  PHI    5.51    .5657     .6888     14.35    .3897
2007  DET    5.48    .5261     .7726     16.56    .4065
2007  BOS    5.35    .5579     .6718     14.78    .3747
2007  COL    5.28    .5504     .6880     14.79    .3787
2007  ANA  ->5.07    .5176     .7473     16.11    .3868
2007  TEX    5.04    .5409     .7473     15.27    .4042
2007  CLE    5.01    .5593     .6669     14.46    .3730
2007  ATL    5.00    .5618     .6722     14.57    .3776
2007  NYN    4.96    .5573     .6722     14.64    .3746
2007  MIL    4.94    .5497     .7171     14.74    .3942
2007  SEA    4.90    .5418     .7045     14.81    .3817
2007  FLO    4.88    .5657     .6628     13.85    .3749
2007  CIN    4.83    .5576     .6692     13.60    .3732
2007  TBA    4.83    .5558     .6707     14.35    .3727
2007  BAL    4.67    .5549     .6563     14.46    .3642
2007  TOR    4.65    .5521     .6772     14.62    .3739
2007  CHN    4.64    .5749     .6319     14.17    .3633
2007  OAK    4.57    .5870     .5890     12.79    .3458
2007  SDN    4.55    .5725     .6427     14.01    .3679
2007  LAN    4.54    .5725     .6125     14.15    .3507
2007  SLN    4.48    .5634     .6207     13.67    .3497
2007  PIT    4.47    .5603     .6470     14.39    .3625
2007  HOU    4.46    .5739     .6122     13.34    .3513
2007  MIN    4.43    .5545     .6411     14.36    .3554
2007  ARI    4.40    .5621     .6532     14.11    .3672
2007  KCA    4.36    .5545     .6483     14.80    .3595
2007  CHA    4.28    .5579     .6453     13.34    .3600
2007  SFN    4.22    .5768     .5986     13.43    .3453
2007  WAS    4.15    .5838     .5787     13.64    .3379

Another benefit of using the Runs per Game was that I could run the correlations against all years going back to 1971, including the strike years (since the strike cut down on totals, not ratios).

The results? M.P. was correct about the negative correlation between LOB/TOB and run-scoring-the correlation I found (-0.52) was almost exactly as strong as the positive correlation between runs scored and LOB last week. OBI Percentage had a strong positive correlation to run scoring (0.82), but not as strong as R/LOB (0.87). That isn’t surprising-having runs in the numerator of your metric increases the chances of positive correlation to run-scoring-and neither is the rock-solid correlation between R/G and R/TOB (0.93).

As it turns out, in the end, it’s best to cut out the middle man (LOB) and attack the question directly: how often do the guys who reach base come around to score?

Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now

Prospectus Toolbox: Still Stranded

Thank you for reading

Latest Articles

Picking Guys Out of a Lineup 2024 $

Box Score Banter: Dealin’ Dylan Does the Deed B

To Swing and Miss Less is Tough Business $

Do Sophomores Still Slump? $

The Heat Check: Loperfido Looms, Collier Crushing $

Derek Jacques

Latest Articles

Picking Guys Out of a Lineup 2024 $

Box Score Banter: Dealin’ Dylan Does the Deed B

To Swing and Miss Less is Tough Business $