Murray Chass is at it again, or perhaps he never stopped. I'm not sure, as I'll admit an aversion to reading the blog of a writer who years ago declared his loathing for the form and its practitioners, but now dwells in that very ghetto himself since being laid off by the New York Times. Chass has made noise twice in recent weeks via missives bemoaning the diminishing primacy of pitcher wins and assailing the so-called "new-age thinking" of anyone who would introduce more modern measures, be they VORP, WAR(P), or quality starts. Even the latter, which was introduced by Philadelphia Inquirer columnist John Lowe back in 1985, is too newfangled for Chass' tastes.

In mid-February, Chass took umbrage at the $2.025 million arbitration award which the Pirates' Ross Ohlendorf received following a hard-luck season in which the 28-year-old righty won one game while losing 11, albeit with a respectable 4.07 ERA. "[T]imes have changed for pitchers. They don’t have to win games any more. Just throw some good-looking statistics out there other than wins, and they can win Cy Young awards and salary arbitration cases," he wrote.

Never mind that even while battling back and shoulder woes, the Dorf's season was actually quite similar to his 2009 campaign, which featured a superficially more impressive won-loss record:




























Ohlendorf's peripherals fluctuated from one year to the next, but all told, he outdid his SIERA by more than half a run in both years thanks to above-average defensive support. The biggest difference between his two seasons is that in 2010, he finished in a virtual tie with Johan Santana and Ryan Rowland-Smith for the honor of being the majors' worst-supported starter among those with at least 20 turns. To hear Chass tell it, this is evidence of the breakdown of society:

Under this new-age thinking, if a team doesn’t score more than three runs a game, a pitcher isn’t expected to win. No longer is a pitcher expected to win 3-2 or 2-1. If his team doesn’t score at least four runs, it’s not the pitcher’s fault if he doesn’t win.

There was once a time when pitchers were expected to win unless their team scored no runs, and then they were expected to tie. But those days disappeared with the advent of the quality start, the questionable creation of a Detroit writer, John Lowe, a nice guy but a little off in his thinking.

If a pitcher pitches six innings and gives up three or fewer earned runs he is credited with a quality start. Never mind that three earned runs in six innings computes to a 4.50 earned run average; that’s a quality start.

Eleven of Ohlendorf’s 21 starts fit the so-called quality category, but he won only one, which happened to be a genuine quality start in the dictionary definition of the word because he shut out the Phillies for seven innings.

Elsewhere in that piece, Chass bemoans the intricacies of the arbitration process, an irony given that he made his mark by pioneering the coverage of the business side of the game during the heyday of Marvin Miller, eventually earning the J.G. Taylor Spink Award. He stops short of calling for the abolition of arbitration, but it's clear from his insinuating tone that he's choking down every explanation for the pitcher's arbitration victory—including reliance on such arcane stats as ERA, innings pitched and run support—offered by those close to the case with the enthusiasm of a man forced to eat worms covered in boogers. Cheer up, Murray, is it really as bad as getting the Fire Joe Morgan treatment?

Last week, Chass again took up arms against the quality start (among other things) in an article titled "Pampered Pitchers and Their Enablers." He even talked to the stat's creator:

“I got the idea in 1983 and ’84,” Lowe said. “I was hearing managers saying they were looking for six innings from their pitchers. I heard Whitey Herzog say ‘all I want from my pitchers is six good innings.’”

That’s where six innings came from. And the runs? “Six and two is too stingy, six and four is too much. I wasn’t going to get into a more than or less than. This was new and had to be understandable.”

Why the need for a new statistic? “I didn’t like ERA as a definitive stat,” Lowe said. “One bad start could wreck your ERA. But I never said don’t look at wins and losses.”

Elsewhere in the piece, Chass lauds the won-loss record and the durability of Warren Spahn, who ranks sixth on the all-time wins list (and fifth on the pitcher WARP list) and who still excelled in his early Forties. He echoes some valid points in terms of the way modern pitchers are conditioned (five-day rotations, pitch counts, and so on) relative to their forebears, but flogging them for the sake of exalting an outlier who retired just shy of half a century ago isn't a terribly convincing argument. Who's going to tell him that nobody hits like Ted Williams anymore, either?

In any event, with the quality start no spring chicken, it's worth looking back at its relationship to modern pitching history in order to clear up some misconceptions about the stat and its utility. Via Chass and his allies, the chief complaint about the stat what we'll call the 4.50 Case, the times when a pitcher surrenders exactly three runs in six innings. I've argued elsewhere that the stat should be expanded to include starts in which a pitcher surrenders four earned runs in eight or more innings, and one can certainly take issue with the preservation of the artificial distinction between earned and unearned runs in this not-so-new metric, but for now, we'll stick with Lowe's definition so as to keep comparing apples to apples.

That 4.50 game ERA is higher than the major league average in all but eight of the last 61 seasons (as far back as's quality start data goes), so one can understand some of the resistance to hanging a "quality" tag on such an effort, but it's also true that such cases are infrequent. Last year, 4.50 Case starts made up just 8.5 percent of all quality starts, and 4.5 percent of all starts. Those pitchers' teams went 98-122 in those games, for a .445 winning percentage. Historically speaking, all of those figures are relatively high. Since 1950, the 4.50 Case starts have constituted just 5.9 percent of quality starts and 3.0 percent of all starts. Their frequency fell below 10 percent in the former category for the first time since 2005, a development that had plenty to do with the major league scoring rate (4.38 runs per game) falling to its lowest level since 1992. From 1950 onward, teams have won at a .419 clip in such games, but while team winning percentages fell below .333 no less than 15 times between 1950 and 1972, they've been that low in just two seasons since (1976 and 1981) and have reached .500 in just four seasons (1994, 2000, 2007, and 2008). That the strike-torn 1981 and 1994 seasons wind up among the outliers is duly noted; they may owe their presence on the list to smaller sample sizes. The takeaway is the same: teams don't win all that many of those 4.50 Case games, but those games make up only a small portion of quality starts.

What should be fairly obvious given all of the above is that the frequency of quality starts is heavily dependent upon scoring rates. In fact, it tracks almost perfectly. From 1950 through 2010, the correlation between the major league scoring rate and the overall percentage of quality starts is -.94. When the scoring rises, the quality start percentage (QS%) falls just as surely as God made little green apples for teenagers to chuck at Chass' proverbial front porch. Note the way the two lines mirror each other:

The highest QS% was in 1968, the Year of the Pitcher, when teams scored just 3.42 runs per game—the lowest scoring level since 1908—and 62.6 percent of all starts were quality starts. The lowest QS% was in 1996, when teams averaged 5.04 runs per game—the highest level since 1936—and 45.8 percent of all starts were quality starts. In 2000, when scoring crested at 5.14 runs per game, just 46.3 percent of starts were quality starts. With runs per game falling, last year was the first time since 2005 and just the second time since 1993 that the QS% rose above 50 percent.

Overall, the linear relationship between scoring rate and quality start percentage is described by the line y = -0.094x + .9263, where x is the scoring rate and y the frequency of such starts. In an environment of 4.0 runs per game, we'd expect 55.0 percent of starts to be quality, with the percentage dropping to 50.3 at 4.5 runs per game, and to 45.6 percent at 5.0 runs per game, very close to our actual observed percentages.

Regardless of the scoring rate or the era, pitchers making quality starts have collectively kicked ass. The combined ERA of starters in all quality starts in a given season has ranged from 1.63 in 1968 to 2.13 in 2000—yes, the two most extreme years again—with an overall average of 1.90. It rose above 2.00 just once from 1950 through 1992, coming in at 2.03 in the anomalous 1987, when the major-league scoring level (4.72 runs per game) rose higher than in any other season in a 43-year stretch (1951-1993). From 1993 through 2009, the ERA in quality starts was a smidge above 2.00 in every season save for 1995 (1.99); it dipped to 1.98 last year. Meanwhile, the combined ERA of starters in all non-quality starts has ranged from 6.98 (terrible) to 8.25 (godawful), with 1968 and 2000 again representing the extremes; 7.65 is the average, and 7.58 was last year's figure, down just a whisker from the previous two seasons. It topped 8.00 in four seasons (1994, 1996, 1999, and 2000). Graphically speaking:

Note that these ERA splits aren't all that different from what we see when looking at all pitcher ERAs in wins and losses. In 2010, pitchers posted a combined 1.96 ERA in wins, and 7.69 in losses. At the height of scoring in 2000, the numbers were 2.39 and 8.53; at the nadir in 1968, they were 1.36 and 5.22.

As you might expect, pitchers who deliver quality starts pitch deeper into games than the average starter. From 1950 through 1963, they averaged over 8.0 innings per start, and hovered just under that mark for the over another decade. As late as 2000 they were still consistently over 7.0 innings per start, while for the past decade, they've hovered between 6.82 and 7.01 per start, landing at 6.89 last year. Meanwhile, the average number of innings pitched in all starts has decreased from a high of 6.76 per inning in 1952, to a low of 5.79 in 2007; it's been below 6.0 in all but one year since 1995, but it did rise back to 5.98 last year. In general, quality starts hovered around 1.7 innings longer than the average start in the 1950s, slipped below 1.5 innings longer per start by the late 1970s, and spent the last decade falling from 1.07 innings longer to 0.91 last year, the first time it slipped below 1.0. Still, longer is good, and these guys work overtime.

In turn, pitchers delivering quality starts have left their teams with excellent chances of winning. Since 1950, teams have won at a .677 clip in games where they received a quality start, with the range running from .642 in 1968 to .717 in 1950. It hasn't been above .700 since 1958, when scoring was just 4.28 runs per game, but it was above .680 11 times from 1994 through 2007, with a high of .698 in 1996, when as noted before, scoring broke the 5.0 runs per game ceiling. The correlation between scoring rates and team winning percentage in quality starts is .71; the two are less directly tied together, but still more than a bit related.

While one might expect falling innings-per-start averages and complete game percentages to help explain the rest of the variance—perhaps pitchers not doing all the work their own damn selves harm their teams' chances of winning—such correlations surprisingly don't hold up. The correlation between team winning percentage in quality starts and annual complete game percentage is a negligible .06, and that between team winning percentage and annual average innings per start is -.15, meaning that the percentages actually fall with rising inning rates such as those observed in the first half of the '70s, when they were at their highest in two decades. In any event, the fact that teams win about two-thirds of the time when they receive quality starts is impressive enough.

Simplistic though it might be, the quality start metric has value. Over the past 61 years, the worth of a quality start has fluctuated in a relatively narrow range depending upon overall scoring levels (which to say the least are influenced by more than just the quality of pitching). No matter what the scoring level, the practitioners have collectively delivered elite-level ERAs around 2.00, with less than 10 percent producing the six-and-three performances over which detractors such as Chass obsess, and they've consistently left their teams in position to win even if they haven't been able to collect the W themselves. As for individual pitcher wins, we're not going there again today, except to point out that their decreasing stature owes plenty to the rising offensive levels, deeper lineups, longer at-bats, and increased reliever specialization which have made complete games largely a thing of the past. With the AL Cy Young awards going to Zack Greinke in 2009 and Felix Hernandez in 2010, it's clear that even most mainstream writers understand this, though some still cling to the bygone era.

 For sure, the quality start metric could be improved by including unearned runs and expanding to eight-inning cases, or replaced entirely with something better. Here at Baseball Prospectus, we're partial to the Support-Neutral Winning Percentage (found on this leaderboard), which tells you how often a team would win a pitcher's starts given average offensive and bullpen support, but that's a story for another day. The humble quality start metric is doing a laudable job of recognizing quality, and it deserves better than the abuse it sometimes receives.


Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now
You need to be logged in to comment. Login or Subscribe

Nice piece. Any chance BP will add Quality Starts to the pitcher statistics files?
I'm hopeful that at some point we can add a BP-defined quality start incorporating unearned runs and eight-inning cases, but there are bigger fish to fry at the moment. And as I said above, I'm quite a fan of our Support Neutral Winning Percentage, which tracks pretty well with QS%. Among ERA qualifiers (162 IP), the correlation between QS% and SNWP was .79 in 2010.
They probably shouldn't. I think the point is not that the state is terribly useful, but that it's not NOT useful.

I think I generally agree with the author, but it's quite a lot of effort to pick a bone with a fired writer. Schadenfreude, anyone?
Mudslingers and propagandists like Chass deserve to have their bones picked. The assertion that modern metrics have no validity when compared with archaic ideas is as old and just as pointless as protestations in the 15th century that the world is indeed flat, no matter what evidence indicates.
Agreed. Few have continued to invite consistent servings of schadenfreude pie the way Chass has, and we go way back with him here at Baseball Prospectus ( and

While I don't normally make a point of picking bones with him - because for the most part that mandates providing links to his site, which I'd prefer not to do - the two columns cited did inspire me to learn something and in turn to pass that knowledge along to readers.
*stat* not state
To include unearned runs in the quality start metric would be ridiculous. Simply change it to 7 innings and 3 earned runs.
I think including unearned runs would be a good idea, but I would state that the pitcher has to have pitched more than 6 innings. In other words, 6.3 IP with 3 runs or less would be quality, but 6 innings flat, would not be.
For every moron that rails on about the 4.50 quality start in defense of wins, answer him with this pitcher "win":
Wow. How did Ramirez get the loss in that one rather than de los Santos?
Ramirez gave up the 6th run that gave Milwaukee the lead for good.
I meant the Giants of course.
There are so many problems with Quality Start as a useful measure of anything (run environment, defensive support, 'quality' vs. 'QUALITY!'), I can't see any point to it. Like calling for Ford to restart a line to build 2001 model cars because they're better than the 1991 models you still find out on the road here and there.
I was suprised to see the W% for the Case start at .455. I would have guessed it would be higher. I wonder what it is for 6 1/3 IP and 3 ER. Wherever the line is that gets you above .500, that is how I would define a quality start.
Interesting that only about 6% of QS are of an iffy sort. If that's enough to discard the entire stat as useless, as Mr. Chass would apparently say, then we must also be done with errors, ERA, park- or league- or era-unadjusted BA and XBH, certainly W-L, and probably most of the other traditional stats that he feels captured everything we'd ever want to know.
"You kids get off my lawn" -Murray Chass

"And I would have gotten away with it if it weren't for you meddling kids!" -Murray Chass

any others...
He even mentioned Eckstein! I've never missed FJM more than I do right now.
Thanks for the article, Jay! This is awesome, in fact.

To help make the case, it's probably worth comparing Wins with the same rigor:

From 1951 to 2010:
ERA in Quality Starts: 1.90
ERA in all other Starts: 7.65

ERA in Starter Wins: ?
ERA in other Starts: ?

One would expect to find a much narrower difference.

On a year-by-year basis, as in the chart at the end of the article, would one expect to find Wins shows more variance year to year?
Alas, it was late in the game when I thought that such a comparison might be worthwhile, and would have required hand-harvesting the data instead of querying our department.
That's too bad... would be lovely if there were a way to make that comparison. Would really drive home the point that QS is a better representation of the pitcher's performance than Ws.

Excellent analysis nonetheless. Have a great weekend!
I can't think of any reason Murray Chass would be worth discussing, even by way of introducing an otherwise interesting article.
Murray Chass is to baseball as Sarah Palin is to politics as Charlie Sheen is to intelligence.
Murray Chass reminds me of the monkeys in 2001: A Space Oddyseey when the monolith appears.
Yeesh - Odyssey
I like that guys like Chass are still writing stuff like that. The more they write and talk, the more silly they sound and the more folks who actually like to think and stuff will get interested in the "newer" stats (many of the "new" stats and ideas are getting into their 20s and 30s) and the newer ways of looking at baseball.

Even in the main stream media you see and hear more and more new ideas being written and talked about. The folks who want to learn more will. And those who want to stay stuck in the "good old days" (that never existed which is another huge glaring weakness in Chass' arguments) will do so regardless.

But guys like Chass provide a very good point of departure for explaining the more realistic look at baseball and statistics and how things really work.

So I'm glad Chass keeps writing this stuff.


I don't understand the caption on the picture for this article on the front page, according to Chass, Mathewson is in the best shape of his after-life.
Chass: "this, I think, is my favorite Spahn statistic: the man gained 25 percent of his victories, 91 of 363, when his team scored three or fewer runs."

This is why I love Murray Chass. His belief that the most notable thing about Spahn has nothing to do with what Spahn did himself, but what his teammates did not do. Remarkable, Murray Chass.
To give him his due there, it is about Spahn. He's suggesting Spahn 'pitched to the score' to gain those 91 victories.
Spahn's belief in the Easter Bunny was responsible for all of those eggs the children found.
Richie--I get Chass' intent. That's actually the primary reason I take issue with it. I don't think his point about Spahn is a good one. It's not like saying:

-"My favorite Felix Hernandez stat is that over 80% of his career wins have come when he's allowed two or fewer (or less than half the AL average) earned runs." Or ...
-"My favorite Felix Hernandez stat is that 31% of his career wins have come when he has allowed ZERO earned runs." Or ...
-"My favorite Felix Hernandez stat is that 28.3% of his career losses have come when he's allowed two or fewer runs."

Those have to do with Felix Hernandez, and tell us that 80% of Felix's wins are coming when he's twice as good or better than average. Whereas the stuff about Spahn has little to nothing do with Spahn.
Well, he's also saying it's his "favorite" thing, not the "most notable" thing, about Spahn. Chass gives us statheads plenty of stuff to ridicule, without also spinning stuff a bit on him.

And some reason to ridicule, given his attitude toward us. But at this point I'm with worldtour. I think we've whacked him around enough for one day, particularly given that he's not that big a target anymore.
I don't really see why anyone would have suspected that quality start pct. did not correlate well with avg. runs scored, or with ERA, or with team win pct. I.e., I don't see why this analysis was performed or what it really proves.

The question is, what additional value (beyond RA or ERA) does the quality start metric provide? I would say not much. It shows how well-spaced runs allowed have been for the pitcher. The problem is that this is not likely to be a skill -- i.e., past run spacing is not predictive of future run spacing. And if you're in the market for a stat that is not predictive of future skill, you might as well use W-L record.
Isn't it the same kind of argument as batting average or BABIP's relation to OBP? Basically, quality starts have some consistency (but some fluctuation like batting average) though RA/ERA tell a fuller story just like OBP does.
I've never considered 'quality start' a particularly useful concept.
However, it does seem to say something about keeping your team in the game. I'm sure BP has the probabilities associated with eventually winning a ballgame if your opponent has scored 3 runs or fewer in 6 innings.

A robot pitcher who could be counted on to give up 3 runs in 6 innings every time he got the start would be far more valuable (i.e., a 4.50 ERA and a 0 Flake rating), than a Flake-star like Joe Saunders (overall ERA of 4.47, Flake ~.29), who had 10 starts in which he had an in-game ERA of 7.5 or better.
I decided to test this in response to evo34's comment above.
For pitchers with 100 or more IP in both 2010 and 2009, the correlation of FLAKE between years was r=0.08 (n=94).
For 2008 and 2009, r=0.17 (n=88).

Correlations between years for the top 32 flakiest were somewhat higher,
r= 0.175 between '09 and '10, and r = 0.44 for '08 and '09.

So, although the flakiest may have a tendency to stay flaky,
I think evo is probably right that steadiness doesn't seem to have any predictability.
Sometimes statistics impact reality and it seems possible the "quality start" is an example. If pitching six innings and allowing three runs or less is the goal, do pitchers and managers employ strategies designed to maximize this result in the (probably mistaken) view that such a strategy helps them win games? This interesting article notes that the number of '4.50 starts' last year was significantly higher than the historical average since 1950 (2010- 8.5%; Avg. 5.9%). Was 2010 an outlier, or is this a trend? If the latter, it may be that that the stat is affecting the way the game is played. If so, this is probably unfortunate because of all of the reasons mentioned by others above about why the stat is not that useful (even if better than wins).

Quality starts would likely be a more informative stat if calibrated against the average run scoring of the opposition. At the extreme, how impressive is a quality start (particularly a 4.50 start) against last year's Mariners, who scored less than 3 runs a game? Compare that to throwing a quality start against last year's Yankees. If a pitcher's quality starts do not correlate to the run scoring of the opposition, that is probably the result of inconsistency in game to game performance (to the extent it is not a result of luck/randomness). On the other hand, if a pitcher's quality starts are predominantly against weaker hitting teams, that probably says a lot about the quality of the pitcher (good but not great?)

Given that the top starters are only going to get 34 starts a season, and there are few who get that many, is there really enoough data in one season to make the quality start a useful stat?
You make some good points.

It's pretty clear that managers manage with an eye on wins and saves, nursing a creaky starter with a lead through five innings where at all possible and robotically deploying a closer to protect a three-run lead. In the data I collected, I don't see any evidence that managers are managing to to the stat. With their ever-growing bullpens, they do seem to have identified sixth- and seventh-inning guys with more regularity, implicitly acknowledging they're only going to get so much out of their starters, but that's not the same.

As noted in the piece, the 4.50 Case starts are a small subset of all starts, and while the 2010 rate was higher than the historical average, it was its highest since 2003 and the first time below 10 percent since 2005. Having laid out the predictable connection between scoring and QS%, it should come as no surprise when I say that the QS rate has been above that 5.9% historical average in every year since 1993, when scoring began its rise.

As you say, there are more informative ways to measure pitcher contributions; our SNLVAR adjusts for the caliber of opposition in its win expectancy-based valuation, while SNWP provides a rate stat. My point is not that we should build a new stat from scratch, but merely that it's worth coming to a rapprochement with an oft-misunderstood mainstream stat, because it does have its uses.