Checking the Numbers: Under Pressure

March 25, 2009

For the past several years, the perception that closers perform poorly in non-save situations has increased. These relief aces fail to look particularly sharp unless they’re under pressure and have the game’s fate in their hands. Our own experiences have helped fuel this idea; we’ve all been witness to an untouchable pitcher entering a game with a 3-0 deficit and allowing a few more runs to score while pitching an ineffective inning. Unfortunately, with the memory of these negative events in mind, a categorical bias emerges where every example only provides further evidence of closer ineptitude when the game is not on the line. Is this strictly a categorical bias, or are the results and discrepancies in data between save situations and non-save situations real and significant?

Last year I conducted a study on closers, pooling together all seasons with at least 15 saves from 1980-2007. The query offered 696 pitcher-years and 220 unique game-savers, but the analysis of their stats in and out of save situations was a solid first step at best. The results, which were deemed viable via a paired samples t-test that compares the means of two different variables, showed that closers did post somewhat improved rates in their save opportunities. The discrepancies in the rate stats measured (ERA, K/9, BB/9), though they were significant, differed only by plus or minus 0.25 units per nine innings.

The study failed to incorporate a few very important factors which, when controlled for, may produce vastly different results. For starters, the most obvious is the rust factor: a large chunk of non-save appearances constitute examples of the hurler merely getting his work in. If the closer has not entered a game in four days, it makes perfect sense that he may not be at his best. His control may be solid but not pinpoint, or it might take several in-game pitches before he reaches his target velocity. The next important factor involves who makes the bulk of the save and non-save appearances. A very talented team is likely to win a good number of games, providing their closer with plenty of chances to record saves, and they may only have a handful of non-save appearances. Conversely, those on poor teams are more prone to appearing in non-save situations, because the opportunities to save games are not as abundant. In a small enough sample, these opportunity factors could drastically skew the data; a closer on a bad team may be dynamite when it counts, but merely average, for random reasons, when he’s doing nothing more than getting in his work.

Additionally, the strength of opponents must somehow be factored into the equation. A large enough sample of games played is required before the average winning team will appear to be better than the average losing team. For instance, I currently participate in a superstar Strat-o-Matic league, and one of my favorite tools is the lineup evaluator, which runs simulations pitting certain lineup configurations against another team. In a set of 25 simulations, my squad can do pretty well against what is considered to be the best team in the league, but over 15,000 simulations, the true talent levels of each team becomes much clearer. Combine these mitigating factors with the results discussed previously and it becomes evident that the results merit further control and adjustment before they can provide any true insight.

What about Pitch F/X data? Do closers have different pitch data in and out of save situations? This aspect of performance would be largely immune to the factors detailed above. After all, a closer is unlikely to make a conscious effort to throw with less velocity or decreased movement just because the opponent has been on a seven-game losing streak. To begin, I queried one of my databases for all pitchers with at least 10 saves last season. (Which reminds me: with only one season of data in our sample, this analysis is more of a preview of a movie being released five years down the road than it is the feature itself.) The query produced 37 pitchers, most of whom were full-time closers for the entire season, and some others who had lost or gained the role halfway through the season. Four more were added that just missed the benchmark because they were closers at some point during the season, providing a grand total of 41 pitchers.

Coding a database for save opportunities is a very labor-intensive task, a fact that Sean Forman of Baseball Reference will undoubtedly vouch for, so instead I did the coding manually and entered the results into a game-logs table. Then the data was joined to the Pitch F/X information, and averages in and out of save situations were calculated. Fastballs were of primary concern; closers throw them practically three-quarters of the time. Here are the overall results for the group as a whole:


Overall   Velo   PFX   PFZ
Save     92.99  5.41  8.95
Non      92.48  5.53  9.04

And here are the results partitioned by average fastball velocity:


Type          Velo   PFX   PFZ
Fast-Save    95.13  5.23  10.04
Fast-Non     94.93  5.32  10.22
Medium-Save  92.46  5.66   9.17
Medium-Non   92.06  5.79   8.99
Slow-Save    89.12  5.27   6.61
Slow-Non     88.46  5.43   6.98

With the entire group attending the sample fair, there was roughly a half-mile per hour increase on the heater in save situations, albeit with slightly less movement. Breaking the group down diminishes the significance of an already questionable sample, but it does provide a window into what may become apparent with more pitcher-years added into the fold. Running paired samples t-tests shows that the only overt difference involved overall horizontal movement, which itself is likely a type one error, where the results are statistically relevant but underwhelming in practical application. This segues nicely into a discussion about statistical significance versus clinical significance. I am by no means a clinical psychologist, but I have hung around with one for over a year now, and I feel the occasional rub every so often. The difference between the two forms can be summed up by the following question: a half-mile per hour difference on the fastball in save situations may be statistically significant, but does that overall result mean anything to the pitcher or to overall pitcher evaluations?

Despite the sample not yet being large enough to produce useful results, I more than expect to see similar findings when rerunning the study in future years. The question arises of whether or not these rather minimal discrepancies even matter. A half-mile per hour? I’ve yet to read any studies proving that pitchers generally average slower velocities in their ineffective games, and the movement components feature no more than a one-tenth of an inch difference-not very likely to make or break a game or a season given the standard deviations we’ve previously looked at and discussed.

What this all boils down to, is that in pondering whether closers perform as well out of save situations as they do in them, we’re asking the wrong question. What should be of interest is whether or not they pitch differently, which these cursory results would seem to suggest. Then again, this would make intuitive sense from a clinical standpoint; I’m likely to exhibit slightly different tendencies when protecting a one-run lead than when entering a game with an eight-run lead. In the latter game, a closer may be less likely to induce swings out of the zone, merely attempting to pitch to contact, and if a run scores in the process, who cares? This lax approach would obviously not work in save situations.

A pitcher should pitch differently in non-save situations, especially those with a hefty lead, because his approach involves aspects of pitching with a higher probability of surrendering runs. In save situations or with a one-run lead, the pitcher is much more careful to not give up the tying run, but in trying to be too fine he may actually give up a few. These different approaches may occasionally produce worse rates in non-save situations, but a straight-up comparison of performance-based stats is inaccurate because the closers are implementing two different strategies. Such a comparison would be akin to comparing Tiger Woods’ performances in and out of major tournaments; he’s more likely to buckle down in the US Open than he is in the Perrotto/Jaffe Invitational. This may lead to worse rates, but in order to really prove that a closer was worse in non-save situations, he would need to give it his all in those appearances, and from a human evaluative standpoint, this is very unlikely to occur. Do they pitch worse in non-save situations? It’s not worth answering, because it’s the wrong question. Do they pitch differently? Yes… and they should.

Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now

Eric Seidman

Latest Articles

You need to be logged in to comment. Login or Subscribe

ghorsche

3/25

There's one critical point that always seems to get missed when discussing closers in save vs. non-save situations:

If a pitcher for the away team allows the go-ahead run in the 9th inning or later, the game ends, and they no longer have the opportunity to foul up their stat line. If you're having a bad night in the 8th inning, you can turn a 5-run lead into a 3-run deficit and leave 3 more guys on base. If it's the bottom of the 9th and you're protecting a 1-run lead, you can give up a homer, two walks, and a double, and that 2nd walk doesn't even have the opportunity to cross the plate.

(Credit to a friend of mine who pointed this out during some random bar talk a few weeks ago.)

Reply to ghorsche

EJSeidman

3/25

Nate, definitely a valid point. I had that as a factor but didn't want to discuss ALL factors hurting the studies. Yours is definitely valid, though, as it can muddle up the save situation stats.

Reply to EJSeidman

swartzm

3/25

Eric, I really liked this article, and I never had read the old one from last year so thanks for linking that.

I wonder how much of this is pitch selection. You mention pitching to contact in 8-run games, but I wonder if they are actually throwing MORE fastballs or maybe even more breaking balls if they are getting their work in. Is this something you can check for quickly in your database? Are they throwing more strikes when they throw their fastballs or nibbling more? I imagine pitch selection and where the pitchers aim would indicate why this is so.

Another thought-- you alluded to pitchers getting their work in. Do pitchers generally throw slower after 3-4 days rest than after 0-1 days rest? I know cumulative ERA is lower for relievers on shorter rest, but I have to imagine that's a sampling bias given how closers and set up men throw on consecutive days more and the guys throwing on 3-4 days rest are more frequently mop-up guys. So I would think that checking to see how closers do after a few days rest in save situations would be a natural extension.

Then again, I'm not the one who would have to code for all that, so maybe that's all just impossible to check. Good article, either way. Very informative.

Reply to swartzm

EJSeidman

3/25

Matt,

Definitely good stuff for me to look up. I did find that the overall pitch selection discrepancy was minimal: 72% in save, 74% in non-save. As far as pitch data based on rest... well... you may have just given me an idea for an upcoming article ;-).

Reply to EJSeidman

EJSeidman

3/25

That is, 72% fastballs vs 74% fastballs.

Reply to EJSeidman

dsc250

3/25

Is it really true that closers on worse teams have fewer save situations? I always think back to Bryan Harvey in 1993, with his 45 saves on the 64-98 Marlins. Randy Myers, who led the NL that year with 53 saves, played for the barely-above-.500 84-win Cubs.

It sure makes sense that closers on better teams would have more opportunities, but we've seen again and again that what makes sense often isn't the case in baseball when closely scrutinized.

Reply to dsc250

EJSeidman

3/25

As with all aspects of baseball, outliers will exist even when the majority represents something else. The idea is that better teams will have more opportunity for save situations... doesn't necessarily mean it is always true, or even that the teams will employ the designated closer in all of the games... just that more opportunity generally exists.

Reply to EJSeidman

jdseal

3/26

Here in Cincinnati, one of the (many, undeserved) knocks on Adam Dunn was that while he hit his 40 home runs every year, a disproportionate number were solo homers, with no one on base, when the team was down by a bunch of runs late in the game; that he wasn't "clutch", most of his home runs didn't matter. I admit I had the same thought, watching the team every day, it did SEEM that way, but the analyst in me said it was probably just perception, or small sample size.

Until I got to thinking, the conventional wisdom when you're pitching with the bases empty and a big lead is you don't nibble around and walk the guy, you "go after" him. If there really is a different pitching style in certain situations, then it would be no surprise if results were different, and Adam Dunn really did hit more home runs when pitchers were "challenging" him.

If a different situational pitching style in low-leverage situations can show up in a hitter's stats, it could and should certainly show up in pitchers' stats as well, whether they are closers or not.

I would be curious whether there are differences in non-save situations that are non-save because the team has too big of a lead to qualify, versus situations where it's non-save because the game is tied (closers often used in extra inning games), or because the team is slightly, or hopelessly, behind.

Reply to jdseal

EJSeidman

3/26

This interests me as well, but alas we do not have enough data. Let's promise to revisit this in 5 yrs!

Reply to EJSeidman

ostrowj1

3/26

I think you still need WAY more data to even perform the analysis you are trying. I think you need to look at individual pitchers performance splits in save / non-save games. Maybe the non-save data has a higher percentage of Joe Borowski fastball of suckiness. Of course, to get a sizable sample of non-save games you would probably need to go across years, which adds another realm of variance...

Regarding your previous article on effectiveness, I am surprised that the data showed closers were more effective in save situations. This seems like another instance where data collection may be impossible. How do you account for the difference in the quality of opposition? Non-save situations probably include a lot of Nefi Perez at bats...

Reply to ostrowj1

EJSeidman

3/26

At the risk of being rude, did you even read this article? I literally went over each of your points and fully acknowledged that this is just a start, nowhere even near the full study. Of course we need more data. This has 41 pitchers in 1 season... we need much more, but I mentioned that at least 4 times throughout.

I DID look at individual pitcher splits, and then calculated the weighted results, and to ensure that guys like Borowski weren't hurting the rest of the sample, I partitioned it further.

And I have a whole paragraph here designated to discuss how strength of opponent needs to be incorporated.

Reply to EJSeidman

rsambrook

3/26

Great Article Eric

I'm looking forward to more analysis using Pitch F/X as more seasons of data are collected and the technology is perfected.

Reply to rsambrook

EJSeidman

3/26

As am I... as am I. For now, it's more of a supplement to analyses rather than the definitive form of analysis. In several years time, we will be able to do many of the same things we have with other data, such as aging curves and studies along those lines.

Reply to EJSeidman

AutomatedTeller

3/26

I have no problem believing that closers do worse in non-save situations than in save situations.

Maybe in save situations they have more adrenalin and they use that adrenalin effectively. And they are used to it, so their muscle memory is tied to the adrenaline... thus, when it's not a save situation, they are less effective, because they don't have the extra jolt.

Reply to AutomatedTeller

EJSeidman

3/26

AutomatedTeller,

Exactly... there is something different about pressure, and we shouldn't be suppressing humanistic elements of the game but rather using them in conjunction with the stats. Like I wrote above, do closers have different pitch data in/out of save situations? Yes, and they should be expected to because the approach is different. In non-save they might be more prone to throw in the zone, nibble less, etc.

But I don't want us to say they do "worse." They pitch DIFFERENTLY, which may lead to different results.

Reply to EJSeidman

pizzacutter

3/26

Don't hang around with clinical psychologists. They're all nuts.

Reply to pizzacutter

LynchMob

3/26

It's my understanding that Trevor Hoffman has said that he is less effective in non-save situations ... and explained that the game situation (ie. pressure) impacted the hitters and their response to the change-up ... "if save-situation, then eager/anxious/under-pressure and therefore suseptible to swinging early at a good changeup" ... but "if non-save-situation, then relaxed and able to recognize change-up and smash it" ...

As you state, there sure seems to be a lot of experiences that have helped fuel this idea ...

Thanks for digging into it for us! Can't wait for more data ...

Reply to LynchMob

JackieAtUPS

3/27

Eric,

Something I would be interested in seeing is how do different types of closers perform in regards to the split. By that, I mean, does Papelbon approach a non-save situation differently than someone like Joe Borowski? The data seems to show that guys may "relax" and pitch for contact more. But what about closers who already rely on contact? Guys like Borowski and Todd Jones and Everyday Eddie who were effective closers but didn't have lights out stuff. Are they pitching in the same way and it just so happens that a bleeder or two gets through and when they put up a stinker, we see it and say, "eh, he's a junkballer anyway, at least this happened tonight and not with a two run lead". I guess you would start with K/9 or something like that to differentiate between a "stuff" closer and "guile" closer. That might be interesting to take a look at.

Reply to JackieAtUPS

EJSeidman

3/27

Do you mean a breakdown based on batted ball types? Like those who miss bats vs. those who pitch to contact? It's definitely doable. What I'd like to do is get a couple more ideas like this and put together an article dissecting the suggestions.

Reply to EJSeidman

Checking the Numbers: Under Pressure

Thank you for reading

Latest Articles

The Stash List ’24: Week Four $

Box Score Banter: No Exit B

MLU: Triantos Tries on Some Power $

Speed, Spin, and Snap $

Pat Murphy, Wade Miley, and the Ship of Theseus $

Eric Seidman

Latest Articles

The Stash List ’24: Week Four $

Box Score Banter: No Exit B

MLU: Triantos Tries on Some Power $