March 2, 2011
Waiting for Relief
October 20, 1993: It was the end of the top of the eighth, and Jim Fregosi was looking at the smoldering wreck of a winnable ballgame.
David West had already given Fregosi's buggy eyes cause to pop from their sockets in the sixth, when the big lefty had narrowed a five-run lead over the Blue Jays to three. Getting the lead back up to 14-9 with six outs to go seemed to set things right—surely even the ramshackle Phillies' pen couldn't blow this, could they? Hold the lead, tie the World Series, with Curt Schilling slotted for Game Five, and it would be anybody's series to win. But after watching Larry Andersen put three men aboard and surrender a run, Fregosi knew that one-for-one outs to runs tradeoff wasn't going to fly without upsetting this particular applecart.
So, the veteran skipper fatefully summoned someone new into this mess, with runners on second and third, a four-run lead, and five outs to go. The alternatives to put out the fire at that point were choices between gasoline, nitro, and napalm: the only unused company Mitch Williams had in the pen at that point of the ballgame were fifth starter Ben Rivera and Bobby Thigpen. Thiggy may have been just 30 years old, but he'd been looking done for a couple of years already, because of a bad back that didn't care if he was the single-season record holder for saves.
It was an awful contretemps, but the bitter fruits of this self-inflicted decision tree were there for the plucking, and the Phillies were plucked. Fregosi reached for the Wild Thing, only to have to sit through a death-by-inches inning as the game slipped away in five batters and just 19 pitches, capped by Devon White's lead-seizing triple. It would be an even more horrifying outcome than the subsequently famous Game Six Series-winning shot off the bat of Joe Carter, which would echo and expand the impact of Fregosi's eighth-inning Waterloo in Game Four.
The long-suffering Fregosi might well have wondered, “How did I get here?”* It's certainly not where he expected to be, but for want of an arm he could count on, he and the Phillies had achieved October infamy with their late-game pyrotechnics.
This time of year, it might seem as if nothing of import is happening, because the wires are humming with the absence of information. This is the stretch of time when we get space-filling stories on pitchers being ahead of the hitters, hitters being ahead of the pitchers, injuries slowing somebody's ability to be ahead of anybody, and the dutifully reported assertions about leadership, team play, execution, and happiness about just getting to be here. Which is another way of saying it's spring training for everybody, as even the writers have to get their clichés in regular-season form. They're your friends, after all.
How teams order their ducks now is important, however. There's actual baseball afoot, and if it just happens to be effectively opaque to most of the jabberocracy, that's because it progresses at the pace the game plays at: a day, a game, a practice at a time. The time for most of the elective decision-making that managers are going to exercise simply isn't ripe yet, so best to just get over it. In particular, one area where everything that happens matters is also the place where the majority of a manager's decisions are likely to get made: the bullpen. In the days and weeks to come, managers will be picking their Opening Day pens, and perhaps get a handle on their best altenatives once one reliever or another falls out of favor.
As much as I might grouse about the tactically stultifying present, with so little action on offense, it's important to keep in mind that this can boil down to so much whinging for an idealized past. If, as L.P. Hartley observed, the past's a foreign country where they do things differently, the present has its offerings. Picking a bullpen is the area where a manager can exert the most creative impact these days, as he sorts out his own usage patterns and adapts—or not—to the strengths and weaknesses of his collection of possible selections.
Every skipper already knows what his lineup looks like and who does what, or what his rotation regulars are supposed to do. Picking a pen, however, isn't simply a matter of choosing who will be relieving, but also sorting out what those relievers get to do, and why, and when. It isn't simply about picking the best arms, but building the best unit, an exercise in puzzle-building that demands familiarity. So the weeks to come will feature 30 different managers sorting out who they've got and what they're for, and testing them in game situations. It isn't particle physics or 36-point headline material, but it's a necessary part of the process.
Deciding who sticks can be especially frustrating if you want to get hung up on small samples and limited data and even more limited exposure; life doesn't always lend itself to a sober aggregation of inputs and careful cogitation. Even armed with reams of data and scouting information, there are just seven weeks for a manager to pick his bullpen, and what happens in that time conspires to jumble up even the best-laid plans. Although most skippers might have no more than a dozen likelies to pick from, even then you can wind up with the odd surprise. Lou Piniella certainly didn't walk into camp last year thinking, “I want three lefties in my pen, and James Russell is just the man to get me there.”
Getting it right and picking the right guys has its rewards. It's not a secret that there's a direct relationship between winning and having a good bullpen. Consider the relationship of Adjusted Runs Prevented to winning percentage and to Pythagenpat in the divisional era (1969-2010):
The correlation coefficient between ARP and winning percentage is 0.485, or what might be referred to as “pretty good.” As a metric that evaluates how relievers prevent runs, it's a counting stat that operates from a straightforward context-neutral proposition, which might be a bit lo-fi, especially when the relationship between WXRL and winning percentage is a slightly better .518. However, I hope I'm not alone in giving some thought to what Colin Wyers has noted in the past when it comes to leverage-dependent stats, in that the sequential interdependence of events makes leverage-based outcomes a bit problematic in evaluating performance: leverage as a concept often involves a stochastic value getting passed farther and farther back into an individual ballgame, until we run out of ballgame; the closer gets to reap that value, in addition to his save tallies and Holtzman-derived riches.
Which is not to say we should just chuck WXRL outright, but it's a metric to use with an understanding that it tends to favor closers, where ARP just tells us how well a pitcher or group of pitchers pitched, expressed in runs, not wins. Obviously, nobody's ever going to say, "We're going to lead the league in WXRL this year,” but if you wanted to say, “We want the best bullpen in baseball,” I lean towards ARP as a more straightforward expression for it.
What that might have meant for the 1993 Phillies is a little helpful. Their team-wide WXRL tally was 6.039, which doesn't sound so terrible, right? Six wins above replacement wasn't great in 1993, rating eighth in the 14-team National League, and it reflects that the Phillies weren't terrible when it came to converting late leads into wins. Mitch Williams converted 43 saves, after all, but via WXRL he ranked just 22nd out of 26 relievers who got 20 or more save opportunities. Happily for him, he had Andersen and West outperforming him:
Which is another way of telling you what you probably already know if you remember the '93 season or the Wild Thing's career: he might have been great for entertainment value, but "explosive" described his performance, not his stuff, the mystique of the save as a descriptor of value made ludicrous and manifest. Beyond Williams' panic-inducing pitching, however, the Phillies' more basic problem was that this was essentially all they had going for them in the pen, at a time when other teams were adapting the La Russa playbook and busily stocking up on specialists. The Phillies never found a reliable fourth reliever, let alone a fifth or a sixth. They mucked around with Mark Davis, traded for Roger Mason and Thigpen and (too late for the post-season roster) Donn Pall, but Fregosi's ballclub still pointedly led the league in complete games.
Switch over to ARP, and you get a sense of how shallow the Phillies' pen was, and as a result how little it was contributing as a unit to the club's collective cause: 1.1 Adjusted Runs Prevented as a team over 162 games is effectively the statistical equivalent of “participant,” tied for 368th out of 566 teams in NL history and 406th in relief-only FRA in the era of divisional play and the 162-game schedule, against a more mediocre 276th in WXRL. The Phillies' bullpen might have done reasonably well in the role-play of closing over 162 games, but they simply weren't armed with much quality or depth when it came to getting people out and keeping runs off the board.
It's failures like the Phillies', for want of a full spread of options, that keeps me thinking that paying attention to who gets selected and why, and then who gets used and how, is a fairly important story to follow right now. At the very least, I consider it cause for a wee bit of patience with the proceedings at present. The Rays might seem reliant on a collection of unlikelies, but Grant Balfour's track record three or four years ago wasn't any better than Kyle Farnsworth's is now (although it was at least shorter). The Red Sox might seem to have more relievers than they know what to do with, but it remains to be seen how Terry Francona will decide to use them. Balfour's uses as an Athletic won't be defined by his getting shellacked the first time out, but you also have to figure Bob Geren won't define usage on reputation alone, not when he has almost as many late-game alternatives in Oakland as Francona will in Boston. Watching 30 different skippers sort through their individual erector sets isn't exciting for those seeking instant gratification, but it's necessary, as well as necessarily slowly resolved.
Which leaves me with the prosaic proposition that what's happening—the "nothing" of who's getting to pitch and how they do when they're out there—is what matters. That's the actual news event, after all, and every decision maker charged with evaluating the action is paying attention to it. Admittedly, it's as unsexy as all get-out. Spring training box scores, manna though though they may be after the long winter, are a mess, and we know spring training performance isn't predictive of anything in particular.
However, the point of the performance isn't to summarize a player's usefulness, but to suggest it. If a guy's struggling with locating his off-speed stuff and doesn't correct it in camp, that doesn't go unnoticed. If it takes a pitcher out of the running for a skipper who's picking his pen, we have to settle for actual, not statistical significance. Eight months from now, you can be sure that two managers will be able to tell themselves how they got there, because it will spring from what they're sorting through right now.
* Perhaps the better question to ask, as was mulled in at least one bar in Chicago by a quartet of semi-sozzled skeptics, was “Where's the Pope [Donn Pall] when you need him?”
Thanks to Rob McQuown for his research assistance.