CSS Button No Image Css3Menu.com

Baseball Prospectus home
  
  
Click here to log in Click here for forgotten password Click here to subscribe

<< Previous Article
The Lineup Card: 9 Bet... (03/13)
<< Previous Column
Skewed Left: PECOTA vs... (03/07)
Next Column >>
Premium Article Skewed Left: Spring St... (03/18)
Next Article >>
Premium Article Top Tools: Glove (03/13)

March 13, 2013

Skewed Left

Saberizing the Gold Gloves

by Zachary Levine

the archives are now free.

All Baseball Prospectus Premium and Fantasy articles more than a year old are now free as a thank you to the entire Internet for making our work possible.

Not a subscriber? Get exclusive content like this delivered hot to your inbox every weekday. Click here for more information on Baseball Prospectus subscriptions or use the buttons to the right to subscribe and get instant access to the best baseball content on the web.

Subscribe for $4.95 per month
Recurring subscription - cancel anytime.


a 33% savings over the monthly price!

Purchase a $39.95 gift subscription
a 33% savings over the monthly price!

Already a subscriber? Click here and use the blue login bar to log in.

So we won this weekend. At least I think we won. At least I think they told me we won.

It was announced that the Gold Glove Awards will add a metric component to the traditional voting of major-league managers and coaches, a presumed victory for everyone who prefers the analytical and objective over the judgment of the human eye.

So why no celebration in this virtual household, which stands for just that?

First of all, the release didn’t give much information about the metric itself. Here’s a portion of the release from the Society for American Baseball Research:

As part of the multi-year collaboration beginning with the 2013 season, SABR will develop an expanded statistical resource guide that will accompany the Rawlings Gold Glove Award ballots sent to major league-level managers and coaches each year. In addition, SABR will immediately establish a new Fielding Research Committee tasked to develop a proprietary new defensive analytic called the SABR Defensive Index™, or SDI™. The SDI will serve as an “apples-to-apples” metric to help determine the best defensive players in baseball exclusively for the Rawlings Gold Glove Award and Rawlings Platinum Glove Award selection processes
.…
Beginning in 2013, the managers/coaches vote will constitute a majority of the Rawlings Gold Glove Award winners’ selection tally, with the new SDI comprising the remainder of the overall total. The exact breakdown of the selection criteria will be announced once the SDI is created later this summer.

In other words, they’re working on it.

But this isn’t a critique of SABR’s motives, which are absolutely in the right place—taking a step toward greater likelihood of getting it “right” and spreading knowledge of defensive statistics. Nor is it really even about the uncertainty of what will come out of the SABR conclave.

It’s about the certainty of the ugly process that will ensue.

1. There will be a fight when the metric comes out.
The problem with giving this committee the task of inventing a metric by modifying/splicing existing metrics is that it’s virtually impossible. It’s not like the analytics community hasn’t been trying.

Using data provided by Baseball Info Solutions, Mitchel Lichtman pioneered the Ultimate Zone Rating, which uses a fielder’s capacity inside and outside of a given zone to make plays. We at Baseball Prospectus use Fielding Runs Above Average as the defensive component of our value statistics, focused more on total plays made given conditions such as pitcher’s ground-ball rate, batter’s handedness, ballpark and base-out scenario.

Both are exhaustively researched and justified, and yet the choice of using one or the other in Gold Glove voting would lead to totally different results. At the positions where UZR is calculated and available at Fangraphs.com, battery not included, here is the breakdown of the top players from 2012 who played the whole season in the same league.

Position

FRAA

UZR

Gold Glove

AL 1B

Casey Kotchman

Mark Teixeira

Mark Teixeira

NL 1B

Adam LaRoche

Joey Votto

Adam LaRoche

AL 2B

Howie Kendrick

Dustin Pedroia

Robinson Cano

NL 2B

Aaron Hill

Darwin Barney

Darwin Barney

AL 3B

Brett Lawrie

Mike Moustakas

Adrian Beltre

NL 3B

Pablo Sandoval

David Wright

Chase Headley

AL SS

J.J. Hardy

Brendan Ryan

J.J. Hardy

NL SS

Brandon Crawford

Clint Barmes

Jimmy Rollins

AL LF

Alex Gordon

Alex Gordon

Alex Gordon

NL LF

Alfonso Soriano

Alfonso Soriano

Carlos Gonzalez

AL CF

Jarrod Dyson

Peter Bourjos

Adam Jones

NL CF

Angel Pagan

Michael Bourn

Andrew McCutchen

AL RF

Nick Swisher

Josh Reddick

Josh Reddick

NL RF

Jason Heyward

Jason Heyward

Jason Heyward

 

Of 14 positions, UZR and FRAA agree on the Gold Glover in exactly three of them, which is a huge issue for the mainstream public appeal of this vote. (Not even mentioning what would happen in the discourse if a part-time player like Dyson or Bourjos were ranked first at a position.)

It will be a very hard sell for the analysts out there to peddle this idea when it creates division within the sabermetric media and surely within SABR’s membership itself.

But that will only be the first step.

2. There will be a fight when the first vote comes out.
There already are gripes, and legitimate ones, when the award comes out. Remember Rafael Palmeiro as a DH in 1999? Derek Jeter winning all five of those awards? Adding the statistical component saves us from having designated hitters win, and that is certainly a good thing.

But instead of uniting them in some awkward arranged marriage, this has the potential to pit the traditionalists against the statistical analysts. When an award comes out where the vote doesn’t match the SDI, that will become a binary-outcome referendum on both parties.

The coaches got it wrong or the numbers got it wrong. One of them had to get it wrong, and dammit, we need to know who it was.

Instead of a celebration of the winner, it’s an examination of the process, which will become very tiresome very quickly. We don’t need any more “WAR, What Is It Good For” columns. Even if SDI has never been in a song lyric, let’s not take that chance over this. There are plenty of more worthy fights for the importance of analytical thinking.

3. We’ll argue over whether stats should be applied directly to more awards.
This is a tough one, because if statistics are going to be applied to any award voting, Gold Gloves might be the worst ones to start with. One-year defensive metrics are notoriously unstable—Alfonso Soriano’s 7.9 FRAA in 2012 came after a -6.4 in 2011 and a -8.0 in 2012. And the difference between metrics as mentioned above make the Gold Glove possibly the worst award to add a statistical component to now. (Okay, Manager of the Year is worse.)

When the statistical community puts its stamp on this award, it has to be prepared to stand behind it. The stats say Soriano, always thought to be a poor defender, was the most accomplished left fielder in the league last year. The stats say a part-time player was the most accomplished center fielder in the league last year.

Is a statistic that research says can take three years to stabilize really the one we want imprinted on a single-season award? There’s an argument to be made that adding a WARP/WAR component to the Baseball Writers Association of America’s MVP awards—or one of the lesser-known MVP-type honors, or maybe even the Hall of Fame—would be a better step.

But SABR doesn’t control any of the awards, and the BBWAA has not been looking to cede any control over its awards (disclosure: the author is a member of both organizations). So this isn’t a knock on SABR, which is doing what it can to advance the discussion, just an unfortunate case of who came calling and had some ground to make a deal.

We hate the process of most awards, yet we haven’t come up with a much better one either here or in the case of the BBWAA awards. If coaches have proven to be the worst voters of any electorate—and it’s really neck-and-neck between them and fans—then change the electorate. Have the people in front offices paid to assess value hand out awards for defensive value.

Perfecting the Gold Glove and other awards is a noble pursuit. This is a small step toward that goal that might be missed in the very predictable reactions to every part of the process.

Zachary Levine is an author of Baseball Prospectus. 
Click here to see Zachary's other articles. You can contact Zachary by clicking here

Related Content:  Sabermetrics,  Sabr,  Gold Gloves,  SDI

21 comments have been left for this article. (Click to hide comments)

BP Comment Quick Links

bluesman98

So from the defensive metric numbers from last year we are to conclude that Alfonso Soriano is a better defensive player than Carlos Gonzalez. And not just a little better--a whole lot better.
What a complete joke.
While there is some merit to what you all do with the defensive metrics. When a result like the above comes out--you need to take a serious look at what you are doing.

Mar 13, 2013 06:24 AM
rating: -3
 
Behemoth

Or possibly you need to re-examine your own prejudices as well? It's always possible your eye is not the ultimate best way of determining defensive competence.

Mar 13, 2013 06:43 AM
rating: 6
 
Drungo

So what would you do when the metrics don't agree with your subjective observations? Veto the numbers? Have a subjective fudge factor? Invalidate all metrics that have outliers and idiosyncracies?

Nobody believes Darin Erstad was a .355 hitter, so while there's some merit in what people are trying to do with offensive metrics, when a result like that comes out someone needs to take a serious look at what they're doing.

Mar 13, 2013 06:53 AM
rating: 3
 
whichthat

The Erstad example illustrates the problem here.

Batting average is a very simple stat that measures exactly one thing. What it says about Erstad's merits as a player is up for debate, but it is a factual statement that for one season, 240 of his 676 of his at-bats ended with a hit.

Advanced fielding metrics incorporate a lot of things and reflect various weightings and interpretations. I can't compute UZR without a spreadsheet -- come to think of it, I can't compute it AT ALL because it includes proprietary data.

Mar 13, 2013 10:41 AM
rating: 2
 
BarryR

Bingo. If just one of these two methodologies said that Soriano was the best LF, I would have just thought it was some bizarre systemic anomaly and ignored it. But when both of them say it I have wonder what caused this to happen - I mean Alfonso Soriano...really? The problem is that I can't see the numbers, I can't see if there is a flaw they have in common, or if they are really meaningful. I just have to trust that these things work. I don't trust FRAA, so I am dubious that their inclusion in the Gold Glove (or Platinum Glove, whatever that is) voting will be useful.

Mar 13, 2013 18:46 PM
rating: 2
 
whichthat

That reliance on trust is the ultimate problem. And it ties in with Russell Carleton's piece from Monday. If Rafael Palmeiro 1999 is a damning indictment of the current voting structure, shouldn't we consider Alfonso Soriano 2012 a serious shortcoming for FRAA and UZR?

Mar 13, 2013 21:26 PM
rating: 0
 
BarryR

I don't think so. Palmaeiro won a popularity poll which he shouldn't have even qualified for; Soriano is based on statistical evidence, which carries a degree of objectivity.
Soraiano was the leader in those two statistical measures,that is just a fact. The question is how worthwhile a fact that is, which depends on how much faith you have in those respective statistics as a measure of defensive value.

By the way, the system in these comments needs fixing, Just because bluesman98's comment received a certain number of negative votes it was grayed out. Given that it was only a -4, that seems extreme to me, but worse, since it started the thread, graying it out took this whole thread with it. That is asinine, as this is a fairly interesting discussion. I voted bluesman back up, in an attempt to restore the thread. Please fix this.

Mar 13, 2013 22:12 PM
rating: -1
 
BP staff member Dave Pease
BP staff
(2)

Barry, the "hidden" comment floor is -4, but if it makes you feel any better, hidden comments are quite widely read at BP.

Mar 14, 2013 09:26 AM
 
Adam Hobson

Or you can read the entire article:

"One-year defensive metrics are notoriously unstable—Alfonso Soriano’s 7.9 FRAA in 2012 came after a -6.4 in 2011 and a -8.0 in 2012."

...

"The stats say Soriano, always thought to be a poor defender, was the most accomplished left fielder in the league last year. The stats say a part-time player was the most accomplished center fielder in the league last year.

Is a statistic that research says can take three years to stabilize really the one we want imprinted on a single-season award?"

Mar 13, 2013 07:41 AM
rating: 11
 
JimmyJack

I have never been confident with any defensive stats.

In my opinion, the Gold Gloves should be a collected vote by advance scouts. This is what they do for a living. They would know best.

Mar 13, 2013 08:41 AM
rating: 5
 
Gordon

As I understand it, the exact use of the fielding stats won't be known and will be one component of the voting. If I were to do it, I'd use some sort of agglomeration of the statistical fielding metrics for that component of the voting.

Mar 13, 2013 08:44 AM
rating: 0
 
dbiester

Isn't SDI the Reagan-initiated "Star Wars" defense against ICBMs?
http://en.wikipedia.org/wiki/Strategic_Defense_Initiative

Mar 13, 2013 09:26 AM
rating: 1
 
BarryR

That was my exact thought when I saw the little TM next to it. How can you trademark an acronym which already has historical significance?

Mar 13, 2013 16:41 PM
rating: 0
 
ScottyB

We need something like confidence intervals around our stats. If Soriano's defense is rated 7.9 (8.7 - 2.5) it gives better context.

Mar 13, 2013 10:54 AM
rating: 0
 
BP staff member Ben Lindbergh
BP staff

We can do that, and have done that (see this article by Colin Wyers, for instance). It makes sense. The potential problem with doing that is that you run the risk of confusing people who expect to see one number, and it's also sort of a pain from a display standpoint.

Mar 13, 2013 11:08 AM
 
Nathan Aderhold

What about displaying as a range rather than three separate numbers, perhaps in graph form?

Something like this -- http://robslink.com/SAS/democd6/col8.png -- but inverted so the ranges are horizontal.

Mar 13, 2013 13:16 PM
rating: 0
 
rweiler

You could order the players at each position by UZR and FRAA, weight by rank order and use the sum of the weighted scores or something like that. My big problem with any of the defensive metrics is that they jump around so much season to season. That might just be due to nagging injuries, or it could be that they are just fundamentally flawed.

Mar 13, 2013 12:40 PM
rating: 1
 
Chucko

I hear you 100%. The tendency for these things to generate ratings which run the gamut is hard to grab ahold of as a fan with limited sabermetric understanding (meaning myself). Taking it two steps further and urging casual fans and old-school managers to adopt it is going to lead to fireworks. Boring, boring fireworks.

That said, the idea that this year-to-year instability in the metrics is a definitively bad result strikes me as arbitrary and sorta silly (not directed at you, rweiler - hope that's obvious). Why is it that we can't handle volatility among our star defensive players? We see it in Cy Young and MVP ballots, so why do we expect fielding to create different results? I am more than willing to believe that defensive ability is not simply a flat, innate talent. Given that variables like health, park effects, weather, opponents lineups, etc. all factor into every single play, it would be funny to expect consistent annual results. And this is even before the bias of the observer (e.g. the bias towards the flashy play instead of the smooth play) is recognized as playing a larger role in how we judge defense vs. hitting and pitching. I'd think we should be MORE skeptical of an award which features the same names year in and year out in this instance. But that's not how we're trained to see our star athletes, so it'll be a hard sell all the same.

Mar 13, 2013 14:02 PM
rating: 4
 
rweiler

The problem is that we do see considerable year to year stability in pitching and hitting metrics, though pitching metrics are substantially more variable due to injuries. That doesn't seem to be the case with fielding metrics.

Mar 13, 2013 15:46 PM
rating: 0
 
R.A.Wagman

This.
Why can't we just accept that fielding events took place, and whatever else happened, these events may not accurately gauge a player's innate fielding ability. Just something that he did - like driving in a baserunner.

Mar 13, 2013 20:11 PM
rating: 1
 
BP staff member Cecilia Tan
BP staff

I suppose the philosophical question at the heart of the debate is what is the Gold Glove award *for*? If it's to say "this guy had the best defensive performance in a given year" then the fact that a guy is "usually" not a great defender is irrelevant if that year he gets the best "score" by some metric.

In the ideal world, we'd have a metric that correlated to defensive ability, but in the real world maybe defensive *performance* is simply volatile season to season.

Mar 21, 2013 12:45 PM
 
You must be a Premium subscriber to post a comment.
Not a subscriber? Sign up today!
<< Previous Article
The Lineup Card: 9 Bet... (03/13)
<< Previous Column
Skewed Left: PECOTA vs... (03/07)
Next Column >>
Premium Article Skewed Left: Spring St... (03/18)
Next Article >>
Premium Article Top Tools: Glove (03/13)

RECENTLY AT BASEBALL PROSPECTUS
Premium Article Minor League Update: Games of Thursday, May ...
Premium Article What You Need to Know: Bummed!
Premium Article The Prospectus Hit List: Friday, May 22
West Coast By Us: Day 1: In The Land Where E...
Premium Article Rubbing Mud: The Quarter-Season Odds Report
West Coast By Us: Taco the Town
Going Yard: The Near Perfection of Pederson

MORE FROM MARCH 13, 2013
Premium Article Top Tools: Glove
The Lineup Card: 9 Bets on Vegas Over/Under ...
Premium Article Rumor Roundup: Trouble at the Doc's Office
Premium Article Sobsequy: Notable AL Minor-League Free Agent...
Fantasy Article Pre-Season Positional Rankings: Top 40 Fanta...
Fantasy Article Five to Watch: Drawing Blanks

MORE BY ZACHARY LEVINE
2013-03-21 - Premium Article Skewed Left: The WBC and Dominican Demograph...
2013-03-19 - Premium Article Skewed Left: Baseball's Great Unresolved Deb...
2013-03-18 - Premium Article Skewed Left: Spring Stats You Can (or Can't)...
2013-03-13 - Premium Article Skewed Left: Saberizing the Gold Gloves
2013-03-07 - Skewed Left: PECOTA vs. Vegas
2013-03-05 - Premium Article Skewed Left: Life on the Inside
2013-03-04 - Premium Article Skewed Left: It's No Good to Get Old (Except...
More...

MORE SKEWED LEFT
2013-03-21 - Premium Article Skewed Left: The WBC and Dominican Demograph...
2013-03-19 - Premium Article Skewed Left: Baseball's Great Unresolved Deb...
2013-03-18 - Premium Article Skewed Left: Spring Stats You Can (or Can't)...
2013-03-13 - Premium Article Skewed Left: Saberizing the Gold Gloves
2013-03-07 - Skewed Left: PECOTA vs. Vegas
2013-03-05 - Premium Article Skewed Left: Life on the Inside
2013-03-04 - Premium Article Skewed Left: It's No Good to Get Old (Except...
More...