Notice: Trying to get property 'display_name' of non-object in /var/www/html/wp-content/plugins/wordpress-seo/src/generators/schema/article.php on line 52
keyboard_arrow_uptop

“I have not failed. I’ve just found 10,000 ways that won’t work.”

–Thomas A. Edison

We learn as much or more from our failures as we do from our successes; a lesson Edison learned in spades as he worked to develop the incandescent light bulb. Of course, he was also the holder of 1,093 patents, so he certainly enjoyed his successes as well. And while our little dabblings in baseball analysis in no way compare to the work of true inventors, we do share the common experience of sometimes having to engage in significant rework of an idea to bring it just a little closer to where we want it. So this week we revisit the topic of outfield defense and take a slightly different perspective in creating a play by play fielding metric for outfielders. We’ve already got version 1.0 of SFR for infielders, which was discussed last week.

However, before we delve back into the topic of outfield defense, I want to make good on my promise to release some minor league numbers. So at this link you’ll find the same spreadsheet linked to in last week’s article, this time with a new tab that includes all 10,774 2007 minor league player, team, and position combinations. Have fun.

We now return to our regularly scheduled topic.

A Different Approach

After my last foray into outfield defense a few weeks ago I received helpful feedback from readers and the sabermetric community at large in relation to the methodology I employed to calculate the SFR values. For those who weren’t with us last time, we can describe the algorithm described in that column like so:

  • First, calculate a baseline for the year and league that includes the percentage of balls that fall for hits and the resulting total bases across the following axes: position, hit type (fly, line drive, pop up, and ground ball), and batter handedness. The resulting matrix will be used for comparison purposes.
  • Calculate the same matrix for each fielder for the year and league in question and compare the matrix to the baseline in order to calculate the expected number of runners and expected total bases given the same number and quality of opportunities. Each difference in the number of expected runners is credited at 0.74 runs (0.46 for the hit plus the negative of -0.27 for the out) and each additional total base above and beyond the number of expected runners is credited at 0.33 runs (the difference between the value of a double and a single).
  • Because outfielders are much more constrained by their park than infielders are, we need to make a park adjustment. This is done by creating a three-year park factor for each park and position using the same context (hit type and batter handedness) as above. The park factor is calculated by comparing all plays at the park and position in question with all plays in games by the home team when on the road. In other words, the park factor is calculated as the ratio of the rate at which balls are turned into outs and extra bases are gained at home, versus that done on the road over a three-year span. The park factor (not weighted but simply averaged over the three years) is then applied to each fielder opportunity at each park with the result being the adjusted SFR value.

When it was all said and done, for 2007 Coco Crisp was our leader at +35 runs, and Brian Giles found himself on the bottom at -23 runs. Although it wasn’t mentioned in the previous column, the correlation coefficients by position when compared with UZR for 2005 and 2006 were:

Table 1: SFR beta 1 Correlations with UZR 2005-2006
Pos         r
Overall  0.54
-------------
LF       0.57
CF       0.66
RF       0.34

The correlations here are lower than that for infielders in part because in this system we don’t have a method for partitioning balls between outfielders. If a ball falls in the gap between a left and center fielder, the fielder (let’s say the center fielder in this case) who picked it up will be debited although it could have been the case that the ball was really in the left fielder’s “area of responsibility” and the center fielder was simply doing his job in backing up the play.

Also, I’ll have to admit that the last row surprised me. I’ll have more on the right field issue a little later, but in the meantime, it should be noted that Sean Smith encountered a similar result in correlating his TotalZone and TotalZone+ systems with UZR.

But as mentioned above, some readers had questions and reservations with this approach, with the primary issue being how the park factors were calculated and applied. Of those issues, probably the largest complaint was that by calculating the park factors using all home and road games for a particular team associated with a park, a large percentage of the resulting factor will be influenced by the home team’s regular fielder at that position. For example, if Manny Ramirez plays the vast majority of games in left field for the Red Sox from 2005 through 2007, then the park factor for 2007 is nearly 50% based on the actions of Manny being Manny.

In order to correct for this and to streamline the calculations, I created a second beta version of SFR for outfielders that uses the following approach:

  • First, rather than create a baseline matrix as described above, this version takes to immediately calculating the percentage of balls that become hits and the cost in terms of total bases for each fielder, park, position, hit type, and batter hand combination for the year and league. Everything is then calculated from this raw data. In order to prep the data for the next step, here we also run this for a five-year span surrounding the year and league we’re dealing with (e.g. for 2005 we run data for 2003 through 2007).
  • Next, we calculate the expected runners and expected total bases, along with the run values for the differences by comparing how the fielder performed in each permutation of park, position, hit type, and batter handedness to how all fielders (not including himself) did in that combination in a five year span centered on the year in question. That’s a mouthful, but what it means is that rather than using a park factor based largely on the fielder who calls that park home, we now effectively calculate an individualized park factor for each fielder at each park for each hit type and batter hand.

    This approach not only removes the bias that may be present by over-representing a particular fielder in a particular park, but it also makes it more likely that an outfielder who is especially good at working with the eccentricities of his park will get the appropriate credit. You’ll also notice that this approach no longer requires data from road games played by a team that were formerly used to create the ratios for the park factor, nor does it require a baseline matrix, since the fielder is compared to all other fielders who have fielded balls at that park. The cost of making these changes is that they reduce the sample sizes upon which the “park factors” are based. However, we’ve made up for that somewhat by including two more years of data when available.

  • Finally, we’ll take one additional step. Because we no longer first compare individual fielders with a baseline for the year and league, we’ve lost a bit of the seasonal context. As a result, we then make a correction that ensures that the totals for the league and position equate to zero. In this way, each player’s SFR score is relative to a baseline of zero for the year and league and can be thought of as runs above or below average for the position.

Software developer geek alert: Although there is far less code involved with this approach (from over 300 lines in the first version to just under 200 in this one), the amount of number crunching has increased because of the need to create the “individualized park factors.” Running the numbers for 2003 through 2007 took about three hours on a 2.33 GHz Core 2 Duo laptop with 2 gigabytes of memory.

So after all of that processing, Table 2 includes the new leaders and trailers in “Outfield SFR beta 2.” The Runners column is the number of runners who reached base on balls fielded by the fielder, TB is the resulting number of total bases, DRunners is the delta in terms of runners, and DTB is the delta in terms of total bases.

Table 2: Top and Bottom Ten 2007 Outfielders By Position
Player               Pos       Balls Runners     TB  DRunners   DTB     SFR
Covelli Crisp       Center       709     301    374      47      79      39
Carlos Beltran      Center       708     319    394      27      50      21
Grady Sizemore      Center       791     394    474      23      44      17
David DeJesus       Center       757     357    448      20      37      14
Nook Logan          Center       444     196    238      16      31      13
Felix Pie           Center       198      78     92      14      25      12
Alfredo Amezaga     Center       381     173    216      14      29      12
Nyjer Morgan        Center       148      64     74      10      17       9
Vernon Wells        Center       619     298    378      15      23       9
Melky Cabrera       Center       680     333    412      17      21       8
--------------------
Juan Pierre         Center       747     381    462      -3       4      -7
Torii Hunter        Center       777     388    482       0      -1      -7
Dave Roberts        Center       436     212    267      -4      -6      -7
Tike Redman         Center       161      83    101     -10      -9      -8
Mike Cameron        Center       742     377    461      -3      -6      -9
Hunter Pence        Center       493     233    315      -1     -15     -10
Aaron Rowand        Center       778     386    503      -1      -8     -10
Ryan Freel          Center       281     145    183     -10     -11     -10
Chris Young         Center       729     375    472      -9      -3     -11
Nick Swisher        Center       295     156    203      -8     -20     -12
Billy Hall          Center       592     297    386      -6     -20     -14
---------------------------------------------------------------------------
Matt Holliday       Left         663     367    477      28      48      26
Eric Byrnes         Left         460     221    287      18      29      16
Jason Botts         Left         104      41     55      14      18      12
Ryan Church         Left         401     205    262       7      26      11
Reggie Willits      Left         296     145    187      12      18      10
Alfonso Soriano     Left         546     301    395       6      22       9
Adam Lind           Left         278     141    177       6      17       8
Carl Crawford       Left         619     332    424       4      19       7
Norris Hopper       Left          76      39     48       7      13       7
Rob Mackowiak       Left         218     120    151       6      13       6
Andre Ethier        Left         161      88    106       5      14       6
--------------------
Barry Bonds         Left         369     207    276     -10     -14      -9
Josh Willingham     Left         558     347    458     -11     -13     -10
Jason Bay           Left         682     416    551      -8     -16     -10
Adam Dunn           Left         604     359    483      -9     -16     -10
Manny Ramirez       Left         463     281    387      -8     -19     -10
Luis Gonzalez       Left         462     270    370      -6     -23     -11
Moises Alou         Left         336     198    282      -6     -30     -13
Pat Burrell         Left         468     292    386     -17     -24     -16
Raul Ibanez         Left         571     347    470      -9     -37     -17
Chris Duncan        Left         401     243    321     -22     -30     -19
---------------------------------------------------------------------------
Jeremy Hermida      Right        505     258    314      12      45      18
Luke Scott          Right        388     191    241      20      31      18
Austin Kearns       Right        717     342    447      19      30      16
Vladimir Guerrero   Right        414     206    262      13      29      14
Carlos Quentin      Right        270     132    183      13      16      10
Magglio Ordonez     Right        538     277    373      19      10      10
Andre Ethier        Right        382     205    253       5      23       8
J.D. Drew           Right        426     214    269       8      15       7
Gabe Gross          Right        137      67     85       7      14       7
Nick Swisher        Right        193      84    114      10       8       6
Corey Hart          Right        470     217    288       8      11       6
--------------------
Nate Schierholtz    Right         99      51     70      -8     -12      -8
Ken Griffey Jr.     Right        615     324    414     -11      -6      -8
Shawn Green         Right        406     203    270      -7     -13      -8
Michael Cuddyer     Right        568     312    420      -7     -14      -9
Jack Cust           Right        190     111    157      -5     -22     -10
Jose Guillen        Right        616     349    445     -14     -13     -11
Juan Encarnacion    Right        288     163    218     -13     -18     -12
Bobby Abreu         Right        631     318    434      -5     -27     -13
Trot Nixon          Right        288     159    222      -9     -27     -13
Mark Teahen         Right        671     353    496     -17     -37     -21
Brian Giles         Right        472     256    358     -17     -42     -22

Coco Crisp still takes the top spot and gains four runs in the process, while Carlos Beltran, David DeJesus, and Grady Sizemore continue to look very good. Melky Cabrera falls from +15 to +8, and Nook Logan holds his own. At the bottom, the soon-to-be-infielder-again Bill Hall claims the title at -14, taking over from Chris Young, who moves up a few spots to -11. Nick Swisher (who at -12 in center fared much worse than he did in right field at +6) and Ryan Freel (-10) now also make appearances in this unenviable list. In left field Matt Holliday (+26) continues to shine, as does Eric Byrnes (+16), with Jason Botts (+12) making a surprise appearance. None of those at the bottom of the list will come as a surprise (and yes, for all of those readers who couldn’t believe Manny wasn’t near the bottom, I hope you’re satisfied with his -10) as Chris Duncan continues to lead them at -19. In right field, Jeremy Hermida (+18) and Luke Scott (+18) continue to shine while Carlos Quentin (+10) and Gabe Gross (+7) sneak into the mix. At the bottom of the right fielders we find Brian Giles at -22 (more on him in a moment) while the rest of the list is filled with familiar faces.

In order to get a feel for how this second beta version compares with UZR I ran regression on all outfielders who fielded 500 or more balls from 2003 through 2006. The summary results are found in Table 3.

Table 3: SFR beta 2 Correlations with UZR 2003-2006, players with 500 or more balls fielded
Pos        #       r
Overall  149    0.64
--------------------
LF        49    0.75
CF        50    0.78
RF        50    0.24

Ouch. While we were looking pretty good at left and center with the correlation coefficients approaching what we saw for infielders, things kind of fall apart in right field. A plot of the correlations follows in Figure 1.

Figure 2. SFR vs. UZR, >=500 Balls Fielded for 2003 through 2006

sfr vs uzr

From this graph you can see how right fielders are more scattered than the other positions with some real outliers in Brian Giles and Juan Encarnacion; Magglio Ordonez is the other blue dot almost directly below Giles. Their totals (across all positions they played are shown below.

Table 4: Brian Giles
Year    SZR    UZR
2003     +0     -3
2004     +1    +24
2005    -20    +20
2006     -2    +23

Both 2005 and 2006 show huge differences in how the systems treat Giles and, as noted above, SFR thinks he’s no great shakes in 2007 either. Having watched him play in person many times in the past several seasons, I do find it hard to believe he could be worth anywhere near +20 to +25 runs, but then again he also doesn’t seem like a -20 type fielder either.

Table 5: Juan Encarnacion
Year    SZR    UZR
2003    -17    +14
2004     -4     +4
2005    -19     +9
2006     -3     +4

Encarnacion is also consistently at odds, as SFR rates him negatively every season, while UZR likes him. Once again, there is nothing obvious that jumps out at me that would indicate that one is clearly right and the other wrong.

These two players account for the bulk of the variation in the two system resulting in the low correlations. It would be interesting to hear theories as to why these two players (and to a lesser extent the flip side of Jose Cruz Jr. and Ichiro Suzuki with an SFR of +53 and UZR of +17) would differ so strongly. Is there a positioning issue? Are there particular park effects that are not somehow being accounted for? I’ll admit that as with the issue of first baseman and UZR I’m a bit stumped at the moment.

Moving on to happier things we can note some pretty strong correlations in left and center field, so we’ll throw few out there so that you can see how they compare.

Table 6: Manny Ramirez
Year    SZR    UZR
2003     -2     -9
2004     -3    -11
2005    -20    -47
2006    -15    -30

Although I didn’t think it possible, SFR now hates Manny almost as much as UZR.

Table 7: Ken Griffey
Year    SZR    UZR
2003     -8    -13
2004    -16    -25
2005    -34    -41
2006    -25    -34

It’s too bad we don’t have numbers on Griffey in his prime, but the older version finishes dead last in both seasons before his merciful shift to right field in 2007 (where he still fared poorly but didn’t cost his team as much).

Table 8: Carl Crawford
Year    SZR    UZR
2003     +3     +6
2004    +23    +27
2005    +23    +11
2006     +4     +0

Crawford shows a peak of sorts in 2004 and 2005 and a decline in 2006 and 2007, but overall the totals are very close.

Table 9: Grady Sizemore
Year    SZR    UZR
2004     +9     +9
2005    +24    +14
2006    +23    +22

Sizemore is now consistently rated the best center fielder in SFR with his +17 finish in 2007; UZR agrees.

To finish up today let’s take a look at the top and bottom 25 outfielders from 2003 through 2007 in terms of SFR by position. Obviously, we haven’t yet calculated a rate statistic, but could obviously do so using the number of balls fielded.

Table 10: Top and Bottom 25 Outfielders for 2003-2007 by Position
Player               Pos       Balls Runners      TB  DRunners     DTB     SFR
Grady Sizemore      Center      2497    1213    1461        68     141      73
Jose Cruz Jr.       Right       1613     762     971        68     136      69
Carl Crawford       Left        3160    1638    2154        65     115      63
Carlos Beltran      Center      3531    1638    2001        44     145      61
Ichiro Suzuki       Right       2588    1242    1606        56      92      53
Andruw Jones        Center      3718    1800    2265        62      95      52
Garret Anderson     Left        1760     899    1134        37      90      42
Austin Kearns       Right       2244    1071    1416        49      60      40
Covelli Crisp       Center      1929     924    1161        47      73      39
Vernon Wells        Center      3455    1743    2167        29      93      38
Covelli Crisp       Left         928     465     607        37      65      37
Matt Holliday       Left        2276    1289    1726        41      52      35
Nook Logan          Center      1291     585     719        31      68      34
Mike Cameron        Center      2961    1376    1689        40      73      33
Reed Johnson        Left        1138     597     750        21      70      32
Alfonso Soriano     Left        1221     650     846        22      61      29
David DeJesus       Center      2078     990    1241        33      56      29
Alexis Rios         Right       1945    1021    1295        11      75      29
Eric Byrnes         Left        1376     698     923        33      43      28
Vladimir Guerrero   Right       2475    1244    1631        29      52      27
Ryan Langerhans     Left         617     322     419        28      38      24
Jeff Francoeur      Right       1669     894    1180        27      33      23
Reggie Sanders      Right        946     462     574        12      56      22
J.D. Drew           Right       1957    1000    1303        24      41      21
Laynce Nix          Center       995     516     605         9      47      21
------------------------------------------------------------------------------
Matt Stairs         Right        468     268     375       -16     -40     -21
Juan Gonzalez       Right        373     222     295       -23     -34     -22
Chipper Jones       Left         631     394     522       -20     -35     -23
Marlon Byrd         Center      1558     809    1039       -21     -34     -24
Michael Tucker      Right        848     461     638       -18     -45     -24
Reed Johnson        Right        690     385     526       -20     -49     -25
Craig Biggio        Center       985     525     685       -21     -46     -28
Shawn Green         Right       2117    1120    1522       -19     -62     -28
Ryan Klesko         Left         767     426     595       -24     -61     -28
Carlos Lee          Left        3126    1741    2334       -28     -49     -29
Mark Kotsay         Center      2776    1383    1773       -18     -62     -29
Moises Alou         Left        1692     961    1307       -23     -58     -30
Adam Dunn           Left        2863    1638    2243       -17     -69     -30
Aubrey Huff         Right        893     458     623       -28     -64     -31
Luis Gonzalez       Left        2662    1533    2018       -41     -44     -33
Bernie Williams     Center      1628     853    1080       -29     -64     -35
Bobby Abreu         Right       3122    1635    2179       -24     -70     -35
Marquis Grissom     Center      1514     763     990       -27     -64     -35
Hideki Matsui       Left        2367    1338    1748       -38     -68     -39
Brian Giles         Right       2441    1310    1788       -28     -81     -39
Pat Burrell         Left        2542    1473    2003       -28     -86     -41
Juan Encarnacion    Right       2415    1278    1735       -35     -98     -48
Manny Ramirez       Left        2541    1537    2099       -50     -90     -51
Raul Ibanez         Left        2506    1417    1906       -35    -110     -52
Ken Griffey Jr.     Center      1653     876    1153       -84    -157     -83

Baby Steps

I believe this second beta version of SFR makes some nice improvements over the previous version, with the added benefit that it is simpler from a conceptual and code perspective. As noted in the case of right fielders, though, there are still issues to be explored, but I’m sure that with the “wisdom of crowds” at our back here at BP, we’ll continue to make progress.

Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now

Dan Fox

 

You need to be logged in to comment. Login or Subscribe