Arizona Diamondbacks Atlanta Braves Baltimore Orioles Boston Red Sox Chicago Cubs Chicago White Sox Cincinnati Reds Cleveland Indians Colorado Rockies Detroit Tigers Houston Astros Kansas City Royals Los Angeles Angels Los Angeles Dodgers Miami Marlins Milwaukee Brewers Minnesota Twins New York Mets New York Yankees Oakland Athletics Philadelphia Phillies Pittsburgh Pirates San Diego Padres San Francisco Giants Seattle Mariners St. Louis Cardinals Tampa Bay Rays Texas Rangers Toronto Blue Jays Washington Nationals
 Glossary Search:

Glossary: PITCHf/x

View Glossary Entries by

Category:

Report:

Alphabetical:

Breaktotunnelratio

Break:Tunnel Ratio - This stat shows us the ratio of post-tunnel break to the differential of pitches at the Tunnel Point. The idea here is that having a large ratio between pitches means that the pitches are either tightly clustered at the hitter's decision-making point or the pitches are separating a lot after the hitter has selected a location to swing at. Either way a pitcher's ratio can be large.

CS Prob

This stat tells us the likelihood that a particular pitch will be called a strike based on a variety of factors. CS Prob is calculated on every pitch thrown by a pitcher. CS Prob is a proxy for control, or the ability of a pitcher to throw strikes.

DRA 2016

(Start here for three articles describing the new 2016 version of DRA: DRA 2016)

Where we once had one linear mixed model, we now have 24. Five of them are for hits (home runs, triples, doubles, infield singles, and outfield singles). Four are for not-in-play events (unintentional walks, intentional walks, hit-batsmen, and strikeouts. For single-out plays, we separately modeled putouts at each position, which makes for 9 more models. And finally, we modeled double-plays that began with 1 of the 6 infielders.

But how do you find the best predictors for each model? Last year we used the Akaike Information Criteria (AIC) and likelihood-ratio tests to find the best combinations. These are time-honored methods for evaluating mixed models, but they also test only in-sample data and make their own assumptions about how the data might be organized.

This year we decided to step up our game in two ways. First, while we still kept an eye on AIC and likelihood-ratio tests, we also moved to using 10-fold cross-validation, employing a random sample of half of the 2015 season as our testing ground. To test out of sample, we moved to using the Receiver Operating Characteristic Curve, commonly described in machine learning as the Area Under the Curve or AUC. Pioneered by World-War II radar operators, and then extended to statistics and other fields, AUC measures a model by the likelihood of returning false positives versus false negatives. A worthless score is .5, indicating that the outcomes of your model are essentially a coin toss. A perfect score is 1. Given the amount of random variation in baseball, our goal was to exceed at least .6 for each modeled event, and some models exceeded that threshold by a substantial amount.

(See these articles from 2015 for a history of DRA: http://bbp.cx/a/26195 and In Depth article)

DRA_MINUS

DRA-Minus ("DRA–") As noted above, we've received multiple requests for a "minus" version of DRA, something that rates pitchers by how well they compared to their peers rather than by an amount of predicted runs allowed in a given season. Knowledgeable baseball fans are familiar with statistics like this. Common examples include wRC+ and ERA-. The idea is to put an average player for each season at 100, and then rate players by how much they vary from the average. By rating every pitcher by how good (or poor) he was by comparison to his peers, we can make fairer comparisons across different seasons and different eras. These comparisons aren't perfect: We can't make baseball 50 years ago more diverse or force today's players to endure the conditions of 50 years ago, but metrics like DRA– allow comparisons of pitchers across seasons and eras to be much more meaningful.

Unlike cFIP (which measures true talent), DRA– (which measures true talent plus luck) will not have a forced standard deviation. The two numbers (which are otherwise both scaled to 100) can still be compared, but be mindful of that distinction. For both cFIP and DRA–, lower is better.

See: http://www.baseballprospectus.com/article.php?articleid=26613

DRA_REP_RA

Type the definition of the term here, or leave the text as it is if you don't want to add a new term.

DRA_RUNS_SAVED

Type the definition of the term here, or leave the text as it is if you don't want to add a new term.

Diffatplate

Plate Differential - This statistic shows how far apart back-to-back pitches end up at home plate, roughly where the batter would contact the ball. This includes differentiation generated by pitch break and trajectory of the ball (which includes factors like gravity, arm angle at release, etc.).

Diffatrelease

Release Differential - When analyzing pitchers, we often talk about consistency in their release point, pointing to scatter plots to see if things look effectively bunched or not. This stat measures the average variation between back-to-back pitches at release.

Diffattunnel

Tunnel Differential - This statistic tells you how far apart two pitches are at the Tunnel Point—the point during their flight when the hitter must make a decision about whether to swing or not (roughly 175 milliseconds before contact).

EPAA_PERCENT

(from http://bbp.cx/a/26195)

Under baseball’s scoring rules, a wild pitch is assigned when a pitcher throws a pitch that is deemed too difficult for a catcher to control with ordinary effort, thereby allowing a baserunner (including a batter, on a third strike) to advance a base. A passed ball is assigned when a pitcher throws a pitch that a catcher ought to have controlled with ordinary effort, but which nonetheless gets away, also allowing a baserunner to move up a base. The difference between a wild pitch and a passed ball, like that of the “earned” run, is at the discretion of the official scorer. Because there can be inconsistency in applying these categories, we prefer to consider them together.

Last year, Dan Brooks and Harry Pavlidis introduced a regressed probabilistic model that combined Harry’s pitch classifications from PitchInfo with a With or Without You (WOWY) approach. RPM-WOWY measured pitchers and catchers on the number and quality of passed balls or wild pitches (PBWP) experienced while they were involved in the game.

Not surprisingly, we have updated this approach to a mixed model as well. Unfortunately, Passed Balls or Wild Pitches Above Average would be quite a mouthful. Again, we’re trying out a new term to see if it is easier to communicate these concepts. We’re going to call these events Errant Pitches. The statistic that compares pitchers and catchers in these events is called Errant Pitches Above Average, or EPAA.

Unfortunately, the mixed model only works for us from 2008 forward, which is when PITCHf/x data became available. Before that time, we will rely solely on WOWY to measure PBWP, which is when pitch counts were first tracked officially. For the time being, we won’t calculate EPAA before 1988 at all, and it will not play a role in calculating pitcher DRA for those seasons.

But, from 2008 through 2014, and going forward, here are the factors that EPAA considers:

• The identity of the pitcher;
• The identity of the catcher;
• The likelihood of the pitch being an Errant Pitch, based on location and type of pitch, courtesy of PitchInfo classifications.

Errant Pitches, as you can see, has a much smaller list of relevant factors than our other statistics.

In 2014, the pitchers with the best (most negative) EPAA scores were:

 Name Errant Pitches Above Average (EPAA) Carlos Carrasco -0.405% Ronald Belisario -0.403% Jesse Chavez -0.392% Clay Buchholz -0.380% Felix Doubront -0.378% Daisuke Matsuzaka -0.375%

And the pitchers our model said were most likely to generate a troublesome pitch were:

 Name Errant Pitches Above Average (EPAA) Masahiro Tanaka +0.611% Jon Lester +0.541% Matt Garza +0.042% Dallas Keuchel +0.334% Drew Hutchison +0.327% Trevor Cahill +0.317%
t want to add a new term.

Flighttimediff

Speed Changes - This is the average difference, in seconds, between back-to-back pitches.

MPH_95_TILE

95th percentile of velocity among "hard" pitches thrown. These include 4-seam, 2-seam, and cut fastballs ("FA","SI","FC" from Pitch Info, as found on player cards and at Brooks Baseball site).

PITCHPAIRS

This is the number of sequential pitchers in the sample for the given selection.

PI_PITCH_TYPE1

This is the selected pitches for drilling down on a specific sequence.

PI_PITCH_TYPE2

This is the selected pitches for drilling down on a specific sequence.

Pitcher CSAA

This stat details the additional called strikes outside the reference zone that are credited to the pitcher after accounting for catcher, umpire, pitch type, etc. This stat is calculated on all called pitches (i.e., balls not in play). Pitcher CSAA is a proxy for command, or the pitcher's ability to locate the ball precisely.

Posttunnelbreak

Break Differential - This stat tells us how much each spin-induced movement is generated on each pitch between the tunnel point and home plate. Think of this like PITCHf/x pitch movement, except that it is only tracking the time between the Tunnel Point and home plate.

Releasetotunnelratio

Release:Tunnel Ratio - This stat shows us the ratio of a pitcher's release differential to their tunnel differential. Pitchers with smaller Release:Tunnel Ratios have smaller differentiation between pitches through the tunnel point, making it more difficult for opposing hitters to distinguish them in theory.

SRAA

Type the definition of the term here, or leave the text as it is if you don't want to add a new term.

TRAA_PERCENT

(from http://bbp.cx/a/26195)

Our hypothesis is that base-stealing attempts are connected with the pitcher’s ability to hold runners. When baserunners are not afraid of a pitcher, they will take more steps off the bag. Baserunners who are further off the bag are more likely to beat a force out, more likely to break up a double play if they can’t beat a force out, and more likely to take the extra base if the batter gets a hit.

Takeoff Rate stats consider the following factors:

• The inning in which the base-stealing attempt was made;
• The run difference between the two teams at the time;
• The stadium where the game takes place;
• The underlying quality of the pitcher, as measured by Jonathan Judge’s cFIP statistic;
• The SRAA of the lead runner;
• The number of runners on base;
• The number of outs in the inning;
• The pitcher involved;
• The batter involved;
• The catcher involved;
• The identity of the hitter on deck;
• Whether the pitcher started the game or is a reliever.

Takeoff Rate Above Average is also scaled to zero, and negative numbers are once again better for the pitcher than positive numbers. By TRAA, here were the pitchers who worried baserunners the most in 2014.

 Name Takeoff Rate Above Average (TRAA) Bartolo Colon -6.09% Lance Lynn -5.91% Hyun-jin Ryu -5.82% Adam Wainwright -5.75% T.J. McFarland -5.17% Nathan Eovaldi -5.17%

And here were the pitchers who emboldened baserunners in 2014:

 Name Takeoff Rate Above Average (TRAA) Joe Nathan 9.60% Tim Lincecum 9.41% Drew Smyly 8.80% Tyson Ross 8.08% A.J. Burnett 7.61% Juan Oviedo 7.55%

ZONE_RT

Zone Rate is calculated using PITCHf/x data and shows the percentage of pitches seen (by hitters) or thrown (by pitchers) that are in the rule-book strike zone.

Hitter Examples (2012):

Very few: Pablo Sandoval, 0.4005
Few: Kirk Nieuwenhuis, 0.4833
Around average: Andres Torres, 0.5054
Many: Bobby Abreu, 0.5244
Very many: Chone Figgins, 0.5787

Pitcher Examples (2012):

Very few: Jared Hughes, 0.33554
Few: Jared Burton, 0.4775
Around average: Jeremy Accardo, 0.4879
Many: Joe Blanton, 0.5217
Very many: Jake Mcgee, 0.5897