CSS Button No Image Css3Menu.com

Baseball Prospectus home
  
  
Click here to log in Click here for forgotten password Click here to subscribe

Happy Labor Day! Regularly Scheduled Articles Will Resume on Tuesday, September 2.

<< Previous Article
Prospectus Q&A: Tony S... (06/10)
<< Previous Column
Premium Article Checking the Numbers: ... (06/03)
Next Column >>
Premium Article Checking the Numbers: ... (06/18)
Next Article >>
Future Shock: First-Ro... (06/10)

June 10, 2009

Checking the Numbers

Binomial Feliz

by Eric Seidman

the archives are now free.

All Baseball Prospectus Premium and Fantasy articles more than a year old are now free as a thank you to the entire Internet for making our work possible.

Not a subscriber? Get exclusive content like this delivered hot to your inbox every weekday. Click here for more information on Baseball Prospectus subscriptions or use the buttons to the right to subscribe and get instant access to the best baseball content on the web.

Subscribe for $4.95 per month
Recurring subscription - cancel anytime.


a 33% savings over the monthly price!

Purchase a $39.95 gift subscription
a 33% savings over the monthly price!

Already a subscriber? Click here and use the blue login bar to log in.

Despite playing alongside Barry Bonds for several seasons, Pedro Feliz never learned to discipline his bat, entering this season with an abysmal .292 career on-base percentage. Given his antipathy toward taking free passes, it stands to reason that what transpired in a May 12 matchup between the Phillies and Dodgers could induce double-takes from even the most seasoned baseball people. In the bottom of the third, Clayton Kershaw issued a four-pitch walk to Feliz to begin the frame. The very next inning, with nobody out and a runner on second, Pedro held up on a 3-2 offering and earned his second straight base on balls. If Feliz had stopped here, his two-walk performance would still have been a relatively monumental feat for him, as he had batted in 983 games from 2001-08, and walked at least twice in a game on just 14 different occasions. Then, in the bottom of the sixth, with a runner on first, nobody out, and James McDonald in from the bullpen, Feliz took four pitches out of the zone, and trotted down to first for a third consecutive time. An inning later, Feliz stepped up with the bases loaded, and after Jayson Werth opened up a spot on the basepaths with a steal of home plate, Ronald Belisario threw two more balls, giving Pedro his fourth walk of the game.

Now, it's perhaps needless to note that Feliz has never before walked four times in one game, let alone in consecutive plate appearances. In fact, Pedro's entire season to date has been an outlier, as his OBP stood at .357 through 52 games at the time that I began researching this piece. I was curious about just how unlikely these achievements were, so I decided to calculate the probabilities that a player as stubbornly resistant when it comes to taking free passes as Feliz has been would walk four times in as many trips to the plate in a single game, and that a hitter with a career .292 OBP would produce a .357 rate through roughly one-third of the season.

To determine the probability of his four-walk game, Feliz's walk rate must first be isolated from the overall on-base percentage. From 2004-08, when Feliz played in a full-time capacity, he walked just 5.5 percent of the time, among the lowest marks for any regular player during that span. Walking is a yes-or-no proposition, in that a player either walks or he doesn't in a given plate appearance, which lends itself perfectly to a Bernoulli Trial. The Bernoulli determines the probability of an independent event occurring within a given parameter of opportunities, based on prior knowledge of the subject's rate of success in the area being tested. For Feliz, we would be attempting to calculate the likelihood of exactly four walks in four plate appearances given a 5.5 percent success frequency.

In Excel, the formula would be BINOMDIST(4,4,.055,FALSE), where the FALSE signifies usage of the probability mass function as opposed to the cumulative distribution function; the former refers to exactly four walks, while the latter lends itself to four walks at most. One stroke of the "enter" key later, and the answer of 0.0000092 indicates that Feliz has a 0.00092 percent shot of accomplishing this feat in a specific game. According to the Birthday Paradox-the idea that, as a group of people increases in size, the likelihood that any two share a birthday grows stronger-it is much more likely that Feliz would perform his four-walk extravaganza at some point over the course of a season as opposed to within any one specific game. Now, one issue emerges with this type of test in that the probability assumes that Feliz has the same chance of walking in each plate appearance, or that each walk outcome is independent of the others, much like a fair coin always has a 50/50 shot of landing on its tail. For all we know, Feliz may have been on a walking spree leading up to this game, or he could have made adjustments in facing Kershaw that second time through the order. Both of these events could reasonably suggest that his probability of walking grew with each plate appearance. Still, the Bernoulli Trial does a solid job of accomplishing our goal with this study, and for all we know, those four walks could have been independent of one another.

To turn the 0.00092 percent probability in a specific game into the likelihood of its occurrence over a 150-game span, the probability of the event not occurring must first be calculated. 1-0.0000092 = 0.9999908. Raise the new number to the power of 150 to arrive at 0.9986, and subtract from 1.0 to end up at 0.0014, or 0.14 percent. Essentially, based on our knowledge of Feliz's success rate when it comes to bases on balls, he had a minuscule 0.14 percent shot of walking exactly four times in four straight plate appearances in a game at some point during a 150-game campaign. If the same steps are taken but the formula is modified to calculate the probability of exactly two successful events out of four opportunities in any of 150 games over the course of a season, in an area with a success rate of 0.0025, we arrive at a probability of 0.56 percent, exactly four times as likely as Feliz's feat. This modification isn't arbitrary either, as I chose it for comparative purposes to illustrate just how unlikely it was for Feliz to accomplish what he did earlier this season.

The 0.0025 rate of success refers to the quotient derived from dividing Juan Pierre's 13 career home runs by his 5,153 career at-bats entering this season. Juan Pierre is four times more likely to hit precisely two home runs in the four at-bats per game that he has averaged than Feliz is to walk four times in as many plate appearances. Who'd have thought that?

The next step-involving the probability of an observed .357 OBP through 52 games for a player with a career .292 mark-is determinable in a couple of different ways. If a normal distribution is assumed, 68 percent of the data within a set will fall within one standard deviation (SD) of the mean; 95 percent of the data will fall within two SDs of the mean; and 99.7 percent of the data will fall within three SDs of the mean. Baseball data cannot always be categorized as normally distributed, so this method has the potential to produce inaccurate results, but because we'll go over the other method afterwards, comparing the results can prove interesting. To find the aforementioned probability operating under the normal distribution assumption, Feliz's average and SD must be known. The average has already been established at .292, and the SD can be found through the formula SQRT(P Q/N), where P is the percentage of success, Q is 1-P (the percentage of failure), and N, in this case, stands for the number of plate appearances.

Since the goal here involves the probability of such a high on-base percentage over a 52-game stretch, I first tallied the statistics for Feliz in every career stretch of such length. This method has been discussed in this space before, and is equivalent to utilizing the game-logs summing tool at Baseball Reference. The 52-game stretches did not extend into subsequent seasons, however, and were comprised of around 205 plate appearances on average since Feliz became an everyday player. Plugging the numbers in, P=0.292, Q=0.708, and N=205, and the expected standard deviation turns out to be 32 points. Again, assuming a normal distribution, we could then expect that a little over two-thirds of the 490 different 52-game stretches for Feliz from 2001-08 would feature OBPs ranging from .260 to .324. About 95 percent of these stretches would consist of OBPs ranging from .228 to .356. With five percent of the dataset out of the two-SD range, roughly 2.5 percent would fall below .228, and 2.5 percent would soar past .356. Under a normal distribution and given his history, Feliz would have a 2.5 percent probability of exceeding a .356 OBP in a 52-game stretch as he has done this season.

The data may not necessarily be normally distributed, however, making the binomial distribution the more accurate measure. Through 205 PAs, a .357 OBP could be produced with 73 on-base events. The probability of observing at least 73 successes in 205 opportunities, given a .292 success rate, is equal to one-the probability of at most 72 successes. In Excel, this would be: 1-BINOMDIST(72,205,0.292,TRUE). The probability checks out at 2.4 percent, basically reporting that Feliz should observe a .357 OBP or higher over 52 games about once out of every 40 such stretches. Interestingly enough, over his eight-year career, despite the expected SD of 32 points, Pedro has been consistently poor in reaching base, actually producing a 22-point deviation. Out of the 490 different stretches, Feliz has ranged from .234 to .345, never falling below 1.87 SDs, or above 1.66 SDs.

Next week, I plan on delving deeper into historical OBP spikes akin to what Feliz has achieved so far this season. The method is very similar to our earlier look at Cliff Lee and his sharp uptick in ground balls last season. Can hitters sustain their shiny new OBP rates? Or are vast increases in reaching base even rarer than unexpected growths in ground-ball frequency? For now, the initial goal dealt more with just investigating the rather extreme unlikelihood that someone with Feliz's skill set could defy the odds by this magnitude, which serves as a solid preface to our eventual inquiry into whether or not these outlying on-base percentages are flukes, or a sign of improved future production lurking around the corner.

Special Thanks to Tom Tango, Ben Baumer, and Heiko Todt for keeping me sane throughout my research.

Eric Seidman is an author of Baseball Prospectus. 
Click here to see Eric's other articles. You can contact Eric by clicking here

Related Content:  Pedro Feliz,  Probability,  Q&A

25 comments have been left for this article.

<< Previous Article
Prospectus Q&A: Tony S... (06/10)
<< Previous Column
Premium Article Checking the Numbers: ... (06/03)
Next Column >>
Premium Article Checking the Numbers: ... (06/18)
Next Article >>
Future Shock: First-Ro... (06/10)

RECENTLY AT BASEBALL PROSPECTUS
Premium Article What You Need to Know: August 29, 2014
Premium Article Pebble Hunting: This Article Mentions Fehlan...
Premium Article The Prospectus Hit List: Friday, August 29
Premium Article The Call-Up: Dilson Herrera
Premium Article Minor League Update: Games of Thursday, Augu...
Prospectus Feature: Roast A Parks
Premium Article Raising Aces: Mis-Priced

MORE FROM JUNE 10, 2009
Future Shock: First-Round Recap
Prospectus Q&A: Tony Sanchez
Premium Article Prospectus Hit and Run: The Museum of Small-...
Premium Article On the Beat: Mid-Week Roundup

MORE BY ERIC SEIDMAN
2009-06-25 - Premium Article Checking the Numbers: Much Ado About Liners
2009-06-19 - Premium Article Checking the Numbers: Ultimate Matchups
2009-06-18 - Premium Article Checking the Numbers: Who Spiked the OBP?
2009-06-10 - Premium Article Checking the Numbers: Binomial Feliz
2009-06-03 - Premium Article Checking the Numbers: Houdini, Meet Jorge
2009-05-28 - Premium Article Checking the Numbers: The Cain Mutiny
2009-05-22 - Premium Article Checking the Numbers: Going Streaking
More...

MORE CHECKING THE NUMBERS
2009-06-25 - Premium Article Checking the Numbers: Much Ado About Liners
2009-06-19 - Premium Article Checking the Numbers: Ultimate Matchups
2009-06-18 - Premium Article Checking the Numbers: Who Spiked the OBP?
2009-06-10 - Premium Article Checking the Numbers: Binomial Feliz
2009-06-03 - Premium Article Checking the Numbers: Houdini, Meet Jorge
2009-05-28 - Premium Article Checking the Numbers: The Cain Mutiny
2009-05-22 - Premium Article Checking the Numbers: Going Streaking
More...

INCOMING ARTICLE LINKS
2009-07-17 - Premium Article Checking the Numbers: MauerQuest!
2009-06-25 - Premium Article Checking the Numbers: Much Ado About Liners