BP Comment Quick Links
Vote in the Internet Baseball Awards for a chance at a free copy of Dollar Sign on the Muscle


June 10, 2009 Checking the NumbersBinomial Feliz
Despite playing alongside Barry Bonds for several seasons, Pedro Feliz never learned to discipline his bat, entering this season with an abysmal .292 career onbase percentage. Given his antipathy toward taking free passes, it stands to reason that what transpired in a May 12 matchup between the Phillies and Dodgers could induce doubletakes from even the most seasoned baseball people. In the bottom of the third, Clayton Kershaw issued a fourpitch walk to Feliz to begin the frame. The very next inning, with nobody out and a runner on second, Pedro held up on a 32 offering and earned his second straight base on balls. If Feliz had stopped here, his twowalk performance would still have been a relatively monumental feat for him, as he had batted in 983 games from 200108, and walked at least twice in a game on just 14 different occasions. Then, in the bottom of the sixth, with a runner on first, nobody out, and James McDonald in from the bullpen, Feliz took four pitches out of the zone, and trotted down to first for a third consecutive time. An inning later, Feliz stepped up with the bases loaded, and after Jayson Werth opened up a spot on the basepaths with a steal of home plate, Ronald Belisario threw two more balls, giving Pedro his fourth walk of the game. Now, it's perhaps needless to note that Feliz has never before walked four times in one game, let alone in consecutive plate appearances. In fact, Pedro's entire season to date has been an outlier, as his OBP stood at .357 through 52 games at the time that I began researching this piece. I was curious about just how unlikely these achievements were, so I decided to calculate the probabilities that a player as stubbornly resistant when it comes to taking free passes as Feliz has been would walk four times in as many trips to the plate in a single game, and that a hitter with a career .292 OBP would produce a .357 rate through roughly onethird of the season. To determine the probability of his fourwalk game, Feliz's walk rate must first be isolated from the overall onbase percentage. From 200408, when Feliz played in a fulltime capacity, he walked just 5.5 percent of the time, among the lowest marks for any regular player during that span. Walking is a yesorno proposition, in that a player either walks or he doesn't in a given plate appearance, which lends itself perfectly to a Bernoulli Trial. The Bernoulli determines the probability of an independent event occurring within a given parameter of opportunities, based on prior knowledge of the subject's rate of success in the area being tested. For Feliz, we would be attempting to calculate the likelihood of exactly four walks in four plate appearances given a 5.5 percent success frequency. In Excel, the formula would be BINOMDIST(4,4,.055,FALSE), where the FALSE signifies usage of the probability mass function as opposed to the cumulative distribution function; the former refers to exactly four walks, while the latter lends itself to four walks at most. One stroke of the "enter" key later, and the answer of 0.0000092 indicates that Feliz has a 0.00092 percent shot of accomplishing this feat in a specific game. According to the Birthday Paradoxthe idea that, as a group of people increases in size, the likelihood that any two share a birthday grows strongerit is much more likely that Feliz would perform his fourwalk extravaganza at some point over the course of a season as opposed to within any one specific game. Now, one issue emerges with this type of test in that the probability assumes that Feliz has the same chance of walking in each plate appearance, or that each walk outcome is independent of the others, much like a fair coin always has a 50/50 shot of landing on its tail. For all we know, Feliz may have been on a walking spree leading up to this game, or he could have made adjustments in facing Kershaw that second time through the order. Both of these events could reasonably suggest that his probability of walking grew with each plate appearance. Still, the Bernoulli Trial does a solid job of accomplishing our goal with this study, and for all we know, those four walks could have been independent of one another. To turn the 0.00092 percent probability in a specific game into the likelihood of its occurrence over a 150game span, the probability of the event not occurring must first be calculated. 10.0000092 = 0.9999908. Raise the new number to the power of 150 to arrive at 0.9986, and subtract from 1.0 to end up at 0.0014, or 0.14 percent. Essentially, based on our knowledge of Feliz's success rate when it comes to bases on balls, he had a minuscule 0.14 percent shot of walking exactly four times in four straight plate appearances in a game at some point during a 150game campaign. If the same steps are taken but the formula is modified to calculate the probability of exactly two successful events out of four opportunities in any of 150 games over the course of a season, in an area with a success rate of 0.0025, we arrive at a probability of 0.56 percent, exactly four times as likely as Feliz's feat. This modification isn't arbitrary either, as I chose it for comparative purposes to illustrate just how unlikely it was for Feliz to accomplish what he did earlier this season. The 0.0025 rate of success refers to the quotient derived from dividing Juan Pierre's 13 career home runs by his 5,153 career atbats entering this season. Juan Pierre is four times more likely to hit precisely two home runs in the four atbats per game that he has averaged than Feliz is to walk four times in as many plate appearances. Who'd have thought that? The next stepinvolving the probability of an observed .357 OBP through 52 games for a player with a career .292 markis determinable in a couple of different ways. If a normal distribution is assumed, 68 percent of the data within a set will fall within one standard deviation (SD) of the mean; 95 percent of the data will fall within two SDs of the mean; and 99.7 percent of the data will fall within three SDs of the mean. Baseball data cannot always be categorized as normally distributed, so this method has the potential to produce inaccurate results, but because we'll go over the other method afterwards, comparing the results can prove interesting. To find the aforementioned probability operating under the normal distribution assumption, Feliz's average and SD must be known. The average has already been established at .292, and the SD can be found through the formula SQRT(P × Q/N), where P is the percentage of success, Q is 1P (the percentage of failure), and N, in this case, stands for the number of plate appearances. Since the goal here involves the probability of such a high onbase percentage over a 52game stretch, I first tallied the statistics for Feliz in every career stretch of such length. This method has been discussed in this space before, and is equivalent to utilizing the gamelogs summing tool at Baseball Reference. The 52game stretches did not extend into subsequent seasons, however, and were comprised of around 205 plate appearances on average since Feliz became an everyday player. Plugging the numbers in, P=0.292, Q=0.708, and N=205, and the expected standard deviation turns out to be 32 points. Again, assuming a normal distribution, we could then expect that a little over twothirds of the 490 different 52game stretches for Feliz from 200108 would feature OBPs ranging from .260 to .324. About 95 percent of these stretches would consist of OBPs ranging from .228 to .356. With five percent of the dataset out of the twoSD range, roughly 2.5 percent would fall below .228, and 2.5 percent would soar past .356. Under a normal distribution and given his history, Feliz would have a 2.5 percent probability of exceeding a .356 OBP in a 52game stretch as he has done this season. The data may not necessarily be normally distributed, however, making the binomial distribution the more accurate measure. Through 205 PAs, a .357 OBP could be produced with 73 onbase events. The probability of observing at least 73 successes in 205 opportunities, given a .292 success rate, is equal to onethe probability of at most 72 successes. In Excel, this would be: 1BINOMDIST(72,205,0.292,TRUE). The probability checks out at 2.4 percent, basically reporting that Feliz should observe a .357 OBP or higher over 52 games about once out of every 40 such stretches. Interestingly enough, over his eightyear career, despite the expected SD of 32 points, Pedro has been consistently poor in reaching base, actually producing a 22point deviation. Out of the 490 different stretches, Feliz has ranged from .234 to .345, never falling below 1.87 SDs, or above 1.66 SDs. Next week, I plan on delving deeper into historical OBP spikes akin to what Feliz has achieved so far this season. The method is very similar to our earlier look at Cliff Lee and his sharp uptick in ground balls last season. Can hitters sustain their shiny new OBP rates? Or are vast increases in reaching base even rarer than unexpected growths in groundball frequency? For now, the initial goal dealt more with just investigating the rather extreme unlikelihood that someone with Feliz's skill set could defy the odds by this magnitude, which serves as a solid preface to our eventual inquiry into whether or not these outlying onbase percentages are flukes, or a sign of improved future production lurking around the corner. Special Thanks to Tom Tango, Ben Baumer, and Heiko Todt for keeping me sane throughout my research.
Eric Seidman is an author of Baseball Prospectus. 25 comments have been left for this article.
 
Feliz's four walk game was amazing. The Phillies, due to Manuel and Milt Thompson, have been pretty good regarding getting guys to reach career highs in OBP  sometimes by a lot. Rod Barajas and Aaron Rowand are two great examples but there have been others. Feliz has fit in to that scheme as he drew a career high walk rate last season and an OBP better than .300 for only the second time in his career.
However, his increase in OBP this season is due to him hitting above .300 more than for any ohter reason. He is a career .255 hitter with a .294 OBP. Now he is hitting .305, which accounts for nearly all of his +60 increase in OBP. When his average drops to the .250ish range, his OBP will be in the very low .300s. A slight improvement to his career numbers but not by all that much.
Feliz really has trouble with pitch recognition. He's trying to do better with the Phils it seems, as the Phillies are really an OBP type team. But he really doesn't look like he sees pitches all that well. He does hit them pretty well when he does get one in his zone though.
Another thing that's interesting about Feliz is he has some pretty nice clutch numbers throughout his career. I wonder if a talented hitter who has trouble with the pitch recognition thing (if that's possible  but it seems to be the case here) has some kind of advantage in the clutch.
His overall walk rate is still higher in 2009. Using the latest stats from the DT page we see: 18 BB in 212 PA (in 2009) versus 182 BB in 3490 PA (before 2009)  that is, 0.085 BB/PA in 2009 versus 0.052 BB/PA before. Since we're really interested in the spike in his walk rate (and not his total OBP), the probability to see >=18 BB in 212 PA assuming a binomial distribution with p=0.052 is 3.0%. Accounting for the fact that there is also a statistical uncertainty on the 182 BB in 3490 PA, we find that the probability is 3.4%. I'd still call that interesting.