Notice: Trying to get property 'display_name' of non-object in /var/www/html/wp-content/plugins/wordpress-seo/src/generators/schema/article.php on line 52
keyboard_arrow_uptop

So what are Chase Utley‘s chances of getting his hitting streak to 56 games?

Short answer for the really impatient: 1 in 194.

How I got there:

Hitting streaks are notoriously difficult beasts for a statistician to deal with, even though they seem to be a simple application of probability. You would think that it would work like this: take some representation of his ability to get a hit, say his current average of .328, and then invert it to .672 (1 minus his average), his chance of not getting a hit in any AB. If he gets four ABs a game, his chance of not getting a hit in any game is just .672 raised to the fourth power, or .204. You have to reverse it again to get .796 (his chance of getting a hit in a game), and raise that to the number of games needed–currently 23–so you have .796 raised to the 23rd power, or 0.05%, about 1 chance in 190.

Breathe.

The big problem in this approach is that he doesn’t get 4 AB every game, and the permutations of juggling 3 AB games and 5 AB games quickly grow into a nightmare. I decided to take a shortcut by modeling Utley’s chances, in almost exactly the way I model the rest of the season in the playoff odds reports. A simulation can build the probabilities by trial and error and good counting skills, rather than by solving an exact equation, although the simple off the cuff method did exceptionally well.

The model starts with a game counter. For each game, a random number will tell us how many AB Utley gets today, based on his distribution this season. Here’s that list:

AB  G
------
1   2
2   2
3  18
4  51
5  26
6   5

In the two games with one AB, Utley was a pinch-hitter, which we can safely assume is not going to happen while his streak is on the line. The games with 2 AB, however, were fully legitimate games. I’m going to mirror the distribution here, ignoring the 1 AB games; however, there is another small problem which I’m going to mention and then ignore. The Phillies no longer have Bobby Abreu and his .427 OBP; in his place they have David Dellucci, who’s more like a .360 OBP guy. That should cost the Phillies about .4 plate appearances per game, and that’s 23 plate appearances between now and the end of the year. If last night’s game is part of a trend with Utley moving from a mostly #2 hitter to #3 as a further consequence of the move, then he personally stands to lose about 8 ABs from the distribution presented above.

Utley’s batting average is .328. That’s for this season only; for his career he’s at .292. He’s hitting .344 at home and .310 on the road, .357 vs lefties and .315 vs righties, and if I dug in enough I could tell you what his average was on Wednesdays but I don’t really care. The point is that his average on any given day is probably not .328, but somewhere between about .250 and .400 depending on who’s pitching, who’s umpiring, where he’s playing, and the weather conditions that day, not to mention things like whether his breakfast is agreeing with him or not. This sounds pointless, but it isn’t–streaks, which by definition end with the weakest link, are enormously sensitive to anything that lowers the probability, even for a day. A steady 4 AB a game produces better odds than a mix of 3-4-5 AB games, assuming the 3 and 5 are equal, because the 3 AB games hurt your chances a lot more than the 5 AB game helps them. Likewise with average: a steady .328 average will produce better odds of building a long streak than any normal distribution around it. Of course, I wrote this up while running the simulation, and as it turns out it really didn’t make that much difference; the difference between running at a steady .328 batting average and a .328 plus or minus 100 points reduced his chances by less than 5%, so this is another thing that I’m going to mention and then ignore.

There’s also the likelihood that increasing scrutiny as he goes along raises the pressure and hurts his ability to play. I have no idea how to model hair falling out from stress and what it does to one’s batting average, but consider this: if we simply say that the stress will cost his batting average a point a game (so that at 23 games it will be down to .305), his chances of reaching 56 games drops by a third. Drop his average by 2 points a game and it loses another third (or about 44% of the original odds).

So, then, using a simple .328 average, along with Utley’s current distribution of ABs per game, here’s how many times he had a streak of N games or longer in a million trials. He comes in with a 33-game streak, so 23 is the magic number to tie and 24 is the number to beat DiMaggio:

 1 797735   Utley's streak ends tonight in 20.2% of the simulations.
 2 635216
 3 505901
 4 402784
 5 320310
 6 255260
 7 202923   20% chance of reaching 40 games.
 8 161574
 9 128797
 10 102361
 11 81611    8% chance of reaching Rose and Keeler at 44.
 12 65013
 13 51736
 14 41081
 15 32719
 16 25957
 17 20693    Fifty game streak, 2% chance.
 18 16461
 19 13142
 20 10463
 21 8288
 22 6537
 23 5166     1 in 194 chance of reaching DiMaggio
 24 4105     1 in 244 of passing him
 25 3302
 26 2618
 27 2086     60 game streak, .2%. Notice we're losing two orders of magnitude every 10 games.
 28 1654
 29 1295
 30 1022
 31 836
 32 661
 33 518
 34 413
 35 320
 36 254
 37 206      Seventy game streak, .02%.
 38 155
 39 122
 40 102
 41 83
 42 67
 43 58
 44 52
 45 46
 46 33
 47 27
 48 19
 49 16
 50 11
 51 9
 52 7
 53 5
 54 4
 55 3
 56 3
 57 1      One in a million of finishing the year with the streak intact, or 90 games.


Special thanks to BP reader Timo Seppa for suggesting the column idea.

Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now
You need to be logged in to comment. Login or Subscribe