BP Comment Quick Links


April 28, 2014 MoonshotHow Quickly Do Team Results Stabilize?With the end of April looming, we can begin to shed some of our fears regarding small sample size. Statistics like strikeout and walk rates have passed critical thresholds on their march toward stabilization, and so we are beginning to get a first look at how well individual players will perform. The requisite earlyseason loss of ~20% of each team’s starting rotation to the failure of a certain crucial ligament has taken its toll, resulting in a clearer picture of who will make each team’s starts. All of which is to say, we can begin to turn our attention to matters larger than individual players. Since the ultimate goal of every team is to win a championship—and the best way to win a championship is simply to field a very good team—the question of utmost importance is simply: How good is my team? In light of this question, I examine here how quickly team quality stabilizes over the course of a season. At a fundamental level, good teams are defined by 1) scoring lots of runs, and 2) not allowing the other team to score many runs. Therefore, I take as my measurements of quality runs scored per game and runs allowed per game. While there is a simple relationship between the number of runs scored/allowed and wins (via the Pythagorean expectation), that relationship is quite noisy. First and foremost, the noise results from sequencing, or the luck a team has in apportioning its runs to individual games. A bad team may thus end the season with an excellent record and a playoff berth, despite an underlying lack of quality. Nevertheless, all else being equal, good teams (those that score many runs and don’t allow many runs) are more likely to make the playoffs and win championships than bad teams.
Estimating Quality
19 comments have been left for this article. (Click to hide comments) BP Comment Quick Links R.A.Wagman (32721) Robert  Is it safe in looking at the final numbers that the model assumes no player movement between teams? Apr 28, 2014 08:47 AM Yes. All of this is based on PECOTA's depth chart projections, which assume that player X will get N plate appearances with his current team. There's no accounting for what would happen if that player gets traded. That's probably a source of some inaccuracy, given trading deadline dynamics (good teams tend to buy, bad teams tend to sell). Apr 28, 2014 08:53 AM jrcolwell (67729) Before running your projection model, did you update PECOTA depth chart projections to account for new playing time projections (like in the case of injuries)? Or are you still using the same preseason playing time projections? Apr 28, 2014 13:43 PM evo34 (33584) What exactly is the model you came up with? Apr 28, 2014 10:05 AM There's 162 slightly different models, one per game number. As you might imagine, the weight on PECOTA decreases as the season goes on, and the weight on RS/G increases. As I said below, I'm going to look into this again and try to get a simple formulation for how much to weight PECOTA per game number. The intention here wasn't to maximize accuracy so much as to illustrate the overall trends. Apr 28, 2014 18:54 PM Peter Benedict (3131) MN scoring the third most runs in all of baseball? Unpossible! Apr 28, 2014 10:20 AM Greg Ioannou (51725) For about half the teams (BAL, DET, CIN, CLE, MIL, NYA, NYN, PHI, SFN, TBA, TOR, HOU, and WAS), the projected RS is below both the actual RS and Pecota. I think there's a problem with the methodology. (Or perhaps you just made a mistake.) Apr 28, 2014 10:28 AM I think that this is a feature, not a bug. Why? Because teams generally scored fewer runs per game later in the season than earlier in the season, reducing the final RS/G number (or at least they did in the years I looked at [2012/2013]). So the model is systematically underpredicting the runs/game to account for that. Apr 28, 2014 12:23 PM Michael Bodell (89) If that were the case for nearly half of the teams then wouldn't a better prediction for PECOTA just be even fewer runs per teams? Might not there be some small sample size issue (maybe a shift of offense from 2012 to 2013 or from 2013 PECOTA expectations to 2013 actual) that causes this effect. Apr 29, 2014 18:50 PM Michael Bodell (89) Also, in general the teams that you have projected outside this band are the teams who have actual runs scored most similar to projected runs score. In some sense that is expected since the band of possible values between the two is smallest. But in other senses this is surprising: The teams PECOTA has most accurately projected what they are doing are ones that your model doesn't trust! You'd think the evidence to date should make you more trust PECOTA more, not less. Apr 29, 2014 18:55 PM newsense (5112) I think I see an error: The Nationals' projection is below both its current runs scored and PECOTA. Apr 28, 2014 11:02 AM ravenight (45272) Would be interesting to see the full version (perhaps trained on 3 years of data), incorporating RA to make a wins prediction. Apr 28, 2014 14:37 PM Not a subscriber? Sign up today!

You say: "In general, a given team’s projected RS number is going to be somewhere between PECOTA’s projection and the RS number the team has accrued so far."
Why would it ever be anything else? How, for instance, do we account for the Orioles who are outperforming their PECOTA so far, but are projected to finish below that level?
Mathematically, a linear model has the variables in it (in this case, RS so far and PECOTA projected RS), and also an intercept. So if the intercept for a particular model is say .1, and the RS and PECOTA numbers are each pointing towards 4.2 RS, the model might spit out 4.1, because of that intercept.
Generally, I would take the above numbers with a grain of salt. They are provided for illustrative purposes, rather than as definitive predictions. The point of this article was to show how quickly RS/RA stabilized, and to demonstrate that preseason predictions still carry some weight (and will until ~ game 100). If people are interested in maximally accurate predictions, maybe I can do a followup with some more sophisticated models.
(With that said, it's also possible I just made a mistake in entering the numbers in the table. I'll go back and check to make sure.)