Notice: Trying to get property 'display_name' of non-object in /var/www/html/wp-content/plugins/wordpress-seo/src/generators/schema/article.php on line 52
keyboard_arrow_uptop

Willie Mays famously started his career 0-for-12 before hitting a home run off of Warren Spahn. This season, Orioles über-stud catcher Matt Wieters has struggled to live up to expectations, posting a feeble .264/.310/.368 line since being called up in May. Talented rookies such as these present a twofold challenge to their teams: first, how to identify when they’re ready for promotion, and second, how to react when they fail to produce. These decisions can be driven by subjective considerations, such as a scout or manager’s evaluation of the player’s poise and confidence. Such things are certainly important, but it’s worth investigating what a purely objective mechanism for making these decisions might look like.

So, today we’ll try to answer the first question: How do you decide when a prospect’s ready? Let’s consider the common scenario in which a rookie player is competing with a veteran for the vet’s job. The veteran’s productivity is typically well established, while the rookie’s productivity is not known as precisely. Thus, we’re faced with a choice between a so-called sure thing, and an unknown but possibly superior alternative.

In the field of statistical decision theory, such choices are known as “multi-armed bandit” problems. They are so named because of an analogy to a slot machine with multiple levers, each of which has a different payoff rate. The case we’re examining can be modeled as the relatively simple case of a machine with two levers, one of which has a known payoff rate. In order to construct the model, we’ll need two values: the veteran’s productivity (the payoff rate of the known “arm”) and a probability distribution on the rookie’s productivity (the payoff rate of the unknown “arm”). To measure productivity, we’ll use on-base percentage. Obviously, OBP is not a perfect measure of productivity; it does not consider defense, power, or baserunning skill. As a measure of offensive performance, though, it is pretty good, correlating with run-scoring to the tune of .91 (per some 2006 research by Dan Fox). Furthermore, the fact that it measures a binary outcome-either a player reaches base, or he doesn’t-is extremely convenient for modeling purposes, as we’ll see shortly.

Since the veteran’s productivity is well established, we’ll quantify it as his aggregate OBP over the previous three seasons. We’ll assign the rookie an OBP distribution (a beta, for my fellow stats geeks out there) based on PECOTA‘s projections. Now we must compare two expected values: that of starting the veteran all season and that of provisionally starting the rookie. I say “provisionally” because the team always has the option of substituting the veteran if the rookie doesn’t perform well.

Assuming 600 PA in a season, we can compute the expected performance of the veteran by multiplying his aggregate OBP by 600; this will give us the number of successes (i.e., non-outs) that he should contribute. Computing the expected value of provisionally starting the rookie is much more complicated, since his performance is variable and a substitution can occur at any time. I wrote an algorithm that accomplishes this by starting with the final plate appearance and iterating backward, considering all possible outcomes (in terms of successes and failures; this is why OBP is a convenient number). At each juncture, the algorithm chooses whether or not to replace the rookie based on which player offers a higher total expected value.

Given the OBP of the veteran and the mean projected OBP of the rookie, the algorithm will determine whether or not the rookie should be given a shot. As it turns out, the algorithm recommends starting all but the very worst rookies who have very good replacements. Six-hundred PA is a lot of playing time, so the cost of possible early failures by the rookie is small relative to the long-term gain to be had if the rookie turns out to be highly productive. To get a better idea of how profitable starting the rookie can be, we can ask how good his veteran replacement would need to be in order for playing the veteran to yield the same expected value as giving the rookie a shot. If the veteran’s OBP is greater than this “break-even” value, then the rookie should not start; if it is less, then the kid definitely should be in the lineup.

This has all been rather abstract, so let’s examine some specific cases from this year to see the algorithm in action. The rookies we’ll look at are Wieters and someone who is perhaps a more typical prospect, Marlins center fielder Cameron Maybin. Below are their pre-season PECOTA projected OBP percentiles, along with the OBP of their initial likely veteran replacements and the “break-even” OBP produced by the algorithm:


                                        Vet's                 Break-even
Rookie    Mean   90th  10th   Veteran    OBP (2006/2007/2008)   OBP
Wieters   .392   .432  .356   Zaun      .348 (.363/.341/.340)  .415
Maybin    .345   .388  .311   Amezaga   .323 (.332/.324/.312)  .370

This information is perhaps better presented visually. In the following diagrams, the rookie’s mean projected OBP is in blue, the veteran’s aggregate OBP is in red, and the break-even OBP is in green:

wieters projection

maybin projection

We can see that, based on expected OBP alone, both rookies should have started over their veteran counterparts, and it’s not even close. The Orioles would have needed a catcher who could be expected to post a .415 OBP in order to justify keeping Wieters in the minors; Zaun is adequate at the plate, but he’s nowhere near that level. Likewise, Amezaga’s aggregate OBP is well below the .370 needed to deny Maybin a shot. In point of fact, Maybin was the Marlins’ Opening Day center fielder, while Wieters was kept in the minors until the end of May; the latter decision was almost certainly motivated by service-time concerns, however.

This analysis is rather involved; fortunately, there appears to be a trend that leads to a simple rule of thumb. Notice that these break-even values are roughly .020 higher than the rookie’s mean projected OBP. I performed this analysis with several other rookies, and found this to be the case generally. Thus, it appears that a rookie should start unless a veteran can be expected to post an OBP at least .020 higher than the rookie’s mean projection.

Of course, this rule is not meant to be definitive. The analysis on which it is based considers only OBP, ignoring power, defense, and other relevant factors; in addition, the assumption that the veteran’s OBP is fixed, though reasonable, is clearly false. Rather, the rule should be seen as a starting point to be supplemented with other information. It also serves to highlight just how much a team stands to gain by giving a promising rookie a chance. Even if there’s only a small chance that a rookie will be more productive than an established player, it is usually worth investing a few games of baseball’s long season to find out if this is the case. After all, the rookie can always be benched or sent down to the minors if the experiment doesn’t work out. In the next part of this diptych, I’ll attempt to determine just how badly the rookie needs to perform in order to justify such a decision.


Dan Malkiel is an intern for Baseball Prospectus.

Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now
You need to be logged in to comment. Login or Subscribe
gbrisbee
9/10
If you do the disparity between Bengie Molina and Buster Posey's projected OBP, will I need a widescreen monitor to view the graph?
irablum
9/10
This is a very interesting analysis. Of course, depending on OBP is simplistic. Also, its not always that a given player is replace. sometimes they are "displaced". I remember back to just before spring training when the decision was made to move Michael Young to third base, bench Hank Blalock, and install Elvis Andrus at Shortstop.

I'd guess you'd have to compare Elvis' OBP to Blalock's.... which has worked out with Elvis' .340 to Blalock's .278

but the real gain here was on defense, which was improved at 2 positions. Elvis was much better than Mike Young at shortstop, and Young was better than Blalock at third. You can go even further because Blalock wasn't the only one to man third in 2008. You had Ramon Vazquez (ugh), German Duran (help), Chris Davis (shoot me), and Travis Metcalf (gack) all log 10+ games at third. Also because of that, 5 players manned first base (Davis, Blalock, Chris Shelton, Frank Catalanotto, and Ben Broussard).

In contrast, the 2009 infield has been relatively set, with Elvis at short, Young at third, Kinsler at second and Davis (or Blalock) at first. With Omar backing up at second, third, and short.
drmboat
9/10
Dan-
The Wieters example should show the failing in your methodology...the break-even OBP is only that high because you continually expect Wieters to have his PECOTA distribution despite the mounting evidence that shows that the forecast is wrong. As of today, Zaun's .319 would be marginally better than Wieters .310, yet your analysis would still expect Wieters to mash going forward and would not take into account the .310.

I think you'd have a better analysis if you used Wieters expected curve and adjusted it as time went forward. Once it was pretty clear that Wieters wasn't going to have a .400 OBP the difference between him and Zaun should have looked much smaller.
hiredgoon1
9/10
This part of the analysis is meant to determine who should start at the beginning of the season, so I haven't yet considered Wieters' MLB performance. I will be doing exactly that in part 2.
harderj
9/11
Thanks for the clarification. I, too, missed that this was a start of the season analysis. Looking forward to part 2!
roryasdfasdf
9/10
The Wieters example really shows this methodology needs a little bit of tweaking. If Zaun was posting a .400 OBP this analysis would suggest replacing him with Wieters when a .400 OBP would make Zaun the 2nd best catcher in baseball in terms of OBP. Clearly that doesn't make a lot of sense.

Even the .370 OBP needed by Amezaga to avoid being replaced by Maybin is wonky. A .370 OBP would have made Amezaga the 6th best center fielder in baseball when Maybin's 90th percentile is only .388. That seems nonsensical as well.

I think because the veteran is modelled as a single probability, the upper tail of the distribution for the rookie is weighting everything in favor of the rookie. In the given model the veteran established player has no chance of having an amazing career year but the rookie does. So of course the rookie should get a chance at having that year, especially when if it goes south he can just get replaced by the veterans solid OBP. Which is the point you were trying to investigate, I know, but I think the way the model is set up its falling out of the simplifications of the model and not the data per se. That being said I think you will still see a shift in favor of the rookie starting, but I think the type of model you chose is magnifying the difference between what-could-be with the rookie and the veteran.

Finally, the shape of the beta distribution seems not right to me. The beta distribution looks almost normal out there and it seems intuitively that you would expect it to be left shifted for a prospect having to adjust to the major leagues.

Fun idea! Looking forward to seeing part 2 and seeing if it gets tweaked.
hiredgoon1
9/11
Suppose Zaun could post a .400 OBP; Wieters' 90th percentile OBP is .432. The idea is that it's worth the small cost of a few PA to give him a chance to match or exceed that mark.

You're right that it's incorrect to model the veteran as a single probability. However, the problem is MUCH more difficult to solve without this assumption, which is not entirely unreasonable for a stable veteran. Besides, if one were to give the vet a probability distribution, it would necessarily be quite narrow, and so the updating (which I'll discuss in part 2) would have little effect.

The beta is tailored to match PECOTA's projections. In case you're wondering, Wieter's distribution is a beta(107, 166) and Maybin's is a beta(88, 166).
roryasdfasdf
9/10
I left out a sentence there. I think that if you modeled as a random variable and not single value, that you will still see a shift in favor of the rookie starting, but it will not be as pronounced.
mentalmeat
9/11
If Wieters' PECTOTA projected 10th percentile OBP was .356, and he is at .310 (looks like .319 now), what is that? Off the charts, practically. 2nd percentile? A PECTOTA limbo.
moscow25
9/13
The percentile for Weiters are based on a full season. The 10th percentile for 1/2 a season might be lower.

Also in the model, how long is the expected time to establish a level of performance? I would image it has to be at least a few weeks before you bench/send down a prospect?
blcartwright
9/13
My first thought is that the veteran also has a distribution of projected performance, just that most of the time the uncertainty of the vet will be less that that of the rookie.
cjones06
9/16
Oh man, gotta say, i've always wanted to see this precise idea fleshed out when analyzing young players (though of course service time accumulation complicates the value to the team).

In your comments though, you mention that this would be much more difficult to do if the veteran was not modeled as a known parameter. This isn't really true though. It's been shown in the statistical literature that the multi-armed bandit problem can be reduced to the single-armed bandit, which is actually the case you're considering. You just end up getting a 'cut-off' point for each of the players, and the team would (in theory) want to choose whichever one was higher. As we would surmise, higher means and greater variance will both increase this cut-off.