keyboard_arrow_uptop

Yesterday’s column made the claim that small differences in age among high school hitters can have a dramatic impact on their return as draft picks. Today, I intend to prove that claim.

Rather than simply looking at the youngest and oldest players in each draft year, I’ve taken all 846 players in the draft study and separated them by age into five roughly equal bins: Very Young, Young, Average, Old, and Very Old. I then calculated the combined expected value of the players in each bin based on where they were drafted and the combined Discounted WARP that they actually generated.

(If you want the technical details: “Very Young” players were less than 17 years, 296 days old on draft day; “Young” players were between 17 years, 296 days and 18 years, 38 days; “Average” players were between 18 years, 38 days and 18 years, 120 days; “Old” players were between 18 years, 120 days and 18 years, 200 days; “Very Old” players were more than 18 years, 200 days old.)

Here are the results:

 # Players XP DW Return Very Young 169 386.31 482.26 24.84% Young 169 405.98 453.05 11.59% Average 170 390.68 418.58 7.14% Old 168 407.65 282.65 -30.66% Very Old 170 370.42 249.07 -32.76%

And here it is in graph form:

As you can see, there is an almost shockingly smooth progression in the data. Very Young players, as a whole, return 25 percent more value than expected by their draft slots. Young and Average players also return positive value, whereas Old and Very Old players return substantially less value than expected. The gap between the youngest and oldest groups of players isn’t quite as large by this method as what I measured in Part 1 yesterday—the youngest group returns about 86 percent more value than the oldest group as opposed to 117 percent. It’s still an enormous difference.

This difference does not appear to have changed over the years. Here’s the same data as above but limited to players drafted in the first 16 years of our study, from 1965 to 1980:

 # Players XP DW Return Very Young 86 197.87 261.67 32.24% Young 88 223.69 249.44 11.51% Average 86 224.57 277.59 23.61% Old 88 211.91 179.16 -15.45% Very Old 86 216.94 135.91 -37.35%

And here’s the data from players drafted in the last 16 years of our study, from 1981 to 1996:

 # Players XP DW Return Very Young 82 179.42 218.79 21.94% Young 83 185.98 190.93 2.66% Average 81 167.11 124.57 -25.46% Old 84 185.16 146.36 -20.95% Very Old 82 168.39 101.20 -39.90%

The data in each half of the study is not quite as smooth, which isn’t surprising given that the sample size is half as large. In the first study, Average players are a better value than Young players, while in the second, Average players return less value than Old players. But otherwise, both halves of the data show the same thing: the younger the player, the better the return on investment.

And if you compare the Very Young players with the Very Old players, you’ll notice that the advantage enjoyed by the youngest set of players is greater in each half of the data than in the data as a whole. From 1965 to 1980, Very Young players return 111 percent more value than Very Old players; from 1981 to 1996 they return 103 percent more value.

That seems counterintuitive, that the advantage enjoyed by young players is greater in each half than in the study as a whole. But there’s a reason for this, which is something that isn’t very well known: high school players are getting older over time. This isn’t something limited to baseball; a better way of putting it is that high school students are getting older over time. There is a societal trend towards holding back children from starting school early. Whereas 40 years ago, parents frequently tried to get their soon-to-be-five-year-old child—whose birthday might be in October or November—into kindergarten, today’s parents frequently will hold back their five-year-old—whose birthday might fall in July or August—until the following year. There is a growing belief in educational circles—the data on this is controversial—that kids who are among the oldest in their class do better academically than those who fall on the youngest end of the spectrum.

And in fact, the average high school player drafted in 2010 is roughly three months older than the average high school player drafted in 1965 (many thanks to Diane Firstman for her help in mining that data.) When I broke the data into two halves above, the age cutoffs for Very Young, Young, etc, were about six weeks higher for the draft group from 1981 to 1996 than it was for the draft group from 1965 to 1980.

This is why, I believe, the results we get from pooling the data for all players from 1965 to 1996, without regard to the year they were drafted, might actually underestimate the advantage younger players have. Derek Jeter and Jason Kendall would not have ranked among the 10 youngest high school hitters from 1965. But when teams are drafting, what matters is the draft pool in front of them, and both Jeter and Kendall were among the five youngest high school hitters in 1993. As draft classes get older as a whole, the youngest players in each class get older as well—but so do the oldest players, giving the youngest players the same age advantage they've always had.

Ultimately, we’re splitting hairs here. We can safely say that the youngest 20 percent of high school hitters in any particular year will return, on average, about double what the oldest 20 percent of high school hitters will.

If you would prefer a graphical, as opposed to mathematical, view of the data, this graph takes the scatter plot from yesterday’s article but separates players into two groups—players who were still 17 on draft day vs. players who were 18 or older:

There are two best-fit lines on the graph, and the red one—corresponding to 17-year-olds—tracks significantly above the black one. You’ll probably notice that while only 33 percent of the players in the study were still 17, a preponderance of the red dots (indicating the 17-year-old players) sit above the best-fit lines.

We can sum up all the data above by performing a second linear regression, this time making a player’s age a variable along with his pick number. If we do so, here is the formula we get:

Expected Return = 21.30 – (1.17 * Age) + (11.14/SQRT(PK))

First off, the p-value for the age variable is very low at just .0063. This means that there is less than a one percent chance that we would get data like this if there weren’t an actual correlation between age and expected return. This is a statistically significant result.

Secondly, we can now estimate to what degree teams should be drafting younger players higher than they already are. If Player A is exactly one year younger than Player B, and they were both selected with the same pick in the draft, Player A should be expected to return an additional 1.17 Discounted WARP over his career. Because the value of draft picks does not go down in a linear fashion, we can’t say that one year of age is worth exactly X number of picks in the draft—X changes depending on where you are in the draft.

We can say that, using the above formula, 1.17 Discounted WARP is roughly the difference between the expected values of picks #24 and #100. In other words, a 17-year-old player drafted #100 overall has as much expected value as an 18-year-old drafted #24. If a player who might look like a third-round pick on talent alone happens to be a full year younger than his draft class, he ought to be considered a late-first-round pick.

That is a massive, massive impact. One year of age is the difference in the expected value of pick #25 and pick #11. It’s bigger than the difference between pick #5 and pick #8. And remember, this is even after adjusting for the fact that teams—at least some teams—may already be taking age into consideration and drafting younger players earlier than they would otherwise. They clearly don’t take age into account enough.

Even a six-month difference is meaningful. The difference in value between a player born in, say, October and in April is the difference in value between the #100 pick and the #43 pick, or the difference between the #30 pick and the #18 pick.

It’s hard to overstate the importance of this. I can’t say that major league teams have ignored age completely when drafting players, but age has clearly been subordinate to present talent, and this study argues strongly that this has been a mistake. If Player A grades out slightly better than Player B, but Player B is 6 or 12 months younger than Player A, teams have been drafting Player A first, and they should have been drafting Player B.

As this data set ends with the 1996 draft, it is quite possible that the edge towards younger players has diminished if some teams have privately done their own research and realized the bonanza to be had in younger high school hitters. In order to study whether this was true or not, I performed an abbreviated study of high school hitters drafted from 1997 to 2003.

For this eight year span, I calculated Discounted WARP in the same way as above, with the exception that I only looked at the first eight years after the draft (this way, even players drafted in 2003 had a full eight years of data through 2011). This is an incomplete measure of a player’s value—we’re cutting off every player’s contribution after the age of 27—but it’s the best we can do at this point.

As I did with the data set from 1965–1996, I used linear regression to come up with a formula to estimate a player’s DW based on his pick number. That formula was:

XP = (6.39/SQRT(PK)) + .04

I then grouped the 176 players in this study into five groups by age—from the youngest 20 percent (those were younger than 18 years, 15 days old) to the oldest 20 percent (those who were at least 18 years, 263 days old). Here are the results:

 # Players XP DW Return Very Young 35 49.10 64.55 31.47% Young 35 56.48 69.32 22.73% Average 35 40.32 50.22 24.55% Old 36 41.00 25.71 -37.29% Very Old 35 40.09 17.19 -57.12%

According to the data, it appears that the importance of a draft pick’s age has, in fact, changed over time… but not in the direction you’d expect: the advantage enjoyed by young players increased dramatically from 1997 to 2003. The average return from the youngest 20 percent of draft picks during this span was more than triple the return of the oldest 20 percent.

If those numbers are hard to wrap your mind around, let’s go back to looking at anecdotes. From 1997 to 2003, 22 high school hitters drafted in the Top 100 were at least 18 years, 293 days old. Just two of them reached the majors: Sergio Santos, who only made it after he converted from shortstop to reliever, and Jorge Padilla, who got 25 below-replacement-level at-bats for the Nationals in 2009, when he was 29 years old. None of the other 20 players sniffed the majors, including a #4 overall pick (Corey Myers).

Meanwhile, among the 22 youngest high school hitters drafted in that span were Daric Barton, Carl Crawford, Grady Sizemore, Adam Jones, and Brandon Phillipsnone of whom were drafted in the top 25 picks. Crawford was taken #52, Phillips #57, and Sizemore (who, granted, got \$2 million to sign) #75.

Here’s the data from 1997 to 2003 expressed in chart form:

The yellow dashed line is the best-fit line for all the players in the study. The median age of the players in the study was about 18.4 years old, so the players were split into two groups: those younger than the median (represented by green squares) and those older than the median (represented by blue x’s).

You don’t need linear regression to see that the green squares are floating to the top of the graph, while the blue x’s tend to hug the zero line. The best-fit green line for the younger players is dramatically higher than the best-fit (blue) line for the older players. (The blue x at the top of the chart, by the way, is David Wright, who at 18.45 years old was barely above the median age.)

Much like I did with the data from 1965 to 1996, I performed a linear regression for the 1997 to 2003 data that included a player’s draft status and his age as variables. The formula I got was this:

Expected Return = 19.96 – (1.08 * Age) + (5.97/SQRT(PK))

What you’ll notice is that the coefficient for a player’s draft pick number (5.97) is much lower than it was in the study from 1965 to 1996 (11.14). This isn’t surprising, because in the more recent data set, we’re only looking at how they played in the first eight years after they were drafted instead of the first 15, so their expected return should be lower.

But by comparison, the coefficient for a player’s age (1.08) is hardly changed from the previous formula (1.17). What that means is that, relative to where the player was drafted, his age had a significantly greater impact from 1997-2003 than it did from 1965-1996.

The data from 1965 to 1996 suggested that a player drafted #100 overall could be expected to perform as well as a player one year older who was drafted #24 overall. But from 1997 to 2003, the impact of age was so great that the 17-year-old player drafted #100 was as valuable as the 18-year-old player drafted #13 overall.

The conclusion is clear: at least as recently as 2003, the baseball industry as a whole massively underrated the importance of age in drafting high school hitters and massively undervalued high school hitters who still needed their parents’ permission to sign their contract. While we simply don’t have enough data to evaluate more recent drafts, Mike Trout and Jason Heyward are two powerful data points in support of the notion that the advantage towards younger high school hitters in the draft is still there, and teams ignore it at their own peril.

Additional studies are needed to determine whether a similar edge towards younger players exists with pitchers or at the college level; if it does, it is almost certainly a smaller one. But even a smaller edge is worth exploiting. There are fewer and fewer market inefficiencies remaining in the post-Moneyball era, and they usually require a hell of a lot more research than simply finding out a player’s date of birth. Implementing this evidence into an organization’s draft preparation is free and painless and ought to have a significant impact on where players are selected.

In the 2011 draft, we had the rare circumstance where two highly-touted high school players, drafted close together, were widely separated by date of birth. With the #5 overall pick, the Kansas City Royals took the first high school hitter in the draft, world-class tools goof Bubba Starling. Three picks later, the Cleveland Indians took the second high school hitter off the board, Florida shortstop Francisco Lindor.

Dozens of articles were written on Starling and Lindor leading up to the draft, but to the best of my knowledge, not one of them made mention of this simple fact: while Starling was born on August 3, 1992 (he actually turned 19 before the signing deadline), Lindor was born on November 14—November 14, 1993. Lindor is more than 15 months younger than Starling and will be younger at the end of next season than Starling was on the day he was drafted.

There are many reasons to think that Starling, despite his advanced age, will meet the formidable expectations placed on him. And speaking as a Royals fan, I hope he does. But if these numbers are even close to being correct, the younger player should have been drafted first. It wouldn’t be the first time.

An expanded version of this article will appear in the forthcoming book Extra Innings: More Baseball Between the Numbers from Baseball Prospectus.

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

### Latest Articles

2/20
0
2/20
5
• ##### State of the Position 2024: Relief Pitcher \$
2/20
0
You need to be logged in to comment. Login or Subscribe
ootpbb
10/14
Bryce Harper fits the "Very young" category when he was drafted last year. Coupled with his 'historical' talent, it sure looks like he has the chance for a very productive career.
philly604
10/14
Very interesting study and something that I've noticed anecodotally over the years, but let me pick out two statements that seem quite contradictory to me.

"Secondly, we can now estimate to what degree teams should be drafting younger players higher than they already are. If Player A is exactly one year younger than Player B, and they were both selected with the same pick in the draft, Player A should be expected to return an additional 1.17 Discounted WARP over his career."

"That is a massive, massive impact."

Ok and also this oen:

"Itâ€™s hard to overstate the importance of this."

The latter two statements are based on what seem to be very large and therefore important difference in percentage above and beyond expected draft pick return. That method, imo, has significant disadvantages in that the expected returns are so low that very small differences lead to dramatic changes in return on investment, but on the playing field those differences are actually quite small.

And by quite small, I mean like 1.17 WARP over a career. Now I know someone will say at ~5M/WARP that over 5M dollars and that's not small at all.

And that's fine, but only if we believe that WARP (or WAR or whatever) precisely measures player value down to two decimal points.

If I told you had to pick between two players' careers and over the course of those careers they were separated by 1.17 WARP would you still say that the differences were "massive" and "could not be understated"?

I don't think so. Whether we're at the utility infielder level (4 vs 5.17 WARP) or the star level (60 vs 61.17 WARP), there just isn't that much difference. Those differences are not beyond the noise in the measurements themselves. It would be quite likley that BP, FanGRaphs and B-Ref would be in disagreement as to who had the better career simply based on their individual implementation of their comprehensive win value stat.

Interesting study and I do think there is something real here, but the unfortunate dependence on percentages of very small expected production has led to small on field differences being magnified.
BurrRutledge
10/14
A couple thoughts:

I think you missed a point about the discounting of the WARP in this analysis. A 1.17 Discounted WARP is different than 1.17 WARP. Though I agree with you that this appears to be a small difference over a 15 year career. Not sure what this would mean to actual WARP over that timeframe, but it's more (much more) than 1.17.

For the sake of the analogy below, let's assume that the actual difference in the players WARP might be 2.4 (pulled this out of my hat, but go with me for a moment).

Let's say that knowledge of this inefficiency had impacted a team's draft several years ago, without other teams catching on to their draft strategy. Let's also posit that at any given time today, 10 of the players on their roster today had been drafted by the organization. Which specific ten players are on the roster may change over time as players leave for free agency, suffer injuries, etc.. However, over the next fifteen years, and all else being equal, the aggregate of the 10 drafted players would produce 240 more wins than the competition.

That's an average advantage of 16 wins a year. I think that could be important at an organizational level.

Nice job, Rany. I love learning something new about baseball.

BurrRutledge
10/14
Of course, I just magnified the effect by 10 in my caffeine-deprived state. A 2.4 actual WARP x 10 players over 15 years would naturally translate to 1.6 win difference each year. While not as bowl-you-over important as what I wrote above, that's still an important impact.

At the value of \$5m per win, that's \$8m value each year. That's a big chunk of a team's payroll.
belewfripp
10/14
Awesome set of studies, Rany - really great stuff and a fascinating read.

To philly - if I understand this all correctly, the discounted WARP applies to each year of the respective players' careers. As Rany stated in Part 1, the gap closes a little bit each year, but the point is that, over the 15-year career that Rany is using as his framework, the younger player is enjoying a discounted WARP advantage each year.

That means the difference in quality - and return on investment - for the younger player is substantially greater than just 1.17 discounted WARP.
SaberTJ
10/14
More amazing stuffy Rany. I am now even more pumped my Indians drafted Lindor!
ScottBehson
10/14
It strikes me that, since most of the value we are attributing to younger players is a result of a handful of them having MASSIVE careers (Griffey, etc), that statistical methods comparing the means among age groupings may not be the best way to qualtify this. Perhaps some logistical regression or some other transformation would smooth out the data (or also conducting some outlier analyses).

That being said, this study is very well-done, and has some important implications. I fear that some implications may be over-stated based on the nature of the data-set (most cases are near zero, with several considerably above the median).
ccseverson
10/16
A concern of mine as well since Griffey and ARod couldn't have been drafted higher than #1 anyway so it's somewhat irrelevant how much they've exceeded expectations for an aggregate study.
batts40
10/14
(Scurries to find Javier Baez's DOB)

Dang it!
tbunns
10/14
It's possible that I'm missing something...
...but the last table showing 1997 to 2003 high school players proves that the league has caught up to the idea of drafting high school batters for age. By your percentages, it looks like the XP and DW data were switched. You probably want to correct that.
ranyj
10/14
Yeah, you're right - the table headings are reversed for XP and DW. Thanks for pointing that out - I'll see if we can get that corrected.
bornyank1
10/14
Switched.
gaughan
10/14
The Lindor/Starling bit will be interesting to see play out. But I wonder if we'll have another neat little case study in the four CFs drafted late 2nd round this year....

#75 Rays, Granden Goetzman DOB: 11/14/92
#79 Cards, Charlie Tilson DOB: 12/2/92
#81 Redsox, Williams Jerez DOB: 5/16/92
#84 Reds, Gabriel Rosa DOB: 7/2/93

I know, I know. Sample size, US vs. PR schools, etc. But Rany's work here shines a new light on viewing drafts and creates another level of interest in how the drafts unfold. Thanks Rany.
kringent
10/14
I love this set of articles - some of my favorite pieces ever on BP, easily.

And while I know this theory can't work every time, I can't help thinking of a fairly recent scenario where it doesn't hold up well.

In 2007, the 2nd and 3rd picks were Mike Moustakas and Josh Vitters, respectively - both high ceiling prep 3B. With some differences, yes, but as close to an apples to apples comparison as we're likely to get. Yet while Vitters was Very Young, Moustakas was Very Old. And I can't imagine there's any team that would prefer Vitters to Mous at this point - acknowledging that there's still some non-zero chance for Vitters to break out.

Offered in the spirit of constructive dialogue.
ranyj
10/14
Funny you should mention that - the whole Moustakas/Vitters choice was the seed for this hypothesis in my head 4 years ago. The Royals were going to take Vitters until the morning of the draft, and I thought it was strange that in the public discussion of which player, the fact that Vitters was almost exactly one year younger than Moustakas never came up.

Obviously, it appears Moustakas was the better choice - although Vitters is still young enough to have a say in that. But again, this is a general principle, and there will always be specific exceptions.

kringent
10/14
Thanks for the response, and thanks again for the great series. Looking forward to whatever future installments you have up your sleeve, whether here or elsewhere.
timber
10/14
Curious question for the Royals fan in you, Rany: If all this leads you to concerns about Bubba Starling, how does it make you feel about Eric Hosmer, who was also Very Old for his draft class?
kantsipr
10/14
Help me out here, because I'm not sure I understand all the statistical interactions. First, in part 1, you wrote, "I also â€œzeroed outâ€ any seasons in which a player generated negative WARP. Given that most draft picks donâ€™t reach the major leagues at all, it would be misleading to penalize a player who was good enough to reach the majors for having a negative-WARP season, relative to a player who might never have gotten out of rookie ball." This seems to imply that anyone who didn't make the majors just got zero WARP? If I'm following what you did correctly, the conclusion of the study is more that players who were young relative to their peers and who made it to the majors performed better than those peers once they started to be productive. That's not quite as strong a statement. Did they take longer to adjust once they made it to the majors? It seems like this produces something of a systematic bias.

I'm also not sure I follow the rationale for the discount factor selected. It seems to me that either it should be evaluated for the player's entire career or for the time the player remains under club control, since that is what the value of the draft pick is.

As I said, I may just not be following it completely.
IvanGrushenko
10/14
It amuses me that an 18 year old can be called "very old". Oh, and this is the most awesome article I've seen here in years.
TheRedsMan
10/14
Should we be asking how appropriate it is to use the average value approach here? That is to say, it seems important to have a good understanding of whether "very young" players tend to be better than expected across the board or if there is simply a slightly more frequent occurrence of "hitting a home run" the young you go.

The average may be a smooth progression, but it could be that the expectations generally hold true for 95% of the population and it's merely the types of outliers in each group that move the average.
ranyj
10/14
I think that's a very valid point, and a very real possibility that it's the rare outlier who is moving the needle here.

But is that a bad thing? As Kevin Goldstein likes to remind us, teams aren't drafting for role players; they're drafting for stars. If the underlying reason behind these findings is simply that a younger player has a 5% higher chance of becoming a star player, that's reason enough to draft him, I think.
tbwhite
10/14
But aren't you assuming that teams draft to maximize the value of each pick, and not the overall value of their draft ? And I suspect that teams may well be optimizing the overall value of their draft, and not the value of each pick.

What I mean is that perhaps older players while having less average value also exhibit less variance in their value, they are more predictable. If younger players are more boom or bust, teams might fear completely missing on a Top 100 pick, and prefer high floor, lower ceiling players in the early rounds. Then in the later rounds when the cost of completely missing on a guy isn't as high, they go for the younger, boom or bust types.

You can't really judge the quality of returns without an assessment of the risk incurred. So, yes younger players offer higher returns on average, but perhaps it is simply because they are riskier, in which case there might not really be a market inefficiency. There's only a market inefficiency when you can show that you can generate better returns without taking on extra risk.
crperry13
10/15
HAH. I caught you in a gross mistake. I think it's pretty clear the Astros are drafting for role players. So eat crow, Mr. Jazayerli.

Great series, by the way. Am enjoying the read.
cjrhgarmon
10/14
I agree. This is my big pet peeve with most sabrmetrics. There is too much emphasis on average results and not enough on the distribution of results. For instance, it is common knowledge that junk bonds and penny stocks earn more on average than T-bills and blue chips. That doesn't mean there's a massive inefficiency that hedge funds have yet to exploit. You can't just look at average return. Risk also matters a lot! How risky are the very young HS draft picks relative to the older HS draft picks?

On a related point, I think there might be a sample-selection problem here as well. The first article stated that 10% of the draft picks were discarded because there was no DOB info for them. The hypothesis was that they flamed out too soon, so their careers didn't progress to the point where someone might care to note their DOB. Isn't it possible that the very young draft picks are more likely to flame out than the older picks, so that the part of the sample that was discarded was disproportionately young? If that's true, it would create a positive bias on the average return estimates for the young (i.e., make the average return for the young look larger than it really is).
ScottBehson
10/14
Thank you, you better explained what I was stating in a comment about 15 comments above this.
andygamer
10/14
It will be interesting how soon the drafting will correct itself to minimize this inefficiency, in much the same manner as College player inefficiency has been apparently corrected.
garrioch13
10/14
Great stuff, Rany. I wonder what the result would be if you would only use the data that doesn't include the high end outliers that are two deviations higher than average. How about if you would use signing bonus compared to overall pick? Now I will have numbers spinning in my head all day.

I think there needs to be a qualification to usage of this data and you hit on it:
I canâ€™t say that major league teams have ignored age completely when drafting players, but age has clearly been subordinate to present talent, and this study argues strongly that this has been a mistake.

Present talent. If a players talent is requisite of his age, then this is not an issue. Is Bubba Starling's talent level equal to what it should be considering he is older than his competition? The fact that he is raw at the plate and already an age that some players are in full season baseball has to make you question his future success.

There are so many angles this kind of study could expand into but the fact that there is statistical proof that confirms that age relative to competition is very important is huge.

Thank you.
ScottBehson
10/14

If a younger-for-their-year player is good enough to be drafted, overcoming the bias towards older players in terms of quality/quantity of instruction received to that point, they will tend to perform better once the quality of development and instruction becomes equal for all.
10/14
Awesome study. The results make sense on an intuitive level if you think about it like using this scenario: a MLB team really loves two similarly talented players in big high schools in SoCal: Player A born July 1, 1993 and Player B born July 1, 1992 are both drafted in 2011. Player B, being a 18-year-old Senior, has an additional year of development and puts up monster offensive numbers as a HS OF (let's say a slash line of 14/50/.525). Player A is a 17-year-old Senior and puts up less than monster offensive numbers (8/40/.450) and plays a similar defensive OF as Player B with similar peripheral skills. Scout goes to CrossChecker who goes to Scouting Director who goes to GM and says, we can get Player B for \$1M in 1st 50 picks or we can get Player A for the same. Which one do you choose if it could mean your job? Human nature would indicate it's more likely that the safer pick goes first, if for no other reason than to be able to justify it later. Most teams take Player B first and hope Player A is there next round - which drives the study you've created. I think it will take time before teams take Player A before Player B (and paying him more). I see them paying more attention to Player A but not to the point of reversing the order - yet. We'll see.
kdringg
10/14
I agree with most of the other comments - great series! Nice work all around. I can't wait to see this with pitchers as I have been a huge Kershaw fan and have enjoyed following his career versus some guys who came in with him that were college starters. I've never been able to buy the whole "college players are better than HS players" debate.
mwashuc06
10/14
A good HS example is Taijuan Walker. He was drafted at 17 and he is the same age as many of the top pitching prospect out of HS from the 2011 draft.
NYYanks826
10/15
Rany, I know I'm a bit late to the party here, but I just want to say how fantastic these articles have been.

With all of the studies that have been done on baseball, especially in the recent years, the fact that you have unearthed this significant of a market inefficiency speaks volumes to how much thought and work you have put into this study.

I can't wait to read more on this subject, and see how the post-2003 draft picks pan out.
10/15
Would be interesting to see if this age effect exists at all among college draftees.
Dodger300
10/16
One important point was never addressed in the article, which is that lots of guys will never have an opportunity to be a young draft pick.

If you are born in October and start school at age 5, you are by definition going to HAVE to be an old draft pick. But that certainly wouldn't mean that no one born in October can ever be a great ballplayer.

I remember reading years ago in one of Bill James' baseball Abstracts that the great ones make it to the majors when they are young. They might not set the league on fire at first, but they make it there young.

Mantle, Yount, Bonds, Griffey, etc.

So Hosmer may have been an "old" draft pick, but he made it young to the big leagues, and acquitted himself quite well. My hunch is that the latter be much more predictive of his career path than the former.
ranyj
10/17
I want to address this comment because I think a lot of people are thinking along these lines, and I want to clear up any misconceptions.

The problem with players born in October is NOT that they are old draft picks. Actually, let me rephrase that: there is NO PROBLEM with players born in October.

The problem is that teams are drafting players born in October TOO EARLY. The problem is that they see a player born in October, and a player born the following May, and they don't account for the fact that the October player has 7 additional months of physical maturity.

The problem isn't that a player born in October can't be any good, or can't even be worth a #1 overall pick. Eric Hosmer is an October player, and obviously that pick looks great right now. The problem is that teams are drafting TOO MANY October players, and not enough May players, in the early rounds of the draft.

The age at which a player reaches the major leagues is more predictive than the age at which he is drafted? Of course. But on draft day, teams don't have the luxury of knowing which high school players are going to make the majors within 3 years. Looking at a player's date of birth is another tool teams should use to maximize their chances of finding those players.
Dodger300
10/19
Thanks for the clarification, Rany. I find it helpful when state it that way.
10/17
As a teacher/coach for 30 years, I can say that I find the kids who are YOUNGER and were not held back are better students and often better baseball players than those who were held back. I have no data, but it is an observation made over 30 yrs of teaching. I think it is because the younger child must work harder at the beginning to keep up, and this becomes a habit where the more mature child does not have to work as hard early and this too becomes a habit.
Schere
10/17
Wow, Rany...wow! This is great stuff, and the effect is so large that these are likely to be merely quibbles. You've got the interns, though, so I'll ask you:

Risk - Overall, it seems to me that you're lacking a risk measurement (variance comes to mind). Are the younger players more varied in their output? It doesn't seem so, but this would be important.

Cost - Does it cost more to sign & develop these guys? Do fewer of them sign?

Draft position as a measure of consensus value estimate - We all know that some kids fall in the draft because their demands are known to be extravagant, or whatever. Possibly the young sample is over-weight on guys who fall due to signability and/or get way above-slot bonuses? I guess that ties back into the cost question, above. And maybe I should read your 2005 article.

Is capping the downside at zero unrealistic? Probably, but given the scant data you'd have to work with in terms of the resources spent on a player before he's out of baseball, I don't know what else you'd do here, exactly...but the absence of any cost/risk in the calculation could be distorting.

Discount rate...I think 8 is probably too high, but on the other hand you should probably cut the analysis off before year 15, when player salaries are set at market rates after 8-10? years. There may be great production in year 12, but you're likely to have paid through the nose for it.

Super article.
jnossal
10/25
Anybody remember when BP touted polished college players and mocked the scouts who preferred to go with high-risk high school hitters?