Prospectus Idol Entry: Top Draft Picks: Where They Came From and Where They’re Going

June 7, 2009

What do you know about what the baseball draft produces? Most fans probably have some sense of the history of their favorite team’s high profile draft picks. For instance, I’m a Phillies fan, and most Phillies fans know that many of their current stars, such as Cole Hamels, Chase Utley, and Jimmy Rollins, are the result of the team’s drafts. They also probably know enough to lament draft picks such as Jeff Jackson (picked ahead of Frank Thomas in 1989) and Ryne Sandberg (drafted in 1978 but then traded for Ivan DeJesus after only 13 games with the Phils). Fans might also know some random tidbit about other teams, such as the Mariners‘ successes with Ken Griffey Jr. and Alex Rodriguez or the Dodgers‘ drafting Mike Piazza 1,390th in the 1988 draft and getting a Hall of Famer. But beyond that, do you know much about what the draft as a whole produces?

Because I fall into the same boat described above, I have been planning on researching the draft and the minor leagues for some time. Given that this week’s topic is “Below the Big Leagues”, now is as good a time as any to start.

Without much readily available data, I researched the first two rounds of the last 31 amateur player drafts, compiled attributes that I felt were important, and then asked a series of questions that I found interesting.

Like the start of any good research project, this initial analysis will create more questions than answers, but will serve to jumpstart future work. What’s below is a discussion of some basic trends I’ve learned about the players drafted in the first two rounds since 1978.

WHO MAKES THE MAJORS?

The first question I wanted to ask was how frequently highly drafted players actually make it to majors. The majority of amateur players drafted will never set foot in a major league dugout. However, aren’t first and second round picks different?

In total, 51% of first and second rounds picks make the majors. Consider the following graph below. The blue line represents the percentage of first round draft picks each year that eventually made the majors (played at least one game), and the pink line represents the same for second round picks.

Two things are pretty clear to me from this data. One is that first round selections do make the majors more than second round selections. It would be fair to say that to some extent teams do know what they are doing with these draft picks!

The other thing that is clear is that fewer and fewer first round draft picks ever end up making the major leagues. Does that mean that teams are getting worse at identifying talent? I don’t think so. Instead, the emergence of international players has grown rapidly during this time frame, resulting in more major league spots being taken up by non-drafted players. Interestingly, the trend is only downward for first round picks and not for second round picks. It would be worth checking how this pattern holds for later rounds.

WHO IS SUCCESSFUL IN THE MAJORS?

Next I went through the number of picks who achieved some level of success in the majors, rather than simply debuting at some point and fading away. I set up a pretty minimal definition of success-accumulating a WARP3 of 10.0 in their careers. Think Travis Lee (10.1) and Bruce Chen (9.1)-players who got some regular playing time over an extended period. I went through each set of five draft picks (picks 1-5, 6-10, etc.) and tabulated how many made the majors and how many had a WARP3 of 10.0 or above. Here is that chart:


Picks:  Made MLB  WARP3>10
 1-5      122        62
 6-10     105        43
11-15      88        31
16-20      97        34
21-25      97        25
26-30      75        15
31-35      86        20
36-40      62        17
41-45      64         7
46-50      71        18

It does seem like teams are pretty good at identifying the top players in the draft. In fact, 28 of the last 31 first overall picks have made the majors, and 19 have WARP3s of at least 10.0. After that, while players with better chances seem to be drafted earlier, there is not a huge difference in between getting the 11th and 25th pick.

DID MONEYBALL HAVE AN EFFECT ON THE DRAFT?

Few books have changed baseball the way that Michael Lewis’ “Moneyball” changed the game in 2003. In this book, Lewis documents the success of the Oakland A’s, as they used economic theory to win with a small payroll. It has been well documented that teams put more value on OBP in a post-Moneyball world. However, another thing that Lewis discussed in detail was that in the draft, the A’s targeted college players more than high school players, since college players had a higher chance of making the majors. What I found most interesting of all of my early results is the following: since 2003, there has been a sharp drop in the rate at which high school players are drafted in the first two rounds.

From 1978-2002, teams drafted players out of high school in 51.3% of their first two rounds of picks. Suddenly, in 2003, only 40.3% of players drafted in the first two rounds came out of high school, and from 2003-2008, only 41.5% of players drafted in the first two rounds came directly from high school.

There is no way to prove that this change was because of Moneyball. However, statisticians approach this problem by doing what is called a t-test. A t-test asks the following question: what is the probability that we would see such a significant change if this was nothing other than good old randomness? The answer to this question here? 0.003%! The chart below plots the percentage of players drafted in the first two rounds that were high school students in each set of six years from 1979-1984 up to 2003-2008.

If something did change in MLB front offices as a result of Moneyball, there was good reason-from 1978 to 2008, 59% of players drafted in the first two rounds out of college made the majors, and only 42% of high school players.

The counter-argument to the Moneyball idea of drafting college players is that high school players have higher upside and are more likely to succeed while in the majors. But research shows that’s not true for the first two rounds. Of the college players drafted in the first two rounds, 17% of them reached a career WARP3 of 10.0, and only 12% of high school players did. It seems pretty clear that the A’s must have known what they were doing.

HOW OFTEN DO TEAMS TRADE AWAY MINOR LEAGUE SUPERSTARS-TO-BE?

In 2008, Phillies pitcher Joe Blanton became the second pitcher since 1977 who was traded midseason and won a World Series game. (Can anybody name the first?) For all of the rampant trade rumors and discussions about making a move to get over the hump, it seems like the ultimate goal of those moves is rarely achieved. Of course, the Blanton example is a limited one, but it paints a pretty clear picture of how rarely midseason moves work out for contenders.

One might take that to mean that contending teams are recklessly trading star prospects midseason for supposed difference makers who never materialize. Trades like the famous Mets‘ surrender of Scott Kazmir for Victor Zambrano stick out in our memories, but I decided to take a methodological look at how often teams actually made this mistake. It turns out that it is nearly as rare as those difference makers making a difference.

Of the 2052 players in the study, 1041 of them made the majors. Of those, only 109 players were traded and then debuted with a different team than the one that had drafted them. Of that group of 109, only 19 accumulated a WARP3 of 10.0 in their careers. As it turns out, for all the fans who scream at GMs for trading away the farm system, rarely do the GMs trade away impact prospects.

FURTHER THOUGHTS

This is obviously just a starting point. It takes a long time to seriously look at a topic such as this (and easily longer to actually gather data!), but this is a good start. The interplay between the emergence of international stars and the success of early draft picks is certainly worth looking into further, and I will continue to study this. Looking at subsequent picks after the first two rounds will be interesting as well. Furthermore, it will be interesting to see if teams are more successful at drafting high schoolers now that they are more selective in doing so. That will take some time to study, but the information will eventually be available as the recent crop of high school draftees grow up. Naturally, the value of a trade of a star for a prospect is one of the hottest issues in the majors today. It is not useful to look at anecdotes to condemn or exalt a trade, but looking at this type of information more clearly will help determine where the trade market goes.

Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now

Matt Swartz

Latest Articles

You need to be logged in to comment. Login or Subscribe

wcarroll

6/07

Once again, Matt really writes well and does great research. He's on topic, clear, and is the class of the field.

Except that a couple of his conclusions seemed a bit off. I immediately asked Rany Jazayerli, who's seminal work on the draft just a few years ago, seemed to contradict this in places. I asked Rany, who said "I wouldn't say he contradicts my series - the main finding of my work was that the high school/college pendulum swung the other way from 1992 to 1999, but since he's looking at the data all the way back to 1978, his conclusions that college players make for better draft picks are valid."

I wonder whether Matt has seen Rany's pieces and whether he really re-invented the wheel here. Part of that is the fault of BP and the sabermetric community at large, in really not having one central comparative database of the work out there. Regardless, this is another solid piece and Matt's not letting off the gas after lapping the field last week.

Reply to wcarroll

kgoldstein

6/07

There's a lot of good stuff here, and I do think this is clearly one of the better pieces this week. I do think you should of asked some deeper questions, like, if the A's really set some standard by taking college guys, why are they taking more high school guys now than they did then?

Nonetheless, solid research, really well thought out, and well executed.

Reply to kgoldstein

ckahrl

6/07

Interesting, to be sure, and as Matt warns up front, he delivers a column that creates add-on questions, and that goes to whether or not he bit off more than he could chew in one column, when I think what we'd all love to see is for him to do something that expands on Rany's work *and* digs deeper into the engaging pastiche of topics that he glides over here.

From among his elements, I especially liked the second topic (the "how good?" bit), and I had a problem with the brief bit on so-called Moneyball-inspired draft strategy, since there isn't any reference to the economic factors that come into play. But then he comes up with a compelling, substantive bit of info in his Blanton-related point about the quality that teams deal away.

In short, while I was disappointed that he went the grab-bag route, it's clear he's got plenty of goodies to dig out and develop. I would love to see more substantive treatments of any of these featurettes.

Reply to ckahrl

daiheide

6/07

Swartz is one of the best contestants, I think, and he shows again that he's got a sharp eye for interesting questions. He includes a lot of valuable information, here, and I'm interested in seeing what other conclusions he might draw on the basis of the research he's started here.

However, the piece seems more like a series of notes - like a draft - than a polished piece of B-Pro quality. I realize that that's because he's trying to tackle a huge topic. But he might have focused the piece a bit better.

Reply to daiheide

rocket

6/08

I agree, some good stuff there, but a bit of a scattershot approach. I would have preferred to see you tackle just one of those in a bit more detail, and leave plenty of follow-up questions open.

Reply to rocket

strupp

6/07

More imortantly, now I have to research who the other pitcher was.

Reply to strupp

drawbb

6/07

Without checking yet, I will guess Jeff Weaver with the 2006 Cardinals.

Reply to drawbb

josh7798

6/07

Wasn't Weaver a FA signing? I'm going to try to guess without looking it up too. I'll guess David Cone with the Yankees in '96?

Oh, by the way, best article I've read so far. As always Matt is one of the best, and is emerging as the favorite, IMO.

Reply to josh7798

drawbb

6/07

Weaver was acquired in a trade for Terry Evans after being DFA by the Angels. Truthfully my first instinct was Cone with the 1992 Blue Jays, but then I remembered all 4 of their wins in that World Series were credited to Key and Ward.

Reply to drawbb

swartzm

6/08

I guess I'll confirm the answer-- it was in fact Jeff Weaver in 2006. It's interesting how there hasn't been a single top of the rotation guy who has been traded and won a WS game.

Reply to swartzm

thesonofhob

6/08

While there hasn't been a single top of the rotation pitcher traded in the middle of the season who then went on to win a WS that season, there are a couple who come to mind who won them them very next season. The aforementioned David Cone was traded to the Yankees during the '95 season, and was instrumental in their World Series title in '96, and of course then again in '98 and '99. Curt Schilling was traded to Arizona in 2000 and was of course extremely instrumental in Arizona's championship in 2001.

So trading for an ace may not get you a World Series this year, but if you expect to keep him around it may not be the worst of ideas. In other words, trading for Jake Peavey could very well be a really good idea, especially if all the Padres are looking for is a package similar to what the White Sox were going to give up.

Reply to thesonofhob

beeker99

6/08

If Joe Torre had left Denny Neagle in Game 4 to face Mike Piazza in 2000, instead of pulling him 1 out short of 5 innings so that Cone could pitch, then Neagle would be another answer.

Reply to beeker99

jivas21

6/07

The only thing I could think about as I read this was - had he read Rany's work from a few years back? Or other research work on the draft? In my opinion Matt's work thusfar in the "Idol" competition has been stellar, but I wonder if he realized that this data set (or comparable sets) has already been analyzed in all of the ways he has initially done here; as noted above, I'd love to have seen him identify an unanswered question and at least attempt to add new information to the conversation.

That said, I haven't read all of the other pieces as of yet, and this may still prove to be one of the best this week.

Reply to jivas21

swartzm

6/07

Yeah, I would love to have had a chance to see Rany's work in advance. I didn't start subscribing to BP until after that series came out, which is a shame because it's right up my alley. I don't know if he could have given me his data set or anything or if it's available online anywhere, but it sounds like that might have helped me ask more questions and drive things further. If nothing else, glancing through the articles, it seems like we came to a lot of similar conclusions.

Reply to swartzm

sde1015

6/08

Out of curiosity, does one not get access to the archives when one subscribes to BP?

I had a similar thought to others when reading this that it seemed similar to Rany's work. You're looking at a different sample and reached different conclusions, so I thought it was a good article, but any follow up would definitely benefit from reading Rany's series if you have access to it.

Reply to sde1015

swartzm

6/08

Yeah, I spent a good part of my afternoon reading through the series, which is terrific. I hadn't seen it before, and obviously that would have helped.

Reply to swartzm

BurrRutledge

6/08

Like others, I was immediately reminded of Rany's work. Without going back over Rany's series, I suspect that your research is not an exact duplicate. That said, it certainly could have informed some of the questions that you asked and researched. Perhaps there's a topic later in the contest that will allow you to revisit and expand on this...

Reply to BurrRutledge

swartzm

6/08

Yeah, it's not an exact duplicate at all, though there is some overlap, especially at the beginning. I highlighted different stuff too. I probably could have expanded the Moneyball section because my whole point in that section was that Moneyball changed drafting strategies, which obviously wasn't something Rany had a large enough sample size to say conclusively in 2005 when he did the study. I spent a lot of time trying to test the assertion that College picks were on average better at reaching the majors and better at succeeding there, which Rany had done. Without spending so much time on that, I could have actually researched the Moneyball adjustment in much more detail. The trading prospects part of the article wasn't in his work either, so that was new too. I opted not to expand that section only because of the word count and the fact that I thought at the time that my other research was original and deserved more words devoted to it.

Reply to swartzm

acmcdowell

6/07

I liked this piece overall although I'm not entirely comfortable with some of the analysis Matt did. When comparing multiple groups to one another, shouldn't he be using an analysis of variance (ANOVA) instead of a t-test? And why break the data into 6 year groups? And finally, although he attempts to address the argument that high school draftees have higher upside, is a WARP of 10 really appropriate? You don't draft a high school player in the first round and call him a "success" if he becomes Bruce Chen or a Travis Lee, you're looking for guys who can put up a 5 or 6 win season.

Reply to acmcdowell

swartzm

6/07

ANOVA vs t-test: I was comparing two samples, so a t-test is really just a kind of ANOVA and it works here, I believe. That said, I treated the variance as sqrt(p(1-p)/n) where n was the 473 players drafted in the first two rounds between 2003-2008. Instead, I should have treated it as sqrt(p(1-p)*(1/n1+1/n2)) to account for the 1578 players drafted from 1978-2002. It doesn't really change the results any (t=3.7 vs 4.2 so p=.02 vs p=.003), but still.

I broke the data into 6-year groups only for the graph. Otherwise, it would have looked too messy due to year-to-year variance.

I recognize that WARP of 10 seems arbitrary, but really, I needed a number that a large enough sample of players achieved, something I could get quickly in the time frame I had to generate the data set, and I made an assumption that is probably reasonable-- if the percentage difference in rate of making the majors at all for college vs high school is the same as the percentage difference in rate of accumulating 10 WARP for college vs high school (which it was (59%-41%)/41%=44% and (17%-12%)/12%=42%, then the ratios were probably pretty similar for 20 WARP, 30 WARP, etc. With more time, I would have liked to done this more seriously as it seems Rany Jazayerli did quite well in his series several years ago.

Reply to swartzm

swartzm

6/07

As a few people mentioned, I did consider going into detail on some of the economic theory behind the A's original plan to draft mostly college guys, and their latest switch to drafting more high school players, but I thought I'd gone to the economic theorist well a few times already in BP Idol and I didn't want to overplay my (invisible) hand.

Reply to swartzm

ckahrl

6/07

Who doesn't like a nicely played Adam Smith joke? Which leads me to my point--speaking only for myself, I'm not complaining when an author plays to his or her strengths effectively, as you have.

Reply to ckahrl

Oleoay

6/08

A realist, a hedonist and an economist take the hotel shuttle to a bar downtown...

The realist says, "The shuttle stops running in an hour so I don't want to get too drunk, but we can stay for one beer."

The hedonist says, "Aw heck it's a Friday and I want to party harder, I'll get a beer and a shot."

The economist says, "I'll pay for the first round. You each can get a beer and a shot on me."

After the first round, the realist is so happy he returns the favor by buying the economist another drink and a shot.

The hedonist is so happy he's buying drinks for everyone for the rest of the night.

Finally, the economist is happy because he drinks the rest of the night for free and knows the realist budgeted enough money to pay for a cab ride back to the hotel.

Reply to Oleoay

Oleoay

6/10

From the -3 feedback, it looks like a few people didn't like my economist joke... or maybe it just wasn't all that good ;)

Reply to Oleoay

llewdor

6/15

I liked it. But I work with dozens of economists.

Reply to llewdor

sweptaway3641

6/07

One of the issues is the composition of the draft class. In any given year, the strengths and weaknesses of the draft class are going to be different. Last year was an exceptionally strong year for college bats, this year is a strong year for college arms. Next year could be a big year for prep bats, or prep arms, or college bats. Some teams will always take a college player, some will almost always take a prep player, some teams will take the best available player.

I think its tough to say "college players are a better bet" in a vacuum, because it all depends. In some drafts, the 10th best college guy might be rated below the 10th best prep guy. What a player does after he is drafted is exhibit B, but you draft a guy based on what he's done, and then project what you think he'll do, that's exhibit A. Lots of guys have had decorated college careers, only to fall flat in the majors against the consensus opinion, just like lots of prep guys destroyed high school pitching, looked like the next Mickey Mantle, and couldn't make it out of AA. The draft in baseball, more than in any other sport, seems to be a lottery of sorts. The teams with the best scouts and the best owners (ie, willing to spend) generally seem to get the most out of the draft, but even the best teams make high profile mistakes due to the nature of the beast.

It should also be said that the "Moneyball philosophy" is to target undervalued assets and traits. If teams had valued college players as X prior to the A's 2002 draft, and then valued college players as Y after, it makes sense for the A's to adjust their thinking. Its just like a value investor like Warren Buffett who says the best time to buy stocks is when others are fearful and panicking. Likewise in baseball, maybe the value of prep prospects to the A's changed when others started to value college guys more.

Reply to sweptaway3641

swartzm

6/07

Yeah, I agree with you on all points.

The reason I did do the chart in clumps of six years is specifically because I figured (and you confirm) that some years will be particularly strong in one group, like college arms, but then not the next.

I considered talking about the A's changing their approach in response to the change in valuation of high school draftees (as in the 8th best HS player used to already be drafted by the 20th pick, but now is not and is a better bet than the 13th best college player), but as I mentioned in another comment, I didn't want to overplay the valuation theory stuff after I'd done so much economics in my first few articles.

Reply to swartzm

SkyKing162

6/07

Great stuff, and if Matt's open to sharing the database he compiled, a lot of us would be really excited. I'm definitely looking forward to the second, third, and fourth installments, as he digs deeper into the analysis.

One potential bias I wonder about is organizations' desires to push draft picks to the majors just because of where they were selected. For example, given the same minor league performance, perhaps top ten draft picks are allowed to see the big leagues more often than second round draft picks. Heck, this might not even be a bad bias, as there could be an innate skill in the top ten draft picks that makes them more likely to succeed with the same minor league performance (such as being younger, or a better ability to kick it into another gear against top of the line competition.)

Reply to SkyKing162

SkyKing162

6/07

I'll add, since this was the last article I read, that I voted for all but two articles this week. Now, that might hurt the top of the line competitors, perhaps, but I'm hoping they're safe, and I really felt this set of articles was the best yet.

Reply to SkyKing162

swartzm

6/08

I'd certainly be willing to share my excel file, but it doesn't have other rounds yet, and it's kind of a messy and large file, because I was trying to gather information on a bunch of different things in a short time frame and have some information recorded for some years, but not others.

If anybody does want the excel file, I'm happy to share it. I don't want to get spammed, but for those who don't have it, you can go to www.econ.upenn.edu, go to graduate students, and click on my name to get my email address.

Reply to swartzm

molnar

6/07

Being drafted out of high school is a binomial variable. No t-test.

Reply to molnar

swartzm

6/07

What should I have done? The first sample was 1578 with 51.3% high school students. The second sample was 473 with 41.5% high school students. Total was 49.0% high school students. In the article I just did (.513-.415)/sqrt(.513*(1-.513)/473) but in another comment I did (.513-.415)/sqrt(.490*(1-.490)*(1/473+1/1578)). Either way, it's significant way passed the 1% level. Is there something else you would have done?

Reply to swartzm

JayhawkBill

6/07

+1, molnar. Yours is a non-trivial point.

Thumbs up, Matt. Your article wasn't as good as last week's, but your next-to-last paragraph rocked. Great job.

Reply to JayhawkBill

swartzm

6/07

Thanks, but is what I put wrong? Whenever it's binomial, I do [p*(1-p)*(1/n)] or [p*(1-p)*(1/n1+1/n2)] or [p1(1-p1)/n1+p2(1-p2)/n2], depending on what the question is. Is that not a t-test? Or is that not what I should do?

Reply to swartzm

tkniker

6/08

Two things on this that I'm wondering about:

1) The 51.3% form 1978 - 2002 doesn't seem to jive with your chart. From the chart it seems that only one of the bars is above 51% (the most recent), but the first two are well below 49%. The only way that 1978-2002 could be 51.3% is if there were lots more trials in 1997 - 2002. Is this because of the expansion of supplemental picks recently?

2) Matt is right in his use of t-test to assume the two are from a similar group, though I have typically found these t-tests to usually fail when there are lots of trials. For example even hairline differences when there are 10,000 trials always seem to suggest they are different. The one concern that I have is that by using the t-test you have assumed that each draft pick is independent of every other draft pick. That may be a pretty big assumption in drafting. It may work in random phone calls for polling, but I would bet that whoever picked 4th in a draft, who they pick may be highly dependent on who was picked 1st, 2nd, or 3rd. Likewise there may be some dependence of what a team picks in the 2nd round (and or supplemental rounds) based on the player they picked in the 1st round.

Reply to tkniker

swartzm

6/08

Correct, more draft picks in later years due to expansion and supplemental picks. There were two sets of six years below 51% and two above.

I see what you're saying about independence, but when I get a t-stat around 4, and my 2003-2008 sample was only 473 picks, that couldn't really change the effect much. No one team gets that many picks to make a huge difference, and there's some negative dependence where it comes to the fact that if the best high school player was picked with the first pick, the second pick is LESS likely to be a high school player. I'm guessing the effect is even stronger.

Reply to swartzm

tkniker

6/08

The way I would do something like this is that we have two samples of lots of Bernoulli trials, and we are trying to determine if these two samples have the same probability of success (a high-school player).

Sample 1: mean = .513, std.dev of estimate error = .01258; SQRT(.513*(1-.513)/1578)

Sample 2: mean = .415, std. dev of estimate error = .02266; SQRT (.415*(1-.415)/473)

So to compare two sample means, our NULL hypothesis is that the difference is zero. To do this our mean is .513 - .415 = .098, and our std dev of our estimate error is (.01258^2 + .02266^2) = .025915. Therefore the t-statistic is 3.78 which would fail for any reasonalbe confidence, so yes, there is strong evidence (in agreement with his conclusions) that these are not from the same rate and suggests strong statistical evidence that clubs are valuing college players more highly than high school players.

The real interesting question though is if the pendulum has gone too far, i.e., is there roughly a "right" percentage which suggests that the pendulum has found the optimal equilibrium, i.e., is the reason the A's are now drafting more high school players is because there is now evidence that everyone is overvaluing college players?

Reply to tkniker

dpowell

6/08

Matt, great article. I don't think it's worth obsessing over the correct standard error here (I was fine with what you initially did) but since we are, I'll chime in. I think Tim makes a good point that the draft picks are not independent. If Strasburg doesn't go #1, he likely goes #2 and definitely goes in the first 2 rounds. The fact that a college player is picked #1 isn't that important if he would've been picked in the first 2 rounds both pre- and post-Moneyball. The underlying experiment, then, is...was there a "shock" that caused more college players to be drafted in the first 2 rounds post-Moneyball? This doesn't really affect the top players. Basically, the observation level is really just the year. You could also consider adjusting for "clustering" by team as well. And then getting the right standard error becomes a paper in itself (get variance when cluster by year, get variance when cluster by team, get variance when cluster by team-year...right variance is sqrt[(1)+(2)-(3)]). Honestly, I'm not even sure that's right. I'm fine with the original stat you gave us.

(Note to Matt and other contestants: I worry that everyone's going to start worrying about using any stats work since the comments seem to jump on small details everytime (I'm guilty of this as well), but I'd like to encourage you to just do it anyway.)

Reply to dpowell

JayhawkBill

6/08

Matt, my understanding is that normal distribution is an underlying assumption for a t-test, and that binomial distribution approximates roughly, but does not truly model, normal distribution.

If you know statistics better than I do, which is certainly possible because my fields are engineering and business, not statistics, refute my points. This is a competition, I've nothing at stake, and I am cheering for you: if my understanding is imperfect, please educate me.

Reply to JayhawkBill

dpowell

6/08

Law of Large Numbers takes care of this issue. Binomial converges to normal distribution.

Reply to dpowell

dpowell

6/08

What the heck? Indicator variables can have a standard deviation. That means they have a t-test and a p-value. Why do you think this is wrong? I've never known anyone who thought this.

Reply to dpowell

sde1015

6/08

But he's testing the proportion of the total that was drafted out of high school, not whether each individual pick is one that was drafted out of high school.

Besides, whether the underlying distribution of interest is binomial, bernoulli, or anything else, he's got a large enough sample size that surely the central limit theorem applies, right? A simple difference-of-means t-test like Matt used is more than fine.

Reply to sde1015

anderson721

6/07

Just a guess... John Tudor?

Reply to anderson721

anderson721

6/07

Well, not quite. Traded mid-season, started a WS game, left with an injury afetr going 1.1 innings. Answer to a different trivia question: Who is the only starting pitcher to be pulled from a WS game while throwing a no-hitter?

Reply to anderson721

strupp

6/08

Did we ever get an answer to the first one? Jeff Weaver seems right I guess.

Stoopid trivia.

Reply to strupp

DrDave

6/08

My immediate reaction was "hasn't even read the recent BP work on this topic". Whether that's true or not, giving that _impression_ is a major failure. No cookie.

Reply to DrDave

markpadden

6/08

Comments like this will do nothing but inhibit future innovation. Just because a topic has been addressed before, does not mean it is should be ignored going forward.

Reply to markpadden

DrDave

6/08

Of course not. But the further discussion should be *aware* of the earlier discussion, and build on it or refine it or refute it.

Reply to DrDave

blcartwright

6/08

Regarding the decreasing percentage of 1st rounders who make the majors - I think it's dependant on the length of the round. We would normally think as the 1st rounding allowing eahc team to have one pick, or 30 in total. Round 2 would be 31-60. With compensation picks this is not true. Players drafted as low as 50 have been classified as 1st round picks, diluting the mean talent level of the round. I would be curious to see if calling 1-30 round 1, 31-60 round 2, etc, regardless of the real life designation, would change the round 1 slope.

Reply to blcartwright

swartzm

6/08

That's a good point too. I was thinking about the extra length coming from extra teams but that would mean extra roster spots, too, so I was comfortable leaving it out. I forgot about supplemental picks for some reason even though I was thinking about them in other contexts.

Reply to swartzm

NathanJM

6/08

It is also possible that MLB careers are longer, or simply that the graph isn't time-bound. There will naturally be fewer players who have been in the big leagues from more-recent drafts simply because they haven't yet gone all the way through the system or had as many chances to fill in for injured regulars. There is also increased incentive to hold off on starting the MLB service time clock from ticking. Or getting better at managing injuries and reducing the need for short-term replacements. It's definitely something that needed further investigation and I'm not sure the graph has any value being presented by itself. I would have preferred he either go into greater depth on this or omit it entirely and give more attention to the other subjects.

Reply to NathanJM

swartzm

6/08

I agree with your point about some players not making the big leagues yet, but I stopped the graph at 2003 specifically because of these issues.

Reply to swartzm

csferry

6/08

First off, I enjoyed this article thoroughly. One question (not at all a criticism, just a curiosity) arose from this excerpt:

"The counter-argument...is that high school players have higher upside and are more likely to succeed while in the majors....Of the college players drafted in the first two rounds, 17% of them reached a career WARP3 of 10.0, and only 12% of high school players did."

High school players are drafted for upside, college players for (presumably) higher degree of safety. So, how do those HS draftees compare to the college ones? That is, while a smaller percentage has reached the majors, have they done so while achieving their supposedly higher upside? Or has chasing upside in 18 year-olds proven to be a fool's errand?

Reply to csferry

swartzm

6/08

My thinking is that since college players made the majors 44% more often than high school players and college players also reached 10.0 WARP 42% more often than high school players, the upside is probably still higher for them. Looking through Rany Jazayerli's stuff, as the judges mentioned in the comments section, I see that he had a similar conclusion though he saw that effect erode around the 1990s as teams started making signing out of high school more lucrative. I guess that plays into my initial entry pretty well.

Reply to swartzm

BurrRutledge

6/08

Matt, I think what you've identified is that you are more likely to get a 'slightly above replacement level MLB player' by drafting college players rather than high school players. By definition, that will not build a championship team. It's good to have these guys around to fill out your roster, but you really want a team of 'stars' to capture a pennant.

The question then, is will college vs. high school strategy be 'more successful' in that regard? If I remember correctly from Rany's research, he concluded that the first coupla picks were far more likely to have 'star success' (my term, not his), regardless of whether they were high school or college draftees. Every other consideration after the draft slot was not statistically significant. (That's my recollection, and I hope somebody will correct me if I'm off base).

If my recollection is correct, my suggested research would be whether you can identify 'star' potential through a high school/college correlation for the rest of the draftees, assuming you're not picking in the top coupla slots each year like the Rays. Example: if you've got the eleventeenth pick in the first round of the 2009 draft tomorrow, are you more likely to get a 'star' by drafting from high school, or from college? What about pitcher vs. position player? Anything else that might catch an emerging star to propel your club during the 2012-2015 seasons? If you can identify those characteristics, well... there's folks who would pay for that information.

Reply to BurrRutledge

swartzm

6/08

I copied and pasted a couple things from his Parts II and III below. It seems that outside of the first few picks at the beginning of the first round, you got more upside going the college route (though he did point out later on that a lot of the advantage in drafting college players eroded in the 90s).

My goal in doing it the way I did was that I only had so much time to collect information, and I figured that if I used the qualification of making the majors and the qualification of becoming a reasonable regular (10.0 WARP3), and college players were equally more likely to meet either qualification, then chances are the distribution of talent was such that I had made a good guess. According to Rany's study, it seems that if I had grouped together the same time frame together using a large WARP3 cutoff, I would have come to the same conclusion.

"Draft Rule #5: In the first three rounds, not only are college players about 50% more likely to reach the major leagues than high-school players drafted in the same slot, they produce approximately 55% more value over the course of their careers. This advantage is persistent at every point after the #1 pick."

http://baseballprospectus.com/article.php?articleid=4042

"High School College

Never reached majors: 459 281
WARP Between 0 and 1: 58 66
WARP Between 1 and 5: 63 101
WARP Between 5 and 10: 30 47
WARP Between 10 and 20: 41 56
WARP Between 20 and 30: 27 43
WARP Between 30 and 40: 16 23
WARP Between 40 and 50: 6 15
WARP Between 50 and 60: 2 8
WARP Between 60 and 80: 6 11
WARP Greater than 80: 4 13

Total: 749 715"

http://baseballprospectus.com/article.php?articleid=4064

Reply to swartzm

BurrRutledge

6/08

Woot, there it is. Thank you!

Reply to BurrRutledge

Oleoay

6/08

I think this is the first article I read from Matt that wasn't original. It was good, it had solid writing and it had depth, but it did seem apparent to me that he hadn't read other works on the subject. Sometimes it is good to tackle a subject with a fresh look, but it is also important to find the "state of the field" as well.

When I was reading the various charts and graphs and analyses, other thoughts were coming to my mind since I had already seen that kind of analysis before. The graph of first and second rounders seemed to indicate to me that there is less difference in 1st round and 2nd round talent than there used to be. While it is very possible that international signings have prevented fewer draft picks to reach the majors, I was wondering if advancements in medicine/surgery/training and longer career life in general had had an impact. I also didn't quite buy the argument that high school players have more potential than college players, I think they would just appear to since there is less data on their performance against solid competition and if they are pitchers, then their arms might be "fresher".

Overall, a good, solid article. I also realize it was Matt's first foray into the topic and I shouldn't expect to be wowed every week. So, another good, solid thumbs up.

Reply to Oleoay

thesonofhob

6/08

I think a lot of the college vs high school draftee debate could really center a lot on how an organization values its minor league development. College hitters and pitchers are obviously further developed than their high school counterparts, and thus require less development in the minor leagues.

High school players on the other hand are more potential and projection than certainty. So while it's harder for scouts to try to figure out who can actually make if from high school to the majors, if you really believe in your organization's player development, maybe you are willing to take some chances on "toolsy" players who are all potential because you trust your minor league coaches and trainers to mold him into an actual major leaguer.

I got to thinking about this when when I was reminded of the NFL draft, and how over at FO Outsiders there was a lot of talk that while the Pittsburgh Steelers often nail the draft, that doesn't mean that their scouts are necessarily better, but instead they let their rookies and younger players sit on the bench and develop, so that when they are called upon to start in their third or forth year they often surprise and succeed.

I'd imagine that a baseball organization could function the same way, get the right instructors and coaches down in the minors who know how to take certain tools and help prospects turn those tools into actual skills and results at the major league level.

Just looking at college pitcher vs high school pitcher, in theory I really like the idea of drafting the high school pitcher and then letting him spend then next three to four years down in the minors under organizational control in the organizational training program learning from organizational coaches and instructors. The college pitcher on the other hand will probably be raced through the minors where he only needs to prove that he can handle that level's hitters, while not receiving all that much training and development.

Reply to thesonofhob

tkniker

6/08

Actually, this brings up a point I've always wondered about. As an example, let's take the Pirates and Royals (at least the Royals before 2003).

There seems to be very few players that they've drafted that reached the majors compared to others (i.e., it's not like I scope other teams rosters and say -- oh, that's a player the Royals drafted, the way I would with the Rangers).

Is it simply that the Royals and Pirates are picking the wrong players, or is it that the pick the right players but their early development of minor leagues are so bad that they "ruin" players with potential. Now THAT would be a great article.

Reply to tkniker

Oleoay

6/08

It seems some organizations tend to do better with certain kinds of players and not as well with others. The Twins seem to be a good example of an organization that spits out a ton of pitchers who have control and players with good defense.

Reply to Oleoay

TonyRiha

6/08

In the most simple sense, this article captures the spirit of the competition... Excellent work !

Reply to TonyRiha

irablum

6/08

I loved this article until he started talking about midseason trades. That's basis for its own article, and irrelevant to the main issue (unless you are going to talk about how teams can get value from their draft picks by trading the prospects for major leaguers, which is not what he was talking about).

Reply to irablum

jkaplow21

6/08

You know, for an article on a topic that is clearly not your strongest, you still did a fine job.

Reply to jkaplow21

hessshaun

6/08

Not sure how we got from HS vs. college picks to deadline deals.

Good none the less.

Reply to hessshaun

BurrRutledge

6/08

Yeah, a little more of a transition might have helped there... or, perhaps, a topic for another article.

Reply to BurrRutledge

gaucho2101

6/08

I liked the article overall. The college vs. high school data and the percentages of those drafted in the sample that made the major, and their success rate was intereting. However, I felt the discussion of mid-seasons trade for would-be stars was an unnecessary diversion, since it was no longer about the draft and there was more draft-related questions that may have been addressed.

Reply to gaucho2101

hotstatrat

6/08

Overall, an interesting but somewhat disappointing entry from last week's star:

Starting off with a quibble: " But beyond that, do you know much about what the draft as a whole produces?" That's a bit of an awkward lead into your article - and isn't a vivid picture of what this is going to be about either.

Not a quibble, but a question inspired by this article: Bill James pointed out that college players have more success in the Majors than high school draftees almost 20 years before Michael Lewis. Why did Lewis apparently affect this change and not James? It is interesting from the bar graph, there was a bit of a dip in high school picks around the time immediately after James' first exposed this fact. Why didn't that trend continue?

Back to quibbling: Matt keeps repeating his mini conclusions that teams know what they are doing in the draft. I wince each time he does that. It is not worth noting even once. What we are measuring is the degree to which they know what they are doing.

Trivia try: Without looking it up or reading the comments I guessed Don Sutton - Brewers '82, but then I thought, no, what about David Cone - of the Blue Jays '92 or '93 teams? (Reading the comments, I see someone mentioned Cone's work with the Yankees, but all of that success mentioned came in years after the trade. I inferred that the question pertained to immediate help. Yeah, Tudor is a good guess.)

Wow, this essay took a steep turn into an entirely tangential area (trading of top picks) leaving the original course missing begging for further exploration. Matt ties it back with a fourth "gee wiz, those GMs know what they are doing" comment.

I sympathise with Matt's final conclusion. Yes, it takes time to do a worthwhile study. He isn't getting paid a wage to live on in order to have time to do that. But, that and the list of unanswered questions was a most dull conclusion.

Reply to hotstatrat

hotstatrat

6/08

Hey, Tudor '88 and Cone '92 are both right! They both won W.S. games those years after August trades. Either I'm misreading Baseball-Reference or they're wrong or you get a point off for being wrong, Matt. But, thanks for the overall nice article - even if it was a little below my lofty expectations.

Reply to hotstatrat

hotstatrat

6/08

P.S. Sutton won an ALCS game, but not a WS game in '92.

Reply to hotstatrat

hotstatrat

6/08

er, I mean '82. (My how time flies.)

Reply to hotstatrat

ckahrl

6/08

That's OK, I can still see Sutton wheeling and dealing from the mound in my mind's eye. A bit too much about that '85 A's team made me want to put them out, of course...

Reply to ckahrl

swartzm

6/08

No-- neither pitcher won WS games the year they were traded midseason.

The winners of '88 WS were: Pena, Hershisher twice, Honeycutt, Belcher.

The winners of '92 WS were: Glavine, Smoltz, Ward twice, and Key twice.

Reply to swartzm

swartzm

6/08

I agree that this article was not as good as last week's-- if only every topic could play into my hands like that, I'd be in great shape. This was a brand new topic for me, but a fun one and I learned a lot. The criticism about the last point being tangential seems either too little or understated as I went the grab-bag route for the article, gathering tons of data quickly, and running a series of tests. Each segment of the article was intended to be a different tangent. The article was supposed to create questions as one should not reach conclusions in three days on a new topic.

Michael Lewis' conclusion affected the league more than Bill James' for a number of reasons. For one thing, when we talk about this concept-- of new information becoming available and the market adjusting-- we have the Efficient Market Hypothesis in mind. The EMH essentially states that if information comes out that says that there is an undervalued commodity, those who get this information will immediately jump upon the market and drive the inefficient competitors out of business. Baseball is not a free market system. There are 30 teams and there is no entry. In essence, this leads to market inefficiencies that can persist over long periods of time, unlike with other markets. The 30 teams do not often compete with each other for fans, and draft picks are allotted more or less equally among the teams. For that reason, Bill James stating something that maybe one or two teams noticed would not necessarily move the market. Michael Lewis stated it on a much larger stage, and it pushed GMs to adjust to keep their jobs.

Reply to swartzm

Oleoay

6/08

I realize the constraints you are under and as I said, I don't expect to be wowed every time. It'd be nice if people had strengths or previous experience in each topic... but that was also part of the conditions of the contest itself.

Besides, at this point, each contestant has a body of work and overall I think you are in great shape. You're the first person to break the 50% thumbs up barrier and that won't be forgotten soon. Meanwhile, those who looked shaky at the beginning have been improving... maybe not to the point where they can become the BP Idol, but to be able to add "Published writer on Baseball Prospectus" to a resume _and_ to learn better techniques has got to be a real benefit. As I said, I hope quite a few people get hired by BP when this is all said and done, and those that don't just might latch onto sites elsewhere.

Reply to Oleoay

NathanJM

6/08

Probably because Michael Lewis' book received more mainstream coverage and was more likely to be read not only by front offices but by fans who would pressure front offices. Simply breaking the idea first doesn't mean a whole lot unless the right people read it and buy into it.

Reply to NathanJM

Oleoay

6/08

I think the concept of fan pressure is a bit overrated. it may affect a 10-20k in attendance in a year, but that's about it. Only if a franchise totally nukes itself like Montreal (English-only broadcasts, etc) or does a fire sale are things affected. A good chunk of the reason is probably because, though a fan can gripe, his or her voice is generally not loud enough to reach the other million or so people who are buying tickets. Even then, winning a World Series has a way of reversing any "fan pressure" or fan dissatisfaction.

Reply to Oleoay

jpkand

6/08

Nice article. One thing rubbed me the wrong way when reading:

"For all of the rampant trade rumors and discussions about making a move to get over the hump, it seems like the ultimate goal of those moves is rarely achieved"

I don't think the "ultimate goal" is appropriately defined as winning a WS game. How about started a playoff game?

Reply to jpkand

rbooth9

6/08

Great job, Matt. The article was intriguing and thought-provoking from start to finish.

Reply to rbooth9

caprio84

6/08

Solid article; kept my interest. Would have closed with the answer of "who was the other pitcher?" to complete the piece.

Reply to caprio84

jdseal

6/08

A couple of times there is an implication that teams "got it right" when higher picks had more success. But success was admittedly defined by a pretty low bar. I think they causation may be reversed...the player may have gotten his cup of coffee (or WARP>10) because of his high draft place (and concomitant investment) as much as because of his talent and performance after being drafted.

Reply to jdseal

swartzm

6/08

WARP3>10 is not a cup of coffee. VORP>10 could be in extreme cases, but not WARP3. You would need several years of sustained success to put up that much WARP3. That said, of course investments affect the ability to get a cup of coffee, but it would take a mediocre player nearly a decade to get that much WARP3.

Reply to swartzm

aardvark

6/09

This is actually something that I was thinking while reading the article, particularly the "WHO MAKES THE MAJORS?" section. To conclude that "teams do know what they are doing with these draft picks!" because they make the majors at a higher rate misses the fact that they probably get more opportunities.

Reply to aardvark

rickfman

6/08

Matt can write. Matt can crunch numbers.

However, Matt picked a topic that has been covered by others in a lot of depth. I think that Baseball America (as well as Rany) did a lot of work on this topic (albeit by using a bunch of less valid statistical measures of success).

I just do not see why one should applaud a less well done version of a Draft Analysis. Artists do not receive pundits for copies of others work (even as a training piece), and sorry Matt, neither should you.

Reply to rickfman

Oleoay

6/09

Just because something is "less well done" doesn't mean it isn't well done or that it is automatically bad. I would bet even Bill James had tread over ground others had covered.

Reply to Oleoay

jtrichey

6/09

I have liked all of Swartz' pieces, but I thought this was the least interesting of his work thus far.

Prospectus Idol Entry: Top Draft Picks: Where They Came From and Where They’re Going

Thank you for reading

Latest Articles

Fantasy Starting Pitcher Planner ’25: Week 23 $

Form, Function, and Fantasy Baseball $

Box Score Banter: Fundies B

Friday Focus: Major/Minor B

The Call-Up: Payton Tolle $

Matt Swartz

Latest Articles

Fantasy Starting Pitcher Planner ’25: Week 23 $

Form, Function, and Fantasy Baseball $

Box Score Banter: Fundies B