You don’t have to be a rocket scientist to understand the nuances of baseball…. But it helps!

I began my quest for an opportunity to analyze and write about baseball by doing the obvious: getting a degree in aerospace engineering from Princeton University and doing my senior research on annulled magentoplasmadynamic thrusters. After a glorious summer job of being an award-winning (just ask Daktronics) scoreboard programmer and operator for the Trenton Thunder in their inaugural season, I did the next logical step and went to the Massachusetts Institute of Technology to get a Ph.D. in Operations Research, a program of study that focuses on the practical application of statistics, probability and optimization theory for improved decision-making (hey, we’re getting closer).

The reason why I would make a great contestant on Baseball Prospectus Idol and the next great baseball analyst is that along with my passion for baseball (and the better understanding of it through quantitative analysis), my job with an elite operations consultancy has trained me on using the tools of operations research to find insights quickly in a mountain of a data and be able to present these insights in compelling ways.

On a personal note, I live with my wife and two-year old twin sons, north of Boston. Every day I struggle with the decision to raise my boys as Royals fans like their father (probably leading to daily playground pummeling) or accept the inevitable: they WILL become (gulp!) Red Sox fans.

Does Organizational Depth Really Matter?

Going into spring training 2009, the Kansas City Royals had nineteen of the twenty-five roster spots set. Over the next six weeks, dozens of threads and thousands of posts on the Royals’ chat boards discussed the eleven remaining candidates for the last six spots. As the April 5th roster deadline approached, name-calling and hurt feelings increased exponentially. Similar verbal assaults were being hurled on the message boards for the other 29 teams as well. As I watched this, and sometimes entered the fray, I wondered: do these seemingly minor decisions really make a difference, given that the five “losers” of the Opening Day roster sweepstakes will likely contribute significant time anyway?

I looked at the amount of playing time (measured by plate appearances for position players and innings pitched for pitchers) of each team in 2008 to determine what kind of organizational depth is needed by a typical major league team. Further, I used VORP to determine the value of organizational depth to the team’s performance. To focus how organizational depth affects championships, I categorized teams into one of three buckets: playoff teams, teams that were above .500 but missed the playoffs (contenders), and below .500 teams (second division)

On average, 46.1 players (22.9 pitchers and 23.2 position players) stepped on the field for each major league team, the low being the White Sox with 39 players, the high being the injury-riddled Padres with 59 different players. Playoff teams used 43.1 players, contenders used 44.7 players, and second division teams used 48.8 players.

As a next step, I labeled if the player was on the Opening Day roster, making a few subjective adjustments for an “Effective Opening Day Roster” which is what the organization would likely have had if everyone was healthy. For example, I would assume that had they not been disabled list, John Lackey and Josh Beckett would have been on the Opening Day roster for the Angels and Red Sox, respectively.

Position Player Depth

For position players, I categorized the top (measured by Plate Appearances) nine players that could effectively fill out the lineup card as starters (for the NL the ninth player is considered main pinch-hitter), and the remainder as bench players. Next I labeled the next four players (who were within the organization as of Opening Day) as the first tier of replacements, assuming that many of these players “just missed” being on the roster. All other players that saw game action were labeled as fillers.

            Playoff Team      Contender Team  Second Divison Team
              PAs/VORP          PAs/VORP         PAs/VORP
Starters    4,598/189.0       4,376/190.4      4,243/141.8
Bench         806/  9.4         727/ -4.7        538/ -4.1
Replacements  546/ 15.7         679/ 11.1        776/  8.8
Filler        127/  1.9         138/ -2.0        340/ -4.8
Total       6,077/216.0        5921/194.7      5,897/141.7

The key insight is that organizational depth does seem to be a driver between playing in October and just being a contender. The VORP of starters of playoff teams versus contenders is essentially the same (189.0 vs. 190.4). However, the extra 20+ VORP is from the organizational depth, mostly coming from the bench players. For second division teams, simply not having the horses in the starting lineup is causing their separation with the .500 barrier.

Another thing that pops out of the data is that the better the team, the fewer plate appearances for first-tier replacement players and smaller difference in VORP between the Opening Day bench players and first-tier replacement players. Is this because a lack of injuries, or is it because the front office made the right decision at the beginning of the year?

Starting Pitching Depth

For only a few organizations is the five-man rotation set in stone at the beginning of Spring Training. For some organization, even the #3 slot is uncertain. As we will see, for the teams where even the #3 slot is uncertain, this suggests that their players should start booking their October fishing trips early.

For the average organization, only 120 of a team’s 162 starts are usually going to be pitched by the five designated starters from Opening Day. A few notable exceptions in 2008 were the White Sox and the Rays – both playoff teams – where their starting five started 153 games for both teams. Almost half the teams (14) used a #9 starter. The average team in 2008 had their starts allocated as follows:

Depth Chart Position       Games Started
Opening Day 5-man rotation   120
#6 starter                    15
#7 starter                     9
#8 starter                     6
#9 starter                     3
#10 starter                    3
Swing men in bullpen           6

Laying out the VORP of each rotation spot shows where playoff teams separate themselves from the rest of the pack.

Starting        Playoff    Contender   Second Division
Rotation Spot   Team VORP  Team VORP    Team VORP
 1                50.9       49.1        32.5
 2                31.5       29.8        23.1
 3                27.7       16.9         1.3
 4                21.2       15.2         0.4
 5                11.3        5.6        -1.1
 6                15.0        3.5         7.8
Total            157.5      120.2        63.9

As we would expect, a playoff teams has better quality at every starting position in the rotation. However, when compared to the contenders, the #1 and #2 slots are almost identical in quality. The separation begins to occur at the #3 slot, and continues through the rest of the depth chart. Specifically, the 37.3 VORP difference between playoff teams and contenders is only 3.5 in the #1 and #2 slots, but 33.8 in slots #3 to #6.

The data suggests that second division teams have a typical #2 starter as their staff ace and another #2 or #3 for their second rotation slot, then essentially replacement-level pitchers from #3 to #5. The Royals 2008 starting rotation being a perfect example of this: staff ace Gil Meche (really a #2), Greinke (an emerging #2) and then the significant drop-off to Bannister, Bale, and Tomko. An interesting note is the significantly higher value that the #6 starter has than the #3 starter for second division teams. This likely signifies either health issues or organizational frustration at the lack of effectiveness of the back end, which is “solved” by the up and coming prospect that is forced into the starting rotation (see Armando Galarraga or Greg Smith)

Reliever Depth

In 2008, all 30 major league teams used at least 11 relievers. Typically, a team’s workhorse reliever averaged 82.5 innings (with an occasional start or two in there) and the 11th reliever averaged about 20.2 innings pitched. Roughly half of the teams used 15 relievers, with the extreme outlier being the Padres who used 20 relievers in 2008.

I divided relievers into 4 groups: top 4 relievers as measured by innings pitched (Group 1 – top relievers) on the Opening Day roster, the remaining relievers on the Opening Day roster (Group 2 – bottom relievers), the 4 relievers with the most innings pitched not on Opening Day roster (Group 3 – first-tier replacements) and all other relievers (Group 4 – the fillers):

Reliever Group  2008 Playoff Team  2008 Contender  2008 Second Division
                   IP/VORP            IP/VORP         IP/VORP
Group 1         279.0/67.1         264.2/51.7      258.0/37.1
Group 2         115.0/12.5          95.1/ 6.0       83.1/ 0.0
Group 3          71.1/ 4.0         104.2/12.1      140.1/21.8
Group 4           8.1/ 0.1          27.2/-0.5       49.2/-6.4
Total           473.2/83.6         492.1/69.1      531.1/53.6

The data suggests that the front line players (starters and front end of the rotation) for playoff teams and contending teams are nearly identical. However, the Opening Day bullpen of a playoff team is much better at every position than the contender’s bullpen. The relatively large VORP and slightly higher innings pitched of the Group 3 relievers (first-tier replacements) suggests that many of the contending teams are still trying to find the right relievers throughout the season. Similar to the position players, there’s an uncertain cause-and-effect relationship regarding the lack of usage of first-tier replacement relievers. Is this because the playoff teams are lucky and don’t need to use as much depth because of fewer injuries, or has the team identified the right players in spring training so that they don’t need to make changes due to ineffectiveness.

Putting it all together

Just comparing playoff teams to the contenders, the organizational depth looks like:

                         Playoff Teams  Contenders  Difference
Starters 	                  189.0        190.4       -1.4
All Other Position Players         27.0          3.7      +23.3
Starting Rotation - Front End      82.4         78.9       +3.5
Starting Rotation - Back End       75.1         41.3      +33.8
Opening Day Relievers              79.6         57.7      +21.9
Organizational Depth Relievers      4.1         11.6       -7.5
Overall                           457.2        383.6      +73.6

So going back to the original question, do the initial decisions of the front office in setting the Opening Day roster really make a difference? On one hand, there’s no question that the losers of the Opening Day sweepstakes will likely see significant playing time, however, the data suggests that maybe the better teams are the ones that get the decision right the first time.

Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now
You need to be logged in to comment. Login or Subscribe
I was wondering this very question last week. Glad to see someone read my mind and answered it!
I know we're not voting on these, but this one is my favorite of all the applicant's first articles.
Nice work on this piece, Tim. I'm not sure if I would've been so quick to arrive at the same conclusions as you, but this is a good blend of simple math and analysis.
I liked the article a lot, nice read and I was able to pull some of my own conclusions from the tables. I'm just not sure I agree with your conclusions, about the importance of Opening Day roster selections. There's not a lot of differnce between players 21-25 and 26-30. It's the lack of quality in the top 20, determined in the off-season, that has the biggest effect on a teams won/loss record. I see the Playoff Teams and the contenders being very close in quality of their starting players, but the contenders lack bench depth. The contenders and the also-rans have similar bench strength, but the second division teams lack depth in their starting players.
Basically, playoff teams versus contenders look the same in their top 10 or 11 players (starting 9 and their top 2 starting pitchers). Their becomes some separation in the bench, starters 3-5, and almost all the relievers (this was a little surprising that the relieving core is consistently better).

One other interesting point that I couldn't put in because of word limit was on bench quality. It's typically not up and down the bench quality. It usually is that one "bench" player who had a 20.0 VORP and 3 below replacement level players versus all below replacement level players for the other teams.
Nice article, Tim.

Have someone with some English chops give your stuff a thorough look before you post it.

Data is plural. "Data suggest" is correct.

In the comment to which I'm replying, "their" should be "there."

Having said that, I agree with the comments that think this is a fine column. I just don't want you to distract people with little things that detract from the good content.

Good luck in the Idol competition!
Because of time commitments, and not re-checking KG's 2nd post, I handed this in on April 14th, which had me a bit rushed.

My goals on this article would be to clean it up (and catch some of those grammatical issues), plus I think I could have made the conclusions a little tighter.
I've only read about half of the entries, but this was probably my favorite, so nice job. It was a creative approach to an interesting question, and the data offered some interesting numbers.

In a different context this info would probably be better suited to a two or three part article, so that you could elaborate more on the data and its implications, but I understand the imperative to dazzle for a BP Idol entry. Clearly, it worked.

Sorry to do this, but I'm going to pile on about the writing. Some of the flaws are noticeable, and it sort of weighs down the overall effort. I'd hate to see it deep six your chances. The good part is that this is organized very well: here's my problem, there are three ways you can analyze the problem, here are those three ways, and here's where I tie it all together. That makes it very readable, and helps keep people engaged in where you are going with it. I know from experience that structuring an argument is not as easy as it sometimes looks, so it is a significant plus that you can do this.

But things like this sentence: "For only a few organizations is the five-man rotation set in stone at the beginning of Spring Training" made my eyes bleed. That reads like it passed through babelfish first. Also, the opening para didn't really get good/interesting until you posed the question that framed the rest of the analysis. The entries I've read all seemed to struggle with finding a compelling opening, and since the first few lines might be your only chance to engage voters, falling down on this is not going to help the cause. In most cases, including this one, the entrants could just trust their stuff and dive right in, instead of spending a few lines taking a stab at what they hoped was an interesting way of (finally) introducing the central question or point.

I wouldn't assume that voters are going to be indulgent enough to let the writers spin their wheels for two or three sentences before they shift into gear. Some people will read every word of the entries because they have to (hi Will), and some will read every word because they want to be good voters, but for the rest of the apathetic masses, you need to give them a reason to want to keep reading. If the first 100 words are weak, some people might not stick around for the rest.

Good luck the rest of the way.
Thanks for the advice. As I have re-read my entry a half-dozen times since posting, there are some sentences that also make me cringe. For the next rounds, I will have another pair or tow of eyes help me avoid those sentences, plus I'll personally do a few more critical read-throughs.

As for content, I believe that this article asks more questions than it answers. If there was an article 2 (and maybe 3), some of the additional questions I'd like to explore are:

a) Bench quality -- it was usually one solid bench player as opposed to 4 solid bench players. Why?
b) Similar to bench quality, the quality of first-tier replacements seem "one-player" heavy. For example, the Royals had one of the top first-tier groups, but really it was solely Mike Aviles surprising year and four replacement level players.
c) Given time, I would have loved to match up lost days to the disabled list with the benches to see if this perceived better bench was truly a result of better/luckier decisions at the start of the season or was it just a matter of health
d) Expand the analysis to an entire decade to see if the results for 2008 held for a time frame from 1999 - 2008.

Lastly, I can understand your point of "spinning wheels" for the first 100 words. I guess that is more a question of writing style. If I was writing a more "academic" style research piece, I would not have bothered. My perception of BP is that it is more fan-oriented than a strictly SABR research type site. I was hoping that the first paragraph would be more engaging. I feel that I'm a fan first, and a baseball analyst second. For some readers, this style will be wasted words, for others I hope it helps them connect.
Great article. Let me throw this assertion out there: content is everything. Yes, proper grammar and spelling are important, but let's face it, it's really more icing on the cake. If we're evaluating raw talent here, then I think you have to look at how fresh the idea is and how well supported it is by evidence. This guy's got the "raw stuff" to write well. Everything else can be learned, or edited.
Is the organizational skill of selecting good bench talent, back end starting pitching, and relief pitching something that varies directly with FO talent? Can it be repeated year after year? If one takes away payroll and its influence, is there any trend attributable to FO talent remaining?

Tim, you've written a good article, one clearly in the top half of the selected submissions. I'm just not sure of your conclusion: I don't know if you've discovered decision-making talent, luck, or payroll.
This article is what Jeff Euston's article could have been.

I love the article, so keep in mind the following ideas are things I thought of to make it even better.

I think an additional gradation on team quality would be helpful since there can be little fundamental difference between an 82-80 team and an 80-82 team, yet the 82-80 teams are compared to all .500+ teams and 80-82 teams are compared to inferior teams like the Nationals. It also might be interesting to use third order wins or Pythagorean records instead of straight W-L to divide your teams off.

I question a bit the division of relievers based on innings pitched. Injured quality relievers, particularly injured closers, might make Group 2 look better than it actually does.

I like how you point out that organizational depth in position players is worth 20 VORP and I think that is a great insight. Sometimes though, you need to be blunt and indicate that 20 VORP is roughly equivalent to 2 extra wins and how those two wins turned a contender into a playoff team.

And I won't rehash the comment about proofreading. Really, I won't.
Really enjoy the article! Nice work.
I'm adding my judging comment to each article:

Kniker, Tim -- 8. This article reminds me of Keith Woolner, which is pretty damned high praise. He's not as good a writer yet, but he's certainly good and the process where he gets places is very solid. I worry a little bit that he's all research and he'll get squashed by some of the themes, but I can't wait to see him try.
Shoot. This is really a bad sign. I was hoping that I would remind Will of Rany Jazayerli. Time to re-tool!
1. Hopefully your sons will be saner than us and have very little interest in professional sports. My son is.

2. Where do you get historical opening day roster info? I was looking for that actually on my first attempt at this contest.

3. I agree with your own self assessment that your conclusion could be improved.

4. Mathematically, I'll defer to you, but don't you think you need more than one year's worth of data? I'll admit the evidence is fairly strong as it is - and putting it together in one week, I'll excuse this possible weakness.

5. Overall, I enjoyed this. It was well presented and worth knowing.
1. Hopefully, however, they were running around today in Pittsburgh Penguins jerseys (a desire of their uncle), so I'm not hopeful.

2. There was a download in Did a google search on 2008 opening day roster. It was about the tenth link down. I didn't do a complete verification that it was 100% accurate, but the few teams I knew it was correct.

3. Yep (see #5)

4. Ideally, I would do all of the 2000's to see how well this held. However, it was probably 20 hours of doing the manual categorizing of opening day player, first-tier, etc. If this was at the level of article in BP annual, I would have spent the time to make this data more secure. I think of this more as an exploratory piece to see what popped out of one year

5. The main thing I would improve is to highlight at the end what I thought were the key insights throughout the piece (and a little better conclusions). Oh yeah, and have someone proofread it. Writing has always been something that I've had to put in more effort than most. Research and analysis has always been the "piece of cake" thing for me.

Based on Will's comments above, I'm anxious as to some of the themes (if I even get that far!)
Tim, when I read your bio, I was afraid I was about to feel really stupid, like I often do when reading the really advanced statistical analysis on this site (it's my first year as a subscriber, and I'm still learning). However, you wrote a really interesting article that made it easy to follow your thoughts and the math involved. Great job!
Thanks for your kind words. Actually, I'm just not that thoughtful about things, so I don't go into as great detail as some other sabermetric writers.

This is one of the main reasons I'm working in industry now as opposed to academia. Personally, I've always liked getting a good general idea and insight, but not getting caught up in the weeds. A good 80/20 man as they say!
I can relate to that. I like minutae and detail, but it has to fit in with the overall picture and suggest expanded themes for future areas of investigation.
Funny, I ran a Daktronics board for two summers with the Hudson Valley Renegades. Didn't win any awards, though, and didn't submit anything this good to the contest, either (didn't submit anything at all, in fact). Good job. I liked it.
Those score boards can be fun. True story. The new field in Trenton (1994) was having some drainage problems all summer long, so the morning before a 4th of July doubleheader, they get some company to essentially drill holes throughout the playing field.

Of course what happened is that these num-nuts completely shredded the fiber-optic cable that went from the pressbox to the outfield scoreboard.

To make a long-story short, for both games of the doubleheader, they were only able to gerry-rig a direct connection a few feet long to the actual line score (no graphics), so for both games of a double-header, I'm sitting on top of a ladder underneath the scoreboard keeping the balls/strikes/outs and line score. Since from my vantage point, I couldn't even see the scoreboard, so I was listening to the game on the radio, and Tom McCarthy (the announcer) and I had a few code words to let me know if I had made a mistake on the ball/strike count so I could correct it. Best summer of my life.
I thought this was a fantastic article as well. It's my favorite so far. If I can figure out how to navigate the internets to cast a vote for Tim to advance, I will do that.

My theory, after watching about a decade of Twins baseball, is that there are about a dozen teams that only care about winning. These teams tend to be the upper tier teams. These are the teams that will allow their best prospects to make the team out of spring training, service time be damned.

The other 17 teams (Marlins excluded) will try to construct a roster that will win about 82 games on average and with some luck, can win 90 or more. Those teams do, in fact, know who their better players are, but will not promote them until late May/early June to keep their service time down. These teams will even do this with relievers and utility infielders - I'm pretty sure the Twins did it this year with Jose Mijares.

That's just one man's theory as to why you see second division teams get more production out of mid-season callups.
I had two questions after reading the article.
1) Wouldn't second division teams be worse just by using more of their prospects on a trial basis? They know that they are not contending and attempt to evaluate during the season.
2) Since most teams have '30-man' rosters where they can cycle the 5 guys who were cut at the end of camp on to their rosters as needed, doesn't that make the last roster cuts irrelevant? There is always a few players that make a team based on their hot spring training stats even though they have no track record of success in the majors.
In response to 2), that was the main question I was asking. When I looked at playing time, most teams do have the 30 - 32 man roster, where each team cycles in the 5 - 7 on the outside.

The interesting thing is why did the playoff teams give 240 PAs to the group that started the season on the roster versus the first tier of replacements, while the second-division teams gave the 240 PAs to the players who were on the outside looking in? It likely comes down to one of two issues: Playoff teams were luckier and had fewer injuries and didn't need to go into the first-tier as much, OR the bench simply performed better and didn't need to go into the first-tier due to poor performance. By matching up disabled list days, we may be able to uncover some of this.
There is a third and just as likely scenerio, afterall, it's probably a combination of them all. Some of the starters on the second division teams were so bad they were discarded in favor of the first-tier players.

Most non-pitcher benches these days are filled with utility players and back-up catchers. When a starter is injured, sent down, or DFAed, isn't it as common for a player to be called up from the minors to replace him in the starting line-up as it is to promote a player from the bench?
If a starter is injured/sent down/DFAed, I imagine that what most likely determines who starts is their salary unless it's a really hot prospect. Why pay $1 mil for a bench veteran without getting some value for it? Besides, a bench veteran's primary job is to back up a starter. A rookie's prime job is to get as many at bats as possible to learn and sitting on a major league bench just doesn't do that... besides there's been more of a trend it seems (or maybe I'm wrong) to give minor league vets a cup of coffee.
In the whopping sample size of 2 this year for the Royals (in terms of main injuries/ineffectiveness), we've had the following:

-- Gordon goes down, Super-sub Teahen moves from starting 2B back to original position of 3B, 2B sub Callaspo assumes starting job.

-- Aviles is ineffective (laster confirmed injured forearm), sub Bloomquist takes over starting job, AAA player Luis Hernandez joins the roster (actual he came up to replace sub Pena), but is really relegated to a game or two/week.

My gut feeling (and this could be another analysis) is that the bench player (assuming he has the skill set for the position) becomes the starter, while the call-up becomes the new bench player. Now if the bench player is ineffective, but in the limited time the call-up performs well (see Mike Aviles last year), its possible that they will become the new starter, and the bench player continues on the bench
When doing any analysis, the conclusions you form are largely based on the assumptions you make at the beginning, and the model you build. I'm not sure the different tiers you have chosen are the best model, but I believe they work just fine as a quick comparison. It seems you understand that a more in-depth analysis would require a better look at the tiers you have created, which is good to see.

Overall, this was my favorite article of the ten finalists. Congratulations on making it, and I look forward for more fun/interesting/insightful pieces from you.
Thanks for the comments. Agreed with you on many aspects. I could see doing single, larger articles on each group (a 2500 - 3000 word piece on just position players, another on starter, and another on relievers).

One good point that you make is that the categorization drives a lot of this, whether it be the categorization of the players or the teams. Something that would require more depth would be breaking out relievers based on the inning/leverage situation they pitch. Closers in one, 7th & 8th inning setup in another, long men, then mop-up guys. So many different ways to go with this.
Far and away my favorite piece out of the bunch. I wish there would be four more pieces written off of this one piece. I would like to see this tore down and rebuilt because certain sections of this are fascinating to me, and I just enjoy thinking about it.
Reading them in order, this article is head and shoulders above the 6 preceding. Truly a Baseball Prospectus piece. Congrats!