keyboard_arrow_uptop

Welcome

Hello, new readers! Welcome to our new series, Fantasy Baseball Do-It-Yourself (DIY).

Do you love fantasy baseball, but find that those annual magazines and websites don’t actually help? Have you long ago left behind stodgy rotisserie fantasy for more cutting edge leagues? Are you willing to spend a little time learning to use Excel to give yourself a competitive advantage in your league? No? Well, you can stop reading now as long as you vote for me. But if the answer is “yes,” we here at Fantasy Baseball DIY are going to show you how to build your own expert analysis that is optimized for your league’s scoring system.

What are you going to need to build this statistical dream house? The requirements are:

• Raw Materials
: Access to the vast array of baseball information out there, especially player projection data (a Baseball Prospectus subscription will more than suffice).

• Tools
: Excel (or Access) is enough to do the trick.

• A Little Know-How
: Some basic data manipulation skills, such as how to use Pivot Tables and the VLOOKUP functions in Excel. If you don’t, consider an off-season workout regimen to prepare for 2010.

• Desire
: A willingness to get your hands dirty, programmatically speaking.

Key decisions

Throughout the course of the season, the fantasy owner makes three key decisions:

1. Whom do I draft?
2. On a given night, who is active and who is reserve?
3. What mid-season rosters moves do I make?

While all are important, I’ve been able to win leagues by simply drafting a better team at the beginning and doing the least amount of mid-season roster moves. Similarly, I’ve seen enough owners mess up the draft badly enough that no amount of mid-season wheeling and dealing can dig them out of the hole they have dug. At Fantasy Baseball DIY, we will cover all these questions, but, for the first few installments, we will focus on the draft.

League Context

While the techniques that we present can be adjusted to match the scoring and settings of your league (hey, that’s the point of Fantasy Baseball DIY), I’m going to use my favorite league as an example. I play in a non-keeper head-to-head league of 20 teams, followed by a three-week playoff for the top six teams. We select 24 players on draft day, but, on any given night, the active roster consists of 10 position players (one at each fielding position plus two utility players) and 7 pitchers (two must be starters and two must be relievers). As for scoring, there are six offensive categories (R, HR, RBI, SB, OBP, SLG) and six pitching categories (W, SV, K, WHIP, ERA, K/BB). The selection of players is a live draft, not an auction.

Marginal Positional Value

As most fantasy baseball owners know, the key concept of player valuation is determining a player’s marginal positional value (MPV) compared to other players at the same position. Although Albert Pujols has better overall numbers, his MPV (compared to other first basemen) is lower than the top players, like Hanley Ramirez, at the traditionally weak offensive positions (SS, 2B, or C).

The first step to calculating a player’s MPV is to estimate the relevant stats for all likely drafted players in the upcoming year. By being a Fantasy DIYer, you can use whichever projection system you like (PECOTA, CHONE, Marcel, ZiPS, an average of many systems, etc.). There’s no need to create your own projection system (unless you want to). After all, Bob Vila doesn’t make his own bricks. He just uses quality raw materials from a supplier he trusts. Personally, I like combining two BP data sets, the downloadable PECOTA projections and the depth charts, which refine the expected playing time.

Note to Baseball Prospectus: a great addition for next year would be a downloadable file of the plate appearances and innings pitched used in the depth charts, preferably with the HoweID key for each player for easy joining with PECOTA projections.

Baseline Calculation

The first step in determining MPV is to calculate a baseline for each position. To do this, we take the average of each player that:

• is eligible at that position and
• is projected to get over 200 plate appearances.

The table below shows what we projected the baseline numbers for each position at 2009 to be:

```
Position   PA    R  HR   RBI  SB   OBP   SLG
1B      519   69  20   75    3  .358  .472
2B      461   58  10   50    9  .337  .412
3B      445   56  15   60    5  .340  .448
SS      441   53   8   45   10  .331  .395
C      382   43  11   47    2  .333  .419
RF      470   62  15   61    7  .345  .450
CF      460   62  12   52   15  .339  .425
LF      466   62  17   63    8  .345  .458
Avg     453   57  13   56    8  .341  .435
```

Based on a team that has each position player plus two utility players (which I fill in with an average player), my baseline team stats are:

```
PA    R   HR  RBI  SB   OBP   SLG
Team Total  4552  578 132  465  74  .341  .435
```

Examples of MPV Calculation

MPV is calculated by determining how a given player, if inserted into the baseline roster, changes the team’s final statistics in each category. Let’s take two examples, both third basemen, to show how to do the MPV calculation: David Wright and Russell Branyan. Based on the PECOTA projections and depth charts from late February (my draft this year was early March), the projections for each of these were:

```
Player     PA   R   HR  RBI  SB   OBP   SLG
Branyan   463   61  26   75   6  .335  .491
Wright    694  119  32  106  20  .400  .538
```

By inserting Branyan or Wright in the baseline team’s lineup and removing the “average” third baseman, we calculate the improved scores of the baseline team. As our more astute readers will notice, we are doing similar “first principles” calculations that lead to the creation of stats like MLVr and VORP, but geared for the scoring of the league, not for increasing run production in the major leagues. The table below shows the new team totals with Branyan (first row) and Wright inserted (third row). The second and fourth rows show the percent improvement in each category over the baseline.

```
R    HR  RBI  SB  OBP*  SLG*  MPV
Team+Branyan         581  143  580  75  .340  .441
Branyan Improvement   1%   8%   3%  1%  - 2%    7%   18

Team+Wright          639  149  611  89  .350  .450
Wright Improvement   11%  13%   8%  20%  14%   18%   84
```

The MPV column is simply the sum of the percent increases in each of the scoring columns, since in this league each category counts equally. Also, we calculate the MPV of a player if they were inserted into the utility slot as well. As the draft proceeds and if all of a player’s eligible position spots get filled, he will likely become a utility player. When this occurs, the MPV compared to the average utility player is a better player ranking.

Note that the percent improvement calculation for OBP is compared to increasing the OBP over a reasonable low-level of .275. We do the same for SLG percent as a percent increase over .350. I find that these help put these rate categories on a similar scale to the counting categories.

Personally, I also note if the MPV for a player is heavily weighted by one single category, such as SB for hitters or SV for pitchers, where the majority of points scored come from just a single player or two. A player like Jacoby Ellsbury may be overvalued, because most of his benefit will come from stolen bases. If your team also has another stealing threat (like Jose Reyes), the additional benefit of Ellsbury is not as great.

After calculating the MPV for every player, we create a sorted list for each position. In the position list, we include all players that are eligible at the position or will likely be eligible. For example, Russell Branyan was not eligible at 1B to start the season, but it was pretty certain in Spring Training that he would likely be the starting first baseman. Next to their name, we put down the MPV score for that player in that position as well as his utility role MPV. When it is our turn to select in the draft, we simply look at the top of the list of all positions and select the player with the highest MPV on any position list. See the table below for an example of the top of the lists at a few positions this year.

```
1B                    2B                     SS
Pujols,A.  (76,91)    Utley,C.    (59,52)    Ramirez,H. (104,91)
Berkman,L. (44,58)    Kinsler,I.  (58,52)    Reyes,J.   ( 98,91)
Howard,R.  (39,53)    Roberts,B.  (51,44)    Rollins,J. ( 72,59)
Cabrera,M. (36,50)    Phillips,B. (43,37)    Tulowitzki ( 31,18)
```

There are two distinct decision-making processes regarding the draft. We have discussed the first, which is the preparation before the draft which lends itself to a sabermetric-type analysis. The second is the adjustments you make as the draft unfolds. The drivers that affect your adjustments are more like a poker game which is about reading your opponents, understanding what they are trying to accomplish, and making tweaks to your own strategy to compensate. If you’ve done good prep work, the amount of tweaks you make will be minimal.

Next Time on Fantasy Baseball DIY

By looking at the counter in Microsoft Word, it seems that I’m out of words today. Please keep your eye posted for the next installments of Fantasy Baseball DIY, where we will:

• Describe the MPV calculation for pitchers
• Adjusting the MPV calculation to maximum bid prices for auction leagues
• Incorporate team health reports and Beta values from the PECOTA projections in our MPV calculation so that we understand the risk in each of our scoring categories.

Good bye and hope to see you next time at Fantasy Baseball DIY.

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Latest Articles

7/07
0
7/07
0
• Eyewitness Accounts: Christian Encarnacion-Strand is More Than Just a Long Name
7/07
0
You need to be logged in to comment. Login or Subscribe
wcarroll
5/31
Pivot tables are basic? I imagine some people will flip over this article, but that's a subset of a subset. For someone like me -- and even at BP, I have a hard time believing it's a minority -- this article does nothing for me and I'm lost by the time he says "VLOOKUP." Tim wrote something that was interesting to him and I think he ignored the vast majority of the audience by selecting this topic.
kgoldstein
5/31
I have to disagree with Will a bit. I can't do Pivot Tables or VLookups either, but I still understood the process explained and fell like I could do this without using them -- it'd be sloppy, but it'd work. I thought there was some interesting stuff, but (and maybe it's just me), I'm still not convinced that standard addition/subtraction/multiplication/divison properly values elite players. Because of their scarcity, a bell seems to make more sense.
ckahrl
5/31
While I take Will's point as far as technical expertise, I think there are several things that shine through: the fun that Tim's having in writing these pieces comes through loud and clear, and a writer having fun while making an argument (or at least giving that impression) is someone who can make it as someone who lives or dies by the keyboard. Second, whatever the level of technical difficulty, following his method was decidedly *not* difficult, and his results supported and illustrated his argument really very well. Third, what's wrong with a high-end complicated suggestion for Fantasy content on BP.com? The generalist's market is well populated by a lot of columnists, some very good. What Tim does here is really take things up a notch and show what is possible by setting the bar a bit higher.
daiheide
5/31
For the second week in a row, Will Carroll complains about an article that includes some technical detail (but, frankly, not all that much). That's a fine position to have, but it seems strange coming from a staff member at Baseball Prospectus, which is known for - and frequently submits its willing readers to - pretty sophisticated and technical statistical analysis.

Tim's piece is, to my mind, very good. He explains a *very* valuable fantasy tool clearly and with ease. Especially for fantasy players who are relatively new to the game, this article has the potential to have a pretty dramatic effect on how they approach a season. It's hard to see how that doesn't meet the goal of the week's competition. While the more technically naive among us will have to do some work to make use of Tim's method, it's not too much work: given the tools available, calculating percentages (basically) is easy.

So, I don't get Will's problem: Tim compliments BP readers by assuming they're smart, but he doesn't go so far as to assume they're degree students in a statistics program. As I understand, that's been the tone BP has been trying to strike for as long as it has been around. What's the complaint?
rjblakel
5/31
This. Bring on more articles that challenge the reader. Most serious fantasy players use an Excel spreasheet to render/manipulate their personal rankings... I love reading about different techniques to help formulate.

Thumbs up.
rbross
6/01
Seriously. I love Will's regular column, his enthusiasm for baseball, and the fact that he nearly always responds to both comments and emails.

But it's a bit discouraging (and puzzling) to see him so hell-bent on criticizing every article that is even the least bit statistically sophisticated. I'm all for analytical diversity at BP, but I actually think there is currently a shortage of top-shelf statistical analysis among the regular contributors to the website and this contest could be a great opportunity to hire someone who can make up for that deficiency.

For this piece: I don't like the tone at all but the analysis is great. When I considered entering this competition, I thought that I'd write a similar piece on how I use various spreadsheets for my fantasy drafts. Tim, you've vindicated my decision to not write it! Not only do you articulate things much better than I ever could, you provide new tools that I'll definitely try to use in the future.

Thumbs up.
wcarroll
6/01
I want to find someone who can take those amazing stats and make them understandable to someone like me. I'm not stupid, but I can't do math, don't use Excel, and when someone starts out an article with "get out your spreadsheet", I'm out of there. As I said, there's an element of people that will love this and for that I'm very glad. It's just not for me. Remember that judging is just an opinion. You're the ones voting.
Oleoay
6/01
Fantasy baseball kind of revolves around math and comparisons/rankings (which are often easier to analyze in spreadsheet form). Even casual fans will usually come up with a draft list of some sort and a list requires some way to sort/rank the players.

For those who only draft people from their local team or their favorite players, then neither this kind of analysis nor method will work for them. But then again, the people who draft along those lines probably won't come to BP to read about better drafting techniques.

Of course, the real draft wizards will also bring a copy of Will's latest UTK and a list of those he says are TRIPping. :) The nice thing about that is all I have to look for is the number of days the player's theorized to be out and whether the injury issue will continue to affect the player's performance.
fsumatthunter
6/01
Will,

I absolutely hear what you are saying. I am not the most stats driven person. I have to work my way through research articles. However, I think you have really taken the non-stats analysis stance thus far to a new extreme. Any type of math you have seemed to make a comment against. This seems odd to me on BP, just for the record.
wcarroll
6/01
I disagree. I would compare my criticism here to what I said about Brian Cartwright's article from last week. There's a large (even at BP) audience that simply can't do this and has no interest in doing it. I can't calculate VORP or EqA, but I understand the concepts and can read the lists. With Brian's last week, I thought a reader had to work too hard to grasp a "basic" concept. With Tim's, it was that most people wouldn't even start trying. I don't think any article should by definition alienate a majority of readers.

That said, I liked the writing on both articles. I just think there's a 'sweet spot' like what Matt Swartz did this week. It's a VERY technical article and yet is readable for someone like me.
warmsox
6/01
Will, I've got your back on this. I use Excel and Access nearly every day at work and can do the things he mentioned but I totally turned off on the opening paragraph.

I'm probably more of a casual reader than the majority of the BP reader base but this article missed the mark with me since I'm looking for something that is more easily digestible and that I don't have to spend time working toward the conclusion away from the actual article.
dpowell
6/01
Yeah, this comment is ridiculous. To kind of relate this to American Idol...say this were a general talent show and this was "music week." The next Whitney Houston (or whatever, pick someone who's a talented singer) performs and nails her performance. Will's comment would be, "Great singing. But I like drums." More ridiculous when you consider that the audience watching the show really likes singing. This analogy isn't working so I'm going to stop. (Also, I'm not suggesting Tim will develop a drug addiction and lose his ability to write good articles.)

This was a great article. I had never really considered putting together my own fantasy program until this article and it gave me a useful framework to use. I also probably wouldn't have used Excel since I typically use other statistical programs, but Tim told me the exact commands that I would use in Excel. I'm much more likely to use Excel now. Should he have detailed the VLOOKUP command? Definitely not. I'm happy to research that myself. There really wasn't anything complicated in the article so I don't understand the complaint.
wcarroll
6/01
The better analogy would be if there was a great opera singer on American Idol. Would it work? Maybe. Some would like, some wouldn't, but I'd say that opera fans < pop music fans, but that they're probably more passionate and less fickle. Saying that Tim's written a great article for a small audience isn't a bad thing.
tkniker
6/01
Well if you go more with Britain's Got Talent, then even though that show is geared more to pop culture, a good opera singer (Paul Potts) won the first year.... so, Will, there may be HOPE for me!
Oleoay
6/02
On that note, I think Diversity deserved to win.
fsumatthunter
6/02
Well in that case lets bring in Bill Plashke for the competition. Because people who think RBI is great > people who think OBP is great.
Oleoay
6/02
And the alligator always eats the larger cookie.

beitvash
6/02
That's a good analogy, Will.

However, you do realize that BP is kind of the "opera" of baseball analysis, right? I understand your point that articles need to be accessible, but I pay for a BP subscription instead of reading a free si.com/espn.com because I want something more. Heck, I love your injury column, but that information certainly isn't "pop," know what I mean?
Oleoay
6/02
This will take things to a completely different tangent, but it seems most sabremetrics analysis comes from Americans... I'm kind of curious if research is being done by native Dominicans, Cubans, Japanese, etc.
hotstatrat
6/02
Don't forget Canadians. (I happen to be a dual American/Canadian.) I am regular on a couple of Yahoo Group forums on Scoresheet Baseball (http://sports.groups.yahoo.com/group/mcscoresheet/messages) and (http://sports.groups.yahoo.com/group/scoresheet-talk/messages). I'd comfortably say a third of the brainiest contributors are Canadian despite being a country one tenth as large with one thirtieth the number of Major League teams. Most of the minor league teams seem to have disappeared as well. What happened to Calgary, Edmonton, London, and Ottawa?

Perhaps, the Japanese and Spanish speaking sabermaticians aren't reaching us, because they aren't writing in English.
Oleoay
6/02
I'd like to say I remembered Canadains but counted them as having a voice since they have a MLB team... but that'd be hindsight 20/20 rewriting history stuff. Sorry for forgetting Canadians.

As far as the other leagues go, I would think people from other countries that follow sabremetrics would've picked up a bit of english. I also acknowledge that the math itself is pretty universal regardless of native language.
hotstatrat
6/03
Perhaps, there are cultural and economic factors, too. Latinos tend to be more action orientated (or so I generalize based on my cultural prejudice). They would rather play baseball than read and ponder the mathematical possibilities in it. More significantly, the Cubans, the Dominicans, and to a lesser degree the Mexicans and Venezuelans may not afford computers to keep up on with the Sabermetric explosion.

The Japanese are probably way ahead of us on computers, but look at baseball differently than we do â€“ generally. To them, baseball is more of an audience participatory event. Although, they are enthusiastic strategists. Perhaps, Japanese just don't make time in their lives for such hobbies as studying baseball stats.

I don't know, I still think it is more of a language issue mixed with it being a cultural issue. People are people. Bill James started the Sabermetric explosion here in North America thanks to that SI article on him in 1982(?). I'm sure there are thousands of young fantasy players and Sabermetric baseball fans who never read him, but his popularity perfectly coincided with the exponential growth of Rotisserie leagues. That snowballed thanks to USA Today, then everyone's access to websites and the ease of website building. At least, that's my take on history. The Japanese are generally xenophobic. The Sabermetric snowball started in the U.S. from an American who writes in English. It may just not have rolled over the Japanese, yet.

Either that, or their work is in Japanese (or Spanish, Dutch, Korean, or Mandarin) and we haven't read it.
SkyKing162
6/01
I'm a stats guy and I'd like to read more of these style articles, too. I'd say, however, the winner of this contest doesn't have to fit that mold, just let every author doesn't have to fit that mold. Another reason why there will likely be (and should be) more than one "winner".
llewdor
6/01
I hate Excel. I do most of my data analysis in Access (I'm a professional database guy).

tkniker
6/01
Hey, personally, I'm not wild about either Excel or Access, but I figured Excel is a bit easier if one is not familiar with either, or at least Excel has the greater exposure.

Given that for a fantasy draft we only need to evaluate under 2000 players, I'll take the quick mark-up of Excel. However, if we're doing some Retrosheet work, than Excel is going to be useless due to its size limitations, and I'll pop over to Access.

Thanks for the compliments!
Oleoay
6/01
Excel 2007 goes to 100k rows... how much size is needed?
tkniker
6/01
Well in Retrosheet for 2008 alone, there were about 140,000 plays, so if one wanted to do an analysis on the 2000s, you're already talking about 1,200,000 rows needed.
jrathkey
6/02
Actually, Excel 2007 has 1,048,576 rows to be exact. Most aspects of Excel are now only limited by the memory capacity of your computer. But using anywhere near all of the rows/columns in Excel 2007 would be very impractical and would bring most home PCs to their knees.
Oleoay
6/03
In theory, you could condense an inning or maybe even a game's worth of play-by-play data into one cell's worth of comma-delimited-type string, then parse it out. That might save space some.
schlicht
6/02
Well Will, this IS supposed to be the Fantasy week, and success in fantasy baseball requires that you be able to do two things: adequately project players performances and be able to evaluate the relative contributions of players (pitcher vs batter; SS vs 1B, etc) to your team.
Tim's contribution addresses the 2nd of these.

Many of us are always on the lookout for ways to place overall weighted values on players.

I've developed a number of Excel-based algorithms over the years to address issues just like the ones he raises.
Tim's approach seems quite useful to me and I give him the thumb!!
SkyKing162
5/31
Not only does this article show people how to create their own fantasy rankings (an underrated aspect to victory in leagues with unique settings), it explains the basics of what they're all about. One of my favorites this week.

Side note: Pivot tables are WAY easier to create and use than most people realize. Find 15 minutes, a dataset you're interested in and someone to walk you through them. You'll be thankful you do.
ssimon
5/31
What I would really like to see -- either from a BP Idol contestant or a BP author/intern -- is a *tutorial* on how to do the kind of advanced analysis we love reading about. Tell us how to create a Retrosheet Database and manipulate it using Excel.

A step-by-step article for people who haven't been to math grad school would be really helpful.

Since I haven't had that kind of training (though I'd like to), I agree with Will that Tim's article lost me at the VLOOKUP line. I read through to the end and appreciated Tim's insight. But telling readers that "you can do it" without telling them *how* to do it is, imho, a mistake.
prospero14
5/31
I agree. I'm familiar with the idea of marginal positional value, and familiar with systems like KUBIAK and PFM that computer it. What I don't know how to do is use Excel to compute these values for me. This article didn't tell me how; he just threw around some technical terms.

Still, he did make the idea of MPV quite clear, and that's good.
tkniker
5/31
Thanks for the comments. I admit that on some levels this was a bit of a "shot in the dark" in that I don't always have a great feel for what is the technical proficiency of the BP readership.

Similar to a "This Old House" or the dozens of home improvement shows, there are sometimes things they show me that I just don't have the proficiency to do given my current set of skill, however, it does give me ideas of what skills will be helpful for me to develop in the near future. My use of the terms of Pivot Table and VLOOKUP functions (which would typically be around Lesson 9 in a 12-part Beginning How to Use Excel course) were not just simply a throw-away for me to look smart or whatever, but a way for me to ground the article for the readers to gauge their abilities to see if what I was proposing was in their skill set or not.

An article (or a few articles) on Excel for the Fantasy Owner would be great, but for the purposes of this contest (in terms of a 1500 word-limit) I tried to shoot for where I thought the middle would be in terms of technical proficiency. I may have missed my mark.

With that said, with a few days of work, I could easily write a quick step-by-step tutorial of how to do what I did in this article. Maybe offline somehow (or through this comment section), people can put down an e-mail address and I'll put something together and send something off for people who are interested.
ckahrl
5/31
Tim,

If you'd be willing to do a service to everyone in the community who could use the assist, just throw me your tutorial, and I'll run it in Unfiltered, where it'll be published in the free and clear for anyone to use, and where you'll be free to add on comments and respond to (perhaps inevitable) questions. Not that I can speak for the audience at large, but I would definitely appreciate it if you were game for something like this.

Christina
pokeysplayers27
5/31
I second this.
Oleoay
5/31
I third this.
eneff1
6/01
Part of me wants to fourth this, but would that be fair to the rest of the competitors? Don't get me wrong--Tim is clearly something special and I think it's becoming obvious that he's going to go deep in this competition. But he's going to be getting extra exposure. Just like the rest of the Idols, he was given a word limit on fantasy. I hate to nitpick, but I at least want that factor to be considered, in case it hasn't been.
rbross
6/01
as much as I'd like to see Tim's tutorial, I think I'm going to have to agree with eneff1 here. However, is anything (formatting? family obligations?) stopping him from posting it in the comment section?
Oleoay
6/02
Nope it's not fair. But I really don't care either. I want whoever wins to be active on this site. Some finalists aren't even responding to feedback. Meanwhile, others are doing additional research on their articles and providing great insight on the threads of other competitors. I like the additional commentary Brian provides, for example (and he's not the only one). I'd also think that if other finalists had a great idea for a tutorial or an article, BP would give them that opportunity as well.
tkniker
5/31
Due to family visiting this week, it's going to be tough enough to get this next weeks' article (assuming I move on) and my regular job done, however, I'd be happy to get soemthing cranked out by mid Juneo n this.

I'm completely game for writing a tutorial on this, plus attaching a Basic Excel spreadsheet with the Pivot Tables and VLOOKUPs etc that is at least set up initially.
Oleoay
5/31
If you want some VBA help, Tim, I can assist with that.

Keep in mind that a standardized sheet might be hard because the input data can vary from one database to another... for example, you might set your database to vlookup to home runs from the PECOTA download, but a different database might have home runs in a different column.
JayhawkBill
5/31
I'll bone up on Pivot Tables and VLOOKUPS and give you a thumbs up. You're aiming for the high end of the BP community with regard to statistical analysis, and, even if I don't normally use those Excel functions, I want a writer capable of writing at this level to win.
Severe
5/31
This is actually just what I wanted to read, if I would appreciate more detail on the specifics in the author's future editions. If you work excel for a day job like me, but have yet to try to really apply it to do interesting things for baseball analysis, this is worthwhile stuff.
Oleoay
5/31
I'm a bit of an Excel wizard, so I'll admit my bias and say I didn't get stumped with pivot tables and vlookups.

Very minor quibbles:
Many people consider Pujols a top three or top pick in almost any format. I think you needed to elaborate on why you think he is not a top three pick or else you'll lose a lot of readers.

It might have helped the word count to just send the "Note to Baseball Prospectus: " to customer service. Either way, it's not a good idea to mention a word count as being a limiting factor.

Since some statistics are more scarce than others, I think you could've discussed the possibility of the reader to add a weight to MPV by adding in a multiplier to a category like stolen bases.

I know one of the finalists got some negative comments for using a title and subtitle to start up an article theme but I didn't mind it there, nor here.

A bigger quibble
I'll admit I'm a bit confused which "team" you are using to compare Branyan/Wright.... I assume it's the "Team Total 4552" line, but then I don't see how the difference between a team+wright OBP of .350 and a Team Total OBP of .341 comes out to a "14%" difference. Is that a .009/.350 or something?

Personally, I found the article easy to understand and think I would've understood the methodology behind MPV even if I didn't know Excel. I think the tone of the writing was strong, fun, and a pleasure to read. I like how you indicate that databases would be downloaded. I would even recommend that you could create a mock spreadsheet to attach to the article that people could download. That way, you spend more time discussing MPV without scaring people off with Excel. Even then, the concepts discussed were easy to understand and most of the examples (besides the math on the Wright/Branyan section) were well presented. I also like that you defined the kind of league you used based on style, roster size, and categories used.

Another great job, and another thumbs-up from me.
tkniker
5/31

On the "word count", not sure if you are referring to my closing in the article or in one of my comments. Obviously as my closing, I was trying to mimic a TV show when they always (annoyingly) say "Well, by the clock on the wall, it seems we are out of time"). As for one of my comments, I was simply referring to the fact that also including an Excel tutorial is going to be a 5000 word thing (also with lots of screen shots to help). I was just saying more of "That's a completely different kettle of fish" than a 1500 word article.

Once again, thanks for your comments, and I can't wait to see your updated Idol Hit List on Tuesday...
Oleoay
5/31
It was my birthday yesterday and after a fun Rockies game sandwiched between rounds of drinking and karaoke, then waking up circa 7am MST to do some work-related work, I took a nap.

The actual word count line I was referring to was "By looking at the counter in Microsoft Word, it seems that I'm out of words today." but I can see what you mean. I think you can get away with that a bit easier once the competition is over, but in the context of a competition, it almost comes across as an apology. As I said though, with your further clarification, I understand what you were driving at.

The Hit List thing was fun and I'll probably do it again :) I'm glad you liked it.
hotstatrat
6/01
What happened to that Hit List comment? I thought it was placed at the end of the "Meet the Finalists" (http://www.baseballprospectus.com/article.php?articleid=8885) or the Round Two Heading article (http://www.baseballprospectus.com/bpidol/) where there are no longer any comments.
Oleoay
6/01
It was linked to the Round 1 The Basics introduction, which doesn't have a direct link anywhere really except from the archives.

http://baseballprospectus.com/article.php?articleid=8942
tkniker
6/01

The Team+Wright or Team+Banyan comparison is to the baseline team (The Team Totals) line. I really could have made that a bit clearer.

So the next questio is how is .009/.350 = 14%? It's not, but as I said, I'm comparing to a "zero point" of .275 (another thing I could have made a little clearer) so it's really more like (.350 - .341) / (.341 - .275) = 13.6% ~ 14%.

Based on this comment and those of ryneestabrook, I definetly could see some benefit to maybe rezeroing the whole thing compared to a baseline of some type of replacement. I'm always a little reluctant to change as this has served me very well over the last few years.
Oleoay
6/01
Thank you for clarifying. I do agree with ryneestabrook as in the replacement level definition wasn't quite clear. How informative is it to set a "zero point" of .275 unless Cesar Izturis is the only player left on the board? Especially since there isn't a "zero point" for some counting stats like runs scored etc that any full-time started just sort of naturally accumulates. Also, by just doing the calculations from a "zero point" of 0, it's easier to track fantasy value fluctuations from year to year without adding that extra step.
vandorn
6/01
I'm not sure an elaboration on Pujols was necessary. Articles about "who should be taken in the first three picks" are beaten to death in the fantasy world. The real work is in filling out the roster with quality players, and this artcile told you how to do that.
Oleoay
6/01
I agree that articles about who should be taken in the first three picks are quite common. My thought along those lines is that since this is an evaluation system, it might add additional credibility if Pujols not being ranked #1 was explained a bit to further explain how the system works.
vandorn
6/01
I should have made that clearer. Ideally an elaboration on Pujols's lack of relative value would be good, but in an article with a word limit it would come in exchange for leaving something out. And I like the stuff he did include.
Oleoay
6/02
I liked the stuff he included to. But I do see your point. That's another reason why I was suggesting the word count limits come off at least one of these weeks, but there was an indication that people are already having problems getting around to reading every article.
mattseward
5/31
Some criticism

As far as I can tell much of the comparison between Wright and Branyan was based on at bats which resulted in much of the difference as clearly in a daily league a replacement level player even in those spots for 200 ABs would have greatly boosted the rate stats.

Secondly I do have a problem with using this value as the only tool as first of all you do not necessarily maximise the value of your team by taking the best ranked player and it would be better to use this system to tier the players and look where the drop off from one player to another is which isn't necessarily the same as the one with the highest score.

Overall though I thought the stats were fine and followed through well but I did feel that perhaps more effort could have gone into what happens if you take say Josh Hamilton over Ian Kinsler say for the rest of the draft to give people a flavour of how it was used in practice.

So some critisicm but I enjoyed the article and voted it up
tkniker
6/01

But those 200 ABs are important. Even in my league where you have 7 bench players, I like to fill these up with extra starters so for a week where I'm slipping on wins or Ks I can rotate in a lot more straters.

If I've got two position players and one is going to get 200 less ABs which show a decrease in rates, that's possibly one more position player forced to be a 3B (or 1B) on the bench. I guess I'm someone who really likes his bench to load up on pitchers.
DLegler21
5/31
Good stuff. I started doing this type of analysis myself when I first learned spreadsheets, then lacking any knowledge of vlookups or pivot tables. I guess I had assumed much of the BP constituents had done the same. The skills I honed doing this for my hobby has somewhat led me to my current profession (Finance) and prove to be valuable skills used on a daily basis there.

For the unitiated, pivot tables allow a very quick summary of data based on fields you choose. Vlookups are a way to link to datasets together when you have fields in one that you want to use with fields from another, or if you simply want to combine the data. Both are relatively simple to use and worthwhile to learn if you have any inclination to do statistical analysis - they will make the task much less daunting.

I use pivot tables in particular frequently to help in my Diamond Mind league - help decisions around lineup construction (both who to start and in what batting order), what opposing hitters to pitch around, and pinch hitting and bullpen decisions. Extremely valuable tools.
mhmosher
5/31
Simply perfect. GREAT work and much appreciated by a player in a less-than-traditional league.
ryneestabrook
6/01
Not a thumbs-up for me, because of a flaw in the "baseline-calculation" section. You reference that league context is important, and totally ignore it in your baseline calculations, preferring instead to reference every player to the average (qualifying) player. You instead should be looking at a league-specific replacement level.

As a quick example, consider the value of a full-time but poor-hitting catcher in either a one-catcher and a two-catcher 12-team league. In the one-catcher league, this catcher should have very little if any value, as the replacement-level catcher in this league is probably full-time (or close) and similarly useless without his chest protector. In a two-catcher league, this player is relatively valuable, as whomever you can find on the waiver wire is likely a part-time or reserve catcher.

This issue will also come to a head when there are differences in the variances of performance across positions, when one player is hands-down above the rest at his position, and a variety of other situations. The rest of the article was good, but if a fantasy player followed your advice in any league where the average (200 PA+) player was too far off from that league's replacement level, they'd just plain draft the wrong players.
tkniker
6/01
I see your point, but I do have to respectfully disagree.

#1) If you draft a catcher, then I suggest, that you should immediately change the MPV you look at not compared to catcher, but to average players.

#2) In your specific example of catcher, one of the things that happens is that a lot of the players that come to the top are going to be your better catchers, so that you don't get into the position of having to choose the full-time, poor-hitting catcher. For example, in my 20-team league, I had the 7th and the resulting 34th pick. At the 34th pick, I noticed catchers were not being taken, so I was choosing McCann because his MPV was so high. Once I took McCann though the remaining catchers dropped significantly because I started using their MPV compared to the average utility player.

#3) So I guess if you think that the qualifying versus replacement level, your recommended fix (which is pretty simple) is to change the qualifying PA so that the number of players who make-up the qualifying list is roughly equal to the number of players who would be drafted? Am I correct in thinking that?
ryneestabrook
6/01

This is of importance when comparing players across positions (within any position, there is *relatively* little impact). Pretend its late in a draft, and I only have two spots to fill: 2B and catcher. Which ever position I fill last should be almost exactly replacement level, so assume that player/position has no value.

Whatever valuation system I use should tell me who to draft next; the best available 2B or the best available catcher. Regardless of the number of catchers each team must roster, the 2B ratings won't change. If I'm in a one catcher league (and thus have not drafted a catcher yet), the choices at catcher should have much higher MPVs than the choices in a two catcher league (where I've already drafted one of my two catchers). The problem occurs that MPV doesn't move with the number of players on rosters in each league, and thus doesn't provide a true zero point to compare players across positions.

Regarding your third point, I think the use of the mean is part of the problem. In a 10 (or 12) team league, replacement level for any given position should be the 10th (or 12th) best player at that position, with caveats made for utility and other flexible position spots. I'd either move to a non-parametric approach (i.e., using ranks), or at least adjust the MPVs for each position such that a player with no value at each position (10th best player at a position in a 10 team league, assuming no players at that position get used in a UT spot) has a value of zero.
tkniker
6/01
Okay, I'm seeing your point a little more clearly. However, I do have one quibble point, which is if we are worrying about your last 2 or 3 players and they are at replacment, and as you said yourself "They have no value" then is this decision really going to be the make or break your draft? At this level, aren't the selections a bit more hit or miss, which you can fix on the waiver wire in a few weeks as the season shakes out?

My personal experiences (maybe yours are different) have been the mid rounds are the key, with an occasional issue to the top few rounds (when someone chases numbers and not MPV). In the first 3-4 rounds, you're almost always drafting solid players, so it's not a big issue. The only bad decisions I've seen in the first 3-4 rounds is when someone panics when the top SPs are off the board and they reach for a mid-level SP that isn't worth it, however, the mid rounds are HUGE because that's when some good ranking system tied to your league's specific scoring is worth its wait in gold. This is when the lesser owners draft names or numbers, but aren't thinking as much about position.

The one issue I have with replacement in fantasy league which is a little different than for VORP is a matter of sequence.

There is a little bit of a chicken and egg here, NO? How do I know who is the 10th best catcher in the league necessarily, until I do SOME type of valuation.

Also let's take an example of a 2B to determine the replacement level, but which is the replacement level, the poor hitter with 18 SB but has a .310 / .375 OBP SLG (a la Willie Bloomquist) or the one who has 2 SB but is more like .325 /.400 or a little pop. I could see how you designate "replacement" level having just as many problems.

I guess one could fix it with taking the 10th best R of 2B, the 10th best RBI, the 10th best SB, etc.
ryneestabrook
6/01
The same problem applies to any across-position valuation, from the last picks of the draft to choosing Hanley Ramirez over Albert Pujols. I certainly agree that leagues are typically won in the middle rounds (although you can lose leagues in the first rounds). The example I attempted to lay out pertains just as well to players at the top of the draft as at the bottom. Russel Martin has very different values in one and two catcher leagues, and comparing him to outfielders and shortstops is very dependent on what you think the replacement level for each position is. Your method doesn't account for league depth.

There is a chicken and egg issue, because we have to deal with multiple categories for each player. We have to project arrays (vectors) of data for each player into a single (scalar) value, which includes at least two components: how the different statistical categories relate to winning, and what the baseline or replacement level of each category is for each position. As you suggested, you can't really figure one out without the other.

Your Bloomquist example is a good one. I see two options to get around this, though there are certainly more. Both start with a preliminary valuation. The first is to use the MPV of the replacement level player at each position. Then adjust your MPVs such that freely available players have a value of zero, tweak to your heart's content, and go!

Another is to use the preliminary valuation simply to find the replacement level players, and come up with a new valuation based on who those players are. I typically get around the Bloomquist problem be using a kernal smoother; instead of taking the 10th best player, I'll use a weighted average of the 7th-13th best, for example. If you properly value each category (i.e., put them on a scale such that moving 1 unit in runs or SB equals x wins or x points), it will end up not mattering; Bloomquist and the slow but better hitter will have exactly the same value, so either can be used.

I've liked your responses, and your original entry. I made the point originally because a reader might have assume this method adjusted for league context because you mentioned league context is important. Good luck in the contest.
Oleoay
6/01
Well, if it's late in a draft, there's almost no reason to not have a 2B and a catcher. I know some philosophies completely punt catching, but even then, you're basically saying "is my last round flier better than anything I can get on the waiver wire"? That's where MPV would come into play.

If you went through an entire draft and did not get a 2B or catcher, then the general MPV does not matter as much. What matters more is the highest MPV at each position that you have a need in.

And yes, the baseline might be a bit low if all players with at least 200 PA are counted, but that also depends on how deep your league is. If it's a shallow 5x5 casual league, the 200 PA types wouldn't have enough MPV to outrank a full-time player because the full-time player will generate more counting stats and thus, more MPV.

As an addendum, I might have to patent the usage of "quibble" :)
jtrichey
6/01
Thumbs up but barely. I loved his 1st 2 pieces which puts this one over the edge. This article finishes strong and really hammers the topic home nicely.
strupp
6/01
I really liked this article. And I'm really disappointed in Will's analysis of it.

For a lot of us, BP was an introduction to sabremetrics, and taught a lot of us to think differently, expand our thought processes, realize that there were more than a few ways to do and/or analyze something, etc. We didn't have Bill James, because James wasn't writing as much anymore, but we did have BP, Neyer, Shandler, etc.

I didn't get everything on the technical side. But I did realize I'd be able to learn, and it made it seem not so scary. That gave it a thumbs up on it's own.
hotstatrat
6/01
I am a Scoresheet guy, not a Rotisserie/category type fantasy baseball player. However, this article quite kept me interested all the way through. I have never heard of Pivot Tables or VLookup, but had only a little trouble following this article and think it may well be a sensible approach to valuing players â€“ except I do have a few qualms.

It requires a fair amount of work to stay on top of your necessary updates. Iâ€™m not sure how practical that is, but if your Pivot Tables make it a cinch, then I guess I should sign up for those lessons.

The part I didnâ€™t understand is where you get your percent improvement over the baseline. I didnâ€™t come up with the same percentages as you. For example, calculating Team-Wrightâ€™s .improvement in OBP: T-Wâ€™s .350 â€“ baseline .341 is .009. .009 / .341 is .026 or a rounded 3%. You say 14%.

You also say, â€œNote that the percent improvement calculation for OBP is compared to increasing the OBP over a reasonable low-level of .275.â€ No, I do not see that either. .350 - .275 = .075 which is 28% of .275. Thatâ€™s not 14% either. How about .075/.341? No, thatâ€™s 22%.

The other problem I have with this essay is that your baseline is a composite of average starters for your league rather than the bottom of the possibilities. That may well be more suitable, but you do not explain why. Most of us are used to looking at Value Over Replacement, so to do otherwise requires an explanation.
molnar
6/01
I missed it the first time through too, but this is explained in the article:
"Note that the percent improvement calculation for OBP is compared to increasing the OBP over a reasonable low-level of .275. We do the same for SLG percent as a percent increase over .350. I find that these help put these rate categories on a similar scale to the counting categories."
(and it is explained again in the comments above too)

Clearly the rate categories need to be treated differently than the counting categories, since a guy who isn't playing doesn't hurt your rate stats. This is one way to address that - maybe not particularly elegant, but certainly some thought was given to the issue. However, there is a LOT of space in the article given to cutesy stuff that could have been used for explanations instead, and this is one point that needed an example such as that given by the author in the comments.

The author took a big risk I thought in terms of the scope of the article - "player evaluation" is a lot to take on in 1500 words, compared to say "here's some guys who might be undervalued in your league, or not". But ultimately it worked.
dpowell
6/01
Great article, Tim. Obvious thumbs up. One possibility for future improvement (I realize that your main point here was not to come up with the perfect measure): I think you might be able to get a more useful number than the MPV one you suggested. Fantasy players don't really care that they added 10% (on average) to their categories because 10% means different things in different categories. Do you think using standard deviations for each category (and then averaging) makes more sense?
bsolow
6/01
I've been working on this problem for an unrelated research project for quite a while. The problem is that the large fantasy baseball institutions are either unable (ESPN) or unwilling (Yahoo) to provide detailed statistics for fantasy leagues in this or past years regarding average values in various categories for various teams. The numbers you see cited by "experts" (i.e. "100 SB should be enough for at least 3rd place in the category") are completely baseless insofar as they come from guess-timates derived from the 5 or 10 leagues the author plays in and bothers to check.

The proper way to do this would be, for example, to take a random number generator between 1 and 300,000 (or better, the number of Yahoo leagues in existence) and extract a sample of a few hundred leagues. Then, observe the stats each team has in each category and use those to estimate a distribution for a given year (I'd assume they're all distributed ~ normal with different parameters). Average those parameter values across a few years to get a reasonably more robust picture (although you may decide to keep it only last year, for example, to account for various changes in league context) and estimate an ordered probit regression of points in each category on the raw statistics from that category. A statistics package such as STATA would then allow you to compute partial effects on scoring for a representative team.

There are a number of problems with this approach, though, and I don't know if there's a good way to deal with them. First, converting stats into rotisserie points is a non-linear function (going from 0 SB to 1 doesn't help you at all, but going from 70-71 may have a relatively larger impact on the probability of gaining a point) and in order to understand the partial effects of a given player, you have to make assumptions about what the rest of your team will look like, which is, of course, what the valuation formula is supposed to tell you in the first place. Furthermore, this sort of exercise is completely invalid if you're constructing the estimated distribution from leagues different than the one you play in, for fairly obvious reasons. Finally, the marginal effects can be inaccurate if people in your league play unconventional strategies (such as punting categories) since the distributions you estimated in the first step wouldn't take those into account.
BurrRutledge
6/01
I don't know my elbow from VLOOKUP, but I still liked the submission. I'm wondering how much of this information is already available to us through BP's extaordinary PFM tool. And in case it's not obvious, that is a compliment, not a criticism. Thumbs up.
TonyRiha
6/01
Remember, it's a total vote count as a go-forward. I vote thumbs up for my six best so that what I consider to be the 9th best article has a lesser chance of sneaking past the fourth... Decent read, thumbs up for TK this week...

psugator01
6/01
i've read almost all of the articles so far but have yet to post a comment. As a fantasy player who's competed in multiple leagues for the past six years or so, this article is exactly what i'm looking for. i would like to see something attached here that allows me to input my players into this format. i'm pretty illiterate when it comes to this stats stuff but i understand exactly what he is trying to do and i know for a fact that this is something that could help me in every single one of my fantasy leagues. it really takes the guessing out of decision-making.

excellent job and i look forward to your future pieces.
timoseppa
6/01
This was the guy with the poor choice of title last week, right? (Dave P. - How do we access the previous rounds' articles?) "Welcome"? Eck, the writing style is grating. The humor misses the mark for me. I don't care how good the content may be, I'm not going to want to read this.
josh7798
6/02
This is one of the best articles this week by far. I actually would have assumed that most fantasy players who have been playing and winning for a while would already be using spreadsheets. I'm a little shocked that most are not. Even though I use a lot of my own spreadsheet programs, I can all but guarantee that I will be incorporating some of this into them next year. Thank you Tim.

That being said, I'm going to offer some advice: You have to have a better opening. The opening of this piece reminds me of the first few lines of a pamphlet that would get handed to me at a work conference. Don't say things like "Welcome", and "Hello New Readers". It just doesn't work. Granted a lot of BP readers will look past a weak opening and judge you on overall content, but not all of them. Just ask Byron.
Oleoay
6/02
The opening could've been better, but the overall writing style and tone made up for it I thought. I'm not sure if the Byron comparison is accurate because Byron's second article had a lot of typos... but it's hard to tell at this stage of the game whether people are voting more on general readability or content. Either way, I like what Tim's doing so far in both those categories.
beitvash
6/01
I love this article. I completely disagree with Will that it has too much math/stats. The concept itself was interesting and I didn't find the analysis difficult to follow at all. The fantasy world is really missing this kind of article. There are way too many "Pick up Aaron Hill from your free agent pool" articles and not enough DIY articles. Great work and thumbs up!
vandorn
6/01
First of all, great article. It took things I try to do each season (usually unsuccessfully) and approached them from a perspective I wasn't considering, and explained it really well to boot.

But I do have one question. In calculating the position list you put Branyan at third but also put him at first because he was slotted there by the Mariners for 2009. Branyan isn't going to be drafted at both positions--you can only draft him once. And since he's likely going to be drafted as a third baseman, why should he be included in the 1B list at all? Including him in both spots treats him like two separate players for the analysis.

Branyan may be a bad example, since the position difference between first and third is close enough that he could be moved to first depending on how the draft proceeded. But what about guys like Russell Martin or Pablo Sandoval? Do they get included at third base even if they're never going to play there on anyone's team?
crperry13
6/01
Excellent article Tim. I've done a lot of these spreadsheet analyses myself for my leagues, but never hit on the idea of using percentage of stats. I always found myself adding an arbitrary points total and got tripped up on "calculated" stats.

I would be curious your take on value in keeper leagues, based on growth, attrition, and decline rate. My two major leagues are long-term keeper with prospect drafting, and I have had a heck of a time assigning value to this sort of thing.
chasschlaack
6/01
Pivot tables *are* basic. Easily one of the better articles of the week.
6/02
To follow up further: the analysis used in this article is, in fact, pretty basic for anyone with experience with Excel. I have no idea why the author even mentions, "pivot tables," as they are in no way needed to do the analysis presented. It is very basic stuff he discusses, and honestly, I get the feeling VLOOKUP and Pivot Tables were mentioned chiefly to dress up the simplicity of the analysis.
shankweather
6/01
The day I discovered VLOOKUP is the day I went from competing every year in my league to winning (or placing second) every year. Anyone who doesn't use Excel is not a serious fantasy player.
tkniker
6/01
Thanks for the most recent comments on this.

An analogy I can come up with is Golf. Tons of people play golf. There are those who are perfectly content (and I've been one of them for many years) to just head out to the golf course and swing the wrenches. Then there is the next level of player who likely subscribes to a golf magazine or two, and goes out to the driving range a few times a year. Beyond that, there are those who actively take lessons at the course or at a place like GolfTec (this is the year I decided to be a bit more into Golf and went down this step).

Definitely, my goal of this article was to try and appeal to what I think may be significant (though admittedly, not all-inclusive) group of fantasy leaguers who are in the middle ground of wanting something a little more than the hundreds of magazine and standard websites offer. Heck, the #1(maybe even sole) reason that I started my BP subscription was for the "one-stop shopping" aspect of of collecting some usuable data to build the better mousetrap of customized fantasy analysis, because I truly was annoyed with most of what was out there.

As opposed to looking at the negative of "this may alienate some readers who don't have the skill set," I think articles like this could be a positive gateway of saying "For those who want better analysis tools for your fantasy teams -- We have the data you need, we have the instruction manual you need [hence an article series like Fantasy DIY], and we are helping you make what we provide even better (Matt Swartz's article)"

Just my two cents
tkniker
6/01
Or maybe let me be a little more clear.

In a similar analogy to golf, people do progress, and some don't like to just stay at the same level. To have something like this article and ONLY at this level wouldn't necessarily be a smart strategy BP, but if they had content at multiple levels of fantasy owner sophistication (some that do appeal to the general masses, but others that appeal to more skilled owners), it seems like this would make it more likely to keep people on the subscription roles as their skill level and sophistication progresses.
wcarroll
6/01
Yes, I completely agree with this statement. I don't think it's as big a group as many think, but it's enough of a niche. Heck, there's a large group of people who don't care about injuries!
Oleoay
6/02
Overall, I like to think people by a subscription to BP to learn. Learning implies being exposed to concepts that you're not familiar with or attempting things you have not done before.
6/02
Alexander
6/02
mafrth77
6/02
For the record in a points leauge that counts walks, and the difference, Based on PECOTA between Pujols and the number two 1B, was about the same as the difference between Cabrera and the tenth best 1B (Morneau)
Oleoay
6/02
That's not a real good "for the record" comment since some leagues count walks for different amounts of points, others subtract strikeouts, and yet others give points for things that don't deal with walks such as double, triples, etc.
edanddom
6/02
I am a bit late to the comments party here, but I'll chime in quickly. Perhaps it is just me, but I think that Tim's introduction is misleading in terms of what he actually covers. It is almost as if he set out to write a very comprehensive article, started writing a draft, but then as soon as he saw his word count closing in on the maximum, he slapped on the "oh well, I guess I'm done now" conclusion and turned in his work.

I think that the base content itself is pretty good for readers who are new to valuation, but as a stand-alone piece of work, it is lacking a smoothly-flowing intro > body > conclusion structure that properly sets the reader's expectations, delivers against them, and leaves them with something tangible to â€œdo on their ownâ€ (as promised).

All that said, Tim showed enough potential with this that he earned my third of three thumbs up votes for this week. Even though he explained only a small fraction of what a person REALLY needs to know about the DIY process, he at least convinced me that he understands the DIY process well enough to potentially serve as an instructor on the topic. As someone who has played this role (informally) elsewhere, I know that mastering the process yourself is only part of the job â€“ being able to fully break down the custom valuation components and clearly communicate the technical how-to is the only way to wean a reader off of pre-published or systematically-generated dollar values. I would be curious to read a more in-depth â€œtrainingâ€ from Tim on this.
CLloyd24
6/02
I have two big arguments against this system as useful fantasy player evaluator.

First as another reader pointed out it fails to establish a proper replacement level player. It simply compares players to those at the same position who are projected to have 200 ABs. Where does this number come from? Is it right? Answer, no. To see why it's wrong lets use a simple example of a league that only counts HRs. Using Branyan and Wright as above Wright is projected to have 32 HRs to Branyan's 26. If we used 200 ABs as our baseline and 30 hitters qualified, lets say the average 3B would get 15 HRs. This would make Wright 113% better than the average 3B and Branyan isn't too far behind at 73%. In relative terms, Wright is about 50% better than Branyan. But this is a 20 team league so there should only be 20 3Bs (ignoring the UT/CI spot). So if we now only use the top 20 3Bs we should probably expect an average of 20 HRs. Now Wright is still 60% better than average but Branyan is a much smaller 30% better than average. And in relative terms Wright is now two times better than Branyan. So as you can see, what you set the replacement value at has a great effect on the relative value of players.

The second complaint is that it appears to ignore the effect of plate appearances on OBP/SLG. A player who puts up an OPS of .900 over 600 PA can actually improve a team's OPS more than a player with an OPS of 1.000 over 200 PA, depending on the team's OPS and PAs. Also the fact that you arbitrarily choose .275 and .350 as baselines for OBP and SLG seem to suggest that there is very little rational statistical basis for this method.
Oleoay
6/02
Maybe I'm wrong, but I assumed that plate appearances of a player's depth chart projection were compared against the baseline plate appearances by position, then that performance was prorated.

As far as 200 AB, you'd have to be in a very deep fantasy league (or maybe, dealing with rookies) to find someone worthy of a fantasy roster spot that wasn't projected to get 200 AB. Even a generic pinch runner will end up with about 200 AB. Go any lower than 200 AB, and I imagine the overall MPV difference would be minimal or zero with respect to a "replacement level" fantasy player that'd be on a waiver wire.
CLloyd24
6/03
"As far as 200 AB, you'd have to be in a very deep fantasy league (or maybe, dealing with rookies) to find someone worthy of a fantasy roster spot that wasn't projected to get 200 AB. Even a generic pinch runner will end up with about 200 AB. Go any lower than 200 AB, and I imagine the overall MPV difference would be minimal or zero with respect to a "replacement level" fantasy player that'd be on a waiver wire."

That's my point, using his method he is comparing Branyan and Wright to a bunch of players that won't even be rostered on anyone's team. Compared to a 200AB player Branyan is almost as much a stud as Wright. However as you reduce the player pool to the proper size for the league the difference between Branyan and Wright becomes more pronounced; Wright is still a stud but Branyan would simply be average.
Oleoay
6/04
You may have a point here in that the baseline for plate appearances might have to be adjusted based on the number of active slots on a roster...
Oleoay
6/04
Let me clarify... if you have five more active slots like a CIF, two utils and two active catchers etc, then the baseline might need to be increased for total team plate appearances. That would also diminish the impact of someone who accumulates less than a full season's worth of at-bats.
tkniker
6/03

The .275 and .350 are somewhat arbitrary, but to find the right levels I believe is also a complicated enough process that for someone who is just beginning to create their own analysis system, this is something that would be addressed later on.

I disagree with your other comment because when you insert a player into your roster and remove the "average player" it will take into account the fact that .900 OPS over 600 PA will ahve the greater impact than 1.000 OPS over 200 PAs as one does calculate what the new OBP and SLG will be.
Oleoay
6/03
So I'm correct that your system prorates based on more/fewer PA than the baseline?
CLloyd24
6/04
So your system does account for differnces in PAs? That's good.

Oh and in reference to this, "Personally, I like combining two BP data sets, the downloadable PECOTA projections and the depth charts, which refine the expected playing time.

Note to Baseball Prospectus: a great addition for next year would be a downloadable file of the plate appearances and innings pitched used in the depth charts, preferably with the HoweID key for each player for easy joining with PECOTA projections." I can't believe no one pointed it out above, I meant to yesterday, BP already does this for us.

Here you go: http://www.baseballprospectus.com/fantasy/pfm/index.php?raw

Since the Depth Charts were updated on 06/01 these projections are from then to the rest of the season. It's too bad the raw PECOTA projections themselves couldn't be updated. Does BP still really think C. Young is going to hit 62R 17 HR 59 RBI 13 SB .269 from here on out?
Oleoay
6/04
I flamed PECOTA's A-Rod projection and he opened the season with an injury that knocked him out for a month... with any kind of projection system, though, there are those who meet or fail to meet their projections, or have variance between their 20%-90% projections.
CLloyd24
6/04
Young's 10th percentile projection triple slash line was .225/.293/.386; right now people would love if he was even at that since he's put up a brutal .172/.216/.299 so far.

My point is that BP has updated their depth charts and playing time projections for the season but not their actual projections for playing ability. I would say that C. Young has soldily established a true talent level significantly below PECOTA's projections.

I'm finding it surprising that BP is unable to update their projections in-season. For \$20 a year I expect more features than I can get for free at a site like FanGraphs (which DOES offer in-season updated projections based on ZIPS). I'm just really not sure if I'm going to renew my subscription next year if they're unable to keep up with features that are free and just as good, if not better, at sites like FanGraphs, Hardball Times and LastPlayerPicked.
hotstatrat
6/06
FanGraphs is wonderful (although, their writers aren't as good as BP's, they have no team tracker, and probably none of those projections are as reliable as PECOTA duing the off-season - and the other sites are even less comparable), but is FanGraphs making any money? At some point, they will likely go out of business, start having annoyingly intrusive ads, or start charging. If readers abandon BP, they will be stuck with what is left. I don't think BP is prohibitively expensive for their continued support in the face of their competition.
CLloyd24
6/09
"their writers aren't as good"

That's debatable.

"they have no team tracker"

Not true. Their's even includes multiple projection systems, including inseason projections. http://www.fangraphs.com/blogs/index.php/testing-some-stuff http://www.fangraphs.com/blogs/index.php/my-team-now-with-projections

"probably none of those projections are as reliable as PECOTA duing the off-season"

PECOTA maybe used to be the best but the other systems are becoming just as good if not better. http://www.hardballtimes.com/main/article/so-how-did-tht-projections-do/

"but is FanGraphs making any money?"

Is that relevant. Until a few months ago they didn't even bother to have ads. It seems to be more of a labor of love than a source of income and they are doing quite an excellent job. I haven't seen any indication that they are going to fold shop.

Ultimately I've enjoyed BP, mostly just for their PECOTA projections and their PFM but if they aren't able to step it up this year I'm gone and I imagine a bunch of others will be as well.