CSS Button No Image Css3Menu.com
New! Search comments:
(NOTE: Relevance, Author, and Article are not applicable for comment searches)
Don't forget that it goes both ways. While the robot's team can't sign a bunch of all-hit players for 35 games a year, the opponent can just call up their best fielders from throughout the organization and try to hold the robot's team to three runs. Every time the robot starts, he'll face a team with three center-fielders and the two or three best shortstops in the organization. While the robot's team can't constantly cycle their roster, an opponent to plays him once or twice can just swap a few players with options for better defenders on the 40-man.
Nice article, by the way.
Different slices of the data can help get around the selection issue, but I think the basic problem is aggregation. Your question is about how players change, so you may want to analyze change directly by somehow accounting for each player's level of ability. If you get a sample such that selection issues don't matter (either by selecting the right group or by using weights), that would work. There are two other ways that allow you to use the whole sample.
The sophisticated way to do this is via a mixed-effect (random-effect, HLM, repeated-measures ANOVA, etc) model, so that you account for individual differences by estimating an initial level or "intercept" for each catcher. Then, you can evaluate the effects of repeated PAs in terms of deviations from one's initial level.
The simpler way is simply to turn the statistics you have into change scores. In the example I gave, Catcher A's .340/.330/.320 line would become .340/-.010/-.010. That would yield the following table:
A .340 -.010 -.010
B .300 -.010
Mean .300 -.010 -.010
This method could fall victim to some selection problems if there are individual differences in how catcher's change, and players that change a certain way are inclined to play more often. However, it does allow you to answer your questions about change.
Quick mock-up of an example (it's a gross oversimplification). There are exactly three catchers in the league, and they each get the same number of starts. For convenience, lets assume none of these guys walk (or walk at equal rates), so PA=AB.
Catcher A hits .340 in his first three PAs, .330 in fourth PAs, .320 in his fifth PAs. He leads off for his team, and gets exactly 5 PAs every game.
Catcher B hits .300 in his first three PAs, .290 in his fourth, and would hit .280 in his fifth. He never gets a fifth plate appearance, as he gets exactly 4 PA a game.
Catcher C hits .260 in his first PAs. He would hit .250 and .240 in his fourth and fifth+, but he gets exactly three PA a game.
These three catchers all get the exact same number of PAs., so the average BA for the first three PAs is
(.340+.300+.260)/3 = .300
Saving the math, the average BA for the fourth PA is the average of players A and B is (.330+.290)/2 = .310, and the average BA for the fifth PA is the average for player A, which is .320. If there were no selection, then the averages would run (.300, .290, .280). Put this in a table:
A .340 .330 .320
B .300 .290 .280
C .260 .250 .240
Mean .300 .290 .280
A .340 .330 .320
B .300 .290
Mean .300 .310 .320
You'll see a similar effect when you look at pitcher performance by inning or trip through the order. Starters still going in the 8th or 9th tend to be good or having good days.
I think there's something of a selection problem in your analysis. Good-hitting catchers are more likely to hit higher in the order and less likely to be lifted for pinch-hitters, meaning the 4 and 5+ PA samples have better hitters than the 1-3 sample. You can get around this by using some type of repeated measures model.
The same problem applies to any across-position valuation, from the last picks of the draft to choosing Hanley Ramirez over Albert Pujols. I certainly agree that leagues are typically won in the middle rounds (although you can lose leagues in the first rounds). The example I attempted to lay out pertains just as well to players at the top of the draft as at the bottom. Russel Martin has very different values in one and two catcher leagues, and comparing him to outfielders and shortstops is very dependent on what you think the replacement level for each position is. Your method doesn't account for league depth.
There is a chicken and egg issue, because we have to deal with multiple categories for each player. We have to project arrays (vectors) of data for each player into a single (scalar) value, which includes at least two components: how the different statistical categories relate to winning, and what the baseline or replacement level of each category is for each position. As you suggested, you can't really figure one out without the other.
Your Bloomquist example is a good one. I see two options to get around this, though there are certainly more. Both start with a preliminary valuation. The first is to use the MPV of the replacement level player at each position. Then adjust your MPVs such that freely available players have a value of zero, tweak to your heart's content, and go!
Another is to use the preliminary valuation simply to find the replacement level players, and come up with a new valuation based on who those players are. I typically get around the Bloomquist problem be using a kernal smoother; instead of taking the 10th best player, I'll use a weighted average of the 7th-13th best, for example. If you properly value each category (i.e., put them on a scale such that moving 1 unit in runs or SB equals x wins or x points), it will end up not mattering; Bloomquist and the slow but better hitter will have exactly the same value, so either can be used.
I've liked your responses, and your original entry. I made the point originally because a reader might have assume this method adjusted for league context because you mentioned league context is important. Good luck in the contest.
Thanks for the reply.
This is of importance when comparing players across positions (within any position, there is *relatively* little impact). Pretend its late in a draft, and I only have two spots to fill: 2B and catcher. Which ever position I fill last should be almost exactly replacement level, so assume that player/position has no value.
Whatever valuation system I use should tell me who to draft next; the best available 2B or the best available catcher. Regardless of the number of catchers each team must roster, the 2B ratings won't change. If I'm in a one catcher league (and thus have not drafted a catcher yet), the choices at catcher should have much higher MPVs than the choices in a two catcher league (where I've already drafted one of my two catchers). The problem occurs that MPV doesn't move with the number of players on rosters in each league, and thus doesn't provide a true zero point to compare players across positions.
Regarding your third point, I think the use of the mean is part of the problem. In a 10 (or 12) team league, replacement level for any given position should be the 10th (or 12th) best player at that position, with caveats made for utility and other flexible position spots. I'd either move to a non-parametric approach (i.e., using ranks), or at least adjust the MPVs for each position such that a player with no value at each position (10th best player at a position in a 10 team league, assuming no players at that position get used in a UT spot) has a value of zero.
Not a thumbs-up for me, because of a flaw in the "baseline-calculation" section. You reference that league context is important, and totally ignore it in your baseline calculations, preferring instead to reference every player to the average (qualifying) player. You instead should be looking at a league-specific replacement level.
As a quick example, consider the value of a full-time but poor-hitting catcher in either a one-catcher and a two-catcher 12-team league. In the one-catcher league, this catcher should have very little if any value, as the replacement-level catcher in this league is probably full-time (or close) and similarly useless without his chest protector. In a two-catcher league, this player is relatively valuable, as whomever you can find on the waiver wire is likely a part-time or reserve catcher.
This issue will also come to a head when there are differences in the variances of performance across positions, when one player is hands-down above the rest at his position, and a variety of other situations. The rest of the article was good, but if a fantasy player followed your advice in any league where the average (200 PA+) player was too far off from that league's replacement level, they'd just plain draft the wrong players.
Great idea. April 2 isn\'t bad for pitching, but the hitting gets a little weak. Hughie Jennings plays first base, his secondary position, out of deference to Luke Appling.
Name, Pos, Birthyear, Career, OPS+/ERA+/HOF
Luke Appling, SS, 1907, 1930-1950 (112, HOF)
Hughie Jennings, 1B, 1869, 1891-1918 (117, HOF Manager)
Reggie Smith, CF, 1945, 1966-1982 (137)
Pete Incaviglia, LF, 1964, 1986-1998 (104)
Boby Avila, 2B, 1924, 1949-1959 (104)
Bill Sample, RF, 1955, 1975-1986 (98)
Hector Cruz, 3B, 1953, 1973-1982 (81)
Howard Wakefield, C, 1884, 1905-1907 (88)
Steve Hosey, RF, 1969, 1992-1993 (75)
Dennis Hocking, IF/OF, 1970, 1993-2005 (69)
Al Wies, 2B, 1938, 1962-1971 (59)
Frank Boyd, C, 1968, 1893 (90)
Milt Ramirez, SS, 1950, 1970-1979 (31)
Don Sutton, SP, 1945, 1966-1988 (108, HOF)
Billy Pierce, SP, 1927, 1945-1964 (119)
Ed Siever, SP, 1875, 1901-1908 (117)
Tommy Bond, SP, 1856, 1874-1884 (111)
Jon Lieber, SP, 1970, 1994-2008 (103)
Al Nipper, SP, 1959, 1983-1990 (94)
Gordon Jones, RP, 1930, 1954-1965 (94)
Earl Johnson, RP, 1919, 1940-1951 (96)
Mike Gallo, RP, 1977, 2003-2008 (106)
Tom Johnson, RP, 1951, 1974-1978 (115)
Curt Leskanic, RP, 1968, 1993-2004 (115)
Dick Radatz, RP, 1937, 1962-1969 (122)