CSS Button No Image Css3Menu.com
New! Search comments:
(NOTE: Relevance, Author, and Article are not applicable for comment searches)
Thanks, guys. I'm going to tackle disability insurance in the coming weeks, which should be fun as well.
Yes, federal taxes are ignored, as every player would be in the highest tax bracket. Good point with the Canada-US fed differences, though. Something to look into for sure.
Not really. CK looked more at the trade itself while I always find discussing goals and risk more interesting.
They didn't give up all stars but many have deemed the package to be more than was given up for Marcum or Greinke, which makes me consider this to be reactive instead of proactive. This really boils down to what you consider the Cubs goals to have been. You seem to be suggesting that the Cubs opted for the middle ground between sticking with what they have and going all-in. That is perfectly viable and potentially what they tried to do. I just personally believe a better course of action would have been to utilize their depth to add Garza AND something.
Right and that is my point here. That core wasn't one Matt Garza away from really competing. SO you go for Garza to team him with Marcum, and you go for Orlando Hudson over friggin' Blake DeWitt. You build a team really capable of winning, not bring in one of 4-5 needed pieces.
This is a very good point, and Cliff Lee is a great recent example. The Phillies traded Carrasco, Donald, Marson, and Knapp to get him and Francisco. The Mariners then traded Aumonte, Gillies, and Ramirez to get him from the Phillies. You're probably right that a similar package could be had, which brings up my favorite topic -- risk. Was it worth the risk that this CAN'T happen? As in, was the potential Garza brings to the Cubs for contention less than the risk that, if not, they couldn't replace the original prospects surrendered? I'm not sure. But that doesn't help really with regards to the central theme here, which is that it's fine to go all-in, and pretty cool when it happens, but going just for Garza doesn't equal going all in. The move is definitely justifiable, so I'm not arguing that at all. I am arguing that they either didn't need to make the move, or needed to/need to make other moves to support it.
Wells ERA in 2009 was 3.05, his SIERA was 4.33. His ERA in 2010 was 4.13.
Yes, athletes get W-2s and the accounting departments of the teams take care of the various withholding rates for the jurisdictions.
This is what I have advocated for a few years. I mentioned before in an article of mine recently that if we wiped away the compensation system, relievers would benefit the most and teams wouldn't have to worry about giving a 1-yr deal to a fickle reliever. They wouldn't have to offer extra years as a means of potentially extracting more value to add to the return so the loss of their picks isn't material.
Once you assemble the rates for the cities and states, send it my way. email@example.com. I'll forward you mine when done. Prob work on it in the next few days.
Yeah that could be very interesting. In fact, I bet once I have all the state and local withholding rates in tow I could come up with an algorithm based on a team's schedule to compute it each year.
Right -- it might not be a dollar for dollar credit due to this, or in Sosa's case, because of wonky rules. It will reduce the burden but in many cases not wipe it out completely.
Haha, I was dreading that question. But I think maybe it would be a fun exercise if I went through it all. Took the Phillies schedule, dissected it, figured out the tax rates of all the states and municipalities involved and come up with a rough estimate, and then compare it to Yanks and Rangers. Unfortunately, with tax season starting for me, well, this week, I don't know if I'd have time. Maybe it can be piecemeal and I'll throw an Unfiltered up.
Thanks, Scotty. I try to make myself as available as possible and that's why I always give out my e-mail address. If there is ever a specific topic someone is curious about or research they would like to see done, I'm always willing to have a go at it.
Is there anything else tax or business related, pertaining to baseball, anyone is interested in me writing about?
And even if you didn't, on your PA-40, you would be able to deduct the taxes paid to Virginia to reduce your PA tax liability.
It's not a choice of the player, it is specific to the statute's of the state and cities involved. I believe the duty days method is used much more often than the games method. But if a guy was on the bench he would most likely still be taxed. It's just a different denominator in the equation. If it's a three-game series with a travel day, it would be (4/220)*rate in duty days, but (3/162)*rate in games method.
I'd love to hear more about this. Feel free to type more here or e-mail me: firstname.lastname@example.org.
I think we might now be getting into quantifying a player's comfort level with a team or city ;-). I'd sure be comfy if it means saving $45 million.
One thing I think would be interesting is to see what the taxes looked like for someone who was traded/signed/released multiple times in a year. It must be a headache on top of a migraine.
Yeah -- entertainers are just like athletes where we know when they are in certain areas due to tour dates. The biggest problem seems to be synergy across the various states and municipalities.
It all depends on the specific statutes of the states or cities involved. I admittedly did not review all of them else I would have gone insane, but it seems that as long as the player is getting paid, it would count as a duty day. If the player was not getting paid while on the DL then it is likely a duty day would not be accrued.
I guess I'm just skeptical that if he had 2.7 WARP in 24 starts that another 10 or 11 would have pushed him above 6.5 or 7.
Why, you're just taunting me into a Seidnotes on this, aren't you?
Yeah, well you're a nerd.
Well Downs really isn't worth talking about all that much, though, given that he seems to be in that elite reliever mold. But if it's a 2nd rounder, we have that valued at $5 million. So the 3/$15 is 3/$20. At around $5 mil/win through MORP, Downs would need to produce four wins over the life of the contract to be a breakeven. From 2007-10 he produced 8.8 wins. Barring an injury, this should be a good deal. It's when guys like Crain sign 3-yr deals that get screwy.
Well see this is where the entire idea of evaluating these contracts gets tricky, and I'm glad you brought it up. It is just not anywhere near as easy as saying Player X produced Y WARP which is worth Z Money. He was paid B Dollars, so deal is (good/bad) depending on if B is greater or less than Z. Using the strategy you mentioned above would certainly have an impact, because the pick associated with Beltre would actually lessen that which is associated with Downs. If we say the 3rd rounder is worth around $1 million, then Downs would have a contract worth $20 million to the Angels, not the $27 million (his $19 mil plus $8 mil for the first rounder).
One of my goals for the off-season is to work on the glossary. I'd like to have a short definition, an extended definition, examples of the stat in action, and maybe a link or two to articles that feature it.
To me what it boils down to is that if it's a free agent who will cost compensation picks in return, you really need to be sure of the level of production you can get. And that is just virtually impossible with relievers. And then the guys who do have established levels of performance are going to be costly anyways and likely expensive enough to the point that value would be unlikely to be recouped.
Gracias! My spreadsheet is at home -- will take a look when I get in later tonight. There were like 45 in just 2007-08 alone, so it's probably around like 65 for 2007-10.
Note that as of 6:50 AM EST the multi-year table is just showing the 2006 offseason guys. The 7 guys signing during the 2007 offseason will be added shortly. Apologies.
For every dollar you go over you owe two dollars. And the Phillies will be dangerously close, if not over, with Joe Blanton. So that would be a major reason why they would try and move him. If they eat $8.5 of the $17 mil and send him elsewhere, then the AAV would be $4.25 mil for this and next year, I believe, which would lower their threshold for the luxury tax.
Yeah, that could be fun. I'll have to look into a bit more.
Both... and your homework is late.
Last year when they were getting Halladay their original course of action was to seek suitors for Blanton. They didn't find any that would take enough of the salary so they traded Lee instead since everyone would pay him $9 million. Now, all the sources that have provided all of these reliable reporters like Heyman, Rosenthal, et al, have indicated moving Joe Blanton is a desired course of action to clear some salary from their payroll.
They are obviously better with him because he is just better than Kendrick or Worley, but THEY seem to feel saving $8.5 mil over the next two years is of the utmost importance, which does seem kind of silly when you break the bank to build this team.
Oh I don't think they have to, but THEY are acting like they have to, so that's where everything is coming from. Including Blanton helps them greatly, as Joe is like a 2.5-3 win pitcher when healthy (last year he was hurt for a while). If I had my druthers, Blanton would stay. I just don't see them keeping him around.
I was joking about the Blanton better than Kendrick part. Your points are all valid -- I'm just telling you what THEIR rationale is with moving Blanton.
Well... duh. The reason they want to move Blanton is that Kendrick makes under a million bucks and Blanton stands to make $8.5/yr. They want to avoid having too high of a salary and in order to do that they would like to move either Blanton or Ibanez. Since it's doubtful nobody would take Ibanez, Blanton becomesthe odd man out.
And as for dumping millions into a couple of marquee pitchers, it's four. Four out of the five spots in the rotation are manned by legitimate #1 pitchers. This isn't like signing Lee and Halladay and then having Kendrick, Vance Worley, and Antonio Bastardo.
I'm an accountant. One of my former Professors was actually Controller of the Phillies from 1997-03 and told us about how when interleague play first started things were really crazy because teams were playing in cities they had never before played in and getting all of the tax withholdings applied was a pain in the butt. He said Rico Brogna called him one night at like 2 AM because they had accidentally withheld way too much from one of the AL cities.
They are not going to expect to get prospects of worth in return. They have little leverage in the situation because it is obvious they need to dump his salary. This was known from the time they were first reported to be in on Lee. The best type of trade for them would be one where the team assumes the max amount of the salary. But this is not going to be a trade where they get anything of value in return aside from salary relief.
Unless of course Blanton suddenly becomes the most attractive option available and teams start to bid, offering to pay 3/4 AND give prospects.
This will most likely end up being something like one of the following scenarios:
1) Blanton is traded to Team X for nothing in return but Team X pays $13 of the remaining 17 million
2) Blanton is traded to Team X for Prospect A who will be forgotten in two years, and Team X pays $8.5 of the remaining 17 million.
No -- and I say that because I don't think he will be moved until closer to the season. When Pavano signs, Zambrano goes to the Yankees, and depending on what happens with Garza, Blanton might actually become the most attractive SP on the market. If I had to guess, I don't know, he seems like the kind of guy the Padres or Brewers would like.
Any of the Phillies four is better than Madison Bumgarner. I am also not too confident that Jonathan Sanchez will continue his emergence as a dominant force. I don't know why someone like Lincecum is considered less likely to get injured. If you're going to comment, you gotta' bring it, dude! Don't give me "but I mark the top 3 in SF less likely to get injured or regress"... tell me WHY.
Oh it definitely is. The front four for the Phillies is better than both of those teams, but Kyle Kendrick is nowhere near Zito or whomever is the Red Sox fifth starter. It just depends how much better R2C2 is than the others.
I'll take that bet.
That is what sets them apart. They are an example of how the qualitative aura only enhances their actual numbers. The difference between them and these 2011 Phillies is the idea that those Braves pitchers grew together. Yes, Maddux was a Cub, but that trio came into their own and peaked at the same time. The Phillies assembly feels more like a Yankees team assembling Clemens, Wells, etc.
Right -- what I meant is that acquiring Smoltz as a minor-leaguer still created the illusion that he was "raised" by the Braves, so while he was not homegrown, it was nowhere near the same thing as trading for Halladay, Lee, or Oswalt.
And he is not -- which is sort of the point. For 1 yr and $5 mil, and the fact that you can guarantee with more certainty than every other pitcher in that #4-#5 range what he will produce, it's a great move.
Yes -- yes he is.
Keep in mind that he is still very young. I'd imagine if you looked at his comps as of the end of the 2009 season it was a much better list. What hurt him is the mediocre 2010 season. This upcoming year will be very telling. If he puts up a .280/.345/.450, he is more common than special and then he is signed to a deal that would pay him a lot of money in 2014 or 2015 to, if that career trend continues, be Raul Ibanez.
No, these articles do not neglect to acknowledge the monetary value of intangibles. I mentioned in the article that most of Jeter's value now is tied up in intangibles. I didn't feel the need to wax poetic on these intangibles for three paragraphs. How much do you suppose they are worth? Because it isn't on the line of $12-$15 million, which is what it would need to be to justify a $22 million or more per year salary.
One interesting aspect of your point would involve whether or not this has occurred--if it has occurred before--more with new GMs taking over a team. It might be that Kevin Towers and his evaluators are not high on Upton, while Byrnes and his staff were very high on him. So Towers coming in and floating the idea would be different than Byrnes doing the same thing.
Zach, I hate saying "it depends" but it really does. It depends on where the Diamondbacks view Upton's true talent. If they think he is more like the 2009 player and are just trying to sell high to maximize the return, they are going to look for cost-controlled players already producing at the major league level, such as a Dan Hudson type, as well as players who will be able to make an impact in the majors quite soon. If they think he is closer to his 3-yr 2008-10 numbers, but are trying to maximize the return the same rule would apply. But if they think he is closer to the 2008-10 aggregate and are really trying to unload him, then the return can be lesser because they have more incentive to trade him.
The platter of players you mentioned is intriguing, but I don't know that the Orioles would value Reynolds much.
One thing I had heard was a deal for Willingham who might then be locked up for a bit, to take over for Ibanez in LF. Similarly, Carlos Quentin's name has popped up. The Phillies are going to use this year as a gap-bridger most likely. Brown will start in RF and be platooned -- though I bet he plays more against lefties than people think -- and that platoon partner would likely be someone they consider capable of starting in 2012 when Ibanez is gone. That is, assuming the world is still around in 2012.
Pagan was too noteworthy. Both him and Torres had good years that people knew about.
Morse would be a good candidate. But would he bump off the guys I have in the outfield above?
cubfan131, I had Byrd originally on the list, but he WAS an all-star last season and he has had success in the past with Texas, so his performing well isn't too surprising.
Yatchisin, Johnson doesn't qualify because he got off to such a hot start that most fans knew he was having a good year. I was looking for guys who were elite while people thought they were merely great, or who had very good seasons under the radar.
See I don't think Hampton will, just because in addition to having 0 effectiveness, he also doesn't stay healthy. It's not like Mark Prior, where you take the calculated risk because if he can be healthy, you might really luck out. With Hampton, what's the potential upside? Kyle Kendrick-like performance?
Well, HOW is that possible??
Cavebird -- in the first part of that blurb I mentioned how he wasn't good defensively. The comment about him replacing Burrell or Ramirez was a shot at Cabrera. As in, I was saying he should not play 130 games anywhere unless he is replacing putrid fielders like those guys, who would make him look like Carl Crawford.
npb - you probably misread what the table shows. That table of five shows the only players who:
a) Had a three-year span with a .320+ TAv
b) That dropped off by 30+ pts in the fourth year of the span
c) Who then had a TAv GREATER than their 3-yr span, in the fifth season
Wynn had a .323 TAv in the fifth season, which was just below his .325 in the initial three-year span. There were five players who met a), b), and c). Wynn was someone who came close, and if I had extended that table to be say 5 pts less and up, he would show up.
R.A. -- I mentioned in the article the players had to have 350 PA.
I can't tell if this is sarcastic or not!
Jesse, I'd advise you to read my article "Gauging True Talent" as I discuss how these PECOTA are runs. In short, yes, it includes the current season numbers, and these are not simply the pre-2010 PECOTAs being used.
When I've written 80 of these already and mentioned it, there comes a time when it makes no sense to continually mention that unless we're talking about 57/43 or greater, it's very close.
Exactly -- we have his current year's numbers, and so normally the idea would be we regress that heavily back to the mean, but I don't know if I feel comfortable doing that. He might be a case where a heavily regressed version is technically correct, but clinically wrong.
After re-running with Berkman and not Teix, the numbers are essentially the same.
Why did you put quotes around 4 significant figures? That's the real question.
Hey, if you know how to code that in, I'm ready to listen!
I completely agree, prs130, however Manuel made the move because Sanchez was pitching, not because he thought along the same lines as us. I fully expect it to be Polanco-Utley in Game 3.
But it's not very relevant. The entire purpose of my article the other day, and the Playoff PECOTAs, is to show that regardless of what happened in 500 PA from April to September, our expectations of player performance are derived elsewhere. Did Huff have a better year than Howard? Yep. But is Howard's true talent level higher? Yep. All that we should care about right now, going into a playoff series, is where the player's true talent level falls, NOT what he did this season.
Tynan -- this is why you don't base anything off of "based on performance this season" and this is why true talent levels are of much greater importance.
Because he's a catcher? Just kidding, Javier Lopez made 77 appearances, 30 of which lasted 4+ batters. He did not qualify as a LOOGY. He is just a good left-handed reliever. Not all lefty relievers are LOOGYs.
Dude, I rechecked the numbers like 10 times when I saw he ranked second. He actually ranked first across 2007-08 as well. I think the issue is that we as fans have trouble reconciling the context. So Romero has certainly had his moments, but he left lefties to a very, very low line this year, a .217/.323/.277 line, and it just so happened that the same lefties he faced did much better against other lefty pitchers.
Both good ideas. Zebs, one reason is we made some slight corrections along the way in terms of our treatment of platoons and HFA, so between that and lineup changes the results moved. As far as why the projections were "off" I don't know that they were wrong, in the clinical sense. A projection system provides a basis of what should happen if everything goes according to plan, so to speak. Why did the results differ? Likely has much less to do with the projections and more to do with players over- or underperforming. For instance, no sane system would peg Derek Lowe as capable of pitching the way he did... but he did it.
MWSchneider, exactly. Saying that the Reds led the NL in runs and that the Phillies had a mediocre TAv is misleading. The Reds aren't facing Kyle Kendrick, they're facing H2O. Additionally, the Phillies aren't starting Wilson Valdez, Mike Sweeney, Domonic Brown, or Greg Dobbs. Their lineup is intact. All of these things MUST be factored into gauging true talent in playoff matchups, and are far too frequently ignored.
Mike, it certainly could have an effect. I was just playing devil's advocate for a bit. But I honestly think, when push comes to shove, it'd be a lot of work for minimal gain, especially given how much we would have to regress. The sample of guys pitching on short rest is very substantially smaller than other splits, which fogs the situation.
Mike, it isn't already done, but here is the question I pose to you: what would you even do along those lines? I suppose we could find out the difference between 3 and 4 days of rest for the league and then regress Lowe's personal splits heavily in that direction, but then how reliable are those splits, given that mostly only the top of the line pitchers even throw on short rest. I guess I'm saying it is conceivably capable of being done, but I'm not sure about how much more accurate it would make the numbers.
Being fixed. There are 4 teams that have done it.
Okay, thanks guys, will run the updated numbers soon and update the article.
Jesus, READ THE DAMN ARTICLE! Kidding... but not really.
It's not my standards, so take that back. It's that both fall into unique circumstances that fall out of the scope of this. Wilson went from reliever to starter, so obviously he doesn't fit any of the above categories. Lewis doesn't qualify because there is nothing from last year to compare to the current year numbers.
Perhaps next week I can do a Seidnotes on the best pitchers who had something like a max of 80 innings in YR1 with 5 or fewer starts, who then made 20+ starts the next season, to find out which reliever turned starter had the best season.
Thanks for the numbers. The issue, however, as I've tried to get across in all of these articles, is that using 2010 numbers by themselves is not the accurate thing to do when gauging expectations right now. Even though it FEELS right, and it does, the better bet is the true talent level of the player, which we can find through this PECOTA process. I'm thinking we'll have Hanigan and Nix's PECOTA lines, adjusted for everything we adjust for, quite soon. But Hanigan is not a .300/.405/.429 hitter, and he certainly isn't that vs. Oswalt.
The issue is that lineups don't get posed until closer to game time and it isn't optimal from a production standpoint for us to wait until they are posted. I'd expect that Hanigan and Nix would actually make the Reds look worse, though.
As I wrote today, there should be a new rule: you cannot complain about balls and strikes when your OBP is under .305.
Additionally, it applies park factors. So for the Rays, for instance, in Game One their bats were expected to do somewhat well against Lee given that the Trop was the fifth hitter-friendliest park this year and they were home.
Yes, as was mentioned in the prior two articles, the playoff PECOTA takes into account the batter-pitcher matchups as well as home field advantage.
Good call. Everything else still applies. I suppose this is what happens when you work 24/7 on PECOTA playoff updates and then have other articles to write, haha. But everything else still stands, so just pretend Holliday isn't there... just like we'll pretend he touched home plate in 2007.
I don't think that has been programmed in yet.
As I mentioned, Bell and Wilson are different from the others, because they went from good to great, not terrible to mediocre.
Yeah, the only thing we need to do is refine the process. Colin, John and I pretty much worked 24 hours on these but now we have it down pat, so each day we'll have the projected lineups, the slash lines, the odds of each team, and blurbs from Colin and I on what we're seeing in the data.
The way this works is that the batter has a projected rate in an event, and so does the pitcher. Maybe Halladay has a projected rate of getting outs of 67.2 percent of the time, and Jay Bruce has a projected rate of making outs at 62.4 percent of the time. Using the Odds Ratio formula which I described in my article here "Pujols and the Simulation Gauntlett" the combination of those two would give Bruce a 64.6 percent shot of actually being out. Even though his overall rate is lower, the fact that he's facing a tougher pitcher makes it increase. So that is repeated for all of the events.
I vehemently disagree with your face.
I believe that would be called a super-cycle.
Oh come on, it doesn't REALLY make me angry. Half this article is facetious.
This is certainly further proof of that.
Klipzlskim, I believe you have just given me an idea for a future Seidnotes.
Okay, table fixed. Apologies for that.
Additionally, the players in the final table were inputted wrong -- they are actually from the first table. The 8 players for the last table are being fixed momentarily.
Yeah, I phrased it wrong. Jackson and Young had two triples, a homer, and a single, but no double.
David Bell's cycle nearly made my head explode. I couldn't believe it was happening. Even after I saw him on third base I couldn't believe he tripled.
I should say, not tied for the last spot, tied for the spot AFTER the spot that's last in the AL, which was a three-way tie.
Buchholz was tied for that last spot. He was 38th or so in IP which really brought him down.
It really depends on the analyst. I don't like starting incomplete research. I don't mind developing methodology and storing it on my computer, but I'm probably not going to publish something like what you described until we have reliable data with which I feel comfortable using.
I think the only thing differently Manuel would do is only let H20 throw about 5 innings. You're just not going to see Halladay go 8 innings in a meaningless game.
Unfortunately, we really can't without some type of data. We would need a uniform velocity reference and an unbiased way of selecting the pitchers to study. Just picking four or five guys we remember and cobbling together velocity info we get from TV -- which comes from production assistants in the stands -- isn't terribly reliable.
This is on most PITCHf/x analysts queue for things to research over the next 10 years.
Oh, it certainly COULD happen. My opinion is that, barring substantial changes it won't. Now there is certainly enough opinion here to help sway me to the dark side, but this all detracts from the main point of my article, which is that comparing someone's win totals today to Cy Young's or to others from his era doesn't make any sense.
One point not necessarily against Verlander but still up in the air, which I mentioned, is what he'll be like when his stuff inevitably fades. I don't think he'll be throwing 96 mph as a 36+ year old, so it'll depend on how he adjusts, assuming he is still dominant enough to be within reach.
It was a joke. My focus on the decision was very evident in the article. I think maybe some people didn't read the article and then commented about how the offense was the downfall, which is hindsight analysis for a different article.
No way! I'd say they are two of the least likely. Too tough to tell given the injury risk with their styles and how, with Verlander, we don't know how he'll be when he can't throw hard. With Lincecum, he'll need a good offense to win games.
It might not have changed anything. But I'm not going to simply assume that under different circumstances the Braves still would have scored 4 runs in the series. Maybe they would have but I just don't know. Perhaps that detracts from the main point/question which focuses on what Cox decided before the series, but the intent is still there. It boils down to, did he wisely not use Hudson in a game he was likely to lose anyway, or was it odd to not give Hudson the best chance of at least taking one game from the Phillies?
Of course the offense killed them. But when someone posts a comment claiming I engaged in pure hindsight analysis when my article focuses on the decisions made prior to the series, fun Eric goes away and mad Eric surfaces.
But Mussina didn't get to 300. I fail to see how the fact that he met specific conditions suddenly rejects the null hypothesis. If you're saying that we're bound to see some pitcher stay effective for 18-20 seasons and durable enough to throw 180+ innings then you're probably right, but even that doesn't guarantee 300 wins, as in the case of Mussina. So we'll see.
Additionally, to those who feel that the offense is the only part that brought them down, I would argue that if the Braves went with Minor-Hanson-Hudson, perhaps things go differently. I'm very butterfly effect-y, and to say that they would automatically score 4 runs with different pitchers is just like saying a team lost a run because a player didn't tag from 2nd to 3rd when the next batter hits a deep fly ball that would have scored him had he been on third. Hate when announcers make that mistake.
Just for the record, I really hate when comments start with some type of a sarcastic or condescending "Umm." Make the points, don't be an arse.
Z-Scores are a good tool. I actually used them in my book's last chapter to describe how to compare one era to another, so I'm on-board with that. The comparison between Webb and Helton helps illustrate one of my points. In your table Webb is at 3.68, Helton at 3.67, yet Webb set the record for raw totals by hitting 8 more than Helton. So contextualizing everything helps us better understand what has actually happened across the history of the game and might be a better way than applying era filters.
No it didn't. It sent your face back to the 12th dimension. BURN.
Yeah this comment feels so out of place, haha.
All's I'm saying is it's unlikely aside from a few guys who got their start prior to the shift away from starts and innings, or who are freaks like Halladay. You're right that someone could be great for 18-20 years, or be durable and effective enough on a winning team that 15 wins is normal from age 38-41, but I just don't see it. I really think we're going to see staggered rotations where you have a #1 to #3 starter making 34 each, a #4 making 32, a #5 at 25 and a #6 at 18-20, with others filling in at various spots due to injuries, before we see guys start to make more starts in a season.
MWSchneider, I don't know off the top of my head, but I would think that the level of talent required to hit .400 now is much greater than in 1941, and so hitting .390 today would be more impressive than .400 in 1941, but again that's why I'm advocating contextual records. .400+ BA is obviously awesome, but getting to the same mark today isn't the same as it would have been 69 years ago.
R.A. -- I'll need a little more than your word that it'll happen again ;-). Tell me WHY. What has led you to believe, with the current style of play, that it'll happen? To me, it seems that starting pitchers are going to go for fewer games in the future as a means of saving them. Maybe 26-27 starts becomes the norm and we see 6-man rotations as well. Everything seems to be gravitating away from what made 300+ wins possible.
First point -- just as a means of further restricting the sample. No real reason aside from more guys had 150 than 175 and I wanted to weed out the guys with 250 PA from those with 400 PA. Second point -- we don't yet have TAv splits. It's something we're actively working on, but until then, it's Raw TAv, or it's OPS.
I'll have a followup this week addressing Alomar, Raines, Posada, and Mantle.
For what it's worth, a decline Chipper at .265/.385/.430 is still very valuable to a team, so even if he stuck around at production similar to that, it's still better than what Rose and Murray were doing at the end of their careers.
Well, one race could be slash stat triple crown, like what Mauer went for last season. I'm sure someone will seem like they'll get the Triple Crown. Maybe the pitching triple crown?
No problem. Hopefully some crazy stuff will happen to give us one more. I'd like to use the simulation for similar articles into next year. Any ideas? I feel like weekly updates on something is interesting.
You guys with the H/R and BB/K splits are just begging for a CarGo-centric Seidnotes column.
MisterJohnny, I linked to articles I previously wrote that explain the background of this particular simulation. I specifically set it up to alleviate concerns similar to yours. If you don't want to put much stock in it, that's your prerogative, but it was set up to account for wide variations. There are runs where Pujols faces an inordinate amount of #5 or worse starters, just as there are times when CarGo sees 70% aces or #2s. Anything can happen.
Don't confuse this with the playoff odds simulation, however, as they are not the same thing. So, thanks for the concern, but I assure you it's already accounted for.
PECOTA had his 90 percentile as .312/.385/.550 with 25 HR and 20 SB, which he has surpassed so far. Bautista's 90 percentile was .275/.373/.494 with 25 HR. So... I'd say Bautista's is much more improbable. I think most understood CarGo would be a very good player, if not a star, whereas Bautista went from journeyman to Dunn.
Greensox, you'd be surprised how tough it is to blow a 7 game lead late in the season. That's why the Mets situation in 2007 was so nutty.
It's basically making a row of every PA for every player, then numbering each row, which seems like it should be easy but actually takes a very long time, and then joining the table to itself when both events are the same and the PAs are in order, and then counting. So for something that should be easy it's fairly annoying. But I'll do something in the off-season that has all of the streaks.
See Murray is interesting. Yes, he has 500 HR and 3000 H, and while I consider him an all-time great, he really didn't have that many awesome years. A lot of his numbers came from still being given so many PAs later when he was through. Comparing him to Chipper:
- In TAv, Chipper .305 to Murray .294
- In WARP3, Chipper 75.0 to Murray 72.7
- WARP3 after age 30, Chipper 43.7 to Murray 27.3, and Murray had basically 3.5 more seasons
- Seasons w/3.5+ WARP, Chipper 11 of 16 to Murray 12 of 21
So Murray is a Hall of Famer, but I'm finding it difficult to determine why, aside from arbitrary raw tallies like 3000 H and 500 HR, one would consider him a better player than Chipper, who played a tougher position.
Unfortunately, streaks take a LOOOONG time to run. I don't know if you remember an article I wrote over a year ago discussing some odd streaks, but it was like a 900 word article that took me 2 weeks to do because it took so long to actually run the scripts.
Meurso -- I was thinking the same exact thing. I'm not sure how I would go about doing it -- maybe like John does the playoff odds?
Brock, you're a man after my heart with that request!
Ron Villone, 2001: 12 GS, 0 w/6
Elmer Dessens, 2004: 10 GS, 0 w/6
Chien-Ming Wang, 2009: 9 GS, 0 w/6
Scott Elarton, 2007: 9 GS, 0 w/6
Ed -- in the Wild Card era that might be the record, but I wasn't looking at that. I was intrigued by the idea of guys going perfect in a specific season, which Buehrle didn't do. And he also didn't miss by just one start either. In 2004, he went 33/35. There was the 5.2 IP start you mentioned and also a 2 IP start prior. In 2005, he went 31/33, and both of those games were 5.2 IP. So from 2004-05, Buehrle made 68 starts, lasted 6+ IP in 64 of them, and in three of the other 4, went 5.2 IP.
I mentioned rain-shortened games in the article as one of the reasons for why it's so rare for a streak like this to occur. It's out of the pitcher's control, but it happened. We could account for it, and maybe that would give another pitcher or two a 1.000 percentage, but I think it's interesting in the same vein of Ripken and Gehrig's streak wherein neither got hurt enough to come out of a game, just like Myers and Schilling haven't had to leave a game prior to the 6th due to something like weather. Remember I'm not necessarily gauging how GOOD a season was, just the % of 6+ IP games.
Hmm -- that's strange. You're right. Everyone else is correct, though, which makes it strange. I probably just input the wrong name. I'll look into it.
For a while, certainly longer than I've been alive (24 yrs). It comes from RBI sounding like ribeye, which is a steak.
Mike, yeah the BABIP would be addressed in the ROS projections.
After re-running with the Infante Clause in place, the numbers really don't change that drastically. Infante wins the title about 7,200 times instead of 6,100, which is certainly significant, and the Trip Crown percentage drops from 12.03 or ~15 (with the more optimistic Votto projection) to around 7-8 percent, but the same proportions are intact. It occurs a significant non-zero amount of time with Pujols achieving the feat upwards of 90 percent of the time relative to it being achieved.
Okay thanks. Good to know. Isn't this game great? I'll see if I can program it in.
No, it is set up so that if his total falls short of 502, his results are not stored. As far as I know you need the 502 PA to qualify. If the hitless-AB thing is an aspect of the rule that I am unaware of, that is interesting to program in.
Dan W - I believe I wrote the article you're talking about with Mauer. And I can do something similar for the Triple Crown too. Maybe for early next week I'll run these guys through my simulation and see where they come out.
It's supposed to say "specialist" not player, for Dunn, since I meant that with Howard on the DL for so long, Dunn is the only HR-specialist who will end up with 40 HR, though Pujols and Votto are likely too, as well, even though they are more well-rounded.
Meurso -- because he hits 40 home runs every season. Literally.
I don't think your criticism fits this article. My goal was to identify the best pitchers who have not garnered much attention this year, and in order to do that I had to first establish which pitchers have gotten the most attention. Usually there is a strong correlation between attention and talent. Overall, I think these lists identified the top ten in each league -- this had nothing to do with end of season awards, but your comment is certainly noted for the future.
Oh I know -- I just find it disappointing that is ALL people were discussing. Maybe it's because I just don't care about steroids and I love the game even if people used drugs that aided their numbers. I just don't care all that much about steroids and to me I don't think of things as tainted or not tainted.
Surfdent, perhaps it's my Seidnotes column for next week, but if you're really curious, we have a stat here called FRA -- Fair Run Average -- which adjusts ERA to account for a league average number of bequeathed runners scoring. That way nobody gets helped or penalized because someone strands or allows runners to score.
Realistically, it isn't a good in-season measure of performance. The problem is that, in order to make a point, certain analysts will berate others who use W-L, instead of opening a civil conversation that can actually help advance the field beyond a small percentage of the population.
The way I see it is that you're never going to convince 100% of the people you try to convince that W-L doesn't matter. But I think the best way to have a conversation about it is to really dig deep and discuss what the stat tells us compared to what we THINK it tells us, and then take it from there.
For instance, many think that W-L tells us the number of good games vs. the number of bad games, and so a 16-8 pitcher is better than a 12-12 pitcher; the former had more good games compared to bad games. But then we have the issues of a) no-decisions don't imply mediocrity and b) cheap wins and tough losses are lumped in.
One problem is that many fans don't realize that they think W-L described what I wrote above -- once they do, then it is much easier to sway the argument elsewhere. Unfortunately, I blame much of the stats-laden community for W-L still being so prevalent, because there are a good number of people who would rather make fun of a beat writer (who has tons and tons of pull and readers) for citing a W-L record than opening a civil conversation regarding the stat.
StatFreak, yes, that is a common reaction. However, I think it would be much better if, when in a similar situation, you ask those friends why they feel the way they do as opposed to issuing them a barrage of reasons why they are wrong. You might find that these friends aren't too far off of your way of thinking, even if neither party realizes it.
Sure thing, ncassino! Will work on that for next week at some point.
Scott, I think you might have an interesting point with Buehrle. It's not that I, as a stathead, overlooked him. I strongly considered him, but at the end of the day didn't include him. To me, Buehrle and Carlos Zambrano fall into the same category of being decent for a very long time, while never dominant or ever really in the top three or four in their league in a given year. That was what I was aiming for here. However, in an upcoming article I will be discussing him, as I think he has a chance to be this era's Kevin Brown -- a very good but forgotten pitcher.
If I were to extend this into more tiers, they would get in somewhere. But when we discuss the most dominant pitchers of this era they are not anywhere in the conversation. They have been useful and are fun to watch, but would you really feel comfortable including Moyer and Wakefield in a list with Halladay-CC-Johan-Oswalt-Hudson? I wouldn't.
The three you list at the end were considered, but ultimately didn't make it through. Zito wasn't truly dominant early on and his lackluster performance at the beginning of his Giants tenure hurts. I'm as big a Vazquez fan as there is but I put him in the Zambrano-Buehrle category. And color me skeptical Webb is ever an effective MLB starter again. In 5-6 yrs, who knows, but I don't think they quite fit right now.
Right -- they have several Yuniesky-clones, who hit for high averages, but don't walk, don't slug, and don't steal. But still, the extreme disconnect is incredibly interesting.
No. I just checked the link out and he discussed the same topic but looked more into why it's happening. My line of inquiry involved whether or not it's happened before.
Alex, you're absolutely correct, but for a different topic, perhaps one I'll discuss in the future. Here it was showing that, if we are looking to evaluate the K and BB information for a pitcher, the differential has a stronger relationship with something we use to assess value than the K/BB ratio. Of course K and BB aren't the only inputs to value, but since peripherals are used so frequently to gauge performance, it is imperative to ask the appropriate question instead of using the wrong tool to get an answer.
When determining what tool to use in a situation, it depends on your goal. For most, if the idea is to use a K and BB-based metric, the goal is to use whatever is more likely to help determine value. These are subsets, for sure, but that doesn't mean we should blindly choose. It isn't the difference between choosing OBP over BA, but maybe equivalent to using ISO instead of Raw HR.
Dan, I said in the article to assume the Astros kick in $8-10 mil to make these work.
Right -- I actually had to add in the total IP for pitchers because, like Saberhagen, Sheets was incredible in the 105-115 IP span, but he didn't pitch that much. So I wanted to find guys who were at this IP threshold at around the mid-point of the season.
I do not see anything supremely different with how he is pitching, however, and this is a big however, what will be very interesting in several years, when we have a ton of data, is a study of how pitchers who incorporate a pitch and throw it a lot allocate their pitches compared to before they started throwing it. That way we can see if pitchers change beyond throwing the new pitch. For now, it appears that the use of a two-seamer is what has changed for Weaver, as it's leading to more grounders and serving as another "off" pitch, even if it is similar in speed to the fastball.
What I'm saying is that the curve has increase SLIGHTLY, not enough for me to be comfortable saying it has a material effect on his improvement, especially when we consider that he's barely throwing the pitch in 2-strike situations. So that if I had to allocate, on a broad level, the reasons for his improvement (meaning if we were just looking at repertoire and not your second point) I'd say 90% two-seam, 10% curve.
I'll have to investigate the other points.
You're seeing a very slight increase in curve usage, but even slighter of an increase with 2-strikes so he is throwing it more it seems to set the Ks up, not to actually make them happen. The two-seamer is where the ticket's at, Sir Bergstrom.
You got it! Is there anything else you would like us to cover? Ideally we want to be able to put together something like this for the glossary that I am currently working on, so we have a handy-dandy resource, but to do that properly we need to know what you want to know!
Yeah -- there is definitely a "gentleman's agreement" and when I said incredibly rare, I meant like, never happened.
Tommy, situations like this are detrimental in the sense that, if the Braves or Mets were to do that, it's likely the Phillies would go out and claim anyone and everyone those teams put on optional waivers. It is incredibly rare for a player to be claimed on optional waivers, which is why this type of move is used as a "loophole" of sorts.
D1, Louis, feel free to ask us whatever questions are still looming. These are very confusing topics, for sure, and Jeff and I want to make sure we are as clear as possible.
The one-line description of the article was a bit too harsh given my tone -- I distinctly said that this was the toughest possible decision for a manager, and that I empathize with the man even if I, from afar, disagree with the end result. Regardless of whether or not he has thrown higher pitch counts, throwing 30 pitches above that, a high concentration of which were early in the game -- which my study previously showed was detrimental the next time out -- is not optimal. There are certainly reasons to leave him in and to take him out -- not everyone will agree. It's interesting to discuss given that not every situation is the same, but I'm in no way panning Hinch for the decision; rather I'm using it to discuss a few different areas of interest that arose.
David, that's precisely one of my points. Walking 8 batters is NEVER good, regardless of whether or not 0 hits were given up. In fact, one can comically but semi-seriously opine that nobody got a hit because he walked everyone! He essentially is risking injury to throw close to the worst no-hitter of all time.
Matthew, that's why the train situation helps to illustrate this. Either way, he is going to be "wrong" but it's about choosing what is less wrong to the individual making the decision. For Hinch, diverting the track to the one with just one person was equivalent to leaving Jackson in, and that's fine. As I said, it's an impossible situation and an incredibly tough decision.
As you said, removing him has several short-term drawbacks: fans and media go crazy, Jackson is mad, maybe Ryan Reynolds gets another tattoo on his neck because he's so angry, while lifting him would potentially prevent something more drastic from a long-term point of view. A big issue is that, while people SAY they like to live/manage for the long-term, their actions are short-term based.
Pavano was included in the article Matt and I wrote about big ERA-SIERA disconnects, and his ERA this year is very close to what SIERA suggested.
Hendrickson was at 3.90 and Sonny at 3.95. All three of them were of the high ERA but lower SIERA mold, though neither pitched a full season, per se, at around 125-130 innings. Ideally, we could find several comps and find similarities, but it really boils down to Vazquez-itis for me, where the sequence of events is not helpful to run prevention, while the actual events, sans-context, suggest performance should be better -- as in, if they are capable of generating the events at these rates, the sequences should even out. Occasionally, they don't, as with Vazquez.
The problem I'm having is that I'd have to really reduce the ERA threshold to find a bunch of more comparable seasons, and nobody has ever been around 4.90+ for two straight years despite meeting the criteria in the article. It's annoying -- as if I feel part of my career relies on friggin' Ricky Nolasco!
We can't really do that because batted ball data is needed and it isn't reliable prior to 2003 (and if you ask Colin Wyers and I, it's barely that reliable now).
Bob -- why has it been less effective in your estimation?
0.28 runs per nine innings is not material when we're talking about someone expected to be in the low-mid 3's but who is actually at around 5.00. Do you have anything to offer aside from insults?
Yep -- and SIERA pegged Nolasco at 3.06 last season. All of the estimators suggested he would improve markedly this season.
1.0 mph in velocity does not constitute a material decrease, and throwing 2.5% more splitters does not constitute a material uptick.
David, thanks -- numbers are being fixed. Not sure how they came out incorrect.
Right... it's not entirely luck. Luck is certainly a factor, but the problem with using LD/GB/FB to tell the tale is that those are 3 buckets, each of which owns a subset of buckets. His LD% could be the same, but if the velocity of the ball off of his bat is reduced, then it's irrelevant. There is a lot more going on with McLouth that isn't easily explainable, his sudden inability to his a fastball being one of them.
If you only look at numbers it is easy to think luck/bad luck. I choose to compare numbers to the thoughts of scouts and video I've seen. Luck is certainly a factor but it is not the only factor. McLouth is doing something differently this year, which may or may not be related to vision impairment.
The worst thing for a business, either a major league baseball team or a retail industry, or a service industry, etc, is to act without proper self-evaluation or goals. The Marlins came into this year thinking they could contend, but realistically without the talent to do so. But the talent needed to win in the future will likely still be around when that time comes to pass, so the proper course of action would be to give those guys playing time to see if they belong. Right now the Marlins are an 83-79 team hoping that they can overperform their way into 88-90 wins, which just isn't very likely. To me, if you're projected as an 83-79 team, it's better to go 80-82 while giving prospects major league playing time than it is to go 85-77 in keeping everything the way it is and still missing the playoffs.
I'm 24 years old.
My thing is that I doubt Ross is around next year -- and you aren't going to learn enough about Maybin in 100 PA. This is the mistake the Reds made with Homer Bailey. You have to let these guys play. Maybin hasn't been impressive but he hasn't exactly been handled that well. Sure, he just might be a bust, but it is too premature to treat him like that if you're the Marlins organization.
Perhaps I should have phrased it as, he doesn't stand out as a fielder. Plus, in a corner OF spot, being very solid doesn't always make you a great overall fielder. But the purpose of the article was clearly not to write about Brett Carroll. I think you're probably right as far as Maybin being demoted, though this piece was about what I believe is best for them, not what they will likely end up doing. I think constantly playing for it now is going to hurt the guys who are going to be around when they inevitably win their next unexpected world series.
I don't know -- haven't watched it since the era of Limp Bizkit's "Nookie" so I'm just assuming nothing has changed.
Matt and I explicitly stated in the articles, when discussing why estimators are better predictors than actual marks, that the main reason is due to the inputs stabilizing quickly and being more persistent year-to-year. That's one of the only reasons estimators carry predictive value -- the way they generate said value is to use what is likely to be seen the following year.
Michael, I've seen much crazier things happen in 40 games at a position, so while I appreciate the comment and perspective of a Marlins fan, I have enough trouble really believing an entire year of defensive data, let alone the equivalent of a quarter of that. A few scouts I've spoken to echoed the same sentiment I wrote in the article.
Simon... it wasn't mentioned primarily because Hudson wasn't there LAST year either. Ryan Roberts was their everyday 2B, and there is no way the difference between Roberts and Johnson is causing anyone's ERA to fluctuate wildly.
Brian, it's beef with how the situation has played out, and how things like this happen all the time. I cringe when I hear trade requests/demands because it hurts the team. The Griffey scenario may be a bit extreme, but KGJ really put the Mariners in a terrible position at the turn of the decade when he demanded a trade and essentially refused to go anywhere other than Cincinnati.
Oswalt might not be like that, but he was going to be a tough player to trade given the no-trade clause and his salary, before factoring in a public desire to move on. I hope, for the Astros sake, that doesn't hurt them that much, because it would be terrible for their one really valuable piece to net "meh" prospects in return. But then again, even "meh" is an upgrade over what they have.
Your intuition is likely derived from seeing him in Toronto for a bit. As Matt noted above, a HBP on the wrist from AJ Burnett required two surgeries, which kept him out of lineups forever, but he was a former first round pick and a top prospect that has finally put his injuries behind him. On top of that, he may be 31 years old, but his deal is going to be relative to Jason Bay and Matt Holliday, and when you factor in fielding and baserunning it's not hard to believe he is better than both of them.
The stats you mentioned are misleading as well. He has only hit more than 25 HR once, sure, but he hit 24 in 2008 so your cutoff point is invalid. He has never had 100 RBI? So what? He had 99 last year, and the more important number is his percentage of baserunners knocked in, especially considering he bats behind Ryan Howard, who routinely knocks in 140.
He may never have won a gold glove but.. come on, Rowen.. you know better than that. Has never stolen more than 20 bases? The man has stolen 20 in each of the last 2 seasons, both 20-20 seasons. You can choose to slice it any way you would like, but your intuition is a bit off here.
I'd be willing to wager a good amount of money that he signs for almost double what you suggested.
I didn't create them or tout them, or express feelings either way on the matter -- this was simply an article discussing what components have been discussed. That being said, it seems the idea is that someone who had an SB against rate of 15/30 would have a 50% rate, which looks worse than someone who had an SB against of 30/100, a 30% rate. But the former allowed fewer attempts. In essence, yours and Tango's posts are correct that it matters little, but this is one of the first things people discuss when you talk about catcher defense and it was worth noting due to that and that alone.
It would be incredibly difficult!
We'll get into this more next week, but this is something I definitely have interest in. Also throw in there lead runners thrown out on sac bunts, which is an aspect of catching not often mentioned. If it's runner on 2nd, 1 out, and a bunt is laid down, and Yadier throws to third, think of how the momentum of a game and the probability of scoring changes between 1st and 3rd, one out, and 1st, 2 out.
While this is entirely my guessing, I'd expect that managers calling the game would be essentially reserved to, like, Scioscia when Bobby Wilson is behind the plate, where the manager used to catch for a living and the guy behind the plate now is inexperienced. I can't imagine managers calling games is that widespread. Maybe in certain situations, but not all game.
Where did I say he was younger? I simply said three generations, as in three people in different areas of the game; a young guy, a veteran, and a catcher-turned manager. Seriously, nitpicks like this are just a waste of everyone's time.
Except that I wrote it :-).
Did you even read the article? I said that this has occurred so far, which got me thinking about the idea in general. Absolutely nothing here says Moyer has this ability. Absolutely nothing here says this IS an ability.
That's a good idea and something I might just look at. For a guy like Berkman, if we're saying he's at 1.051 vs .856 or something like that, and the normal platoon split is -.130, then we can make a point estimate that he'd be somewhat worse than that, making the expected performance around .900. So being .856 when expected to produce .900 would mean it might be worth a try. Thanks for that tip -- I'm going to work with that.
Yes, this is Raw TAv, which is much easier to calculate. Numbers like TAv/EqA start out being in this OPS-esque range and are then normalized with a slew of factors down to the .260 as average level. Unfortunately it's not that easy to calculate for splits.
TGisriel, I've used REqA quite a bit in articles, and that's exactly what RTAv is -- just with the True Average name and not Equivalent Average as we previously stated the intention to rename EqA into TAv.
Yeah this is the one option people seem averse to mentioning but once you say it out loud it makes all the sense in the world. It ties into #2 as well if you think about it -- why not get Dukes or someone like that to replace Bautista. Either way you pay Wells all that money and still have holes elsewhere. He's an average-ish player, which has value, but the real issue is the contract (duh!) which restricts how they fill other holes.
My line of thinking exactly!
He's so bad he doesn't even make THIS team. In fact, he'd be a bench player on this team that we'd eventually try to ship to Matt Swartz's no-turnover roster.
No. Something must've been lost in the translation a bit from my original. I've fixed it here.
SIERA had his efforts at 3.92 last season.
Well, I've got ACL, so I can see what pops up!
Plus, there's a difference between real and fantasy value. I certainly mentioned fantasy owners shouldn't overlook him, but don't mistake what Marc's doing with his rankings -- three stars for fantasy purposes doesn't mean #3 pitcher. It means that, of the pitchers likely to be drafted by fantasy teams, Blanton is mid-pack, which still puts him in really good territory. Additionally, not everyone at BP has to agree -- this isn't groupthink. I might be higher on Blanton than others. It doesn't mean anyone is right or wrong, and I'd encourage everyone not to expect the writers here to always agree.
Good point -- last year, he was projected for a 4.59 ERA in about 165 IP. His 4.05 ERA in 195 IP was somewhere between the 75th and 90th percentiles, we'll call it the 82.5 percentile. This year his projection calls for good number of innings and a 4.00 ERA, so it def believes what happened last year was real. Further evidence is his DT card, which shows the equivalent strikeout rate to be high, and that factors in league differences.
That table shows Blanton's rates when you remove PAs by opposing pitchers, to show that not everything in terms of his improvement was a byproduct of getting to face pitchers so much more often.
It was very difficult to not put a joke similar to that in that very sentence.
Just checked Murcer -- the problem with using him as a comparable is that in YR5, David Wright was -0.60 SDs below the mean for his league/year. Murcer, while meeting the criteria for the first 4 years, was only -0.11 SDs below the mean in that fifth year. So from a raw tally standpoint, yes, Murcer had a good amount of HRs and then fell off in the fifth year, but from this method, which is used to normalize league/year differences, his fifth year was actually right in line with the average whereas Wright was a good amount below the average.
Murcer wouldn't have shown up here because he had 5 years of good power before the dropoff and the database query involved finding four years. I'm going to run through everything again this weekend to see if I can find 5 yrs then dropoff like Murcer.
It definitely seems that way. When I started formulating this I had to battle whether I wanted to discuss potential reasons for the dropoff or if it had ever occurred before. Obviously, the latter won out so I didn't spend much time with my Dr. Gregory House cap on, but I am certainly in the camp that he did something different last season. Perhaps he psyched himself out as to the dimensions of the stadium and tried to go to the opposite field more often. Or maybe it just sort of happened. It's just fascinating that it happened to begin with. Unfortunately, what we'd really need to know in this case is what happens to players who start pushing the ball more often in the years following that season and I'm not sure we have accurate enough data for that yet.
Glad you enjoyed it. I of course never said that Dusty Baker's post-suckiness success was concrete evidence that Wright will improve. It was more tongue in cheek: hey, one other guy had this happen and he got better, so boom! There are several potential factors for the dropoff, but I wanted to focus more on the historical precedents aspect, which hadn't yet been tackled, partly because searching for the reasons can drive one nuts.
Richie, I meant the "Post Reply" button doesn't work on my web browser at work.
Can't respond directly when at work, but Richie, I did include that. I mentioned that only a handful of the guys even played into a 6th and/or 7th season (or didn't amass 300+ ABs in those seasons) and only a couple of them regained power. Which tells us little because the sample is so small and even less because the sample shrinks the further out we extrapolate.
Yep, I mentioned this above as in my queue.
One of the articles I have in my queue is to repeat something I did last year with ERA and FIP, but with SIERA and ERA. Basically, that article looked at guys with big gaps in their ERA and FIP halfway through the year, and how they fared the rest of the year.
SIERA will be on the site for the season, so no worries there.
Exactly -- the post was to rectify the mistake on my part that showed xFIP testing so poorly. SIERA is still SIERA, though we're going to continuously refine the metric.
Bingo. As we continue to improve the metric and develop new tests and such, this will be referenced as will the results. This isn't a one-off or anything, but a way to get it across for everyone.
In article four we did these comparisons -- not sure what else you're looking for, or if you're thinking of something differently, but click the link in the article.
Yes, I noticed this too when the SIERA glossary rollover was mentioned above and couldn't figure out why I couldn't read the whole thing. Definitely something to work on.
You're right... we should add in more paragraphs.
The BB and IBB discussion is one Matt and I had for a long, long time, but we ultimately felt that the difficulty in differentiating the types of IBBs muddied the waters and for now just felt more comfortable using the term in its current state. But it is definitely something we were conscious of throughout this process.
I think that the glossary entry was saying more that he described it in the most detail in that particular article, given he explained it elsewhere, but I agree, sounds odd. Thanks for pointing it out.
He may not be making the squared GB term negative if (GB-(FB+PU))/PA is negative.
Over the next four days Matt and I have articles going really in-depth into a number of topics, one of which is testing SIERA against FIP, xFIP, QERA, tRA, ERA-Park, ERA, you get the picture. So just hold tight. We're going to have the data in the articles themselves too.
You know, Matt and I never even realized that it was like Ruben's last... nevermind, my straight face just left.
These are both interesting thoughts I will work to explore next week, as perhaps the 50/50 is skewing things and the 25 vs 75 quartiles could prove more telling.
Exactly -- I specified in the article that risk in this forum dealt with a team not being able to reach its goal. For a team signing someone very volatile, the risk deals with performance and how he could be boom or bust. With someone very consistent but not necessarily that good, the risk deals with the performance not being that good despite the lack of volatility.
I'll Unfilter something along these lines over the weekend, looking for anyone who made a 2.5+ win jump from Y1 to Y2, regardless of GS, and seeing how many of them made 20+ GS the next year, and of THOSE guys, how many were above average. The numbers above are still valid to some extent, though, but perhaps not as far as determining the likelihood he falls flat on his face.
You can interpret the table however you like -- it's there just for that. Regardless, only five guys surfaced, and as we whittle away at the individual skill-sets of those five, there are even fewer comps in this area.
Right, perhaps it should have been phrased better as far as saying that guys who were able to stay on the mound for the requisite sample sustained their performance, not that it was a lock for that to happen, but if they did, they were increasingly likely to remain above average.
You're absolutely correct if the overall goal is to literally specifically investigate just Garland and Pineiro, but this is more of using them as archetypes to discuss what teams (and what situations for teams) should be looking into guys with certain skill-sets. Garland is considered by many to be Mr Consistency while Pineiro is the crazy volatile Tracy Jordan character from 30 Rock.
This presents another interesting idea, as Pineiro makes little sense for the Cardinals given the previous signing of Penny, as if the team acknowledged "Hey, we're going to have a volatile/risky pitcher, but we think Penny's high end on his distribution is greater than Joel's." Pineiro may have made more sense if right now they were looking at Carpenter, Wainwright and Mitchell Boggs, not the former two, Penny, Boggs and whomever else.
Thanks, and I do agree that Garland doesn't represent risk in the traditional sense, but this is perhaps more closely associated with investment risk, in that the consistent stocks may not drive you batty but they also might not help you reach your goal. And that's what I'm trying to discuss here -- a team has a goal, and anything they do that doesn't truly help them reach that goal is risky, in some way, shape or form. The risk of Garland isn't that a team is far from the playoffs but that, for a team on the cusp, he isn't going to put you over the top. Now, if he isn't "the answer" and is just another signing -- like if the Mets signed BOTH Garland and Pineiro -- the risk is mitigated.
No, the opportunity cost is what builds the risk profile for someone like Garland. He's risky because of the opportunity cost AND because, while consistent, he just might not be good enough to get you where you need to go.
And the Nats signed Marquis, a similar starter to these guys, all things considered. The opportunity cost comes into play -- could the $7.5 mil doled out to Marquis have been better spent on the draft or the Latin American markets to get the Nats some potential superstars? For teams in their situation that's worth exploring. You might have given me another idea.
That's the next article ;-), specifically discussing a former Brewer.
Rowen, stop making so much sense all the time. Gosh!
Precisely what a nerd would say!
This entire thread reminds me of perhaps my favorite Simpsons exchange ever:
Database: Uh, excuse me, Mr Simpson, on the Itchy and Scratchy CD-ROM, is there a way to get out of the dungeon without using the wizard key?
Homer: What the hell are you talking about??
I STILL cannot beat Mario: The Lost Levels. It's impossible. And yes, I have an NES/SNES in my room... which is above my mother's basement.
If the implication is that I am not cool, well, well... :-(.
Right. Reverting to movies, if I put in Phantoms with Ben Affleck but it stinks, I can quickly remove it for The Phantom, with Billy Zane. If that stinks, I can put Phantom of the Opera in, and if that stinks, Batman: Mask of the Phantasm. The low risk moves help a team avoid being stuck with sub-replacement level players, as you said, because they can unload someone without feeling like they wasted money.
I'll be perfectly honest, I had no idea Condrey had even signed with the Twins, so nothing here was meant as a slight to anyone. I would assume Condrey was the picture because he represents the type of low-risk player that would be included in a study like this.
Eighteen, exactly (the last line). The Yankees might sign Bruce Chen in a Sergio Mitre type of role, if that. The Royals would sign Bruce Chen in the hopes he becomes the 4th starter. It depends on the teams, and one of my next articles will actually focus on how teams would view players, either of this ilk or of the more standard free agent variety.
Yep, I replied to it already, two comments above this. Cut and pasted below:
Schere - I didn't say they were inefficient, but that they look closer to that area than under WARP3. Still efficient with rWAR, but close to inefficient.
Schere - I didn't say they were inefficient, but that they look closer to that area than under WARP3. Still efficient with rWAR, but close to inefficient.
For some reason here at work the Post Reply button isn't working, but Patrick, I appreciate the comment although the article was not written as a direct response to comments in the SotP article. One of my goals moving forward has always been to, more often than perhaps I did this past year, tell a story using statistics, not about statistics, although there are certainly occasions where a statistic is the driving force behind an idea, perhaps to prove or disprove a common creed or myth. And I've never been afraid to use an outside statistic -- nothing felt awkward at all. I do believe that personalizing helps develop a stronger bond with the audience, however, and I bet a whole bunch of readers don't even know about my movie background, which would be a fun discussion perhaps for another day.
Jeff, Joey Gathright has no relative value.
Dave, I will say that we set it up to test the future as well as backcasting to avoid this type of issue, so we know how the general formula works on the dataset it was derived in works as well as how that formula works on a subsequent year or two as well. But of course, you're correct in that we'll need a few more years of forwardcasting.
Yep, and when Matt and I put our articles up, you'll see how it fares favorably to the other ERA estimators.
Exactly right. When Matt and I set out to work on SIERA, the goal was to fix a problem in a stat currently offered on the site, which was causing issues in specific areas that were, as Colin mentioned, over- or underestimating pitchers. It was not a matter of us simply saying, "hey, let's change one little thing and market it as a brand new stat" like you might see in pharmaceuticals. If that were the case we would have simply said we fixed one quirk in QERA. But when so many have come to rely on the accuracy of certain metrics, it is really important to present that which is most accurate. Sure, Johan Santana will still look better than Matt Belisle, but the metric will more accurately model what Santana and Belisle contribute within their control.
Yeah that is a concrete suggestion in that these pitchers may be in a new role while not necessarily changing their approach, meaning they throw harder but aren't exactly focusing solely on a pitch or two, rather utilizing their whole arsenal.
That is why I used FRA in addition to ERA. FRA, which is FAIR_RA, is a stat we use here probably not enough, which is the RA (all runs, not just unearned) but with the league average rate of bequeathed runners scoring. So it would normalize for such situations. If Joe Blanton loads the bases with no outs, leaves, and JC Romero strands all the runners, Blanton's ERA wouldn't go up, but his FAIR_RA would.
Yeah, that would be included in any expected value approach. If RBIs occur 4% of the time with runners on third and two outs, than such a runner knocked in would be counted differently than if the runner was on third with one out.
I don't know, but your comment is surely foreshadowing of the future.
It's a Simpsons quote from when Homer was playing Poochie on the Itchy and Scratchy show and had a list of character demands.
Richie, that's good to know. I'll have to check his book out.
You're correct in a lot of what you wrote, but the purpose of the piece is to figure out why personal opinions can clash with fielding metrics. Putting butts in the seat is irrelevant here. Victorino appears like he should always have a great UZR because he puts in so much effort, but the effort overstates his defensive output from a technical standpoint, not one of economic value.
The system should be louder, angrier, and have access to a time machine.
Delta in this instance means the difference, or change, from one to the next. So when I say REqA Delta I am referring to the REqA of the individual in relation to the average for the league for that split.
That is in my list of splits I'd like to generate, along with baserunners/no baserunners.
Yeah, I hear you on that. But rest assured this will be resolved. In the meantime, since I'm in the process of putting together an Excel manual for baseball, I'll throw out there that you can use web queries to make your life easier. Click in the cell, go to import external data, from web, and then navigate to the page, make sure all players are showing, import the table, and where that year of data ends, run a new web query for the year prior. When finished, you can sort it and such however you want, and if it is done in the midst of the season it will update current year data everyday.
Okay, did not realize that. Honestly, my gast is flabbered over it. Will do everything I can.
That is already there. Go to any report, say Pitcher-Standard. Click the dropdown for Year. 2009 will be the first year you see at the top, but you should notice that the scroll down is not entirely at the top. ABOVE 2009 is (All).
Right, and extenuating circumstances like this are going to persist for several players, meaning we'll need to make careful adjustments for them. But velocity off the bat will cover a larger spread of the league as a better proxy than hits will, I think we can agree on that.
I meant to say it isn't as tough as it is time-consuming, so it's definitely going to get in there and soon, along with all handedness information, and more splits to boot.
Dave, I would agree that hit/out could serve as a decent proxy FOR NOW, but certainly not when more data is available, assuming it becomes available. Castillo could smash line drives that are recorded for outs and get bloop hits. For me, the idea of how hard the ball was hit is much more important regardless of the outcome.
Peter, more like you can have a guy utilizing a sound strategy who just isn't a good hitter. So he might not be biased in any direction when making mistakes and might make fewer mistakes than the field, but when he has what we are calling positive responses--like swings in the zone--the results aren't solid.
I would imagine that anything helpful in any capacity for components of our site will be added moving forward.
Dave, mentioned this in the post that the one glaring omission right now for batters is their actual handedness. Same for the pitcher handedness splits. Quick fix and something that will CERTAINLY be implemented. Additionally, we are going to get EqA up there instead of OPS. Calculating EqA for Splits is tough but we're working on it.
As a further, if every non-foul ball that Castillo contacted was hit really, really hard, regardless of whether those balls were fielded or not, he would be utilizing a suboptimal strategy because he makes contact all the time, most of those contacted balls are put in play, and he hits them really hard. Home runs are hard hit balls but not all hard hit balls are home runs.
No, this is incorrect. You need velocity off of the bat or a means of gauging how well the ball is it because not all line drives are hit hard, so LD% could overestimate it, not all GB% are hit weakly, so GB% could underestimate it there as well, and you in no way need to hit a ton of HR to hit the ball hard.
Brian, the article this week discusses contact. Castillo is well above average in contact in and out of the zone, but even that isn't enough since we need to know what TYPE of contact he's making. So sit tight!
Not a problem. I'm actually more concerned with the contact aspect than redefining the strike zone. It seems the goal would be to treat swings as their own test: true pos = in zone contact, true neg = out of zone contact, false pos = in zone whiff, false neg = out of zone whiff. Then to run through the same process for swings and factor that into the results found through this overall process.
That way, we get an idea of sensitivity towards making contact; players with lower swing sensitivities--those more prone to in zone and out of zone whiffs--are the ones with suboptimal strategies IF they have a response bias in the method proposed in this article that is skewed towards swinging too often. It would tell us that they swing too often AND those swings don't amount to anything.
On the flipside, those with high sensitivity marks for swings and a response bias above 1.0 in THIS study wouldn't be penalized as much. Vladdy would fall into this category. His response bias is through the roof with an average or so sensitivity but his strategy is still sound since he connects on most of his swings.
Gordon, exactly! Right now we just have 2008 and 2009 but something in my queue is a comparison. Did Luis Castillo have a response bias of 0.5 in 2008 and 0.1 in 2009, etc? Did Player X see a drastic improvement in results that his improved signal detected process indicated?
Very, very true. The ideal it seems would be to extend this to contact, breaking the test outcomes into their own test. So, a test solely on swings, where the desired result is contact on a swing. As the sample sizes of PITCHf/x grows the Counts can definitely be broken down deeper, and I'm really curious about the curveballs, changeups, sliders, etc since that could conceivably help pitchers realize they can throw a curveball out of the zone more often perhaps to a certain hitter given his propensity to chase them.
Except that Wade had a proposed Howard for Kip Wells deal, so it's not like he didn't try...
Is he still talking about bacon?
I guess you have not been reading this series, but Perry and I are friendly. He has been helping out with my studies here, as mine are essentially expanding upon what he is doing, using PITCHf/x data and such.
If you read the previous article on the crossover effect I mentioned just that, and how we were going to start by exploring changeups and fastballs.
Just to clarify, the point here in case it was confusing, is that changeups are most successful when they are perceived to be around 13-15% slower than the fastball perception. To achieve this, the pitches can be sequenced in a number of fashions based on location. You never want to do the same thing too often, but if you are going to use FA-CH, it's more successful to find a way to get 13-15% separation in perception, either if you average that separation on the gun or can achieve it through location deltas.
R.A., there is nothing here that says sequences need to be the same, and I honestly thought it was just a given that sequences should never be the same. The point is that pitches can be a variety of speeds depending on where they are thrown, and they can seem faster or slower and therefore be more successful based on the speed separation.
Yeah, I briefly touched on that. As the separations grow larger and larger in perception, the hitters can understand that a different pitch is being thrown. Based on the results, it seems that point is when the changeup is in excess of 17 perceived mph slower than the perceived fastball mph. And on the other side, when the separation is below 8 percent, it looks just like a fastball apparently since the results either favor the hitter or are around 0.00-average.
I did a study last year on pitch counts here and found essentially that a) it was number of pitches in an inning, not a game, that mattered more (throwing 35-40 in the first and racking up 100 is much more stressful than throwing 17 for 9 innings), and b) it was power pitchers affected more. Finesse and Neutral guys did not see their stuff deteriorate as much in that or the next start.
Richard, no apology necessary, you just seemed like you were picking up steam ;-) and I wanted to slow that locomotive. The stance may have changed but it just seems to me like something with relatively little reward in relation to the risk. But the big thing I wanted to get across here in the thread is that we focus too much on the what and the who and not the why, which is more important.
Essentially, the risk greatly outweighs the reward in this instance given that it's fantastic that PITCHf/x is free right now and I'm not going to exert energy into something that could potentially alter that.
You're misunderstanding everything that has been said and putting words in my mouth. Not once did I say I would not report something interesting with the data, or anything involving pitchers. In fact, reading the comments, I don't see how any of this could be misconstrued as dealing with anything other than umps. Interesting findings, of course, but since there was some hesitance or suggested restraint with regards to the data being used by analysts to evaluate umpires, I, myself, am not going to walk that fine line.
PITCHf/x data should not be taken for granted. It's amazing that we have this for free. If it's something the powers that be were touchy about, count me in as someone who is not going to try and make them angry.
Yeah this is what should be discussed - the WHY - and finding out how to do that, which shouldn't be tough, per se, but very time consuming. I'll be honest, though, I know there was a bit of an unspoken hope that analysts wouldn't focus their time and efforts on umpire valuations. I don't know if that has changed a whole lot, but after being at the PITCHf/x Summit this summer I got the sense it is still the same. This line of research came up because of how atrocious they have been so far but I probably will not continue along these lines. Not to say I don't consider it valuable information, but I don't necessarily want to p*ss the wrong people off.
And in addition to that, HOW they are grading them. I can't imagine--or at least I would hope--more goes into it then a straight, this is the zone, how many did you miss, etc.
Based on other comments, it seems he isn't a fan of my work. That's fine, I'm not going to win over every reader. The respect I get from all the other readers who understand how hard I work to produce consistent and quality work is more than enough for me.
Yeah, Richard, you can certainly determine frequency of missed calls based on location, as we have the location parameters as well as the result of the pitch, but the definition of zone is less concrete. An umpire's zone is probably a lot like a park factor in that it's more meaningful over a rolling span... unfortunately PITCHf/x is still in its infant stages, so the real questions are a) what is the frequency of missed calls by zone, and b) WHY are they missed?
The second part is one that people, for whatever reason, don't seem to care about. I, personally, want to know WHY certain calls are missed, not if they are or not, because if we know the reason we can either alter the training of umps or re-train current ones.
If it's movement of certain pitches, where the catcher originally sets up, how the ump sets up, etc these are the questions to ask. Unfortunately we cannot answer them all yet.
That's a good idea. I'd also like to explore how movement affects umps. How many times have you seen that two seam fastball tail over the plate but it isn't called? Or the sharp curve that fools both catcher and ump? I bet there is something there. It's just a fine line when evaluating umps that I'd like to cross slowly.
Yes, the latter. It was extended out to be in balls territory, with a bit of weighting to clear up any bias with some guys seeing more balls or in zone pitches than others as well as some umps being behind the plate more often than not. It's not a 100% perfect way to determine the size of a zone but it does the trick for now.
No, no it isn't at all. The umps who blew the major calls were experienced veterans. Being proactive is always better than being reactive. Being reactive is okay as long as you react the right way. This is that third tier - wrong reaction.
If you search archives for my articles here, I've written three in the past month discussing perceived velocity and how this changes the game, using both the flight time of the pitches with actual examples and the location of the pitch.
I don't doubt your line of thinking. This isn't an area in which I'm an expert at all, but I'm going to be speaking to some optometrists and specialists in this area to find out their thoughts. I think, too, expectations loom large. Facing Verlander you expect to see a faster pitch than when facing Moyer and so on.
Exactly. We're so caught up with the radar readings so we are trained to think that a 92 mph fastball followed by an 82 mph changeup is devastating due to the differential. But if they were sequenced in a fashion that essentially borrows from Peter to pay Paul, the pitches are of similar velocity. Much of it is intuition, but the idea that a pitcher can raise eye levels, "mix" speeds, go in and out, but throw 2 pitches with the exact same perceived velocity in a row is incredibly interesting.
It's not always about changing speeds on every pitch because movement is key as well. This is more of just an introduction to the ideas of crossing over and throwing at risk pitches. You want to throw all of your pitches effectively, but you want to sequence them in fashions that don't carry risk. Going fastball-changeup isn't always going to produce a velocity spread, which is really the major point.
Alan Nathan is a friend of mine and he has been extremely helpful with regards to my research on perceived velocity thus far. So yes, double recommendation to go to his site.
The hitter is perceiving the average velocity over a flight time. The flight time depends on where he picks it up, which can go hand in hand with where the pitcher releases it (closer to home like Chris Young, further like Ian Snell). If a pitcher hides his pitches well, the batter won't pick it up until later, meaning the perceived velocity for him will be greater, since he is tracking the ball for a lesser distance. The inverse is also true. There have been a few studies in this regard already that are going to be recapped in my perceived velocity series here and applied to 2008-09 PITCHf/x data.
It is without question relevant, regarding the time it takes to cross the plate. As I recently showed, Ian Snell releases the ball further from home, making his velocity measured at 55 feet overstated. In actuality, his fastballs, based solely on flight time, were closer to 88 than 91.5. The batter recognition is a big part as well--and I have an article or two lined up on that--but the flight time is definitely important. If someone like Snell sequences poorly like he did in the PA I diagrammed, is already throwing at a slower perceived velocity, AND allows hitters to pick his pitches up moreso than others, well then it's no wonder someone like him struggles, YET gets more opportunities because of the radar gun reading and his age.
I began to study this in the first of my perceived velocity articles, however with Chris Young be sure to read the follow-up Unfiltered post where I correct his data from the original article.
Matt and I may share the commonality of being Phillies fans, but I did not write this article :-).
Yes, just to clear something up that was surfacing here, the hitter is reacting to the average velocity, which is (Start_Speed+End_Speed)/2. So in the Blanton pitch described above, at 93 and 85, the hitter is reacting to an 89 mph pitch basically.
However, that doesn't affect anything like the perceived velocity articles I've been writing discuss, like location or flight time or actual distance from release to plate.
The average velocity just creates a new equilibrium wherein pitchers don't really gain advantages over anyone else. Chris Young releasing the ball from 53.7 feet and not 55 gives him an advantage, just like Ian Snell releasing from 55.9-56.1 instead of 55 makes him appear to throw slower.
I have no idea what sort of Red Bias you're talking about, especially to me given I'm a Phillies fan, but think of what I said along the lines of NFL instant replay. Nothing in the process of my research for this analysis was concrete enough in the Angels favor to make me feel any differently than I did about the Yankees against any of their potential opponents. If you're arguing my diction or grammar, I'll just concede right now since I'd rather go watch baseball.
Yes, but this was what, seven years ago? Things change, and Scioscia would be wise to actually test Molina now. Blindly accepting that someone you molded almost a decade ago is still Pudge-in-his-prime-esque is a mistake. Nichols Law, people. Molina may still be a great defensive catcher but if Scioscia plays into the Yankees hand here and hamstrings his team because of an idea without proper evidence it could be costly.
Make him throw to first for a pickoff or try to nab a baserunner. He's surely better than Posada but that doesn't make him Yadier Molina.
I picked the Phillies to win 3-1 in the NLCS preview. The Phillies won 3-1. There may be some odd decisions here from time to time--like Milwaukee over Philadelphia--but not everyone here hates the Phillies. In fact, Matt S and I are pretty hardcore Phils fans.
Well, that's because I'm a Phils fan! I even turned to my dad and said the ball would go into left before it happened. Just so predictable. The guy looks like Travis Lee making the plays now but just simply cannot throw with any consistency. Makes no sense. And I agree, he looks less lost at the plate. The results may not have drastically changed but the "process," so to speak.
He is certainly athletic, chasing down foul balls, stealing bases... heck, if he was more decisive yesterday he had an EASY steal off of Ubaldo... but maybe he and Perlozzo can work on throwing this off-season as their defensive work this previous off-season has really paid dividends.
Yes. Platoon splits simply aren't that reliable for individuals unless observed over many, many, many PA. Two years worth of platoon data is likely going to be a tad less than the equivalent of one entire season. So, it may be on principal, but the principal is not unfounded at all.
Yes. I requested a game of Lincecum's as well as Lidge, to potentially see if Lidge not only has a lower real velocity but a lower perceived velocity. Stay thirty, my friends!
I can see them making a push to sign someone like Erik Bedard, as they might be desperate to the point of not offering him as incentive-laden of a deal. Maybe a big overpay for Harden.
More likely though would be bringing in Wolf+Garland.
Yes and no. His peripherals portend more success moving forward, so in that regard he doesn't belong with Backe, Moehler, Hampton, Ortiz, but it doesn't mean he wasn't bad this year, and this is a fallacy too many people fall prey to. Peripherals better than the actual simply means we would expect performance to improve next year, but it does not automatically transform a 5.50 ERA into a 4.00 or anything like that. So, ultimately Paulino does not belong with those guys in that sense, but as I said in the article, any of those guys would have been fine as a fifth starter--true for Paulino as well--but not as 3/5 of a rotation at any given time.
I e-mailed Schwimer when I began my research but never heard back. No way Happ is 94-96 though. Location doesn't add that much. What Happ MAY be doing though is hiding the ball from view for a longer period of timing so the batters cannot track it, and he might release his pitches from the same plane, in addition to utilizing location, but I cannot see that getting him higher than maybe 92-93, especially based on the results I found here.
According to Dr. Nathan, the plane has less to do with it than I thought. So the likely issue with Young deals with the system being off for that series, and the new results from his games will be used in further studies on here. It'll likely produce a delta of 3-3.5 mph instead of 6. However, as we'll discss next week, location could make it appear +6 mph.
From Dr. Nathan:
"I used to do a simple demo for my introductory courses. I would fire a steel marble horizontally from the top of a table. Simultaneously, I would release another steel marble from the same height with zero velocity. The two marbles hit the floor at the same time. In both cases, the initial vertical velocity was the same (0) and the height
above the floor was the same. They experienced exactly the same motion y(t) in the vertical direction."
I'm planning on continuing this line of research throughout the rest of the year. Quality over quantity. I have enough other interesting things to research in between if they take longer than anticipated, but there will certainly be something on this next week, and most likely another the week after.
To me it always just seemed like he wanted another GM job and never really bought into what Houston had in mind, instead merely going through the motions.
Chris, it was admittedly weird to hire Wade in the first place, and as I'm sure most people know, I'm a Phillies fan, so I experienced him as a GM for several years. He isn't awful, but he isn't the guy to really right a ship under very tight constraints. He was basically given a team with a poor farm system and made it slightly poorer through moves to improve the big-league club marginally, and their run last year that defied logic didn't help. The Astros problems lie with McClane, a man much more successful than I am, clearly, but who has demonstrated a knack for not making the right calls as an owner of a baseball team.
Yeah I'm not sure why you guys are discussing Wade when literally my 2nd or 3rd line mentions how he is not entirely at fault for the issues here. Sure he hasn't done anything particularly fantastic, but his moves are not what have sunk the Astros. To get big returns, he would have needed to trade Oswalt and Berkman a few years ago, when he wasn't even the GM, etc.
And as a further aside, it might not be the distance that is important. I agree nobody is releasing inside of 50 feet, but rather that the plane of his delivery decreases the time to plate... not necessarily the straight y-distance from release to home, but the plane of the release.
Mike, as I mentioned a few comments above, the data was off for the game I was sent of the Padres, so when I get the new video I'm going to compare. That entire Angels-Padres series was off by 1-3 mph for each pitcher. The calculations are also bound to experience some error in frame counting, which could mean Young was throwing 86 with a perceived 89-90, but not enough of an error would exist to bring it all the way down to the 85-86 he actually threw. Same goes for Bell. The others all checked out.
Everything you guys are mentioning is stuff I'm interested in exploring down the road, data and feasibility permitting, but we're starting slow and working our way up. When we get into the attention zone it's going to be more of an explanation of handling pitches and the game theory of batters and pitchers. Then perhaps getting into the more advanced aspects like platoon splits and perception and type of hitter vs. attention zone.
Josh, that is the sort of thing to look at down the road. Right now the focus is more on the perceived velocity itself.
Just found one slight issue - the Padres game I was given was a bit off that day as all of the Padres and Angels pitchers had lower velocities on average than their seasonal lines. If all of them did, then it leads me to believe that Young and Bell still posted their quoted perceived velocities but that the deltas should go from 6.7 and 5.2 to more like 3.7 and 2.2, which makes a bit more sense. All of the other pitchers results stand.
The issue with Happ as presented here just means that the plane he creates when pitching and his actual release point makes the pitch look very slightly slower. There are other ways to increase the perception of velocity, which we will get into in the coming weeks. Next week we'll look at location, and soon thereafter how deception in the release of a pitch plays in. I'm personally not willing to accept that all of Happ's success is fluky, when he seems to be deceiving hitters somehow.
If we all get on them it just might happen! Haha, but seriously, it's something I discussed with them, and it is DEFINITELY very useful to teams, so it might take some time but it could be in the cards. Those guys can do anything.
Exactly my thoughts. He has a huge downward, jumpy windup that may allow him to release closer to home plate and allow for less flight time.
Something I'll be discussing a bit down the road is where the hitters pick the ball up and how that works for pitchers in terms of deceptiveness.
That is the very first thought that came to mind. If he was throwing 98-99 and was THAT long, sheesh, it could have looked 102-103.
Burr, when it comes out next year, I'll definitely run through the 2009 data and point out guys who may experience regression due to the PW and PWO marks.
I'm not defending anyone or anything. Based on Year 1 results, Jackson definitely has outplayed Joyce. However, Joyce was more for the future, and the team was content with the Gabe Platoon. Their season did not go downhill because of this trade. It went downhill because of the confluence of events I mentioned above, predominantly with players counted on to perform at certain levels falling WELL short.
You also cannot assume that he'd put up the same numbers. My biggest pet peeve with announcers is when they assume that each event is independent of all others. For instance, say you have a runner on 2nd, 1 out, and a guy hits a single up the middle and there is a question about whether or not 3B coach should send him. If he sends him and the runner is thrown out, and the following batter singles, 98% of announcers will say something like "..and if he was held at third, that single would have scored him..." as if the subsequent single was a guaranteed event that had nothing to do with the prior event.
Jackson is having a good year and he very well may have had a good year with the Rays but we cannot automatically add in his Detroit performance to the 2009 Rays and conclude they would be right there.
Iwamura has an option on his contract, as does Crawford and Zaun. Popular opinion is that the latter two will definitely be back--barring a trade involving Crawford--while Iwamura might not be. After all, Aki's injury paved the way for Zobrist's emergence.
All of my database work is done in MySQL or Oracle, however as I recently began learning simulations, Excel was a natural way to start. And seeing as how this serves sort of as a tutorial, Excel seemed like a more natural fit.
Yeah and if the article was about bad offenses with good records, they would certainly be atop the list. This was more of the worst offenses to potentially make the playoffs. Of course the Giants are terribly unlikely since publication but at the time they were 2 back.
I appreciate the tip as I am always looking to improve as a writer, but maybe send me an e-mail next time and leave the comments thread for an actual discussion of the article.
The last one. The stats were not taken from the same points, so the sim would be bound to produce different results. Back then Pujols trailed I think by 20-25 points and when I ran this it was 30-35 points.
I agree. Let him play. Like now. Right now.
Or rather I should say that OBI% isn't treated as a constant but rather as a range. As in if Pujols knocks in 20% of others on base, his minimum RBIs in a season might be 10% OBI relative to the OBP events of preceding hitters + home runs.
And, to add further, because of the variability in the HR and SLG of the hitter, RBIs were not a constant. The OBI% and average OBPs of preceding hitters were constants, but as the HR and SLG total changed, so too did the player's RBI total.
Yeah, it's explained above even though RBIs aren't in the pictures. RBIs were modeled as a function of the batter's HR, OBI%, SLG%, and the average OBPs of 3 preceding hitters.
Yeah, me too. I thought it all went great. I might have been a featured guest but I definitely enjoyed hearing Neal talk as opposed to my own voice.
I had a blast. I love how, even though many of us have never met, we didn't miss a beat in person. Cannot wait to attend more events, even if I did have to drive 5.5 hours to get there!
I think we're also conditioned to expect managers and GMs to react immediately to the extreme outliers, so when they don't it's even more surprising.
Personally, I feel that describing it like that is easier to read than listing the break angles per the PITCHf/x data I have and taking 5-6 paragraphs to explain what it means and how it compares.
The 40 IP as a minimum is there because I wanted to get guys with similar playing time as Lidge. The guys with terrible WXRLs that have fewer innings clearly were not given more duty. These 40+ IP guys were continually tossed out there.
Henneman had a 0.476 WXRL that season, so in spite of the ERA, he still pitched relatively well.
Well I can't speak to his 2004 dominance... 14.93 K/9!!!... but I know last season was anything but easy for me as a fan. He may have gone 41-41 and 7-7 in the playoffs but he was NOT anywhere near automatic and he had plenty of "Mesa Saves" in which it teetered on the brink of a blown save loss but unlike Mesa, Lidge managed to escape unharmed.
Yeah, Lidge is generally a good reliever, it's just amazing how he has gone from the best to the worst in a matter of a year - not his actual talent levels but his actual numbers.
It could be. I can't remember if it was Lidge or not, but I thought I remembered hearing a story in which Berkman or one of the Astros brought it up to Lidge after he left. It happened to some Astros guy that then left the team and I have a feeling it was Lidge.
But the dropoff from 96 to 93 is important to because Lidge's slider dips out of the zone a lot. For instance, the table I showed above, which extends out of the zone a bit, suggests that his fastball could theoretically be called a strike about 57-58% of the time, just 50-52% for the slider. If hitters are adjusting to his slider dropping out of the zone, guessing fastball, they can now stay on it more.
Bring the draft to PNC Park!
I want to wait until a large enough sample is gathered. Maybe next season at this time.
drawbb, as for the first question, it's sort of tied up in everything. Last season there were several more luck-based indicators that helped him out, such as a 0.26 HR/9. Several of the stats we consider to not necessarily be sustainable have gone to the opp. extreme this year. Plus, as Phillies fans can attest to, it wasn't as if last season involved easy, stress-free saves. He managed to "luck" his way out of plenty last year.
For the second point, perhaps I needed to elaborate, but in my own personal studies I've seen less of a relationship between marginal changes in location to results than I have with velo or movement.
Good call! Thanks.
If it entices anyone else, I'll be making the 4.5-5 hr drive from Philly to PNC Park as well.
Yeah. We allow the drafting of minor leaguers and such. Strasburg was a 7th-8th round pick this year. SO next season I'll have Neftali for about 20-25 innings.
He's on my strat team... believe me I know about him.
Evan, I write two versions, one for BP and one for ESPN. The ESPN articles are generally 800-1000 words and I find I am far too verbose and interested in digging very deep to a problem that in order to appease everyone I decided to write a longer version for BP.
That's certainly a factor however from the Red Sox standpoint the talent to acquire him isn't likely to contribute to their playoff chances this season and Epstein seems to be viewing this as a we'll-worry-about-this-season-now-and-next-season-later sort of deal, where the only cost on his radar is the salary to Wagner.
One of the issues holding things up right now is that Wagner wants his option for next year to go away, affording the opportunity to go wherever he wants. The Red Sox cannot guarantee they won't exercise it, making him wary of waiving the no trade. In arbitration, I'd have to imagine Wagner would be getting somewhere between $7 and $11 mil.
You can... I explained this in the previous Unfiltered posting. When you go to the Stats page, there is a drop down menu before the four headings. Click the drop down menu and choose from the various reports in there.
Once you do so you will be taken to a page wherein you can mix and match and pick only the stats you want.
Hit enter and then the report will generate. At the end of the URL you will see likely a 5-7 digit number, called a CID. Save that CID or bookmark your report and it will be saved forever.
These are all things I would like us to correct, but the Team Reports consist of the appropriate data available in the custom reports feature. The stats you mentioned are not currently in the team reports category list. Once added, they will of course be added to the reports. As I mentioned, this is a start.
One of the issues is that we have two different reports: Batter Season and Batter Team Year, both of which are in the drop down menu. The former includes overall performance while the latter includes performance on a team. It's usually 90% the same, but for instances like Sabathia/Harden/Blanton last year as examples, the data on Batter Team Year will have one entry for Cle and one for Mil, or Oak/Chi or Oak/Phi. What we should be able to do is include the team and age and just put a ---- for players with multiple teams that year.
My feeling on the matter was that, even though the Phillies never had any intention of trading for Halladay AFTER acquiring Lee, the fact that they COULD still pull off Ricciardi's request meant the Tribe didn't get enough. However, Phils were close to getting Sabathia last year so Shapiro likely knew what he wanted and spruced the deal up a notch.
What makes four? Counsell isn't in this game. I said three alumni in the game, not the league. No additional Seidpoints.
I'm still tinkering with the heat maps but the issue there is that the ranges aren't necessarily comparable, and if the same ranges were used, the second map would be predominantly the same color. But it's definitely something to work on in the future, since that was my one fear.
Seid-points will be reconciled at the end of the quarter, but good job!
Heck, if he could become a .480 slugger that would be fine and dandy, but he's struggling to reach even .430. I'll look into the pull-data.
And I have an article or two here about Cliff Lee and the historical rammifications of such a vast GB rate shift. Jumping 10 or so percent as he did is really rare in the game, BUT the pitchers that did so generally held steady over the next few years, suggesting it is more real than fluke.
The funniest part is that to do a fun but not groundbreaking article like this takes about 4-5 days of database work given how long the scripts run--and I have a pretty fast computer. Yet my previous article, showing that ERA is equally predictive as FIP in the second half of a season for guys with big ERA-FIP discrepancies took about 35 minutes to code.
Well, I'm an Accountant. I'm in grad school to take some courses to help my eligibility for the CPA exam down the road and I'm also concentrating in Finance to hopefully sit for the CFA at some point.
I don't necessarily want to do anything, just have the titles for my e-mail signature.
Yep, as I stated in the article, the correlations for the group without the big discrepancy was 0.23 ERA1-ERA2 and 0.34 FIP1-ERA2. So FIP is a better predictor for "normal" pitchers, but ERA shouldn't be written off for the "abnormal" guys.
As far as the other point, look at Carlos Zambrano's career.
Very interesting observation. I'm the opposite, having watched all of Cain's starts but not much Jurrjens, and I can say with certainty that he has gotten out of jams more than he has prevented jams from occurring, which isn't the best way to be successful, but he is doing things differently out of the stretch that appear to be more real than fluky.
Bobby, while I do agree xFIP has merits over FIP and that QERA, which uses the better K% and BB% then K9 and BB9 has merits over both potentially, the reason for using FIP is based on how people these days tend to look at a player's ERA and FIP and base a conclusion on his season strictly on the heels of his FIP-ERA discrepancy.
Right, I'm with you. But I do want to stress that a major point here is simply not to overlook ERA for these pitchers. Regardless of the conditional importance, which is certainly important, the data here shows that ERA should not be disregarded for pitchers outperforming FIP for "this long."
Yeah, time permitting I'll run them through and see what surfaces.
But in case anyone gets to the comments, let me reiterate that the major takeaway here, in case it is confusing in the article, is NOT that FIP is a bad predictor in any way for these guys... but rather that when we see a pitcher with an ERA 1+ run below his FIP through the first half of a season, the data from 1996-2008 indicates that the ERA is equally as effective as FIP at predicting ERA in the second half. This phenomenon is not observed in the larger sample of pitchers excluding these guys.
Yeah, we're on the same page, except I never said FIP is a bad predictor for that group... in fact I said sort of the opposite, that the finding is that ERA is actually just as good for those pitchers.
FIP is at 0.34-0.35 for the overachievers and the greater sample, but the ERA correlation is much higher for the overachievers, equivalent to the FIP correlation.
So in no way is FIP a bad predictor for these guys, but we shouldn't automatically write off their ERAs either.
Plus, per your second point, it's the SAME group that had 3.34 in the first half and 4.60 in the second half. So in the first half they had 3.34 ERA with an SD of 0.83... and in the second half, the same group had a 4.60 ERA with an SD of 1.42, which helps explain why the aggregate average is right in line with FIP, even though the ERA has equal predictive value.
The SDs are only a minor part of this. The major takeaway is that, from at least 1996-2008, and I'm in the process of running this back across a whole lot of seasons, when a pitcher has an FIP a lot higher than his ERA at the halfway point, ERA is equally predictive as FIP in terms of second half performance.
One thing you're missing when you're making the comparisons is something clearly stated in the post, being that the pre-season PECOTAs being used in the in-season projections ARE NOT the original pre-season PECOTAs.
What we do is translate the original PECOTA to 2009 specifications, including the offensive levels of the league and the park factors and such. It may not account for all of the gap you found in certain players but it is something to keep in mind; for instance, Mauer's original projection called for an adjustment from a .388 OBP/.436 SLG to a .402 OBP/.457 SLG... is it really that much of a stretch to think someone projected to be at .402/.457 (what his projection would have been if we suspected Minnesota would have a higher park factor, etc) would be projected to go .416/.504 from here on out, especially given the incredible first half?
There is a difference between in-season and full-season. For instance, full season weights all the full season data with comparables and things like, while in-season does its best "guesswork" based on whatever we input, and since 2009 is weighed the heaviest it is going to be skewed towards the number this year. Just because Greinke has a 2.97 ERA projected down the stretch doesn't mean he goes into 2010 with a 2.97 PECOTA.
I actually didn't even see it, apologies. Unfortunately I'm not much of a fantasy guy though I can certainly see the utility to be gained by such an addition.
Basically, it boils down to the reconciling what we thought his talent level was prior to the season with what we know about the player now. This true talent level is somewhere in between the 2006-08 weighted data and the 2006-09 re-weighted data with 2009 being the heaviest, with the ultimate weighting reliant on how far we are into the season, so when we are 160 games in, Mauer's talent level will be his 2006-09 weighted and translated performance.
Sky, the past performance is included in the original PECOTA as you said, however, the weighting changes and that is what the in-season estimates attempt to gauge. For instance, at the beginning of the season, Mauer was at .307/.388/.436, based on say 2008 weighted at X, 2007 at Y and 2006 at Z. We update the pre-season based on current factors, which doesn't alter the pre-season weights, but just updates one component that translates the original weighted result.
That is then weighted against the translation of 2006-09 data, with new weights, IE 2009=1.0, 2008=0.5, 2007=0.25 and 2006=0.125, so the past performance is incorporated from a weighting standpoint. It IS already included in the original PECOTA but this past data is re-weighted, with 2009 being the heaviest.
If you just did pre-season PECOTA and performance right now, you would essentially be over-weighting the current performance, saying that the 2006-08 performance is worth 42% with the 2009 worth 58%, whereas the current performance needs to be regressed to the prior track record a bit as a means of estimating true talent, which helps us avoid over-weighting the current performance or ignoring prior, re-weighted performance.
Correct, so you would add 22 to whatever Pujols currently has and that's what his end of season numbers are expected to be. An addition I'd like to implement is current numbers atop rest of season projections, with an updated end of season total beneath.
I addressed that above - it's something that needs fixing. The GS and IP are full-season whereas everything else is rest of season. I'd imagine it will be addressed shortly.
I misspoke, apologies. The 2009 stats are being translated, however since it is the 2009 environment they will output practically the same results. The issue with Mauer's stats, which I'll update in the article, had more to do with the timing of the translation last run and the stats selected.
1) Yes, definitely updated regularly, not a once per season sort of thing.
2) Yep, current record is the starting point here.
3) That is something we're working on, thanks for the catch.
4) The number of starts would be full season, which needs to be weighted by the amount of time left. In this case, around 40%, so that would be, what, 3 starts for Kennedy from here on out? Might not get there but he would be more likely to be called up than, say, Kei Igawa.
Which is exactly what I wrote in the article - THEY are the market inefficiency teams are exploiting.
In case anyone isn't sure where to find it, here is a link to Joe Mauer's page - http://www.baseballprospectus.com/pecota/mauerjo01.php
Beneath his picture is a big, blue bolded link titled Projected Playing Time. Beneath that is a short, 3-line or so table with his playing time percentage and rates/raw totals projected from here on out.
This is where it can be found on the individual cards. Relative to the team depth charts, click on the link for Projected Playing Time and you will be transported to the team page, with playing time estimates and projections from here on out.
It would actually be better to show actual results atop the rest-of-season updates, and then perhaps the total result combining what has happened and what is expected to happen, to show that Mauer, with 15-16 HR, projected to hit 9 more, would finish at 24-25.
Because he values different things is why he would pay more than others would settle for. The best example came through a convo I had with Dave Cameron, where he likened it to a vegetarian walking into a burger joint. The vegetarian will scan the menu and find a $9 tofu patty. Nobody else there would pay that much just to eat fake meat, especially when their burgers might only cost $5, but to the vegetarian, it is what he wants, so the price doesn't seem high in comparison, since he has no use for the regular burgers.
You're slightly off course, which is likely skewing your conclusion. The current season DEFINITELY plays a large factor as it carries the most weight in the area currently accounting for 58%.
The part accounting for 58% is comprised of 1.0*translated 2009 performance, 0.5*translated 2008 performance, 0.25*translated 2007 performance, and 0.125*translated 2006 performance. And by translated, I mean that those seasonal lines are translated to what they would look like with the factors inherent in 2009.
That then carries the weight of team GP/162, with 1-(team GP/162) being the weight for the original, translated PECOTA. For Mauer, based on his 2006-09 translations of .339/.426/.538, and his translated pre-season PECOTA of .313/.402/.457, he is expected to hit .328/.416/.504 over the rest of the season.
Precisely... and on the flipside, if they were a playoff contender, when taking on a multimillion dollar salary might make more sense, you go for someone like JJ Hardy and not Yuniesky Betancourt.
I'm not necessarily giving him the benefit of the doubt but rather trying to provide some insight into why some of these moves are continuously made. I don't know that any of us could do a better job as an MLB GM... it isn't nearly as simple as running a fantasy team. Do we here at BP have a better understanding of what correlates to winning? Seems so, but that and running an MLB organization are not interchangeable.
I'm also not saying the Betancourt deal isn't awful, but again, trying to show how someone in DM's shoes could see the positives and not the negatives. I wholeheartedly agree that paying his salary is a negative, when cheaper options can fill the void for the remainder of the season. Granted, I'm a Phillies fan and have been rewarded with winning seasons this decade, but I remember the pre-Jimmy Rollins era, when they routinely lost 90+ games.. and I honestly didn't care if it was 66-96 or 63-99, they stunk either way. Upgrading from Pena to YB is something a team on the precipice of the post-season might do to increase their odds, not something a 67-95 team does in the hopes of going 69-93.
Not exactly, and that's basically the point of the article: understanding does not equal automatic implementation. He understands OBP and stats of that nature, most likely of course, but prefers to put more weight on what the scouts see and certain statistical areas that he "grew up" with. Understanding SHOULD result in the proper utilization of information, but in this case there is a fairly vast gap between the two, which is a major issue in KC.
I'm not "arguing" anything, but rather trying to provide some perspective for those who are so quick to yell at him for being statistically inept or dumb from an evaluative standpoint, when, say, seven or eight years ago there was MUCH more weight placed on scouting than the numbers.
When you say that he continues to bring in slow, non-OBP hackers who can't run or defend, I wholeheartedly agree, but from his point of view they look good, because he is using the ancient techniques passed down by his forefathers. He sees that they don't make errors and ignores that the lack of errors is from a lack of range. He sees the low strikeout totals but bypasses low walk totals, etc.
I agree that he has done a poor job and I repeatedly mentioned that I am not defending him. It is irresponsible to not upgrade your evaluative techniques when all of the other teams in baseball have.
And that is sort of the point. Yuni is certainly an upgrade over Hernandez/Pena, but that speaks to the ineptitude of the prior starters more than it does to the abilities of Betancourt.
There are certainly players that are worth flyers in order to "fix", but Yuni doesn't fit that bill given that seemingly everything has been tried before... and failed.
Many on TV predicting the Royals to be fantastic this year fell prey to exactly what you mentioned. They saw the good and not the bad, and mistook an active offseason for a productive offseason. I remember writing at the time how I couldn't believe that people legitimately thought this team had a fighting chance, and that they were in no way comparable to the Rays at all, but when you actively seek something you tend to ignore the downfalls.
Just like Moore looks at Yuniesky and sees "good hands" or something like that, he neglects to see poor range.
I don't disagree with you at all... actually, I'll flat out agree with you since it is a pet peeve of mine when people use the reverse instead of what they mean.. "I didn't hate the movie," instead of "I liked it."
I feel like the type of inefficiencies are different now than before, certainly, and that it isn't a matter of digging deep in one area to find them, but rather the knowing where to look might not be as cut and dried anymore. For instance, around the time of Moneyball, newer generations were introduced to the virtues of OBP and other stats they had glossed over because they never knew to look for them, not because they actively despised them. Now it's kind of strange to think of a specific way of thinking as a market inefficiency.
When Howie Long is the second bad guy in command, you know you have a hit!
While you may be well-versed in it, others might not be, and some others may find it useful to learn that for their own "experiments."
Rowen, I agree... now trade me Chase Utley. Is there such a thing as Strat-tampering?
In all seriousness, being receptive and opening discussions is of tantamount importance to growing as an organization. Otherwise, an echo chamber is created wherein no new ideas can develop. This is what teams like the Royals and the Bavasi-led Mariners were essentially criticized for.
With the second question first, are you referring to weighted regression towards some form of an expected BA based on the number of different balls in play hit? As in, if Mauer hit .347, .293, .328 from 2006-08, instead of adding in the 380-1173 (his weighted prior three seasons), find his EXPECTED HITS given the number of each batted ball multiplied by the 0.73, 0.24, 0.15 expected values, and divide that by the 1173?
That would certainly be a worthwhile study, as your second paragraph hit the nail on the head, or whatever that expression may be: Mauer MIGHT be a true .325-.330 hitter right now but the regression approach isn't going to agree until he gets more and more ABs at such a high level.
Or if Rodrigo Lopez can continue to be serviceable, Pedro could potentially be a bullpen pitcher who warms up alongside Jamie Moyer; if Moyer isn't facing the Marlins, or struggles early on, quick hook, go to Petey.
Right, I'm not debating that at all. He very well COULD be a true talent .325-.330 hitter, but in going through the appropriate regression to the mean methodology we arrive at .318 right now. I don't have as big an issue with Nate's Bayesian approach last year as some others did, but this is a different approach. No matter what approach you use, however, you are going to get something well below 1 percent.
There was a ton of debate about this on Tango's blog when Nate's article came out last year, but ultimately I feel that the regression approach utilized here is the most accurate method. But that is one of the great things about the numbers - you can adjust them in this case and re-calculate. I am much more comfortable in saying he is now a true talent .318 hitter with a .373 clip than using the PECOTA as a prior with a Bayesian approach. But it sort of boils down to sample size taste - you might be convinced that with his 2006-08 seasons and current numbers that Mauer really is a .330 hitter, which he very well may be. I might be more inclined to regress him to the mean more than simply accept the .330 as gospel.
Please read the article, not the little one sentence synopsis, before commenting.
Matt's system doesn't really have anything to do with the type of ball being hit as it records positioning and angle more than anything else. He does keep track of the batted ball, however, and the play result, and he showed in his presentation how you can use that to determine the probability that, say, a grounder will be converted into an out relative to how far the fielder has to range to his left and right. Additionally, you can track the probability of out conversions based on balls in play hit at certain angles. But that is more of a post-production calculation. Within the game, he captures initial and final position.
Sure is... I have some interesting things in the works with the data. Now that our samples are growing larger, more avenues, previously roadblocked, are becoming available.
To me, the idea is like that riddle where ivy league students/grads struggled to solve while kindergarteners figured out quickly. Not to say Matt Thomas is a kindergartener (he's twice my age!) but that most of us look for so many advanced ways to go about evaluating players that we occasionally lose focus of some simple concepts.
One of our discussions involved trying to determine interesting ways to implement the data. Right now it's all very raw, but the TV aspect could very well be an avenue they explore.
To elaborate on the Bloomquist/Bruntlett front, here are the positional adjustments per 162 games:
C: 12.5 runs
1B: -12.5 runs
2B/3B/CF: 2.5 runs
LF/RF: -7.5 runs
SS: 7.5 runs
This is based on what would happen if you put Bloomquist in at a position. The average SS would be 7.5 runs better per 162 games than Willie. However, the average LF/RF would be 7.5 runs worse than Willie.
So when I say that a +10 1B is actually 20 runs worse as an overall fielder than a +10 SS in 162 game spans, it is based on historical data of fielding at several positions and the difficulty in playing certain positions. The adjustments are essentially a way to level the playing field, literally, a baseline for comparisons.
The positional adjustments are based off of what would happen if you were to plug in, say, Bloomquist or Bruntlett at any position, using data from players who have historically played multiple positions. We can argue semantics but a +10 1B is not equivalent to a +10 SS. The actual UZR and + - is relative to positional peers but we need to use some form of positional adjustment to level the playing field.
Must not qualify.
What it meant is that even though they saved the same amount of runs, when we normalize for difficulty, the shortstop is a MUCH better fielder.
Nope, this is actually a reasoned response to the uproar caused by an apparent controversial call. I didn't read the story on ESPN or subsequently venture there for any story upon hearing of Cohen's claim, which was apparently uttered or implied over the air. Something controversial happened, and I felt it was worth looking into.
Forgot to mention, the initial called strike was a ball but not THAT far outside.
23 of the 373 in this game, just 1 called strike (this controversial pitch).
Ah, okay, let me check and get back to you.
As I said, the 373 total is only balls+called strikes. So, if 32 were called strikes, 341 were balls. This is only for RHB, as well.
If he gets to 199, then we'll talk.
No problem... CV is a really good stat to employ in something like this, I just chose to filter it off to incluide different plateaus.
Okay, here we go... of guys with an average of at least 20 HR/season in the five year span, here are the lowest coefficients of variation:
1) Fred Lynn (1983-87): 0.02
2) Cal Ripken Jr (1983-87): 0.03
3) Paul O'Neill (1993-97): 0.05
4) Adam Dunn (2004-08): 0.06
5) Mike Schmidt (1982-86): 0.06
6) Frank Thomas (1993-97): 0.06
Hmm, that would certainly work. Will plug it in when I get home and see what comes up.
Gagne is the only person since 1954 to have 3 straight seasons with identical IP tallies, at 82.1.
If this were intended to be hardcore analysis or groundbreaking research, yes, but really the Statman's Notebook is just a column designed to have some fun with numbers. Filtering it to different constraints works for me though your proposal certainly works in a different setting. Like I'm not saying Lynn's SD is more impressive than Thomas's... I treated them as separate.
Right, and these days, with 12-man pitching staffs and one of the four bench spots reserved for the backup catcher, and another seemingly for a fourth outfielder, teams need to be very careful with platoons because of the roster space issue.
That would honestly be evened out by teams throwing their top specialists at him much more often now. Or making sure to use a lefty reliever when he comes up when they might otherwise not. Plus, there aren't many elite LHP starters. And Howard usually plays vs. Johan.
Essentially, to take into account above propositions, you don't need to use data to decide who to platoon. If legitimate scouts are indicating a player struggles in certain circumstances, that is likely more accurate than small samples of data.
HOWEVER, the point I'm making here is that many of those opining that Howard should be platooned point to his drastic self-split, which is not the appropriate context. If the Phillies are trying to decide which player to platoon, they should be factoring in defense, what diminished playing time might do to the player against OHP, and which of their players is both easily platoonable from a personnel standpoint as well as a viable candidate given their ability to hit SHP compared to the rest of the league.
So basically, you don't need to use data or comparisons to decide who to platoon... but if you're going to use data, use the appropriate context, which is some form of the league against same-handed pitching, not individual splits.
Quite honestly, the LH-LH and RH-RH were eerily similar. Thus why I combined them, there was really no difference. LH-LH would have a .754 and then RH-RH would be at .757 or something like that. You COULD adjust for position and such but the idea of showing this data was merely to indicate that the appropriate context ISN'T the player himself but the league.
All sorts of scouting and makeup issues come into play, though. For instance, would Howard NOT playing everyday, as he is used to, adversely affect his awesomeness vs. righties? Players are not strat cards and while the numbers are more accurate in many areas, we're both in agreement that sample sizes prevent accuracy in this area and that scouting is probably much more important.
I threw 81 mph my junior year of high school but went all Ankiel and was relegated to 1B duty... haven't thrown in a while but would love to face LaRue.
Richard, questions like this are why we need a uniform system. It isn't as if there are a set of rules out there. I might answer No to your question while others might answer Yes. I hate that. Give me the robotic data output on BIP!
Yep, on my to do list as well. And when the samples are larger, success rates by bat speed, vartical launch, horizontal launch, etc. I think your question touches on the illusions of fielding. For instance, Shane Victorino runs 100 mph for a ball that Carlos Beltran easily glides two. Both made the play but the perception is that Victorino made the better play because he sprinted. In reality, Beltran's superior range meant he could simply get to it easier. In your example, the not getting to the ball makes it look like a liner; after all, how does a non-Adam Dunn outfielder not get to a flyball?
Byron, yeah, I have the data in my database but it is only for April at the present moment. I don't want to use it until more is available, but the spray and launch angles as well as the bat speed can certainly clue us into which pitchers allow the slowest speed off the bat, which specific pitches do so, etc.
Exactly, and Brian and I discussed a specific example this weekend, in which one of the LaRoche's hit a blooper that Gameday recorded as a line drive. I want to eradicate any type of misinterpretation. While I do like the human aspect of the game in certain areas, Brian and I, and I'm sure many of you, can agree this is not one of those areas.
Not just that - as I mentioned above, we can not only differentiate once and for all between liners and flyballs, we can further differentiate between various types of grounders, flyballs, and liners. A weak hit dribbler is going to have a lower expected value than a scorched one-hopper, but as is, everything is lumped together.
OK, as promised, here are the deltas for pitchers by year, keeping in mind that their OPS allowed from 1999-2008 is equal to the OPS numbers from 1999-2008 posted by the hitters.
So it seems here that things are in fact much more even, and if anything, things are tipping in favor of the hitters, as they made the elite pitchers deviate from the mean more than pitchers made them deviate.
Jared, I'll likely be posting the numbers in question at some point tonight or tomorrow, either here in the comments or perhaps in Unfiltered, but to give a short answer, the advantage isn't clear - both hitters and pitchers deviate similarly against non-elites than against elites.
The real conclusion I wanted to be taken away is that if you have a choice between 1,200 PA of a player over the last 3 seasons or 10-25 PA in a specific batter-pitcher matchup, always tske the 1,200 PA.
Yeah, I ran into a bit of a word count issue, not necessarily getting in everything I wanted to. The ultimate method would involve comparing the Pitcher's OPS vs Elite/vs Everyone to the Batter's OPS vs Elite/vs Everyone to see which had a larger spread. The likely answer is that they are equivalent and there is no advantage.
Next week's article deals with line drive rates - you read my mind.
Will look into it and see what I can find re: hit sprays.
I don't think the actual spot, per se, in the batting order is as important as his role. On the Giants, it was Bonds and then Feliz, whereas on the Phillies you have Howard, Utley, Ibanez, Rollins, Werth, and Victorino who are all superior hitters.
No worries, but I do like when people read everything before commenting, as you essentially just re-wrote my final paragraphs, haha.
And, as Kinanik noted, I mentioned this in the piece ;-).
Can't remember the exact date, but it was in June 2001, in Philly, Marlins in town. I was 15 years old, and a huge Phillies fan. I think Robert Person was pitching.
Anyways, it was the 9th inning, and the Phillies were up by a run, with Jose Mesa into close the game. With a runner on first, Luis Castillo (when he was REALLY fast) laid down a bunt, and Scott Rolen fielded it and was able to turn a double play, by far the most amazing thing I've seen in person.
I literally mentioned in the article that the non-independentness could skew the results.
Yes, over the course of a season, hitters face varying types of accuracy and strategy amongst pitchers. Overall, these varying types tend to even out, but on a per-game basis, can change the probability of anything.
In fact, I mentioned this in the article, in that determining the probabilities here isn't going to be 100% accurate given that they are derived on the principle that each event is independent of others. If Feliz is facing Greg Maddux, he is going to be MUCH less likely to walk than if facing Oliver Perez. If he faces Kershaw the first time and learns his stuff, that second time through the lineup, Feliz may be more likely to walk based on his prior experience. Nobody is debating that, but this isn't necessarily an analysis of how or why Feliz has done what he has done, just an experiment to see the likelihood of him producing these types of numbers, given the last 8 years of data we have for him, constituting a pretty consistent level of production based on his low SD.
In a large enough sample, the normal distribution is a very solid approximate for the binomial distribution. To determine if the sample is large enough, I use the idea that all data within a set falls within 3 SD. For Feliz, everything is within 2 SD, which is a bit nutty, so the point here was to compare the two. The binomial distribution is more accurate, at 2.4%, but since the sample is large enough and everything falls within 2 SDs, we could have stopped with the normal distribution of 2.5%. If this was just a short-term explanation or determination of the probability, we could have stuck to the normal distribution, but I got into explanation mode and wanted to show both.
Thank you! When I was in school, I found that relating everything to baseball helped me really relate to the material, and I have no doubt the same would apply to some of your students. I'll search for the one I wrote on Z-Scores, T-Tests, SDs, etc and see if I can link it to you as that might be a good one too.
Something I might be doing in the next few months is going over how to do just this. There is already a really great tutorial out there on how to CREATE a Retro DB, from Colin Wyers, so what I might do is link to that and then, maybe once a month, go over some MySQL jargon to teach how to run some queries and get some data.
Joe, last part first, yes, 0.55^4 gives us the same answer as BINOMDIST(4,4,.055,FALSE), but the answer it gives us, 0.0000092 (I rounded up) is the probability in a SPECIFIC game, not any game. Over a 150-game season it is more likely to occur at least once than it is likely to occur in a specific game. 1-0.0000092 = 0.9999908. 0.9999908^150=0.9986. 1-0.9986=0.0014, or 0.14% chance that in some game over 150, Feliz would walk 4 times in 4 PA.
As far as the other points, yes, this is all dependent on the idea that he is the same player. This may or may not be true. If his true walk rate is closer to the 7.2% last year than the 5.2% the years before, everything changes, but I'm not convinced he is that much different.
Nope! I don't even know which discussion you're referring to, either, but I might go check that out. This was inspired from actually watching Feliz, with my own eyes, walk four times. My jaw dropped.
From Michael Wolverton, himself, the guy who created SNVA:
"The Support-Neutral pitching stats are designed to measure the value of a start in terms of how much it adds or subtracts from the team's chance of winning. Using situational scoring tables and some basic laws of probability, I calculate the probabilities that a pitcher's start will lead to a W or an L for him, as well as a win or a loss for his team. When totaled over all of a pitcher's starts, that gives us the three SN measures:
* Support-Neutral Wins and Losses (SNW/SNL) -- a starter's expected W/L record, given the way he pitched in each game and assuming that he had league-average support from his offense and his bullpen.
* Support-Neutral Value Added (SNVA) -- the number of games the starter is worth to an average team in the standings, over (or under) what a league average starter is worth."
I believe I used SNVA (Support Neutral Value Added) in that article as the barometer of pitching success, since it's one of those catch-all win-based statistics.
This seems like something PITCHf/x could help with in some due time. What makes one 88 mph guy Maddux and the other Ortiz? Probably a lot to do with location and movement, as well as selection. Everyone gets there differently. One guy might have pinpoint location and average movement, yet he achieves similar results as a guy with poor location but ridiculous movement.
Then we get into all sorts of discussions about whether or not consistency matters. I wrote something here in either Feb or March looking at whether or not staying consistent in various stats (IE - using standard deviations) mattered with regards to overall production, and it didn't. A guy can be flaky and still be just as productive, overall, as the most consistent pitcher out there.
Your idea is interesting, though, reverse engineering. Find the most consistent pitchers with regards to strand rates via the ICC, however we would have to incorporate some form of baserunners allowed. For instance, a pitcher with a 77% strand rate for four straight years who also has a 1.50 WHIP is going to have worse numbers than a 72% strand rate for four years with a 1.20 WHIP.
The second one... as the data I found showed, the higher quality pitchers had an ICC of 0.44, which is very significant and a much higher level of stability than anyone else. This DOES NOT mean, as I mentioned in the article, that they always post higher rates, but rather that they are more likely to sustain whatever rate they hover around. So if you see a guy like Halladay put up a 77% strand rate, he isn't an automatic lock to regress to the mean given that ace pitchers tend to be more stable year to year in their strand rate and therefore fluctuate/regress less.
No, the correlation breakdowns wouldn't be the same. For instance, when I run the correlation of the entire group, and then switch to just K/9 >= 5.0, the program doesn't know it's being compared to the entire group. It merely tests the strength of the new dataset.
The overall correlation was 0.28, which isn't that strong, but definitely suggests something is there in terms of as K/9 increases for this whole group, the LOB% increases.
However, as the K/9 minimum gets incorporated and increases, the sample sizes become smaller, and we see that the relationship is more random... that guys with K/9s above 9.0 could have high strand rates but there isn't anything there to suggest they always do or don't.
As for the last point, we can run regressions to see what sticks out but I personally think it's pointless because correlation doesn't equal causation. What if we run a regression and find that foul flyouts are a big factor? We don't even know if that's sustainable and yet we're supposed to treat it as the sole determining factor for strand rates. Know what I mean?
I'm personally satisfied with the conclusions found above, that higher quality pitchers, simply put, have more control over their strand rate and can sustain it more than others. And that these higher quality pitchers get to where they're going in different fashions, without necessarily a few unifying factors.
I agree with the counterintuitiveness but it really just goes to show there is more to stranding runners than strikeouts. For instance, as I wrote in the piece, FIP had a -0.39 correlation to LOB, whereas ERA was at -0.78, so the controllable skills seemingly have less to do with it. Then again, groundball rates also had little to do with it.
I had better respond or else I might not get Richard's vote ;-)...
I took all pitchers with 15+ games from 2004-08, ran correlations on overall K/9-LOB, and then broke it down into the segments, >= 5.0, >= 6.0, >= 7.0, etc, since that seems like a more accurate way than 5.00-5.99, 6.00-6.99.
>= 5.0: 0.274
>= 6.0: 0.249
>= 7.0: 0.235
>= 8.0: 0.180
>= 9.0: 0.118
So it actually goes in reverse, when the minimum gets higher and higher. When we include all pitchers from 5.0 K/9 and up the correlation to LOB is almost identical to the overall correlation, which makes sense given that most MLB pitchers fan over 5 per 9, but when we increase the minimum to higher strikeout rates, the relationship with strand rate dwindles.
When DIPS was first formulated, the walk rate, strikeout rate, and home run rate were found to be the only really stable components for a pitcher. Walks and strikeouts had a correlation around 0.50-0.65 if I recall correctly, and home runs were around the 0.30 mark.
As far as the NL deviating, as in the standard deviation of all years, in terms of strand rate, for the NL from 1954-2008 was under 1 percent. I can see where the confusion would arise from that. Didn't mean "of the time" per se but just that the SD for strand rate in the NL from 1954-2008 was under 1%, meaning it barely deviated from the mean.
No, I actually phrased that wrong. Even if we adjust it, the results stay virtually the same, though.
The IP/season is a good point, but I'm at work until 4 and cannot check that until around that time. I would venture a guess that yes, of course guys like Halladay, Johan, Carpenter, etc average more innings than Adam Eaton. But then it's a chicken/egg situation - is the decreased volatility because they log more innings, or do they log more innings because of the decreased volatility?
I will also look into the second paragraph, probably using stdev of FIP, finding the standard deviation of the whole group as well as the SD of the individual deviations; IE - if the SD of the whole group is something like +- 0.30, but the SD of individual SDs is +- 0.08, then we can partition the pitchers that way: SDs of 0.21 or lower would be consistent, 0.22-0.38 medium, and > 0.38 inconsistent.
So your number was off as well? Kind of ironic.
I inherited a Strat team this year and did some decent wheeling and dealing, while picking well in the dispersal and annual drafts, and my team is geared for the future. My minor leagues/farm system: Adam Jones, Madison Bumgarner, Neftali Feliz, Pedro Alvarez, Andrew Miller.
I'll have my own in-depth look at strand rate this week.
Will run those numbers over the weekend and see what I get. Who knows, might be an interesting piece in and of itself. From working with the numbers quite often, though, the strand rate doesn't seem to be any different than other regression-based stats, like BABIP, in which certain guys do exhibit some control, but players for the most part regress to the mean.
Well, Cain isn't going to sustain a 90% strand rate, but people often forget that regression works both ways. It's very likely that he'll also experience regression in allowing baserunners since their currently hitting like Pujols against him with the bases empty. Right now, more runners + awesome strand rate = low ERA. By season's end it'll likely be more strikeouts+fewer runners+lower strand rate=low, but not as low as before, ERA.
I consider Cain to be in a similar boat to Scott Kazmir in that they have the tools to be true #1 pitchers but they won't ever reach that point, leaving them great #2 pitchers. When I think of a #1, I think of Halladay, Johan, Hamels, Sabathia, Lincecum... Cain doesn't come to mind, but he is definitely a top-tier #2.
Haha, nice catch. I didn't even realize that!
Sure is, that's a good one, a bit tougher but definitely in the realm of possibility.
Sounds good to me.
Okay, so far I have:
1) DP for pitchers
2) Hitless streaks for those at .300 or better 05-07
3) Homerless streaks for those avg 25 or more 05-07
Yeah I knew that one was coming, haha. I'll add that to the list... perhaps the biggest HR droughts in 2008 for players who averaged 25+ HR from 2005-07?
Won't be tough... just takes about a half of a day to do the streaks for a particular stat, so I was hoping to get a few and then maybe put up an Unfiltered post with the desired streaks.
So yeah, I essentially confused myself a little bit, acting as if there were subsets--thus the reasoning for the overlap line--when in fact there couldn't have been any overlaps due to no subsets.
Dr. Dave, that's the answer to mountainhawk's question, I did not include any subsets. Streak-finding in databases is a tough, multi-step process without any sort of Perl script, so the process involved determining the PA # each of a particular stat occurred in, and then finding the sum of the PAs between both 1's (with the non-occurring PAs having a 0).
So, for Molina, the stretch of interest was 9/3-9/23, meaning he got a walk on 9/3, and then did not walk for 70 PAs until again doing so on 9/23. Taking the subsets wasn't of interest.
That would be akin to saying Zimmerman not only had a 30-game hit streak, but a 25 gamer, etc.
Or you could always just yank him in Strat as soon as the goings get tough... or follow Rowen's prescribed plan above.
No I don't care! Take him out no matter what!
Exactly... that's what I meant by hypnotics. He looks like he SHOULD be able to get through the frame but he won't, and history confirms this. I agree with your rule... I don't care if he has a 2-hit shutout going, if he lets 2 guys on in the sixth, pull him.
Right, and that's exactly why I didn't do them, haha, but I'll definitely see if something is there... righties who dominate righties but are not closers and have usage patterns similar to LOOGYs.
Burr, do you mean LOOGYs who ONLY faced lefties in like 75% of their appearances?
Yes, it certainly was. Re-read. I classified LOOGYs as anyone who, in a particular season, faced 3 or fewer hitters in 50% or more of their appearances. Non-LOOGYs = everyone else.
I know this is a pretty vague comment ;-).
The first measure is STS/pitches but the second is STS out of STS+CalledK+Fouls. The reasoning deals with the distribution of non-BIP strikes. Some may get more whiffs, others may get higher percentages of fouls than anything else. Measuring it relative to swings is certainly valid, it just isn't what I was intending to look at.
What sort of reference/links did you have in mind?
kmbart, let me start by saying the PFZ is not factored into the flight trajectories. The fields of interest for the trajectories are x0, y0, z0, vx0, vy0, vz0, and ax0, ay0, az0. x0 is the horizontal position of the ball at its initial release point and z0 is the vertical initial release. y0 measures the distance from which the ball is released. In the dataset the standard is 50 ft though some analysts prefer to backtrack everything to 55 ft. The ax0, ay0, az0 measures acceleration of the pitch in feet per second squared in three dimensions, and the vx0, vy0, vz0 measures velocity in feet per second. To find the start_speed, or what we know as velocity, you take the square root of the sum of the squares of vx0, vy0, and vz0.
To calculate the horizontal trajectory we return to formulas probably seen in high school/collegiate physics but thrown by the wayside as soon as the final exam ended:
Horizontal Trajectory = x0+vx0*t+0.5*ax0*(t*t), where t = the time interval. The other trajectories are calculated similarly, except you substitute y and z for x in each instance.
So when you see my above flight pattern for Santana, the horizontal was excluded because you cannot see that from the first base view... from that view, the vertical trajectory was most important. Note that the formula with the y's instead of x's and z's is used regardless, but for first base views we use it in conjunction with the z's. Looking from an angle to see horizontal trajectory we would use the x's and y's, not the y's and z's.
Yeah, I would agree... it has to be very tough to run a company when people tend to leave for teams or other work with great frequency. Kind of like managing a minor league team with stud prospects. When they get the callup there is really nothing you can do.
Glad you have enjoyed my work here. I have loads of fun writing and analyzing and it's always wonderful to hear that your hard work is making some sort of a difference to readers.
No problem, glad you enjoyed it.
What I'm basically saying is that if we see a righty with an average PFX_X of -5.5 inches on his fastball and a lefty with an average PFX_X of +5.5 inches on his fastball, they threw with the same amount of horizontal movement relative to a pitch thrown at both velocities with no spin. The only reason for the sign discrepancies is that the pitches came from opposite sides of the plate, with the right of the catcher/ump being positive and the left being negative.
As I mentioned in the article, the movement components measure how much the ball moved relative to a pitch thrown at the same velocity with no spin. So when Santana threw with 7.1 horizontal inches, about two inches less than a 20 oz. water bottle, that's compared to a ball with no spin at that velocity. However, in the data, lefties have positive horizontal movement values on fastballs and righties have negative horizontal movement values on fastballs. It isn't that they are doing things differently but rather that the initial point of release comes from opposite sides of the plate relative to the catcher and ump.
I took the average PITCHf/x data for each of his pitches and plugged it into a speadsheet I have that calculates the position of the pitch based on the average data along its course to home plate, factoring in the position of initial release, the acceleration, and the feet per second form of velocity at various different intervals.
Essentially it's showing that if a ball is released at Point A at Time 0 with Acceleration B and Velocity C, then at Time 0.025 it will be at Point B... at Time 0.050 it will be at Point C, etc, all the way up until it crosses/passes home plate.
The data can be found through mlb.com, at this URL: http://22.214.171.124/components/game/int/year_2009/. If you don't have database experience you can find individual game data there. If you want to just take a look at the data, Dan Brooks has a wonderful website that can be found here: http://www.brooksbaseball.net/pfx/.
You would be surprised how far 1,500 words can go... I consider myself to be one of the hardcore stats-guys on this website and my articles tend to range from 1,400-1,800 words each week.
It's not that hardcore analysis cannot be done in 1,500 words or less but rather that it is extremely difficult to convey methodology, results, meaning, and context in that range. It is certainly possible but it takes a lot of hard work.
The major reason these two rates were chosen is because they were the most discussed relative to his turnaround. Plenty of articles were written last year about Lee's great season, including one by myself, Normandin and Goldstein, where the gb rate and bb rate were highlighted.
What is really interesting, though, is that if only 10 names were returned while probing for drastic changes in BB and GB rates, what happens if the K rate is added in? I'll check later, but maybe Lee is the only person since 1954 to increase GB rate by at least 8%, decrease bb/9 by at least 1.5 and increase K/9 by at least 0.5 in the same year.
Thanks for the data mining, Greg, that's interesting! Also, in response to your previous comment, it isn't that I drew the wrong conclusions, but rather that I highlighted the very limited group relative to the sample. What I meant was that in the timespan of 50+ years investigated, the number of pitchers that experienced these rate shifts and then sustained them is incredibly small... relative to the number of instances in which 95+ IP were logged back to back.
As you noted, the rising percentages signal that it was easier for these guys with the shifts to sustain the rates but my overall point is more along the lines of the guys who did so constitute an incredibly small portion of the greater whole. I can see how it would be confusing when I noted that even rarer was the number of pitchers who sustained the rates, but I really meant that relative to the overall sample not just the guys experiencing the rate shifts.
That's certainly a good point. That's honestly the next step here. First it's important to determine how common such rate shifts are and then evaluate its importance relative to performance. Ultimately, though, so few pitchers in the grand scheme of the sample have experienced rate shifts as drastic as Lee's last season so even if we wanted to determine what made some successful and others not would be invalid because of the small sample.
As I just responded above, comparing Lee to his more effective seasons is fine... there is nothing wrong with that... but the goal here wasn't to strictly compare 2008 Cliff Lee to 2007 Cliff Lee.
The goal was to research how often pitchers have experienced such drastic rate shifts in the same season and then how they have fared in subsequent seasons. Lee was just the baseline of the idea since it happened to him so recently and many chalked up his dominant season to such an incredibly low BB rate and much higher GB rate.
Well you're combining different things there... the point here was to investigate how pitchers with rate changes have historically fared... it's essentially independent of Lee. The idea was derived from Lee's shift from 2007 to 2008 but the goal here wasn't necessarily to investigate Lee as much as other pitchers with similar rate changes... just centered around someone who famously experienced such shifts last season.
Nope, doesn't explain them and I didn't intend to try... it's just two starts. After four starts in 2008, Sabathia had 18 IP, 27 H, 32 ER, 16 BB, 16 K. He turned out just fine. This data in conjunction with last week's piece really shine light on two major things:
1) Groundball pitchers follow different rules so even if rates are sustained, several factors can worsen one of their performance lines in a given year.
2) Reducing a UBB/9 and increasing a GB_Rate as drastically as Lee did last year is incredibly rare and nobody other than Daniel Cabrera, since 1954, has been able to sustain both rates in the following season.
And thanks for the compliments!
Yep, my DB has Lester going from 0.35 to 0.49 in GB Rate from 2007-08 and from 4.43 to 2.78 in UBB/9. However, his 2007 innings weren't high enough to qualify for the study. Despite that, he too might be someone to keep an eye on based on the extreme rarity of pitchers sustaining such shifts.
I'll probably end up tracking this. I have a followup in the works possibly for next week looking more into this data as well as his miniscule walk rate. Maybe it's just me but I find it incredibly fascinating that a groundball rate spike like Lee's is not only rare from 1954-present but also how so few were able to consistently sustain the rate while facing a decent amount of hitters.
Interesting idea. Your last paragraph definitely represents the primary ideas surrounding these pitchers. But think of a guy like Matt Clement, who missed many bats but also kept the ball on the ground. He had the reputation of a power-type pitcher. Reputations currently peg power pitchers as strikeout guys with hard fastballs and finesse pitchers as grounder guys who avoid walks. There are certainly cases where this just isn't true.
Nope... was just testing you to INsure you were paying attention.
I guess Billy Idol is screwed! Eric Idle will be safe.
Joe, look below, as I posted that data here in the comments.
Well, it would actually be the other way around... this was done as Data on Day 4 - Data on Day 5, meaning velocity was up on an extra day of rest for each pitcher. Though with the samples being 22, 57, and 72 respectively, I too would doubt the results as being overly significant. Definitely something to examine in the future.
Burr, the issue there is that these guys only made like 1-2 starts on short rest and it isn't even worth reporting. The more interesting data would be the paired samples deltas, which I posted here in the comments section as it compares the 4 day vs 5 day data for each individual pitcher and then takes the weighted average.
Okay, so I just re-ran the numbers based on the suggestions and got the following deltas from 4 day to 5 day:
Hard-Throwers: -0.13 velo, 0.07 horiozntal, 0.24 vert
Medium-Throwers: -0.11 velo, -0.14 horizontal, -0.03 vert
Soft-Tossers: -0.05 velo, 0.21 horizontal, 0.04 vert
Ah, okay, I misunderstood that point. I'll calculate that out now and report it in a few minutes.
Which is pretty much why I suggested skipping the 3-day and comparing the 4 day and 5 day data, as in each group these two areas featured the same pitchers, making the data more meaningful. Of course the short rest will be skewed by the better pitchers... but the sample sizes were irrelevant there to begin with so there was no need to mention that.
Jamey... 3 days is short rest... and I literally mentioned that nothing should be gleaned from short rest in any of the three groups, and that the more interesting discussion is on standard rest (4 days) and one extra day (5 days). In fact that's the bulk of the article.
Do you mean a breakdown based on batted ball types? Like those who miss bats vs. those who pitch to contact? It's definitely doable. What I'd like to do is get a couple more ideas like this and put together an article dissecting the suggestions.
Yep, I own a copy, and Bill and I have spoken quite a bit before. Last we spoke he was working on another book about mammoth home runs.
Marc and I did a player profile on Quentin last year and my major point in the performance evaluation is that Quentin was very lucky with his home runs, very similar to Youk.
Exactly... there is something different about pressure, and we shouldn't be suppressing humanistic elements of the game but rather using them in conjunction with the stats. Like I wrote above, do closers have different pitch data in/out of save situations? Yes, and they should be expected to because the approach is different. In non-save they might be more prone to throw in the zone, nibble less, etc.
But I don't want us to say they do "worse." They pitch DIFFERENTLY, which may lead to different results.
As am I... as am I. For now, it's more of a supplement to analyses rather than the definitive form of analysis. In several years time, we will be able to do many of the same things we have with other data, such as aging curves and studies along those lines.
At the risk of being rude, did you even read this article? I literally went over each of your points and fully acknowledged that this is just a start, nowhere even near the full study. Of course we need more data. This has 41 pitchers in 1 season... we need much more, but I mentioned that at least 4 times throughout.
I DID look at individual pitcher splits, and then calculated the weighted results, and to ensure that guys like Borowski weren't hurting the rest of the sample, I partitioned it further.
And I have a whole paragraph here designated to discuss how strength of opponent needs to be incorporated.
This interests me as well, but alas we do not have enough data. Let's promise to revisit this in 5 yrs!
That is, 72% fastballs vs 74% fastballs.
Definitely good stuff for me to look up. I did find that the overall pitch selection discrepancy was minimal: 72% in save, 74% in non-save. As far as pitch data based on rest... well... you may have just given me an idea for an upcoming article ;-).
As with all aspects of baseball, outliers will exist even when the majority represents something else. The idea is that better teams will have more opportunity for save situations... doesn't necessarily mean it is always true, or even that the teams will employ the designated closer in all of the games... just that more opportunity generally exists.
Nate, definitely a valid point. I had that as a factor but didn't want to discuss ALL factors hurting the studies. Yours is definitely valid, though, as it can muddle up the save situation stats.
I hope he pitches until the age of 75 and wins 450 games.
We would have to see Nate's take on it to know if we're definitely discussing the same things. In the article I'm referring to what JayHawk termed pure knuckleballers, the guys who threw solely that pitch, which is much less taxing on the arm than, say, a Nolan Ryan heater. Plus, the knuckleball is a specialty pitch, so the seasons of Moyer, Ryan, and John are more impressive to me given that they used more common repertoires.
Watching Moyer against the Marlins is always a treat. He'll throw over to first 9 times and get Hanley Ramirez anxious to the point that he feebly pops up.
In all honesty, you shouldn't expect every single writer at a particular site to think the same way. It isn't groupthink here. Plus you're comparing two very different things here: staying healthy and staying effective. In no way was I suggesting that Niekro and Hough were effective, just that throwing a knuckleball isn't as severe as throwing hard fastballs, curves, sliders, etc, so their ability to pitch at age 45+ isn't as impressive as Ryan's, John's, or Moyer's.
Oh, it's definitely what he is, but I don't really see the need for the title. Rich Dubee does a very solid job as their pitching coach, but Moyer is invaluable in this regard. Even if Moyer regresses from last season's production, his mentoring and instruction to these young players adds greatly to his value to the team.
I mentioned in the article that there is no Pitch F/X data with which to compare this season... the dataset didn't emerge until last year, in incomplete form, and Petey only made 5 starts anyways, way too small of a sample.
His velocity has gone from around 89 to 86 since 2005, though, his repertoire has not changed much (also mentioned in the post), and his zone percentage has dropped from 59% to 50%.
All I can say is that you are MUCH higher on him than I am. As I said, the 13-yr old inside of me who saw his dominant seasons would love to see it again, but I just cannot see him doing any more damage than the Schmidt/Estes/Stults combo... and Manny took care of tickets... I can't see hordes of fans flocking to go watch Pedro pitch anymore as he hasn't really been marquee in a few years.
I can see the Dodgers signing him but realistically he isn't that big of an upgrade over Estes/Schmidt/Stults.
Pedro is projected at 1.9 WARP in 110 innings. Schmidt and Estes combine for 1.5 WARP in about 135 innings, and Stults is projected at 1.4 WARP in 110 of his own innings. If we say that Stults hits his 110 innings and the remaining innings from that fifth spot are from either Estes, Schmidt, or a combination, the overall added wins total at the very least equals Pedro.
Add in that Schmidt is another perpetually injured player being overpaid and signing Martinez doesn't get you any guarantee of 30 GS/200 IP or anything along those lines that would merit spending a good $5 mil or so for a very marginal upgrade, or potential downgrade.
The 0.03 was for season total SNVA and Flake, and I definitely agree that there could be some selection bias, however by restricting the sample each year to 150+ IP it isn't as if I was including guys with 7 starts with those with 30+ starts, so it is vastly reduced.
They could, but they could also be the guys that wash out of the league and are forced to take minor league deals with non-roster invitations.
The idea is that pitchers may be consistent one year but inconsistent the next. Sure there are studs who are consistently awesome every year, but by and large, these players are few and far between, so it isn\'t necessarily contradictory... it\'s just that when you think of it you\'re probably immediately thinking of counterpoints like Sabathia, Halladay, etc.
There are some parks with screwy calibrations but there aren\'t as many as in 2007 and I do my best to normalize all of these things, so whenever you see/hear something is awry with the Pitch F/X data, know that anything you read of mine has some sort of correction, including pitch classifications.
Exactly, Clubber. Floyd is the poster-child for someone who saw vast shifts in his stats without a change in approach. Danks became a different pitcher which is why any shifts he experienced are \"more real\" than luck-based. He may see slight regressions in a couple areas but he looks to be the real deal. I wish I nabbed him in my Strat draft last week!
Not anywhere in this entire article did I say he is as good of a pitcher as Halladay. I merely pointed out that their cutters are very similar in terms of movement. And it is mentioned that a regression in HR/FB may very well occur given the league average of 10-11% but that if his cutter is for real, and all indications point in that direction, he could definitely keep that rate below 10-11%. On top of that, even if it regresses to 11%, the vast reduction in flyballs will still limit the home run count.
The issue is that the change in approach and addition of a tasty new pitch caused the statistical shifts, which is different from other pitchers who post terrific numbers without any real differential in pitch data but changes in luck-based indicators.
It could be, but given the enormous sample size I tend to think that things even out, and a selection effect like you described would only exist in a smaller sample, like the study I did last year about those who threw a ton of pitches in one inning. In that study, it definitely was a selection bias in the sense that those whose fastballs required more effort to throw (harder, faster velocity) saw a sharp decrease in average velo in their grouping because it included a 3 mph range, and those on the low end of the range ended up using it more.
And actually, Tommy John was classified as Finesse from 1963-1974 and 1976-1989, so he may have done a few things differently pre and post-surgery, but his classification remained the same. Since I find this stuff fascinating, let\'s look at some others:
Greg Maddux: finesse pitcher entire career except for 1992, 1995, and 1998.
Roger Clemens is pretty interesting, as he was Neutral in 1984-85, Power 1986-89, Neutral 1990-92, Power 1993-2006, and then Finesse in 2007.
Interesting. For the record, I do have everyone\'s classification for every year of their career from 1950-2008. Here is Moyer\'s:
So he might not be the best example because he was always a finesse pitcher. Here is Frank Tanana:
So he was all over the place to start his career but then settled in as a finesse pitcher.
Your idea is interesting though I think it depends on what you consider finesse/power pitchers to be. The < 24% BB+K% makes sense, in my eyes, because finesse pitchers throw to contact. They aren\'t going to walk or fan many. Power pitchers pitch less to contact. Because of this definition, we get guys throwing > 94 mph as finesse (Bobby Jenks, for one), and some guys throwing < 90 as power pitchers, like Chris Young. What would your profile look like? With such variance in the individual components, what would a finesse Pitch F/X profile look like?
I correct anything that seems fishy with regards to the algorithm for that very reason.
Matt, there are definitely finesse pitchers with 4-seamers and power pitchers with 2-seamers, but for the most part you\'re correct. Though this point wasn\'t really missed as I stated that vertical movement is much more telling for power pitchers than horizontal. And yes, it\'s horizontal/vertical. Since the classifications between 2-seamers and 4-seamers are still somewhat hazy in the data it is more accurate to discuss the actual movement as opposed to what the pitches are called.
And there are a number of power pitchers who throw under 90 mph and finesse pitchers throwing above 92 mph, so it isn\'t as if every single pitcher in the respective group carried the same attributes.
I also think you misread something as I never said the movement data was trending in negative directions for power pitchers. I said that the movement data trended in OPPOSITE directions. As the pitcher groups threw harder, their horizontal movement decreased and vertical movement increased.
Thanks, gw. Glad you enjoyed it.
I agree there is a difference between the Moyers and Maddux\'s and the Kennedy\'s. What actually interests me more is the pitchers with reverse-numbers. As in, the guys who throw below 89 mph classified as power pitchers and above 92 classified as finesse.
I think that if I were to specifically discuss finesse pitchers I would separate them into a couple groups and determine what makes a specific finesse pitcher successful. This study was more about quantifying the differences between power and finesse.
All I want to know is whether or not Denny Neagle or Omar Daal won the matchup shown in the newspaper picture.
Haha, well put. I could see this season going a variety of ways for Cabrera. I can see him getting hurt early on and missing most of the year, I can see him starting strong and fading just like last season, or I can see him succeeding all season long, like Lohse, signing an overpaid deal following the season when everyone thinks he turned a corner, and then prove to be inconsistent once more.
Frank, I admittedly have not seen enough of them to give an opinion. I\'m a Phillies guy so I tend to primarily watch NL games. Cole Hamels is a pretty slick fielder who always ends up in prime fielding position. I always thought Jair Jurrjens looked pretty good in that regard, too. Another favorite of mine is Shawn Hill, though with his delivery it is unlikely he will ever consistently log over 20 GS in a season.
I can\'t tell if that\'s sarcasm or not.
Frank, Maddux is the gold standard for me, with Kenny Rogers, Mussina, and Mark Langston close behind. Jamie Moyer is pretty good at ending in the proper position as well.
Thanks for the feedback, guys. I\'ll likely be following up on this one next week, looking at Cabrera more in-depth. I mentioned his decline from 2006-08 here but didn\'t discuss it outside of calling Perez the better player and more sound investment. The idea here was more to look at how they both differ from the rest of the league.
Aaron/YYZ, exactly. Cabrera also had a pretty solid 2005 season when he struck people out and induced plenty of grounders. Because these two have shown flashes of brilliance, everyone will do their best to fix flaws.
As a Phillies fan, Wigginton would likely be the most productive of that bunch, but I feel like The Millar Experience might be worth the small fee.
I think what we\'re all agreeing with is that it\'s harder to blame Dusty, contextually, for bringing in Harang, because nobody other than Cueto was available. The bulk of the blame goes to him for burning the bullpen up so quickly and then putting Harang in for 4 innings.
And even if you feel Baker is absolutely blame-free for the entire game, how do you allow Harang to pitch three days later?
It\'s not as if the Reds were in a Sabathia-Brewers situation.
Derek, exactly the point. You cannot let your players know this is a meaningless game, but two cellar-dwelling teams at the end of May is a meaningless game. You simply, as a manager, cannot mortgage your ace and up and coming stud to win that game.
And my sentiments exactly about the long reliever criticism. I\'m not saying every team needs to have ONLY one. Just looking at the Phillies last season, Clay Condrey is their mopup guy, Ryan Madson has previously pitched loads of innings, Chad Durbin is used to being a starter, and you still could have used Kyle Kendrick as a long reliever when Madson became the setup guy. These are 3 different guys who could have alternated each game.
Peter, very well put. When I set out to write this, the first thing I wanted to make sure of was that Dusty was fairly represented. And I still feel I could have been a bit more sympathetic towards him given the circumstances. He micro-managed the game and burned his whole bullpen up through 12 innings, which was definitely avoidable if either Fogg/Bray went deeper, or someone like Fogg was merely kept as a backup plan. This game should serve as a reminder to do your best to keep one guy available at all times, or to carry 6th starter/long reliever whose job is to make spot starts or long relief outings.
I remember a long game a couple of years ago between the Phillies and Mets where Ryan Madson pitched about 7 relief innings. He gave up a game-ending homer, but he saved a ton of arms.
Hands, exactly my sentiments, and it is in there though not as explicitly as your comment. The error really occurred when Baker burned his whole bullpen after the 12th inning, and had no other relievers to bring in.
After that, he had to go to a starter. Cueto would have started the next game, so he had the most rest, and hadn\'t logged a ton of innings up to that point. Going with Cueto for a max of 2 innings and then potentially to Harang for 1-2 innings might have been excusable given the circumstances, but to throw your ace out there on 2.5 days rest, to throw 4 innings, and then start him 3.5 days later is asinine.
Well, think about what happens if Jeff suddenly lays off some outside pitches in 2009.
By virtue of not chasing, pitchers will theoretically have to throw more in the actual strike zone or else he will walk. This would be the first step, but he still has to prove himself capable of handling pitches in the zone, too. Since he is swinging less at better pitches, it isn\'t just pitches out of the zone giving him trouble. I\'ve seen plenty of Frenchy ABs during which he swung and missed pitches seemingly down the middle.
Laying off junk is of tantamount importance but does not offer a similar guarantee of results as, say, Ryan Howard, who mashes balls in and around the zone, but can\'t hit anything outside.
Rawagman, go to Google, type in Eric Seidman, Matt Cain. I\'m sure you will find a plethora of articles. I kind of have a baseball fetish for Cain ;-).
Given the contract Jenkins signed this year as well as Werth\'s ability to pummel lefties but not dominate righties, the Phillies wanted to platoon them, hoping that Victorino in centerfield and Werth/Jenkins in rightfield would be more productive than Rowand in centerfield and Victorino in rightfield.
When Werth still manhandled lefties but was just OK against righties halfway through the season, you can kind of understand, with the above logic, why he still wasn\'t consistently starting.
When Jenkins went down and Werth caught fire, there was no way anyone could keep him out of the lineup. Additionally, from comments Gillick made on local tv shows, he had wanted Werth in the lineup everyday even sooner. Regardless, he is there now, and will be a full-time starter next year, when he should prove to be an extremely valuable fantasy player, even though he will be more well-known.
Shoewizard, that is exactly what I was referring to. If Ethier THINKS he is going to see better pitches because, as kcboomer says, it makes so much intuitive sense that he SHOULD, then his confidence is likely going to go up, and he is going to be better at working the count, perhaps, getting to see better pitches more often. Statistically, Ethier is not really seeing anything different, but if he thinks he is then that does say something. He has definitely stepped his game up in the last couple of months but I feel that to credit solely Manny for this surge is inaccurate.
Slingerland, I can see him being good next year, but not THIS good. The HR/FB of 4% is not sustainable. When that inevitably regresses, even if to just 7-8%, it will result in a lower strand rate, higher ERA, and higher FIP. Then again, he could shock us all like he did this year. His controllable skills look better than they are due to this very low HR rate, but for all we know, he could start fanning more hitters to counteract that.
Oops, off by one year. His 2000-2002 was very solid, as are his 2005-2008. In 2003 he stunk, plain and simple. In 2004, he was muc better than 2003, but injuries played a big part in keeping his numbers down.
Alexei, it actually hasn\'t been up and down. He has been extremely solid from 2004-2008, and had good years prior to 2002. His 2002 season was terrible, though, and his 2003 was injury-plagued. He doesn\'t hit for a high batting average, but look at his slash line and OPS from 2004-2007 and this year as well, and tell me it isn\'t consistent.
Ohh, just you wait fellers...
mars2001, the fusion of St. Louis medical with Oakland scouting would be... well, it would be like a mega-front office!
Derrek Lee is an interesting case, bdoublegeez... from 2005-2007, his LD/GB/FB were virtually identical, but his decline in power had to do with a gradual dropoff in HR/FB... 24% in 05, 15% in 06, 13% last year.
Then, this year, it is down to 11.8%, yet his FB% has dropped from 38% to 33%; he\'s hitting less home runs per fly ball, and hitting a much lower percentage of flyballs, a combo that can\'t be positive.
His BB rate is the lowest it has been since 2004, but his K-rate is the lowest it has been, well, ever. He is also swinging more than the last few years, increasing his contact on pitches out of the zone, which would explain the dropoff in his walk rate.
I\'m not sure where the power went, but it has gone somewhere. He is still effective, but his .824 OPS is lower than Casey Blake and equal to Cody Ross, two players you\'ll likely never hear mentioned alongside Lee.