Baseball Prospectus is looking for a Public Data Services Director. Read the description here.

​1. Pitch Counts from the "Old Days"
Pitch counts are not a new phenomena. During my freshman season of high school baseball some 34 years ago, my job was to count our pitchers' pitches. Since I received exactly one plate appearance that season—and struck out on three pitches against a Division I recruit—I had to do something to justify having a uniform and getting out of class early for road games. Our coach never really paid any heed to the pitch counts. I think he just wanted to keep me occupied. Regardless, I have always had a thing for pitch counts and have often wondered how many pitches were thrown back in the early stages of the major leagues, when most teams had just two or three starting pitchers. Think back to the 1893 and 1894 seasons. Pud Galvin threw a combined 1,292 2/3 innings for Buffalo in those two years while Providence's Old Hoss Radbourn—who has developed a hilarious Twitter feed in his reincarnation—logged 1,331 innings. I realize the game was different then and pitchers didn't throw as hard or put as much torque on their breaking pitches. Still, I'd love to know exactly how many pitches Pud and Old Hoss threw in those two seasons. Then again, my arm might start hurting if I learned the answer. —John Perrotto

2. Minor-League PITCHf/x Data
One of my favorite baseball tools to become readily available in recent years is PITCHf/x (thanks Dan Brooks and Harry Pavlidis). Being able to track a pitcher's velocity on their fastball and the movement and whiff rates on their secondary pitches has resulted in me wasting spending countless hours perusing PITCHf/x data. I'm fascinated by this information, much like I'm fascinated by prospects and their development. Bringing the two together, and making the information available to the public, would be tremendous. Seeing that a right-handed starting pitcher with a sub-2.50 ERA in the Double-A Eastern League throws a four-seam fastball at an average velocity of 84 mph would go a long way in explaining why said pitcher isn't plastered prominently on top prospect lists.

Yes, PITCHf/x data for minor-league players would probably result in a new type of crazed fan. The type of crazed fan that would ask our new leader in prospect coverage about how a prospect could earn a plus grade on a breaking ball that has such a low whiff rate, a la the “how can a guy with bad stats be a top prospect,” but that's okay because I'm not responsible for answering those questions. (I kid, of course.) I believe this information will become available in the future (hopefully the near future), but until then, I'm left anxiously waiting. —Josh Shepardson

3. Historical Pitcher Throwing Programs
I wish we had access to the throwing programs of the past, so we could learn how pitchers in modern baseball history kept their arm in shape. We would benefit if more was known about the days that pitchers did not get the ball, what they did between starts, and how their pitching coaches and field manager regulated their throwing, if at all. Keep in mind they were operating with four-man rotations, and an era with smaller pitching staffs, in addition to a 154-game season. Did they play long toss? When did they throw BP, how much, and why? Flat ground versus mound work? While I realize that data was unlikely kept as formally as it is by today's pitching coaches, the current game would benefit if a panel of medical experts, trainers, and pitching coaches analyzed the information. Imagine having the notes from a great pitching coach like Johnny Sain or Rube Walker, and what it could mean for the sport. —Dan Evans

4. Player Perception of Time
Someone out there is reading this in the middle of a regularly scheduled and oh-so-boring Wednesday meeting. Don't worry, we won't tell. You might hate that meeting, but it probably marks out something important, the halfway point to Friday afternoon. It's a way to mark time. Humans are good at that. We are creatures of cycles, and in general, the world is built around the cycle of the Monday to Friday, 9-5 job. If you've had a bad week, there's a natural endpoint to it, and you can get them the next time around. It's a natural human reaction to segment things out like that. It's protective. The fact that it's Monday doesn't really mean you can hit the reset button, but in your head, you can.

What happens when you have a job that doesn't have that sort of 9-5 regularity… like a baseball player? I often wonder how players segment off time. Probably the most salient marker of a new period of time beginning is the three-game series. Afterward, you switch cities and opponents. But even the little markers in a week don't happen. You play the same game every day. It's not like you have a standing "I get to play second base" day on Thursdays. I wonder how players handle that lack of landmarks and whether they create artificial ones (or whether creating these at the team level would help.) It's nothing that a few well-formed questions asked to a few players couldn't answer. But, of course, while there's plenty of data on what happens on the field, there's no portal for asking a player what's going on between his ears. If there were, though, imagine the advances that could be made in understanding how players perform. —Russell A. Carleton

5. Bat Speed for Every Pitch
Things I imagine could be accomplished if we measured the bat speed of every swing by every player: Scout like freaking geniuses; identify players' decline phases with more precision; understand the aging curve better, and identify more specifically the swing types that age worse; measure the effect of each pitch-type and pitch location on bat speed, and delve into the pitch-sequencing effects, as well; isolate the batter's performance in an at-bat, which is to say, draw clearer distinction between the pitcher's role in a home run and the batter's role in a home run; measure the player's swing during a slump to determine whether the slump is artificial (random fluctuation) or physical; charts; graphs; GIFs; Mike Trout. I don't talk to scouts a ton, but when I do, bat speed comes up more than anything else. It's the hitting equivalent of fastball velocity. As long as we aren't measuring bat speed, we aren't measuring one of the most important parts of baseball. —Sam Miller

6. Reliable Radar Readings on "Fastest" Pitchers Ever
Who's the fastest pitcher of all time? It's not an answerable question. Denton True Young was called "Cyclone" (shortened to Cy) because he threw so fast. But he couldn't throw faster than The Big Train, Walter Johnson's contemporaries claimed. Or was Rube Waddell faster than both of them? Did those three throw faster than Dazzy Vance or Lefty Grove or Lefty Gomez? Did they throw harder than Bob Feller? Did Bob Feller top 100 mph? Could he throw harder than Sam McDowell or Nolan Ryan or Justin Verlander? And where does a young Satchel Paige factor in?

It's tempting to say that modern pitchers must throw harder than their ancient colleagues, who didn't have the benefit of modern nutrition, training, medicine, mechanics, or the general trend of human beings getting bigger and stronger. But that doesn't account for the possibility that one or more of these old timers could have been some genetic freak and pitching savant.

But ultimately, we can't know who the fastest pitcher ever really was because we don't have reliable velocity readings, only subjective descriptions from much of the 20th century. If I could have only one stat, it would be PITCHf/x or, at the very least, reliable radar gun readings from every pitcher in history, so we would know how Johnson matched up to Paige matched up to Feller matched up to Ryan. And so that I would finally know if Steve Dalkowski was actually faster than them all. —Michael Bates

7. PITCHf/x for Luis Tiant
I wasn't born when the 1975 World Series took place. I first watched the games a few years ago, after buying the 1975 World Series Collectors Edition DVD box set by A&E. I remember at some point, while watching Game Six, turning to my father and saying: "Well, maybe Tiant has already thrown this pitch in one of the previous outings." I was obviously exaggerating, but we really had the feeling that Luis Tiant had not thrown two pitches of the same kind among the 300-plus he delivered in his three starts. He mixed speeds, locations, delivery angles, and motions. In Baseball Eccentrics, teammate Bill Lee described one of Tiant's characteristic deliveries: "My head stopped exactly seven times […] First I looked up, then I looked down; then my jaw pointed toward second base, then toward third. Then it pointed toward center field, then behind my back—and then, just before I throw the ball, I pointed it upstairs to where Mrs. Yawkey was sitting."

As a data analyst living in 2012, I'd be thrilled to have what Sam Miller asks for, but I would also love to see my friend Harry Pavlidis having to classify a few seasons of El Tiante's one million different pitches. —Max Marchi

8. True Distance of Home Runs Hit in the 1990s and Earlier
Remember the great home run race in 1998? Mark McGwire and Sammy Sosa and everyone else were cranking home runs at a ridiculous pace, and we all loved it. But it wasn't just the number of home runs that we loved—it was the majesty of the home runs as well. Big Mac and company weren't merely hitting balls over the fence, they were crushing them. It wasn't uncommon to hear of yet another 500-foot blast from McGwire or Ken Griffey Jr. or Frank Thomas or so many other great home run hitters. Glancing through old newspapers, one September 1997 article mentions the "fourth 500-foot home run of the season" for McGwire. Another article mentions "four consecutive 500-foot home runs against the Cubs" by a quartet of Rockies players. The blasts may not have happened every day, but they were certainly a normal event of that wacky offensive time.

Since 2006, ESPN's Home Run Tracker (also known as Hit Tracker Online) has been calculating the true distance of every home run hit in the majors, using speed off the bat, angle of launch, and other physical properties. In those seven years and (roughly) 33,500 home runs, can you guess how many home runs have actually traveled 500 feet? One. That's right. This one. With all the great home-run hitters we've seen since the White Sox won the World Series—Prince Fielder, Adam Dunn, Alex Rodriguez, Albert Pujols, Manny Ramirez, Barry Bonds, Miguel Cabrera, Giancarlo Stanton, and so many others—only one has been able to reach 500 feet. Even with steroids, smaller ballparks, crappy pitchers, juiced balls, the league's deal with the devil and a once-in-a-generation talents like Big Mac, I still find it very, very unlikely that so many 500-foot home runs were hit in such a short period of time. That's what you get when you have to rely on press box estimates in an era when chicks dig the long ball. If we could get true Hit Tracker-style home-run distances from the past, we might have a better idea of just how powerful that era was.

At the very least, we'd finally be able to know just how far the likes of Mickey Mantle, Dave Kingman, and Josh Gibson could hit the ball. —Larry Granillo

9. Biomechanical Video Analysis of Current and Former Players
I'd love to see biomechanical video analysis of major-league pitchers and hitters, both current and historical. The information could be used to determine when pitchers are tiring in-game, or when a player has developed a bad habit at the plate or on the mound. It would further be fascinating to see the exact physical differences between the way pitchers and hitters play the game now and the way the game was played decades ago. I suspect the differences would be staggering. —Matthew Kory

10. Historical Caught-Stealing Totals
This is the kind of thing that keeps a baseball history geek up at night. Around the turn of the last century, the stolen base was as much a part of the daily game as the walk and strikeout are today. You got on first base, maybe with a bunt single or by reaching on an error, and then, almost without regard to who you were or what the situation was, you took off for second. After that, maybe third. Spectators and scorekeepers kept track of stiolen bases meticulously (within their technologically constrained means), and players who stole a lot of bases even in the context of this time were certainly lauded for it; Sliding Billy Hamilton was a huge star, mostly thanks to his baserunning exploits.

Yet through all this, nobody (or at least nobody in power) thought it might be interesting to track the number of times players unsuccessfully tried to swipe a bag. Not for a long time. It was 1920 before the caught stealing became an official statistic, and after six years of that, the National League inexplicably decided they just weren't worth it, dropping the stat again to start 1926. The fact that that year's World Series ended on Babe Ruth's caught stealing still wasn't enough to convince them that this might be a good thing to know after all; they didn't pick the stat back up again until 1951.

So we know the original Sliding Billy stole 100 or more bases four times and 97 another, but we don't know if they came in 110 attempts or 200. We know Ty Cobb led the league in steals six times, but can't really say whether he had a talent for it or just a combination of on-base ability and bravado. We can guess the latter, based on the unofficial caught stealing stats from 1912 and 1914-16, which have him leading the league in CS twice, plus once more (officially this time, at age 40) in 1927; it seems unlikely that he would've had sterling success rates during the 11 years of his career for which the data is still absent.

It becomes less important as you get into the NL's inexplicable holdout of the 1920s, '30s and '40s; nobody was running much then anyway. Still, it's at least a bit odd that in all of Paul Waner's long career, save 10 games he played with the Yankees in his mid-40s, we have no idea how commonly he was thrown out trying on the way to his 104 career SBs. The same goes for Mel Ott and his 89 career steals; his 22-year career happened entirely in the NL, and at no time during it were caught stealings considered a thing by that league. Then there's young Jackie Robinson, one of the most dynamic players the game had ever seen; he led the league twice in steals in his first four years, and we can guess based on his later numbers that he was very efficient, but we'll never know exactly how efficient. It's small in the grand scheme of things, and even in the grand scheme of baseball things, but it's the kind of thing that nags at me much more than I'd usually like to admit. —Bill Parker

11. Bullpen Pitch-Tracking Technology
Some pitchers say the stuff they have in the bullpen is the same stuff they bring to the game. Others rave about how they felt in the bullpen before a bad outing or recall how they had nothing while warming up before the first pitch of a successful start. We don’t know whether what a pitcher throws before entering a game has any predictive power. But if it does, imagine the applications of a bullpen pitch-tracking system: By comparing bullpen readings to established baselines, managers could get an early warning before bringing in a reliever on a day when he doesn’t have it, or decide to switch starters based on bullpen readings before a big game. More practically, perhaps, teams could benefit from an easier way to assess a pitcher’s progress on a throw day between starts or while rehabbing from injury.

The information might not be equally useful for every arm, and there could be some unintended consequences to making pitching decisions based on pre-game activity. How would pitchers react to the possibility of a Minority Report-style punishment for in-game crimes not yet committed? Would the psychological benefit of knowing with certainty when one is scheduled to start outweigh any advantage that could be gained by going with a (so to speak) hotter hand? Plenty of confounding factors would make it difficult to derive value from the data, but there could still be some signal lurking amidst all the noise.

We don’t need another reason to envy team employees’ access to information, but this is yet another area in which the information gap between industry insiders and outsiders is about to grow. TrackMan—a company that tracks the complete trajectory and spin rate of both pitched and batted baseballs via three-dimensional radar—has already developed a portable product (seen here during a demonstration at the Sabermetrics, Scouting, and the Science of Baseball seminar last month) that it hopes all teams will want to install in their big-league bullpens. According to one source I spoke to, at least one major-league stadium already has one in place. Using the TrackMan system or some other pitch-tracking technology, teams could get a sneak preview of a pitcher’s performance before he throws an official pitch. And seeing just a little bit further into the future than your opponent probably can’t hurt. —Ben Lindbergh

12. Batter Reaching On Error For Historical Players
Reaching on error isn't a huge part of today's game, but it's a massive part of the historic game of baseball, especially way at the early points of baseball history; omitting reach on error could conceivably remove half of a team's baserunners. So we keep track of such things now when omitting it isn't such a big deal, but we lack it for when it makes a big difference. It would also be nice to have similar data for pitchers so we can better understand how theories like DIPS affect historic baseball.—Colin Wyers

Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now
You need to be logged in to comment. Login or Subscribe
Very, very cool article.
Yeah, very interesting reading.
Put my vote in for genetic data on healing factor. Knowing who heals faster means more supermen like Pujols.
Good one.
I'd want to know how fast Mickey Mantle really was home to first, because the stories written are just goofy.
Home Run Tracker for MiLB so we can more easily judge power between parks, leagues and levels.
Those home run record assaults and home runs for distance were, IMHO, not the result of PEDS at all at that time but resulted from MLB monkeying with basball construction, something I believe that they did to try to counteract the severe economic effects of the '94 strike, and the owners just sort of fell into being able to use PEDS (and thus, the players) as a scapegoat for the whole thing.
"We would benefit if more was known about the days that pitchers did not get the ball, what they did between starts"

If you knew what Grover Cleveland Alexander did between starts, you would probably puke.

On Hit Tracker, if you had video of the homerun and a model of the ballpark set up in Greg R's downloadable spreadsheet, there's nothing to stop you from estimating HR distances from the past. Well, other than MLB making it hard to find those kinds of highlights.
Historical home-to-first times.
I'd like BP to have the kind of detail, flexbility, readability and sortability that baseball-reference does. It's also kind of disjointed with a hit tracker here or a fantasy tracker there. I can't even find platoon splits here either.

Generally, I go to b-ref (or even ESPN) for hitting/pitching/fielding data then do a search on BP for articles.
At least as much statistical data about players in the Negro Leagues as is available for their MLB contemporaries.
Though it would be nice, I'm not sure if it can be done. As a comparison, there isn't as much high school, college, independent league, summer league or minor league data as there is for major leaguers. For example, trying to find pitch count data at anything other than the major league level is very difficult.