Happy Thanksgiving! Regularly Scheduled Articles Will Resume Monday, December 1
January 24, 2012
The Player Popularity Test
Years of talking about baseball have taught me at least two things: it’s dangerous to shout “Francoeur!” in a crowded room, and it’s difficult to gauge a player’s popularity, especially outside of the sabermetric bubble. Unless you work for a club and have access to information on team merchandise and ticket sales—and maybe even if you do—it’s tough to know how high a profile a player has among fans. So how can we decide if a particular player is overrated or underrated, or whether he gets more or less attention than his play on the field might merit? Are we forever doomed to count google hits?
We can approach this problem in a number of ways. If I were Vince Gennaro, author of Diamond Dollars, I might spend several hundred hours developing a proprietary “marquee value” metric based on social media measurements and other components to assess the off-the-field value of star players. Then I’d write a book and a bunch of articles about it and consult with major-league teams. Well, here’s a blurry picture of me sitting at the same table as Vince Gennaro (also pictured: part of Kevin Goldstein’s fedora). I may look like I have one strangely-shaped eye and don’t trust Cory Schwartz, but do I look like I’m Vince Gennaro? Not particularly. So that’s not what I did.
It’s possible that the answer to this question—along with many others—might be found in the most likely location for baseball information to be: Baseball Reference. As you’ve probably noticed, every player page on B-Ref can be sponsored for a certain price. If you pay that price, you get your name on the internet and the satisfaction of having supported Baseball Reference. You also get to post a link to your website or a defensive message explaining why you made a questionable life choice like sponsoring Ramiro Pena.* (Click to expand.)
*I don’t know what Howard Megdal said about Ramiro Pena, but I’m guessing he was right. Just five more months, and this page could be yours!
Here’s how these sponsorships help us: the more hits a page gets, the higher its price. If we know how much a player’s sponsorship costs, we have some idea of how popular his page is, at least among the sizeable subset of baseball fans who look up stats online. With that in mind, I asked Sports Reference czar Sean Forman for a list of the sponsorship prices for all active players, which he was kind enough to provide.
It’s easy enough to tell whose pages draw the most traffic. Would you believe Albert Pujols, Derek Jeter, and Alex Rodriguez? Yeah, you probably would. What I’d really like to know is whose pages cost much less or more than one would expect, given how good at baseball they’ve been. Which players have put up numbers without anyone noticing? And which have drawn the most attention without producing on the field?
I could have eyeballed the list and picked out some prices that seemed strange. (Ian Stewart costs the same as Zack Greinke?) But this is BP, so instead, I decided to do something quasi-scientific. Armed with the information from Sean, I ran a regression to determine the relationship between page price and WARP.* Presumably, both what a player has done and what a player has done lately affect how often people look at his page, so I included both career WARP and a weighted average of WARP over the past three seasons as variables. (As it turned out, career WARP was more predictive of page price, with a 0.76 correlation between the two.) Of course, there are plenty of important variables I didn’t include—market size, whether a player has been signed, traded, or arrested recently, whether he has a memorable nickname—but WARP gets us most of the way there.
*Everything sounds more scientific if you say you “ran a regression.” Additional science points are awarded for saying “my model” instead of “this half-assed thing I did in Excel.”
Once I ran the regression (a complex process that involved clicking “Data-->Data Analysis-->Regression-->OK”), I had an equation I could apply to each player’s career and weighted three-year WARP to get a predicted page price. (If you’re curious, the two WARP figures together explained 63 percent of the variation in page price.) For some players, the predicted price and actual price matched perfectly. For others, they came close. The model valued the most expensive player, Albert Pujols, at $3600 ($3602, to be precise, but the page prices increase in $5 increments). The actual cost of his sponsorship is $3560, which translates to something like four million page views if this outdated explanation of the sponsorship system is still accurate. By the way, if you have $3560 to burn (and who doesn’t, amirite?), Pujols’ page is available.*
*For now, anyway. Last year, the Emir of Kuwait agreed to give every Kuwaiti citizen $3560 to celebrate the 50th anniversary of the country’s independence. So far, not a single citizen of Kuwait has blown the whole handout on an Albert Pujols B-Ref page sponsorship, but you have to figure it’s coming.
On a $/WARP basis, Juan Castro for $45 seems like the worst deal you can get. Castro has a career WARP of -7.5 and a three-year weighted WARP of -0.5, which means that according to my simple model, Sean should have to pay you $257.13 to sponsor him. (Sean, if you’re reading, I’m willing to make that sacrifice.)* But since the lowest possible price for an active player appears to be $10, I set that as the minimum.
*If $45 for Juan Castro’s Baseball Reference page sounds like a lot, what would you say to $7.9 million for Castro himself? That’s how much major-league teams paid him during his 17-year career. Sure, he played a lot of positions and probably did all the little things, like advancing runners and capturing, spaying, and neutering stray ballpark cats after batting practice. He also averaged only 64 games and 168 plate appearances per season, so clearly even his own teams did their best not to play him. Still, with a career .207 TAv and below-average fielding and baserunning stats, he must have been several wins above replacement clubhouse presence to stick around that long. He’s currently paying off his debt to the Dodgers as a special assistant to the GM.
Here are the players whose page prices are lowest as a percentage (shown in parentheses) of what we’d expect given their production. In addition to actual and predicted price, I’ve included actual and predicted “POP+,” a very silly stat that works like ERA+ or OPS+, but for popularity instead of performance. (If you want to be a stickler about it it, page views don’t equal popularity, as some visitors to Barry Bonds’ page could probably attest. So maybe POP+ isn’t the most accurate term. On the plus side, it’s alliterative, so I’m sticking with it.)
Elmer Dessens (Actual Price: $30 (6.3%), Predicted Price: $475) (Actual POP+: 28, Predicted POP+: 208) I’m not the first one to notice that Dessens might not be getting his due. This author nicknamed him “Unknown” because “he will work in just about any pitching capacity the Mets put him in, and yet he always flies under the radar.” Dessens is now so under the radar that he last pitched in Mexico, and his nicknamer has stopped writing about the Mets and started writing about more cheerful topics, like which players have recently died.
Doug Davis (Actual Price: $50 (7.2%), Predicted Price: $700) (Actual POP+: 47, Predicted POP+: 305) Davis spent time in three organizations last season. You’d think all that moving around might have led to a few more B-Ref lookups, but evidently the internet didn’t want to see much more of him than the Brewers, Cubs, or White Sox did.
Scott Rolen (Actual Price: $220 (8.7%), Predicted Price: $2530) (Actual POP+: 207, Predicted POP+: 1108)
Hey, I like looking at Inge’s low batting averages as much as the next guy. I’m guilty of visiting Jonathan Sanchez’s page to make sure I didn’t imagine his walk rate. And who doesn’t enjoy ogling Adam Jones’ league-leading 2011 sac fly total? But Rolen is a potential Hall of Famer—or at least I thought he was, until I realized it costs more to sponsor Kelly Shoppach’s page than his. Rolen would have more WARP than Barry Larkin if he retired today, but something tells me he’ll need a lot more than three ballots to get the call to Cooperstown. Hall of Fame voters have been hard on third basemen, and Rolen might end up as the spurned successor to Ron Santo, especially if Brandon Inge hits the ballot at the same time and steals the spotlight.
Melvin Mora (Actual Price: $110 (9.0%), Predicted Price: $1220) (Actual POP+: 103, Predicted POP+: 535)
Saul Rivera (Actual Price: $15 (10.3%), Predicted Price: $145) (Actual POP+: 14 Predicted POP+: 64)
Brett Tomko (Actual Price: $60 (10.4%), Predicted Price: $575) (Actual POP+: 56, Predicted POP+: 252)
Mark Hendrickson (Actual Price: $45 (11.0%), Predicted Price: $410) (Actual POP+: 42, Predicted POP+: 179)
Julio Lugo (Actual Price: $80 (11.5%), Predicted Price: $695) (Actual POP+: 75, Predicted POP+: 304)
Ronnie Belliard (Actual Price: $50 (11.6%), Predicted Price: $430) (Actual POP+: 47, Predicted POP+: 189)
Livan Hernandez (Actual Price: $170 (11.6%), Predicted Price: $1470) (Actual POP+: 160, Predicted POP+: 643)
Other notable players whose sponsorship prices are no more than 15 percent of their predicted values include Mike Cameron, Bobby Abreu, and Javier Vazquez, a trio of players whose talents have sometimes gone appreciated outside of the sabermetric community and inside the state of New York. The highest price in this 15-percent-and-under group belongs to Miguel Tejada at $275, who just beats out Nick Punto at $270, ties Brett Gardner at $275, and is left looking up at Ivan Nova and Freddy Garcia at $280. Lesson learned: there are a lot of Red Sox and Yankees fans on the internet.
You know who else is unpopular? Pitchers. Pitchers, as a group, have a 75 POP+. Only catchers are less popular among position players, which sort of makes sense unless you’re the type who likes to look up passed balls. Help me out here: Are pitchers inherently less enjoyable to watch? Do we identify with them less closely because for most of us, standing on a mound only reminds us of what we can’t do, while being in a batting cage can be more fun with less talent? Do we look them up less often because they appear in fewer games? Do we just like looking at big offensive numbers? The most popular positions are the corners and DH, which tends to suggest that we all dig the long ball.
Before we conclude, let’s take a look at the players whose high sponsorship prices don’t match their production. All but one of these guys received the minimum predicted price and POP+:
Anthony Rizzo (Actual Price: $235 (2350%), Predicted Price: $10) (Actual POP+: 221, Predicted POP+: 4) My model is perplexed that a guy with a single season and a negative WARP under his belt would cost as much to sponsor as Jon Lester and Brandon Phillips. Little does my model know that Rizzo is a promising prospect who was recently traded. More importantly, it doesn’t know that Jed Hoyer set Rizzo’s B-Ref page set as his homepage so he could celebrate having acquired him or fantasize about acquiring him again every time he opens his browser. And general managers probably open their browsers a lot.
Jeff Mathis (Actual Price: $150 (1500%), Predicted Price: $10) (Actual POP+: 141, Predicted POP+: 4) We’ve created a monster, and it’s not named Jeff Mathis or Mike Scioscia. By writing article after article about how much Scioscia overvalued Mathis—no matter how amusing or insightful most of them are—we’ve collectively done exactly what we derided: made Mathis seem more important than he actually is. At some point this winter, Gregg Zaun speculated on Twitter that Mathis would break out offensively as a Blue Jay, now that he’s finally free of Scioscia’s directives to focus on his defense. If that happens, we can keep blowing up the Mathis sponsorship bubble. Until then, though, it might be time for a Mathis moratorium.
Bryan LaHair (Actual Price: $130 (1300%), Predicted Price: $10) (Actual POP+: 122, Predicted POP+: 4) Look, I wouldn’t blame Bryan LaHair if he spent a few days after the Rizzo trade refreshing his B-Ref page over and over and hoping to see a full 2012 season appear beneath those 69 plate appearances from 2011. I don’t think anyone would. And I’m not saying that’s definitely how this happened. Just that if it were, I wouldn’t blame him. Let’s move on.
John McDonald (Actual Price: $125 (1250%), Predicted Price: $10) (Actual POP+: 117, Predicted POP+: 4)
Brandon Wood (Actual Price: $100 (1000%), Predicted Price: $10) (Actual POP+: 94, Predicted POP+: 4) We might not like to admit it, but most of us love to rubberneck. Brandon Wood’s Baseball Reference page is the ocular equivalent of a 10-car pileup. Want to see what a zero OPS+ or a career average under .200 looks like? They’re only a click away. Go on, you know you want to.
Greg Halman (Actual Price: $95 (950%), Predicted Price: $10) (Actual POP+: 89, Predicted POP+: 4) Sometimes baseball experiences tragedies much more serious than a prospect who doesn’t pan out. Halman was killed last November. Naturally, the news made him better-known in death than he had been in life. Unfortunately, we’ll never know whether his play would have sent his page price even higher.
Jo-Jo Reyes (Actual Price: $95 (950%), Predicted Price: $10) (Actual POP+: 89, Predicted POP+: 4)
The last few years haven’t been kind to Reyes. On the plus side, it costs as much to sponsor him as it does to sponsor Brandon Beachy and Jonny Venters, who were much more successful. If he’d won a little more often, that might not be true.
John Bowker (Actual Price: $95 (950%), Predicted Price: $10) (Actual POP+: 89, Predicted POP+: 4) Bowker’s inflated page price reeks of desperate Giants and Pirates fans searching for a source of offense. Either that, or the NPB team that acquired him does most of its research via B-Ref.
Chris Davis (Actual Price: $120 (936%), Predicted Price: $15) (Actual POP+: 113, Predicted POP+: 6) Those 48 games in the PCL were impressive, but if Josh Johnson and Shin-Soo Choo cost $120, Chris Davis has no business being $120, too. Then again, Johnson and Choo might be even more surprisingly priced at that level. When you play in a small market and miss most of the last season, these things happen.
Tyler Chatwood (Actual Price: $90 (900%), Predicted Price: $10) (Actual POP+: 85, Predicted POP+: 4)
Also on the overpopular side: Armando Galarraga (sort of pitched a perfect game), Eugenio Velez (went 0-for-2011), Charlie Morton (briefly became Roy Halladay), Andy Marte (more rubbernecking), and Rob Johnson (I gave up trying to explain him for New Year's).
We don’t know how any of these pages prices translate to the costs teams care about. But at least we have one new way to settle an old debate.