Category: Baseball

Great Expectations: the Roberto Aguayo story

By Jack Graham

Among other things, the Tampa Bay Buccaneers have been plagued by the kicking struggles of rookie Roberto Aguayo throughout the 2016 season. Through 9 games, Aguayo has made only 9 of 14 Field Goals, a rate of 64% that qualifies him as the least accurate kicker in the league. Of course, these lackluster numbers would not typically be grounds for an interesting story, except for the fact that Aguayo also happened to be the Buccaneers second round draft pick. While selecting a kicker so early in the draft is not unprecedented (the Oakland Raiders drafted Sebastian Janikowski in the first round in 2000), it is incredibly rare. It is not controversial to say that Aguayo has not met the lofty expectations imposed on him by his draft status, but it is also natural to wonder: how well would Aguayo have to perform in order to justify such a high draft pick? And then, based on his college performance, was it reasonable for the Bucs to expect him to meet this standard?

Roberto Aguayo

Roberto Aguayo

Continue reading

A Historical Review of ESPN Power Rankings

By Ben Ulene

It’s September again, and with the major wintertime sports starting their 2016-2017 regular seasons, sports fans across the U.S. get to participate in the annual tradition of poring through expert predictions – including various outlets’ preseason power rankings.

While Vegas odds and other betting markets offer a general take on how various teams may stack up, there is something satisfying about reading power rankings reports. With written blurbs for teams that function as justifications for their rankings, sites like ESPN and Bleacher Report impart a qualitative aspect to the numbers that may mirror, or perhaps even spark, debates among fans across the country. And like professional odds-makers, many sporting news outlets update rankings as the season goes along – bookending the year with a final set of rankings leading into the playoffs – which provides a mechanism for determining how “predictable” any given regular season is.

For ease of use, I analyze ESPN’s Power Rankings in this article. With rankings from the first to last week of the regular season across all four major U.S. sports, ESPN’s rankings let us perform cross-sport comparisons to test the accuracy of predictions by different sets of “experts,” as well as dive deeper into individual sports to determine the predictability of different teams and seasons.

 

Cross-Sport Analysis:

Upon analyzing the past six years of rankings across MLB, NBA, NFL, and NHL, one thing is apparent: The NBA is consistently the most predictable league, and by far. As a few scatterplots show, in the NBA, Week 1 Power Rankings are much more predictive of regular season success than in any other sport:

Here, a straight diagonal line would represent perfect prediction accuracy for every team; while the NBA plot is far from perfect, it still seem to be noticeably less random than the other plots.

In fact, as a two-sided t-test shows[1], the inter-league difference in predictability (measured by the average absolute value difference for every team’s beginning ranking and their final ranking) is statistically significant between the NBA and every other league, but insignificant among the other three. In other words, the predictabilities of MLB, NFL, and NHL regular seasons cannot be proven to be different from one another – but the NBA is more predictable than all three:

League 1 Mean Difference League 2 Mean Difference p-value
NBA 4.33 MLB 6.94 1.151e-07
NBA 4.33 NHL 6.54 6.178e-06
NBA 4.33 NFL 7.52 3.242e-09
MLB 6.94 NFL 7.52 0.3327
MLB 6.94 NHL 6.54 0.4735
NFL 7.52 NHL 6.54 0.101

 

Why then – besides the unlikely explanation that ESPN’s NBA analysts are that much better than their analysts for other sports – is there such a drastic difference in ranking accuracy? Season length is what first comes to mind, but upon closer inspection cannot be the primary cause; not only is the NBA season easier to predict than the equally-long NHL season, but there is a noticeable lack of statistical difference between predicting the MLB (162 games) and NFL (16 games) seasons.

There are, however, a few possible explanations:

1) Fewer, more impactful players: The NBA mandates that teams carry 14 players at any given time, as opposed to the NHL’s 20, MLB’s 25, and the NFL’s 53, giving star players – who are usually apparent at the beginning of the season – more of an impact on results.

This is magnified by the nature of the game: Only in the NBA do the most impactful players consistently play for more than three quarters of the game, making it easier for talented teams to rise to the top. Pitchers in baseball can play only every five games; hockey superstars may see only 20 minutes on the ice in a game. Even in the NFL, elite quarterbacks are only on the field for offensive snaps, and the comparatively high injury rate makes it difficult to predict even the best teams.

2) Higherscoring games: The high number of possessions in NBA games, compared to the other three sports, may mean that there is less of a risk that any given NBA game is determined by chance. NBA teams possess the ball over 90 times on average in any given game[1]; even in baseball, teams will rarely send up more than 40 batters in a game[2]. Therefore, skill differences have more opportunities to manifest themselves in basketball, while mistakes can have a larger impact on the other big sports with fewer possessions.

3) Injuries: Since hockey and football are high-contact sports, teams in the NHL and NFL are much more susceptible to being gutted by injuries midseason than NBA teams. Even in baseball, a non-contact sport, pitchers – perhaps the most important players on their teams – are highly susceptible to season-ending arm injuries that can tank a team’s season.

 

Seasonal Analysis:

Taking the league-based differences into account, it is no surprise that six of the seven most predictable seasons in our sample are in the NBA – with the 2013 NBA season clocking in with a tiny 3.47 average difference between beginning and end rankings.

seasons

Additionally, the NBA is unique in its consistency – while other sports like the NHL vary wildly from season to season in predictability, the NBA has stayed relatively constant. The NBA has been the most predictable for each of the past six years, while no other league has repeated as least predictable – the NFL led in 2010 and 2013, MLB in 2011 and 2015, and NHL in 2012 and 2014.

seasons2.png

 

Team Analysis:

Just as interesting are the numbers for different teams across sports. Not surprisingly, the Miami Heat was the most predictable team in the country over the past six years – buoyed by four years of their LeBron-powered “Big Three.” But the top of the list is not just dominated by consistently good teams – bottom-dwellers like the Philadelphia 76ers, the Edmonton Oilers, and the Houston Astros (bad until 2015) also lead in predictability. Perennially unpredictable teams like the Minnesota Vikings and Boston Red Sox dominate the bottom of the list, as well:

teams.png

To conclude, what can this data tell us about predictability in American sports as a whole? For a large majority of teams across the big four U.S. sports, not a ton – moving four spots up or down in rankings can be the difference between making the playoffs and missing out. And assuming the randomness in professional sports doesn’t change anytime soon, it is probably safe to say that expert accuracy will continue along this trend for years to come.

[1] I compared the absolute value of first week–last week differences, with n=180 for every league except the NFL, which had n=192 (since the NFL has 32 teams).

[2] http://www.basketball-reference.com/leagues/NBA_stats.html

[3] https://www.teamrankings.com/mlb/stat/at-bats-per-game

 

Dear Rob Manfred, The Millennials Are Leaving

By Max Kaplan, “The voice of the millennial sports fan”

We millennials are losing interest and it’s not our fault.

We can’t sit through another 4-hour MLB game with 11 pitching changes and 15 walks.

We groan every time a batter steps out of the box to re-adjust his batting gloves for the third time since the last pitch. Or when the pitcher starts pacing around the mound, fondling the rosin bag.

We think “get on with it” when the manager takes a full minute to decide whether to challenge a play and then challenges it – and the fans are gifted another 3-minute stoppage.

It’s not the 10-9 slugfest that’s the problem – it’s the 3-2 game that takes 3.5 hours where nothing happens.

In a game earlier this month, “Make Baseball Fun Again” Bryce Harper faced 27 pitches and didn’t swing at a single one. Great…

The baseball establishment mocks and shames the millennials for not watching the game the “right way.” They patronize our short attention spans and our “addiction” to social media. They say we don’t “respect” the game’s tradition.

I played baseball ‘till high school, have attended over 200 MLB games across 23 different stadiums in my life.

Rob, you need me and my friends – maybe not this year, but we are your future revenue stream – and I’m telling you, it ain’t looking good.

My observed reality: college students would rather watch the English Premier League (or literally any other sport) than an unwatchable baseball game on TV.

We think baseball is getting more boring and guess what? We’re right.

Baseball Boredom Index (BBI)

Everyone knows that MLB games are getting longer and longer. But there is also way less stuff happening.

I created a new statistic, called the “Baseball Boredom Index.” Or BBI for short. It is extremely easy to understand. The BBI is how many minutes you have to wait, on average, until something happens in a baseball game.

Let’s say an “action event” is a ball in play, or a stolen base attempt. This is a low bar for excitement. It includes sacrifice bunts, dribblers to 1B, and pop-outs to SS.

How long do you have to wait between these action events? Over three minutes! That’s a full commercial break between every single moment of ‘action.’

And it has trended up ever since the dawn of the game. The last three seasons have been the slowest in MLB history. As teams incorporate sabermetrics, we are seeing record-level strikeout totals, leaving fewer balls in play and more pitches per game.

Snip20160526_2

Snip20160526_3

Mr. Manfred, I leave you with a bold challenge. The gauntlet has been thrown. Bring us back to 2.5 BBI. 1985 is not that long ago.

The pace of play changes in 2015 led to slightly shorter games, and a lower Baseball Boredom Index. The pitch clock experiment in the Minor Leagues proves we can cut another 10-15 minutes from time of game. It’s a start, but not enough to keep our attention. Please hurry!

Rob, the writing is on the wall. You will lose the attention of the millennials (and everyone else) unless more progress is made. In fact, I just got three text messages, a snap, six tweets, four fb notifications since you started reading this so this article is now over. Bye.

Max Kaplan, “The voice of the millennial Sports Fan”, is a graduating senior at Princeton University Engineering School majoring in Operations Research and Financial Engineering. Max’s “Curse of the Home Run Derby” article hit the front page of Yahoo.com in 2011. He has appeared on NFL.com and NFL Network. His favorite sport used to be baseball.

Ode to the Great Bambino

How the Best of the Best Performed Relative to Their Time Period

By Keith Gladstone

Only the best players of a given era are inducted into the National Baseball Hall of Fame in Cooperstown, from classic names like Babe Ruth and Lou Gehrig, to the most recent nominees of Mike Piazza and Ken Griffey Jr. Since the MLB era tainted by PEDs saw unthinkable, sky-high hitting totals, the question of who deserves a seat in the Hall of Fame is open for debate. The great differences in eras alone can convolute our interpretation of the game’s statistics, so in this article I will introduce a method of comparison.  

babe-ruth-01

Indeed, I did an analysis to normalize the career HR totals of all Hall of Famers based on their historical era. Babe Ruth held the career home run record at 714 upon retiring in 1935. Hank Aaron shattered the record almost 40 years years later, but what does this actually mean? Was Hank Aaron better than Babe Ruth?

I calculated a new statistic to measure a player’s HR performance relative to the era in which they played. I call it the “Home Runs to Benchmark Ratio.”

HR to Benchmark Ratio = Annual Career HR Average / HR Era Benchmark

  • A ratio of 1 means the player was an average home run hitter in his own era.
  • A ratio of 2 means the player hit twice as many HR as the average player.

Pitching dominated the game in the “Dead Ball Era,” which ended upon the emergence of Babe Ruth and the Bronx Bombers in the 1920s. 714 HR in an era when the average player hit only 100 HR in a career underscores how impressive Ruth’s prowess was.

The Home Run to Benchmark Ratio rankings below confirm this, with Babe Ruth miles above the rest, followed by other classic Yankee heroes Lou Gehrig and Joe DiMaggio. Stunningly, Hank Aaron does not even crack the top ten. His ratio is 2.48, leaving him 26th overall. The HR performances of The Great Bambino, The Iron Horse, and The Yankee Clipper relative to their contemporaries shows just how incredible they must have been to watch.

MLB HOF All-time HR Rankings – Normalized by Era

MidYear Name Career HR HR to Benchmark Ratio
1 1924 Babe Ruth 714 7.13
2 1931 Lou Gehrig 493 4.76
3 1944 Joe DiMaggio 361 4.45
25 1964 Harmon Killebrew 573 2.49
26 1965 Hank Aaron 755 2.48
27 1950 Ted Williams 521 2.43
28 1935 Earl Averill 238 2.37
29 1960 Mickey Mantle 536 2.33
30 1975 Johnny Bench 389 2.31

 

Below is a graph of career HR per game against the average HR per game in that era . Players that appear above the line toward the top-left have higher ratios. Babe Ruth is the top left point. 

Screen Shot 2016-04-07 at 10.15.20 AM

Appendix

The following assumptions were made for data collection and analysis:

  • Player performance is symmetrical over time with a peak in the middle of the player’s career
  • League averages are decent estimates of the “benchmark” over which a player could measure
  • This analysis will consider the modern era (Hall of Famers whose careers occurred mostly after 1900) and those with career batting averages above 0.250
  • Since Hall of Famers had relatively long careers, their statistics are reliable estimates of their abilities

Using the “middle year” as a barometer for a player’s peak

Since the number of players in this dataset is so large, we need a simplified way to capture a player’s top-performing year. For this analysis, we can take the player’s career totals and divide by the number of years played to get a yearly average for the player, and measure this average against the benchmark for the year (selected as the middle year of the player’s career). While this analysis is therefore not perfectly rigorous, it stills serves as a useful method for comparing players from different eras. Put another way, the performance benchmark in 1995 should be similar enough to 1997, and the benchmarks in the 1990s are different enough from those in the 1920s where a benchmark a few years off wouldn’t be a significant issue.

Data Sources

The Mets’ World Series offensive collapse was inevitable

By Ben Ulene

After this year’s World Series ended in a Game 5 comeback win for the Royals, plenty of questions remain about what caused the Mets – who almost nobody [1] predicted would go home after just five games – to lose so quickly. While sloppy defense certainly contributed to their collapse, an even bigger liability was their offense, which only managed a meager 7 extra-base hits in the series.

Should we be surprised that the same team that had excelled at the plate during the NLCS, putting up 21 runs in a four-game sweep of the Cubs [2], could only manage 10 runs over their four losses to Kansas City? Probably not; as the statistics show, the Mets not only came into the World Series with a historically weak offense, but they also were up against a Kansas City bullpen that dominated games like perhaps no other bullpen before.

2015 Mets Offense
Statistic Value All-Time Rank (out of 202 W.S.  teams since 1914)
BA .244 200th
SO 1290 201st
R / Game 4.22 177th
OPS+ 97 181st

 

First, the Mets’ offense, for a pennant-winning team, had been weak throughout the regular season. The team’s .244 regular season batting average was the fourth-worst of any World Series team since 1914; on top of that, their 1,290 regular season strikeouts were more than any other pennant-winner aside from the 2013 Red Sox (who more than compensated with a .277 regular season team average).

The Mets’ regular season mark of 4.22 runs per game was also the third-lowest of any World Series team in the last twenty years – and the only two to score less played each other (the 2014 Royals and Giants).

Perhaps most strikingly, the team’s OPS+ for the season – a statistic that measures a team’s OPS (on-base percentage + slugging percentage) relative to the rest of the league, with 100 being the league average – was 97, putting it below average in the big leagues this year. Only 23 other teams have ever made it to the World Series with an OPS+ of 97 or lower; of those, only 9 managed to win the series, and none since the 1997 Florida Marlins.

All in all, this was not an offense that anybody should have expected to put up huge numbers against any pitching staff in the World Series.

2015 Royals Bullpen
Statistic Value All-Time Rank (out of 202 WS teams since 1914)
Innings 539 2/3 1st
ERA 2.72 22nd
BAA .214 8th
K/BB 2.63 6th
WHIP 1.13 12th
tOPS+ 78 4th
sOPS+ 80 29th

 

The Mets weren’t just facing any ordinary pitching unit in the World Series, however, but rather one with a historically dominant bullpen for a World Series team.

Not only did the Royals bullpen hold opposing batters to a .214 average during the regular season, the 8th lowest for any pennant-winning club, but simultaneously posted a 2.63 strikeout-to-walk ratio, the 6th best regular season mark for a World Series team. The bullpen also maintained a 2.72 ERA during the regular season, the lowest for any World Series team since the 1990 Oakland A’s.

More complex statistics also reflect the dominance of the Royals’ bullpen. Its tOPS+ against – which reflects opposing hitters’ OPS relative to how they hit against starting pitching – was 78 (the 4th lowest for a World Series bullpen), making the Royals’ bullpen one of the best all-time at shutting down opposing offenses mid-game. And the bullpen’s sOPS+ against – which reflects opposing hitters’ OPS relative to the average OPS of hitters across the league – was 80, highlighting the bullpen’s excellence at shutting down hitters entirely.

While all of these numbers are impressive, what will go in the history books is how manager Ned Yost used his bullpen, which was a lot. The Royals’ bullpen pitched 539 2/3 innings this season, more than any other pennant-winning team in history. It’s not surprising that winning teams generally pitch their bullpens less than average, since more bullpen innings generally signifies bad starting pitching; in the Royals’ case, however, their bullpen was just really effective.

During the World Series, Royals relievers pitched 23 2/3 innings, compared to their starters’ 28 1/3. Take away Franklin Morales’s 6th inning implosion in Game 3, and the numbers are staggering: 1 run and 14 hits in just over 23 innings (an ERA of 0.39), with 4 walks and 30 strikeouts. And given just how dominant those relievers had been all year – and how susceptible to offensive slumps the Mets had been – the Royals’ dominant and decisive showing might just have been a foregone conclusion.

[1] http://espn.go.com/mlb/playoffs2015/story/_/page/playoffs15_worldseriespredictions/espn-experts-make-their-world-series-predictions

[2] http://www.baseball-reference.com/postseason/2015_WS.shtml

There is No Place Like Home

By Jeffrey Gleason

Nine weeks into the NFL season, no teams remain unbeaten. This could’ve actually been said after eight weeks, after seven weeks, and after six weeks as well. Week 5 was the last time an unbeaten team remained, when both the Cardinals and Bengals were sitting at 3-0.

However, after these same nine weeks, five teams remain unbeaten at home. The Patriots, Broncos, Eagles, Packers, and Cardinals have yet to lose on their own turf.

Home field advantage is a phenomenon that gets a lot of traction in sports. Experts often use it to justify their predictions and betting lines usually reflect the perceived advantage of the home side. However, people often generalize home field advantage with a “one size fits all” approach, acknowledging its presence, but assuming it displays a constant impact across different situations.

With five unbeaten NFL home teams and the recent impetus of a road team finally winning Game 7 of the World Series (the Giants topped the Royals on October 29th to capture their third championship in five years), I was interested in how home field advantage was quantitatively different in different situations. How does it vary across sports? Do both good teams and bad teams experience the same advantage? Is it magnified in the postseason? What about differences in earlier eras? These are the questions I set out to resolve.

Continue reading

The MLB Division Series Should Be 1,101 Games Long

By Max Kaplan

The baseball playoff system is messed up. It’s a statistician’s worst nightmare. As both an Angels diehard and a statistician, I have descended into despondency.

After six months and 162 games of baseball, a 5-game coin flip decides the fate of the eight playoff teams. The Los Angeles Angels, considered by many to be the best team in baseball and considered by most to be a better team than the Kansas City Royals, were knocked out in only three games after leading the league with 98 regular season wins. That’s three games – the same length as the common regular season sweep.

I’m going to try to “fix” the randomness and unfairness of a short playoff series. And by doing so, I hope to resurrect the Angels 2014 World Series hopes.

How many games would we need in a playoff series to be fairly confident that the better team moves on? According to my calculations below, that number is 1,101.

Continue reading

MLB Unveils Field Tracking System at Sloan Sports Analytics Conference

Image

By Patrick Harrel

A few years ago, NBA teams started installing the SportVU system in their stadiums to get proprietary player tracking data and an edge over the competition, a decision that cost them $100,000 a pop. In the run-up to the 2013-14 campaign, the rest of the league caught up, making the tracking system standard and releasing the data to the public. Today at the MIT Sloan Sports Analytics Conference, Major League Baseball released their plan for a counterpart system, unveiling a player tracking system of their own.

This system has been in the pipeline for a while, with a pilot setup being deployed at Citi Field last year. This season, the system will expand to three stadiums, with all 30 MLB ballparks receiving the technology for 2015. Major League Baseball has been making a push to improve their technology in recent years, with PITCHF/x being released to the public years ago, giving us greater access to detailed pitch data.

Quite simply, the system looks beautiful. Check out this sample video the MLB released of Jason Heyward making a game-winning catch against the Mets last year.

Ultimate Zone Rating and Total Zone Rating have advanced the field of defensive statistics, but they have their problems as they struggle with defensive shifts and do not differentiate between a high fly ball and a more looping strike. The idea with those systems are that over a large sample those variations balance each other out, but this new player tracking system will give teams and fans much more tangible evidence to determine if someone is a quality defender or not.

The biggest question will be how much of this data the MLB will hoard for themselves. PITCHF/x has been available in the public domain for years, so one can hope they will follow their own precedent (and the NBA’s) in releasing the data to the public. The possibilities for meaningful research are simply endless.

Breaking Down the Breakdown: David Murphy

by Jay Hashop

murph

“I’m a big believer in Michael Young. And if the ship sinks, I’ll still be on it.”
– Ron Washington, August 2012

The S.S. Ultimate Professional officially sank on October 5th, 2012, when the Texas Rangers lost the AL Wild Card game to mighty Joe Saunders and the Baltimore Orioles. Coming off a strong 2011 season in which FanGraphs credited Young with 3.5 fWAR (FanGraphs wins above replacement), the Rangers’ super-utility player struggled all season at the plate and in the field, ending the season at -1.6 fWAR as one of the worst everyday players in Major League Baseball. Contributing significantly to his collapse was the complete lack of power Young displayed in 2012, when he posted his lowest season marks in both home runs (8) and isolated slugging (.093) in over a decade. Additionally, Young’s batting average on balls in play (BABIP) dropped to .299 from the .367 he recorded in 2011. The breakdown in Young’s game was so severe that general manager Jon Daniels paid the Phillies 10 million dollars to take Young in exchange for a middle reliever and a bullpen prospect in case Young had permanently lost the ability to play at least replacement-level baseball.

While Young fizzled, his teammate David Murphy sizzled on his way to accumulating more fWAR in 2012 (3.9) than he had from 2009 through 2011 (3.7). Murphy finally appeared to have conquered the left-handed pitching demons that had forced him into a platoon-like role for much of his career, and the Rangers showed confidence in Murphy by naming him the everyday left fielder going into 2013. A .433 BABIP against left-handed pitchers on only 60 balls in play served as cause for concern about steep regression, but Murphy at least appeared to be a sufficient corner outfield option. Continue reading

Buyer Beware: Two players to avoid in MLB Free Agency

Image

By: Patrick Harrel

MLB Free Agency is upon us and with that comes players moving teams, crazy contracts, and MLB writers scrambling to get the latest rumors out of team executives. In the coming weeks, teams will start signing players, and as salary figures are tossed out, heads will spin.

Overpaying is sometimes just the cost of doing business in the MLB, a league without a salary cap, but often, that overpaying can be a killer blow to a franchise. In 2006, as the Astros were trying to put together another team that could go deep into the playoffs after reaching the World Series in 2005, they spent $13 million on Woody Williams and $100 million on Carlos Lee. Williams was released in spring training the following year, Lee hamstrung the Astros payroll for the next six seasons, and the Astros bottomed out to be the worst team in baseball for three seasons in a row.

Today, we discuss a pair of veteran free agents that teams should stay away from if they want to avoid the fate the Astros fell victim to in the winter of 2006. Continue reading