By Gene Li
Nowhere is the concept of the “Big 3” more relevant than basketball. As a relatively star-dominated game compared to football, soccer, etc., NBA games are determined by the performance of a few players who can deliver offensive firepower. NBA fans often view their team’s success as driven by the top three players on each team. Just last season, we saw the trio of Stephen Curry, Klay Thompson, and Draymond Green from the Golden State Warriors face off against Lebron James, Kyrie Irving, and Kevin Love of the Cleveland Cavaliers. Historic “Big Threes” include the infamous James-Wade-Bosh trio in Miami, and the Duncan-Parker-Ginobili Spurs offense that won four championships over 13 years. But just how much can a team’s performance be attributed to its top three players?
A Historical Study
by Aqeel Phillips
With just a few weeks left in the regular season, some of us are left without much to root for anymore. HEAT fans remain optimistic in the surprisingly competitive battle for the first seed, and Suns, Mavs, and Grizzlies fans are biting their nails short in hopes that their teams can grab a playoff spot. However, a good percentage of us basketball fans now realize we have little to root for anymore (or if you’re a Sixers fan like me, you realized in about August), and are just waiting to see the final playoff seedings and end-of-season awards before the playoffs get underway. Besides the MVP, one of the most notable awards each year is the Scoring Title. Last season, we were treated with a thrilling ending as the battle for the Scoring Title came down to the wire between Kevin Durant and Carmelo Anthony.
This season, Kevin Durant aka the Slim Reaper has made things less interesting, currently scoring 32.2 points per game (PPG) over 2nd place Melo’s 28.0 PPG. Durant is the only player to average 30 points since he did in the 2009-10 season. The NBA has had a notable drop in scoring lately, a trend first starting when hand checking was instituted in the early 2000’s and extended as many teams have embraced sharing the ball throughout the team in order to better find open looks, namely threes, rather than relying on singular scorers. Durant’s current season widens eyes at first glance — averaging 4 points more than his next closest competitor will do that. But I find that PPG by itself doesn’t tell the full picture. Elgin Baylor averaged over 38 points in 1961-62, but that was over 50 years ago in a completely different league. So who had the most impressive season: 2014 Durant? 1962 Baylor? 2006 Kobe? We’ve witnessed plenty of monstrous seasons, and this study examines them in relation to the rest of the league at the time to contextualize the simple PPG marks.
League Scoring Average (Season)
To get a better comparison between scoring performances, we can divide a player’s PPG by their minutes per game (MPG) marks to see how they’re scoring with regard to the opportunities they’re being given. This is especially useful in calculating a league average scoring mark. We don’t want end bench players that average 0.6 PPG to drag down the entire league scoring average, most importantly because they outnumber the talented, 20+ PPG scorers in the league. Dividing PPG by MPG for each player across the league levels the playing field, and also accounts for the possibility that in any given season the league as a whole significantly played more or less bench/low-scoring players for whatever reason (for example, in the ‘60s there were much fewer players in the league and more minutes and points to go around).
For reference, here are the Points Per Minute values for the current league leaders in scoring:
(For those wondering about a full list of the league leaders in PPM, see the appendix)
In terms of points scored per time played, you can see that Durant is not just scoring at an average rate while playing more minutes, he is scoring more efficiently than the players below him on the list (shown by a higher PPM value than his competitors). It’s interesting to note that Melo averages more minutes than Durant, but Durant makes much better use of his time, scoring-wise, than Melo (Durant is also more efficient with his shot attempts – averaging 20.7 field goal attempts per game to Melo’s 21.5). This gives more evidence to Durant’s case for “best scorer in the league” – not only does he have the sheer output, but he also has the efficiency.
Next, we’ll calculate the average PPM value for the entire league, and compare each individual player to that average, to see how much better they score than the average replacement.
Unlike other studies I’ve done, I haven’t artificially subtracted out all of the players that aren’t contributing much (<20 MPG, <30 GP in previous articles), as using PPM should even out all contributions.
By Neil Rangwani
This time of year means a few things in the world of sports: March Madness highlights take over ESPN, baseball stadiums start to fill up, and Knicks fans await their inevitable disappointment.
This NBA season looks remarkably competitive: the top of the league is crowded with legitimate contenders. The defending champion Heat and the Pacers, although sliding a bit recently, look to be the favorites in a weak East, while the Thunder, Clippers, and an extremely hot Spurs team each look like they could win the West.
In order to take a closer look at the playoff picture, we wanted to rank teams according to a metric that took into account various facets of a player’s game, so we decided to calculate a team equivalent of Player Efficiency Rating (PER). We took a relatively simple approach, since PER encompasses a number of basic statistics.
Introducing Weighted Player Efficiency Rating (WPER)
Using data for each player over the past four NBA seasons, we weighted each player’s PER by their playing time as a fraction of their team’s total playing time in order to account for a player’s actual usage. Then, we found each team’s Weighted Player Efficiency Rating (WPER) by summing the values for each player on each team.
by Aqeel Phillips
Halfway through the current NBA season, fans have celebrated and lamented the position of their teams as the contenders and lottery teams separate themselves from the pack. On the flip side, NBA stat geeks have begun universally celebrating as the SportVU player tracking system has filled up with an ample pool of data and now possesses a respectable sample size. More than 41 games into the season, we can not only start to project playoff seeding and start pondering matchups, but we can also begin to accept players’ performances so far as an expectation of how they will finish the season as well (barring injury or possible team-afflicting swaps at the trade deadline). SportVU allows us to take a deeper look at these performances, past the simple statlines of points, rebounds, and assists, and really get our hands dirty in finding out what might makes each team and player special.
To start, I’d like to revisit my previous article with a few revisions. A reader pointed out that the passing player’s free throws were not being subtracted from the team free throws, so players like LeBron James and Russell Westbrook benefitted from taking many free throws. In addition, it appears that Assist Percentage is a more helpful stat to use than Assist Rate for calculating free throws. The former is simply a percentage created by the amount of field goals assisted by a player out of the total team field goals made, while Assist Rate is a more involved metric that counts assists versus possessions in a game. Lastly, player minutes need to be factored in as well. Team points from free throws are tallied over the entire game, but a player is only on the court for a fraction of the game to assist on those free throws. As a result, we need to multiply the team free throws per game by the fraction of the game that a player is on the court.
Here is a comparison of my formula (specified in previous article) compared to the concrete data that SportVU provides this season, using this season’s data rather than the 2012-13 data I used previously.
The formula has its flaws, specifically it has a tendency to overestimate the number of free throws catalyzed by a player’s passing. For example, the formula assumes that Chris Paul’s ridiculous 53.8% assist percentage also applies to the amount of free throws shots while he is on the floor. The formula projects him to catalyze 5.8 FTs per game, while NBA.com reports that he only catalyzes 0.9 per game (almost the full difference between his projected points and his contributed points). Overall I believe it still gives a fairly good projection of how many points a player is contributing total. I think that it can still be a valuable tool for getting a picture of players’ contributions before SportVU was available.
(Note: AST+ is not available for this season, so I was forced to calculate it myself. A full explanation can be found after the conclusion of the article).
Introducing Passing Efficiency
SportVU has been tracking two pieces of player data never readily available before: Passes per Game and Points Created by Assist per Game (as mentioned previously). The points are a combination of passes leading to two-pointers, threes, free throws, and passes leading to assists (“Hockey assists”). To get a picture, here are the current top five in Passes per Game and Points Created by Assist per Game (which is desperately in need of a fancy acronym).
by Aqeel Phillips
With the introduction of the new SportVu advanced statistics that the NBA has officially introduced at the beginning of November, I’ve been most intrigued by the new passing statistics now at the disposal of the fans. It’s been well known around stat-heads for a while that Assists are a flawed metric for measuring a player’s contribution to their team. They simply serve as a tally with no weight to them, a cross court pass to an open player in the corner yields the same number of Assists as a pass inside to a big man who does most of the heavy lifting by skillfully posting up. Though some public websites track the number of assists that lead to three-pointers as opposed to deuces, there is still no stat that accounts for passes that lead to free throws, and passers are robbed of rightful assists that they should receive when a play ends in a shooting foul. SportVu will be tracking these statistics, but I’m too impatient to wait for the season to progress and the sample size of SportVu to increase sufficiently, so I set out enumerate the contributions of passers from last year’s NBA season.
The Three-Pointers: Creating Valuable Shots
Let’s start by reminding ourselves of the Assist leaders from last year:
As stated previously, these assists merely serve as a tally of passes a player completed that led to field goals. We can gain a better picture of each passer’s contributions by taking a peek at a lesser-known statistic called Weighted Assists (shorthand AST+, courtesy of Hoop Data), which weights three-pointers as 1.5 as valuable as regular field goals. From AST+, we can easily calculate the amount of points from field goals that a player produced per game, by multiplying their AST+ value by two.
By: Patrick Harrel
In the quest for advanced statistics capable of accurately quantifying defense, NBA analysts have always faced an uphill battle. Unlike offense, which had easily quantifiable measures of success, readily available statistics came nowhere close to establishing how effective a defensive player was on the floor. If a player blocked a lot of shots, he was often lauded as a tremendous defender, but what if those blocks came at the cost of missed rotations and wide open layups on failed attempts? Until very recently, we couldn’t dream of answering a question like that comprehensively.
When the NBA announced this year that they would be making the SportVU data available to the public for the 2013-14 season, the news was met with raucous applause from all circles involved with basketball. Writers loved it, fans loved it, and statisticians, who had always only been able to make educated guesses about certain factors, adored it. At Princeton Sports Analytics, we are going to make the data more accessible to you in a bi-weekly column, with each entry dedicated to a specific aspect of what is going on in the NBA.
If you are unfamiliar with SportVU, it is a system that is now installed in all 29 NBA arenas that tracks the movement of all 10 players on the court, the 3 referees, and the ball, and automatically generates an incredible amount of data about the various outcomes on the floor. It tracks average speed of every player, how many touches any given player gets per game, and much more.
Today, we’re going to discuss the ability to better quantify defense. Specifically, we will look at who have been some of the surprisingly poor interior defensive players this season. SportVU measures how well players defend inside by charting every shot attempt that an offensive player takes when a defender is both within five feet of the basket and within five feet of the offensive player. It then measures what percentage of shots the defensive player allows to be made under these conditions.
By: Patrick Harrel
Kobe Bryant recently changed his twitter avatar to a simple image of the numbers “1225,” an obvious nod to ESPN’s respective predictions for the Lakers performance in the West and Kobe’s performance this season in comparison to his NBA counterparts. He, along with Laker Nation, was appalled to see both ranked so poorly. The NBA Rank methodology may be a bit primitive, with each voter voting on a 1-10 integer scale to rate all the NBA players on the list, but the ranking nonetheless reflects a reality that Kobe is likely to regress after rupturing his Achilles tendon.
But how much will he regress? Dr. Douglas Cerynik and Dr. Nirav H. Amin of Drexel University did some research into Achilles ruptures in their paper Performance Outcomes After Repair of Complete Achilles Tendon Ruptures in National Basketball Association Players, and shed some light as to just how difficult it is to come back from an Achilles tear. Of the 18 players they looked at, 7 were never able to return to NBA action, 3 returned for just one season, and the remaining 8 would go on to play 2 or more seasons.
And of those players that did return, their performance suffered drastically, especially in their first season back. In their study of the 11 players that returned to the NBA, the players PER (player efficiency rating), decreased by an average of 4.57 points. In the second, it decreased by 4.38 points. Even after controlling for age and other confounding variables, both figures were statistically significant, the first with a p-value of .038 and the second with a p-value of .081.
If you are unfamiliar with PER, it is an attempt at an all-encompassing rating system that sets the league average at 15. An All-Star typically has a PER in the range of 21 or above, and an MVP will be in the 27-30 range. Last year, Kobe had a PER of 23.10. If his PER fell by the mean decrease seen in the study of 4.57 in 2013-14, it would be 18.53, or .07 points worse than Samuel Dalembert’s PER last year. When Kobe is compared to the mediocre center the Mavericks just signed as a stopgap to please the fan base in Dallas, he suddenly doesn’t seem so intimidating.
By Avi Cohen
All basic sports statistics need to be simple enough for the regular sports fan to comprehend quite readily, allowing them to understand the basics of how a player or team performed without actually watching the game. As a result of this simplicity, most are pretty flawed in some way or another when taken out of context. For instance, a typical stat line in basketball reads points, rebounds, assists – sometimes including steals and blocks. Obviously, some of these reflect a player’s performance better than others. But on the whole, the majority of these stats can be contextualized with other statistics or by effectively watching game action. But it just seems that this is not the case with rebounds.
PPG can be contextualized with FG%/FGA and USG rates. Same could be said for assists. Few people assume that lots of steals and blocks equate directly to good defense – though it certainly helps. And yet, rebounds exist within their own category. They are in limbo between offense and defense, essentially a loose-ball statistic. Granted, it could be argued that defensive rebounds are a component of playing good defense, while offensive rebounds as contributing to your team on offense.
Nevertheless, when we say someone is a good rebounder, we only really look at rebounding numbers. Maybe some will bring up rebound rates to seem smart, but that really is just controlling for minutes/game. Evaluating the offensive and defensive talents of a players often comprises multiple statistics in order to come a conclusion, and yet we typically only rely on the one or two readily available ones when judging rebounding prowess.
All in all, rebounds are all about securing the possession for your team. An offensive rebound gives the team an additional opportunity that they otherwise wouldn’t have had, and as such, conveys more of the in-game contribution. But defensive rebounds? Not so much the case. So many teams often box-out their men specifically with the intention of allowing their designated rebounder to grab the board. The most obvious example of this is Jason Kidd on the Nets during the early 2000’s. All too often everyone else would just box their men out, let Kidd grab the board and immediately go into the fast break. Kidd was certainly a more than competent rebounder, but his numbers were highly inflated by the system implemented by the coaching staff during his time as a Net. There are plenty of others factors that need to be taken into consideration as well. If a big man’s teammates were bad perimeter defenders he would be forced to commit to help on defense more often, resulting in missed rebounding opportunities. Additionally, less offensively talented players may not be deemed a threat, and left unguarded, have no one boxing them out. The loose ball period between the shot release until possession is secured with a rebound is much less structured, and as such, much more difficult to quantify in a simple manner.
It is true that there aren’t many good and simple ways to evaluate defense. Steals and blocks are the only basic defensive statistics available to evaluate defensive contribution, but as mentioned earlier, few seriously equate those two with good defense. However, when discussing rebounding ability, there is almost no discussion beyond the number of sheer boards a player brings down.
Considering that rebounds can really be narrowed down to just securing possession, there needs to be a new method for evaluating presence on the glass. Some sort of system that weighs offensive rebounds more heavily than their easier, less contested defensive counterparts, while also taking into account the amount of times you allow your man to grab offensive rebounds. It’s certainly a considerable challenge to take on, but considering the advances of sports video analysis software, it’s definitely not as difficult as we’d imagine.
Come back over the coming months as we attempt to tackle this challenge.
By Max Kaplan
This is part 3 of my March Madness bracket series. In part 1, I showed that Florida was the best team to pick to win it all. In part 2, I explained how to choose the rest of your Final Four depending on your pool size and skill.
First, I’d just like to express my frustration at a fellow Princeton publication: The Princeton Tiger. While we may all be able to relate to this list of March Madness excuses, number 5 (“I’m no sheep”) is precisely the best strategy to win your pool. Actually, it is the very misconceptions of people like this that makes the strategy of choosing undervalued picks possible.
Now on to the rest of the bracket.
March Madness is set up so that each round is worth the same number of points. This is not quite true. If you correctly choose the national championship, by definition you correctly chose them in every other round too, thus doubling the importance of every subsequent round. This is precisely why the first two posts focused on picking the Final Four, where most brackets are won and lost.
However, there are two instances where this may not be the case. First, maybe no one picked any of the Final Four teams correctly (see 2011). Second, you could play with rules that give extra points for upsets. Sure, everyone wants to be ahead after two rounds and to have chosen this year’s VCU, but a much simpler strategy leads to an optimal first two rounds. The following is the best strategy ALWAYS but it probably won’t change the final outcome unless…
Everyone’s Bracket Gets Destroyed
When everyone’s bracket gets destroyed (while somewhat rare), you just need a few more wins than everyone else to walk away with the prize. So here’s my advice: pick all the favorites. Yes, all of them.
You: But Max, didn’t you just tell us to choose undervalued upsets?
Max: Yes, choose an undervalued winner, runner-up, and semifinalists depending on how big your pool is. But unless points are given for upsets, choose the favorite in every other game.
In my bowl confidence column from January, I discovered that people tried to pick upsets to differentiate themselves from the pack. In the end, I found that if you just chose the favorites and ranked them by how much of a favorite they were, you would end up in the 90th percentile without doing anything.
The same concept applies to March Madness. People think that they are smarter than the seeding. They choose upsets to get ahead of the curve. If you get two out of five major upsets (like Florida Gulf Coast and La Salle), it feels like a win. But you could have gotten 3 out of 5 if you had chosen all the favorites, and favorites have a higher chance of winning subsequent rounds too.
Case in point:
Are you serious? Without upset points, there is no reason to believe Florida and Florida Gulf Coast are almost even money.
This year, if you had chosen only the higher seeded teams in the first two rounds, you would have 44 points (multiply by 10 for ESPN’s point system). That would be good enough for about the 87th percentile. If you include games where the betting favorite was the higher seed (ex. Minnesota), you could do even better. This is how you differentiate yourself in the case of Upset City. Choose the favorite.
Now, you may ask. Why doesn’t this strategy work for the entire bracket? In short, it does. By choosing the favorite in every game, you guarantee yourself a very high percentile. However, unless you are in a very small pool, you will not win. One person will luckily get more upsets than misses and win. But choosing the favorite in every game for a very small pool is probably the best strategy. Of course, seed may not indicate the favorite in the Elite Eight and later: Ohio St would have been favored over Gonzaga.
Strategy: Choose the betting favorite all the way up to the Elite Eight.
The Upset Points Pool
This is my favorite league and it is the easiest one to gain an advantage, and as you saw above, you could make it into the high 80th percentile for regular pools. Yahoo’s upset points rule is as follows. You get the regular amount of points for every round PLUS the difference of the seeds if you correctly choose an upset.
For example, if you correctly choose an 8 seed over a 9 seed. You get 1 point. If you correctly choose the 9 seed you get 2 points (1+1). Therefore, to break even in the long run, the 9 seed only needs a 33% chance to win to make it a worthwhile gamble. Without loss of generality, you can apply this to every other first round upset.Breakeven Probability of Upset to make it worthwhile 9 seed – 33% 10 seed – 25% 11 seed – 17% 12 seed – 13% 13 seed – 10% 14 seed – 8% 15 seed – 7%
Under almost all circumstances, these are worthwhile bets regardless of the teams playing. However, this is a simplified version because it isolates the first round. As you reach later rounds, the seeds become closer together (usually high seeds) and the rounds become more valuable regardless of seed. Because of this, it isn’t prudent to choose upsets for the entire bracket.
So where is the cutoff? A full survival analysis would lead to the exact answer. But you are already so ahead of the curve, you have already reached a point of diminishing returns. The point should probably be before the Elite Eight because choosing a correct team would net you 7 points and the upsets are probably not likely enough to make up for that.
A good rule of thumb is to pick your Sweet 16, then have every other high seed be upset in the first round. If you have a two low seeds playing each other in the second round, choose the one who was more likely to win the first game. You will make a killing in the first round. Guaranteed.
Strategy: same Final Four, only betting favorites for Elite Eight and Sweet 16, only upsets for first 2 rounds
In my final article in the series, I will talk about the probability of the perfect bracket and actually comb through this year’s numbers to see how well my strategies performed (will perform).
Please comment below your thoughts. Especially if you have any ideas about what we should cover in the future. And please like us on Facebook too.
By Max Kaplan
A couple months ago, I wrote an article that showed how to perform better in bowl confidence pools than 90% of all participants by just following the simple strategy of following the consensus of the nation (by either Yahoo averages or betting lines).
To be successful, you didn’t need to know who was playing or who was better. You didn’t need to pick upsets and you didn’t even need to predict the outcome.
Here, I will try to find a similar strategy for college basketball
The basic underlying theory should apply to March Madness as well. You should not try to predict the outcomes of the games to maximize your chances. Logically speaking, why would you be able to better predict the outcomes of basketball games better than the other millions of brackets? ESPN alone has 8,145,000 entries. And guess what? Most of them think they are above average too. Now that the first (second?) round is done, about 95% still have their national champion pick still alive and probably feel even more confident. Unless you have insider information (and those coaches, athletes, and others that do are not allowed to gamble by NCAA regulations), there is no reason to think you can win your pool with a game-by-game approach.
In each pool, there is only one winner. Presumably, the whole point of making a bracket is to win the pool for bragging rights, since I would never in a million years dream of even thinking of doing anything in the proximity of gambling underage. Therefore, you should play your opponents as opposed to your own bracket, much like in poker.
In essence, you can neither control nor accurately guess 67 probabilistic results, but you can adjust your predictions relative to your competition. You just need to find where others are making probabilistic blunders.
For example, I am very knowledgeable about the Big Ten and the Ivy League because I have a rooting interest in the two. However, the biggest gaff (and the most common mistake) that people can make is overrating their own teams. This is the case for an alma mater, the league that they play in, and, more generally, popular teams that you see on television (Duke, North Carolina, etc). It is because of this that I consciously chose fewer Big Ten teams to succeed than I would have preferred. In a pool of Duke grads, it wouldn’t be very smart to chose Duke to go all the way.
Now you may ask, so what? How can this help me win?
Below is a chart of all teams with a greater than 1% chance of winning according to betting lines (historically, betting lines are very good at predicting outcomes). It finds the difference between the % of people that chose each team to win the national championship and the adjusted (so that the percentages add up to 100) betting lines should show which teams are under and overvalued.
7 out of the 8 favorites to win the title are overvalued. People like choosing favorites. The lone exception is Florida, who has 7 to 1 odds but was chosen by only 2.7% . The betting lines have Florida as 3rd, while the bracket entries have Florida as 9th. This is why I chose Florida to win it all.
Below are the same percentages sorted by Yahoo % instead of betting line. The top 7 most popular picks on Yahoo are all overvalued. This may be because we round up high percentages. Notice that Louisville is both the consensus best team and most overvalued. There is a very high chance that a 1 or 2 seed wins the tournament. Because of this, we assume that there is almost no chance that anyone else can win it. However, the chances are more than the 13% that is given by Yahoo users.
Winning your pool almost always comes down to choosing the winner. While Florida is still not the favorite to win the tournament, you are trying to outdo your peers. For instance, if you chose Louisville, you would still most likely need to correctly guess the other finals team as well as 3 or 4 Final Four teams in order to win the pool. Winning your pool suddenly shifts from choosing 6 games correctly (round of 64 to finals) to 11 or more. Choosing an undervalued team increases your odds of winning a pool even if it means you are less likely to choose the champion.
If the betting data existed publicly for every round (and please comment below if you can find something like this), this would be the best strategy to fill out the entire bracket and we would likely end up with a very high national percentile.
I will continue this search for the winning bracket later this week, where I will look beyond just choosing the national champion.
3 Implications of Choosing an Undervalued Team like Florida
1. The betting line gave Florida an 8.7% chance to win it all. I needed the betting line to be fairly accurate in order to compare it with the picks on Yahoo. In reality, I could have used Nate Silver’s or any other “expert” model as the baseline. But historically, betting lines have been the best. The reasoning is as follows. If someone could consistently create a model that could forecast games better than the betting line, they would just bet on that team and make boatloads of money. The line would shift until it is no longer profitable. Oh, and they wouldn’t publicly release the profitable model either. Free Market 1, Experts 0.
2. Notice that it is much easier to lose money choosing the wrong team than it is to make money choosing the right team. Louisville was -15.7% and Indiana was -6.3%. Just like in poker, it is much much easier to squander your money than it is to profit.
3. So choosing Florida (+6%) gives you an 8.7% chance of being in the top 2.7% of brackets. With all else being equal, this choice alone gives you an average national percentile close to 56%. However, choosing an underdog is a risky strategy and it leads to more disparate outcomes. You will win your pool more than average but also be far less likely to finish in other high percentiles (below 97th). It is somewhat an all or nothing strategy. However, winning should be the only outcome that matters.