By Neil Rangwani
Each year, as the NBA season kicks off, the “hot hand” debate (or, according to Wikipedia, the hot hand fallacy) resurfaces – are streaks of made shots indicative of a player getting hot, or are they just random occurrences? Here at Princeton Sports Analytics, we’re not happy discussing this with just anecdotal evidence (I mean, did you see Steph last night?), so we did some analysis(!). It turns out that (surprise) the data show that NBA superstars Steph Curry, LeBron James, and James Harden don’t get hot any more than a coin that lands on heads 5 times in a row does.
The argument that shot streaks are random is based on probability. Any time an event (with two outcomes) is repeated many times, streaks are bound to occur. To study whether NBA shot streaks are random, or if players have disproportionately long “hot” and “cold” streaks, I used shot-by-shot data from the 2014-2015 season (from nbasavant.com) and applied a geometric distribution framework.
The geometric distribution is a probability distribution that is used to model repeated trials of events that have two distinct outcomes, each of which occurs with a constant probability. NBA shots roughly fit these requirements – there are clearly two outcomes (makes and misses), and I’ll assume that a player’s season-long field goal percentage is the “true” probability that they make any given shot.
Using the geometric distribution, we can model the number of made shots in a row. Essentially, a streak means that a player makes a certain number of shots in a row and then misses the next. Mathematically, this is the probability of making k shots in a row (pk) multiplied by the probability of missing (1-p)k.
P(X = k) = pk * (1 – p)
Next, I applied this framework to the shot-by-shot data from last season for Steph Curry, LeBron James, and James Harden. Using the data and the geometric distribution, here’s the expected and observed shot streaks.
Here’s a visual version of the same data:
While it seems that the observed and expected distributions are close, we can actually quantify whether they are. We’ll use a chi-squared goodness-of-fit test to tell us whether the observed data fits the geometric distribution.
The p-value of a chi-squared test tells us the probability that the observed values are from the theoretical distribution. These p-values strongly suggest that the observed shot streaks are well explained by the geometric distribution.
Bringing this back to basketball… what we get from this analysis is that shot streaks match what we’d expect if they were truly random. Even NBA superstars don’t “get hot” – instead, since they tend to have higher field goal percentages in general, they naturally have longer streaks of made shots.
One thing to keep in mind, however, is that the model doesn’t account for timing. It’s totally possible that a player’s shot streaks are geometrically distributed overall, but when you isolate playoffs or overtime games, as examples, they tend to have longer streaks than expected. I mean, did you see LeBron in that Pistons game?