The Statisticator: May 2012

Wednesday, May 30, 2012

Home-court advantage in the NBA

We have heard the concept countless times and almost take it for granted.

Home-court advantage. "They should win the next since they're playing at home." "They managed to steal one on the road."

But doesn't it all come down to Team A versus Team B? If A is a better team it should win no matter the location. The court has the same dimensions, the baskets are identical, the shots have the same likelihood of falling in. It's not like being server or receiver in a tennis game.

Or is it? There has actually been quite some studies around home-court advantage in an attempt to tease out the external factors that could cause it. Many potential causes have been brought forward: the home crowd of course, cheering when the hometeam gains momentum. The fact that the players in the home team can sleep at home instead of being in a hotel. Familiarity with the locker rooms, the facilities in general. Mostly psychological explanations difficult to accurately measure.

Or just consider the distractions when shooting free-throws:

I don't have a degree in psychology so I will tackle from the data point of view, and try to see how playing at home can impact the game's outcome.

Data

I looked at all NBA games (regular season only) for the past two years. I didn't go further back as other factors could have come into play such as the differences in team composition.

Methodology

For each team, and for the league in general, I computed the empirical probabilities of winning at home and on the road. The values can be directly computed form the creation of a two-by-two table win/loss VS home/away. I choose to approach the problem via a logistic model which provides exactly the same point estimates but in addition provides confidence intervals which in turn allow me to determine whether the homecourt advantage is significant or not.

Results

If an NBA team plays against another NBA team, how does the probability of victory change depending on where the game is played?

With a simple model where the only variable is where the geme is played I obtained that for the NBA in general, playing at home offers a +19.8% winning probability (59.9% VS 40.1%). This was a very significant uplift. There we have it, homecourt advantage is very present in the NBA.

Does this +20% hold for all NBA teams?

Here the variability is much greater, ranging from +37.8% for the Denver Nuggets (81.7% at home vs 43.9% away) to only 2.4% for the Dallas Mavericks (69.5% at home VS 67.1% away).
Full team-by-team table is at the end of the post.

Does any team player better away than at home?

No, all teams play better at home, even if by only a little such as the Dallas Mavericks where the uplift is only 2.4%.

Is homecourt advantage statistically significant for all teams?

Homecourt advantage was significant for all teams except 8: Dallas Mavericks (+2.4%), Miami Heat (+3.7%), Boston Celtics (+9.8%), Philadelphia 76ers (+9.8%), Oklahoma City Thunder (+11.0%), Sacramento Kings (+11.0%), Houston Rockets (+13.4%), New York Knicks (+13.4%). These are not necessarily all good or bad teams, just teams that are just as good (or bad) away as at home.

Closing thoughts

Going back to Denver and its overwhelming homecourt advantage (+37.8%) compared to the second-best advantage +32.9% for the Los Angeles Clippers, what might wonder if the altitude isn't the Nuggets biggest fan...

In a later post I will explore how the model can be tweaked to take rest time between games into account. Does more rest improve your winning probability?

Appendix: Full table

Team	Home win %	Away win %	Delta %	Significance
DAL	69.5%	67.1%	2.4%	No
MIA	65.9%	62.2%	3.7%	No
BOS	69.5%	59.8%	9.8%	No
PHI	46.3%	36.6%	9.8%	No
OKC	69.5%	58.5%	11.0%	No
SAC	35.4%	24.4%	11.0%	No
HOU	58.5%	45.1%	13.4%	No
NYK	50.0%	36.6%	13.4%	No
MIN	26.8%	12.2%	14.6%	Yes
CLE	57.3%	40.2%	17.1%	Yes
LAL	78.0%	61.0%	17.1%	Yes
POR	68.3%	51.2%	17.1%	Yes
UTA	64.6%	47.6%	17.1%	Yes
ORL	76.8%	58.5%	18.3%	Yes
PHO	67.1%	47.6%	19.5%	Yes
NBA	59.9%	40.1%	19.8%	Yes
CHI	73.2%	52.4%	20.7%	Yes
NJN	32.9%	11.0%	22.0%	Yes
ATL	70.7%	47.6%	23.2%	Yes
DET	46.3%	23.2%	23.2%	Yes
MIL	61.0%	37.8%	23.2%	Yes
SAS	79.3%	56.1%	23.2%	Yes
MEM	64.6%	40.2%	24.4%	Yes
TOR	50.0%	25.6%	24.4%	Yes
NOH	63.4%	37.8%	25.6%	Yes
WAS	42.7%	17.1%	25.6%	Yes
IND	57.3%	26.8%	30.5%	Yes
CHA	63.4%	31.7%	31.7%	Yes
GSW	53.7%	22.0%	31.7%	Yes
LAC	53.7%	20.7%	32.9%	Yes
DEN	81.7%	43.9%	37.8%	Yes

Wednesday, May 23, 2012

NBA: Spurs VS Thunder

Time for a playoff update now that the two contenders for the Western Finals are known.

Everybody wants to see Spurs VS Heat, but before then comes the Thunder hurdle, and despite the Spurs impressive performance up til now, this is definitely going to be a tough challenge.

The method is exactly the same from my previous post (which correctly identified Thunder beating the Lakers in 5 as the most likely scenario!). Entering the last numbers in the model, here's what came out:

Probability of Spurs winning the series: 55.6%

Series breakout:

Winner	Number of games	Probability
Spurs	4	6.6%
Thunder	4	5.3%
Spurs	5	15.8%
Thunder	5	9.5%
Spurs	6	14.3%
Thunder	6	16.9%
Spurs	7	19.0%
Thunder	7	12.7%

The three most likely scenarios are:
Spurs in 7 (19.0%), Thunder in 6 (16.9%) and Spurs in 5 (15.8%).

For the overall playoffs, the latest numbers suggests the West as championship favorites for now:

NBA team	Champion Probability
SAS	34.4%
OKC	25.7%
MIA	21%
BOS	12%
IND	4.6%
PHI	2.4%

Verdict in the upcoming weeks!

Thursday, May 17, 2012

Lakers - Thunder Series

This post is actually an expanded comment to Sekou Smith's Hang Time Blog on nba.com concerning the Lakers - Thunder series.

This series is one everybody has been waiting for since the start of the season.

Experience VS youth.
Kobe VS Kevin.

A blowout in game 1.
An incredible comeback in game 2.

What's in store for the next 2 + X games?

5 nba.com's experts on Sekou's blog give their predictions after 2 games: one says 4, two say 5, and 2 say 6.

But what do the stats say?

I recently updated my model from the last two posts (here and here) in two ways: homecourt advantage is now incorporated (another post soon on this topic, namely how we can quantify it, whether all teams have a significantly higher probability of winning at home than on the road, and which teams have the greatest delta in home ganes vs away games), and by providing more details on each series with not only the probability of one team winning it but also the breakout in how many games the series will play out.

Which is exactly what I did here for the Lakers - Thunder series.

And now for the results:

Winner	Number of games	Probability
Thunder	4	22.2%
Thunder	5	30.4%
Lakers	6	5.8%
Thunder	6	17.2%
Lakers	7	9.5%
Thunder	7	14.9%

So Thunder in 5 is actually the most likely scenario, followed by Thunder in 4 and in 6. Overall, if you're a Laker fan you should feel depressed with Lakers having only a 15.3% probability of facing the Spurs. But I have to admit that I haven't factored Kobe-back-against-the-wall variable in my models :-)

Let me know your thoughts!

Tuesday, May 15, 2012

2012 NBA Playoffs: Updated forecasts

What a first round this has been!

Things were rather quickly expedited in the East, including the surprising elimination of the #1 team Chicago Bulls, surprising until we saw the following video at least:

Meanwhile, the West was really the wild wild west and gave us some thrilling comebacks and two stressful game sevens.

Chicago was the favorite to win the Championship after the first two games of the playoffs with an estimated probability of victory of 17.9%. Its elimination has freed up some room but for whom?

Oklahoma City, San Antonio and Miami were the runner ups, and while the names of the next three teams hasn't changed, their order has:

NBA team	Champion Probability
MIA	21.5%
OKC	21.2%
SAS	19.3%
BOS	10.2%
LAC	7.9%
IND	7.3%
PHI	6.6%
LAL	6%

However the results are slightly biased as of now in the sense that Miami and Oklahoma won their round 2 opener whereas San Antonio still hasn't played Game 1 against the Clippers. If it were to win, it would jump right back to the first spot with a probability of 24.6% of clinching the Larry O'Brien trophy, more than 3 percentage points ahead of Miami and Oklahoma City.

More updates at the end of round 2!

Monday, May 14, 2012

The Johnny Depp / Tim Burton collaboration

I don't think anybody could have remained oblivious to the new Dark Shadows movie coming out:

Yet another Johnny Depp / Tim Burton collaboration, it seems those two have been in the movie business forever ! Edward Scissorhands, Sleeph Hollow, Charlie and the Chocolate Factory, Alice in Wonderland, now this !

So this begs the question: why? What do I mean "why"? Well, do the two just really like working together, or have they both determined that their partnership was mutually beneficial in terms of the quality of the movies created together?

I pulled IMDB data for Johnny Depp and Tim Burton separately focusing only on Johnny Depp as an actor and Tim Burton as a director (did you know he was in the list of actors for M.I.B. 3 ???), and labelled the movies either as "Common movies, "Johnny Depp only" or "Tim Burton only".

Collaboration VS Solo

Here is a graph summarizing for each of the three categories the IMDB ranking of the movies:

The above plot in question is called a boxplot and is a quick way to compare sets of data. The dark bold horizontal line is the median, and the gray rectangles represent the 25%-50% interquantile range, meaning that only 25% of movies will have rating greater than the top of the rectangle, and only 25%

will have a value less than the bottom of the rectangle. The dashed lines (called "whiskers") give an idea of the spread of the most extreme values.

For instance, we see that the median value for "Common movies" is around 7.5, and the data is rather concentrated (no movie had a rating better than 8, and none worse than 6.5).

Now comparing to the movies Johnny and Tim did solo, we see that while their joint work did not produce their best-rated movies (8.2 with Platoon for Johnny, and 8.4 with Vincent for Tim), it definitely limited risks with no movies worse than 6.5, whereas at least 25% of the movies Johnny or Tim did by themselves got worse than 6.5.

What about gross revenue?

Another comment about the boxplot: the round circles represent 'outliers' in the sense that they are values way beyond the spread observed in the data.

For "Common Movies", the outlier is Alice in Wonderland which generated just over a billion dollars, despite being, ironically, their worst-rated movie together at 6.5!

For Johnny Depp, the data is quite interesting: all his solo movies seem to have generated less than 150 million dollars, except four which made 4 to 7 times that amount. No surprises here, all four are Pirates of the Caribbean. I'm sure Johnny Depp bank account is looking forward to the fifth installment!

As for Tim Burton, the movies he did with and without Johnny Depp have very similar profiles.

Ratings and box office revenue can be combined in a scatterplot:

On a side note, it is interesting to see from the above graph the relationship between IMDB rating and revenue for the Johnny-Tim collaboration (blue dots): the greater the revenue, the lower the rating!

Evolution over time

The natural follow-up question is how this collaboration fits with the historical trends for both Johnny Depp and Tim Burton.

In terms of ratings, the collaboration had a significant impact on movie quality for Johnny Depp at the beginning of his career but very little in recent years (which was what the earlier boxplots hinted at earlier), whereas the impact was essentially insignificant for Tim Burton (but it's interesting to notice that 5 of Tim Burton's last six were with Johnny Depp).

From a revenue perspective, the collaborative Alice in Wonderland generated as much as the Pirate of the Caribbean series for Johnny, whereas that same movie and Charlie and the Chocolate Factory were Tim Burton's two biggest revenue-generating movies.

Closing conclusions

If were to summarize the previous findings in one sentence, it would be that Johnny Depp is currently repaying Tim Burton for having made him known in Hollywood early on his career with great-rated movies (Edward Scissorhands and Ed Wood) by starring in two huge hits (dollar-wise).

All this being said, it will be very interesting to see how well Dark Shadows performs and how it fits in with the current trends...

Friday, May 11, 2012

Dominion: Optimal "Big Money" strategy?

In a previous post I gave a quick overview of the rules of the board/card game Dominion.

Because there are so many different and attractive actions cards to purchase, they can be very tempting to purchase, especially those allowing you to draw and play even more action cards. However, a very simple yet efficient strategy at Dominion is called "Big Money" and essentially ignores all the action cards.

Big Money strategies

"Big Money" consists in only purchasing treasure cards and buying Provinces. Assuming a two-player game with 8 Provinces in play, I will look at how many turns it takes to buy 4 Provinces. As a hand consists of 5 cards and players only start with copper, the maximum hand value is 5, so before the first Province can be bought, silvers and golds will have to be bought.

The algorithm for "Big Money" can be written as:

if hand value >= 8, buy Province
otherwise, if hand value >= 6, buy gold
otherwise, if hand value >= 3, buy silver
otherwise, do nothing

But some variations exist. Indeed, in Dominion it is usually important to have a high money density, to maximise the value of a 5-hand card. So although coppers are worth 1 and cost nothing to buy, it would be foolish to gain as much of these as possible as the values of your hand will be capped at 5, and make higher purchases impossible. So going back to the variations, we ideally want as many gold as possible to raise the average value of a 5-card hand. But what about silvers? We need to buy at least one silver in order to buy a gold (4 coppers + 1 silver = 6, cost of a gold). But if the player has multiple turns with a hand of 3, 4 or 5, should silvers always be bought, or are they going to bring the average hand value down? Wouldn't it be worth skipping those turns and wait to buy gold instead?

I therefore created variations of Big Money, depending on variable k which is the maximum number of silvers the player will buy. If the player already has k silvers and has a turn with 3, 4 or 5 in money, the player will not buy a new silver:

if hand value >= 8, buy Province
otherwise, if hand value >= 6, buy gold
otherwise, if hand value >= 3 and [less than k silvers in deck + hand + discard pile], buy silver
otherwise, do nothing

Analysis results

And now for the long awaited results. Let us consider 9 different "Big Money" variants, respectively capping silvers at 1, 2, ...7, 8 and no capping, I've displayed the number of turns it took to buy 4 Provinces based, on 100,000 simulations in each case.

Apparently, capping is not a good idea, especially for very small values. For larger caps (7, 8), it is unlikely the limit will even be reached! But just to confirm let's take a closer look excluding the first two caps:

Looking at the mean number of turns in each situation:

When Capping at 1 silver purchase, the average number of turns required to purchase 4 Provinces was 31.14.
Capping at 2: 21.63 turns on average
Capping at 3: 19.47 turns on average
Capping at 4: 18.25 turns on average
Capping at 5: 17.55 turns on average
Capping at 6: 17.13 turns on average
Capping at 7: 16.91 turns on average
Capping at 8: 16.83 turns on average
No capping: 16.80 turns on average

So, when applying "Big Money", don't think twice go ahead buy the most expensive treasure you can!

In the next post we will take a look at some characteristics of "Big Money". How much Gold will you end up with? How much money in total? On which turns can you expect to purchase your first three Provinces?

There is much literature about "big Money" and my objective is not to repeat what can easily be found elsewhere, but to use "Big Money" as a simple benchmark for other strategies I would like to model, explore and compare.