Monday, July 8, 2013

Visualizing the quality of a TV series

I love TV series. Like many people, my first addiction was Friends. Only 20 minutes, 4 jokes-a-minute, it was easy to fit in your schedule.

Then came the TV series revolution, with more sophisticated scripts, each episode becoming less of an independent unit but one of the many building blocks in the narration of ever increasing complex plots. Skipping episodes was no longer an option, even more so given the cliffhanger of the last episode.
The first such series I remember were 24 and Lost.

I'm not going to list all my favorite shows, nor try to compare them or do any advanced analyses. I just wanted to share some visuals I created allowing quick summarization of the quality of a series, episode after episode, season after season.

Here is the visual for Friends:
The blue line displays the IMDB rating of the episodes within each season - notice the break between seasons and the gray vertical bars delimiting seasons. There is a lot of data represented, so I've added some summary stats at the very top of each season:

  • the number at the very top is the average of the IMDB rating for all the episodes of the season. The color on a red-green scale compares the seasons to each other: the seasons with highest averages are green, those with worst ratings are red.
  • the arrow under the season average represents the evolution of ratings within the season: if episodes tend to get better and better the arrow will point upwards and be green, if they get worse the arrow will point downwards and be red. If the ratings remain approximately average over the season a horizontal orange arrow is displayed.

Revisiting the Friends visual, we observe spectacular consistency in ratings over the ten-year period. The summaries on top allow us to see through the noise, and that seasons 8 and 9 were among the worst in average episode rating as well as the only two that got worse over the season. But the tenth and final season was the highest-rated one, and had a strong positive trend until the series finale.

Although not a huge fan, I had to look at the Simpsons' incredible run. The visual highlights the surprising drop and plateau of the season average starting around the ninth season.


Now looking at some of my favorite series:
I have to agree here that the first seasons of 24 were the best. I also love the way ratings evolve in the final season, when people had very low expectations and thought it would be the same old jack-bauer-aka-superman-prevents-nuclear-attacks-every-thirty-minutes but soon realized this was Jack Bauer on a revenge rampage without any rules.


Breaking Bad is the perfect example of the TV series that keeps getting better, throughout the seasons and throughout the years. Can't wait for the second half of the fifth season!


The last few seasons of Dexter aren't rated as high as the first ones, but ratings remain high.


I loved the first season of Prison Break but clearly it should have been a one-season series, you jsut couldn't top the suspense and originality.


Game of Thrones started really high but just kept getting better à-la-breaking-bad. The seesaw effect in the third season is rather impressive!




Wednesday, July 3, 2013

Hollywood's lack of originality: "Let's make a sequel!" (Part 3, yes I realize the irony)

As indicated by the title, this is part 2 of the analysis of movie sequels.
In the first post I described the IMDB data used for the analysis and shared some preliminary statistics on the distribution of number of movie installments in movie series.
In part 2 I focused on comparing IMDB ratings for the original movie and its sequel.

This post looks at series with 3 or more installments.





The sequel has a sequel!!!

Looking at multiple installments is a little tricky. Can I compare the average IMDB rating change between installments 1 and 2 with the change between installments 2 and 3? Probably, but only if I look at the same sample. Let me explain myself. To compute the change between installments 1 and 2 I might have 1000 series to look at (2000 movies then). But when looking at the change between 2 and 3, my sample will be smaller (I will no longer have all the series that only had two installments). Is that so much of an issue? It could be if there is what is called a "lurking third variable". Perhaps moviemakers only make a third installment when the second installment wasn't too bad, so series with only two installments could be biased in the sense that they are the ones where the second installment did really terrible and so no third installment was made. So if we really want to compare the drop-off between 1 and 2 with the drop-off (I am assuming it is another drop-off!) between 2 and 3, we should restrict the analysis to only series with at least three installments.

So there are a couple of things we might want to look at:
1) average change between installments 1 and 2 for all movies that have 2 and only 2 installments
2) average change between installments 1 and 2 for all movies that have 3+ installments
3) average change between installments 2 and 3 for all movies that have 3+ installments

Comparing 1) and 2) will give us an idea whether third installments are favored for series where second installments didn't do too badly. Comparing 2) and 3) will allow us to compare the 1 -> 2 and 2 -> 3 effects.

Here are the results:


Series length Sample Size From Installment ... To installment ... IMDB Rating Difference
All series 606 1 2 -0.87
Exactly 2 410 1 2 -0.92
3 or more 196 1 2 -0.78
3 or more 196 2 3 -0.33

So it does seem that series with exactly 2 installments had a larger 1 to 2 installment drop-off (-0.92) than those with 3 or more installments (-0.78), however another (unpaired) t-test revealed that the difference was not significant. Budget and revenue are probably more important factors than IMDB ratings taken into account at Hollywood before deciding whether to make a third movie, but I suspect there is still some correlation between rating and revenue (no worries a future post will address this!).

Another interesting finding is that the drop-off from 2 to 3 is much smaller than from 1 to 2. This can make sense: 
1) First of all, an IMDB rating can only go so low, you can't keep losing a full point rating everytime.
2) It is safe to assume the original movie was seen by a rather diverse crowd, whereas the second might have been seen only by the first movie's fan base, and very likely that those were about the same as those who saw the third. In other words, a much bigger overlap is to be expected between those who saw the second and third as opposed to first and second, which translates into more similar ratings.




More than 3?

For more than three installments, the sample starts to shrink quite rapidly, which is why I went for a more visual approach. I normalized all first installments to a score of 100 in order to track the evolution of all subsequent installments. This graph will allow us to shed some light on some hypotheses from the previous section: ratings go down but at a slower pace and tend might eventually converge with the same hard-core fan base rating the movies.



A slightly more confusing plot where trends are harder to establish but which nonetheless conveys both the initial rating drop-off as well as the sharp decrease in series length is the spaghetti-plot, where each line represents the evolution of a given series rating installment after installment.



From the previous two graphs, it appears that the decrease we observed over the first few installments continues until the fourth installment which is typically a series all-time low, with installments 5 and 6 usually bouncing surprisingly back. However, as shown in the spaghetti plot, the sample size is severely reduced and it would be dangerous to draw any strong conclusions.

It also seemed that quite a few series ended on surprisingly good ratings, sometimes the best or second best after the first installment:
  • The Rocky series was strictly decreasing from the start: 8.0, 6.9, 6.3, 6.2, 4.7 but finished with 7.2
  • The Rambo series had a similar pattern: 7.5, 6.0, 5.2, and...  7.1!
  • The Harry Potter series has ratings grouped between 7.2 and 7.6 for the first 7 installments but the eight and final finished at 8.1
Could it be that for these series a real effort was made to finish on a strong note? Stallone probably put more thought into the script waiting 16 years for the last Rocky whereas the first 5 came within an average of 3 to 4 years apart from each other. Same for Rambo within a 20 year lapse compared to the three year gaps for the first movies.

Also, in the cases mentioned above, the last installment was usually declared as such, in which case also ratings might be higher from fans saddened to witness the final installment.




Closing thought

What have we learned? The primary conclusion is that the common belief that sequels do worse than the original is definitely valid. Although not proven here, the most likely explanation is that Hollywood doesn't care about making terrible movies as long as they generate a profit but more importantly (since good movies are more likely to generate greater profits than terrible movies, a priori) they want as little risk as possible involved. A guaranteed million dollar profit is better than a 50/50 chance of generating either three million or losing a million.