Friday, May 08, 2009
The Big Money's interactive tool measures whether critics affect box-office receipts.
By: Chris Wilson | Slate.com
A week before the much anticipated movie Watchmen came out, my colleague Chadwick Matlin and I were wondering whether we could predict how well it would do at the box office based on the lukewarm reviews it received. The movie was virtually certain to be the highest grossing movie that weekend, given the tremendous hype and lack of credible competitors debuting at the same time, but our theory was that it would take a major hit in sales in subsequent weeks as word of how bad it was got around. (I later made the mistake of seeing the movie, and I'm inclined to agree.)
To test this theory, we decided to gather box office info from lots of recent movies and measure the percentage of their total gross revenue they made in the first weekend. The theory was that well reviewed movies would have a low percentage, since people continued attending well after the opening weekend; overhyped bad movies, meanwhile, would quickly fizzle. To help visualize the data, I built this tool in flash to plot every movie we looked at on two axes: Percent of profits from opening weekend (vertical) and average review score on Metacritic.com (horizontal). If our theory was correct, most dots would group along the diagonal line from the top left to bottom right, a basic inverse correlation. I also colored the dots by genre, in case that was important.
At first, the dots looked a big jumbled that betrayed no correlation at all. But since there appeared to be some more ordered pattern among just the action movies, I added a feature that allowed users to toggle genres on and off using the check boxes at the top. Lo and behold, when you hid every genre except action, a rather distinct correlation emerged in just the direction we predicted.
In the course of collecting all the data for this tool, we ended up with a lot of raw numbers-total revenue, revenue per theater, and so forth-that didn't make it into the original graph. But one of my favorite aspects of data-driven journalism is that, when there's too much information to process, you can build a simple way for readers to parse through the numbers on their own and discover their own conclusions. So I added the dropdown menu at the top left, which let's users choose which dataset they want to see represented on the y-axis. (Because this was developed in Actionscript, I could make use of the "tween" function that gives you that elegant transition from one dataset to the next.) We didn't find any strong trends in the other datasets, but that doesn't mean someone else will not.
More on this project