Using Google search trends to predict box office performance is something that, weirdly, feels like it should be easier than it is. This is because if you are at all interested in movies and analytics, you’ve at some point read the Google whitepaper Quantifying Movie Magic with Google Search. And if you’ve tried to replicate these findings yourself, on a broad library of movies, you’ve likely come to develop a love-hate relationship with it because… it’s just not that simple.
We think about leading indicators for a movies’ success a lot and naturally, Search is a big part of that. But the bigger consideration is the overall mix of signals, of which Search is just one. This is where the Google whitepaper is both inspirational and frustrating. It lays out some really great methodology/concepts, but it also uses a hand-picked pool of movies, proprietary data unavailable to general users, and its intent is to sell Google as an advertising platform to movie marketers – not to generally understand and predict movie performance.
This is a topic we’ll be getting deeper into in the coming weeks, especially as we add FLIQ forecasts to the app. But with the release of a standout performer like Black Panther, it’s a great time to take a retrospective look at how it was tracking coming into box office, and what that tells about the metrics used by FLIQ – and an opportunity to understand better and refine them.
Black Panther Search Activity
If we look at the Intent category of FLIQ search tracking, we see that Black Panther is actually the highest movie we’ve ever tracked in the week leading up to release. And if we look at the top 30 movies rank-list in this category, it does track intuitively with (1) high performers in general, and (2) surprise hits. In the latter category, we see It, Deadpool, Beauty and the Beast, Wonder Woman, etc. All of these are huge titles without a doubt, but these are also all titles which substantially exceeded their forecast revenue forecast.
We also see that while combined Search tracks pretty closely with Intent, both General and Trailer search track quite a bit differently, which speaks to the underlying complexity of this data.
First, a note about how we track and compile these scores. Part of the problem with the Google Trends is that it’s somewhat limited and provides a rating 1-100 that is normalized to your specific query results. That is if you separately look up search activity for Black Panther (overperformed at the Box Office), and for Blade Runner 2049 (underperformed), and even Going In Style (died on arrival, gruesomely) you cannot directly compare these results to each other. Notice that the high-mark of the trend in these charts is always 100:
You can look up multiple terms together, and then the trends Google Trends API provides do scale relative to each other. So this information is a lot more helpful, and we can see just how drastic the difference is:
But this is just the first layer of complexity, because not only do we want to compare against MANY more movies, consider that people often search for a movie using different terms, and for different reasons, and we want to compare this information across movies on the same timeline relative to their Box Office release. So how do we make sense of this complexity and make the data useful?
FLIQ Search Signals
Our approach is to use a reference term. We’ve found some phrases and terms for which the search volume is consistent over the course of the last 15 years. So when we get the search activity for any particular movie, we get that activity combined with the activity for a reference term. We can now compare search results from different queries as long as they used the same reference term.
We also use variations of the movie name when searching, sometimes a friendly or a franchise name is much more common than the full title. E.g. Harry Potter vs Harry Potter and The Half Blood Prince. These results are combined together.
Finally, we break the search into categories like you see here, where Intent Search is combinations like [Title] + ‘Tickets’, or [Title] + ‘Showtimes’. Trailer Search includes any trailer related search terms like ‘Teaser’, ‘Preview’, etc. And General are all other title searches that don’t fall into Trailer and Intent.
Calculating just the correlation between these metrics and two key outcomes (total revenue and opening week) we see a strong and clear relationship. We see that they are in general more predictive of opening week, which makes sense just due to proximity – and total revenue starts to be impacted by just how long a movie is released and what happens in that timeframe.
We also see Intent tracking the best out of any metric, which is encouraging and makes sense intuitively. And we see Trailer Search being the weakest predictor.
There is definitely predictive power in the search data, but it’s substantially more difficult to work with in practice than you’d think at first blush. This is one of the challenges FLIQ is meant to solve, to bake in the complexity and provide easy to read and digest figures. What we end up with are clean summaries like these:
You can see from those two charts the story behind Black Panther and Blade Runner 2049 Box Office performance, from a search-as-predictive-signal standpoint.
As mentioned at the top, this is all part of an array of signals FLIQ tracks to make ongoing performance predictions that we’ll be making available in the app soon. But all these individual signals are important and nuanced and require a methodology onto themselves – which is where we would love to get feedback on how others think about this information.
Check out the app here to see how all the upcoming releases are tracking using these metrics!
@2017 FLIQ.AI - All Rights Reserved