I hit
the pause button roughly one-third of the way through the first episode
of “House of Cards,” the political drama premiering on Netflix Feb. 1.
By doing so, I created what is known in the world of Big Data as an
“event” — a discrete action that could be logged, recorded and analyzed.
Every single day, Netflix, by far the largest provider of commercial
streaming video programming in the United States, registers hundreds of
millions of such events. As a consequence, the company knows more about
our viewing habits than many of us realize. Netflix doesn’t know merely
what we’re watching, but when, where and with what kind of device we’re
watching. It keeps a record of every time we pause the action — or
rewind, or fast-forward — and how many of us abandon a show entirely
after watching for a few minutes.
I personally hit the pause button — I was checking on my sick son, home
from school with the flu — but if enough people pause or rewind or
fast-forward at the same place during the same show, the data crunchers
can start to make some inferences. Perhaps the action slowed down too
much to hold viewer interest — bored now! — or maybe the plot became too
convoluted. Or maybe that sex scene was just so hot it had to be
watched again. If enough of us never end up restarting the show after
taking a break, the inference could be even stronger: maybe the show
just sucked.
legally delivered via the Internet than on physical formats like
Blu-Ray discs or DVDs. The shift signified more than a simple switch in
formats; it also marked a major difference in how much information the
providers of online programming can gather about our viewing habits.
Netflix is at the forefront of this sea change, a pioneer straddling the
intersection where Big Data and entertainment media intersect. With
“House of Cards,” we’re getting our first real glimpse at what this new
world will look like.
For
at least a year, Netflix has been explicit about its plans to exploit
its Big Data capabilities to influence its programming choices. “House
of Cards” is one of the first major test cases of this Big Data-driven
creative strategy. For almost a year, Netflix executives have told us
that their detailed knowledge of Netflix subscriber viewing preferences
clinched their decision to license a remake of the popular and
critically well regarded 1990 BBC miniseries. Netflix’s data indicated
that the same subscribers who loved the original BBC production also
gobbled down movies starring Kevin Spacey or directed by David Fincher.
Therefore, concluded Netflix executives, a remake of the BBC drama with
Spacey and Fincher attached was a no-brainer, to the point that the
company committed $100 million for two 13-episode seasons.
“We
know what people watch on Netflix and we’re able with a high degree of
confidence to understand how big a likely audience is for a given show
based on people’s viewing habits,” Netflix communications director
Jonathan Friedland
told Wired in November.
“We want to continue to have something for everybody. But as time goes
on, we get better at selecting what that something for everybody is that
gets high engagement.”
The strategy has advantages that go beyond
the assumption of built-in popularity. Netflix also believes it can
save big on marketing costs because Netflix’s recommendation engine will
do all the heavy lifting. Already, Netflix claims that 75 percent of
its subscribers are influenced by what Netflix suggests to subscribers
that they will like.
“We don’t have to spend millions to get people to tune into this,” Steve Swasey, Netflix’s V.P. of corporate communications,
told GigaOm last March.
“Through our algorithms we can determine who might be interested in
Kevin Spacey or political drama and say to them, ‘You might want to
watch this.’”
And maybe we will. Early reviews for “House of
Cards” are promising. It will be fascinating to find out how many people
gorge themselves on all 13 episodes this upcoming weekend. (Netflix
data shows that’s how we like to consume our TV series now — in great
gulps and marathons — so that’s how it will give them to us.) But one
does end up wondering: What will the Big Data approach mean for the
creative process? If Netflix perfects the job of giving us exactly what
we want, when and how will we be exposed to things that are new and
different, the movies and TV shows we would never imagine we might like
unless given the chance? Can the auteur survive in an age when computer
algorithms are the ultimate focus group? And just how many political
dramas starring Kevin Spacey can we stand, anyway?
The scope of
the data collected by Netflix from its 29 million streaming video subscribers is
staggering.
Every search you make, every positive or negative rating you give to
what you just watched, is piped in along with ratings data from
third-party providers like Nielsen. Location data, device data, social
media references, bookmarks. Every time a viewer logs on he or she needs
to be authenticated. Every movie or TV show also has its own associated
licensing data. The logistics involved with handling every bit of
information generated by Netflix viewers — and making sense of it — are
pure geek wizardry.
Netflix
doesn’t just know that you are more likely to be watching a thriller on
Saturday night than on Monday afternoon, but it also knows what you are
more likely to be watching on your tablet as compared to your phone or
laptop; or what people in a particular ZIP code like to watch on their
tablets on a Sunday afternoon. Netflix even tracks how many people start
tuning out when the credits start to roll.
Correlating the raw
numbers of Kevin Spacey fans who also like David Fincher and have a
fondness for British political dramas is just the beginning. Netflix
knows enough about what you are watching to judge specific aspects of
content as well. Last summer senior data scientist Mohammad Sabah
reported at a conference
that Netflix was capturing specific screen shots to analyze
in-the-moment viewing habits, and the company was “looking to take into
account other characteristics.”
What could those characteristics
be? GigaOm’s report of the Sabah presentation speculated that “it could
make a lot of sense to consider things such as volume, colors and
scenery that might give valuable signals about what viewers like.”
Netflix chief content officer Ted Sarandos
has said
that all that data means that Netflix has a very “addressable
audience.” Unlike the traditional broadcast networks or cable companies,
Netflix doesn’t have to rely on shoveling content out into the wild and
finding out after the fact what audiences want or don’t want. They
believe they already know.
Of course, data-centric decisions don’t guarantee hit-making success. Kevin Spacey’s participation isn’t bulletproof (see
“Fred Claus”)
and even David Fincher can’t claim a perfect record. (“Alien 3,”
anyone?) Netflix’s ambition to challenge HBO as a destination for
quality original programming will require fabulous craftsmanship to go
along with the Big Data filters. All the Big Data in the world can’t
rule out, once and for all, the possibility of a bomb.
But that
goes without saying. The interesting and potentially troubling question
is how a reliance on Big Data might funnel craftsmanship in particular
directions. What happens when directors approach the editing room armed
with the knowledge that a certain subset of subscribers are opposed to
jump cuts or get off on gruesome torture scenes or just want to see blow
jobs. Is that all we’ll be offered? We’ve seen what happens when news
publications specialize in just delivering online content that maximizes
page views. It isn’t always the most edifying spectacle. Do we really
want creative decisions about how a show looks and feels to be made
according to an algorithm counting how many times we’ve bailed out of
other shows?
For years Netflix has been analyzing what we watched
last night to suggest movies or TV shows that we might like to watch
tomorrow. Now it is using the same formula to prefabricate its own
programming to fit what it thinks we will like. Isn’t the inevitable
result of this that the creative impulse gets channeled into a pre-built
canal?
It’s certainly possible to overstate the case here. One
could argue that Netflix’s strategy is only a slightly more
sophisticated version of what’s already been in place for, well,
forever. We wouldn’t be seeing teenage vampires or zombies every time we
turn on the TV if the money that bankrolls the content creation
business hadn’t already decided that’s what we want to see. Actors who
have the fortune to appear in hit movies or TV show get more parts to
play. So what else is new?
But there’s a level of specificity made
possible by Big Data that suggests we’re headed into new territory.
“House of Cards” is just one symptom of a society-wide shift. The Obama
campaign used the same kind of number crunching to target voters with
more accuracy than any political campaign had ever accomplished before.
Online advertisers are also gathering vast amounts of detailed
information about us from our smartphones, our Facebook likes and our
Google searches.
The sheer amount of data available to crunch is
already phenomenal and is growing at an extraordinary rate. Last summer,
at a panel discussion that included several significant players in the
emerging Big Data universe, Michael Karasick, a V.P. at IBM Research,
estimated
that there is “a thousand exabytes of data on the planet anywhere.” An
exabyte is one quintillion bytes, or 1,000 gigabytes. That’s a lot of
ones and zeroes all by itself, but the mind-boggling part of the
equation is that Karasick predicted that just two years from now there
will be 9,000-10,000 exabytes of data on the planet.
The companies
that figure out how to generate intelligence from that data will know
more about us than we know ourselves, and will be able to craft
techniques that push us toward where they want us to go, rather than
where we would go by ourselves if left to our own devices. I’m guessing
this will be good for Netflix’s bottom line, but at what point do we go
from being happy subscribers, to mindless puppets?