In this era of Big Data, there is little that cannot be tracked in our online lives—or even in our offline lives. Consider one new Silicon Valley venture, called Color: it aims to make use of GPS devices in mobile phones, combined with built-in gyroscopes and accelerometers, to parse streams of photos that users take and thus pinpoint their locations. By watching as these users share photos and analyzing aspects of the pictures, as well as ambient sounds picked up by the microphone in each handset, Color aims to show not only where they are, but also whom they are with. While this kind of service might prove attractive to customers interested in tapping into mobile social networks, it also could creep out even ardent technophiles.
Color illustrates a stark reality: companies are steadily gaining new ways to capture information about us. They now have the technology to make sense of massive amounts of unstructured data, using natural language processing, machine learning, and software architectures such as Hadoop, which handles high volumes of simultaneous search queries. Messy data of this kind, long relegated to data warehouses, is now the target of data mining. So is the information generated by social networks—user profiles and posts. Its quantity is staggering: a recent report from the market intelligence firm IDC estimates that in 2009 stored information totaled 0.8 zetabytes, the equivalent of 800 billion gigabytes. IDC predicts that by 2020, 35 zetabytes of information will be stored globally. Much of that will be customer information. As the store of data grows, the analytics available to draw inferences from it will only become more sophisticated.
It’s no wonder that there are calls for corporations to create positions such as chief privacy officer, chief safety officer, and chief data officer, or that American and European legislators have been considering several kinds of privacy measures. In one bipartisan effort, Senators John McCain and John Kerry have proposed the Consumer Privacy Bill of Rights Act of 2011, which aims, in part, to restrict what online companies can do with customer data. Senator Jay Rockefeller has proposed his own piece of legislation, the Do-Not-Track Online Act of 2011. The European Union’s Article 29 Working Group is addressing similar concerns.
In the private sector, the Digital Advertising Alliance has sought to get ahead of such rule-making by introducing its own privacy framework to assure the security and safety of customer information. Its Self-Regulatory Program for Online Behavior Advertising comes on the heels of several incidents: Epsilon’s admission that hackers gained access to customer information from clients such as CitiGroup, Target, and Walgreen’s; Sony’s revelation that its PlayStation platform failed to safeguard the account information of up to 100 million customers; and Apple’s confirmation that it uses an unencrypted file stored in iTunes accounts to track movements of individual iPhone users in the physical world.
For all the privacy concerns, the online economy creates enormous value by using customer information. In 2009, according an ad industry study cited by the Wall Street Journal, the average price of an untargeted ad online was $1.98 per thousand views. The average price of a targeted ad was $4.12 per thousand. We used to measure the success of websites as if they were portals—by how much traffic they could muster. Now we measure them as social networks—by how much they know about their users. This is why Wal-Mart recently acquired Kosmix, a Silicon Valley startup that filters and finds meaning in vast streams of Twitter messages. Other retailers, along with digital players such as Facebook and Yahoo, are using the technology of another startup, Cloudera, to sort through enormous quantities of behavioral information compiled over years (sometimes decades) in search of insights based on patterns that only machines can fathom. Intelligence generated in these ways can lead to better games from companies like Zynga and better advertising from your favorite brands. David Moore, the CEO of 24/7 Real Media, argues that when an ad is targeted properly, “it ceases to be an ad; it becomes important information.”