The information management challenges of data monetization

Register now

“On the Internet, no one knows you’re a dog” said the famous New Yorker cartoon, back when it seemed that online interactions were inherently anonymous. Today, of course, the Internet knows not just that you’re a dog, but what breed you are, what you ate for breakfast, and when you were neutered. In retrospect, the funniest thing about that New Yorker cartoon is its view of a world that quickly changed.

Data managers are still grappling with this change. Much Web advertising regulation, for example, is still premised on the notion that cookies are anonymous. This is what justifies sharing them without the consumer consent businesses need to trade in Personally Identifiable Information (PII) such as names and addresses.

But there are many ways to tie cookies to known individuals, a process that often includes “consent” consumers don’t know they’ve granted. Other theoretically anonymous identifiers such as device IDs and IP addresses can also often be connected to PII. And research has shown that even less specific information, such as a collection of taxi trips or a combination of birthdate and Zip code, are often enough to identify specific individuals.

Regulators are catching on. When the FCC and U.S. Congress recently voted to let broadband suppliers share details of their customers’ Internet and TV activities without permission, their justification wasn’t that such information was anonymous: it was that Google and Facebook could already monetize similarly personal information. European regulators, starting from stronger pro-privacy premises, are moving towards treating cookies and other identifiers as PII precisely because they recognize they can easily be tied to individuals.

Paradoxically, regulations intended to give consumers more control over their data actually force companies to be more thorough about assembling everything they know about each individual. After all, that’s the only way you can let people review and correct or delete their information. Similarly, “right to be forgotten” rules require companies to identify all published information about a person so it can be effectively flushed down the memory hole.

The loss of anonymity isn’t all bad. It lets marketing, sales, and service departments build a more complete picture of each customer. This lets those departments tailor their actions to each customer’s needs. Customers now expect such personalization and many refuse to do business with a company that doesn’t provide it. While consumers generally value privacy in the abstract, most will happily trade personal information for concrete benefits such as price discounts.

The social and business impacts of losing anonymity are intriguing, but data managers face more immediate challenges in managing the transition. Key implications include:

Separation of anonymous and identified customer data

Some customer data will remain anonymous for the foreseeable future – for example, new cookies that have not yet been matched to a known individual. So long as regulations allow different uses for such data, companies will need to keep it separate from identified data. Some companies may choose not to link some information with individual identities so they can use it more freely. Companies may put the different classes of data in different databases and isolate them to ensure there’s no leakage. Or they may manage rights and permissions at the attribute level, an approach that’s more demanding but arguably more efficient. Either way, they’ll need mechanisms to convert unknown to known identities when the information changes, and to convert known to unknown based on customer requests, rule changes, or discovery of false matches.

New methods to identify customers

Customer expectations will require companies to assemble unified profiles across different sources, whether the company wants to or not. This will often require relying on external vendors to provide advanced matching technologies and additional data. For example, cross-device matching relies extensively on data collected from publishers and other sources. Data systems and processes will need to be modified to incorporate these new components.

New uses for identified data

At one time, marketing systems such as Customer Data Platforms worked mostly with identified data, while advertising systems such as Data Management Platforms worked mostly with anonymous data. But advertising is making increasing use of identified information to select ad audiences, set bids on individual impressions, and present the best content to each individual. This implies changes in the ad systems and the data platforms that support them – often including requirements for ultra-quick real time access, which older systems couldn’t offer. The much-ballyhooed convergence of marketing technology and advertising technology is largely driven by the fact that advertisers now need the highly personalized, individually-targeted programs that marketing systems have been running for decades.

Greater controls and security

Widely publicized breaches have already raised the security of customer data to the top of many data managers’ priority lists. But richer, better organized customer data will be an even more valuable prize for attackers. At the same time, broader use of identified data by advertising, marketing, and service systems means customer information must be made more rather than less accessible. Regulatory pressures, especially outside the United States, will impose new requirements for proving consent, managing different attributes separately, keeping audit trails of actual use, providing data access by customers, and documenting adequate security processes. So it’s a safe bet that demands for better data controls will increase.

Allow for shades of gray

There’s no longer a clear line between known and anonymous identities; nor is there always absolute certainty about whether two identifiers refer to the same person. Data managers will need methods to share sensitive information in the least privacy-compromising ways, for example by revealing only the minimum information required for a particular task (e.g., yes or no flag for “over 18”, rather than an exact birthdate; or, sharing hashed identifiers rather than actual name and address). They’ll need to provide different versions of “complete customer data” based on different types of identity matches (for example, including questionable matches to avoid duplicates on marketing lists, but allowing only near-certain matches when making loan decisions). When they do create different customer views based on different inputs and matching rules, they’ll need to help users understand the differences and use each view appropriately.

Don’t be creepy

Customers may broadly assume your company knows everything about them but they can still be surprised at the data presented in specific situations – especially if that data is wrong. It’s really up to the marketing, sales, and service departments to use customer information most effectively, which often means acting on data without showing the customer you have it. But it can’t hurt for data managers to press gently to avoid unnecessary collection and distribution of detailed customer information. Customers will be happier and company risks are reduced. Everybody wins.

For reprint and licensing requests for this article, click here.