JAN 25, 2013 7:56am ET

Related Links

How to Map Data Governance Policies to Business Processes
Big Data Already Suspect

Web Seminars

IBM & Teradata Compared: A Total Cost of Ownership Study
May 22, 2013
What Is Data Science? You Might Be Surprised!
June 3, 2013
AARP: Embracing Dynamic, Agile Analytics Platforms for Big Data
June 5, 2013
Interview

Do You Need Big Data Governance? Maybe.

Print
Reprints
Email

Wouldn’t an issue like data quality raise a similar conflict?

When you get into data quality now you’re looking at different things like how to deal with streaming data that’s flowing in, and that’s a different kind of data quality than most in that field have dealt with. In some examples you are trying to match multiple feeds from different sensors, maybe a temperature sensor and a motion sensor. You might expect the temperature sensor to respond 10 times a second. For some reason you lose three seconds and that’s potentially a data quality issue. In the book I talk about temporal alignment and the rate of arrival. It’s a different implementation of data quality, though things like metadata still apply. If you’re thinking about clickstream analytics, which is big data, how do you define a unique visitor to a website? How do define a session, one that is closed or one that is returned to while open? I found many governance issues in that vein that may not have been considered.

In your book you seem to use dictionaries and metadata as the connecting point of where these things can be aligned. Is that a kind of overlay or abstraction as opposed to an attempt to conform the data?

Yes, exactly, and if you want to align a customer’s Twitter feeds with their master record, you still have to define what a customer is. You think about whether customers are prospects or active clients just like in any other system.

What are some of the unknowns in big data governance companies need to manage before they take their experiments out of quarantine and into production?

First, you are right, I haven’t seen a lot of companies ready to integrate their big data governance policies with the rest. There are just so many things that need to be understood first, which is why governance is there in the first place. If I work in credit, can I use your Twitter account to make a loan decision? If I am in collections, can I use Facebook info under the Fair Debt Collections Act? You definitely have to start writing policies by jurisdiction. The state of Maryland and others now have policies that don’t allow employers to use social data to pre-screen candidates. There are concerns that a lot of social media contains protected information like age, race, gender or sexual orientation. You cannot consult social media and later claim you didn’t discriminate with that knowledge.

It reminds me of some of the unintended consequences marketers have experienced using analytics against customer records that backfired after they dug too deeply into a person’s history.

I think that’s a similar challenge for big data because so many regulations are evolving. There’s also reason to worry about reputational backlash if you cross a line with social information that is also deemed personal. I advise clients to be conscious in both the regulator and reputational areas but remind them big data has many types to take advantage of. That can also be a problem when you start integrating multiple types of data and focus a lot of analytical power that can push the edges of privacy, but again, that’s where governance comes in. It will be interesting to follow how people in privacy and legal departments will have their own take on data governance and risk.

Jim Ericson is editorial director of  Information Management, a SourceMedia publication. You can reach him at Jim.Ericson@sourcemedia.com. Follow him on Twitter at @jimericson.

Advertisement

Comments (0)

Be the first to comment on this post using the section below.

Add Your Comments:
You must be registered to post a comment.
Not Registered?
You must be registered to post a comment. Click here to register.
Already registered? Log in here
Please note you must now log in with your email address and password.

Where do young IT professionals (30 and under) obtain information to aid with daily role responsibilities and career development?

Trade publication websites 14%
Social media 23%
Vendor websites 4%
Vendor/community forums 7%
Newsletters 1%
Trade conferences/meetups 2%
RSS feeds 6%
Web search 44%

 

Twitter
Facebook
LinkedIn
Login  |  My Account  |  White Papers  |  Web Seminars  |  Events |  Newsletters |  eBooks
FOLLOW US
Please note you must now log in with your email address and password.