The Forrester Muse
APR 17, 2014 5:45pm ET

Related Links

Health Information Exchange Requires Nationwide Patient Data Matching Strategy
August 14, 2014
Analytics CEO Schools Payers at AHIP
June 16, 2014
Making Sound Business Decisions
June 11, 2014

Web Seminars

Essential Guide to Using Data Virtualization for Big Data Analytics
September 24, 2014
Integrating Relational Database Data with NoSQL Database Data
October 23, 2014

Big Data Quality: Certify or Govern?


We've been having an intersting conversation with clients and internally about the baggage associated with Data Governance. As much as we (the data people) try, the business thinks it is a necessary, but the commitment, participation, and application of it is considered a burden worth avoiding. They wonder, "Is this really helping me?" Even CIOs roll their eyes and have to be chased down when the data governance topic comes up. They can't even sell it to the business.

Get access to this article and thousands more...

All Information Management articles are archived after 7 days. REGISTER NOW for unlimited access to all recently archived articles, as well as thousands of searchable stories. Registered Members also gain access to:

  • Full access to including all searchable archived content
  • Exclusive E-Newsletters delivering the latest headlines to your inbox
  • Access to White Papers, Web Seminars, and Blog Discussions
  • Discounts to upcoming conferences & events
  • Uninterrupted access to all sponsored content, and MORE!

Already Registered?


Comments (3)
Good points in the article. I am glad to see that data governance and data quality tags are used for this article, as they are interrelated, IMO. There seems to be a industry-wide rush to dump ungoverned and uncleansed data into Big Data stores and letting data scientists makes sense of and correct or ignore non-conforming data. This is not so bad for machine-generated data as it would be consistently non-conforming/ungoverned/unclean, but integrating/combining data from multiple sources and then adding other enterprise/Web sources, and it all becomes overwhelming. Cybersecurity analytics are a good example of this mix. Our company takes a different approach anyway and I do not know how appropriate it is to this discussion, but we independently index and query data - Big (Hadoop) or otherwise, in a virtualized and federated manner. We cleanse, transform and standardize data as it is indexed, usually using the same schemas as the sources, and then map it to standard data models that can look like relational databases, Big Tables or ontological models. The point is that we take care of poor data governance and data quality without changing source data, and then, optionally, apply the same algorithms to results data that is read from source systems after queries are executed on indexes. At some point, someone has to pay the piper on data governance and data quality, either as it enters the source, copied to a data warehouse or Big Data store (ETL), in the data store (ELT), or, if using schemaless unstructured approaches typical of Big Data, manually by a data scientist. The involvement of a data scientist, prevents BI/analytics from being end-user-oriented and thereby, IMO, diminishing the value proposition.
Posted by Gavin R | Friday, April 18 2014 at 1:40PM ET
I think "data Audit" for compliance to data quality and certification of it can drive Data governance point to the business. I think it is Audit and compliance that drives the initiatives which are not driven by growth as per the user community. But I think people have to be educated about data governance and small implementations process by process and standards by standards would help see the value. Also if driver's manual was not there can I still get the driver license? even if with common sense don't we need the rules(best practices in DG) for "eventless" governance of information. In a tool driven automated world, something has to govern even the tools isn't?
Posted by H R | Monday, April 21 2014 at 11:17AM ET
H R: I agree that education is crucial around data governance. Using every opportunity to do so reinforces the best practice. What we need to be careful of is not getting bogged down in the process. The process for data and tools is the role of IT. However, in a business lead data governance effort, the process has to be oriented around value - risk mitigation and business strategy. The issue to date is that even when the business is involved and 'leading', they often look and operate closer to a shadow IT shop than a business unity. This creates stagnation and limited effectiveness - not to mention no real commitment from sr. leadership. Positioning data governance in terms of certification (and I think it goes beyond an audit) is more than a marketing ploy, it changes data governance efforts fundamentally from data form factor to data value. You treat data not as an artifact of the business but as a core unit of the business.
Posted by Michele G | Tuesday, April 22 2014 at 10:39AM ET
Add Your Comments:
You must be registered to post a comment.
Not Registered?
You must be registered to post a comment. Click here to register.
Already registered? Log in here
Please note you must now log in with your email address and password.
Login  |  My Account  |  White Papers  |  Web Seminars  |  Events |  Newsletters |  eBooks
Please note you must now log in with your email address and password.