The problems of master data management  sometimes seem to be intractable.

It is difficult to keep master data synchronized with business reality, deduplication of records that exist for the same entity instance is a perennial issue, and the quality of non-key data attributes often cannot be determined with certainty. The traditional MDM paradigm has been to take data produced in transaction applications and try to automate its integration and cleansing.  Such "transaction adaptation hubs" have brought the promise of  clever algorithms that could do all the work to arrive at a single golden record for master data.

Alas, this has not turned out to be so. The algorithms in traditional MDM hubs have done well, but in many cases, they have not done well enough.  One response has been to add functionality to these hubs that permits data stewards to detect and correct data.  However, this creates additional issues, such as when data is not corrected synchronously in the corresponding upstream transaction applications.

Another approach has been to separate the production of master data from its distribution.  Central hubs remain for distribution purposes, but master data is created in specialized environments. The specialized environments are like farms where crops of master data are grown, and the hubs akin to markets, where the master data is taken after it is production-ready.  This pattern works reasonably well when the master data is about specialized, high value entities, such as institutional clients in brokerage businesses.  However, there can still be problems, such as failing to detect changes in attribute values. This approach is also very difficult to scale when there many entity instances to be mastered, such as in retail banking.

Who Owns the Data?

Once we recognize the problems in the traditional approaches to MDM, what can we do about them?  First of all, we have to ensure that data quality is high when it is first captured by the enterprise. However, are we taking this idea far enough?

Let us ask a question about customer data: Who owns it?  If I ask an IT person, he or she will likely identify one or more business users. But this is a problem.  To the IT mindset, “data owner” typically means one or both of two things:

(a) A data owner is someone who has some form of governance responsibility for the data.  It is almost never explicitly stated what such responsibility involves.  “Owner” is an analogy; the “owner” is expected to look after the data as if he or she was in possession of it.  However, such an expectation is often unrealistic. How can such an “owner” of customer data always be expected to know if the data is right or wrong?

(b) A data owner is somebody who can give IT requirements so that IT can do their job. In other words, in data-centric development projects, “data owner” is a mere substitution for the ”business sponsor” in the old systems development lifecycle. 

Let us ask the question again: Who really owns customer data?  We propose that, in reality, each customer owns it - not any one person in the enterprise. If we truly subscribe to this viewpoint, it has profound implications for MDM.

Data Owner Driven Architecture and Governance

Many data managers might think this is fairly obvious, and it is.  Who knows the customer’s basic information better than the customer themselves?  Who finds out earlier about changes to this information than the customer involved in said changes? Yet, if it is so obvious, why do our MDM architecture and governance processes seem to be based off of different paradigms? MDM hubs that capture data from transaction applications are not capturing data directly from the customer. By definition, these hubs cannot be updated with customer data unless the customer is involved in the transaction.

More importantly, especially in light of evolving laws about data management, who “owns” the information about a customer, if not the customer themselves? Surely, Malcolm Chisholm owns the information about Malcolm Chisholm and Fabio Corzo owns the information about Fabio Corzo.  And if it is true that a customer owns his or her own information, what does this mean, in terms of how an enterprise should treat a customer?   Furthermore, with ownership, comes responsibility.  What are the individual customer's responsibilities when it comes to their own information?

Clearly some of these issues will take many years to sort out.  In the meantime, enterprises can begin to consider what it might mean to their MDM architecture and governance processes, if they were to take seriously the idea that customers are the true owners of their own data.

If this were an accepted idea, it is difficult to see how the implementation of a transaction adaptation hub would apply this principle.  An enterprise would want to find ways to get as close to the customer as possible, in terms of customer data management.  If a customer was able to directly manage their data, there would be no need for probabilistic trust and survivorship rules.  Why should users not be able to provide demographic information about themselves, or update life events directly?

Current State and Looking Forward

Many enterprises reason that traditional architecture and governance approaches are the best approach, given the decades of organic growth in data architecture. But why?  The growth of the Internet and the rise of social media have brought people into direct contact with enterprises, in terms of data, and have shown that individuals are quite willing to maintain their own profiles.  Of course, there are issues to be worked out; people are not always accurate in describing themselves and may be sloppy in the way they maintain their personal data. However, this overall approach is superior to what we have today.

We have focused on the example of customer data here, but this concept can be applied to other master data subjects. Find the true owner of the data, rather than the owner of a data store, let them manage the master data and adjust architecture and governance to fit this approach.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access