MDM and Next-Generation Data Sources
New data sources and real-time analytics continue to change the IT landscape as never before. Big data and other new information management technologies now allow organizations to uncover relationships that have long been buried in their data stores.
Now, top master data management expert Aaron Zornes, chief research officer of The MDM Institute, see another new trend emerging: master relationship management.
Zornes, who talked to Information Management as he was preparing for the October 20-22 MDM and Data Governance Summit in New York, explores the area of master relationship management and warns that, despite the promise of data management innovations, the basics of data governance and master data management can’t be neglected.
Information Management: Master data management is a well-established industry that continues to evolve. What excites you about MDM today, and what’s on the horizon?
Aaron Zornes: Well, like all technologies, at least information technologies, they never sit still, they’re constantly evolving, morphing and, in particular, they need to evolve to address the next generation of data that’s beginning to flood companies. We’ve all heard the words big data, but let’s be more specific, let’s talk about mobile data, GPS coordinates, location awareness data, social interactions - you know, who you network with and what’s the value of your network of associates, that sort of relationship data. Likewise, all the transactionalization of society, I mean every time you go somewhere and buy something, that’s captured. Likewise, every time you visit a store or website, all that data is captured, and that’s all data that needs to be cleansed and related to somebody and some thing. The next generation of data related to mobile, cloud and social is taking place at a much faster pace, so the clock has been sped up. And so, increasingly, people expect real time. But it’s not just capturing real-time data about various people and things, it’s also serving up real-time actionable insights or analytics as a result of that data. It’s one thing to be flooded with data; it’s another thing to be able to make sense of it and then be able to act on it or make recommendations for a human or another system to act on it. Big data by itself is just a lot of noise, so you need big data analytics to make sense of it. In order to have big data analytics be effective, to take advantage of all these new sources of next-generation data, you want to have MDM and be able to do what’s called identity resolution to tell you who is talking about what. What’s next on the horizon is being able to accommodate the next generation of data sources and not just be able to clean them up but be able to analyze them. Increasingly you’ll find analytics are required to make sense of the big data flood, and likewise, the analytical information that comes out, once it’s determined as a probability, that information needs to be shared as master type data, as part of that profile or view about that person or that company or that thing. And again master data, cleans it up, organizes it, publishes it, shares it accordingly.
With all the new sources and types of data, how are the mega-trends of big data, social, mobile and cloud driving change in the data governance landscape?
In the data governance landscape, we still don’t really have data governance of our classical MDM data under control yet. And so just baby steps at the moment. We’re just beginning to get products that can look at your traditional MDM sources and help you manage that, and even that’s wild and wooly because if you look at a typical large company with 14,000 or so databases on average or some wild number like that, to be able to understand the landscape of where all the data is and who manages what, who’s responsible for what, and where is the redundancy and where’s the best source, that sort of data governance by itself is quite a mammoth project to take on. Not that we want to document everything, but to be able to just organize ourselves so we can execute on classical MDM. And if we’re talking about next-generation MDM, the next-generation data sources, it gets even more wild and wooly because a lot of the data sources are out of our control. They belong to LinkedIn, Twitter, Facebook, Google Plus, Xing, to all these other social networking systems, not to mention the third parties that own that GPS or location awareness data like the credit card merchants, likewise the cell phone providers. All that information is owned and managed by other people. So you don’t necessarily have the vested authority to manage it to the degree you would like as an IT organization. Now, what’s happening to meet that need is there’s a number of brokers or third-party go-betweens that are cleaning up [data] to make it more available to the consumers of that information. Increasingly, those people will help us get our arms around, if you like, the sourcing, the cleansing, and the validation, verification, certification, etc. of this flood of next-generation data.
When you look back to the very beginning of the MDM Summits, in North America that was eight years ago, what surprises you about the way that the events and the industry have evolved in that time span?
Well, I look at the formal job titles of people in particular because it’s important for us to understand what people are doing MDM or influencing MDM and data governance, and who has the decision-making and purchasing power, and who’s doing these systems, who benefits from them and who’s developing and implementing them. So what’s been a very big surprise is the flood of job titles that have evolved over the last couple of years to address MDM data governance. It is very, very remarkable the number of people who have data governance in their job title and, likewise, the number of people who have master data or MDM in their job titles. If you were to look at any of us consultants who do IT consulting in general, and if you look at the number of people that have MDM or master data or data governance in their job title, it’s easily half of the attendees at our conferences who have that formal job title. Those job titles did not exist a couple of years ago, and so that’s a great indicator of the uptake or historical documentation.
And what about the technology?
I would say also what’s remarkable is the new architectures that have been coming along. Just like the database itself evolved into distributed database, semantic database, object database, all these different variants, likewise MDM is permutating into a number of different architectures like cloud MDM, federated MDM cloud being simply you’re hosting your MDM processes in the cloud and/or your master data in the cloud. And then likewise the federated architectures, how do we have the global company like Apple or IBM or HP unify their systems and share master data across those large distributed IT and data centers and centers of applications that are deployed out there. And then there’s a requirement now also to unify or repatriate all the different permutations into yet another type of hub. So we have these classic MDM hubs some people like to be pejorative and call them legacy MDM, but they’re not legacy, they’re current, contemporary they’re doing the heavy lifting in companies, like the IBM MDM, Oracle SAP MDM hubs. Then you’ve got some of the newer cloud hubs, MDM hubs, and then you’ve got the need to bring that all together into a super hub or hub or hubs or uber hub, there are various names for it. And so the idea, again, is how to repatriate, re-weave, bring back together the various permutations of data, of MDM data and MDM sourcing. And then further, there’s a set of new database technologies for some of the large Web scale type companies. So if you look at Cisco, Facebook, Twitter, LinkedIn, these sorts of companies are using something called graph database. And graph database is wildly different from relational database, and it’s very good at managing relationships, which are horribly hard to do in a relational database except for the simplest of relationships. So graph databases are very interesting technology that are coming, and particularly that’s enabling the next generation of MDM, we’re now calling it master relationship management. Because what we’re really interested in is the relationships between the data. How many of this does that citizen or consumer own? What are the relationships between this consumer and these other consumers as a key opinion leader and these other people? What is the relationship between this doctor and these other doctors as a key influencer? What is the relationship between this bad guy and these other bad guy groups?
So master relationship management is something that is very difficult to do with current database technology and the current MDM platforms, but the graph database provides that possibility, and it’s being proven out already by a certain number of vendors. But meanwhile it’s not that easy to change wholesale from one database technology to anther because it’s like pulling the rug out from underneath. [But] there’s a lot of work to be done across the planet in terms of just the basics of MDM. People always ask what’s new and what’s wild and science fictiony, but the basics are fundamental -- you gotta do it.
Click here to read 12 strategic planning assumptions for MDM and data governance for 2013-14 from Aaron Zornes and The MDM Institute.