Graph databases and machine learning will revolutionize MDM strategies
Every organization today is in the “data business,” as they continue to collect volumes of information about their customers. When organized effectively – by integrating data from variety of different sources through master data management – this data can provide important and actionable insights.
Some enterprises are tackling this challenge with a traditional approach to MDM, which limits what can be done and how quickly they can do it. But coming soon to an MDM hub near you are two key enabling technologies that both augment and/or re-invent MDM as we know it ... graph database and machine learning (ML).
With that perspective in mind, here is more on these two top trends.
Graph database is conceptually threatening MDM due to its ability to simplify complexity. But it is also augmenting MDM and data governance via user interface and query. Here are some implications:
- Simpler modeling of complex relationships yields more humanistic user interfaces for all concerned. This model agility and extensibility enables users to easily and quickly add new data dimensions, hierarchies and linkages.
- The querying of analytics via graph technology also simplifies and turbocharges the ability to query (and discover) data relationships critical to “system of engagement” style systems.
- In practice, quite often the majority of purist graph-specific implementations are either (a) unable to transact for high volume scalability in operational mode, or (b) primarily being used as adjuncts to operational mode/transactional MDM hubs in order to cross-walk or analyze across domains.
- Near term, graph databases will continue to enhance the delivery of data mastering.
- Start-ups and others that are not mega vendors are more capable of introducing and leveraging such capabilities. For examples, see Reltio and Semarchy for data modeling and querying (as well as other UI aspects).
- As MDM evolves towards “master relationship management” via graph technology, analytical upstarts from the graph world will increasingly add operational capabilities for performance and robustness.
Machine Learning is stealing the spotlight at the MDM and data governance. Consider:
- Scalability, complexity and agility are only some of the problems increasingly being solved by machine learning.
- Start-ups and others that are not mega vendors are more capable of introducing and leveraging such capabilities. For examples, see Tamr for scalable metadata mastering, Fresh Gravity for GDPR expert system guidance for enterprise-strength anonymization, etc.
More on machine learning for MDM
Traditional MDM has been around since the early 2000’s. As data volume has grown and the potential value of analytics has exploded, enterprises seeking to compete on analytics struggle to scale mastering efforts with the surfeit of available data sources.
Clearly, creating robust data engineering pipelines to unify this data at scale is more important -- and harder -- than ever. An “agile” approach, utilizing machine learning can cut time required for unification or analytics projects (around 90%) while scaling to more sources than other traditional approaches.
Moreover, given the scale of enterprise data, automation is the key to agility and scale. Such enterprise data automation can only be achieved with some human oversight to make sure the results are fast and accurate.
Not just raw data scalability, but also human process scalability is enabled by machine learning. While we all know to invest in active/integrated data governance for long-term sustainability and return-on-investment of MDM, most all of the currently-marketed classic data governance tools do not exist as integrated solutions and are also lagging in “ML-guided” stewardship.
Over the next several years I expert systems (self-learning, etc.) will work side-by-side with human experts to facilitate, advise, correct and promote best practices in data governance and stewardship, and more data-related tasks for IT pros.
While I expect most MDM vendors to deliver classical data governance over the next 6-18 months, I also project that the innovative best-of-breed data governance software players will focus on machine learning as competitive differentiators. Specifically, mega vendors (including IBM, INFA, Oracle and SAP) are focused to deliver data governance capability in 2018-19 with resulting partner chaos – with every solution provider turning increasingly to machine learning acquisitions and partnerships to bolster their governance capabilities.
Clearly, machine learning will augment (more than replace) MDM and data governance to provide increased agility and scalability. Areas where machine learning will be applied include: data discovery and mapping, entity resolution, relationship discovery and mapping, taxonomy and ontology; and governance and stewardship.
The bottom line? Organizations should plan now to realize economic value and competitive differentiation from investments in MDM plus graph databases plus machine learning over the next two to five years.