Multidomain master data management can provide organizations with substantial value and measurable revenue and margin benefits. My previous article “Multidomain Master Data Management for Business Success,” discussed how mastering multiple data domains and managing their interrelationships will help companies realize shorter cycle times, reduced (or controlled) costs, and better forecasting, planning, and transactional and analytical outcomes. 
This article defines three different architectural styles for deploying multidomain MDM, discusses the pros and cons of each approach and provides examples of which use cases might favor one style over another. 
Organizations that are embarking on multidomain MDM should realize that, no matter which architectural style they choose, the process requires the management of both simple and complex master data objects. A simple master data object is an instance of a master data domain that exists by itself. A complex data object is a combination of detailed data about several master data objects that exist interdependently, and the data regarding the associations and relationships between them. Before a multidomain MDM system is deployed, master data comprising the simple and complex objects are managed independently, and often inconsistently, in various systems of record, which is what typically drives an organization to create a single version of the truth.
Let’s look at an example of a retailer of home building supplies. The company’s supplier data are complex data objects. In addition to corporate and contact information, each supplier also has master data on the products that it sells. The company also has data about volume purchase agreements, discounts and volume targets. that are facts about the relationship between the products and the vendor supplying them. If the retailer is sourcing windows, it can get the same window from several sources. One supplier may give better terms if the retailer is willing to wait a few days and order a larger quantity of products at once. Another may deliver the windows when ordered, but at a higher price. The third may deliver the windows in a scheduled shipment, and offer the most favorable payment terms. In this example, price is not a fact about the windows themselves, but rather about windows from a particular supplier and location, in specific quantities with given lead times and shipping methods. Price isn’t an attribute of the company, product or location master data objects themselves, but about the combination of those master data within the composite data object. 

Multidomain MDM Architectural Styles: An Overview


Multidomain master data projects can be designed and implemented using one of three architectural styles: registry, hybrid and centralized. Determining which style is best for an organization will depend on a number of factors including how the data will be used, the number of applications that will rely on the solution, the stability of the ecosystem within which the solution will exist, and specific requirements for transactional throughput, uptime, response time, performance and scalability. Organizations should consult with an experienced enterprise architect to help them decide which architectural style makes the most sense. Enterprise architects have the requisite skills to define the characteristics of the system and to evaluate how specific business needs can and will be addressed by a chosen model today and in the future. 
All three styles have something in common; they all have a master data service that is the container for simple or complex master data objects. The differences in these three styles are in the extent to which they store data inside these containers. For example, the registry style does not store all the data about the composite objects in the container, but instead stores only key references to the objects from the contributing systems. The actual detailed data is left in the external source systems. A registry system is based on a federated model that only brings the data together to complete a master data object as required. 
Like the registry model, the hybrid model stores the key references to the composite object locally, but it also copies in all of the attributes required for that composite object from the outlying master data services. With a hybrid style, supplier, product, customer and other master data continue to reside in their external source systems, but a copy of the data from the different master data services is also in the centralized composite. 
The centralized style consolidates multiple domains of master data into one location. In this model, customer, product, location and other master data services are no longer sourced by external systems, but instead from the centralized hub, which contains all of the attributes of the composite objects in one location and acts as the master for all data. 
Companies need to determine which architectural style best meets their multidomain MDM requirements. 

Registry Models are Leaner but Slower 


A multidomain object implemented following the registry style contains all of the pointers and keys required to get details about the master data from the various source systems. It also includes any associative data about the interrelationships between those objects. The main advantages of a registry model are that it is lightweight (since it only has key data), can be deployed rapidly (because only a small amount of data is being moved around), is the least intrusive and is the most scalable. Registry models are also much easier to keep up to date because, once they are deployed, any additions or changes to attributes or interrelationships between master data can be made fairly easily, since usually only the external master data sources change, not the master itself. With a registry model, when a request for an object is made, the master data service can easily compile the single version of the truth by using the keys and relationship data from the various source systems contained inside the object.
The main drawback of a registry model is that queries can suffer from performance problems because the performance of composite objects is dependent on the speed of the systems or services that feed it. This is a typical “weakest link in the chain” problem. For example, if a product source is overloaded (or unavailable), its slow performance will constrain the performance of the composite multidomain object containing product data. 

Hybrids Are Fast But Redundancy can Increase Risks 


A hybrid model copies the detailed data about complex objects from source systems into containers. The major benefit of the hybrid model is faster performance, because complex objects already have all of the data they need to satisfy a request under their control. This capability eliminates the need for a hybrid system to federate data from external sources, which positively impacts performance and availability.
One drawback of the hybrid approach, caused by data redundancy, occurs when updates or changes need to be made to master data. Redundancy not only means more work and higher maintenance costs, but it also introduces more risk of errors occurring and of data values getting out of sync. Another drawback of the hybrid approach occurs because changes in attribute characteristics, relationships, semantics, processing or structural rules have to be coordinated between the composite object and outlying source systems. Thus, the hybrid approach tends to be more invasive. However, depending on an organization’s business requirements, the higher performance or lower latency demands of a hybrid approach will usually outweigh the redundancy and maintenance costs, along with additional risk factors.

Centralized Models are Complete but Expensive 


A centralized model puts all master data inside a single container and results in the outlying systems being dependent upon the central master. In the centralized model’s theoretically pure state, outlying systems may not even store copies of the data, but instead, they get the data as needed from the central master. This approach is like the registry (federated) model turned on its head. Most of the advantages and disadvantages of other single platform systems, such as enterprise relationship management systems, apply to a centralized architecture approach. The main benefit of a centralized model is that all the data ever required to create or evolve a composite object is most likely already present within the MDM container. (The only time this would not be the case is when significant additions are made to the definition of a composite object, a new source is being added to the ecosystem or new data is otherwise being added to the data universe that didn’t exist before.) Another benefit of a centralized approach over the hybrid model is that it is easier to evolve or expand the definition of a composite object because changes only need to be made to one system.
However, there are disadvantages to the centralized approach. The primary drawback is that very few commercial off-the-shelf (COTS) source systems are built to be dependent upon an external master; they are usually self-contained and built to be independent. There can be significant license and maintenance issues in making a COTS system behave according to a centralized model. Contrary to what many may think, this model can suffer from performance problems because it creates a much bulkier platform that may be difficult to scale. In addition, since the centralized model has to respond to all transactions that need master data, it can be difficult to tune and may create a bottleneck or a weak link, instead of a high-performance master. Making centralized systems highly available is challenging because, whenever a change is required of any type of master data, the entire system must be taken offline to perform the maintenance or evolve the platform. Finally, the ROI of a centralized model can take much longer to realize as the system can take years to deliver, compared to six or eight months for a registry or hybrid system.

Choosing the Best Model


The optimal multidomain MDM architectural style, for most organizations, will either be a registry or hybrid model. Both are easier to scale and perform better with large data sets than a centralized approach. When deciding between the two, the choice will often depend on how rapidly transactions need to be processed, how much the data volumes will grow, the sheer number of contributing systems, the stability of the data models supported and how well external systems will adapt to the MDM style chosen. Enterprises that require real-time availability will most likely choose a hybrid model because of the latency issues associated with the registry approach. If real-time access is not a critical criterion and the master data object is likely to evolve or change over time, then a registry approach will be more efficient and offer greater flexibility. 
A centralized approach is best suited for a smaller, stable organization with low transaction and volume requirements. Companies should avoid falling victim to choosing a centralized approach just because they can create the entire set of structures inside a single database. For most organizations, a centralized approach is not the best choice (for reasons cited).
Each of these three styles is not right or wrong in and of themselves, only in the context of an organization’s present and future business needs. This is why a careful business analysis is critical for identifying which style may be preferred over another. Since there are always compromises, organizations selecting a multidomain MDM architecture should actively involve an enterprise architect in the process to help them think through the business cases that they will be dealing with both today and tomorrow. 
Some organizations, depending on which business problems they are trying to solve, may ultimately deploy all three types of multidomain MDM architectural styles. Each style may be used to solve a different business problem and may be the optimal choice for that specific use case.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access