A recent study by IT analyst firm IDC estimates that the average company has 49 applications in 14 different databases, which need to be integrated, and typically has no more than 20 percent of its customer data residing in any one location. IDC also reports that over the next three years, the world's data will increase six fold annually.1 At most companies, the number of systems continues to grow, not shrink, which is exacerbating the problem of data proliferation.
Most business leaders agree that data is a critical strategic asset, yet effective management of information has remained elusive. The core of the problem is an inability to easily share data between systems or make systems work better together. Too often, companies try to solve their interoperability issues by replacing their systems, building numerous point-to-point interfaces between them, customizing them, or trying to scale them to be the single master of highly shared data. These approaches are extremely disruptive and lead to overall brittleness in system integration.
By properly employing a service-oriented architecture (SOA), enterprises can leverage their existing systems while largely leaving them alone and create a new integration solution for more effective information sharing across disparate applications. A well-designed SOA enables a company to create a consistent, accurate and complete view of its most important data, which can enhance data quality management, improve compliance with internal and government regulations, and provide performance and agility gains. Moreover, because services run on their own layer of infrastructure (an enterprise service bus or ESB), SOA implementations are easier and more cost-effective than solutions that involve more invasive rip and replace, point-to-point integration or customization strategies.
Even when using an SOA-based approach, the ultimate goal of strategic information management cannot be fully achieved unless specific care is taken to understand and manage the underlying data as a strategic asset. Unless special attention is paid to shared data, SOAs run the risk of failure, because the proliferation of bad data can actually lower the overall quality of a companys most critical information. In SOA implementations, data in more easily shared and exchanged between numerous applications, and that means the ripple effect of sharing low quality data is much higher than ever before. Such issues can have a far-reaching, negative impact on a wide range of corporate activities. For example, if a system updates the wrong customer record with new contact information, it could affect such operations as advertising, customer relations, privacy protection, security, accounting, billing and regulatory compliance. Customers could be billed inaccurately; products and marketing collateral could be sent to the wrong address; or credit histories could contain the wrong information, causing a company to wrongly extend or deny credit. This is the kind of risk that companies simply do not need.Remediation of these problems drives up IT costs and lowers credibility. It is imperative that IT organizations ensure the consistency, accuracy and completeness of information shared in their enterprise.
Whats the Answer? A Master Data Service
There are two basic kinds of SOA services: process-centric and data-centric. Process-centric services execute business processes such as authorizing credit cards, processing orders, sending bills, etc. Data-centric services manage the attributes and relationships of data required by process services. A master data service (MDS) is a kind of data service that is responsible for only one thing: managing, in a single place, the uniqueness, integrity and interrelationships of data that matters most.
Each MDS is the authoritative source for a particular type of master data (customer, product, license, location, event, asset and portfolio, to name a few). Master data is generally the most highly shared and the most critical to successfully meeting the goals of an enterprise. Sometimes referred to as subject areas or master data domains, master data is the most essential sets of core data to an enterprise, which means it has to be accurate. If master data is inconsistent, it could potentially expose an enterprise to significant risk.
An MDS provides an ideal way for managing data within an SOA environment. Using a hub and spoke model, the MDS serves as the integration method to communicate between all systems that produce or consume master data. The MDS is the hub, and all systems communicate directly with it using SOA principles (XML documents exchanged over HTTP or JMS). Systems only have to communicate with one authoritative source for identity matching and data quality management, and they can communicate using an independent standard XML schema. In this way, the MDS enforces a single lingua franca for master data. Participating systems are autonomous in SOA parlance, meaning that they can stay independent of one another and do not have to know the details of how other systems manage master data. This allows disparate system-specific schemas and internal business rules to be hidden, which greatly reduces tight coupling and the overall brittleness of the ecosystem. It also helps to reduce the overall workload that participating systems must bear to manage master data. Without using a hub and spoke model, any time a system makes a change to a piece of master data, it should coordinate the change with all other systems within the ecosystem (of course, systems do not do this). For example, if the enterprise resource planning (ERP) system is responsible for customer billing addresses, not only does it have to perform its job of processing orders, but if it is also to manage data as an asset, it needs to ensure that every system within its universe is up to date. If data quality business rules change, every system must adopt and enforce these rules, instead of just a single data service. Such a solution requires extensive customization, which leads to brittleness and risk, not to mention high costs.
An MDS is tailored to manage the enterprise-wide business rules for data quality in one place. It is fine-tuned to be the arbiter of data quality among the systems and services with which it will interact. This eliminates the need for each system to know about the other systems that share master data, making them more autonomous - a key goal of an SOA. With an MDS, each system within an SOA ecosystem needs to be configured only once with the MDS, which then handles the complete exchange of data with all other systems. This makes the problem of adding or subtracting systems much easier. It also mediates structure and semantic inconsistencies between participating systems. For example, one system might refer to a customer as an entity while another might call it a party, yet an MDS system inherently knows that the data set associated with these different schemas is the same, and can, therefore, enforce a common standard for easy integration. In addition, each business unit in a large enterprise may have different governance practices for enforcing data quality, policies and business rules. An MDS resolves all of these issues so that an SOA service can use data accurately and consistently each time the service is invoked.
MDS deployments solve many of the common issues of enterprise-wide data management, but the following guidelines should be followed to ensure success:
- Manage data quality rules in one place. An MDS provides the central tooling for validating the completeness and accuracy of the master data it manages. It should apply one consistent set of policies and rules, so that the best information is available to all participating systems. For example, there should be one service to validate postal addresses, ensure all systems are using the same product description, or provide correct product pricing to multiple order-taking applications.
- Control data redundancy. Data housed in multiple applications and databases often contains similar records about customers and transactions. An MDS should be the single place for managing the uniqueness of master data records across all participating systems, even though data is duplicated across (and sometimes within) systems. Rules in the MDS should help resolve identity questions and where and how to obtain the most accurate information. For example, the MDS might use the ERP application for billing addresses and the customer relationship management (CRM) application for shipping addresses.
- Resolve semantic and structural variations. Every application uses a unique schema for managing its own data - thats completely normal. However, in order to share data among disparate systems, it is critical to implement a system for mediating these differences. An MDS should hide semantic and schematic differences and create a consistent standard between applications.
- Require autonomy. Each system that produces or consumes master data should only have to talk to a single MDS for each kind of master data. This relieves each system from having to know the details about other master systems, or share the same set of business rules for managing overall data quality. It also greatly simplifies the process of adding new systems, or sun setting systems - MDS makes this an addition problem instead of a multiplication problem.
- Enforce data governance policies. Data governance is the set of policies, procedures, processes, skills and technologies required to manage data as a strategic asset.It is 80 percent business processes and 20 percent technology. Establishing and resolving data quality and business rule control issues between policy administrators of each system or business unit is a difficult process. Data governance administrators must go through the often lengthy and political process of gaining consensus and approval to create and alter policies. An MDS solution helps resolve these tasks by centralizing and streamlining all of the policy administration for capturing new data and enforcing data quality policies.
Delivering an enterprise-wide, authoritative source of master data is the role of an MDS. It understands all of the characteristics of enterprise-wide data, matches and links data accurately and automatically between all internal and external data services, and enables the noninvasive, nondisruptive delivery of information to existing business systems via an SOA.
An MDS provides seamless interoperability across disparate systems throughout the enterprise dramatically reduces the complexity and brittleness of IT systems and increases the scalability of the environment. Because it is decoupled and autonomous, an SOA adapts more easily to new data sources as the business requires, and provides performance and scale to handle the ever-increasing volumes of master data.
Enterprises that are migrating to an SOA and want to reduce costs and complexity, while ensuring the accuracy and completeness of data, are well advised to consider an MDS as part of their overall strategy and technology implementation.
- John F. Gantz. "The Expanding Digital Universe: A Forecast of Worldwide Information Growth through 2010." IDC, March 2007.
Register or login for access to this item and much more
All Information Management content is archived after seven days.
Community members receive:
- All recent and archived articles
- Conference offers and updates
- A full menu of enewsletter options
- Web seminars, white papers, ebooks
Already have an account? Log In
Don't have an account? Register for Free Unlimited Access