It seems that service-oriented architecture (SOA) is on everyone's minds these days. SOA is not a new concept, but it seems as if SOA technology has burst onto the IT scene like a rocket. The question is: Will it flare out quickly, or will it power companies to greater profits and other long-term positive results? The answer may depend on how well those companies that choose to convert their current technical architectures to an SOA have gotten their data management and governance house in order before they build an SOA.

SOA - A Foundation

Let's take the time for a little primer on just what an SOA is and what it can do for a company that implements one correctly. An SOA is a component-based, standards-driven architecture that utilizes the Web as a vehicle for delivering services to users. Services are really not that complex. Basically, a service is work that's done on request by software or hardware and that achieves a particular result while being reusable by the same, or other, service requestors.

Sound too theoretical? Let's look at a real-life, nontechnical example of a service. DVDs make an excellent example. The DVD itself is the service. You play it, on request, in one of several different applications: a DVD player hooked to your TV, a portable DVD player or your computer's DVD player. The player offers the service, and the service (the DVD) can be reused or burned onto another DVD, which is akin to asking for another service. Thus, you have an architecture - all your various DVD player components - that delivers services (the movies).

Now for the technical version. Most SOAs that I've helped my clients with have had a layered architectural approach. Figure 1 represents a layered SOA.

Figure 1: A Layered Service-Oriented Architecture


The services and layers in the SOA are loosely coupled. They are combined when called and operate together. Let's take a look at how a typical, layered SOA operates.

Data comes into the architecture via a variety of data collectors and consumers, including distributors and resellers, internal IT systems, business users and consumer retail transactions. The interface services layer is the presentation layer. Users interact with the architecture through this layer, which includes the user interface, enterprise portal, security gateways, and reporting and delivery mechanisms.

The next layer in the SOA is the business services layer. This layer comprises services related to key business processes. It provides services combining logic, data and user interfaces with graphical workflow design templates. Next, the shared services layer provides facilities - such as workflow management, notification and alerts, and Web services - that are common across all components of the SOA.

Perhaps the heart of the SOA, the integration services layer provides a uniform standards-based framework within the SOA. It orchestrates data movement, quality, transformation, enrichment, validation and storage. All the company's business taxonomy and business rules, as well as the meta data repository and technical meta data, are included in this layer.

The deepest layer, infrastructure services, provides hardware, systems operation and networking services to the SOA. This layer is where performance tuning, job management, security and backup/recovery procedures are performed.

I mentioned it earlier, but it's worth mentioning again: all the services and layers of the SOA are loosely coupled, but any service may be called at any given time. Therefore, all the services must operate together smoothly, or incoming data will not be handled as expected. It follows that the SOA has a critical dependence on rules that tell it how to do its job. This is especially true in the integration services layer, particularly when handling data-centric activities, where incoming data is checked for quality, assigned a quality rating and transformed as necessary to fit the company's data standards. This dependence on data rules means having your company's data management strategy - and the execution of that strategy - in place and functioning before you build an SOA.

An Enterprise Data Management Strategy

Most companies I've seen have at least a rudimentary data management process in place for much of their enterprise data. Often, however, the deployment of the process has been fragmented by business unit or departmental divisions. Also, lapses in enforcement of data governance policies and procedures are often created through reorganizations and/or mergers and acquisitions. Before you build or convert to an SOA, you must examine how well your data management process operates at the enterprise level, and address consistent data definitions and ownership across the enterprise.

If your company is like most, there will be issues to address. Once you have assessed your current state, you can begin to fill in the gaps with an enterprise data management (EDM) strategy that will enable you to develop and implement the solid data quality foundation necessary to build an SOA.

The ideal enterprise EDM strategy will be built using a framework of appropriate processes, technology, data architecture and standards, organizational controls and governance structure. Figure 2 represents the enterprise data management framework.

Figure 2: Enterprise Data Management Framework


Data Management Processes

The processes most important to your SOA initiative will center on controlling the path of data through the company. Specifically, the processes you should have in place before you build are:

  • Data integration,
  • Data migration,
  • Data maintenance,
  • Data quality assurance and control, and
  • Data archiving.

Each of these processes should be developed at the enterprise level, in the context of object models that reflect leading practices for key processes in your industry. The processes should depict the data use cases and their activities and data objects. Also, each process should be mapped to all services needed at the intake layer for consumers and providers to assist in technology-enabled transformation of data into business information. Finally, the modeling should be facilitated by market leading modeling tools or, ideally, via an interactive Web page.

Technology

The technology piece of the EDM framework focuses on the tools necessary to implement the data management processes you've defined. Specifically, you should have top-notch tools in place to profile, synchronize, integrate, consume, clean, transform and manage the data infrastructure of your company's data.

There are some pretty important requirements for these tools. A top-notch data integration suite should enable the company to manage change across both business performance management and transaction-based systems at all time intervals.

Such a top-notch data integration suite should have the following characteristics:

  • An open design (i.e., the tool will work with multiple platforms and will fit in with an SOA design),
  • Integrated data profiling and data quality capabilities,
  • Complex data transformation and routing capabilities,
  • Reusable components and rules,
  • Unlimited performance with linear scalability,
  • Enterprise meta data management capabilities,
  • On-demand connectivity,
  • Messaging capable of integration with messaging tools,
  • Compliance with industry standards such as XML, EDI, JMS and JCA, and
  • Industry-specific and ready integration capabilities.

Finally, because these tools will be an integral part of your SOA, they should obviously be compatible with other company-standard technologies and tools, such as business performance management, enterprise application integration and enterprise information integration toolsets.

Data Architecture and Standards

While technology is important in implementing a data management framework, the data standards that you develop for your EDM strategy will largely determine the final results of your strategy. Your data architecture provides the blueprint for how your data management processes are implemented and how your toolsets operate.

Think of your company's data architecture and standards as a pyramid. At the bottom is your enterprise data management conceptual model, which defines with broad strokes how your data should flow through the company. Then, from that conceptual model, you develop an enterprise-wide logical data model, complete with subject areas, business entities and their relationships. That's your foundation. Next, you have the company's meta data - or data about data. Then, you have the information exchange model that defines how data is shared across the company, especially when you have data that requires translation from one level to another. Finally, you decompose the enterprise data model into specific use cases that support the business processes.

This model gives you two things:

  1. It provides a complete picture of the flow of data through the company.
  2. Because all aspects of the data spectrum are represented - from use cases, to exchange models, to meta data, to logical data models - you have the ability implement uniform data standards throughout the enterprise.

Enterprise Data Management Organization

Your enterprise data management organization (EDMO) is the human element of the enterprise EDM strategy. You construct an EDMO to manage change and lead the data management efforts. The organizational chart for the EDMO should be a top-down model that gives the EDMO director guidance from the EDMO steering committee. The EDMO includes data stewards responsible for defining data management policies - controlling the flow of information, coordinating information dissemination throughout the company and championing the EDM strategy.

The EDMO director is responsible for implementing policies set by the data stewards. The director is also the overall coordinator for the enterprise EDM strategy implementation. If you think of the data stewards as "idea people," the director is the implementation person for all those ideas.

Other important roles within the EDMO are the functional and application data owners. Data owners can be either department heads or knowledge-workers responsible for using specific applications. Data owners define requirements for information, confirm the quality and availability of information, and authorize access to information that the IT people need to manage and maintain company data and application.

Data owners serve as a liaison between the management levels of the EDMO and the data management group that supports the data steward to define data standards and provide architecture support, data maintenance, data delivery, and the data management infrastructure and administrators.

The main function of such a structured EDMO is to make the next component in the EDM framework, data governance, easier to accomplish.

Data Governance

Moving toward an EDM strategy requires a management framework, including data governance and standards, not just a consistent technical approach and a good toolset. The data governance strategy, as implemented by your EDMO, will span all components of your data architecture. A first-class data governance strategy should focus on instilling a culture of data stewardship and quality throughout the company. It should define policies and empower oversight and procedures for defining data standards.

The strategy, if properly implemented by the EDMO, will be embraced by all your company's information consumers. Company data will be clearly defined and accessible, with defined quality metrics in place to enable it to stay that way. If you implement a comprehensive data governance strategy, data excellence will pervade the company, and your EDM framework will hold up through growth, change, government oversight and any other curve that may be thrown at you.

Reaping the Benefits of Comprehensive Data Management

When you have a functioning data management plan in place, building the SOA and realizing its benefit will be easier and more effective. Let's revisit the SOA to walk through an example of why you need an EDM strategy in place before you build and how the EDM strategy can make the SOA operate smoothly - as it should. Figure 3 highlights data movement through the SOA.

Figure 3: Data Movement through the SOA


Assume a sales transaction comes in through one of your company's distributors. Immediately, the SOA goes to work. The transaction enters the gateway in the interface services layer, using the access control service in that layer and the security service in the infrastructure services layer. The business services layer manages the data's intake and flow, and it engages the workflow engine. The transaction is transferred throughout the service layers as necessary (through the enterprise data integration and data movement services).

Once the data is in the architecture, the integration services layer goes to work. Data quality checking of source data is handled by the data quality services, using guidelines stored in the meta data repository and business rules interacting with the workflow engine. Quality checks should include checking for specific rules, profiling, diagnostics and quality rating assignment. Once the data quality checks are complete, product matching and pricing adjustments are performed.

Ratings and exceptions are sent to the quality management service for later reporting, and notifications and alerts are sent, via the portal using Web services, for resolution. The quality management and operational data stores are updated, and the operations research and process management functions can access the updated information for reporting and exception handling.

The company's learning database is updated with any corrections, and business rules and meta data repository updates are made. The data is then delivered for analysis and reporting. When processing is complete, the sales data housed in the operational data store is transferred via downstream management, using the data movement service, to the data warehouse.

With this example, the advantages to the SOA become obvious. Data movement is automatic, synchronized and fluid. Errors are identified early in the process and are sent to the appropriate services for immediate recording and handling. In effect, the SOA "learns" from each transaction, so that future transactions are completed even more quickly. Also, because each service is discrete and the services are loosely coupled, any service can be called on and combined at any time, making the architecture immensely flexible and able to meet most data movement, analysis and reporting needs on an almost real-time basis.

However, with this example, the need for a data management strategy and standards to be in place before you build an SOA also becomes obvious. Because any service may be called on at any time to perform any combination of tasks on any type of data, standards must be in place to enable each service to perform its tasks.

For example, if the data quality rating is to be assigned by data quality services, using standards stored in the meta data repository, it's critical that those standards be in place well before the first piece of data flows through the architecture. Every piece of data must be handled as expected. Every scenario must be anticipated. Therefore, all your rules, quality standards, rating information and exception information must be defined before you begin.

In short, before you build the SOA, build an appropriate data management strategy that will enable it to operate as it should. Use a framework that enables you to build a flexible, comprehensive data management strategy that integrates process, technical and human elements. If you don't, you're failing to plan and planning to fail.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access