As companies seek benefits from service-oriented architecture (SOA), implementers struggle with the fundamental mismatch between the logical data organization that a service should expose, and the diverse and chaotic nature of data in its current physical form. The solution to this dilemma is a data abstraction layer. Data abstraction hides the complexity of the back-end data sources while providing a common information model, which is better organized to meet the needs of the architecture.


The Problem


Enterprises have complex information structures as a result of evolving applications and databases deployed through the years. The typical enterprise runs many dozens of applications and databases, spanning modern and legacy equipment, multiple operating systems and a myriad of data storage systems. The core asset of these systems is data, which reflect every aspect of the business, from participants (customers, partners, employees) to operational data (supplies, products, orders) and financials (revenues, payables, salaries). Data is the lifeblood of the business, and the success of the applications running the business hinges on the accessibility and quality of the data. Without efficient access to data, the entire business suffers.


Managing Data for SOA


While the management of data is nothing new, trends in IT computing are placing stress on old techniques used to access it. Adoption of SOA in particular, while showing great promise for increased agility and reuse in the enterprise, calls into question old methods of accessing data. The premise of SOA is that processing capabilities are built into callable chunks that roughly align with business operations such as “get auto policy,” “process order,” or “verify credit limit.” Then, processes and applications are constructed by “orchestrating” these services into the required business processes. When this vision is realized, businesses can react to change quickly by simply reorchestrating services rather than succumbing to the traditional six, 12 or even 18-month development cycles that are common today. However, SOA calls for a new way to organize and manipulate data, as we shall see momentarily.


There are three main characteristics of services that drive the value of a SOA: they must be defined in business terms, loosely coupled and coarse-grained. Specifically, this means:


  • Defining services in business terms helps align IT with business needs, allowing the business to use IT services as well-understood building blocks for business processes. For example, you might define a service as “update customer record” instead of “send MQ update.”
  • Loose coupling is the notion that services should have minimal dependencies on one another. Loosely coupled services generally have a defined “contract” under which they interoperate. One service may change its implementation without affecting the other, as long as it continues to abide by the contract.
  • A service is coarse-grained when it delivers a significant piece of data given a limited interface. A service with a coarse-grained interface intentionally limits flexibility in favor of formalizing and standardizing a well-defined behavior and data set. For example, a service to “get auto policy” might take a single parameter, the policy number and return the policy document using the company’s standard format.

The need for services to be loosely coupled, course-grained and defined in business terms presents several important problems that must be solved for a successful SOA. Somehow, architects and developers must find a way to deal with the complexity and diversity of data in the back-end systems, develop representations of data at a suitable granularity, and define information objects in business terms, like “order,” “customer” and “invoice.” It is worth noting the distinction between simply exposing data as services and truly building an SOA. Many applications are written without a data abstraction layer. Indeed, modern development environments for Java and .NET make it very easy to bind an application or service directly to a physical data source. While data services can often be quickly developed and delivered in this manner, the resulting service and consuming processes become tightly coupled to the underlying data. Any change in the data or the consuming processes causes a ripple effect through the entire stack.


Managing Complexity with a Data Abstraction Layer


Complexity in data can be managed by creating a data abstraction layer. Data abstraction hides the complexity of data by letting an architect define a new, better organized structure that exists only in middleware. The result is that an application or other service can request data in the well-organized logical format, without regard to the physical layout. As an example, an application may request a customer record from the data abstraction layer. Data is fetched from potentially many data sources, transformed into the agreed logical structure and returned to the calling application.


Various types of data abstraction layers are possible, each suitable for a particular purpose. Examples include a virtual relational database that presents a relational model of data queryable using SQL and virtual spreadsheets that present back-end data for easy manipulation in a spreadsheet form. Broadly speaking, a virtual relational database maximizes the flexibility in the types of queries that can be issued, due to its rich query language. To provide data to services, however, the natural implementation of a data abstraction layer is to model the data in XML. A data abstraction layer modeled in XML is an ideal fit for an SOA, as the richness of data defined by XML schema facilitates the creation of coarse-grained and loosely coupled services, defined in business terms.


An XML-based data abstraction layer solves both of our problems simultaneously: we hide the complexity of the back-end systems so that applications can avoid the messy details, and we provide data structures at an ideal level of granularity for an SOA.


Modeling Data Using XML Schema


One of the obstacles to defining a data abstraction layer is the ability to properly model the data using XML schema. Like any data modeling activity, this task carries a degree of difficulty requiring input from both technologists and businesspeople. In effect, the business messages that drive operations need to be defined. This is generally done through an internal XML modeling effort, resulting in a set of XML schemas defining pertinent business messages, each of which becomes the subject matter of a service. As an example, a business analyst responsible for order processing knows what data elements are required as input to the order process. Working with a data modeler knowledgeable in XML schema, they define an appropriate structure and data set representing the order.


Many industries have defined standard business structures using XML schema (for example, ACORD for insurance, MDDL for capital markets, HR-XML for human resources). These standards have incorporated the “wisdom of the masses” to achieve comprehensive XML-based definitions of business information. Using industry-defined XML schemas has the additional benefit of a well-documented vocabulary. Since these standards are achieved through consensus, the tradeoffs include the potential for bloated messages, and the need to customize elements specific to your business. All things considered, implementing a data abstraction layer based on industry-standard XML definitions leads to quicker implementations and better interoperability.


Implementing a Data Abstraction Layer


Each business message defined using XML schema becomes a contract between a service implementation and its consumers. The data abstraction layer is simply the combination of these services. To that end, anybody that can build a service that conforms to an XML Schema can build a data abstraction layer and reap its benefits. Many tools and technologies are available to assist in this effort, and generally the selection criteria shifts to secondary considerations, such as ease of use, design time metadata management, and runtime management support.


Data Abstraction Benefits


The potential cost savings for building a data abstraction layer for your SOA fall into a number of categories. First, if you have agreed that SOA in general provides benefits, it is a natural extension to agree that “SOA done right” is the only viable approach. The mandate to deliver data aligned with business definitions requires that logic be defined and exposed in a data abstraction layer. A properly defined logical layer leads directly to increased reusability opportunities. ROI analysis has convincingly shown that using a data abstraction layer provides far less expensive systems in the long run.


Perhaps the greatest benefits accrue due to the loose coupling achieved through a well-designed data abstraction layer. Alternate approaches of accessing data sources directly from applications, however easy development environments make this, leads to tight coupling. A change to the data or the application causes a ripple effect through many systems. In contrast, the loose coupling available through a data abstraction layer provides long-term cost savings by isolating many of these changes to the proper architectural layer. Physical data can be reorganized without affecting applications, as the “contract” between applications and data remains the same. Also, changes to the application can often be accomplished with modifications only to the data abstraction layer implementation. For example, if a new source is added to a customer’s activity history, that new source can be incorporated using the exiting formats, extending the implementation of the service without changing the interface contract.


Recognizing the complexity of data as it exists in its disparate physical stores, it makes sense to create better logical groupings, which can be exposed through a data abstraction layer. In an SOA context, the logical groupings should be defined with services in mind, maximizing SOA benefits by defining messages aligned with the business vocabulary, and creating loosely coupled and coarse-grained services. Conquering complexity and maximizing SOA value gravitate toward the same solution: using XML schema to define business messages that form the contract between a service and its consumers. Together, the set of services created in this manner constitute a data abstraction layer optimized for a SOA.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access