Data as a Service Emerges

Register now

Service-oriented architecture has been a hot topic in both IT and business circles for the past several years. In 2006, 82 percent of companies planned to use SOA to drive increased flexibility in application development, according to InformationWeek.

To many, SOA means deconstructing applications into discrete and reusable services, an abstraction that inevitably extends to the data within application services. The principle of SOA is that applications become simpler to build as the services themselves are standardized. The classic SOA paradigm is to align IT functionality with key business processes, regardless of the underlying applications or systems.

SOA employs Web services to automate and deliver key functions. A Web service automates a common business functionality and logic needed by numerous and often diverse applications.

There's widespread agreement among CIOs that Web services can simplify development and ensure processing consistency. With the explosive growth of application systems and the increasing disparity and proliferation of data across those systems, SOA principles are quickly being applied to corporate information. Whether supporting business events or data retrieval, SOA can provide an abstraction layer to programmers who are often hindered by having to understand cryptic business rules or calculations, liberating developers from complex rules and codes that often accompany data.

A common example of SOA in action is a Web service that calculates sales tax. An application calls a Web service, identifying the product description and the location of a transaction. The Web service contains the complex set of rules to calculate sales tax: is there a local tax, a federal tax or is the product exempt from taxes? The benefit of centralizing this service is that it frees developers from having to learn the rules of a relatively small component function, as well as not having to track ongoing rule changes.

Data as a Service: What's Different

Data services is a relatively new concept. Business applications haven't traditionally focused on the accuracy and standardization of their underlying data content. Unlike other Web services, which ensure functional consistency and functional accuracy, data services ensure data consistency and data accuracy. Most Web services are independent of data content - the data doesn't influence the operation.

Typical business applications treat data as either active or passive. Active data is the actual content manipulated or processed within the application. Business logic and rules are focused on active data. These rules are limited to the specific application functionality. It is rare that any data is checked for accuracy.

Passive data is the content collected by an application for business purposes but not required for specific processing. When an airline ticket is purchased, for example, the application collects both active and passive data. The active data includes the origin and destination cities and the date of travel, which are all required to calculate the fare. The passenger's name and address are passive data required by the airline, but not by the application. The application doesn't typically ensure integrity or accuracy of the passive data.

The conundrum is that in order to effectively manage their businesses, companies need access to both active and passive data to support additional and diverse business functions. The challenge is twofold: the rules for active data may vary across disparate applications, and no attention is paid to the integrity of passive data.

Traditional systems employing point-to-point or object coding require programmers to know all the nuances and details. This is why companies have replicated and unmatched copies of customer addresses. The practical challenges are applying rigor and logic against data without forcing every developer to learn every rule associated with every piece of data and tracking the changes to the data as business gets done.

Extending Web services to data allows companies to have well-defined data functions availed to all applications, irrespective of hardware platform or software vendor. It allows centralized management while supporting distributed processing. Data services provide a reasonable solution of decoupling data-centric rules and logic buried within an application. They allow for a mechanism of business rules and logic that are data value dependent to be managed and deployed consistently across multiple, disparate applications.

For example, consider an insurance company's call center. A customer who has just purchased auto insurance calls customer service with a question. The agent asks for the customer's name and phone number and inputs it into the customer relationship management system. Behind the scenes, a request is fired off to an IBM mainframe application. This required specialized point-to-point code that handles database access and data conversion and can submit the query and retrieve the data values. In this scenario, two pieces of code had to be written: one piece on the CRM side, the other on the mainframe side. Since the company had many business applications in need of customer information, the point-to-point environment described above quickly expanded, as shown in Figure 1.

Irrespective of the relatively basic functionality involved prior to Web services, every application developer is required to be aware of every other application in order to share and move data between them. This is one of the reasons that data as a service holds such significant promise for IT departments focused on efficiencies and cost-cutting.

As we assess the insurer's ability to provide real-time information to businesspeople in a more sustained way, we realize that there is enormous potential not only to drive productivity through the use of data as a service, but also to foster enormous economies of scale in code development. We help the insurer define a new set of data services that extends to a range of applications, both analytical and transactional. The services ensure the consistent deployment of data processing and information access. A simplified version of the service-enabled environment is shown in Figure 2.

In the new environment, the insurer migrates away from its numerous point-to-point connections. The data services environment contains the business rules, logic and access knowledge to deliver the information and processing horsepower back to the requesting applications. The centralized data services platform delivers multiple Web services to numerous business applications.

When Data as a Service Makes Sense

Remember that a Web service is a transactional request, and thus is very well-defined. If an application needs to look up a customer's phone number, the service provider will know where phone number details are located and understand the rules to use to identify the right phone number. Thus, services need to be defined and scoped in advance of their implementation. The three primary drivers to creating data services are complexity, frequency or control.

Circumstances abound where data isn't physically stored but rather it is represented by a calculation with fairly complex rules. Credit risk and address standardization are rules that determine the accuracy of an address and will vary based on country and individual location. The rules are broad, complicated and changing. Encapsulating these rules within a service makes it very practical for applications to contain credit risk or address cleansing functionality without any additional cost or complexity.

Likewise, the frequency of development and duplication of logic across multiple applications degrades as a company grows. Figure 1 illustrated numerous point-to-point connections written to access the same data. Frequency simply indicates the popularity of data needed across multiple disparate platforms. Examples include customer lookup functionality in which an application needs identification or contact information for a customer. Requiring individual applications to build and maintain their own individual logic is costly and error-prone. A data service is the ideal remedy.

Protecting privacy and client information is a universal challenge across industries. Ensuring that data security and access rules are consistent and that the data doesn't fall into the wrong hands has become a priority. Combined with frequently changing regulatory and legal environments, maintaining the balance of data access and data protection is only feasible when this functionality is centralized.

The introduction of data services doesn't contradict the value of Web services. Instead, it extends and expands the scope of code reusability to ensure data reusability. The justification process isn't easy, but as early adopters of data services can attest, it's worth it.

For reprint and licensing requests for this article, click here.