Free Site RegistrationFree Site Registration

Sign up today and access Information Management on the web!
Your FREE registration entitles you to:

FREE email newsletters

FREE access to all Information Management content

FREE access to web seminars, resource portals, our white paper library and more!

The Quest for Mainframe Data Access for SOA

Simply SOA

Information Management Online, August 30, 2007

Robert Morris

In today's customer-centric enterprise, regulatory mandates, globalization and corporate mergers and acquisitions are accelerating the demand for real-time information. Yet the vast majority of enterprise information remains buried deep in legacy databases used by applications running on mainframe computers.

Accessing these information-rich wells of mainframe-based legacy data has long been a challenge. Organizations are looking beyond traditional data access approaches to address the issues and potential benefits of legacy integration strategies that tap into mainframe performance, security and programming resources as well as valuable data to deliver real-time results.

Advertisement

First things first, let's consider the primary methods of accessing legacy data: screen data capture, process extension and direct data access. Screen data capture is a fast, easy way to define an interface to an existing mainframe application, by creating a "macro" that navigates the application, grabbing the desired data from the 3270 screen layer. For this reason, it is also a very brittle approach and may malfunction or produce unexpected results if there are subsequent changes to the flow or layout of the application screens.

While the process extension method avoids this problem, bypassing the 3270 screen layer and communicating with mainframe programs at the business logic layer, both of these approaches execute the existing business rules on the mainframe to retrieve or update data and, therefore, must have access to the mainframe application logic in order to function.

Each of these has its place in an organization's data access strategy, but many times an organization has the opportunity to leverage direct data access to integrate and leverage legacy data in their service-oriented architectures (SOA). Direct data access bypasses existing application programs altogether and directly accesses the legacy data source, eliminating the time-consuming hassle of defining custom APIs.

Direct data access is especially beneficial when building new business logic in an application server or to protect the investment in end-user applications, reporting and business intelligence (BI) tools, by giving them access to legacy data in real time.

The most popular approaches to enable direct data access are Web services and SQL. Of these, the greatest familiarity among programmers is with the use of SQL; virtually any programmer can write standard SQL commands. Thus, an SQL-based direct data access solution will allow any programmer to use virtually any software program to read and update mainframe data sources.

There are two types of direct data access approaches: fully-automated solutions and mainframe adapter-based approaches. The key differentiators between these approaches are speed and cost.

With the data adapter approach, you are provided with an API, around which you must write code that is specific to each data source. This makes the approach error-prone and time-consuming, putting the onus on the user to define and configure each database to be accessed. A specific piece of code (the "adapter") for each data source has to be installed and implemented. This involves creating the database definition, describing both the data and the physical file attributes, and then setting the connection path and operation parameters to make it work.

Clearly, this involves a high degree of familiarity with the mainframe data source on the part of the user - more specifically, familiarity with each and every mainframe data source that is to be accessed. As they say in the shampoo business, "lather, rinse, repeat" - a complete solution requires the administrator to perform the entire coding-based process of data description and connectivity definition for each data source.

Using an automated data access solution, access to target databases is established at the server level, using a simple wizard-based configuration approach. All the administrator has to do is identify the target data source, and the data access solution finds the system and automatically builds the connectivity path.

Here's a good example. Enabling IDMS access can be as simple as setting the IDMS switches using a configuration wizard, which then prompts the user for the IDMS parameters. As a result, database sources are installed, configured, integrated and available for access right away, quickly and easily.

The Power of Predefined "Views"

You might think the differences between an adapter approach and automated data access would be insignificant once the data sources are defined and connectivity paths established. Unfortunately for the users of adapter-based approaches, at this point the work has just begun. For each database queried, the user has to point to the data source and construct the appropriate join logic as part of query itself.

This requires substantial knowledge of the target databases and is especially challenging when you are working with databases that have very different structures and access methods - for example, relational and nonrelational data sources. If the joins are constructed incorrectly, not only will you get unexpected or inaccurate results, but the subsequent resource drain can actually crash the entire system.

With today's focus on service-oriented architecture (SOA), many companies assume that the best solution to mainframe data access is via Web services. There are times when Web services are appropriate and times when SQL-based access will offer a better solution. For this reason, it is advisable to implement technology that supports both approaches. In order to determine which to use, there are a couple of qualifying questions you need to ask.

First, ask yourself how big the returned data set will be. Web services tend to fall down in terms of efficiency any time you are moving large amounts of data. This is because the Web service returns the data as an XML document, as opposed to optimized or structured data. As that XML document gets bigger - when a large data set is returned - it becomes highly inefficient to work with, both in terms of transferring the data across a network as well as being able to parse the document for relevant data. By contrast, JDBC and ODBC drivers deal with large amounts of data automatically and efficiently.

The next question is who will use the service? The more removed the user is from the functional area that controls the data, the more likely that Web services will offer the best data access option, requiring little or no knowledge of or guidance regarding the mainframe data sources to be accessed. If you want to give your insurance customer, for example, the ability to enter a policy number, and pull up the account history, you will not need to instruct them on running the underlying routines required to access and collate that data.

In addition, since you can predict that not a lot of data will need to be returned and it is not a case where the user will require additional ad hoc slicing and dicing of data, this is a situation where wrapping the data access as a Web service will work well. In such cases, the user would want to implement technology that makes it easy to use Web services for direct mainframe data access.

The next question is who will use the service? The more removed the user is from the functional area that controls the data, the more likely that Web services will offer the best data access option, requiring little or no knowledge of or guidance regarding the mainframe data sources to be accessed. If you want to give your insurance customer, for example, the ability to enter a policy number and pull up the account history, you will not need to instruct them on running the underlying routines required to access and collate that data.

In addition, since you can predict that not a lot of data will need to be returned and it is not a case where the user will require additional ad hoc slicing and dicing of data, this is a situation where wrapping the data access as a Web service will work well.

On the other hand, if you want to expose the data model to give the user the ability to manipulate, select or reorganize the data on an ad hoc basis, and especially, if you have a data access scenario that will return large amounts of data, you will want to take advantage of a SQL-based approach. Although this does require the developer to know how the underlying data is organized, it can be more powerful and flexible than a straight Web service. And the developers who are familiar with the mainframe data sources are also almost universally familiar with SQL.

The optimum approach, then, is a Web service in a solution that is still practical for large data sets. Build SQL-based access where needed and provide access as a Web service for easy deployment within an SOA. In this way, you get the best of both worlds - the flexibility of Web services with the power and performance of SQL. Products are available for such pairing up, so this need not all be done by hand.

While adapter technology vendors are proud to offer support for scores of databases on dozens of platforms, try to find one that makes it easy effectively implement a data access solution that readily integrates the mainframe with all of the disparate data sources in the typical enterprise. For example, the typical adapter approach requires the installation of server software on each individual platform, everywhere that data resides, just to access it. Once retrieved, the user would have to integrate the mainframe and non-mainframe data across the organization.

All of this adds up to a big advantage for automated data access over the adapter approach in terms of rapid implementation, learning curves and time-to-benefit.

Robert Morris, GT Software chief strategy officer, is responsible for the planning, integration and marketing of GT Software product solutions to the global market. Morris has an extensive background in application development and integration including experience with CASE methodologies, distributed systems as well as midrange and mainframe environments. He speaks frequently at customer and industry events including Gartner Symposium, Java One, IBM Common, IBM Transaction and Messaging, and IBM CICS and IMS. You can reach him at rmorris@gtsoftware.com.

For more information on related topics, visit the following channels: