The Quest for Mainframe Data Access for SOA
Simply SOA
Information Management Online, August 30, 2007
In today's customer-centric enterprise, regulatory mandates, globalization and corporate mergers and acquisitions are accelerating the demand for real-time information. Yet the vast majority of enterprise information remains buried deep in legacy databases used by applications running on mainframe computers.
Accessing these information-rich wells of mainframe-based legacy data has long been a challenge. Organizations are looking beyond traditional data access approaches to address the issues and potential benefits of legacy integration strategies that tap into mainframe performance, security and programming resources as well as valuable data to deliver real-time results.
First things first, let's consider the primary methods of accessing legacy data: screen data capture, process extension and direct data access. Screen data capture is a fast, easy way to define an interface to an existing mainframe application, by creating a "macro" that navigates the application, grabbing the desired data from the 3270 screen layer. For this reason, it is also a very brittle approach and may malfunction or produce unexpected results if there are subsequent changes to the flow or layout of the application screens.
Advertisement
While the process extension method avoids this problem, bypassing the 3270 screen layer and communicating with mainframe programs at the business logic layer, both of these approaches execute the existing business rules on the mainframe to retrieve or update data and, therefore, must have access to the mainframe application logic in order to function.
Each of these has its place in an organization's data access strategy, but many times an organization has the opportunity to leverage direct data access to integrate and leverage legacy data in their service-oriented architectures (SOA). Direct data access bypasses existing application programs altogether and directly accesses the legacy data source, eliminating the time-consuming hassle of defining custom APIs.
Direct data access is especially beneficial when building new business logic in an application server or to protect the investment in end-user applications, reporting and business intelligence (BI) tools, by giving them access to legacy data in real time.
The most popular approaches to enable direct data access are Web services and SQL. Of these, the greatest familiarity among programmers is with the use of SQL; virtually any programmer can write standard SQL commands. Thus, an SQL-based direct data access solution will allow any programmer to use virtually any software program to read and update mainframe data sources.
There are two types of direct data access approaches: fully-automated solutions and mainframe adapter-based approaches. The key differentiators between these approaches are speed and cost.
With the data adapter approach, you are provided with an API, around which you must write code that is specific to each data source. This makes the approach error-prone and time-consuming, putting the onus on the user to define and configure each database to be accessed. A specific piece of code (the "adapter") for each data source has to be installed and implemented. This involves creating the database definition, describing both the data and the physical file attributes, and then setting the connection path and operation parameters to make it work.
Clearly, this involves a high degree of familiarity with the mainframe data source on the part of the user - more specifically, familiarity with each and every mainframe data source that is to be accessed. As they say in the shampoo business, "lather, rinse, repeat" - a complete solution requires the administrator to perform the entire coding-based process of data description and connectivity definition for each data source.
Using an automated data access solution, access to target databases is established at the server level, using a simple wizard-based configuration approach. All the administrator has to do is identify the target data source, and the data access solution finds the system and automatically builds the connectivity path.
Here's a good example. Enabling IDMS access can be as simple as setting the IDMS switches using a configuration wizard, which then prompts the user for the IDMS parameters. As a result, database sources are installed, configured, integrated and available for access right away, quickly and easily.
The Power of Predefined "Views"
You might think the differences between an adapter approach and automated data access would be insignificant once the data sources are defined and connectivity paths established. Unfortunately for the users of adapter-based approaches, at this point the work has just begun. For each database queried, the user has to point to the data source and construct the appropriate join logic as part of query itself.
This requires substantial knowledge of the target databases and is especially challenging when you are working with databases that have very different structures and access methods - for example, relational and nonrelational data sources. If the joins are constructed incorrectly, not only will you get unexpected or inaccurate results, but the subsequent resource drain can actually crash the entire system.
With today's focus on service-oriented architecture (SOA), many companies assume that the best solution to mainframe data access is via Web services. There are times when Web services are appropriate and times when SQL-based access will offer a better solution. For this reason, it is advisable to implement technology that supports both approaches. In order to determine which to use, there are a couple of qualifying questions you need to ask.
First, ask yourself how big the returned data set will be. Web services tend to fall down in terms of efficiency any time you are moving large amounts of data. This is because the Web service returns the data as an XML document, as opposed to optimized or structured data. As that XML document gets bigger - when a large data set is returned - it becomes highly inefficient to work with, both in terms of transferring the data across a network as well as being able to parse the document for relevant data. By contrast, JDBC and ODBC drivers deal with large amounts of data automatically and efficiently.
Page 1 of 2.






