Continue in 2 seconds

Database-Oriented Middleware

Published
  • November 01 1999, 1:00am EST

Database access is a key element to EAI (enterprise application integration), especially data-level EAI. While there was a time when databases were proprietary and difficult to access, there are currently so many solutions for accessing data that there is rarely a problem either retrieving information from any database or placing information in any database. Not only do these solutions make EAI a much easier proposition, they speak directly to the notion that the capability of modern middleware drives the interest in EAI.

As with most things, however, the situation with databases and database-oriented middleware grows complicated. Database-oriented middleware is no longer just a mechanism to "get at" data; it has also become a layer for placing data within the context of a particular common database model or format, known as a virtual database. For example, if data contained in a relational database is to be viewed as objects, the database-oriented middleware can map the information stored in the relational database so it appears as objects to a source or target application. The same thing can be done the other way around ­ mixing and matching models such as hierarchical, flat files, multidimensional, relational and object-oriented (see Figure 1).


Figure 1: Database-oriented middleware allows the viewing of data using any model, no matter how the data is stored.

Database-oriented middleware also provides access to any number of databases, regardless of the model employed or the platform upon which they exist. This is generally accomplished through a single, common interface such as open database connectivity (ODBC) or Java database connectivity (JDBC), both discussed later in this article. As a result, information stored in your Adabas, DB2, Oracle or Sybase databases can be accessed at the same time through a single interface (see Figure 2). By taking advantage of these mechanisms, it is possible to map the difference in the source and target databases to a common model, making them much easier to integrate.


Figure 2: Database-oriented middleware provides access to a number of databases at the same time.

It should be clear from these examples that database-oriented middleware plays a very significant role in the world of EAI, allowing a large number of enabling technologies to process information coming and going from the source or target systems. If a message broker or an application server requires information contained in a database, then database-oriented middleware becomes the logical solution for accessing that information. Many EAI products, such as message brokers and application servers, already contain the necessary database-oriented middleware to access the most popular databases. In fact, most message brokers and application servers come prepackaged with the appropriate adapters to access most relational databases, such as Oracle, Sybase, Informix, etc. Clearly, database access is now a problem solved, with plenty of inexpensive and proven solutions available.

However, it's important to understand the role of database-oriented middleware in the context of EAI in order the get to the larger picture. Let's face it, databases are going to serve as the primary point of integration for most EAI solutions over the next few years; and the mechanism you select to move the information in and out of the databases can make or break your EAI project. What's more, the integration with more modern middleware solutions is essential as well and carries with it its own complexities and opportunities.

What is Database-Oriented Middleware?

Database-oriented middleware provides a number of important benefits, including:

  • An interface to an application;
  • The ability to convert the application language into something understandable by the target database (e.g., SQL);
  • The ability to send a query to a database over a network;
  • The ability to process a query on the target database;
  • The ability to move a response set (the results of the query) back over the network to the requesting application; and
  • The ability to convert a response set into a format understandable by the requesting application.

In addition to these processes, database-oriented middleware must also provide the ability to process many simultaneous requests, as well as scaling features, such as thread pooling and load balancing. All this must be presented along with management capabilities and security features. As in other contexts, the approaches to providing these benefits vary greatly from vendor to vendor and technology to technology.

Types of Database-Oriented Middleware

In short, database-oriented middleware is "all the software that connects some application to some database." Like primitive middleware layers, database-oriented middleware allows developers to access the resources of another computer (in this case, a database server) using a single, well-defined API. While database-oriented middleware appears straightforward in its architecture, there are many products and standards that make up this market, and each accomplishes the task in very different ways.

Although there exist several types of middleware, they are all basically native middleware ­ call level interfaces (CLIs) and database gateways. That is, native middleware is middleware created for a specific database. For example, middleware provided by Sybase to access the Sybase databases from C++ is native database-oriented middleware. Native database-oriented middleware provides the best performance and access to native database features (such as stored procedures and triggers), since the middleware has been created for a particular database. However, once the links to a database have been created using native middleware, major renovations will be required in order to change databases.

CLIs, such as ODBC and JDBC (both discussed later in this article), provide a single interface to several databases. CLIs are able to translate common interface calls into any number of database dialects, as well as translate the response sets into a common response set representation understandable by the application making the request to the database.

Database gateways are able to provide access to data once locked inside larger systems, such as mainframes. They can integrate several databases for access from a single application interface. They can re-map archaic database models (flat files, ISAM, VSAM, etc.), so they appear more traditional and translate queries and information as they move in and out of the database gateway software.

ODBC

ODBC is really not a product but a standard that Microsoft created several years ago just after the Windows revolution. ODBC is a CLI that simplifies database access from Windows (as well as a few other operating systems) by allowing a developer to make a single API call that works with most relational databases, along with a few that don't follow the relational model.

Simply put, ODBC is a translation layer (as is JDBC ). Like all middleware layers, ODBC provides a well-defined and database independent API. When using the API, ODBC utilizes a driver manager to determine which database the application would like to communicate with and load (and unload) the appropriate ODBC driver. As a result, an application using ODBC is database independent. However, if there are any database-specific calls (such as passing SQL directly through to the database or invoking a number of stored procedures and triggers), that application is no longer database independent, since it's bound to a particular database brand. In that case, it may make more sense not to use ODBC, but rather move to a native database middleware layer.

ODBC is currently available in a 32-bit version, and most relational databases have ODBC drivers available. Although ODBC is free, the drivers are not. These drivers can be purchased from the database vendors or through third-party ODBC driver vendors. Most popular application development tool vendors provide database access features using ODBC. In fact, it's the only way Microsoft's Visual Basic and other Microsoft tools can talk to a database.

Does ODBC Hinder Performance?

There is a great deal of debate concerning ODBC's ability to provide performance, to access database features and to provide a stable application deployment platform. While it is true that ODBC had a rough start, failing to provide the performance that developers were looking for, the ODBC of today (generally speaking, since performance is driver dependent) provides high-performance database access.

There have even been instances where ODBC has outperformed the native middleware layer. What's more, ODBC can access most of the native database features, such as access stored procedures and triggers, tracking transaction and recovering from errors ­ albeit without the same degree of control as when using middleware that's native to a particular database.

Bottom line ­ ODBC is good enough for most EAI projects, especially those using Microsoft platforms. ODBC should be considered when operating in a multi-database environment with the need to access several different databases from the same application or integration server (message broker or application sever). ODBC is also a good fit if a database is likely to change during the life cycle of the application (such as scaling to a larger user load).

Using ODBC enables an EAI solution to move from database to database quickly. However, ODBC should be avoided if one is wedded to a particular database or if an EAI solution requires a large number of proprietary database functions.

JDBC

Now an accepted and stable approach, JDBC from JavaSoft was the first standard Java-enabled database API. Functionally equivalent to ODBC, JDBC provides Java developers with a uniform interface to most popular relational databases from most Java-enabled development or application processing environments.

The JDBC API defines a set of Java classes that allow an applet, servlet, Java bean or Java applications to connect to a database. In most cases, such an applet will be one that links back through the network to remote relational database servers, such as Sybase, Oracle or Informix. The native Java JDBC classes, sold or given away by the database vendors, exist with the custom application classes and provide a "pure Java" and portable mechanism for database access. These allow one to link to any database from any platform that supports Java. At least, that's the idea. JDBC also provides uniformed database access for many EAI-enabled middleware products, such as message brokers, application servers and even traditional MOM.

JDBC Java classes allow the developer to use native Java to issue common SQL statements to request information from a remote database, as well as process the result set. Since JDBC is another translation layer, like ODBC, Java applications that employ JDBC are database independent and can access any number of databases through a single JDBC interface. For example, you may gather data from an Oracle database running remotely, update a local Sybase database and delete a record from a DB2 databases running on a mainframe, all from the same Java applications using one common interface ­ JDBC.

Two major layers make up JDBC: the JDBC API and the JDBC driver API (see Figure 3). The JDBC API provides application-to-JDBC manager communications. Developers use this API to access database services using standard Java mechanisms. It is incumbent upon the database vendor to provide the JDBC driver interface. Vendors may also use a traditional ODBC connection through a JDBC to ODBC bridge.


Figure 3: Two JDBC Layers

As alluded to earlier, the drivers are really a group of Java classes (including java.sql.Connection, java.sql.Statement, java.sql.PreparedStatement, java.sql.CallableStatement, and java.sql.ResultSet). When developers want to access a database with JDBC, they can use these classes from the native Java applications, which can link to the database, send a request and process the returning result set.

The java.sql.DriverManager handles the loading and unloading of the proper DBMS driver. The java.sql.Connection interface exposes the database to the developer, representing the connection as a set of objects. The java.sql.Statement interface provides the developer with a container for executing SQL statements using a connection to the database. The java.sql.ResultSet interface exposes the requested data as native Java for processing by the Java applet or application.

The JDBC manager, just as the ODBC driver manager, loads and unloads database drivers as required by the Java applet or application. JDBC supports a connection either to a single or multiple database servers. This means that an applet can connect to the inventory database in the warehouse as well as a public database on the Internet ­ at the same time.

The DBMS supports JDBC through the JDBC driver interface with each driver providing an implementation of the java.sql.Connection, java.sql.Statement, java.sql.PreparedStatement, java.sql.CallableStatement and java.sql.ResultSet classes. What's more, the driver must implement the java.sql.Driver interface for use by the java.sql.DriverManager interface.

If a developer needs to access a database from Java, he or she obtains the java.sql.Connection object directly from the JDBC management layer and the java.sql.DriverManager. The driver managers leverage the URL string as an argument, allowing the JDBC management layer to locate and load the proper driver for the target database. The driver manager performs this "magic" by looking at each driver, finding the only one that can connect to the URL. Sometimes the URL may require a sub-protocol that the driver supports. Once all of this is complete, the driver connects to the DBMS and returns the proper java.sql connect object for accessing the database.

In order for the driver manager to locate the correct driver, each driver has to register with the driver manager using the DriverManager.registerDrive method, invoked from the applet. JDBC, using Java's rather limiting security, is only able to use drivers coming from the local file system or from the same class loader. These are limitations that the tool vendors are overcoming through their own custom JDBC implementations.

Types of JDBC Drivers

JDBC drivers fit into one of four categories: JDBC-ODBC bridge driver, a native-API partly-Java driver, a net-protocol all-Java driver and a native-protocol all-Java driver.

JDBC works with ODBC by providing a JDBC-ODBC bridge to translate JDBC calls to functions understandable by the ODBC API. Although JDBC and ODBC are similar, they take slightly different approaches toward connecting to databases. Therefore, when using this architecture, developers must endure the overhead of a translation layer communicating with another translation layer.

The JDBC-ODBC bridge driver provides Java developers with JDBC access using most ODBC drivers. Not only is this the most flexible method, it is also the most homogeneous. Typically, the ODBC binaries must be loaded on each client machine using the driver. That being the case, it is likely that the Java application will be locked into the Windows platform where ODBC is more native. An easier option may be to access a shared set of ODBC binaries existing on an application server using JDBC. However, the architecture involved is much more complex and adds time to a development project.

The ODBC bridge is a requirement of JDBC if the standard is expected to support the vast majority of relational databases (and sometimes non-relational databases) currently available. This is the result of failure on the part of some of the database vendors and gateways to larger systems to offer JDBC drivers. Eventually, middleware vendors will offer the JDBC drivers, thus eliminating the need to communicate through more than a single translation layer. However, the performance hit will not be as significant as might be anticipated.

A native-API partly-Java driver is a middle-of-the-road approach. This driver converts JDBC calls into calls on the client API for any number of target databases (including Oracle, Sybase, Informix and DB2). However, for this option to be successful, some binary code must be loaded on the clients. As a result, it has many of the same limitations as the JDBC-ODBC bridge driver, but does not require access to ODBC.

The net-protocol all-Java driver translates JDBC calls into a DBMS-independent net protocol, which is again translated into a native DBMS protocol by a server. As a result, this driver can connect its pure Java clients to any number of databases with the database vendor specifying the native protocol employed. This is one of the most flexible of all JDBC solutions. This architecture is typically seen on intranets.

For Internet access, this architecture must support additional security requirements. Database vendors are working to assure that they support net-protocol all-Java drivers.

Finally, a native-protocol all-Java driver directly converts JDBC calls into the network-native DBMS network protocol. This driver architecture provides direct calls from the client to the database server. This architecture is most popular for intranet access, since it not only uses pure, portable Java but, by taking a direct route to the data, also provides the best performance.

Database vendors need to provide for this solution. Most have, or are working on, drivers for this architecture.

Other JDBC Features

The beauty of JDBC goes beyond its ability to link to and retrieve data from remote DBMSs to its robust array of database features. JDBC is able to access BLOBs (binary large objects) ­ handy for moving large binary information to and from the database. There is a data conversion mechanism as well, allowing JDBC to map the data back into Java by converting some SQL types into Java types. In addition, JDBC is able to support threading for pooling database connections, thus providing the source or target application (or integration server) with the ability to operate against the database asynchronously.

The ability to support transactions is a useful feature. Utilizing this JDBC native feature, developers can define a starting point and endpoint in a set of homogeneous or heterogeneous database operations. There are a few options here. Developers can set the JDBC transaction manager to "auto-commit," meaning that each database command is carried out as the applet invokes it. Developers may decide to turn off "auto-commit," allowing them to define several database commands as individual transactions, something that will complete all operations successfully, or put everything back the way it found it. This option allows complete recovery if any of the commands in the transaction fail. For example, an operation that records a sale in three separate databases (e.g., inventory, sales and customer) would be a good candidate for a JDBC transaction since the failure of any of the updates would result in none of them completing. This maintains database integrity. Most popular DBMSs (such as Oracle) provide native transaction features. JDBC simply extends those features to the Java applets or application.

JDBC supports database cursors as well, using the ResultSet. getCursorName() method of JDBC. A database cursor allows the developer to return a cursor name that points to the result set that actually still resides, as a cursor, on the database server. This saves the network the overhead of having to send the entire result set down to the requesting application. Using this feature, the Java applets or applications can move through the data, only bringing the data required across the network. This feature also allows positioned updates and deletes to be invoked. However, the target DBMS must support this feature in order for JDBC cursors to work. Fortunately most do.

There are even more benefits, such as the use of SQL escape syntax. This allows developers to map escape syntax to DBMS-specific syntax. Stored procedures from JDBC can be invoked as well, simply by invoking them from JDBC and passing in the proper arguments. Finally, scalar functions are available, such as ABS(), DEGREES(), WEEK() and DAYNAME().

Java, JDBC and EAI

The use of Java as a development language and architecture for EAI is widespread. At this time, JDBC seems to be filling the enterprise space more than the Web space. That trend will most likely continue as the future brings a Web-enabled, existing corporate database. So, while JDBC will exist on the Web, its real value will be on the inside of the firewall.

In many respects, JDBC is bringing the traditional, complex multi-tiered world of Web-enablement back to the EAI problem domain. With the advent of server-side Java (e.g., EJB, application servers, etc.), JBDC does not have to be client-only anymore. Many tool vendors are employing both RMI (Remote Method Invocation) and JDBC to provide a flexible and complex architecture to solve a number of application problems, problems that include EAI. The tradeoff, however, is complexity and proprietary approaches. When a standard is used in such a way as to make its architecture proprietary, then the value of that standard is diluted. The danger is that JDBC is heading toward that unhappy ending.

OLE DB

OLE DB, referred to by many as the big brother of ODBC, is a specification that defines a set of data access servers capable of facilitating links to any number of data sources. As a result, developers have the ability to manage different data sources as a single virtual database. OLE DB allows access to data using a standard COM interface.

OLE DB gives developers the means to access data that resides in relational databases, documents, spreadsheets, files and electronic mail. Developers, through COM, can easily integrate object-oriented and multidimensional (real cube) databases with OLE DB. When using OLE DB, the database simply becomes a component known as a data provider. Any component that uses a native data format and exposes methods through an OLE DB interface is considered a data provider, including relational databases (using ODBC), an ISAM file, text file, e-mail, Microsoft Word or a data stream (see Figure 4).


Figure 4: OLE DB

The idea here is to create an individual OLE DB component object to deploy additional features that are layered on top of the data providers. These individual OLE DB components are called service providers. Service providers are like query processors in that they allow applications to take advantage of providers that interconnect different combinations of data (homogeneous or heterogeneous). The data, regardless of model (object-oriented, relational, multidimensional, etc.), exists as single view. This solves the relational-bond limitations when using ODBC.

The other side of data providers are OLE DB data consumers, applications written to a single data provider, or generic consumers that work with any number of data providers. For example, Microsoft's Excel, Word and Project can become data consumers.

ODBC remains a piece in the puzzle, but in a diminished role. It is simply a mechanism to communicate with relational databases from OLE DB. Microsoft has an updated ODBC driver manager with an OLE DB provider, making OLE DB compatible with any ODBC accessible database.

So how is OLE DB programmed? OLE DB provides 55 new interfaces grouped into seven object types: DataSource, DBSession, Command, Rowset, Index, ErrorObject and Transaction. An object type is simply a set of methods (interfaces) that an object must expose. For example, developers will define the Transaction Objects using a group of methods that any data consumer can request from a transaction service.

It is likely that few EAI developers will have to deal with the OLE DB interface directly. Microsoft is perfecting a new set of products and development environments that allow developers to build applications with OLE DB hidden behind many easy-to-use layers. This OLE DB interface is analogous to the relationship with ODBC and development tools.

Going Native

In addition to ODBC, JDBC, OLE DB and other database translation interfaces, there are many other native database-oriented middleware products. These are APIs provided by a database vendor or a third-party with access to a particular database. In the past, these were often older C and C++ libraries. Now, most EAI development tools ship native database-oriented middleware with their products.

The advantage of using native database-oriented middleware rather than ODBC, JDBC or OLE DB is the ability to provide high-performance database access, along with the ability to access features native to a specific database. However, using native database-oriented middleware binds the user to that middleware vendor, since the EAI application uses calls specific to that particular database. That's the tradeoff.

Database Gateways

Database gateways (also known as SQL gateways) are APIs that use a single interface to provide access to most databases residing on many different types of platforms (see Figure 5). They are like virtual database middleware products, providing developers with access to any number of databases, residing in environments typically not easily accessible, such as a mainframe. For example, using an ODBC interface and a database gateway, it is possible to access data residing in a DB2 on a mainframe, Oracle running on a minicomputer and Sybase running on a UNIX server. The developer simply makes an API call, and the database gateway does all the work.


Figure 5: Database Gateways

Database gateways translate the SQL calls into a standard format known as the Format and Protocol (FAP), the common connection between the client and the server. It is also the common link between very different databases and platforms. The gateway can translate the API call directly into FAP, moving the request to the target database and translating the request so that the target database and platform can react.

There are a number of gateways currently on the market such as Information Builders' Enterprise Data Access/SQL (EDA/SQL), in addition to standards such as IBM's Distributed Relational Data Access (DRDA) and ISO/SAG's Remote Data Access (RDA).

EDA/SQL

EDA/SQL is a wonderful, general-purpose database gateway for several reasons. It works with most database servers and platforms, bridging many enterprises where there might be dozens of servers running on dozens of different platforms needing to be accessed from a single application ­ perfect for EAI. It uses ODBC as the interface rather than a proprietary API.

EDA/SQL can access more than 50 relational and non-relational database servers and can access all these databases using ODBC. There are several EDA/SQL components including the API/SQL, EDA/Extenders, EDA/Link, EDA/Server and EDA/Data Drivers. API/SQL provides the call level interface (ODBC), allowing the developer to access the EDA/SQL resources. EDA/Extenders are really redirectors of SQL calls, which route the request across the network. EDA/Link provides the network connections by supporting more than 12 communication protocols, and EDA/Server resides on the target database processing the requests on behalf of the requesting application or integration server. Finally, the EDA/Data Drivers, like ODBC drivers, provide access to more than 50 different target databases.

RDA

RDA is not a product. It is a standard for developers to access data. RDA uses OSI and supports dynamic SQL. RDA also allows the client to be connected to more than one database server at the same time. However, it does not support typical transaction-related services and, due to lack of vendor support and its inability to snap into popular EAI development environments, it's no longer relevant for EAI.

DRDA

DRDA is an IBM database connectivity standard that has the support of many database heavyweights such as Sybase, Oracle, IBI and Informix. Like other database gateways, DRDA attempts to provide easy database connectivity between any number of databases operating in multi-platform environments.

DRDA defines database transactions as remote requests, remote units of work, distributed units of work and distributed requests. A remote request means that one SQL request is sent to one database. A remote unit of work means that many SQL commands are sent to one database. A distributed unit of works means that many SQL commands are sent to many databases. However, each command is sent to one database. Finally, a distributed request means that many SQL commands are sent to many databases, and each command can execute on several databases.

While DRDA is a well-defined standard, the fact that DRDA requires that databases comply with standard SQL syntax diminishes the benefit of DRDA to organizations where many different systems run many different databases at different stages of maturity.

Ready for Prime Time

The strongest point in support of database-oriented middleware is that the technology is very mature, well tested and ready for most EAI applications. In other words, database access should not be a major concern for most EAI projects.

Problems that remain to be solved for database-oriented middleware include the ability to make it more scalable. As things stand now, a TP monitor or application server will have to be employed to multiplex the database connections on behalf of the application or EAI solution. Multiplexing (or connection pooling), or the ability to remove the one connection per request restriction from database-oriented middleware, is becoming part of many database-oriented middleware layers, including JDBC and ODBC.

Moreover, as interest is renewed in nonrelational database models, such as multidimensional, hierarchically and object-oriented, middleware is learning how to emulate and translate data from model to model. Today it is possible to view a relational database using the object-oriented model and a hierarchical database as a relational database. These emulation and translation services make EAI a much easier proposition, since they make it possible to map very heterogeneous environments to a common database model and thus provide an easier starting point for integration. Certainly this adds the most value to data-level EAI.

This world won't change much even as EAI grows in popularity, due to the simple fact that there exist solutions to most of these problems. Isn't it nice to come upon an area of technology where there are few problems left to solve?

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access