John would like to thank Nancy Vodicka, XML product manager from DataDirect Technologies, for contributing this month's column.

Corporate IT requirements are increasingly driven by the need to present and exchange data in different formats. The Internet is rapidly becoming the platform for data integration, as companies struggle to move and present information internally and with trading partners. This introduces a major struggle between the data they have - usually stored in a variety of relational databases - and what they need - Web-friendly extensible markup language (XML).

Consider the business processes required for a typical online catalog retailer. They must display up-to-date product details on the Web, take orders online, pass messages to suppliers or shipping departments for fulfillment and interface with accounting and inventory control systems. The primary data storage in these companies is a relational database, but their data must be readily available in a format that is suitable for the various Web interactions. SQL queries have long been used to combine data from relational databases for different purposes and audiences. However, these queries do not easily produce data suitable for multipurpose data exchange and presentation.

While most Web sites use data from relational databases, it rarely looks "relational" on a Web page. The relational data is used to build hierarchical representations and then presented to the user in a visually pleasing view online. Likewise, most Web messages have a strongly hierarchical organization that looks nothing like the relational tables.

XML has emerged as the best option to represent virtually any kind of data for exchange or presentation on a Web site or in a Web message. Web sites convert the XML to HTML using style sheets and Web messages exchange data directly as XML. XML is a text-based format which structures data as an easy-to-maneuver outline. It is easily written, read and interpreted by any program and also easily exchanges and integrates data across applications. Unlike relational databases, XML uses both hierarchy and sequence to structure data, making it a perfect fit for Internet data exchange.

It is clear that developers must use relational data in XML applications, but what are the implementation options? Currently there are four choices: custom coding, proprietary XML extensions from database vendors, SQL 2003's SQL/XML extensions and the XQuery language. All offer the necessary capabilities, but each has very different ramifications for the architecture of software systems.

Custom Coding

The first option is custom coding, which is probably the most widespread approach to integrating XML and relational data. This approach involves writing a program using JDBC or ODBC and SQL queries to retrieve the relational data, using an XML API such as DOM or SAX to create the XML structures and transforming or formatting the XML for display as HTML using something like XSLT.

While custom coding may be both standards-based and portable, it often proves to be an expensive investment for several reasons. First, the code tends to be complex and tedious to write. It is often necessary to represent the data in multiple intermediate formats before achieving the final desired XML structures. Additionally, it is difficult to get this type of code to perform well. And finally, maintaining the code may be time-consuming and difficult since each format change requires new coding, followed by testing cycles.

Proprietary XML Extensions

The next option is to use proprietary XML extensions from database vendors. Relational vendors have long recognized the need for integration with XML and most now provide SQL extensions and other tools to assist their users. These tools vary widely in quality, performance and usability. They are often difficult for programmers to learn and cumbersome to implement. In many cases, the tools only work with a specific version of the vendor's database; therefore, moving to a new version of the database or supporting multiple versions of a vendor's database can be an expensive project. Proprietary extensions provide no portability. They do provide an alternative to custom coding for companies who rely on only one database vendor and generally support only one version of the database.

Using Standard SQL/XML

Another way to create XML from relational data is to use standard SQL/XML. Early on, standards bodies such as INCITS, ANSI and ISO recognized the need for a standard for adding XML support to relational databases and, as a result, added XML publishing functions to SQL 2003. These functions enable the creation of any desired XML structure with the full power of SQL. For an experienced SQL programmer, they are quick to learn and easy to use. This is the simplest tool that solves the problem well, and it fits naturally into existing relational infrastructures. Standards-based, database-independent SQL/XML implementations provide both simplicity and portability.

XQuery Language

The final option for using relational data with XML applications is the XQuery language. The World Wide Web Consortium (W3C), an XML and Internet related standards body, has designed the XQuery language to query XML in the same way that SQL queries relational data. XQuery also works efficiently for XML views of data sources that are not represented as XML. Most relational vendors are implementing XQuery based on SQL/XML views of their relational tables, and some third-party tools can provide XML views of any relational database, allowing data to be combined with XML files or in a program. Likely to become a recommendation soon, XQuery easily combines multiple data sources making it a clear winner for data integration.

In summary, while custom coding, proprietary XML extensions, SQL/XML and the XQuery language are all viable options for using relational data in XML applications, SQL/XML and XQuery provide the most flexible and extensible alternatives. Both are extremely useful for businesses that need a portable way to query their relational data and produce XML. SQL/XML fits well into existing relational infrastructures, and requires little new learning from SQL programmers. XQuery is powerful for data integration and is a better fit for many XML environments.

Increased demand for collaboration inside and outside the enterprise will continue to drive requirements for data integration. IT management will feel the strain if they are unable to deploy flexible and manageable infrastructure for sharing data. Standards-based approaches, such as XQuery and SQL/XML, show the best promise for IT departments seeking to meet their company's evolving business needs.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access