My April column discussed the direction that DBMS vendors such as IBM, Microsoft and Oracle are taking toward integrating business intelligence features into their database products. I began the discussion by reviewing ETL in the database. I'll continue the discussion by reviewing the online analytical processing (OLAP) marketplace and looking at what the DBMS vendors are doing in the area of OLAP. My next column will complete this three-part review by looking at the data mining capabilities of DBMS products.
OLAP has always been a controversial topic. Experts have argued for years about the pros and cons of the various approaches to storing and processing dimensional data and about which approach is the best to use for any given application. Of the many approaches to OLAP, four main options dominate (see Figure 1).
Figure 1: Options for Supporting OLAP
Option 1: Store the data in a relational database (usually using a star or snowflake schema table design) and access it via SQL statements. The relational database may be managed by either a client- or server-based relational DBMS (RDBMS). The benefit of this option is that it avoids the need to purchase a specialized multidimensional database product. This approach, however, has often been criticized for poor performance and for the limited power of SQL as a language for OLAP. In response to these criticisms, RDBMS vendors have made significant improvements over the past two years to the analysis power of SQL and to the performance of star schema manipulation. Any product that provides a relational view of data can support this option. Key vendors include Oracle, IBM, Microsoft, Teradata and SAS.
Option 2: Use a vendor-provided OLAP engine to retrieve data from an RDBMS and perform more complex processing on the retrieved data. The processing to be done is defined using vendor-supplied visual tools or, in some cases, by applications that execute OLAP language statements through a provided API. The OLAP engine may reside on the same operating platform as the RDBMS, may be integrated into the RDBMS or may operate as middle-tier server in a three-tier architecture. Key vendors in this area include Applix, IBM, Microsoft, MicroStrategy and Oracle.
Option 3: Store the data in a multidimensional database (in the form of a multidimensional array or cube) and access it using OLAP visual tools or language statements. The database may be managed by either a client- or server-based multidimensional DBMS (MDBMS). This approach has been popular because the MDBMS can be optimized for OLAP (i.e., the MDBMS incorporates an OLAP engine), which provides good performance, especially when large amounts of memory are available for array processing. This option, however, does not always scale to handle large amounts of data and large user populations. As a result, companies often use this approach for manipulating summarized data that has been extracted from a relational database. Key vendors in this area include Applix, Cognos, Comshare, Hyperion, Microsoft, Oracle and SAS.
Option 4: Store small amounts of data in files on a user's desktop computer and access it via an OLAP engine. Often the data for manipulation is extracted from an RDBMS or MDBMS. To date, this option has been very popular, but the growth in the use of Web-based thin clients means that companies are now looking for solutions that move client OLAP processing to Web-based servers. Key vendors in this area include Business Objects and Cognos.
Options 2, 3 and 4 are often known by the terms ROLAP, MOLAP and DOLAP, respectively. Be warned, however, that these terms are misleading and often abused by vendors. To make matters more interesting, some vendors support a hybrid of ROLAP and MOLAP. This is often known as hybrid OLAP or HOLAP. Key vendors that support HOLAP include Applix, Microsoft and Oracle.
What Are the DBMS Vendors Doing?
Much of the discussion so far is probably familiar to many readers, but it is necessary to review these various OLAP architectures in order to explain the current OLAP support and future directions of major DBMS vendors such as IBM, Microsoft and Oracle.
As shown in the sidebar, Microsoft is now the OLAP market leader with a 24.4 percent share of the market. Microsoft's OLAP support is contained in Microsoft Analysis Services, which is a no-cost component of Microsoft's SQL Server 2000 RDBMS. Microsoft's strategy is to provide SQL Server to handle a tabular view of data and the Analysis Services OLAP engine to handle a multidimensional cube view of data. Analysis Services provides the OLE DB for OLAP API and the MDX language for accessing and manipulating multidimensional cubes. Data materialized through these cubes can be physically stored in relational tables or a multidimensional store. Microsoft Excel and Microsoft Office both provide access to Analysis Services data. Many third-party business intelligence (BI) tools support Analysis Services, including Panorama Software (the company from which Microsoft acquired the original technology for Analysis Services in 1996) and ProClarity.
It is now almost three years since Microsoft shipped the last release of SQL Server and Analysis Services. During that time, Microsoft (together with Hyperion and SAS) has led a multivendor initiative to develop a Web services messaging interface between OLAP and data mining applications and a service provider such as an OLAP engine. This XML for Analysis (XMLA) interface is based on OLE DB for OLAP and the MDX OLAP language (now known as mdXML). The XMLA Council (see www.xmla.org) developing the interface specification now has a membership of more than 25 vendors. The objective of XMLA is to provide an open standard API for accessing and manipulating OLAP and data mining data. Another benefit of XMLA is that it also supports thin Web clients and does not require COM components to be installed on the client like OLE DB for OLAP.
The next major release of SQL Server (code name: Yukon) is expected in the late 2003/early 2004 time frame. Public information about the contents of this release is sparse; but Microsoft has indicated that the release will provide enhanced XMLA support, an improved ETL tool, a new Reporting Services component and Visual Studio for Yukon (which will offer an integrated environment for developing SQL, XML and OLAP applications). These enhancements are likely to increase the pressure on competing BI vendors even more.
Given Microsoft's focus on XMLA, it is worth mentioning at this point that the Java community is developing a J2EE object-oriented interface to OLAP service providers known as JOLAP (see www.jcp.org/en/jsr/detail?id=069). JOLAP provides a standard set of object classes and methods for business intelligence and makes extensive use of the OMG Common Warehouse Metamodel (CWM) specification (see www.cwmforum.org). This raises the question of what the relationship is between JOLAP and XMLA.
Hyperion is a key member of the group developing the JOLAP specification and intends to support both XMLA and JOLAP. Its position is that XMLA is a linguistic interface (much like SQL) that has no preference for programming language or object model and given that JOLAP supports a Web services architecture, there is no reason why a JOLAP environment cannot extract an mdXML statement from an XMLA SOAP message and process it through the JOLAP API. This, in fact, is exactly how the Hyperion support of XMLA and JOLAP will work. SAS is also a member of both the XMLA and JOLAP groups and sees these specifications as key elements of its strategy to provide interoperability between SAS Version 9 and external products.
Oracle's support for OLAP goes back to 1995 when it acquired its Express OLAP product. Since then, Oracle has had mixed success with Express. As shown in the sidebar, Oracle's share of the OLAP marketplace slipped in 2002 to seventh place with a share of 4.7 percent. Oracle hopes to turn this situation around with increased emphasis on its BI capabilities and an enhanced OLAP architecture that incorporates the Express OLAP engine into the Oracle database server. This enhanced OLAP architecture ships with Oracle9i Database Enterprise Edition Release 2 and is a separately priced option.
The OLAP support in Oracle9i provides both SQL and Java-based OLAP APIs for accessing and manipulating dimensional data. The legacy Express SPL language API is also supported. Multidimensional cubes are stored in analytical workspaces, which are managed in an Oracle database using an abstract data type. SQL access to analytical workspaces is done using table functions or relational views. An OLAP meta data catalog is also provided, which is OMG CWM compliant.
No specific OLAP tools are provided with the OLAP option, but existing Oracle capabilities such as PL/SQL, Oracle Reports, Oracle Discoverer and Oracle BI Beans can query and analyze analytical workspaces. The OLAP API is Java-based and supports a rich OLAP manipulation language, which can be considered to be the multidimensional equivalent of Oracle PL/SQL. The proprietary nature of Oracle's OLAP APIs and languages has always been an inhibitor to third-party tools developers, but Oracle has indicated it will support the proposed JOLAP specification. It is unlikely, however, that Oracle will support XMLA, as it views XMLA as a Microsoft-dominated development effort.
The Oracle OLAP option, like Microsoft SQL Server, provides a HOLAP capability. Oracle places strong emphasis on the fact that Microsoft employs two engines (one for relational and one for OLAP) to support a HOLAP environment, whereas Oracle incorporates this into a single database environment. There are obvious benefits to the Oracle architecture in the areas of common interfaces and administration, but it remains to be seen if this architecture can offer the performance and scalability it promises. Discussion with Oracle development managers indicates they are confident that the OLAP option can achieve the same performance as that provided by Express.
Like Oracle, IBM has had mixed success with OLAP. To date, IBM's offering in this area has been the DB2 OLAP Server. Originally, this product provided a middle-tier OLAP engine that enabled applications to store and manipulate dimensional data in either a DB2 relational database or a Hyperion Essbase MDBMS, or both. DB2 OLAP Server applications access dimensional data using the Hyperion Essbase OLAP API and language.
The relational storage option of the DB2 OLAP Server gained limited acceptance, and IBM ceased work on this option in favor of working with Hyperion to add hybrid OLAP capabilities to Essbase and also to develop a System/390 version of Essbase for the DB2 OLAP Server.
A major new OLAP initiative from IBM is known by the code name Aurora. Although Aurora has not been announced, it has created a lot of speculation and confusion in the industry. The confusion is caused by rumors that IBM is developing a DB2- based OLAP engine and a multidimensional database capability. If this were the case, then it would, of course, have a serious impact on the IBM/Hyperion relationship. Some analysts have even suggested that this speculation is beginning to affect Hyperion's stock price.
The IBM Aurora project is not about developing an OLAP engine or an MDBMS. The initiative is instead intended to make DB2 more OLAP-aware and is designed to make DB2 easier to use in conjunction with other OLAP products. Several OLAP vendors (including Hyperion) are, in fact, beta testers for the project.
Aurora contains three main capabilities: dimensional meta data support for DB2, tools for exploiting DB2's relational OLAP support and Web services support for accessing dimensional data managed by DB2.
The Aurora dimensional meta data facility addresses the issue that although many applications store dimensional data in relational tables (usually in the form of fact and dimension tables), the DBMS is not aware of the dimensionality, or OLAP semantics, of this data. Aurora provides a comprehensive meta data facility to define these semantics and an XML import/export capability to enable meta data interchange with external OLAP products. At present, Aurora is not OMG CWM compliant.
Aurora also helps DB2 developers exploit the power of DB2's relational OLAP capabilities (materialized query tables, clustering capabilities for indexing and storing dimensional data, etc.). Much of the Aurora development here will focus on tools and advisors that help developers employ materialized query tables (MQTs) not only to improve performance, but also to create aggregated data and multidimensional cubes for export to third-party OLAP tools.
The third, and last, component of Aurora enables application access to DB2 dimensional data using Web services. This facility will allow applications to issue XML- based queries across a Web-based connection. This support will be based on XML query languages such as XPath, rather than on OLAP languages such as mdXML.
We see then that IBM's strategy is to focus on developing DB2 capabilities for supporting OLAP using relational technology (option 1 outlined earlier in this column) and to enhance its integration with third-party vendors that support OLAP options 2, 3 and 4. Microsoft and Oracle strategies, on the other hand, are to support a hybrid OLAP architecture (options 2 and 3). In the case of Microsoft, multidimensional cubes are processed using MDX and, in the future, mdXML. These cubes can be physically stored in a relational database or a multidimensional store. Oracle supports both SQL and OLAP APIs to multidimensional cubes (analytical workspaces), which are physically maintained in a relational database.
In conclusion, it is important for BI developers and architects to understand the OLAP strategies of the DBMS vendors and to determine how these strategies support and integrate with the BI framework and products being used in their organizations.
Register or login for access to this item and much more
All Information Management content is archived after seven days.
Community members receive:
- All recent and archived articles
- Conference offers and updates
- A full menu of enewsletter options
- Web seminars, white papers, ebooks
Already have an account? Log In
Don't have an account? Register for Free Unlimited Access