As the business environment has become increasingly competitive, the need to use corporate data as a strategic resource has intensified. However, most organizations in today's technology-based businesses are data rich and information poor. Much of the essential information needed to anticipate changing market conditions and customer preferences, forecast future demand for products and services, and develop profitable business plans is locked in various transactional systems, spreadsheets and Web log files. Without the ability to deliver the right information to the right people at the right time, companies cannot stay competitive in today's fast changing economy.
According to Bill Inmon, "The framework provided by the corporate information factory is a blueprint for enterprise information. If you are looking to manage data across the enterprise then you will be drawn to the corporate information factory because there is no other framework. And the corporate information factory has proven its worth in the companies where it has been implemented."1
While Bill Inmon and Claudia Imhoff have defined the corporate information factory (CIF) as a data warehouse-centric framework, it is much more than the data warehouse (DW). The comprehensive nature of the CIF requires that many diverse technologies, such as data integration, security and meta data management, data archiving and near-line storage, data mining, and online analytical processing (OLAP), surround and support the data warehouse.
In fact, the marketplace has shown that just creating an enterprise data repository is not sufficient to support the information needs of today's business world. The META Group research of Global 3000 companies found fewer than 50 percent attributing any business or financial benefit to the DW.2 Instead, the DW spreads data across back-office systems and generically enables other applications focused on delivery of tangible business results.
Figure 1: The Corporate Information Factory
As can clearly be seen from the architecture of the CIF (Figure 1), a fully implemented enterprise-wide framework contains multiple sources of data and information supporting the decision-making process.3 Included in this framework are:
- Corporate Applications: The operational systems that a company uses to actually run the business, such as the ERPs (SAP, PeopleSoft) or other order entry and customer service-type applications. These are the systems that feed transactional data into the corporate information factory.
- Enterprise Data Warehouse: A repository of corporate information arranged for the purpose of data retrieval. The DW is designed to support the bulk extract of data necessary for decision-support activities such as data mining and analysis, modeling and reporting. The data warehouse is also the most common source for data cleansing activities, historical data and for feeding data marts.
- Departmental Data Marts: A departmental or subject-specific data repository. Here, the data is organized to support decision-support activities, but the scope of data in a data mart is much more limited than in a DW. The most common activities using this type of data mart are data mining, exploration and detailed analytics such as OLAP.
- Operational Data Store (ODS): A repository that contains the consolidated current data from around the enterprise, yet stores the data for rapid retrieval similar to transactional applications. The ODS is used as a place where online integrated processing can occur. In addition, the ODS can be used as the back-end source for e-commerce, call center and business activity monitoring applications.
- Decision Support System (DSS) Applications: The analytic applications built to maximize data value for specific business processes (i.e., consolidations, budgeting and planning, performance measurement and business modeling) to provide maximum strategic value to an organization.
OLAP in the Corporate Information Factory
Enterprises today are facing tough competitive markets, and their respective management cycles are constantly driving the need for information at every level. This cycle requires companies to:
- Report on information.
- Analyze information for patterns and trends.
- Model data using the analysis patterns and trends.
- Plan future activities based on the modeled trends.
The timeliness of reporting on business activities has evolved as technologies have improved and competitive pressures have grown over time. Where using monthly, weekly or daily information was once considered acceptable, today's businesses require near real-time information because the possible gains or losses of the previous hour can have a significant impact on a company's bottom line.
The ability to take current information and determine why certain trends or patterns are continuing gives a company a significant advantage. Indeed, historical data is used to identify trends. Current information is used to determine if those trends are occurring or changing. If a company extends this capability to permit the use of these patterns and trends to model future market and company behaviors, its ability to use its information is maximized, and the enterprise is able to move from making short-term tactical decisions to making significant strategic long-term plans.
Tools that allow users to become autonomous in a very short period of time and that require minimal training are the key to unlocking the information contained in an enterprise's storehouse of data. Users must be able to gather information quickly in a variety of different ways to satisfy diverse business requirements.
By deploying tools that minimize the need for end users to understand underlying data structures and minimize the risk of creating runaway queries, an enterprise can safely allow end users access to the information they need with a minimum involvement of expensive information technology resources. Because end users only need to understand how to ask for the information (the access methodologies are masked from their view), they can concentrate on what information they need, rather than where and how to access the information. In addition, because everyone is accessing the same data sources within an enterprise, everyone is using a common data set to make decisions.
One of the most powerful components of a CIF platform is its OLAP tool suite. Users equipped with the proper access methods and powerful OLAP tools will become self-sufficient and provide an enterprise's management team with all of the information required to make intelligent business decisions.
OLAP tools provide an important and complementary function to the other reporting and analytical tools found in the corporate information factory. Best-of-breed OLAP tools provide users the ability to align business dimensionality such as customer and product hierarchies into organized rollup structures that provide meaningful drill-down capability through a dimension as it interacts with other dimensions. In addition, as the integration between the OLAP and relational DSS environments becomes tightly coupled, detailed transaction-level information becomes seamlessly available. End users have the ability to perform detailed data analysis and retrieve transactional detail when anomalies in business operations occur. Users select analysis tools of choice and are no longer burdened with having to understand "where" the data resides. Users become self-sufficient and retrieve information on their own timetable.
MOLAP, ROLAP, HOLAP It's all OLAP
The value that an OLAP component provides within the CIF architecture is clear. The decision in today's market is not whether to employ an OLAP solution, but rather which OLAP architecture to embrace and implement within the overarching CIF architecture. The two current OLAP architectures are MOLAP (multidimensional online analytical processing) and ROLAP (relational online analytical processing).
All OLAP solutions provide the same basic capability: the ability to view a metric by multiple dimensions. The goal of any OLAP solution is to allow the end user to access the available information looking for trends and exceptions. In almost all cases, once the "pearls" are uncovered, the user needs to look through the detail to determine the makeup of the data in an effort to answer the "why or what" questions that are causing the data anomaly. The main differences between MOLAP and ROLAP products lie with the data storage and access mechanisms.
Multidimensional OLAP products provide their own multidimensional database (MDB). All OLAP data is stored in the MDB; and the underlying detailed information is stored in a relational repository, such as a data mart, and can be accessed on demand.
Figure 2: The Management Cycle
Relational OLAP products sit directly on the relational data store and build multidimensional views of the relational data. Both architectures allow for the twisting and turning of the data in the OLAP "cube," but the MDB is designed to directly support the OLAP data access methods and will provide better performance for limited amounts of data than the ROLAP architecture.
This performance benefit is derived from the way the data is stored in a MDB. The purpose of OLAP is to quickly look at points of numerical data from multiple dimensions, (i.e., cost by time, by state, by product; or counts of responses by cell, by outlet, by promotion). The data value (cost) is stored once in a "measures" dimension in the MDB. The other dimensions in the first example (time, state and product) are intersection access measures. The point in the cube, or matrix, where these dimensions intersect identifies a storage location for a data element. This matrix storage methodology greatly reduces the size of the OLAP cube by eliminating the need for redundant data storage within the cube.
Because the MOLAP approach utilizes this optimized multidimensional database, a MOLAP application may be developed, put into production and made available to the users very quickly without the need to have a dedicated relational schema from which the OLAP engine pulls its data. There is a price to be paid, however. Once you build the MOLAP structure, you cannot drill through to the detail. A side benefit of the MOLAP architecture is that it allows the data warehouse/data mart developers to create "rapid prototypes" for new data marts and/or enhancements to existing data warehouses. New data planned for the data mart or data warehouse can be quickly added to a new or existing MOLAP cube and released to the user community. The users' feedback as to the data quality and suitability for analysis can be gathered and used to validate or improve the data mart design. Once the data mart design is improved, the prototype is thrown away. The detailed data is placed in the data warehouse and then delivered to the data mart cube.
Occasionally it is useful to use MOLAP technology in a nonpermanent prototype mode. In this mode, structures are quickly built, results are obtained and the MOLAP structure is either discarded or the results are woven into a permanent data warehouse. It is improper to build a permanent MOLAP structure in the guise of an impermanent architecture.
The impermanent mode of MOLAP design the prototype mode allows for significantly reduced time to market for the end-user business applications when compared to the time necessary to develop and deploy a data mart and then design and develop the ROLAP multidimensional views. A typical MOLAP used in a prototype mode can be developed and deployed in a matter of weeks, taking data feeds directly from the source systems. These sources could be ERP, CRM, legacy or desktop applications. Compare this to the three to six or more months required to design and build a data mart, create the ETL processes to populate the relational store and complete the ROLAP implementation.
Figure 3 is a simple architecture diagram for typical impermanent and temporary prototype MOLAP and ROLAP implementations.
Figure 3: Temporary Prototype MOLAP and ROLAP Architecture
In both the ROLAP and MOLAP architectures, the data warehouse is an integral part of the overall permanent solution. In the case of ROLAP, the data warehouse is deployed first, with the OLAP cubes being placed on top of the warehouse relational data tables. With MOLAP, the data warehouse can be deployed before or after the OLAP cubes. Temporary MOLAP cubes can be created before the data warehouse is created or after the data warehouse is deployed. When both the OLAP cubes are designed and the data warehouse is in place, then the MOLAP cubes can be populated directly from the data warehouse via an integration layer. This integration layer should also facilitate real-time drill through from the MOLAP cube to the underlying detail data in the data warehouse or data marts.
While ROLAP and MOLAP continue to come closer together, there still remain some basic differences in their approach. ROLAP solutions reside in a relational environment and create aggregated tables and multiple indices that often reside in the same relational spaces as the data marts and warehouse structures that serve as the sources for the ROLAP cubes. As the user community increases and the demands for additional data manipulation grow, the relational space suffers from resource contention.
In contrast, by using a MOLAP solution, the data is pre-aggregated in a separate database environment that replaces the relational aggregation tables of the ROLAP solution. Because the data is organized and indexed for near speed-of-thought retrieval, users are spending the majority of their time accessing the MOLAP databases and only return to the relational world when investigation of anomalies requires access to the transactional level detail constituting the aggregations.
Evolving OLAP into an Analytic Platform
The lines between OLAP and relational data storage and access are blurring with the deployment of the enterprise-wide business intelligence (BI) platform. This enterprise BI platform is a centralized, unified environment that the user community employs to request information from the analytic environment without regard for query technologies or data location (DW, DM, ODS, MOLAP, etc).
The underlying technology employed by a BI platform is commonly referred to as HOLAP, or hybrid online analytical processing. HOLAP allows for the seamless retrieval of data from an OLAP cube and related information from a relational data store with a single OLAP query.
When fully deployed, the BI platform is one of the more important access layers to information contained in the corporate information factory. The BI platform satisfies data requests via a common meta data repository that contains the necessary data mappings and required data presentation descriptors. Based on the presentation information in the meta data catalog, the data will be presented to the user in the most appropriate format: cross-tab report, OLAP access method (twist-and-turn), graphical, etc. Figure 4 shows a basic architecture for the CIF with a BI platform.
Figure 4: CIF Implementation with an Enterprise Business Intelligence Platform
In the future, access to all information in the CIF will be brokered by the BI platform. The end users will select the data required and decide on the most appropriate access tool for their particular needs OLAP, Q&R, custom-developed application or standard query language. These tools will issue all requests for information to the data access layer of the BI platform. The BI platform will employ its meta data repository and security layer to identify the location of the data and presentation method to be used, and to validate the access rights the user has to the data. The data access layer will then issue the appropriate data requests to one or more data repositories in the CIF. The data access layer of the BI platform will then pass the resulting information back to the users.
In addition, the BI platform can implement a standard security layer (such as LDAP, Windows NT authentication or a third-party authentication server) for security purposes and to provide a central login to many disparate applications. The BI platform then ensures that users are only given access to data for which their security level is appropriate. Depending on the method of implementation, the BI platform could reject the data request outright or process the data request omitting the data elements that the user is not authorized to receive.
In summary, careful consideration should be given to insure that any CIF implementation contains a strong BI platform capable of providing an integrated set of query and reporting tools and tools that enable analysis, modeling and planning. These powerful tools make end users self sufficient and productive, empowering them to provide management with the data necessary to make good tactical decisions to implement their strategic goals and objectives.
More and more, companies are recognizing this need and are looking to deploy a business intelligence platform that can provide these near real-time data reporting capabilities with the analysis and modeling capabilities. With declining profit margins and stiff competition, companies are quickly moving toward BI platforms that contain integrated suites of tools that are flexible, extensible, provide relatively easy and consistent access methods and can span the enterprise's disparate data sources.
When selecting a BI platform, users should give careful consideration to the suite of tools contained within the platform. The tool will determine end-user capabilities and will have far-reaching impact if the tool suite selected is not a good fit for the enterprise. OLAP is certainly one of the many magic keys that unlock the power of the corporate information factory and create useful information.
Register or login for access to this item and much more
All Information Management content is archived after seven days.
Community members receive:
- All recent and archived articles
- Conference offers and updates
- A full menu of enewsletter options
- Web seminars, white papers, ebooks
Already have an account? Log In
Don't have an account? Register for Free Unlimited Access