More than half of all business intelligence projects fail to meet enterprise goals. As many software vendors fail to deliver on their promises, decision-makers at all levels are becoming more skeptical about their purchasing decisions.

The goal of this column is to provide business and IT enterprise managers with a guideline for evaluating BI products that support certain functions. This column also helps an organization identify its specific BI application needs.

Software vendors use expressions such as "benchmark comparisons," "out-of-the-box functionality" and "limited-time discounts" to lure distracted IT and business managers into making impulsive decisions. These unsuccessful BI implementations waste time, money and opportunity. But with careful evaluation of business needs and the use of proven methodologies, decision-makers can successfully evaluate BI products. It's also important to undertake the project with realistic expectations.

Deciding on a BI Solution

Before evaluating BI products, it's important to understand your needs. A single product likely won’t meet all your requirements. Most BI solutions integrate multiple products, and each product addresses one or more components:

Performance scorecard: Performance scorecards track business progress through key metrics. They use financial data, customer information, internal business processes and iterative growth to translate strategy into action.  The scorecards are sometimes based on methodologies such as balanced scorecards.

Key performance indicators:  KPls are performance indicators of the organization as a whole; they show whether the enterprise is meeting its stated objectives.  Different KPIs can be established for different divisions of an organization.

Querying and reporting: Because this is the ability to access information and present it in an easily understandable way, this is one of the most important functions and should be evaluated carefully. Functions include ad hoc queries across multiple data sources, focused business reports, batch-type production reporting, multiple report formats and multiple queries per report. Reporting solutions should draw on the full range of data assets and data sources to deliver a single comprehensive result. Therefore, they should integrate easily into the existing IT infrastructure and provide different levels of sophistication.

Enterprise portals: Enterprise portals are websites that act as gateways to corporate information, enterprise applications, organizational processes and business resource. They are an easy way to centralize and share information across the organization. Enterprise portals let users personalize their view and content, perform efficient searches, tag "best bets," profile documents for better discoverability and receive notifications of changes to their subscribed documents, folders, categories or search queries.

Relational database management systems: RDBMSs store data in the form of interrelated tables. They are designed to easily determine the relationships among data and various elements and provide different ways to write and read data. For instance, a customer buys many products, and a product is purchased by many customers. This is a many-to-many relationship between the customers and products.

Data warehouses: A data warehouse is a collection of subject-oriented databases that support business decisions. Because they organize and store data specifically for querying, reporting and analysis, the design of a data warehouse database differs significantly from the design of an OLTP database. A data warehouse is a “data out” database, where queries and reports are written, such as which routes of an airline are booked over 80 percent capacity. To answer questions like this, a data warehouse application retrieves data from the database. On the other hand, an OLTP system creates data, such as an airline reservation system. It is a “data in” database and thus should be designed differently from a “data out” data warehouse database.

Metadata management: Unlike ordinary data stores, a metadata repository is not designed to store business data for a business application. Rather, it stores contextual information about the business data, or metadata. This difference is crucial and often overlooked. Examples of the contextual information found in a metadata repository include the business data's meaning and contents, governance policies, technical attributes and programs used to manipulate it.

Extract, transform and load: Source data for BI applications comes from a variety of platforms, managed by a variety of operating systems and applications. The purpose of the ETL process and application is to merge data from these heterogeneous platforms into a standard format for BI target databases. A BI system should not be a collection of standalone target databases, each with its own ETL process. It should instead be an integrated data store. For this reason, it's critical to perform common data transformations for all BI target databases only once, and then reconcile that data.

Data cleansing: This is the process of reformatting, reconciling and refining data during ETL. Data cleansing activities include correcting or identifying misfielded names and addresses and occurrences in a field (including blanks and zeros), as well as recognizing shapes of data in a field (such as a 10-digit number XXX-XXX-XXXX for a U.S. phone number and distribution of business addresses versus residential addresses). These activities may be either simple or complex.

Simple data cleansing, or scrubbing, provides basic cleansing of invalid data. This process may involve case correction, standardization, splitting a source data field and removing spaces or special characters. For example, simple cleansing would convert "Mr. Rob Ball Layton" to "Mister," "Rob,” "Ball" and "Layton" and place these fields into four different columns.

Complex data cleansing performs deduplication, data reformatting and data cleansing. This may involve phonetic and nonphonetic fuzzy-matching, as well as user-defined cleansing rules. For example, complex cleansing might determine that "Mr. Rob Ball Layton" is the same as "Mister," "Robert,” "Bell" and "Leighton," and then convert an ETL insert into a target update.

Real-time replication: This is the ability to propagate source data changes to the target system in real time. Real-time replication systems normally use database transaction logs, triggers or time-stamp columns to capture incremental changes in source data and then apply these changes to the target system with minimal latency.

Analytics and OLAP: OLAP systems handle queries required to discover trends based on vast volumes of information. Data is organized into multidimensional cubes, providing better performance than relational tables. The basic unit of a multidimensional cube is called a "measure" and is defined as the unit of data that is being analyzed. Each dimension is divided into units called "members," which are typically organized into a hierarchy. The dimensions and measures that define the cubes in any OLAP system depend on the types of analysis that are important to the enterprise.

Mobile applications: These connect mobile devices to existing enterprise resource planning, customer relationship management and back-end systems. These applications provide mobile workers access to vital corporate systems and information. This creates efficient operational decision-making, increased responsiveness to customers and a decisive competitive advantage.

The selected solution should help the organization develop future applications with lower labor costs (i.e., computer cycles are getting cheaper, while the cost of people is rising). The solution should also connect information across the enterprise; that information represents customers, money, products, policies, employees and partners.

The Selection Process

Typically, selecting one software package first involves evaluating others. For example, selecting an ETL solution also involves evaluating a data cleansing tool and RDBMS interface software. As a result, the selection process for BI products is more involved than many managers expect.

The product selected, especially for querying and reporting, should have the following capabilities:

  • It should be able to access multiple/disparate data sources.
  • It should support open source and should have an open architecture. Open source also facilitates easy trial and evaluation of the software, since you can simply download the product and try it out yourself.
  • The user interface should be highly visual and interactive.
  • The user interface should also support collaborative architecture and should support multiple delivery formats.
  • For growth, it should have predictable and efficient scalability.
  • It should support open APIs and should have 100 percent reliability.

Products satisfying these features will assist you in implementing a successful BI environment today and far into the future.
This is the first in a series of articles by Shaku Atre. Click on the titles to read the other recent articles: "Who in the World Uses Only Words and Numbers in Reports?"; "Who in the World Wants to Stay Locked Up?"; "Who in the World Doesn't Want to Reach for the Clouds?"; "Who in the World Wouldn’t Want a Collaborative BI Architecture?"; "Who in the World Wants More Data?"; "Who in the World Needs a Data Warehouse?"; "Who in the World Needs a Hard Drive?"; and "Who in the World Wants to Just Be Structured?"

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access