The goal of this column is to provide business and IT enterprise managers with a guideline for evaluating BI products that support certain functions. This column also helps an organization identify its specific BI application needs.
Software vendors use expressions such as "benchmark comparisons," "out-of-the-box functionality" and "limited-time discounts" to lure distracted IT and business managers into making impulsive decisions. These unsuccessful BI implementations waste time, money and opportunity. But with careful evaluation of business needs and the use of proven methodologies, decision-makers can successfully evaluate BI products. It's also important to undertake the project with realistic expectations.
Deciding on a BI Solution
Before evaluating BI products, it's important to understand your needs. A single product likely won’t meet all your requirements. Most BI solutions integrate multiple products, and each product addresses one or more components:
Performance scorecard: Performance scorecards track business progress through key metrics. They use financial data, customer information, internal business processes and iterative growth to translate strategy into action. The scorecards are sometimes based on methodologies such as balanced scorecards.
Key performance indicators: KPls are performance indicators of the organization as a whole; they show whether the enterprise is meeting its stated objectives. Different KPIs can be established for different divisions of an organization.
Querying and reporting: Because this is the ability to access information and present it in an easily understandable way, this is one of the most important functions and should be evaluated carefully. Functions include ad hoc queries across multiple data sources, focused business reports, batch-type production reporting, multiple report formats and multiple queries per report. Reporting solutions should draw on the full range of data assets and data sources to deliver a single comprehensive result. Therefore, they should integrate easily into the existing IT infrastructure and provide different levels of sophistication.
Enterprise portals: Enterprise portals are websites that act as gateways to corporate information, enterprise applications, organizational processes and business resource. They are an easy way to centralize and share information across the organization. Enterprise portals let users personalize their view and content, perform efficient searches, tag "best bets," profile documents for better discoverability and receive notifications of changes to their subscribed documents, folders, categories or search queries.
Relational database management systems: RDBMSs store data in the form of interrelated tables. They are designed to easily determine the relationships among data and various elements and provide different ways to write and read data. For instance, a customer buys many products, and a product is purchased by many customers. This is a many-to-many relationship between the customers and products.
Data warehouses: A data warehouse is a collection of subject-oriented databases that support business decisions. Because they organize and store data specifically for querying, reporting and analysis, the design of a data warehouse database differs significantly from the design of an OLTP database. A data warehouse is a “data out” database, where queries and reports are written, such as which routes of an airline are booked over 80 percent capacity. To answer questions like this, a data warehouse application retrieves data from the database. On the other hand, an OLTP system creates data, such as an airline reservation system. It is a “data in” database and thus should be designed differently from a “data out” data warehouse database.
Metadata management: Unlike ordinary data stores, a metadata repository is not designed to store business data for a business application. Rather, it stores contextual information about the business data, or metadata. This difference is crucial and often overlooked. Examples of the contextual information found in a metadata repository include the business data's meaning and contents, governance policies, technical attributes and programs used to manipulate it.
Extract, transform and load: Source data for BI applications comes from a variety of platforms, managed by a variety of operating systems and applications. The purpose of the ETL process and application is to merge data from these heterogeneous platforms into a standard format for BI target databases. A BI system should not be a collection of standalone target databases, each with its own ETL process. It should instead be an integrated data store. For this reason, it's critical to perform common data transformations for all BI target databases only once, and then reconcile that data.
Data cleansing: This is the process of reformatting, reconciling and refining data during ETL. Data cleansing activities include correcting or identifying misfielded names and addresses and occurrences in a field (including blanks and zeros), as well as recognizing shapes of data in a field (such as a 10-digit number XXX-XXX-XXXX for a U.S. phone number and distribution of business addresses versus residential addresses). These activities may be either simple or complex.
Simple data cleansing, or scrubbing, provides basic cleansing of invalid data. This process may involve case correction, standardization, splitting a source data field and removing spaces or special characters. For example, simple cleansing would convert "Mr. Rob Ball Layton" to "Mister," "Rob,” "Ball" and "Layton" and place these fields into four different columns.
Complex data cleansing performs deduplication, data reformatting and data cleansing. This may involve phonetic and nonphonetic fuzzy-matching, as well as user-defined cleansing rules. For example, complex cleansing might determine that "Mr. Rob Ball Layton" is the same as "Mister," "Robert,” "Bell" and "Leighton," and then convert an ETL insert into a target update.









