I have lost count of the number of times I have heard a manger say, "We've got to get a handle on our data quality." It seems as though organizations everywhere are concerned about reducing or eliminating bad data in their systems. But, what does bad data really mean? To a large extent, it means that management doesn't trust their reports and, as a result, also does not trust their applications. They don't know for sure which systems or reports are wrong, but they know something isn't right because data values do not match from system to system and report to report. In many ways, it is a problem of perceived or suspected poor data quality as much as actual poor data quality.

Managers (and others in the organization) do not trust the data because it is inconsistent. Often the organization does not have standard customer names across systems or departments. In other cases, the bad data could be mismatched product descriptions between the marketing, production or other departments.

The Business Data Layers Model

This article introduces a straightforward methodology for improving data quality by managing your business data layers. These techniques focus on defining and managing the fundamental subject areas or classes of information that comprise all business data. The levels of information start at the most detailed and become more and more aggregated as the information moves up between layers. Creation, maintenance and control of the data from the bottom up is the solution to getting a handle on data quality.

Data stewardship initiatives can sometimes improve the quality of data at higher levels through more rigorous data entry procedures; however, without adequate management at the lowest levels, data will get duplicated and become out of sync. Even the most rigorous update procedures eventually result in data discrepancies between multiple copies of data. The result is inconsistent data and poor data quality.

Figure 1 illustrates the different business data layers for a generic organization and includes examples for each data layer from a range of industries. This model should be tailored to identify the specific objects for your organization.

Figure 1: Business Data Layers

Foundation Data Layer

At the bottom of the hierarchy are the foundation data objects. Information in the foundation layer is the basis for all business applications in your organization. It is essential to identify these classes of objects and to have corporate-wide systems to manage them. There are typically less than 10 fundamental classes of information common to all businesses. Occurrences of these objects must be carefully maintained based on the assignment of unique identifiers and on strict control of the update procedures. This concept is actually quite simple and extremely powerful for managing corporate data. Identify and manage these fundamental objects for all systems across your organization, and your data quality will improve. Data stewardship should focus on these objects first.

Who is your customer? It seems like a simple question, but it is surprising how many organizations do not have a good definition of customer or a good way to uniquely identify them. They have customer files or databases, but they do not have a consistent set of business principles for managing the customer base across the enterprise. The business principles must be defined first, followed by a solid identifier (it is better to create your own identifier than to use a key or code assigned by an outside organization). A computer system can be developed to consistently implement the principles and assign the identifiers. Examples of business principles include:

  • The definition of customer.
  • The specification of all roles another organization may have with your organization. Strive to maintain only one identifier for an organization and not multiple identifiers based on role.
  • The assignment of an official customer name and valid name variations.
  • The selection of a unique identifier for each customer occurrence.
  • Purification processes to insure uniqueness.
  • The identification of different customer addresses.
  • The definition of corporate locations, levels and hierarchies.
  • The definition of customer classes and groupings.
  • Rules for handling customer name changes.
  • Rules for handling M&As.

Product/service is a universal information building block for virtually all organizations, yet it is rarely managed as a corporate data resource. I frequently see different representations of products managed by separate departments within an enterprise. The result is low customer satisfaction due to potential conflicts in product specifications for the original sales order, the installed products or the billing. Like customer, a consistent set of business principles should be defined for products and services. A common practice is to roll up individual products into different groups or hierarchies based on departmental needs. Naturally, a comparison of sales or production values based on different groupings can render inconsistent results. Make a special effort to understand and coordinate multiple product categories, groupings or hierarchies to provide a consistent enterprise view of product.
In order to conduct business, an organization must have a location. That statement sounds obvious taken on its own; however, many companies have never created systems to manage the various facilities or locations at which they operate. At best, there are various flavors of these systems for different departments. Define your organizational structure, including any areas and divisions, and assign unique identifiers to each physical location or facility. Location hierarchies are important as well. Typical location hierarchies include sales hierarchies that associate sales personnel with corporate regions or sales territories as well as customers. As with the other foundation objects, develop a complete set of business rules for managing your organization's structure and implement these rules consistently across departments and systems.

Employees are not always adequately represented outside of the payroll application, or they are associated with specific transactions only and managed by separate departmental applications. Another common practice is to have separate files by employee role, such as order entry clerk, contract specialist and sales representative. If employees do not perform multiple job functions, this method can work, but as soon as employees take on multiple roles, individuals end up getting strangled in multiple files, in multiple groups, with multiple name variations and keys. That is no way to treat a foundation data object.

Equipment is a broad classification that includes machinery, vehicles, plant facilities and other physical assets. Most companies that rely on equipment have systems to track the purchase, repair and depreciation of the associated assets. However, these applications are not necessarily integrated with the other applications that describe where the equipment is repaired or what products are produced for customers.

Transactional Data Layer

The next layer is the transactional data objects layer. Transactional data objects build on the foundation data objects and track essential business actions, activities or processes (see Figure 1 for examples). Transaction data objects vary based on the type of business, but they are essentially the main functions performed by the business. If the organization is a retailer, the transactions are sales. If the organization is a freight carrier, the transactions involve carrying shipments. If the organization is a health insurer, the transactions include enrolling health plan members and making claims payments. For banks, transaction objects include account deposits and loan payments.

Management of transactional data is the driving force behind the development or purchase of most business applications. A primary characteristic of transaction systems is that they make relationships between the underlying foundation data objects (the sale of products to customers by employees at store locations). It is critical to define and manage the transaction data objects relative to a consistent base of foundation data objects. Take a moment and think about the primary transactions for your organization.

Operational Reporting Layer

The third data layer is the operational reporting layer. This layer is used to provide information for managing the business on a daily basis. Operational reporting is usually driven by the transactional systems, and reporting is based on the associated transaction data objects, such as a daily sales report or a production report.

Bad data is usually first observed in the operational reporting layer, often by managers who need to make comparisons between different transactional systems to accomplish their daily job duties. The following simplified example illustrates my point. The production manager is planning his weekly production schedule and looks at the sales report to see what products have been sold and to which customers. The review of this report sounds simple enough, except the sales system and the production system have been developed independently and have different representations of product and customer. In other words, they look at a slightly different slice of the foundation data objects. The production manager needs to call the sales manager and ask what exactly has been sold in order to determine what to produce. These kinds of situations take time and cost money to resolve.

Financial Management Layer

The fourth business data layer is the financial management layer where corporate accounting prepares the books and provides revenue and expense reports to management. The aggregation of data continues in this layer, where individual transactions are accumulated into the general ledger and rolled up into broad revenue and expense categories. The financial management layer usually involves complex systems and accounting rules that interpret the myriad business transactions in order to prepare the monthly and yearly statements. The financial management layer is dependent on the foundation data layer and transactional data layer to supply consistent information about business transactions. Bad data quality caused by poor management of the lower layers only gets worse in the financial reports.

Executive Information Layer

The fifth and highest layer is the executive information and reporting layer. This layer is where high-level corporate officers require information to make strategic decisions. Reports typically come from the financial management layer; however, the operational reporting layer may also provide reporting on sales volumes, production volumes, work order statistics, etc. The relative trust or lack of trust in data is often determined at the executive information layer. As executives ask for successive levels of detail from the underlying systems ,they are often dismayed to see contradictory references and values. They want to know which numbers are correct. Unfortunately, none of the numbers may actually be correct.

We have all heard the term stovepipes, or silos, to describe computer systems. These systems have usually been developed independently based on separate requirements, projects and budgets, but they typically access and manage similar data. The result is inconsistent data and poor data quality. The term stovepipe is usually used when someone needs to explain to management why the customer list or sales hierarchy is different between finance and marketing. A typical comment from management is, "It's not surprising that I can't match up my customers with the marketing department because they developed their own stovepipe application."

Every single organization I have seen has these so-called stovepipe applications. It is mostly caused by computer systems that are developed to support individual processes, business lines or departments.

Now, why does this independent development naturally lead to inconsistent data? It is because systems are developed to automate business processes, and the related information exists primarily as transaction-level data. Organizations need systems to track sales, make claims payments or manage work orders. The departments responsible for the business processes sponsor and fund the development of systems they need to manage their business. Because these systems are funded and developed at the transaction level, it is natural that the stovepipes are created relative to the lower-level foundation objects. Each of the transaction systems references only its portion of the requirements for the foundation data objects. Each somewhat different view can have different business rules, identifiers, data stewards and databases. It's no wonder that subsequent cross-system reporting leads to inconsistent results.

Establish project management and budgeting procedures that will fund the development of systems to manage your foundation data objects company wide. The consistent definition and identification of the foundation objects are the basis of data quality. If you have multiple departments and multiple projects that will develop systems to manage transaction data, you must determine the overall requirements that each of these systems have for the foundation data objects. Then, you must develop systems to manage the foundation objects based on the shared requirements.

The transaction systems must be developed using the same base of data for each foundation object. Prohibit individual departments from developing their own proprietary applications for managing the base objects. Make a special effort to avoid the trap of duplicate and conflicting data entry - the cost saving of eliminating duplicate entry alone can be significant.

The specific strategy for developing systems to manage the foundation objects will vary depending on your organization. Take the time to identify your foundation objects, and then conduct a review of your current systems to make an assessment of which ones have the potential to manage these objects. You should expect to find that some of the systems could be suitable given some level of enhancements. Allocate the necessary budget dollars and make the changes. Then, use these systems as a basis for all of your transaction processing. Remember, managing your data from the bottom up is the key to data quality.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access