Continue in 2 seconds

DQ Point 4.3: Single Data Sources Increase DQ

Published
  • February 01 1998, 1:00am EST
More in

This is the seventh in a series of discussions of quality guru W. Edwards Deming's Fourteen Points of Quality and their ramifications on data quality. Deming's fourth point of quality, "End the practice of awarding business on price tag alone," contradicts three commonly accepted IT practices that virtually guarantee non-quality data.

In previous columns I examined the first two of three counterproductive IT practices: 1. Reward project development for on-time/within budget alone--without real measure of the quality of the product delivered. This practice has the impact of actually increasing costs while increasing defects. (See November 1997 column.) 2. Model and build databases based upon the application project requirements alone--without involvement of stakeholders outside of the project scope. This practice leads to fragmented, non-integratable data models and non-sharable databases that appear to be cheaper for the project. Experience now has disproved this with increased complexity and costs of redundancy maintenance and increased data quality problems. (See December 1997 column.)

This month I examine the third counterproductive practice: 3. Capture data close to where it is convenient to a business area and application--without determining if this is the best source of data capture for the enterprise.

This practice stems from the combined paradigms of the industrial-age functional organization and the isolating "systems approach" of application development. These two paradigms have resulted in the notion that every business area and every application should have its own source (supplier) for the information it needs to operate. The presupposition is that we (in our department) cannot trust your data (data that you create in your department). So we create our own information needed to do our job (i.e., function). Because we cannot "control" the information you create in your department or it is in the wrong format or incomplete to meet our needs, we develop our own data sources and databases.

The result of this practice is incredible data redundancy. Worse yet is the fact that the data is created by one of two means, both of which create potentially significant data quality problems. The first option is to develop separate and uncoordinated applications that capture the same type of information, such as customer or product. The second option is to use interface programs to extract--and transform--data from "your" database into "my" database in the way I need to see it. For example, the interface of orders from the sales department may be transformed to go into the fulfillment system, because it does not "need" all information captured by the order system. Both of these options significantly increase the cost of capturing and maintaining data and the complexity and cost of maintaining applications.

Deming's fourth point of quality is the fundamental truth that purchasing or acquisition decisions based on price alone result in decreased quality and higher costs in the long run. If there is no measure of quality, business tends to select the lowest bidder with "low quality and high cost being the inevitable result."1 The quality solution is for the organization to develop a long-term relationship with a single source of supplied goods. This "partnership" provides the basis for consistency of materials resulting in higher quality goods and reduced costs.

Single-Source Data Suppliers

The same principles of quality hold with data acquisition. In business there should be a single source of data of a given type (customer, product, order, etc.). The natural data suppliers--data producers--are those who are responsible for the process at the point of data origination. In other words, data producers who are closest to the point where data becomes known within the enterprise should create it. Those closest to business events that create and update knowledge are the ones best able to capture quality data for the entire enterprise. The enterprise should leverage those data producers as the preferred supplier of choice.

Data should be captured in a single authoritative record-of-reference database in a way that it can meet the needs of all interested knowledge workers. If the enterprise is geographically dispersed and this is not economically feasible, the data may be created in local record-of-origin databases and replicated to an enterprise record-of-reference database to support non-local knowledge workers.

The enterprise is best served when it establishes a partnership between the knowledge workers and the single source of data. This requires a relationship of trust between knowledge worker and data producer. A trust relationship eliminates the need for every department or business unit to have to create and maintain their own sources of that data.

Unfortunately, most applications are built vertically. Many will create secondary sources of data with interfaces that transform data to fit their specific functional requirements. Others will "hire" their own data suppliers to take information from computer-generated reports and re-enter it into another database. In fact, as much as 70 percent of computer input comes from output from other computer output, according to Kathryn Alesandrini.2 When coming from other companies' computers, this multiple redundant source of data can be eliminated through EDI (electronic data interchange).

However, there is no valid business reason for an organization to need internal personnel to have to re-enter data that is available electronically. The argument may be that they cannot trust the data. Which is precisely the point! Develop data quality in the originating data producers. Then downstream knowledge workers do not need to look for an alternative data supplier or source.

How to Develop Single-Source Data Suppliers

  • For a particular type of data, identify the originating process and data producers.
  • Identify the redundant processes, databases and data producers that are recreating it.
  • Analyze the reasons for the redundant create and maintenance (inaccessibility, low data quality, incompleteness of attributes, inconsistent definition or domain values).
  • Knowledge worker management should contract with the process owner that produces the data for the level of quality that satisfies their needs. This eliminates the need--and cost--to find a new "supplier" of the data.
  • Based on the root cause, seek to eliminate the problem by:
    • Controlled replication (inaccessibility),
    • Negotiated quality standards and possibly reallocated resources to the originating processes (low data quality),
    • Redefined data models and databases (incomplete attributes),
    • Redefined data definition of attributes (inconsistent data definition), and
    • Planned migration to an authoritative, single sources of data (with improved quality).

Benefits of Single-Source Data Suppliers

  • Decreased costs of data capture and maintenance (eliminate redundancy and complexity),
  • Increased value of the data resource (create once, with quality use many times),
  • More timely information (once captured, all knowledge workers have access),
  • Increased confidence in the data resource, and
  • Increased ability to exploit new opportunities (by minimizing resources to maintain data and applications redundantly, they can be deployed on new opportunity applications and data).

1W. Edwards Deming, Out of the Crisis, MIT Center for Advanced Engineering Study, Cambridge, p.32.

2Kathryn Alesandrini, Survive Information Overload, Business One Irwin, Homewood, IL,

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access