Continue in 2 seconds

Data Quality Assessment for Data Warehouse Design

  • April 20 2000, 1:00am EDT
More in

This month we continue where we left off with source system assessment to delve into a parallel or companion activity – data quality assessment. When undertaking data quality assessment, we must define what it is we hope to accomplish. A crucial first step is laying out a set of ground rules to follow in establishing our framework to deliver value (in this case timely, accurate and consistent data) to our family of business users. Our assessment process must, therefore, focus on these key principles and set quality thresholds to gauge how close our source systems are to providing acceptable data – both in content and timeliness – to the warehouse.

When undertaking a data quality assessment, the end result we need to achieve is to establish an ongoing data quality management process. Therefore, not only do we need an assessment of our data sources, but we also need to establish what will be required to translate this activity into an ongoing business process. Some representative activities include establishing:

  1. The role and responsibility of a data quality administrator. a) The administrator is tasked to liaison with the business and technology communities in terms of implementing the required recommendations. This role can form part of the responsibilities of a corporate data manager or data administrator.
    b) This person or group is responsible for maintaining the oversight committee meeting agenda and provide all required information for their review and approval.

  2. The role and responsibility of each subject area champion. a) Responsible for attending each oversight committee review meeting, recommending action on tabled agenda items and approving changes to existing corporate, conceptual, schema-based meta data.
    b) Responsible for coordinating and completing periodic data quality review audits of his/her subject area and for being involved in any data quality management process from a rehabilitation or data warehousing implementation perspective.

  3. The roles and responsibilities of a data oversight committee which directs all the identified subject area champions and is administered by the data quality administrator. a) The data oversight committee schedules meetings on a regulated basis and is responsible for reviewing and approving cross-line-of-business meta data or critical meta data identified by the data quality administrator.
    b) The resulting action of this committee is to certify (approve) a definition structure for implementation within the business and technology communities.

  4. The data owner or knowledge worker is responsible for the day-to-day collection and management of current and new data on behalf of the organization. These people are generally only measured and rewarded along departmental or line-of-business boundaries and do not usually maintain cross-line-of-business responsibilities.

These personnel must be educated and made aware of changes to any data collection or management activities, which affects the business processes they perform and the information systems they access.
In undertaking the data quality assessment process, we need to establish a data baseline that spells out what will and won’t be acceptable as minimal data value. This value is determined in terms of the domain of each attribute or field, its business meaning, related or associated definitions or interpretations (homonyms and synonyms) and what corrective action will be required to clean up the data and put it into a stable state. In addition, we will have to determine how and what type of business data stewardship process will be required as a deliverable of our assessment process in terms of data cleanup. A business data steward or user group assume ownership from an ongoing data quality auditing/maintenance process (remember in data warehousing we are going to be doing data scrubbing every day!).

The data quality process (a key output of our assessment) follows a logical flow in support of the data warehousing project(s). From a strategic standpoint, this process entails:

  1. Defining and staffing a data quality core team. a) Staffing the core team and defining roles and responsibilities.
    b) Defining the core team technology environment.
    c) Educating and training the core team.

  2. Business intelligence systems business case verification and prioritization. a) Subject area definition and business process dependency definition.
    b) Subject area to information system database(s) mapping.

  3. Data quality standards and procedures definition (initial). a) Data quality mandate (program charter).
    b) Defining data quality procedures, milestones and deliverables.
    c) Defining data quality standards and enabling technology.

  4. Business users education and awareness setting a) Initial definition of the data stewardship program.
    b) Designing the executive buy-in and steering committee.
    c) Invited business core team member list confirmation.
    d) Business core team training.

  5. Integrating data quality procedures with the project life cycle a) Data warehouse data quality staffing and training.
    b) Implementing the data warehouse development team procedure and technology.

  6. Prioritized subject area data inventorying and assessment based on the results of business case verification which will result in one or more of the following deliverables: a) Data inventory/data sampling review.
    b) Data rehabilitation strategy.
    c) Data warehouse data staging approach.
    d) Data warehouse data management approach.
    e) Data warehouse user access approach.

  7. Implementing changes of the data quality standards and procedures based on feedback received during the project(s). a) Data quality process review and assessment.
    b) Data certification program action plan (or a list of updates to be added to an already in-place program).

Next month we will complete our planning process by delving into the development of business measures to quantify and gauge the success of our data warehouse over time. For a more complete description of this process and deliverables, please contact the author at

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access