Continue in 2 seconds

Here are a few questions that are bugging me while designing a data warehouse.

  • November 01 1999, 1:00am EST


Here are a few questions that are bugging me while designing a data warehouse. It's something to do with the clean data.

To illustrate this point, here is a brief description of the problem. In the OLTP systems there are certain fields that are not captured, as they are not mandatory. While designing the data mart, I chose a few dimensions. However, it's important that these join to the central fact table using the proper values in them. Talking specifically, I have two cases:
Note: This is a data warehouse for the Customs Dept.

  1. There is something called the importer information (name, address, etc.) that is not captured while the Customs entries are sent. This poses a serious problem. However, I can make use of the clean data from another division which has all this information very well organized.
  2. The problem above gets sorted as we have the data. Other information they wish to query on is Vessel on which the goods arrive. This information is present but is not clean. There are quite a few places where it is null, and I would like to know if we have a way out of this incomplete data captured or we need to present it as it is!

Could you let me know how to tackle such issues?


Larissa Moss's Answer: You are describing a common ailment of organizations treating data as a byproduct rather than as a PRODUCT. Unfortunately, organizations capture data without the involvement and consideration of requirements from down-stream knowledge workers (users) who will need the data for their decision making process. Creating another set of databases, such as data warehouses and data marts, can not and will not ever fix the broken procedures for capturing operational data. My advice to you is to cost out this problem (Larry English's new book Improving Data Warehouse and Business Information Quality can help you with the calculations) and to bring this problem to the attention of your upper business management. This situation, including your valiant attempts to "plug the holes" is costing them far more than they realize.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access