"Dirty Data"is such a pervasive problem in every company, in every industry! But we have lived with "dirty data" for decades, so why is it such a problem now? Is it because we promise to deliver data warehouses with "clean, integrated, historical data in a short time frame for low cost," yet we are unable to deal with the preponderance of "dirty data" within the framework of this promise? Some data warehouses are failing because the promised "clean," "integrated," "historical" data could not be delivered. Others are failing because the promised "short" time frame and "low" cost were exceeded in the attempt to clean up the data. In other words, they are failing because of the dichotomy of our promise.

How could we, the conscientious IT professionals, have allowed "dirty data" to happen in the first place? How could the users have allowed it to happen? Or, are there legitimate, sometimes even justifiable, reasons for the existence of "dirty data" in our legacy systems? And what should we do with it now?

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access