Why do we get wrong answers when we combine two operational databases that were both known to have perfectly adequate data quality prior to their joining? Why do different users of the same database have totally different perspectives of the quality of its data? Why is it that data, once cleaned, doesn't seem to stay clean even though the record was never modified? Recently a reporter asked me, "Why hasn't the industry solved the data quality problem yet? It seems so straightforward." Like most human endeavors, if you don't understand the root of the problem, then your solutions will treat symptoms or at best provide only temporary relief. Despite enormous advances in technological remedies, the data quality problem seems more pervasive and tenacious than ever – like bacteria that has grown immune to antibiotics.

We can blame the Internet, the sheer explosion of data volumes, uncontrolled data entry from end users and customers, disparate content from external or poorly documented sources, or the torrent of new business needs that old data must satisfy. We would be right on all counts, but we'd still be missing the key to this dilemma. To paraphrase a favorite British playwright, the fault, dear Brutus, lies not in the data, but in ourselves.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access