Data quality assessment has become so much a part of many data quality programs that we assume that everyone knows how to do it well and efficiently. But this is not the case. Done well, assessment is technically difficult and demanding, often conducted under the press of time, while short on budget and long on conflicting management demands. Much can, and does, go wrong. So Arkady Maydanchik’s volume is a welcome addition to the data quality literature. It describes the end-to-end assessment process and each step in a brisk, easy-to-read style.

 

The book has three parts. The first is titled “Data Quality Overview” and it consists of three chapters that introduce the subject matter. The first chapter provides a laundry list of data quality problems, including everything from erred manual entry to errors introduced during batch feeds. The second summarizes a five-part data quality program. Contextually, data quality assessment is the first of the five. The third chapter provides an overview of the data quality assessment process. One important section discusses the makeup of the assessment project team, a subject all too easy to pass over. Importantly, Maydanchik takes the point of view that data quality assessments should be conducted, or at least led, by IT departments. This is both a strength and a weakness of the book, a point I’ll return to later.

 

The second part consists of five chapters that describe various categories of business rules. This subject almost always receives short shrift, so the careful attention given this subject is welcome. I found the flow of the chapters, from oft-forgotten rules governing the handling of missing values through identify rules, historical data and attribute dependencies engaging. I did not cross-reference Maydanchik’s rules with Ross’s (The Business Rule Book, Classifying, Defining, and Modeling Rules, Database Research Group, Inc., 1997), but I completed a quick checklist against the “rules in my head.” All the important rules appear to be there.

 

The last three sections in this part are really important. They lay out a process for actually creating the rules, another one of those nettlesome tasks that people underestimate. I would like to see this material expanded. Maydanchik’s views on how you know you have a set of rules that are good enough and industry-standard rules might make this onerous task easier.

 

The third part consists of five chapters focused on conducting the assessment. I particularly liked Chapter 12, on measurement. Too many assessments founder on reports of the form, “There were a total of 123,456 total business rule failures.” Without interpretation, management is left to say, “So what?” and move on to the next subject. This is the data quality equivalent of the old hospital yarn, “The operation was a success but the patient died.”

 

I have two complaints, both minor. First, although Chapter 2 is titled “Data Quality Program Overview,” this volume is IT-centric. It does not, as examples, recognize the importance of process and supplier management, perhaps the two most powerful (business) management tools for data quality. Thus, Maydanchik does not discuss how business rules can be built into process design, in-process measurement, or the roles assessment can play in data supplier management.

 

Second, for many teams, selecting the right vendor tool is a big deal, fraught with politics, and complicated by the large number of good offerings and churn in the space. Maydanchik does not provide any criteria or other guidance on this important subject.

 

Of course, the most difficult choices authors make involve what to leave in and what to leave out. Maydanchik’s volume is tightly focused, muting these complaints.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access