JUL 1, 2010

Related Links

CIO Stepping Stones to Success
February 10, 2012
Birst Automates Connections to Big Data
February 8, 2012
Rising to the Enterprise App Demand?
February 8, 2012

Web Seminars

Suit Yourself: An Effective Recipe for Self-Service Analytics
March 20, 2012
How to Narrow the IT/Business Communication Gap
March 21, 2012
Enhance and Expand BI with Mobile
Available On Demand

Data Definitions: Speaking the Same Language

Print
Reprints
Email

We're not on the same page.

That's Greek to me.

We're comparing apples and oranges.

These phrases are familiar refrains to IT professionals working with business customers. We must find ways to bridge the communication gap, especially in the data management and business intelligence space where we aim to unlock the value that is often hidden or neglected in business data.

An abundance of tools aid businesses in collecting, organizing, managing and gaining competitive advantages from their data. However, many businesses have foundational data work that must be tackled before any technology solution can add value.

Perhaps because this work is not considered to be as flashy as the latest technology solution, it is too often neglected or seen as documentation that can be put off for later. Even when businesses agree to tackle data definitions up front, or top down via a data governance initiative, difficulties around doing this foundational work are common. Malcolm Chisholm points this out in his article "Real Definitions versus Nominal Definitions in Data Management":

"We frequently speak of definitions in data management, but they are often taken for granted. In particular, it seems that everyone knows what a definition is and that every one assumes producing definitions is easy. Nobody stops to ask what exactly a definition is and if there are any particu- lar considerations about formulating definitions."

Consider the scenario of a business with data stored in a number of legacy system silos. Since the data is isolated within the silos and used to support independent business units, it is shared sparingly, if at all. Additionally, business users are only familiar with their particular legacy systems. In this case, definitions of data elements are a crucial first step to integrate data across the legacy systems into a consolidated data warehouse. Without data element definitions, the data warehouse could be populated with data that makes sense to one group of users but not to all users who need the data. Good data warehouse design must start with a solid foundation of data element definitions.

Definitions will also be a crucial aid as data is mapped from the legacy sources to the target data warehouse. Without integrated data and commonly understood definitions, the wrong data from the wrong legacy databases could be brought into the data warehouse. Additionally, the business users may assume they will see the same data they have always expected ("These customer IDs make no sense - they're supposed to be in this format").

Testing of the data warehouse solution also hinges on quality data element definitions. Without definitions as a standard, it will not be possible to write unambiguous test scripts. Business customers may find unpleasant surprises during user acceptance tests if they did not agree on definitions of data at the beginning of the project.

To avoid these pitfalls, requirements for populating a data warehouse must include business definitions of each data element going into the data warehouse. These definitions provide a common understanding of what data elements are required, what the data elements mean and the correct source of each data element.

Given the need for good data element definitions, what is the best way to go about getting them? You will need to engage the business and IT experts who are familiar with the legacy systems where the data is stored. After all, these people use the systems daily and are the most familiar with the data they need, how they use it and how it is represented in their system. This sounds like it will be a breeze, right? Simply get the business users to define each data element and explain how they use it in their daily operations. However, it may not be as easy as it seems.

Business users are often not accustomed to providing definitions at the level of detail required in order to ensure the right data gets into the data warehouse. Additionally, their familiarity with the systems and the data elements may actually make it more challenging for them to explain data elements to an outsider who has not used the systems as extensively as they have.

IT database administrators understand the physical structure of the legacy databases and the relationships between tables, but they may not be as familiar with how the business uses the data. Additionally, IT definitions are not always meaningful to business users from other departments or familiar with other systems.

Getting a definition from either group in isolation often ends up in confusion, as each group by default will offer limited explanations that make sense to them but not to anyone else in the organization. If business or IT users insist that their definition is good and everyone knows what they mean when in fact that is not the case, the following strategies may help.

1. Provide examples of unclear versus clear definitions. Users who are intimately familiar with their business process and supporting systems may not understand the point of specifying exactly what they need. To them "the ID of the customer" is a perfectly acceptable definition of "Customer ID." Or, the IT representative may give a definition that works for him or her but no one else, such as "the primary key of the customer table." It will help both to see examples of what is needed in order to have a workable definition to support data warehouse population and use of the data. Beyond being ambiguous, the "unclear" definition in the example (see box on the next page) hides the fact that the Customer ID contains embedded data that might otherwise have been overlooked by data modelers.

Advertisement

Comments (0)

Be the first to comment on this post using the section below.

Add Your Comments:
You must be registered to post a comment.
Not Registered?
You must be registered to post a comment. Click here to register.
Already registered? Log in here
Please note you must now log in with your email address and password.
Twitter
Facebook
LinkedIn
Login  |  My Account  |  White Papers  |  Web Seminars  |  Events |  Newsletters |  eBooks
FOLLOW US
Please note you must now log in with your email address and password.