MAR 9, 2006 1:00am ET

Related Links

10 Sustainability Predictions for 2011
February 23, 2011
A Letter to Future Employees: Embrace Analytics
February 3, 2011
A Hunger for Risk
January 6, 2011

Web Seminars

Why Getting Started in MDM Doesn't Have to Be Difficult
February 29, 2012
How to Narrow the IT/Business Communication Gap
March 21, 2012
Deliver Better Enterprise Data through Better Reference Data Management
Available On Demand

Data Quality Begins with Data Entry

Print
Reprints
Email

One of the worst things that can happen to an information system is for incorrect data to appear on management reports. Nothing can kill the reputation and potential adoption of an information system faster than "wrong" data. Fortunately, in most cases the incorrect data has nothing to do with the business intelligence tool that created the report. The incorrect data more than likely stems from a data quality issue.

Data quality issues surface for a number of reasons: glitches in the data entry process, poor transformations and merges when multiple data sources are brought together, and often from missing data. The bottom line is garbage in, garbage out. If incorrect data is put in during the data entry process, the incorrect data will appear in a report when it is brought out of the database.

Suppose a sales rep at a company, while entering data into a sales tracking system, began to use the gender field in the data entry form to track whether the company they were selling to was under or over 100 employees (as indicated by a "U" or an "O" in the gender column of the data). The sales rep didn't care whether the prospect was male or female. It was the size of the company that meant more to him. By using a seemingly useless column (gender), the sales rep solved a particular issue for himself but when the marketing people wanted to look at demographic information, they found they had 45 percent M, 46 percent F, four percent U and five percent O when counting the percentage of customers by gender. Information that showed four genders with varying percentages invalidates the entire meaning of the report.

This fictional example closely represents the problems that can occur in the data entry process. In most cases, the gender option in a data entry form will only allow two valid selections. Developers who build data entry forms that collect data go to great lengths to ensure that the process is fast, easy and automates many entry options.

The best way to prevent data quality issues from occurring in the first place is to communicate with, and enlist the help of, the people who are entering the data (assuming that you have a manual data entry process). It's no secret that data entry is boring and time-consuming. As a result, the people who do it often look for ways to cut corners to accelerate the process. Cutting corners is a surefire way to promote eventual data quality problems.

Educating and motivating data entry people goes a long way toward minimizing data entry problems. First make sure that the people who are entering data know what it will be used for - solving customer problems, sales analysis, marketing, etc. This will give them an understanding of the importance of the data entry process. Then show them examples of what can happen if they don't enter the data correctly.

In my company, we prominently display data quality reports that show how different people and groups are doing in relation to data quality. For example, when collecting sales prospect information in our marketing database, we look for information about the title of the prospect (CIO, VP, managing director, etc.), the industry their company is in, their email address, and so on. At each branch where data entry occurs, a data quality report shows how that branch compares to others in the percentage of records with titles, percentage with valid email addresses, and percentage with a company's industry. This report motivates branch personnel as they compete to have the highest percentages on the report. Additionally, it is thoroughly explained to everyone in the branches that the data will be used for targeted email campaigns to specific industries and that the branches with the best percentages typically get the best return from these types of campaigns.

You can ensure that the first management reports that come off the system will result in correct data. Take the time to let the people who are entering the data know what the data is being used for and how significant the data is. Use data quality reports to motivate them to be conscientious about the data entry process. Also, if they receive the first reports from the system, they can monitor the quality of the reports.

If you have the budget, create a position for a person who is responsible for data quality and who is focused on monitoring the collected data and ensuring that it is valid. That person can also be responsible for educating and motivating the people involved in manual data entry processes.

Kevin Quinn, vice president of Product Marketing at Information Builders, researches new technologies for acquisition or adoption and defines the strategy and road map for the WebFOCUS business intelligence platform. In his 22 years of experience in IT, Quinn has helped companies worldwide develop information deployment strategies that accelerate decision making and improve corporate performance. Quinn is also the founder of Statswizard.com an interactive sports statistics Web site that leverages business intelligence functionality. You can reach him at KevinR_Quinn@ibi.com.

Filed under:

Advertisement

Comments (0)

Be the first to comment on this post using the section below.

Add Your Comments:
You must be registered to post a comment.
Not Registered?
You must be registered to post a comment. Click here to register.
Already registered? Log in here
Please note you must now log in with your email address and password.
Twitter
Facebook
LinkedIn
Login  |  My Account  |  White Papers  |  Web Seminars  |  Events |  Newsletters |  eBooks
FOLLOW US
Please note you must now log in with your email address and password.