One of the truisms of the data administrator is that accurate data is the backbone of corporate decision making. Entire careers and software product lines have been made on the belief in this axiom.

While accurate and complete data is a good thing wherever you can get it, interestingly, accuracy may not be necessary for the decisions made on and about the data residing in a data warehouse. There are some interesting cases where accuracy is not needed.

In the case of large amounts of summary data, is accuracy necessary? Consider the annual corporate report, an exercise in which every public company participates. Summary numbers, including profit, expense and revenue, can be found in the annual report. Accounting firms work hard to calculate this set of numbers that is, in turn, verified by the auditing firm and presented to the public on a quarterly basis. Accountants spend many hours slaving to get the numbers to balance to the penny. Then, what happens to those numbers as they are taken by management and presented to the public? The numbers are rounded to the nearest million. The last six or so digits are not even presented to the public! So much for accuracy. Some corporations are so large that the rounding is done to the one-hundred-million-dollar level. Wouldn't it be nice to have all that rounding money credited to your account? In that regard, annual reports are woefully inaccurate. As much as one million dollars here or there is thrown to the wind, making it difficult to make a case for corporate accuracy. Pity the poor accountant. After getting the books to balance, the better part of the accountant's work is simply tossed away.

Let's consider another case of accurate information and business decisions. Suppose you have a data warehouse and your boss asks you to calculate the revenues for a product line for 2001. You go back to your historical data in your warehouse and you calculate that there were revenues of $27,887,900.19 for the product line in 2001. Your boss now uses that information to make a decision. Is the decision your boss makes going to be changed if revenues were actually $27,867,108.10? The answer is most likely not. Your boss is going to make the same decision regardless of the accuracy of the summary calculation. If the amount should have been $6,988,108.05, then your boss probably would have made a different decision. However, as long as the accuracy is off by only a small margin on a large number, then a case can be made against accurate data in a data warehouse.

Let's examine another case for accuracy –­ the spreadsheet. Three different departments independently create similar spreadsheets. One department calculates last month's total product revenues at $109,887. Another department calculates last month's total U.S. product revenues at $75,998. A third department calculates North American revenue at $106,339. These three numbers are admittedly somewhat different. On the other hand, while there is a certain similarity between the calculations of the three numbers, different people using different data and different formulae have calculated different sums. In fact, it is surprising that the numbers are as synchronized as they are.

There is real value in focusing people on the business issues at hand. Spreadsheets are good at doing that. Spreadsheets provide a quick and dirty way of getting some numbers together. The spreadsheet numbers may be incomplete and inaccurate, but at least they indicate the correct direction ­– for better or for worse. Furthermore, the spreadsheet numbers can be created quickly and changed even more quickly. General George Patton said it best –­ "A good plan today is better than a perfect plan tomorrow." Of course, Patton was not referring to spreadsheets, but he well could have been.

Anyone who thinks that accuracy is the strength of a spreadsheet has not worked with them very much.

I have presented three cases where accuracy of information in the corporation is not a significant factor in the corporate decision-making process. Should accuracy and completeness of information matter in the data warehouse?

It seems that the lower the level of detail, the more accuracy matters. Conversely, the higher the level of summarization, the less accuracy matters. There are some cases that are informal to the point that accuracy is a very relative thing. Stated differently, accuracy is very important in the capture and storage of detailed data as the data is placed in the data warehouse. However, when used for the purposes of decision making, as long as the data is generally accurate and generally complete at the detailed level, then accuracy becomes a non-issue.

I can hear the critics now.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access