During the summer and autumn of 2001, Giga Information Group asked data warehousing people at financial service institutions, insurance companies, retailers, manufacturers, transportation companies, and organizations in the private sector and education to let Giga take a look at the weaknesses in data warehousing solutions that they had implemented. Giga also asked what they would do differently if they had the chance. The result is a revealing, sometimes painful and often humorous look at the uses and abuses of data warehousing in the real world.

It would be an exaggeration to say that in every instance Giga has an easy solution. However, in every instance, there is a lesson learned and a best practice that can be inferred. In every case, it is important for firms building or operating data warehouses to understand what can go wrong. This enables action to be taken to reduce the risk of surprises such as schedule slips, disappointed customers or budget overruns due to technology miscalculations. This, in turn, enables implementation of state-of- the-art data warehouses that support high-impact business applications in CRM, supply chain management and business intelligence.

One data warehouse implementer reports: The major weakness of our data warehouse is its lack of data. This organization has many different kinds of data, but the data warehouse focuses only on customer billing information. There is no order, delivery, warranty or detailed line-item data. The main reason for not having this data, according to the client, is lack of justification for expending the IT effort to access the data stores.

A data warehouse is no better than its access to upstream transactional systems. The situation described by this client vignette occurs when the data warehouse lacks access to the diverse operational systems that hold the transactions for order completion, delivery and detailed line items. The client suffers from the latter. There also is evidence of organizational obstacle – possibly territorial behavior on the part of some operational systems whereby access to the required data is denied. The direct approach is to analyze the business problem which, in this case, highlights the additional data required to get a more complete view of the customer. Next, design and implement dimensions that capture and forward the information from the operational systems to the data warehouse. Finally, use that information to address the business issue (not explicitly stated here) such as selling additional products to those whose warranty data indicates the product is at the end of its useful life. Lesson learned: A basic architectural principle can usefully be employed here – the data without the application is meaningless, and the application without the data is empty.

In a related but slightly different problem, a respondent complains: The data warehouse contains the "raw" data – mostly extracted from operational systems. We had very limited design and little effort was spent on rolling up or summarizing the data. It is very complicated to collect all the needed information – and to integrate and aggregate it. As a result, every new source takes a lot of effort, which has prevented maximum use and business benefit from the warehouse. Along the same lines: The data warehouse was built, not designed. We keep way too much detail and do not summarize enough data into prebuilt tables. Too many of our queries are ad hoc and take long times to run because of the detail level in the tables. The folktale of the three little pigs and the wolf comes to mind. All three of the pigs built their houses – only one designed his.

This is a slightly different issue, but related to the first. In the first example, the data warehouse lacked data. In the second, the data warehouse lacks information. Unlike the first example, there is plenty of data – but not enough information. The irony is that not enough effort was expended at the front end to design aggregates; therefore, extra effort must be incurred at the back end to collect, integrate and aggregate all the information needed. The good news is that if the detailed data is available, building aggregates should be relatively easy. Indeed, knowing that 20 percent of the data gets used 80 percent of the time is often not obvious until an installation gets some experience under its belt. Although this is not a discussion about tools, getting technology to monitor database usage can be helpful at this point. Aggregates are one of the chief performance enhancers for any data warehouse; therefore, justifying their creation and maintenance is relatively easy to do. Adding business views and summaries (e.g., by time period, business unit, product family, etc.) should be a priority. Also, the introduction and use of data mining technology/tools to discover unforeseen relationships and correlative factors would greatly enhance the perception of value, use of the warehouse and provide better insight into the business. Lesson learned: Design the data warehouse; don't just build it.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access