MAR 15, 2007 1:00am ET

Related Links

10 Sustainability Predictions for 2011
February 23, 2011
A Letter to Future Employees: Embrace Analytics
February 3, 2011
A Hunger for Risk
January 6, 2011

Web Seminars

Data Replication for Real-time (Big) Data Warehousing
Available On Demand
Improving your Overall Analytical Environment by Migrating to a New Data Warehouse Platform
Available On Demand
The Dynamic Duo of Data Warehousing and Real-Time Streams
Available On Demand

A Fresh Look

Print
Reprints
Email

John thanks Robert Vasta, Navigant Consultant, for major contributions to this column.

The name of this column is "Beyond the Data Warehouse," which implies that there is more to managing information than building a data warehouse. I have emphasized how to accomplish information management and data warehouse efforts as the field matures.

There is a new generation of data warehousing (DW) on the horizon that reflects maturing technology and attitudes. In addition, I think we are looking at a radical change in platforms that will make the old DW strategies look like a COBOL programmer at a semantic ontology convention.

Remember in the 1990s, DW was enabled by the drop in the cost of storage - to the point where we could afford to cleanse and store lots of data. As we move well into the 21st century, we have much less expensive hardware in terms of storage and processing, and we are seeing commodity-type appliances, specialized for doing what mainstream servers did less than 10 years ago. We also have a business environment that is driving down latency to the point where the operational reporting must be as integrated as the DW reporting was envisioned to be. We have responded to that with real-time DW and integration layers (reminiscent of 1980) to clean up the transactions and report against them again. Enterprise resource planning (ERP) applications seemed to be the Holy Grail of integrated data. Sadly, most ERP implementations result in more dis-integration via the challenges with data quality and interfaces.

Therefore, the DW still holds strong as an architectural approach to enabling business and data integration. The basic tenets of developing a DW are solid. There is no slowdown in DW development, as many organizations cheerfully embark on second and third generation DW frameworks. But often, many of the basic tenets and lessons learned are not practiced. This column is going to take a look at these basic tenets, applying the hindsight of the accumulated experience of thousands of projects. I will review the areas where DW projects have and still do tend to fail historically. It is astonishing to me that these mistakes are still made on a consistent basis. I will supply some more modern rationale to these "legacy" critical success factor in the hope that somewhere out there someone will cease what is commonly know as "dumb stuff."

Tenet 1: "Build it and they will come" is not a good philosophy for a data warehouse.

Why this oft-repeated metaphor is still ignored is a great mystery. I will address defining requirements correctly later , but many DW project do not define them at all. The typical scenario is a project that is begun to solve data access or reporting inefficiencies. Then the project sponsor gets too busy to get into detailed analysis of requirements and insists that all of the data be saved into the new DW and the business can make sense out of it later.

The reality is that all of this data is rarely used; in fact, often as much as 40 percent of loaded data elements are never accessed. Those that are loaded are often of questionable accuracy due to a lack of data quality. The ultimate deal killer occurs when the "vat o' data" that has been developed meets few of the original expectations and is not aligned with what the business wants.

The 21st century perspective is this: If the end user wants all of the data "stashed" - then save it in a cheap staging area that is made up of offline storage. If the data is needed, it is handy. Also audit this stash for basic data quality items - referential integrity, code accuracy, etc. - then audit the data for risk. T he DW team can tell the end users what their risks are when they go plowing through the data stash. Another offshoot of tenet 1 is that often these large data dumps become the system of record for many organizations, even though they were not intended to be operational or support compliance. Once your data warehouse becomes a source for tactical or regulatory reporting, you must mitigate the potential risk of violating laws or making bad operational decisions.

One more contemporary slant on the "throw it all in" philosophy: Sarbanes-Oxley and other regulations have made record retention a big issue. There is risk in data that is stored too long. It is mandatory to confront retention issues when the business requests all of the data.

Tenet 2: Understand the true business problem.

The DW is sustainable only if it satisfies valid business objectives or supports business drivers. But too often, I still get the answer of "faster access to data" when I ask why the DW is being developed or why it was developed in the first place.

Faster access is not a good objective. In fact, a DW project where there are no measurable business goals defined will fail twice as often as one where the business is driving the result. Therefore our 21st century view is to focus on sustainability. You must ensure alignment with a business problem first, rather than striving for a one-off deliverable of reports or queries. Frankly, the most successful projects are business-driven and managed.

Tenet 3: Define requirements from an agnostic view.

If requirements are being defined for DW projects, they must be defined correctly. A good example of a bad requirement definition is beginning the process to determine what the sources systems are the day after the project starts. Another example of bad requirements definition is having the BI and ETL tool selection start immediately after the project budget is approved (and most likely before the business sponsor has been confirmed).

Filed under:

Advertisement

Comments (0)

Be the first to comment on this post using the section below.

Add Your Comments:
You must be registered to post a comment.
Not Registered?
You must be registered to post a comment. Click here to register.
Already registered? Log in here
Please note you must now log in with your email address and password.
Twitter
Facebook
LinkedIn
Login  |  My Account  |  White Papers  |  Web Seminars  |  Events |  Newsletters |  eBooks
FOLLOW US
Please note you must now log in with your email address and password.