Continue in 2 seconds

We are at a crossroad.

By
  • Sid Adelman, Scott Howard, Joyce Bischoff
Published
  • January 23 2002, 1:00am EST

Q:

We are at a crossroad. We need a bigger server to get the ETL load times down, but as a company I am not 100 percent convinced that we are willing to spend the money. I think we will spend the money, but I would not bet on it yet. A decision will be made in the middle of December. If we don't purchase the server then my guess is that the warehouse will disappear over time. We won't have the horsepower from a hardware perspective to meet the growing business needs. If we can't meet the needs then the question is sure to come up - why do we need a warehouse.

A:

Sid Adelman’s Answer:It sounds like you haven’t been measuring the benefits of the data warehouse. If you had, and the benefits were apparent, there shouldn’t be any question about the value of the warehouse and your company would not be questioning the expenditure for the bigger server. My strong recommendation is to now measure the benefits, both tangible and intangible, and communicate those benefits to management before they have had a chance to let the warehouse die.

Joyce Bischoff’s Answer: If you don’t know why you need a warehouse, you should not build one. A data warehouse should be driven by the requirements of the business community and not by the desire of IT to do the users a favor. Typical business requirements involve query and analysis of historical trends, comparisons of various time periods, data mining, etc. If you do not have a highly placed sponsor from the business community who is willing to foot the bill for the new server, your warehouse and the new server will probably not be cost justifiable.

Scott Howard’s Answer: Harsh reality. If the true value of the BI system and its supporting data warehouse are not well known or justifiable, then yes, “Why do we need a warehouse?” is a legitimate question. The warehouse must by an asset, providing valuable information not available from your traditional OLTP or operational systems. It does this by creating and maintaining a true history of business activities and of internal and external events that could affect that history. It combines your internal data with classified (purchased or public record) data so these external events and their effects on your business can be analyzed. Someone has to perceive some value in what your organization has built and must champion your cause. You may need to seek these folks out from your user community. If they don’t exist and what I just described does not represent your implementation, perhaps the digital junk heap is an appropriate fate.

Chuck Kelley’s Answer: This is the typical “we want a petabyte data warehouse, but we don’t want to spend more than $50,000 building and running it over the next five years” attitude. There are many ways to shrink ETL load times. But those are dependent on 1) your dw design (star, snowflake, normalized, partitioned or not, etc.); 2) operating system (UNIX, VMS, windows, parallelization, etc.); 3) the ETL tool you have chosen (Informatica, DataStage, SQL Scripts, COBOL, etc.); 4) the database you have chosen (Oracle, DB2, SQL Server 2000, Access, partitioning available, parallelization, etc.); 5) how often you load; and 6) the window to be able to load. If you’re absolutely positive that you can not squeeze more out of the current box, then your company will have to make the decision to upgrade or (possibly) close down (or limp along). If the choice is the close down, then most likely the company is not ready or does not understand the true value of the data warehouse.

Clay Rehm’s Answer: Why do you need a bigger server to get the ETL load times down? Before blaming the hardware or spending money on more hardware, investigate to see if the code is written efficiently and if the sequence of the specific load jobs are in the most efficient order. What is the ETL philosophy? Are you reloading each row of data, when you could be doing updates instead or vice versa? Are there file conversions going on (RDBMS to flat file for example)? Bottom line – there could be many reasons your ETL processes are performing poorly that can be resolved through process or design improvements. Your statement about the warehouse disappearing overtime should make you and your team very, very nervous. If you or your team thinks that the data warehouse will ever go away, then it should have not been built in the first place. A data warehouse should only be built to resolve specific business problems. As soon as you read this, locate the original project description, scope document, requirements document and any other related documents to see why the data warehouse was built in the first place. Talk with your business users to determine if their needs are being met with the data warehouse. Have honest and sincere discussions with them to find out their needs and to let them know you can accommodate them. Determine if their needs are truly accomplishable and realistic given your budget, time frames, resources and sponsors.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access