Continue in 2 seconds

A Time of Growth for Data Warehousing

  • November 01 2004, 1:00am EST

2005 will be a time of growth for data warehousing and the business intelligence (BI) applications that it enables. Forrester survey work shows spending expectations are improving dramatically, with 41% of enterprises surveyed expecting spending to grow more than 10%.

There are a few key forces behind the dynamics that will characterize data warehousing in 2005. These major drivers include:

Pent up demand for data warehousing services. Over the past 2.5 years of the economic downturn, the steady drumbeat has been to reduce costs. With the economy appearing to turn a corner, building out data warehousing infrastructure is again a priority when opportunities are visible for increasing top-line revenues in customer facing operations and product optimization.

Growing data volumes. When left to itself, data does not spontaneously shrink or become better organized. Over the next 18 months, 50% of end-user enterprises surveyed expect to be operating a data warehouse larger than 2TB. Use unsatisfied requirements to justify building data warehousing infrastructure where the development will produce specific business advantages such as reduced costs in carrying inventory, improved operating efficiencies in forecasting and the generation of incremental revenue through customer cross-selling and up-selling.

Growing data complexity and intolerance for latency (delay). Heterogeneous data is abundant in the enterprise, and the number of different data types - relational, flat files, XML, message queues, desktop spreadsheet, etc. - is growing. Daily warehousing refresh (updates) rates are expected to advance from 2% infrequently performing multiple daily updates to 24% performing multiple daily updates or near-real time refreshes.

Dramatic improvements in hardware price/performance. Data warehousing benchmarks for systems of comparable power show a dramatic decline in price of nearly 68% over a 15-month period. The changing economics of data warehousing include nearly $1 million ($992,288) in savings on database licenses as driven by savings in hardware in comparing the August 9, 2002, and the July 29, 2003, TPC-H 300GB benchmarks from IBM (see Half the number of processors with twice the power requires only half the number of database licenses.

Innovations in database technology: open source platform, database, parallelism. Large vendors such as IBM, Oracle and Hewlett-Packard are shipping open source platforms for data warehousing using standard relational databases. Approximately 8% of survey respondents have deployed an open source database such as PostgreSQL or mySQL with another 14% considering doing so. Innovations in database design from Netezza (data warehousing appliance), Sybase (IQ server) and Datallegro (massive parallel processing on open source) are driving enterprise class data marts.

Because of these key drivers, continuing and emerging trends to watch in 2005 include:

Integrated BI (active data warehousing). Firms already operating a first- or second-generation data warehouse and addressing decision-support issues (such as inventory reduction on the product side or lifetime customer value on the customer side) will invest in third-generation, active data warehousing to attain integrated BI. Third-generation data warehouses go beyond forecasting and decision support with their high volumes, large numbers of users and complex workloads to close the loop from the data warehouse to the transactional system. This bidirectional process results in integrated BI. The Forrester Data Warehousing E-mail Survey from April 2004 shows that active data warehousing is getting traction. Approximately 17% of respondents claim to have deployed an active data warehouse and another 13% have one in design. Still, the vast majority, or 70% of respondents, do not have an active data warehouse.

More data warehousing for the dollar. The dramatic improvements in the price/performance of data warehousing infrastructure are the result of technology innovations. Technology innovation is both a cause and an effect. As a result, data warehouses will be able to handle more complex workloads and high volume points for more users. This continues the well-established trend of shipping additional functionality - ETL, data mining, OLAP - integrated with the underlying database at either no extra charge or a nominal incremental cost.

More competition at the high end - very large data warehouses. The net result of growing data volumes and technology innovations is increased competition at the high end of the market in the very large data warehousing (VLDW) sector. It is said that imitation is the highest form of flattery; thus, Netezza is compared with Teradata, though the implementation details are different. Teradata is quick to point out that the claims have not been validated by a single audited benchmark such as the TPC-H. This is not an issue for Sybase, which has now submitted a 1TB TPC-H data warehousing benchmark with two out of three metrics showing near to an order of magnitude improvement. The trouble is that the system is a tad under-powered in its current implementation. Oracle has reported a 3TB benchmark on HP Integrity Superdome (Sept. 25) that boasts the best price/performance ($109) and highest composite power-throughput (45,247 QphH) in its volume point class. IBM and Oracle are still the only vendors that have posted a 10TB benchmark.

Data mart consolidation advances. Data mart consolidation is reaching full throttle. During the past two years, data mart consolidation has been driven by the need to cut costs and attain intelligent information integration. Without the consistent and integrated data warehouse design, the benefits of data mart consolidation will be limited to attaining additional operational efficiencies. However, benefits are achievable in either case.

Service-oriented architecture (SOA) makes room for data warehousing. Rolling data warehousing into your overall SOA is critical for enabling integrated BI. Data warehousing raises the bar on SOA, requiring service interfaces to expose data access, data transformation, aggregation, reporting and a host of related decision support services. Widespread deployment of SOA for data warehousing will be driven by the need for integrated BI, follow a different migration path than Web services and grow slowly but steadily as standards and vendor support develop over the next five years.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access