In the coming year, data warehousing trends will be driven by factors obvious and less obvious. Among the obvious drivers are data volumes that continue to grow by leaps and bounds, increasing intolerance for latency in getting answers to business questions and the expanding diversity and complexity of data types. Among the less obvious drivers are:
- Unmet end-user requirements for dashboards, balanced scorecards and interfaces that simplify understanding of complex business metrics. In order to present a simple metric, significant upstream data integration is required; the data warehouse (DW) is often on the critical path.
- Master data management (MDM). As soon as you rationalize master data around customers, products, vendors, employees and other key entities, you have the key dimensions and infrastructure to gather facts about them (yet another definition of data warehousing).
- The need to calculate customer lifetime value, which requires aggregating a lifetime of transactions (again, what the DW does).
- Information quality. As soon as an enterprise sets out to assess and improve its IQ, it must rationalize the use of data as a corporate asset; that invites rigorous attention to the DW function.
Given these drivers, the following data warehousing trends will be in the foreground:
Simplicity, Simplicity, Simplicity
End-user enterprises do not want or have the expertise to undertake the labor-intensive and iterative process of balancing computing power, disk I/O and network capacity in a labor-intensive iterative process. Therefore, preconfigured data warehousing appliances, predefined quasi-appliances and balanced configuration systems will gain even more market traction, reaching $2.5 billion in 18 months (or approximately 20 percent of the overall DW market) with the majority of those dollars going to large, established, late-arriving major innovators, not the original upstart, proprietary ones.
The appliance as a trend has reached takeoff speed and critical mass - but beware. The amount of confusion caused by proprietary messages and technology is on the rise. The system must be able to preserve existing investments and have a well-defined roadmap for growth - for example, based on an operating system with a future such as Linux, UNIX or Windows. No one wants to buy yet another version of the relational database; buyers will gravitate toward the tried-and-true standard that can perform both business intelligence (BI) and transactional workloads. Going forward, you only need one kind of database.
As systems become larger, more complex and faster, the need for self-managing, self-tuning and self-understanding looms large. Smart tools to recommend optimal indexing (or not), data access and balanced partitioning will become even smarter and more autonomic (self-ruled). However, because human oversight and responsibility are required for legal accountability, autonomic computing will function like a co-pilot or assistant rather than a replacement for the database administrator or final decision-maker.
Performance, Performance, Performance
The advantage will go to those data warehouses with a performing total technology stack - hardware (chip), database, data model and usable business metrics. Again, end-user enterprises will be at risk due to confusing messages that must be disentangled. For example, standard industry benchmarks (www.spec.org) show that the POWER5 (p5) 1.9 GHz chip is generally equal to the Xeon 3.2 GHz chip; but those who mistakenly use clock speed (GHz) alone as the measure of power (or receive bad advice to do so) will get an underpowered and underperforming system. The result will be an expensive surprise when an upgrade is needed sooner than expected. Memory management and data channel (pipeline) are key performance factors. An even more dramatic differentiator is available if you look at the data warehousing benchmark from the Transaction Processing Performance Council (www.tpc.org). Of course, computing power is not everything.
Large DWs are now common. Going forward, differentiators will be in terms of the capability to perform sustained real-time update. You cannot do this using a database utility, no matter how persistent the trickle. What is needed is memory-to-memory transfer from a message broker or other application server. This is now an operational necessity for sustained real-time update, and those lacking the capability will be busy acquiring and implementing it.
Value, Value, Value
Value for the business takes priority even with the geek squad. Building a cross-functional team of IT professionals who understand the business and business professionals who appreciate IT will finally be rewarded with corporate recognition. Innovations in business processes - closing the loop back from the DW to optimize transactional processing - will be even more important than technology innovations. On the business side, sales and marketing will connect the dots between the business question, "Which customers are leaving and why?" and the business intelligence available from the data warehouse. Finance will connect the dots between the question, "Which clients, products and categories are the profit winners and which are the profit losers?" and the consistent, unified view of customer and product master data in the warehouse. Operations will connect the dots between the questions about supplier and procurement efficiency, stock outages, capital risks and reserves, dynamic pricing and the aggregations of transactional data in the warehouse. The result will be an even smarter enterprise.
Register or login for access to this item and much more
All Information Management content is archived after seven days.
Community members receive:
- All recent and archived articles
- Conference offers and updates
- A full menu of enewsletter options
- Web seminars, white papers, ebooks
Already have an account? Log In
Don't have an account? Register for Free Unlimited Access