Plummeting disk storage costs have made it possible for businesses to store hundreds of terabytes of data. Experts project that within the next three years the size of the average data warehouse will grow by a factor of 24, as businesses store more and more data for use in strategic programs such as customer relationship management (CRM), database marketing, fraud detection, customer attrition and more. Businesses cannot reasonably process such a high volume of data using the power of a single computer. As a result, an increasing number of companies are migrating to multiprocessor systems, now coming into everyday use in banking, retail, telecommunications and other industries that generate massive amounts of transaction-level data. These systems work best with applications that leverage their ability to throw the power of all processors against a single task in other words, when they "run in parallel."
This shift to a parallel universe was, at first, slow. The parallel database was the first major building block for large-scale information systems. IBM's DB2 UDB, NCR's Teradata, Informix XPS and Oracle Parallel Server can use an arbitrary number of processors to run single SQL queries against large amounts of data. As users grew accustomed to running parallel queries against large databases, it became clear that there was a painful shortage of parallel applications to provide services the database cannot offer. Unfortunately, most legacy applications were not designed to leverage parallel capabilities, and programming new applications required both a lot of time and a mastery of parallel programming resources few IS shops possessed.
In his December column in DM Review, Ken Rudin described a new technology that is helping to fill this gap by enabling the development and execution of high-performance, parallel business intelligence applications. This technology consists of a framework that allows applications to run in parallel by partitioning data among the processors and then streaming the data through multiple, parallel instances of each operation in the application. This "divide and conquer" approach typically increases application performance by a factor equal to the number of processors.
This article describes the ways in which this new parallel framework technology is boosting the momentum behind parallel computing, from the framework's effect on business intelligence practices to the ways in which both software and hardware vendors are incorporating this technology into their products and plans.
Broadening the Horizons of Business Intelligence
Businesses creating and executing analytical applications are among the biggest beneficiaries of the parallel shift. These applications allow IS and line-of- business users to discover new business opportunities by analyzing transaction- level data (often bolstering it with behavioral, statistical and demographic information) and using business intelligence programs to uncover strategic relationships between the data. The most successful analytical applications tie the results of these analyses directly to operational processes, with a direct impact on the bottom line. The documented benefits of these programs are astronomical:
- United Airlines expects that its yield management program will generate incremental revenue of at least $50 million annually.
- A major credit card issuer saved $45 million in fraud costs with the results of a pilot test of a neural network model.
- A telecommunications company's analysis of call detail data netted the firm $750 million in revenue over a nine-month period.
With their appetite for large amounts of atomic-level data, multivariable analysis and shorter feedback cycles, these analytical programs demand the power and speed of parallel processing systems. Today, about 75 percent of such systems are developed from scratch by the internal IS departments a process the Gartner Group labels "home-brewed." Gartner projects that as the complexity of these systems increases, users will abandon "home- brewed" for solutions that provide either the analytic or parallel function, or both. A version of the new parallel framework technology actually includes analytical components patented for their unique ability to train in parallel against large data. Users adopting this technology are experiencing more rapid return on their investments, often recouping their investment within one year of deployment.
Vendors of business intelligence (BI) software are no longer thinking of parallel performance as a side-order on the business intelligence menu. In the past year alone, business intelligence vendors, including Harte-Hanks, Knowledge Discovery One (KD1), Carleton Europe, The MEDSTAT Group and i.d.Centric, have used the parallel framework technology to develop parallel business intelligence tools and solutions.
Until recently, though, such parallel products were virtually non-existent. The reason: building-in parallel capability required that each vendor essentially recreate the [parallel] wheel. Today, vendors can build in parallel capability by including in their software the same parallel framework that makes it possible for IT shops to build parallel applications.
Business intelligence vendors are now in a situation analogous to that which existed when vendors of early software packages labored to put a GUI face on their DOS applications. The arrival of Windows not only set a standard for GUI applications, but provided vendors with a toolset to use in developing the GUI. Those vendors that converted to Windows could speed to market with new products that users found easier to master. Vendors that did not comply with the emerging standard or take advantage of this new toolset found themselves needing more development time and talent than their Windows-compliant competitors. Today, business intelligence vendors can choose whether they will forge their own parallel path or, with history as a guide, take advantage of the parallel toolset contained within the parallel framework technology already available in the marketplace.
Not Just Hardware Anymore
Recognizing that the same technology that enables users to better harness the power of their multiprocessor systems also lowers the price/performance ratio for their systems, the leading vendors of UNIX- and Linux-based systems are themselves helping to move us into the parallel universe. As one of the first vendors to forge a partnership with developers of a parallel framework, IBM has long been in the forefront of these efforts. Recently, NCR joined in with an announcement that it will use this parallel framework to deliver high- performance business intelligence applications that enable customers in specific vertical industries to solve many critical business problems for the first time. The applications, targeted for the financial, retail, telecommunications and transportation sectors, rely on parallel technology to allow users to execute programs that were previously too costly or complex. This year, terabyte-level warehouses in these industries can look forward to NCR's delivery of its parallel solutions, an event that will further hasten the transition from the single-processor to the parallel universe.
At November's Oracle OpenWorld, Intel demonstrated this parallel framework with a data warehousing application running in parallel on an Intel Pentium II Xeon processor-based, four-way server running Linux and Oracle8.05. The framework enabled the application, which accessed 180GB of retail warehouse data from the database, to achieve a throughput of 11.7GB/hour by using all four processors. Users witnessing such performance results from extremely cost-effective technology can readily envision a world in which departmental data marts that once cost millions will become commonplace as their price falls to hundreds and even tens of thousands of dollars.
Your Parallel Universe
The introduction of a parallel framework is already shaping the business intelligence landscape of the twenty-first century. The shift to parallel is affecting business practices, the software we rely on and the hardware we use, bringing with it unprecedented levels of performance and business benefit. To remain competitive, you will have to enter the parallel universe; to stay on top, you will need to lead the way.
Register or login for access to this item and much more
All Information Management content is archived after seven days.
Community members receive:
- All recent and archived articles
- Conference offers and updates
- A full menu of enewsletter options
- Web seminars, white papers, ebooks
Already have an account? Log In
Don't have an account? Register for Free Unlimited Access