Business intelligence (BI) solutions, including data mining, data warehousing and data marts, are exploding in size as they evolve into mission-critical workloads. According to the Palo Alto Management Group, a leading BI consulting firm, the average data warehouse will grow 36 times in size over the next three years. The mainframe has long been famous for its capacity and ability to process large volumes of data with ease. As companies reengineer their businesses around BI solutions, they are increasingly evaluating the value the mainframe can bring to data warehousing. Early data warehouse implementations were often developed on less expensive, dedicated systems that needed to be accessed only eight or ten hours a day. These systems were limited in scope and size, with little impact on day-to-day business should there be an outage.
As BI applications surge into companies' business operations areas, these limitations create intolerable conditions when problems arise, resulting in costly outages. Windows for system and data maintenance are disappearing as companies demand constant access to data to meet global business requirements.
As a result, companies are beginning to deliver BI applications on the mainframe to harness its inherent availability, scale and processing strengths.
Around-the-Clock Application Availability
Mainframes have long delivered exceptional systems availability. Frequently, customers run their mainframes without outages for years; the average mainframe meantime to systems failure is 25 to 30 years.
The mainframe boasts industry-leading application availability as well. Mainframes' hardware and software designs permit end users to access data while it is being backed up, reorganized and migrated to replacement hardware (both disk and processor). Even underlying software (both system and application) can be changed or upgraded while the applications continue running.
As parallel processing technology has developed, it has been integrated into mainframe environments as well. Mainframe systems can be easily clustered together, permitting up to 32 systems--each with up to 10 processors, to share a query workload. This is a non-uniform memory architecture (NUMA) based MPP (massively parallel processing) environment and an attractive architecture for parallel solutions. This clustering isolates applications from normal planned outages by permitting work to flow from system to system as individual components within the complex are serviced, upgraded or replaced. Additional incremental capacity can be added at any time; this allows businesses to avoid costly migrations to new systems when more capacity is needed. Additionally, mainframe servers have the flexibility to support parallel databases which are critical for fast, efficient access to large amounts of data.
Business-Related Workload Controls
Business intelligence workloads are characterized by their unpredictability. Less mature systems frequently have difficulty sorting and executing work that scales from very small to large complex queries. Due to their multi-workload heritage, mainframe operating and management systems have evolved to allocate system resources based on business priority rather than the order in which queries are received. This capability prevents large queries from monopolizing the system, ensuring that important queries, such as BI, are completed with consistent response times.
This sophisticated workload management capability also permits the integration of a variety of workloads on the same mainframe server. This optimizes the utilization of installed capacity; as one workload's processing demands diminish, the capacity is automatically available to other applications on the system. This avoids islands of unused processing cycles that are common when applications are isolated on separate servers.
A key metric in data warehousing solutions is the ability of the system to "scale linearly." Linear scaleup in data volumes means that when the data doubles, the elapsed query time to scan all the data should double (processing capacity held constant). Many servers have shown they are able to scale linearly when either data is increased or when the processor capacity is increased. However, only mainframes have the ability to deliver better than linear capability for query workloads. More importantly, mainframes are able to deliver consistent system throughput when operating at 100 percent utilization; other platforms typically manage workloads at 50-60 percent server capacity in order to accommodate peak workloads. This unique ability to consistently operate at a high utilization rate is a result of the mainframe's patented memory management capabilities such as virtual storage, expanded memory, dataspaces and block paging. As a result, mainframes have not been plagued by the same data access issues that are forcing other platforms to adopt a 64-bit architecture.
Less Costly Implementations
Mainframes have historically been characterized as very costly platforms for BI solutions. However, with improvements in chip technology resulting in faster and cheaper hardware, that has changed. Due to the development of CMOS technology-based servers, mainframe hardware costs have dramatically dropped over 70 percent since the early 1990s, resulting in a more cost-effective solution to deliver business value. In 1997, an independent consultant, International Technology Group (ITG), published the results of a year-long survey of over 273 mainframe and UNIX sites (ranging in size from small to large organizations) for transaction systems as well as intensive query systems. The results clearly demonstrated that the operational costs of centralized and distributed UNIX servers were substantially higher than those of the mainframe. (Source: "Cost of Scalability: a Comparative Study of Mainframe and UNIX Server Installations" by International Technology Group, Mountain View, CA.) See Figure 1.
Gartner Group concurs with this data, adding that the "mainframe has the advantage of the economy of scale, and can run large numbers of users efficiently." (Source: Research Note/Key Issue Analysis/16 July 1997.)
Economical Disk Sizes
A key consideration in any BI solution is the amount of disk required. By implementing compression algorithms within mainframe hardware, the data can be compressed to as little as 50 percent of the original raw data, significantly reducing the amount of physical disk required and avoiding expensive software alternatives. This also improves overall response time by substantially cutting the number of I/Os while increasing the buffer hit ratios. Through compression and the high reliability of mainframe disk subsystems, the ratio for mainframe-based implementations tend to range between 2GB of disk to 1GB of raw data to as much as 3.5:1.
In contrast, UNIX servers for data warehousing require fairly significant disk-to-data ratios to deliver adequate performance. Many systems have disk-to-raw data ratios ranging from 5GB of physical disk for every 1GB of raw data in the system, to as much as 9:1. With mainframes, there is a substantial difference in floor space, environmentals and cost. Finally, backing up fewer disks results in lower infrastructure costs.
Another important consideration when evaluating data warehouse platforms is the movement or porting of data between multiple systems and databases. With a large portion of the world's corporate data already stored on mainframe systems, hosting warehouse applications on the mainframe avoids wasting processing cycles to translate and move the data to other platforms. Additionally, other systems generally lack the mainframe's infrastructure to control the large volumes of data found in data warehouses. The vast selection of automation tools and established procedures found in the mainframe environment simplify the mundane tasks associated with managing such large volumes of data.
As businesses rely more heavily on data warehouses to drive operations, controlling access to the information grows more critical. Web-enabling data warehouses is the latest trend as mobile employee environments become more common. A secure server that can protect sensitive corporate information from unauthorized access is paramount to ensure business survival.
Mainframes deliver the system integrity and security to create a secure platform for deploying BI applications, including protecting business data from unauthorized access from both external threats (competitors) and internal threats (employees who don't have a "need to know"). Mainframe security is integrated as a set of services which combine user authentication, access control and auditing services with the functions needed to support the emerging "Web-enabled" mobile access environment. The inclusion of firewall technologies, secure communication protocols and digital certificate user authentication help to ensure the security of BI applications conducted in the Web-enabled environment. Augmenting mainframe security, BI applications that require data encryption can also benefit from the mainframe's integrated cryptographic co-processors which house encryption keys within a secure tamper-resistant hardware boundary.
Data warehouses and business intelligence solutions will continue to grow in size, scope and importance, compelling companies to look to the mainframe. The mainframe offers a number of advantages for implementing BI applications, ranging from high levels of application availability, system workload control and infrastructure benefits to cost efficiencies. Together, these strengths make mainframes an attractive choice for meeting business demands for effective BI solutions.
Register or login for access to this item and much more
All Information Management content is archived after seven days.
Community members receive:
- All recent and archived articles
- Conference offers and updates
- A full menu of enewsletter options
- Web seminars, white papers, ebooks
Already have an account? Log In
Don't have an account? Register for Free Unlimited Access