Continue in 2 seconds

Conceptual Collaboration and Scalability

  • August 01 2004, 1:00am EDT

Having cut my teeth early on as a developer of optimizing, parallelizing compilers for high-performance supercomputers, I have a soft place in my heart for any innovative business uses or exploitation of parallelism. Although the bottom dropped out of the supercomputer market coinciding with the end of the Cold War, lessons learned from the heyday of "big iron" continue to reverberate and trickle down to the desktop.

Parallelism itself is not particularly new to the business intelligence (BI) world - high-end platforms have always had a niche in the BI universe, especially when considering the processing and storage requirements for data warehouses and OLAP applications. Parallelism has not been ignored in the data warehousing space; quite the contrary is true. There are some well-known products in place at many Fortune 500 companies, especially in the extract, transform and load (ETL) space, that were developed by companies started by refugees from the supercomputing world; and many data mining tools were developed for high-performance platforms.

Pleasantly, I am starting to see an interesting trend in how the coordination of resource management and use is melding with developed technology to reduce the costs of high-end computing and make scalable high performance actionable at the desktop.

Over the past few years, a consortium of industry organizations has been consolidating the protocols, roles and responsibilities related to collaborative high-end computing, all aggregated under the heading of "grid computing." Grid computing provides for collaborative sharing of distributed resources through a series of protocols for registering services, service/resource management, multiprocess coordination, and authentication and security for participants belonging to virtual organizations, which may span multiple administrative domains. The intention was to collect and standardize these protocols to better enable more organizations to exploit shared resources. The result is the ability to better effect both intra- and inter-organizational sharing and collaboration which, in turn, could better use available resources in a predictable way. Another benefit is the ability to transform a collection of desktop machines into a virtual parallel supercomputer.

To me, this is a great development that in some ways reduces the barrier to entry for companies to use the kinds of parallel platforms that I helped develop 10 years ago, especially in terms of cost. Similar architectural platforms that were very expensive in the late 1980s and early 1990s can now be had for the cost of a number of workstations, some networking infrastructure, some open-source software and a couple of system and software engineers to tie it all together. Moreover, what I am liking more and more about the grid computing world is not the technology part, but more so the rapid pace in which grid computing services vendors are adapting the technology to potential business value.

Here's a case in point. One year ago, I assembled a research report on grid computing, and I reviewed a number of grid software providers to evaluate the current state of the market. At the time, those companies focused on the infrastructure componentry - hooking up hardware and network-oriented process management. All of this was mostly geek-talk; it was less business-oriented. Within the past month or so, however, I revisited some of these companies' Web sites. I was surprised to see how the marketing spin now focuses on the business applications enabled by the underlying technology. In fact, one company's site mostly talks about data integration, with barely a mention of the grid technology that underlies the company's offerings.

The ramifications of this case are critically relevant to the future of business intelligence. More and more (mostly transactional) information is captured and archived; and with new initiatives associated with supply chain management (i.e., RFID) as well as growing focus on customer data integration, there will be a corresponding need for computing power to massage, manipulate and present that information. Couple that with the desire of smaller organizations to develop BI programs (especially on a shoestring), and grid computing seems to be showing up at just the right time.

High profile announcements by industry giants such as IBM and Oracle herald the dawn of a new day filled with coordinated sharing of networked resources as the underlying fabric of a services-based architecture. There is a grand ballyhoo about the investment these companies are making in tying their future to grid computing. While the message seems to overwhelm the promise right now, I am confident that the day when grid computing is quietly ubiquitous in business intelligence programs is not too far in the future.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access