You recall some of the great debates of our times: Kennedy vs. Nixon, Internet Explorer vs. Netscape, Java vs. ActiveX and MOLAP vs. ROLAP. One of these debates had a clear winner (Kennedy). The other debates still rage on, with no declaration of a clear winner. The MOLAP vs. ROLAP debate, most familiar to those of us who are enabling our organizations to gain insight through business intelligence tools, has been hot for awhile. In 1996 Gartner Group declared that ROLAP would be the winner; however, Arbor (with Essbase) and other MOLAP solution providers have been busy taking the opposing viewpoint quietly and very successfully.

Now a new debate is commencing--a debate that promises to contend for the time and attention of data warehouse and business intelligence managers. This debate centers on whether to do data mining on the whole database (the whole enchilada) or on a sample set of records. The issue arises as data warehouses approach multi-terabyte size and detail is retained for all source records within an enterprise. Technology has now progressed to the point where there is computing power available to mine larger datasets. Where previously we could not mine it all, now with SMP and MPP solutions, it can be done. That makes some (including vendors of high performance decision support database engines) say that sampling becomes less critical, if not irrelevant.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access