Free Site RegistrationFree Site Registration

Could you please supply a definition for "exploration data mart?"

Information Management Online, March 1, 2007

Chuck Kelley, Tom Haughey

Question: Could you please supply a definition for "exploration data mart?

Advertisement

Chuck Kelley's Answer: To me, an exploration data mart is a subset of a data mart. Data marts are generally defined in the structure of the tool that best suits the need of the business - snowflake, star, multidimensional. Exploration data marts are for those tools that do more data mining. They are, generally but not always, built with very atomic data and in a very denormalized form (very few tables with lots of duplicate data).

Tom Haughey's Answer: An exploration data mart is a data mart used for analytics, in general, especially for unstructured or semistructured analytics. Examples are mathematical functions as statistical analysis or data mining. Exploration work does not usually fit the pattern of facts and dimensions, and usually does not benefit from being presented with facts and dimensions. Generally a variable list is used to slice or extract the data, with each variable defined in the list. In the case of data mining, the algorithms are looking for patterns of usage across instances of data. This form of analysis is very data-intensive and for performance purposes is often offloaded from the DW platform. Other applications of this could be market segmentation or sampling. I have seen subsets of the DW data offloaded to a data mart to facilitate marketing analyses. An example of a subset could be one with most of the columns but only 20 percent of the rows. Such a smaller subset is just easier to analyze.

Chuck Kelley is an internationally known expert in database and data warehousing technology. He has 30 years of experience in designing and implementing operational/production systems and data warehouses. Kelley has worked in some facet of the design and implementation phase of more than 50 data warehouses and data marts. He also teaches seminars, co-authored four books on data warehousing and has been published in many trade magazines on database technology, data warehousing and enterprise data strategies. He can be contacted at chuckkelley@usa.net.

Tom Haughey is the president of InfoModel LLC, a training and consulting company specializing in data warehousing and data management. He has worked on dozens of database and data warehouse projects for more than two decades. Haughey was former CTO for Pepsi Bottling Group and director of enterprise data warehousing for PepsiCo. He may be reached at (201) 337-9094 or via e-mail at tom.haughey@InfoModelUSA.com.

For more information on related topics, visit the following channels:

Advertisement

Advertisement