MAR 1, 2007 1:00am ET

Related Links

SAP Puts HANA Behind Warehouse, Data Access
November 9, 2011
DTCC to Create Second Data Warehouse for Swaps
November 2, 2011
Eurex Moves Operations to Equinix Data Center
October 24, 2011

Web Seminars

Data Replication for Real-time (Big) Data Warehousing
Available On Demand
Improving your Overall Analytical Environment by Migrating to a New Data Warehouse Platform
Available On Demand
The Dynamic Duo of Data Warehousing and Real-Time Streams
Available On Demand

Could you please supply a definition for "exploration data mart?"

Print
Reprints
Email

Question: Could you please supply a definition for "exploration data mart?

Chuck Kelley's Answer: To me, an exploration data mart is a subset of a data mart. Data marts are generally defined in the structure of the tool that best suits the need of the business - snowflake, star, multidimensional. Exploration data marts are for those tools that do more data mining. They are, generally but not always, built with very atomic data and in a very denormalized form (very few tables with lots of duplicate data).

Tom Haughey's Answer: An exploration data mart is a data mart used for analytics, in general, especially for unstructured or semistructured analytics. Examples are mathematical functions as statistical analysis or data mining. Exploration work does not usually fit the pattern of facts and dimensions, and usually does not benefit from being presented with facts and dimensions. Generally a variable list is used to slice or extract the data, with each variable defined in the list. In the case of data mining, the algorithms are looking for patterns of usage across instances of data. This form of analysis is very data-intensive and for performance purposes is often offloaded from the DW platform. Other applications of this could be market segmentation or sampling. I have seen subsets of the DW data offloaded to a data mart to facilitate marketing analyses. An example of a subset could be one with most of the columns but only 20 percent of the rows. Such a smaller subset is just easier to analyze.

Chuck Kelley is an internationally known expert in database and data warehousing technology. He has 30 years of experience in designing and implementing operational/production systems and data warehouses. Kelley has worked in some facet of the design and implementation phase of more than 50 data warehouses and data marts. He also teaches seminars, co-authored four books on data warehousing and has been published in many trade magazines on database technology, data warehousing and enterprise data strategies. He can be contacted at chuckkelley@usa.net.

Tom Haughey is the president of InfoModel LLC, a training and consulting company specializing in data warehousing and data management. He has worked on dozens of database and data warehouse projects for more than two decades. Haughey was former CTO for Pepsi Bottling Group and director of enterprise data warehousing for PepsiCo. He may be reached at (201) 337-9094 or via e-mail at tom.haughey@InfoModelUSA.com.

Advertisement

Comments (0)

Be the first to comment on this post using the section below.

Add Your Comments:
You must be registered to post a comment.
Not Registered?
You must be registered to post a comment. Click here to register.
Already registered? Log in here
Please note you must now log in with your email address and password.
Twitter
Facebook
LinkedIn
Login  |  My Account  |  White Papers  |  Web Seminars  |  Events |  Newsletters |  eBooks
FOLLOW US
Please note you must now log in with your email address and password.