Continue in 2 seconds

I am currently working on a data mart project where the client is requesting an ad hoc reporting capability.

By
  • Sid Adelman, Larissa Moss, Scott Howard
Published
  • July 26 2002, 1:00am EDT

Q:

Question: I am currently working on a data mart project where the client is requesting an ad hoc reporting capability. After further review, it turns out they really want the ability to do ad hoc querying. I personally feel that allowing users to write and execute their own queries is not a wise idea. My guess is that ad hoc querying can be avoided by designing the multidimensional data model broadly enough to anticipate the measures (and their associated dimensions) that may be required by the business. This broadly designed model will allow users to slice and dice the data in various ways giving users the notion of ad hoc reporting. My question to you is this: 1) Do you agree with this hypothesis? If not, what am I missing? What have I failed to consider? 2) If you do agree, do you know where I can find material that will help me make this case. I’m looking for articles, white papers, etc. that support this idea.

A:

Sid Adelman’s Answer: I disagree with your perception of users’ abilities to write and execute their own queries. First of all, you can’t lump all users into the same category. The power users are quite capable of writing their own queries; however, they do need some discipline in testing and some knowledge of the performance issues of the data warehouse. The smart move, as you suggest, is to create multidimensional models the users can work with and additionally you will want to build a query and report library that would be the basis for the majority of users’ queries and reports.

Scott Howard’s Answer: Yes, users speak their own unique dialects in this case interchanging reporting with querying. We just need to interrogate what they really mean and then diplomatically agree with their terminology (you've successfully done that).

Your hypothesis appears sound. You have two options to solve this:

  1. Construct the broad multidimensional DM as you suggest, but that may (and I stress may because I don't know all the details) introduce complexities for the users because the data mart attempts to be all-encompassing. The resulting model would too broad, thus confusing, and could introduce performance problems due to the size.
  2. Create separate specific data marts limited to the ad hoc need capabilities of a small subset of users more accustomed to this ad hoc requirement. The models will be more focused and thus more intuitive and applicable to the users. The data mart will also be smaller and perform better.

I'd say that your best source is again www.ralphkimball.com. I believe that he has written several articles on the subject over the past two years.

Larissa Moss’ Answer: I do not agree with that idea. The big myth in data warehousing is that all query and all report requirements are patently multidimensional. That is not true. There are plenty of business analysts, statisticians and financial analysts that need granular atomic data (not precalculated by any dimension), and they want it stored in such a way that they can ask any question against that data now and in the future. The only prerequisite they have is that the data is related in the same way as it is being used in business transactions, i.e., operationally. Multidimensional designs don't support this very well. Multidimensional designs support predefined reporting patterns. Sure, the designs can be broad enough to support some ad hoc queries – but still only within the precalculated patterns of facts and dimensions. Sounds like these clients (and they are not alone) don't have predefined reporting requirements for facts and dimensions. There is nothing wrong with building a data mart based on an entity-relationship based relational design, which does support business transactions – if that is what the client requirements call for. As for letting clients write their own ad hoc queries, I support that. One of the many benefits of a DW is supposed to be the empowerment of business people to take care of their own information needs – within the limits of their capabilities, of course. As long as the clients have the required skill set and receive the appropriate amount of training, and as long as the DBMS and the query tool have good governors to monitor and prevent runaway queries, I would empower the clients. Bottom line: While 90 percent (or better) of DW utilization is multidimensional, this client is apparently one of the other 10 percent. I suggest the book Corporate Information Factory by Inmon, Imhoff and Sousa.

Doug Hackney’s Answer: If you are dealing with a broad spectrum business user audience, i.e., one that is not skewed by a high concentration of high-end analysts, then you can typically meet 80 percent of their needs with template-based, thin-client, interactive (selectable where clauses) reporting. This should almost always be your first deliverable. High-end analysts will usually require the ability to do true ad hoc queries, but these people are normally a small percentage of your total user base. Strive to have a wide user base of information consumers using the template report system, it is much more sustainable politically. It is a big political mistake to build a five million dollar system for five analysts. It is not politically sustainable when the personalities change at the top. For the high-end analysts, give them client tool versions of any of the popular query and reporting tools, and be prepared for a high level of support requirements from them (not for the tool, but for the data).

Chuck Kelley’s Answer: One of the major benefits of a data mart is to allow the user community the ability to do ad hoc queries.  I agree it should be a dimensional model, unless the tool used by the user community works best in a different model (relational for example).

Michael Jennings’ Answer: In my experiences, there is always a small group of power users who need to perform analysis on the data warehouse that can not be predicted.  These individuals exist on the business side of the organization but are very technically capable.  The analysis needs of this group of users can be accommodated in a variety of ways.  Many of the leading business analysis product vendors offer applications specifically to accommodate ad hoc user needs.  These products offer the data warehouse administrator the ability to limit the scope of inquiry for the power user’s but provide the user with capabilities such as unions, complex joins, and aggregation.  If the power users and regular reporting users are sharing the same database, governing can be applied to power user request to limit either the amount of data or CPU time or both of their queries.

Additionally, several of the leading reporting tools have quasi  “ad hoc” capabilities that allow users to add/remove columns or calculations to reports plus control data content through prompting.

Joe Oates’ Answer: One of the things that is most appealing about data warehouses and data marts is that the end users can perform their own queries.  However, this is usually done via a front-end presentation tool such as Business Objects, MicroStrategy or Cognos, not by end users writing their own SQL queries.

The ability to slice and dice is determined by the number of dimension tables associated with a fact table.  If there are only one or two dimensions, then slicing and dicing will be very limited.  You do not indicate what kind of data mart you are preparing, but some of the more commonly used dimensions include: time, geography, demographics, behavior scores, product, profession, industry classification and psychographics.

There are many sites that have white papers and articles on data marts.  Here are a few.   http://dmreview.com/, http://www.tdan.com/, http://www.intelligententerprise.com/, http://www.tdwi.org/ and many more.

Clay Rehm’s Answer: I am sorry to disagree with you, however the whole reason for a data warehouse is that “users” can write their own ad hoc queries!  User is such a misused term – there are users who need paper reports handed to them, and there are power users who develop queries and reports for other users, and other user types in between.

The data (and meta data) should be easy and intuitive enough (after proper data training) to use so the IT department can get out of the query and report writing business.  There should be multiple ways to retrieve the data, such as a web site that provides canned reporting, multidimensional tools that provide slicing and dicing, and even programming languages that resemble SQL for the power users.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access