Report generation (BI) which searches and retrieves data from very large tables has resulted in slow database response-time and affecting the whole db performance. What can be done to improve this besides optimizing the SQL query? What are the cost-effective ways to do this for a small firm? (Data warehousing might seem to be '"big" for a small firm). I am currrently using Oracle.


Larissa Moss’ Answer: I am not sure I completely understand the situation. "Report generation" and "business intelligence" are not synonymous. If this firm has a BI initiative, it would – by definition – include some type of data warehousing in the strategic BI solution. I am therefore assuming that the performance problems are caused by simply generating reports directly from the operational databases. This is a common performance complaint, and Oracle and SQL are not the problem. Databases are designed and tuned differently for data-entry and update purposes than they are for reporting purposes. I would suggest to find the common reporting patterns and to build a multidimensional data mart (or oper mart, if used for operational reporting) storing the pre-calculated facts for the common reporting dimensions. While the fact table may still be a VLDB (although not as big as before because the facts have been summarized by the chosen dimensions), the dimension tables would not be that big. Oracle (like other DBMS) have special optimization techniques for multidimensional database designs. Although this is a "data warehousing" solution, it shouldn’t matter as long as it solves the performance problems, and as long as it removes the reporting function from the operational databases.

Mike Jennings’ Answer: Look at the fragmentation of the tables involved including fragmentation of free space in the data dictionary, isolate other fragmentation and minimize resource contention such as I/O bottlenecks on disk. Investigate whether your reporting tool can take advantage of the fragmentation strategy used by the DBMS (distribute data from a single table across multiple disks to avoid I/O bottlenecks).

Chuck Kelley’s Answer: Oracle can indeed deal with large tables very efficiently. However, if you want to return to you BI tool, one million rows, then it will take time. There are a lot of tuning techniques that can be applied to your database.

Joe Oates’ Answer: The most cost-effective way to improve performance in the scenario that you mention is to create appropriate aggregate (summary) tables. About two-thirds of data warehouse users can satisfy most of their requirements using aggregate tables. You can accomplish this in Oracle by using materialized views. However, unlike real views, materialized views actually take storage space. Materialized views can also have indexes to improve the speed of any query against the materialized view.

Clay Rehm’s Answer: This is a situation that calls for tables (reporting tables) specifically developed for the reports they are supporting. Each table will need a database structure, meta data and ETL to keep the data current. Try to design the tables so they can be a source for multiple reports.

Les Barbusinski’s Answer: First of all, if your database was not designed with business intelligence in mind, you’re in for a rough time. Specifically, if your database was designed primarily as a historical or operational data store (i.e., utilizing a normalized structure rather than a dimensional or star schema structure), your BI tool will never perform properly (too many joins). Worse yet, if your database is a "data dump" (i.e., the tables in the data mart look exactly like their counterparts in the source system), your BI tool will "up-chuck" (too much work for the BI tool’s semantic layer to handle).

Second, if your database was not designed for the specific BI tool that was chosen by your firm, your performance will always be substandard. For instance, if you’ve designed your mart as a snowflake schema, but you’re using a BI tool that prefers consolidated dimensions (such as Brio or Business Objects), then you’re going to have problems.

Third, consider creating aggregate tables for some of your reports (I’m assuming your BI tool is "agg aware"). Typically, if you can achieve at least a 3-to-1 reduction in size by aggregating a given fact table, it’s a good idea to do so. For example, if 50 percent of all sales reports are summarized by Month and Region, and aggregating your 5 million row Sales Detail Fact table by Month and Region produces a table with only 500,000 rows…you’re going to realize some real performance improvements.

Fourth, consider using custom- designed fact and/or dimension tables for some of your reports. For example, you could create a custom fact table that combines metrics from several fact tables, or a fact table that is already pre-filtered for a particular report. You may even want to design a fact table where all of the metrics for a particular BI report have already been pre-calculated (i.e., let the ETL script do the work rather than the BI tool).

Fifth, do some Oracle tuning. Look into options such as a) increasing the size of your rollback segments, b) partitioning your larger tables, c) adding some new indexes (based on an EXPLAIN of the SQL generated by the BI tool), d) caching some of your smaller dimension tables, e) forcing "hints" into some of the SQL generated by the BI tool or f) redesigning some of the BI reports to generate better SQL.

Finally, consider upgrading the hardware. A more powerful database server – or simply an upgrade to the number of CPUs and/or RAM on the existing server – may make an appreciable difference in the performance of your BI queries. Hope this helps.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access