Optimizing Oracle Optimizer Statistics

Donald K. Burleson's picture

Many Oracle professional do not realize the important of having good statistics for the Oracle cost-based SQL optimizer (CBO). The CBO has received a "bum-rap" from some Oracle professionals who wonder why the CBO is generating sub-optimal execution plans for their SQL statements.

If you provide Oracle with good statistics about the schema the CBO will almost always generate an optimal execution plan. The areas of schema analysis include:

  • Object statistics - Statistics for all tables, partitions, IOTs, etc should be sampled with a "deep" and statistically valid sample size.

  • Critical columns - Those columns that are regularly-referenced in SQL statements that are:

    • Heavily skewed columns - This helps the CBO properly choose between an index range scan and a full table scan

    • Foreign key columns - For n-way table joins, the CBO needs to determine the optimal table join order and knowing the cardinality of the intermediate results sets is critical.

  • External statistics - Oracle will sample the CPU cost and I/O cost during statistics collection and use this information to determine the optimal execution plan, based on your optimizer_mode. External statistics are most useful for SQL running in the all_rows optimizer mode.

Frequency of re-analyze

It astonishes me how many shops prohibit any un-approved production changes and yet re-analyze schema stats weekly. Evidently, they do not understand that the purpose of schema re-analysis is to change their production SQL execution plans, and they act surprised when performance changes!

Most Oracle experts only recommend scheduled re-analysis for highly dynamic databases, and most shops save one very-deep sample (with histograms), storing the statistic with the dbms_stats.export_schema_stats procedure. The only exceptions are highly-volatile systems (i.e. lab research systems) where a table is huge one-day and small the next.

For periodic re-analysis, many shops use the table "monitoring" option and also method_opt "auto" after they are confident that all histograms are in-place.

Saving and re-using stats

For data warehouses and database using the all_rows optimizer_mode, Oracle9i release 2 has the ability to collect the external cpu_cost and io_cost metrics. The ability to save and re-use schema statistics is important for several types of Oracle shops:

  • Bi-Modal shops - Many shops get huge benefits from using two sets of stats, one for OLTP (daytime), and another for batch (evening jobs).

  • Test databases - Many Oracle professionals will export their production statistics into the development instances so that the test execution plans more closely resemble the production database.

Getting top-quality stats

Because Oracle9i schema statistics work best withexternal system load, I like to schedule a valid sample (using dbms_stats.auto_sample_size) during regular working hours. For example, here we refresh statistics using the "auto" option which works with the table monitoring facility to only re-analyze those Oracle tables that have experienced more than a 10% change in row content:

 ownname => 'SCOTT', 
 estimate_percent => dbms_stats.auto_sample_size, 
 method_opt => 'for all columns size auto', 
 degree => 7

Tips for optimizing the CBO with statistics

There are several tips for optimizing your CBO with good statistics:

  • Find skewed columns that are referenced in SQL - Many shops do not use method_opt=skewonly and suffer from poor execution plans on skewed column access.

  • Find histograms for foreign key columns - Many DBAs forget that the CBO must have foreign-key histograms in order to determine the optimal table join order (i.e. the ORDERED hint).

  • Fix the cause, not the symptom - For example, whenever I see a sub-optimal order for table joins, I resist the temptation to add the ORDERED hint, and instead create histograms on the foreign keys of the join to force the CBO to make the best decision.

For new features, explore the Oracle10g automatic histogram collection mechanism that interrogates v$sql_plan to see where the foreign keys are used. It claims to generate histograms when appropriate, all automatically.