8 Replies Latest reply on Aug 8, 2016 11:11 AM by Andrew Sayer

    Parallel query + direct path reads = better cardinality estimates?

    959666

      Same query run during the same hour -- buffer cache is "warm" but that doesn't help much at all in the serial plan. Serial execution (actually doesn't finish for 10 hours), SQL Monitoring report:

       

      https://www.dropbox.com/s/dgvifjt1n28nx6q/noparallel_conv_path_SQL_Monitoring.html?dl=0

       

      Parallel plan -- lots of direct path reads, runs in less than two minutes:

       

      https://www.dropbox.com/s/6u84jnnra0fjhhs/parallel16_direct_path_SQL_Monitoring.html?dl=0

       

      PGA usage is not much higher in the parallel plan, and considering extremely short execution time, customer can tolerate having 32 extra CPU threads allocated. Sounds too good to be true -- what am I missing here? Is the optimizer picking a different plan in the serial version based on bad cardinalities (line 37 of serial plan)? Thank you!!!

        • 1. Re: Parallel query + direct path reads = better cardinality estimates?
          Jonathan Lewis

          The optimizer error probably starts at line 42 (possibly in combination with line 41) of the serial plan.

           

          Line 41 predicts one row - which may or may not be accurate, but I'd guess that it returns 36 rows (where calendar_date = month_end date and both dates are in the last 3 years ... suggestive).  The optimizer will have taken 5% as the selectivity of the "last three years" predicates and divided whatever estimate it got from the first predicate by 400.

           

          Line 42 has two predicates which also reflect "the last three years" and they also will results in a factor of 400 estimate on number of rows.  Rough consequence - Oracle thinks the join will return about 109K rows (which drops a little later to 12K rows) which means the subsequent nested loops won't happen "very often" when actually the estimate is out by a factor of (ballpark) 400.

           

           

          Ultimately the differences in cardinality final estimates look like an issue related to transitive closure - the "last three years" predicates aren't on the date_dimension in the parallel plan, and only one of them is on the HSP-BUCKET_AGING table; to they appear they way they do in the serial because something about the query transformation has allowed predicates to be copied (hence duplicated and double counted) from one table to another.  (One of the predicates on the date_dimension is an "echo" of the actual predicate on hsp_bucket_aging, made possible because you join on that column - I haven't tried to find out where the other one came from.)

           

           

          Regards

          Jonathan Lewis

          • 2. Re: Parallel query + direct path reads = better cardinality estimates?
            Andrew Sayer

            I'm not able to open the monitor files but parallel query will do further dynamic sampling which will lead to better cardinality estimates. You can confirm this by looking at the notes section of the plan in dbms_xplan.display_cursor, I typically see dynamic sampling level 6 using parallel query when the optimiser parameter is actually set to 2. I believe this is documented behaviour but I'm struggling to find the reference.

            • 3. Re: Parallel query + direct path reads = better cardinality estimates?
              Randolf Geist

              959666 wrote:

               

              PGA usage is not much higher in the parallel plan, and considering extremely short execution time, customer can tolerate having 32 extra CPU threads allocated. Sounds too good to be true -- what am I missing here? Is the optimizer picking a different plan in the serial version based on bad cardinalities (line 37 of serial plan)? Thank you!!!

              Andrew is right, the parallel plan got better estimates based on the automatic increase in Dynamic Sampling - in particular on the critical operation that Jonathan pointed out (operation ID = 41 in the serial plan, operation ID = 102 in the parallel plan).

               

              So, what you can try is to see whether you get a better performing serial plan by making use of Dynamic Sampling (e.g. via cursor level DYNAMIC_SAMPLING(5) hint, that's the level Oracle has selected automatically for the parallel plan, or via ALTER SESSION SET OPTIMIZER_DYNAMIC_SAMPLING = 5). I don't think you can get better estimates for this expression based on Extended Statistics (another alternative for arriving at improved estimates).

               

              In principle with all 12c adaptive features enabled you should get that Dynamic Sampling automatically at some point based on a corresponding "SQL Plan Directive", but it doesn't look like that there is already one covering this bad estimate.

               

              @Andrew: The automatic adjustment of Dynamic Sampling for Parallel Execution is indeed a documented feature, but I don't have a suitable document link handy either. I think it's controllable via fix control 7452863 ("adjust DS level for large parallel tables based on the size") and enabled by default when the OPTIMIZER_DYNAMIC_SAMPLING parameter is left at the default of 2, introduced in 11.2.

               

              @Jonathan: Comparing operation IDs 41 and 102 of the two plans I think the single table predicates are pretty the same for serial and parallel execution - I also got confused as there is another instance of DATE_DIMENSION in the first branch of the UNION ALL. In the parallel plan the estimate is 742 rows and actual is 1044, so definitely much closer than the 1 row estimate in the serial plan.

               

              Randolf

              • 4. Re: Parallel query + direct path reads = better cardinality estimates?
                Jonathan Lewis

                Randolf,

                 

                You're right that I picked the wrong place to check filter predicates, but I think I may actually have tracked across from the correct date_dimenstion line (102) and drifted up to the the predicate on the hash join line (98) for some reason.

                 

                Regards

                Jonathan Lewis

                • 5. Re: Parallel query + direct path reads = better cardinality estimates?
                  Jonathan Lewis

                  Given the enormous significance of the date_dimension table, if a serial path with corrected statistics works I think I'd edit the view to add a specific dynamic sampling hint at level 4 for just that table to get maximum benefit for minimum overhead. The "date = month_end" is nasty enough (though it would be possible to code around it with an index on a real or virtual flag for month-end rows) but the unknown "last three years appearing twice" is a real threat for ALL queries. (update: looks like 11.2.0.4, if no earlier version, will work out the actual date during optimisation, so not such a big problem unless you have a very big history)

                   

                  Regards

                  Jonathan Lewis

                  • 6. Re: Parallel query + direct path reads = better cardinality estimates?
                    959666

                    Jonathan, Andrew, Randolf, thank y'all for your valuable contributions. I'm a lot smarter than I was yesterday.

                    • 7. Re: Parallel query + direct path reads = better cardinality estimates?
                      Mohamed Houri

                      Hi Randolf,

                       

                      @Andrew: The automatic adjustment of Dynamic Sampling for Parallel Execution is indeed a documented feature, but I don't have a suitable document link handy either. I think it's controllable via fix control 7452863 ("adjust DS level for large parallel tables based on the size") and enabled by default when the OPTIMIZER_DYNAMIC_SAMPLING parameter is left at the default of 2, introduced in 11.2.

                       

                      The following My Oracle support note gives some details about the Automatic Dynamic Sampling (ADS)

                       

                      Different Level for Dynamic Statistics (Dynamic Sampling) used than the Level Specified (Doc ID 1102413.1)

                      From this doc:

                       

                      “In previous versions the default value lead to certain situations where the result from using dynamic statistics resulted in bad plans, for example, when the tables referenced in the query are
                      very large.With this enhancement the optimizer itself changes the dynamic value under certain conditions such as with  large tables and parallelism. The level is automatically adjusted based on the size of the tables

                      in the query. This is done only for queries that will run in parallel which is typical for Data Warehouses and large databases”

                       

                      At one customer site we have had to augment the dynamic sampling from its default value 2 to 4 in order to avoid this uncontrollable dynamic sampling level and its varying execution plans; particularly for parallel query.

                       

                      Best Regards

                      Mohamed Houri

                       

                       

                      • 8. Re: Parallel query + direct path reads = better cardinality estimates?
                        Andrew Sayer

                        I don't think we are allowed to directly quote from MOS docs.

                         

                        I've found something in the official docs Managing Optimizer Statistics - 11g Release 2 (11.2)

                        For parallel statements, the optimizer automatically decides whether to use dynamic sampling and which level to use. The decision depends on the size of the tables and the complexity of the predicates. The optimizer expects parallel statements to be resource-intensive, so the additional overhead at compile time is worth it to ensure the best plan. The database ignores the OPTIMIZER_DYNAMIC_SAMPLING setting unless set to a nondefault value, in which case the value is honored.

                         

                        And then in the 12.1 docs https://docs.oracle.com/database/121/TGSQL/tgsql_statscon.htm#TGSQL348

                        n general, the optimizer uses default statistics rather than dynamic statistics to compute statistics needed during optimizations on tables, indexes, and columns. The optimizer decides whether to use dynamic statistics based on several factors. For example, the database uses automatic dynamic statistics in the following situations:

                        • The SQL statement uses parallel execution.
                        • A SQL plan directive exists.
                        • The SQL statement is known to the database, which means that it was captured in SQL Plan Management or Automatic Workload Repository, or is currently in the shared SQL area.

                         

                        But that doesn't seem as specific - or maybe the nondefault optimizer_dynamic_sampling is no longer a way around this?