This discussion is archived
1 2 3 Previous Next 42 Replies Latest reply: Nov 6, 2009 12:13 AM by 635471 RSS

An optimizer Question

729555 Newbie
Currently Being Moderated
Hi,

I'm running two queries:
One:
select a.fact_name,
decode(r.fact_type,
'DATE',
to_char(FACT_VALUE_DATE, 'dd-mon-yyyy'),
FACT_VALUE_INTEGER || FACT_VALUE_DATE || FACT_VALUE_STRING ||
FACT_VALUE_NUMBER) as fact_val
from analytic_CNM.patient_facts partition(part_200907) a,
REF_HD.fact_ref r
where hd_person_id = 70154929
and a.fact_name = r.fact_name
and r.rec_expired_dt = to_date('99991231', 'YYYYMMDD')
and r.prod_status = upper('P')

Two:
select a.fact_name,
decode(r.fact_type,
'DATE',
to_char(FACT_VALUE_DATE, 'dd-mon-yyyy'),
FACT_VALUE_INTEGER || FACT_VALUE_DATE || FACT_VALUE_STRING ||
FACT_VALUE_NUMBER) as fact_val
from analytic_CNM.patient_facts partition(part_200907) a,
REF_HD.fact_ref r
where hd_person_id = 83738142
and a.fact_name = r.fact_name
and r.rec_expired_dt = to_date('99991231', 'YYYYMMDD')
and r.prod_status = upper('P')

The first query is using a nested loop and indexes.
While the second one is doing a hash join and full table scan.
The only difference is value of HD_PERSON_ID.
Now when I force index hint on second query and check the plan the cost is much more.
The first question, how can same query have different cost for two different person ids.
Second, how can I enforce the plan of first query for the execution of other.

Thanks
Tarun
  • 1. Re: An optimizer Question
    Tubby Guru
    Currently Being Moderated
    Possibly because you have a histogram on hd_person_id ?
  • 2. Re: An optimizer Question
    riedelme Expert
    Currently Being Moderated
    tsharma@healthdialog.com wrote:
    The first question, how can same query have different cost for two different person ids.
    Second, how can I enforce the plan of first query for the execution of other.
    Cost is an estimate of how expensive a query will be to execute, based on a lot of different factors. The method to generate the hash join you mentioned will be different from that for the nested loops, and one will probably be more efficient than another. Nested loops work well when a small number of rows are looked up with indexes from one of the tables in a join. Hash joins work well when a large number of rows match in both tables using full table scans.

    Tubby pointed out that if you have histograms the optimizer can intelligently decide when to use indexes for the lookup, and in the second case it is possible that the hash join is a better option.

    You can use hints to try to employ the nested loops access method and see what happens.
  • 3. Re: An optimizer Question
    729555 Newbie
    Currently Being Moderated
    Well, number of records for first query (using nested loop and index) are 2600
    and for second (using full table scan) are 2000.
  • 4. Re: An optimizer Question
    614235 Explorer
    Currently Being Moderated
    I agree with the other posters - the difference in cost will be due to a histogram existing for that column. Check the data dictionary view USER_TAB_COLUMNS for that column in that table and see if HISTOGRAM is set. There are two types of histogram - height balanced or frequency. You can read up about the differences between the two.

    If you have a frequency histogram then it stores the matching row count in the histogram for each value that exists when statistics are gathered on that table. Thus the Optimizer will have what it believes is a good estimate of the number of rows that would match your constraint on that column, and that will determine the cost of access and which is the cheapest access method.

    A height balanced histogram is less 'accurate' as it counts the number of rows in certain value ranges. It can still estimate the number of rows that match your constraint, but to less absolute accuracy.

    Either way it is the presence of the histogram that leads to the different cost estimates, and the use of a different and 'lower cost' execution plan for the other query.

    If you want the same execution plan for both queries, and all such queries, then drop the histogram. Then it will assume that all values occur equally often in that table, and use the same execution plan for each query. Drop the histogram by gathering statistics on that table with the histogram size set to 1, which really means none.
    method_opt => ‘FOR ALL COLUMNS SIZE 1‘
    John
  • 5. Re: An optimizer Question
    614235 Explorer
    Currently Being Moderated
    It's not the total number of rows from the final query that makes the difference, but the number of rows for each step within the execution plan. If you look at the execution plan (or explain plan) you should see that the estimates for the number of rows for the step that is different in the two queries does have a significantly different row count.

    By making the Optimizer try and be more 'intelligent' Oracle have also made it more 'sensitive'. Thus small changes between queries and constraints can cause it to choose radically different execution plans as a result of the statistics it has available to it on the tables and columns in the database.

    John
  • 6. Re: An optimizer Question
    riedelme Expert
    Currently Being Moderated
    tsharma@healthdialog.com wrote:
    Well, number of records for first query (using nested loop and index) are 2600
    and for second (using full table scan) are 2000.
    2600 rows isn't that many. Assuming you are using indexes I would imagine the nested loops lookup to be the better method, but that is hard to say with certainty with the information I have on your system.

    If necessary use the use_nl hint to try to persuade the optimzer to use that access method. Be warned that hints are called "hints" and not "directives" for a reason - the optimizer may choose to disregard them if it wants to. l have seen posts here on OTN that declare hints WILL be used if specified but that is not my experience - the optimizer will ultimately do what it wants to do regardless of hints.
  • 7. Re: An optimizer Question
    CharlesHooper Expert
    Currently Being Moderated
    >
    (Snip)
    If necessary use the use_nl hint to try to persuade the optimzer to use that access method. Be warned that hints are called "hints" and not "directives" for a reason - the optimizer may choose to disregard them if it wants to. l have seen posts here on OTN that declare hints WILL be used if specified but that is not my experience - the optimizer will ultimately do what it wants to do regardless of hints.
    From what I understand, Oracle's hints are actually directives - the optimizer must follow the hint, unless the hint is invalid.

    Hints may be invalid if the wrong alias is used in the hint, the hint is malformed, the hint is incompatible with another hint, or the query was transformed by the optimizer into a form that is incompatible with the hint before the optimizer applied the hint. I probably left out a couple reasons why Oracle would apparently ignore a hint.

    Jonathan Lewis has an interesting blog post that shows the optimizer (apparently) ignoring a hint:
    http://jonathanlewis.wordpress.com/2009/10/02/quiz-night-3/

    Charles Hooper
    IT Manager/Oracle DBA
    K&M Machine-Fabricating, Inc.
  • 8. Re: An optimizer Question
    Jonathan Lewis Oracle ACE Director
    Currently Being Moderated
    Charles Hooper wrote:

    Hints may be invalid if the wrong alias is used in the hint, the hint is malformed, the hint is incompatible with another hint, or the query was transformed by the optimizer into a form that is incompatible with the hint before the optimizer applied the hint. I probably left out a couple reasons why Oracle would apparently ignore a hint.
    Or if using the hint would return the wrong answer, or if you happen to be unlucky enough to hit a bug.

    Regards
    Jonathan Lewis
    http://jonathanlewis.wordpress.com
    http://www.jlcomp.demon.co.uk

    To post code, statspack/AWR report, execution plans or trace files, start and end the section with the tag {noformat}
    {noformat} (lowercase, curly brackets, no spaces) so that the text appears in 
    fixed format
    .
    
    
    "Science is more than a body of knowledge; it is a way of thinking" 
    Carl Sagan                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    
  • 9. Re: An optimizer Question
    Jonathan Lewis Oracle ACE Director
    Currently Being Moderated
    tsharma@healthdialog.com wrote:
    Well, number of records for first query (using nested loop and index) are 2600
    and for second (using full table scan) are 2000.
    It's not the number of rows that actually appear that matters, it's the number that the optimizer thinks will appear.

    It's possible to do a detailed analysis of the stats involved - but a simple starting point would be to give us the full execution plans for the two queries. (See the note below about posting code in fixed font)

    Regards
    Jonathan Lewis
    http://jonathanlewis.wordpress.com
    http://www.jlcomp.demon.co.uk

    To post code, statspack/AWR report, execution plans or trace files, start and end the section with the tag {noformat}
    {noformat} (lowercase, curly brackets, no spaces) so that the text appears in 
    fixed format
    .
    
    
    "Science is more than a body of knowledge; it is a way of thinking" 
    Carl Sagan                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    
  • 10. Re: An optimizer Question
    riedelme Expert
    Currently Being Moderated
    From what I understand, Oracle's hints are actually directives - the optimizer must follow the hint, unless the hint is invalid.
    Nonetheless, on multiple occasions in multiple circumstances I have seen the cost-based optimizer refuse to take a hint that was coded properly. Usually this takes the form of refusing to use a specified index in preference to another index which in operation offers inferior performance.

    This is rare but I have seen it.
  • 11. Re: An optimizer Question
    Jonathan Lewis Oracle ACE Director
    Currently Being Moderated
    riedelme wrote:
    From what I understand, Oracle's hints are actually directives - the optimizer must follow the hint, unless the hint is invalid.
    Nonetheless, on multiple occasions in multiple circumstances I have seen the cost-based optimizer refuse to take a hint that was coded properly. Usually this takes the form of refusing to use a specified index in preference to another index which in operation offers inferior performance.

    This is rare but I have seen it.
    I saw this happen on a client site on one occasion. Examination of the 10053 trace file showed that Oracle had "lost" the index.

    I wasn't allowed to keep the trace file, and I don't remember who the client was. I think it was some version of 10.2, and I think there was a query transformation involved and Oracle lost the first index of the alphabetical list (as if it had created an array of indexes from subscript zero, but only considered them from subscript 1) - it was clearly a bug.

    It's possible that your experiences were related.


    Regards
    Jonathan Lewis
    http://jonathanlewis.wordpress.com
    http://www.jlcomp.demon.co.uk

    To post code, statspack/AWR report, execution plans or trace files, start and end the section with the tag {noformat}
    {noformat} (lowercase, curly brackets, no spaces) so that the text appears in 
    fixed format
    .
    
    "Science is more than a body of knowledge; it is a way of thinking" 
    Carl Sagan                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            
  • 12. Re: An optimizer Question
    CharlesHooper Expert
    Currently Being Moderated
    Jonathan:
    Thanks for joining the discussion and providing 2 more reasons why hints are potentially ignored. Riedelme's reply reminded me of another that I have seen addressed somewhere online.
    riedelme wrote:
    From what I understand, Oracle's hints are actually directives - the optimizer must follow the hint, unless the hint is invalid.
    Nonetheless, on multiple occasions in multiple circumstances I have seen the cost-based optimizer refuse to take a hint that was coded properly. Usually this takes the form of refusing to use a specified index in preference to another index which in operation offers inferior performance.

    This is rare but I have seen it.
    Riedelme:
    I believe what you are saying - that it can happen. I think that Jonathan's provides an idea at what might be the cause, although it might be nice to see a test case where it happens. I have a vague memory of Jonathan mentioning such a problem online in the past where the 10053 trace indicated that Oracle forgot that an index existed, where he actually showed that fact by presenting part of a 10053 trace.

    Your post reminded me of another case where Oracle might ignore a specific index hint, and select to use another index instead. Here is a simple test case which creates a table with 100,000,000 rows with 2 indexes:
    CREATE TABLE T15(
      C1 NUMBER NOT NULL,
      C2 VARCHAR2(50) NOT NULL,
      C3 NUMBER,
      C4 VARCHAR2(300));
    
    INSERT INTO T15
    SELECT
      ROWNUM,
      TO_CHAR(ROWNUM,'0000000')||'A',
      DECODE(MOD(ROWNUM,1000),0,NULL,ROWNUM),
      LPAD('A',300,'A')
    FROM
      (SELECT
        ROWNUM RN
      FROM
        DUAL
      CONNECT BY
        LEVEL <= 10000) V1,
      (SELECT
        ROWNUM RN
      FROM
        DUAL
      CONNECT BY
        LEVEL <= 1000) V2;
    
    CREATE INDEX IND_T15_C1_C2 ON T15(C1,C2);
    CREATE INDEX IND_T15_C3 ON T15(C3);
    
    EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T15',CASCADE=>TRUE)
    The above created the table T15, the composite index IND_T15_C1_C2 on the columns C1 and C2, and the index IND_T15_C3 on just the C3 column. The table is large enough due to column C4 that Oracle will probably will select to use an index, when possible, rather than performing a full table scan. Now, let's see what happens when we try to determine the number of rows in table T15:
    SET AUTOTRACE TRACEONLY EXPLAIN
    
    SELECT
      COUNT(*)
    FROM
      T15;
    
    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 4197727756
    
    -------------------------------------------------------------------------------
    | Id  | Operation             | Name          | Rows  | Cost (%CPU)| Time     |
    -------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT      |               |     1 | 10174   (1)| 00:02:03 |
    |   1 |  SORT AGGREGATE       |               |     1 |            |          |
    |   2 |   INDEX FAST FULL SCAN| IND_T15_C1_C2 |    10M| 10174   (1)| 00:02:03 |
    -------------------------------------------------------------------------------
    The optimizer selected to use the composite index IND_T15_C1_C2 rather than a full table scan or the much smaller index IND_T15_C3 on just the column C3. Let's try a hint to use the index IND_T15_C3:
    SELECT /*+ INDEX(T1 IND_T15_C3) */
      COUNT(*)
    FROM
      T15;
    
    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 4197727756
    
    -------------------------------------------------------------------------------
    | Id  | Operation             | Name          | Rows  | Cost (%CPU)| Time     |
    -------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT      |               |     1 | 10174   (1)| 00:02:03 |
    |   1 |  SORT AGGREGATE       |               |     1 |            |          |
    |   2 |   INDEX FAST FULL SCAN| IND_T15_C1_C2 |    10M| 10174   (1)| 00:02:03 |
    -------------------------------------------------------------------------------
    Oracle still used the IND_T15_C1_C2 index, even though I hinted to use the IND_T15_C3 index. But wait, there is a problem. The table is actually T15, not T1. I have included this example, as it is an easy mistake to make when typing SQL statements. Let's try again with a correctly formed hint in the SQL statement:
    SELECT /*+ INDEX(T15 IND_T15_C3) */
      COUNT(*)
    FROM
      T15;
    
    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 4197727756
    
    -------------------------------------------------------------------------------
    | Id  | Operation             | Name          | Rows  | Cost (%CPU)| Time     |
    -------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT      |               |     1 | 10174   (1)| 00:02:03 |
    |   1 |  SORT AGGREGATE       |               |     1 |            |          |
    |   2 |   INDEX FAST FULL SCAN| IND_T15_C1_C2 |    10M| 10174   (1)| 00:02:03 |
    -------------------------------------------------------------------------------
    
    SET AUTOTRACE OFF
    Oracle still ignored the hint to use the IND_T15_C3 index, and selected to use the IND_T15_C1_C2 index instead (again). Oh, Oracle does not index NULL values in a non-composite B*Tree index (or when alll values are NULL in a composite index), so using that index may yield the wrong result. We can fix that problem:
    UPDATE
      T15
    SET
      C3=0
    WHERE
      C3 IS NULL;
    
    COMMIT;
    
    EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T15',CASCADE=>TRUE,NO_INVALIDATE=>FALSE)
    Let's try again:
    SET AUTOTRACE TRACEONLY EXPLAIN
    
    SELECT
      COUNT(*)
    FROM
      T15;
    
    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 4197727756
    
    -------------------------------------------------------------------------------
    | Id  | Operation             | Name          | Rows  | Cost (%CPU)| Time     |
    -------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT      |               |     1 | 10559   (1)| 00:02:07 |
    |   1 |  SORT AGGREGATE       |               |     1 |            |          |
    |   2 |   INDEX FAST FULL SCAN| IND_T15_C1_C2 |    10M| 10559   (1)| 00:02:07 |
    -------------------------------------------------------------------------------
    Notice that the optimizer did not select to use the IND_T15_C3 index, let's help it with a hint:
    SELECT /*+ INDEX(T15 IND_T15_C3) */
      COUNT(*)
    FROM
      T15;
    
    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 4197727756
    
    -------------------------------------------------------------------------------
    | Id  | Operation             | Name          | Rows  | Cost (%CPU)| Time     |
    -------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT      |               |     1 | 10559   (1)| 00:02:07 |
    |   1 |  SORT AGGREGATE       |               |     1 |            |          |
    |   2 |   INDEX FAST FULL SCAN| IND_T15_C1_C2 |    10M| 10559   (1)| 00:02:07 |
    -------------------------------------------------------------------------------
    Notice that the optimizer still ignored the index hint and used the larger index. Let's see if we can help the optimizer by telling it that column C3 cannot hold a NULL value:
    ALTER TABLE T15 MODIFY (C3 NUMBER NOT NULL);
    
    SELECT
      COUNT(*)
    FROM
      T15;
    
    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 877827156
    
    ----------------------------------------------------------------------------
    | Id  | Operation             | Name       | Rows  | Cost (%CPU)| Time     |
    ----------------------------------------------------------------------------
    |   0 | SELECT STATEMENT      |            |     1 |  6700   (1)| 00:01:21 |
    |   1 |  SORT AGGREGATE       |            |     1 |            |          |
    |   2 |   INDEX FAST FULL SCAN| IND_T15_C3 |    10M|  6700   (1)| 00:01:21 |
    ----------------------------------------------------------------------------
    Oracle finally used the correct smaller index, without needing a hint.

    I am not saying that any of the above happened in your case, just suggesting it as another possible reason why an index hint may appear to be ignored by the optimizer.

    Charles Hooper
    IT Manager/Oracle DBA
    K&M Machine-Fabricating, Inc.
  • 13. Re: An optimizer Question
    Hemant K Chitale Oracle ACE
    Currently Being Moderated
    Charles,

    The case you present is covered by Jonathan's statement "Or if using the hint would return the wrong answer" earlier.

    If the column is not defined as a NOT NULL Oracle cannot be sure that it does not contain NULLs -- even if you are sure that they are no NULLs. It is safer and correct for Oracle to not use the index , else using the index would lead to the wrong results.

    Hemant K Chitale
  • 14. Re: An optimizer Question
    CharlesHooper Expert
    Currently Being Moderated
    Hemant,

    The concept that I had in mind was the influence of potential NULL values in columns causing Oracle to (correctly) treat a hint as being invalid - not necessarily an index hint, but it was a little easier to demonstrate a "fix" for the invalid hint with an index on a nullable column. I agree with you that this should probably fall under Jonathan's statement "if using the hint would return the wrong answer". Then there is the case of a typo as I demonstrated, but that seems to be a repeat of my statement "if the wrong alias is used in the hint".

    By the way, nice summary of the reason why the optimizer would find the index hint as invalid.

    Charles Hooper
    IT Manager/Oracle DBA
    K&M Machine-Fabricating, Inc.
1 2 3 Previous Next

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points