1 2 Previous Next 25 Replies Latest reply: May 25, 2012 6:01 AM by Jonathan Lewis Go to original post RSS
      • 15. Re: OR Expansion Behavior
        OraDBA02
        deleting this thread..

        Edited by: OraDBA02 on Oct 3, 2012 2:38 PM
        • 16. Re: OR Expansion Behavior
          Nikolay Savvinov
          Jonathan Lewis wrote:
          Nikolay Savvinov wrote:

          1) in 10g I've seen cases when Oracle does a terrible job recognizing skewed distributions. It probably has improved in 11g, but I'm not sure by how much.
          The problem is not really about the number of buckets, it's about the same size and the side effects of Oracle missing the "rare" values when sampling
          Hi Jonathan,

          yes I am familiar with this problem (had to deal with a major crisis caused by this problem a couple of years back), that's why I said that Oracle 10g isn't doing a great job identifying skewed distributions. Wasn't aware of differences between 10.2.0.3 and 10.2.0.4 though.

          How about higher versions of Oracle, has the sampling of rare values improved there?

          Best regards,
          Nikolay
          • 17. Re: OR Expansion Behavior
            Jonathan Lewis
            Nikolay Savvinov wrote:

            How about higher versions of Oracle, has the sampling of rare values improved there?
            Nikolay,
            I haven't seen any indication that the sampling for frequency histograms has improved in 11g - but I haven't taken the time to check this explicitly.

            If you've enabled 'approximate NDV' then Oracle can get a very good approximation of the number of distinct values very cheaply, and when the number of distinct values is less than 16384 it will get an exact count. It's possible that the code that generates a frequency histogram could check that the number of values in the histogram matches the known number of distinct values and increases the sample size until the two sets of results are consistent.

            It shouldn't take more than 30 minutes to check this hypothesis - but the same could be said of all (five or six thousand) of the questions on my todo list.

            Regards
            Jonathan Lewis
            • 18. Re: OR Expansion Behavior
              Jonathan Lewis
              OraDBA02 wrote:
              Thanks Jonathan. (Its Time for me to go through Oracle -Core again!)
              What are your index definitions ?
              I_GC_BATCH_EVENTS_STATUS is traditional b-tree on (MARKETPLACE_ID, EVENT_STATUS).
              I_GC_BATCH_EVENTS_STATUS_TYPE is on (MARKETPLACE_ID, EVENT_STATUS,EVENT_TYPE).

              a) it looks like one may be a prefix of the other, and therefore redudant
              This is exactly what i thought. I believe I_GC_BATCH_EVENTS_STATUS_TYPE is redundant and should be dropped.
              Strategically correct. You can test the safety by first making it invisible. There is a trap, though - the optimizer can't use an invisible index, but it may still note the number of distinct keys in the index and use the number on some of its arithmetic. So you may need to create extended stats on column group (marketplace_id, event_status) before dropping the index.

              b) it looks like the leading columns of the indexes are highly repetitive, so the indexes should have been created with some level of compression
              Thats true. Should i enable COMPRESSION on this highly OLTP table ?
              Compression of indexes is fine, compression of the table is something completely different. From your previous posts I infer that the index has about 300 distinct keys for 117M rows, which should give a high level of compressibility if you compress on all three columns. (Note - it's not compression in the normal use of the term, it's de-duplication at a block level - with your data I think you could get something like 600 index entries per leaf block after compression)
              c) it looks (from the plan) like the event_status index doesn't include the status column when the name suggests that it's supposed to.
              It does include event_status col.
              That highlights an interesting limitation in the optimizer that I hadn't noticed before. The event_stats column isn't used during the index range scan in the first plan, it applies ONLY after you've acquired rows from the table. In principle Oracle could apply the predicate (event_status in (2,3)) as an access or filter predicate on the index, thus limiting its visits to the table where it would have to do the more detailed check due to the combined predicate (event_status = 2) or (event_status = 3 and ....). This is where the concatenation helps - it splits the query into two blocks, one for each event_status, and that allows Oracle to apply the event_status predicate in the index.
              This is a case where function-based indexes (to hide the very common values) combined with matching modification to the code become very important to minimise the sizes and costs of index maintenance and to make the queries efficient.
              This seems to be a great piece of advise. Are you pointing towards FBI containing OR and AND conditions ?
              No. The best choice of FBI depends on what the data looks like and what your common queries are, but if the result set is tiny because of the event status values you're choosing then you might create an index like:
              (
                case when event_status in (2,3) then marketplace_id end,
                case when event_status in (2,3) then event_status end,
                case when event_status in (2,3) then event_type end
              )
              Your query would then need to be something like:
              where
                    case when event_status in (2,3) then marketplace_id end in (1,2,3,7,8,9)
              and case when event_status in (2,3) then event_type end = 2
              and case when event_status in (2,3) then event_type end = 'Whatever'
              I can't see the original query as I type this, but I think it should give you the right sort of idea.
              You'll have to play about with the details - the basic trick is the case/decode approach that guarantees you have an index on only the small number of (easily defined) rows that give you the answer you want with, perhaps, a small amount of excess work.

              Regards
              Jonathan Lewis
              • 19. Re: OR Expansion Behavior
                OraDBA02
                deleting this thread..

                Edited by: OraDBA02 on Oct 3, 2012 2:38 PM
                • 20. Re: OR Expansion Behavior
                  Jonathan Lewis
                  OraDBA02 wrote:
                  Thanks Jonathan for detailed explanation. One of your link that describes bug 5483301 seems to be hitting here.

                  Sometimes, CBO picks up cardinality =50% of table rows and sometime it uses cardinality=1 for values which does not have histograms.
                  In both cases, Histogram on "Rare" value is missing.

                  Database-Prod
                  COLUMN_NAME             ENDPOINT_NUMBER ENDPOINT_VALUE ENDPOINT_A
                  ----------------------- --------------- -------------- ----------
                  EVENT_STATUS                    2580109              4
                  EVENT_STATUS                    2580133              5
                  
                  --------------------------------------------------------------------------------------------------
                  | Id  | Operation        | Name                          | Rows  | Bytes | Cost (%CPU)| Time     |
                  --------------------------------------------------------------------------------------------------
                  |   0 | SELECT STATEMENT |                               |     1 |     3 |  8236   (1)| 00:01:39 |
                  |   1 |  SORT AGGREGATE  |                               |     1 |     3 |            |          |
                  |*  2 |   INDEX SKIP SCAN| I_GC_BATCH_EVENTS_STATUS_TYPE |     1 |     3 |  8236   (1)| 00:01:39 |
                  --------------------------------------------------------------------------------------------------
                  Database UAT
                  COLUMN_NAME          ENDPOINT_NUMBER ENDPOINT_VALUE ENDPOINT_A
                  -------------------- --------------- -------------- ----------
                  EVENT_STATUS                    5416              4
                  (Only one bucket for value=4)
                  
                  --------------------------------------------------------------------------------------------------
                  | Id  | Operation        | Name                          | Rows  | Bytes | Cost (%CPU)| Time     |
                  --------------------------------------------------------------------------------------------------
                  |   0 | SELECT STATEMENT |                               |     1 |     3 | 10994   (1)| 00:02:12 |
                  |   1 |  SORT AGGREGATE  |                               |     1 |     3 |            |          |
                  |*  2 |   INDEX SKIP SCAN| I_GC_BATCH_EVENTS_STATUS_TYPE |    12M|    35M| 10994   (1)| 00:02:12 |
                  --------------------------------------------------------------------------------------------------
                  Three things to note -
                  a) between 10.2.0.4 and 11.2.0.2 Oracle has had time to refine the algorithm, so the may have added some extra conditions to it.
                  b) the prod sample is 2.5M rows, of which 24 are in status 4, the rest in status 5, so according to my description I'd expect Oracle to predict about 120 rows for status_type = 2. (24/2 = 12, then scale up by a factor of 10 to get from 2.5M to the 24M rows in the table).
                  The fact that Oracle reports 1 row certainly looks like it's using the old mechanism - if I waned to find out what was going on I'd take a look at the 10053 trace. (In the context of 24M rows, 1 is close enough to 120 that it might be a clue that Oracle has added some refinements to the algorithm.)
                  c) The uat sample is only 5,000 rows, and it's hit the special case of finding only one value. I think Randolf Geist has published a note about an anomaly with histograms in some versions when you see only one value in the sample (The anomaly may not be present in this version, of course, I don't remember - but it's a warning that whatever you usually expect to see, 1 is a special case.)

                  I have to say I think I've seen a note somewhere that makes a comment about some circumstances where Oracle uses the old density value when you've recently recalculated stats and some other memory condition holds in the dictionary cache - and that might account for the cardinality 1 on prod.

                  Can you also help me understand how can i use "Approx NDV" to solve histogram oddity?
                  It probably won't, but as I pointed out to Nikolay, it will get a much more accurate cardinality estimate, and MAYBE the histogram code in this version will take a larger sample if it sees that the number of values in the histogram is much smaller than the NDV.
                  The solution to your histogram problem is to write your own code to construct the histogram(s) you want Oracle to see - it saves time and improves stability.

                  Looking at your prod stats, I begin to suspect that most of your data is in status 5, so a function-based index where each column uses a case clause to return null when it sees 5 would be your best bet as a general mechanism.

                  Regards
                  Jonathan Lewis
                  • 21. Re: OR Expansion Behavior
                    Randolf Geist
                    Here is sql which is showing different execution plan in two databases

                    SQL
                    select  * from
                    ( select  this_.GC_BATCH_EVENT_ID as GC1_31_0_,this_.RECORD_VERSION_NUMBER as RECORD3_31_0_,this_.ACTION_ID as ACTION4_31_0_,this_.MARKETPLACE_ID as MARKETPL5_31_0_,
                                   this_.TRIGGER_DATE as TRIGGER6_31_0_,this_.EVENT_STATUS as EVENT7_31_0_,this_.PROCESS_ATTEMPTS as PROCESS8_31_0_,this_.EVENT_TYPE as EVENT2_31_0_,
                                   this_.CREATION_DATE as CREATION9_31_0_,this_.LAST_UPDATED as LAST10_31_0_,this_.LAST_UPDATED_BY as LAST11_31_0_
                    from
                    GC_BATCH_EVENTS this_
                    where
                    this_.trigger_date <= sysdate and (this_.event_status = 2 or (this_.event_status = 3  and this_.last_updated < sysdate - 60/1440))
                    and this_.event_type in ('Fate')
                    and this_.marketplace_id in (1, 7, 3, 4, 5, 6, 35691, 44551)
                    and this_.PROCESS_ATTEMPTS<=10  and this_.CREATION_DATE>SYSDATE )
                    where rownum <= 1000
                    /
                    Just two comments here:

                    1. The result of above query is non-deterministic, because when applying ROWNUM without an ORDER BY Oracle can give you any 1,000 rows back. Given the nature of the query and the names of the columns and tables I assume that this is OK in your particular case ("process any unprocessed rows") but I mention it anyway because I still see it used inappropriately quite often

                    2. Given the ROWNUM attribute the query is very likely optimized using the FIRST_ROWS_N mode of the optimizer that can introduce some surprising decisions and also some bugs that you won't necessarily see when using the ALL_ROWS mode

                    I don't think that your issue described here is caused by the FIRST_ROWS_N mode, but it might be useful to keep in mind.
                    Bad Plan :
                    no rows selected
                    
                    Elapsed: 00:05:43.28  --->
                    
                    Execution Plan
                    ----------------------------------------------------------
                    Plan hash value: 3816329578
                    
                    ----------------------------------------------------------------------------------------------------------
                    | Id  | Operation                     | Name                     | Rows  | Bytes | Cost (%CPU)| Time     |
                    ----------------------------------------------------------------------------------------------------------
                    |   0 | SELECT STATEMENT              |                          |     2 |   142 | 47859   (1)| 00:09:35 |
                    |*  1 |  COUNT STOPKEY                |                          |       |       |            |          |
                    |   2 |   INLIST ITERATOR             |                          |       |       |            |          |
                    |*  3 |    TABLE ACCESS BY INDEX ROWID| GC_BATCH_EVENTS          |     2 |   142 | 47859   (1)| 00:09:35 |
                    |*  4 |     INDEX RANGE SCAN          | I_GC_BATCH_EVENTS_STATUS |    24M|       |  3580   (3)| 00:00:43 |
                    ----------------------------------------------------------------------------------------------------------
                    
                    Predicate Information (identified by operation id):
                    ---------------------------------------------------
                    
                    1 - filter(ROWNUM<=1000)
                    3 - filter("THIS_"."CREATION_DATE">SYSDATE@! AND ("THIS_"."EVENT_STATUS"=2 OR
                    "THIS_"."EVENT_STATUS"=3 AND "THIS_"."LAST_UPDATED"<SYSDATE@!-.04166666666666666666666666666666666
                    66667) AND "THIS_"."EVENT_TYPE"='Fate' AND "THIS_"."TRIGGER_DATE"<=SYSDATE@! AND
                    "THIS_"."PROCESS_ATTEMPTS"<=10)
                    4 - access("THIS_"."MARKETPLACE_ID"=1 OR "THIS_"."MARKETPLACE_ID"=3 OR
                    "THIS_"."MARKETPLACE_ID"=4 OR "THIS_"."MARKETPLACE_ID"=5 OR "THIS_"."MARKETPLACE_ID"=6 OR
                    "THIS_"."MARKETPLACE_ID"=7 OR "THIS_"."MARKETPLACE_ID"=35691 OR "THIS_"."MARKETPLACE_ID"=44551)
                    Since EVENT_STATUS is part of both indexes above plan seems to exhibit a bug I've come across in the past where some of the predicates weren't applied as ACCESS predicates but as FILTER predicates. It is related to the usage of INLIST ITERATORS and some other circumstances I can't remember at the moment, but your other plan clearly shows that EVENT_STATUS should be part of the ACCESS predicate.

                    I'll write a separate reply about your histogram / cardinality issue.

                    Hope this helps,
                    Randolf
                    • 22. Re: OR Expansion Behavior
                      Randolf Geist
                      OraDBA02 wrote:
                      Thanks Jonathan for detailed explanation. One of your link that describes bug 5483301 seems to be hitting here.
                      However, i have other oddity.
                      Sometimes, CBO picks up cardinality =50% of table rows and sometime it uses cardinality=1 for values which does not have histograms.
                      In both cases, Histogram on "Rare" value is missing.

                      Database-Prod
                      SELECT COLUMN_NAME,ENDPOINT_NUMBER,ENDPOINT_VALUE,ENDPOINT_ACTUAL_VALUE FROM DBA_TAB_HISTOGRAMS WHERE TABLE_NAME='GC_BATCH_EVENTS' AND OWNER='BOOKER'
                      AND COLUMN_NAME ='EVENT_STATUS';  
                      COLUMN_NAME             ENDPOINT_NUMBER ENDPOINT_VALUE ENDPOINT_A
                      ----------------------- --------------- -------------- ----------
                      EVENT_STATUS                    2580109              4
                      EVENT_STATUS                    2580133              5
                      
                      explain plan for explain plan for select count(*) from GC_BATCH_EVENTS where event_status=2;
                      Plan hash value: 4000767050
                      --------------------------------------------------------------------------------------------------
                      | Id  | Operation        | Name                          | Rows  | Bytes | Cost (%CPU)| Time     |
                      --------------------------------------------------------------------------------------------------
                      |   0 | SELECT STATEMENT |                               |     1 |     3 |  8236   (1)| 00:01:39 |
                      |   1 |  SORT AGGREGATE  |                               |     1 |     3 |            |          |
                      |*  2 |   INDEX SKIP SCAN| I_GC_BATCH_EVENTS_STATUS_TYPE |     1 |     3 |  8236   (1)| 00:01:39 |
                      --------------------------------------------------------------------------------------------------
                      
                      Predicate Information (identified by operation id):
                      ---------------------------------------------------
                      
                      2 - access("EVENT_STATUS"=2)
                      filter("EVENT_STATUS"=2)
                      Database UAT
                      SELECT COLUMN_NAME,ENDPOINT_NUMBER,ENDPOINT_VALUE,ENDPOINT_ACTUAL_VALUE FROM DBA_TAB_HISTOGRAMS WHERE TABLE_NAME='GC_BATCH_EVENTS' AND OWNER='BOOKER'
                      AND COLUMN_NAME ='EVENT_STATUS';  
                      COLUMN_NAME          ENDPOINT_NUMBER ENDPOINT_VALUE ENDPOINT_A
                      -------------------- --------------- -------------- ----------
                      EVENT_STATUS                    5416              4
                      (Only one bucket for value=4)
                      
                      explain plan for explain plan for select count(*) from GC_BATCH_EVENTS where event_status=2;
                      Plan hash value: 4000767050
                      --------------------------------------------------------------------------------------------------
                      | Id  | Operation        | Name                          | Rows  | Bytes | Cost (%CPU)| Time     |
                      --------------------------------------------------------------------------------------------------
                      |   0 | SELECT STATEMENT |                               |     1 |     3 | 10994   (1)| 00:02:12 |
                      |   1 |  SORT AGGREGATE  |                               |     1 |     3 |            |          |
                      |*  2 |   INDEX SKIP SCAN| I_GC_BATCH_EVENTS_STATUS_TYPE |    12M|    35M| 10994   (1)| 00:02:12 |
                      --------------------------------------------------------------------------------------------------
                      
                      Predicate Information (identified by operation id):
                      ---------------------------------------------------
                      
                      2 - access("EVENT_STATUS"=2)
                      filter("EVENT_STATUS"=2)
                      I think there is a pretty straightforward explanation for this behaviour and you can see a similar issue described on my blog:

                      http://oracle-randolf.blogspot.de/2010/04/frequency-histograms-edge-cases.html

                      In your first example most values found map to value 4 in the histogram, but Oracle managed to pick up the rare value 5 (24 occurrences), due to the increased sample size (10 percent).

                      This means that values that are not found in the frequency histogram will by default get a cardinality of 0.5 * 24 (according to the new algorithm introduced in 11g / backported to 10.2.0.4), scaled up by 10 due to the 10 percent sample size (as pointed out by Jonathan).

                      But 2 is very likely outside the existing range of values (4-5) hence Oracle will come up with a downscaled cardinality, in this case scaled down to 1.

                      As demonstrated in the blog post mentioned, in the special case of a single value (low sample size, only the very popular value 4 found), this "out-of-range" detection doesn't work if your NUM_DISTINCT is different from the number of values found in the histogram.

                      Hence the downscale doesn't apply and 0.5 * 100% is applied to the non-existing value 2, similar to what I show in the blog post.

                      Clearly the new strategy of generating histograms with a very low sample size when using the new AUTO_SAMPLE_SIZE (Approx. NDV) algorithm in 11g leading to inconsistent results between the basic columns statistics populated by the new algorithm (NUM_DISTINCT, LOW_VALUE, HIGH_VALUE etc.) and the low sample size histogram is a threat when dealing with highly skewed column values and searching for rare values.

                      Since you're already on 11g one possible approach could be to generate a virtual column that filters out the popular value and search for the unpopular values using that virtual column, also demonstrated in the blog post.

                      Hope this helps,
                      Randolf
                      • 23. Re: OR Expansion Behavior
                        OraDBA02
                        I want to express my gratitude to Nikolay,Jonathan Lewis and Randolf Giest who gave their valuable time in supporting this thread. Their contribution has solved critical performance issue. Thank you all.
                        • 24. Re: OR Expansion Behavior
                          Rainer Stenzel
                          Jonathan Lewis wrote:
                          This is why I advice people to construct histograms through calls to dbms_stats.set_column_stats in extreme cases, and to refresh those histograms after any stats collection on the table. http://jonathanlewis.wordpress.com/2009/05/28/frequency-histograms/
                          Can you recommend the construction of frequency histograms with more then 255 different values ?
                          Will Oracle support this ?
                          Do you know/recommend OHU (Optimum Histogram Utility by Doug Laughlin/ Optiwarehouze Solutions) or other products to do this job ?

                          Hope these question matches this topic without allegations of thread hijacking.

                          Best Regards,

                          Rainer Stenzel
                          • 25. Re: OR Expansion Behavior
                            Jonathan Lewis
                            Rainer Stenzel wrote:
                            Jonathan Lewis wrote:
                            This is why I advise people to construct histograms through calls to dbms_stats.set_column_stats in extreme cases, and to refresh those histograms after any stats collection on the table. http://jonathanlewis.wordpress.com/2009/05/28/frequency-histograms/
                            Can you recommend the construction of frequency histograms with more then 255 different values ?
                            Will Oracle support this ?
                            Do you know/recommend OHU (Optimum Histogram Utility by Doug Laughlin/ Optiwarehouze Solutions) or other products to do this job ?
                            The last time I tried to bypass the normal procedures and load more than 255 values into the relevant table Oracle ignored the histogram. Even if it worked I wouldn't advise that type of strategy since you never know when Oracle might change the logic of processing the data in a way that makes your method fail and (possibly) leaves you with a very limited time to find an alternative solution. If you can find a method that leaves the data looking like data that Oracle might have generated then I'd think about using it.

                            No - I don't think Oracle (Corp.) would support this.

                            Doug Laughlin sent me a link to a video demonstration of OHU, which I watched, but I haven't had time yet to look at it in detail. My impression at the time was that the video simply showed the impact on the optimizer calculations of the default compared to his method, but didn't give any indication of whether his method was taking advantage of legal mechanisms (e.g. the ODCI routines) to give Oracle a different statistical method, or whether he had subverted Oracle in a way that wouldn't be supported.

                            In general I would take the approach that you are unlikely to need more than about 250 "popular" values,including an indicator for the low and high values and an average frequency of everything else to get an adequate histogram in almost all cases.

                            Regards
                            Jonathan Lewis
                            1 2 Previous Next