9 Replies Latest reply on Oct 28, 2015 9:56 AM by AleKa

    Greater than more efficient than equals?

    AleKa

      Good morning,

      in our company we manage multiple client databases oracle .

      Our framework "versions" records using  2  NUMBER(18) fields (lets call NSTART and NEND). So that we have, for example:

       

      ExtId   |    NSTART  |             NEND          |....and so on

      +++++++++++++++++++++++++++++++++

        1       |     1            |    10                          |.....

        1       |     10          |    12                          |....and so on

        1       |      12         | 999999999999999999 |....and so on

        2       |     67          |    789                        |....and so on

        2       |     789        | 999999999999999999 |....and so on


      The record with 999999999999999999 is the "last version record".

       

      It happens that we have to identify the last version, so we look for the NEND = 999999999999999999.

      In databases  with million records, the "research" is really long and often seems to freeze...but we discovered that if we just change:

      NEND = 999999999999999999

      with

      NEND > 999999999999999998


      it is much more faster (doesnt freeze and takes just seconds, not minutes....)


      Why?


      Thanks

      Alex

        • 1. Re: Greater than more efficient than equals?
          Chris Hunt

          Compare the query plans of the two queries, that may explain what's going on.

          • 2. Re: Greater than more efficient than equals?
            Scott Swank

            I suspect that your stats tell the optimizer that there are no such records, so it doesn't bother to look for them:

             

            SELECT high_value

              FROM all_tab_columns

            WHERE owner = 'YOUR_SCHEMA'

               AND table_name = 'YOUR_TABLE'

               AND column_name = 'NEND';

            • 3. Re: Greater than more efficient than equals?
              BrunoVroman

              Hello,

               

              one remark about the use of a "dummy value" 999999999999... It can cause issues to the optimizer because your column has for example values in the range [ 1 99999999999 ] (so maybe: a value of 1000 is supposed to be on the first percent of values, although in reality it is > the largest value if we except 99999999999...

              And you create artificially "skewed" data.

               

              Maybe you should rather use a NULL value instead of 9999999999... Dummy values can be really bad... Search the Web, I've seen an interesting and amusing article of Tom Kyte about this usage.

               

              When you have to find the "last version", is it for a given extid? In other words: what are the columns with conditions in your WHERE clause? If you really search with only

                 " SELECT ... FROM ... WHERE nend = 9999999999 "

              then turning to NULL as I suggest would force a full scan because your query is then

                " SELECT ... FROM ... WHERE nend IS NULL "

              In this case, depending of the cardinalities involved (what percentage of the table are these "last values" rows?), it might be a good idea to create a functional index on, for example: NVL( nend, 'L' ) (L for Last_value).  If other queries look for some given "nend" and if you build an index on NEND anyway, you might add a dummy value in the index: CREATE INDEX ... ( nend, 'x' ) so even the NULL entries will be indexed.

               

              If the queries involve other columns, you might simply put the nend column is a composite index

              example: " SELECT ... FROM ... WHERE ext_id = 123 AND nend IS NULL " might use an index on ( ext_id, nend )

               

              Maybe you can give some more info on your data and on the needs (queries to run) to receive more help...

               

              Best regards,

               

              Bruno Vroman.

              • 4. Re: Greater than more efficient than equals?
                William Robertson

                it is much more faster (doesnt freeze and takes just seconds, not minutes....)


                Why?

                 

                The optimiser came up with a different cardinality estimate, which led to a different plan, which happened to be better.

                • 5. Re: Greater than more efficient than equals?

                  it is much more faster (doesnt freeze and takes just seconds, not minutes....)


                  Why?

                  Because a full table scan is used for one case and an index is used for the other.

                  Our framework "versions" records using  2  NUMBER(18) fields (lets call NSTART and NEND). So that we

                  Why? Why don't you just move that 'old' data to a history table?

                  It happens that we have to identify the last version, so we look for the NEND = 999999999999999999.

                  Ok - but if you moved the 'old' data to a history table you wouldn't need to do ANY of that.

                   

                  You could just:

                   

                  1. query the 'current' table if you want only current data

                  2. query the 'history' table if you want only history data

                  3. query a new view that combines data from both tables if you want data from both tables

                   

                  If you want help with a performance issue read the FAQ for how to post a tuning request and the info you need to  provide:

                   

                  1. the query

                  2. the DDL for the tables and indexes

                  3. the execution plan

                  4. info on stats including how you collect them

                  5. row count info for tables and query predicates

                   

                  We have NO IDEA how many '999..' rows you have compared to other rows.

                   

                  For some use cases you could just create a functional index that ONLY indexes the '999...' values and converts other values to NULL so they don't get indexed.

                   

                  Then the index will only have values for the '999...'.

                   

                  Whether Oracle will actually use that index, and when, depends on the info you provide for those 5 pieces I mentioned.

                  • 6. Re: Greater than more efficient than equals?
                    William Robertson

                    I agree, inline versioning is horrible and destroys query efficiency as well as referential integrity. It seems to be used on every project I work on these days. I really think Oracle is missing a trick here in not supporting it transparently (and I don't just mean the AS OF PERIOD syntactic sugar in 12c).

                     

                    Possibly subpartitioning on the range-end column and enabling row movement might be worth the overhead, as queries that just want the latest row would have much less work to do.

                    • 7. Re: Greater than more efficient than equals?
                      Jonathan Lewis

                      It looks like you've found a defect in the way Oracle generates a height-balanced histogram for a numeric column.

                       

                      I think your histogram has got 1e19 as its high value rather than 999,999,999,999,999,999 so the query on equality calculates that 999,999,999,999,999,999 is a very ordinary value with a tiny number of rows, while the greater than query sees that a large percentage of your data is a very large.  This results in a change in execution plan.

                       

                      If you used 999,999,999,999,999 as the highest value the difference would disappear - and both plans would take the same path (the current "> 999999999999999998" one).

                      If I get time I'll write up some demo code on my blog.

                       

                      Regards

                      Jonathan Lewis

                      • 8. Re: Greater than more efficient than equals?
                        Jonathan Lewis

                        I wrote up a blog note on Friday - and it turns out that I had already discovered the problem about 18 months ago. It's fixed in 12.1.0.2: https://jonathanlewis.wordpress.com/2015/10/23/histogram-limit/

                         

                        There is a bug number, with the correct title but the base bug trail leads to a different problem:Bug 18514507 : WRONG CARDINALITY ESTIMATES WHEN NUMERIC VALUE IS LONGER THAN 15 BYTES

                         

                        Regards

                        Jonathan Lewis

                        • 9. Re: Greater than more efficient than equals?
                          AleKa

                          Many thanks,

                          I've read your article (very clear) and sent the link to our dba, I'll send updates.

                           

                          Cya

                          Alex