Forum Stats

  • 3,726,974 Users
  • 2,245,299 Discussions
  • 7,852,514 Comments

Discussions

Select UNIQUE with compressed index slow

954696
954696 Member Posts: 17
edited June 2018 in SQL & PL/SQL

Hi!

In am doing a SELECT UNIQUE on a column that has a few values in a table wit many rows.

I would expect a compressed index would make the operation very fast, but it does not.

Example follows:

create table t (val varchar2(50));alter table t modify val not null;insert into t select 'foobar' from dual connect by level<1000000;create index t1 on t (val) compress 1;

Now a "select unique val from t" before the index creation uses 1955 consistent gets, and after the index creation 1551 consistent gets.

It is a bit less but not nearly as much as I expected.

As there is only one value in the index and it should be stored once due to compression, I would expect it would be very fast to retrieve all unique values.

Why isn't it so?

Is there a way to make it faster?

DB version: 11.2.0.2

Plan:

select unique val from test.t; VAL--------------------------------------------------foobarExecution Plan----------------------------------------------------------Plan hash value: 225264052------------------------------------------------------------------------------| Id  | Operation             | Name | Rows  | Bytes | Cost (%CPU)| Time     |------------------------------------------------------------------------------|   0 | SELECT STATEMENT      |      |     1 |     9 |   451   (8)| 00:00:06 ||   1 |  HASH UNIQUE          |      |     1 |     9 |   451   (8)| 00:00:06 ||   2 |   INDEX FAST FULL SCAN| T1   |   999K|  8789K|   422   (1)| 00:00:06 |------------------------------------------------------------------------------Statistics----------------------------------------------------------          0  recursive calls          0  db block gets       1551  consistent gets          0  physical reads          0  redo size        490  bytes sent via SQL*Net to client        437  bytes received via SQL*Net from client          2  SQL*Net roundtrips to/from client          0  sorts (memory)          0  sorts (disk)          1  rows processed

Regards,

David

AndrewSayerJonathan LewisWilliam RobertsonSven W.

Answers

  • AndrewSayer
    AndrewSayer Member Posts: 12,998 Gold Crown
    edited May 2018
    954696 wrote:Hi!In am doing a SELECT UNIQUE on a column that has a few values in a table wit many rows.I would expect a compressed index would make the operation very fast, but it does not.Example follows:
    1. createtablet(valvarchar2(50));
    2. altertabletmodifyvalnotnull;
    3. insertintotselect'foobar'fromdualconnectbylevel<1000000;
    4. createindext1ont(val)compress1;
    create table t (val varchar2(50)); alter table t modify val not null; insert into t select 'foobar' from dual connect by level<1000000; create index t1 on t (val) compress 1;
    Now a "select unique val from t" before the index creation uses 1955 consistent gets, and after the index creation 1551 consistent gets.It is a bit less but not nearly as much as I expected.As there is only one value in the index and it should be stored once due to compression, I would expect it would be very fast to retrieve all unique values.Why isn't it so?Is there a way to make it faster?DB version: 11.2.0.2Plan:
    1. selectuniquevalfromtest.t;
    2. VAL
    3. --------------------------------------------------
    4. foobar
    5. ExecutionPlan
    6. ----------------------------------------------------------
    7. Planhashvalue:225264052
    8. ------------------------------------------------------------------------------
    9. |Id|Operation|Name|Rows|Bytes|Cost(%CPU)|Time|
    10. ------------------------------------------------------------------------------
    11. |0|SELECTSTATEMENT||1|9|451(8)|00:00:06|
    12. |1|HASHUNIQUE||1|9|451(8)|00:00:06|
    13. |2|INDEXFASTFULLSCAN|T1|999K|8789K|422(1)|00:00:06|
    14. ------------------------------------------------------------------------------
    15. Statistics
    16. ----------------------------------------------------------
    17. 0recursivecalls
    18. 0dbblockgets
    19. 1551consistentgets
    20. 0physicalreads
    21. 0redosize
    22. 490bytessentviaSQL*Nettoclient
    23. 437bytesreceivedviaSQL*Netfromclient
    24. 2SQL*Netroundtripsto/fromclient
    25. 0sorts(memory)
    26. 0sorts(disk)
    27. 1rowsprocessed
    select unique val from test.t;   VAL -------------------------------------------------- foobar   Execution Plan ---------------------------------------------------------- Plan hash value: 225264052  ------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | 9 | 451 (8)| 00:00:06 | | 1 | HASH UNIQUE | | 1 | 9 | 451 (8)| 00:00:06 | | 2 | INDEX FAST FULL SCAN| T1 | 999K| 8789K| 422 (1)| 00:00:06 | ------------------------------------------------------------------------------   Statistics ----------------------------------------------------------  0 recursive calls  0 db block gets  1551 consistent gets  0 physical reads  0 redo size  490 bytes sent via SQL*Net to client  437 bytes received via SQL*Net from client  2 SQL*Net roundtrips to/from client  0 sorts (memory)  0 sorts (disk)  1 rows processed 
    Regards,David

    The index still has to store the rowids, even when compressed. In order to get all the distinct values you have to read all of the leaf blocks. Here, the optimizer decided the best way to read the entire index leaf blocks would be to read the entire index using multi block reads and then sort the results.

    Alternatively, (and the compression will make little difference here too), you can do what is informally referred to as an index bouncy scan which takes advantage of the min/max reads to hop to the next distinct value by traversing the index branch blocks to one leaf block.https://jonathanlewis.wordpress.com/2017/02/09/index-bouncy-scan/ is a nice write up on the technique.

    Here is what I would do for your requirement

    with bouncy (val)

      as (select min(val) val

          from  t

    union all

      select (select min(t.val) val from t where t.val > bouncy.val) val

      from  bouncy

      where  bouncy.val is not null

        )

    select *

    from  bouncy

    where  val is not null;

    If I insert an extra row, this is my comparison:

    SQL> select unique val from t;

    VAL
    --------------------------------------------------
    foobar
    foobar2


    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 1793979440

    -----------------------------------------------------------------------------------
    | Id  | Operation          | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time    |
    -----------------------------------------------------------------------------------
    |  0 | SELECT STATEMENT  |      |  1959K|    50M|      | 15852  (1)| 00:00:01 |
    |  1 |  HASH UNIQUE      |      |  1959K|    50M|    67M| 15852  (1)| 00:00:01 |
    |  2 |  TABLE ACCESS FULL| T    |  1959K|    50M|      |  968  (2)| 00:00:01 |
    -----------------------------------------------------------------------------------

    Note
    -----
      - dynamic statistics used: dynamic sampling (level=2)


    Statistics
    ----------------------------------------------------------
              0  recursive calls
              0  db block gets
          3470  consistent gets
              0  physical reads
              0  redo size
            600  bytes sent via SQL*Net to client
            572  bytes received via SQL*Net from client
              2  SQL*Net roundtrips to/from client
              0  sorts (memory)
              0  sorts (disk)
              2  rows processed

    SQL>
    SQL> with bouncy (val)
      2    as (select min(val) val
      3        from  t
      4  union all
      5    select (select min(t.val) val from t where t.val > bouncy.val) val
      6    from  bouncy
      7    where  bouncy.val is not null
      8      )
      9  select *
    10  from  bouncy
    11  where  val is not null;

    VAL
    --------------------------------------------------
    foobar
    foobar2


    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 1014999004

    --------------------------------------------------------------------------------------------------
    | Id  | Operation                                | Name | Rows  | Bytes | Cost (%CPU)| Time    |
    --------------------------------------------------------------------------------------------------
    |  0 | SELECT STATEMENT                          |      |    2 |    54 |    27  (0)| 00:00:01 |
    |*  1 |  VIEW                                    |      |    2 |    54 |    27  (0)| 00:00:01 |
    |  2 |  UNION ALL (RECURSIVE WITH) BREADTH FIRST|      |      |      |            |          |
    |  3 |    SORT AGGREGATE                        |      |    1 |    27 |            |          |
    |  4 |    INDEX FULL SCAN (MIN/MAX)            | T1  |    1 |    27 |    3  (0)| 00:00:01 |
    |  5 |    SORT AGGREGATE                        |      |    1 |    27 |            |          |
    |  6 |    FIRST ROW                            |      |    1 |    27 |    3  (0)| 00:00:01 |
    |*  7 |      INDEX RANGE SCAN (MIN/MAX)          | T1  |    1 |    27 |    3  (0)| 00:00:01 |
    |*  8 |    RECURSIVE WITH PUMP                    |      |      |      |            |          |
    --------------------------------------------------------------------------------------------------

    Predicate Information (identified by operation id):
    ---------------------------------------------------

      1 - filter("VAL" IS NOT NULL)
      7 - access("T"."VAL">:B1)
      8 - filter("BOUNCY"."VAL" IS NOT NULL)

    Note
    -----
      - dynamic statistics used: dynamic sampling (level=2)


    Statistics
    ----------------------------------------------------------
              9  recursive calls
              0  db block gets
            17  consistent gets
              0  physical reads
              0  redo size
            600  bytes sent via SQL*Net to client
            572  bytes received via SQL*Net from client
              2  SQL*Net roundtrips to/from client
              4  sorts (memory)
              0  sorts (disk)
              2  rows processed

    Jonathan Lewis
  • 954696
    954696 Member Posts: 17
    edited May 2018

    Thanks, that looks promising!

    PS: Why quote the entire message?

    William Robertson
  • Unknown
    edited May 2018
    PS: Why quote the entire message?

    So that there is a record of EXACTLY what someone posted that was being responded to.

    Some people have been known to edit their post AFTER people have responded to it and then the reply may not make any sense in the context of the new/edited post.

    And sometimes when someone makes a bonehead mistake and is told about it they edit their post to REMOVE EVERYTHING except a single character so others can't see the bonehead mistake they made.

    Andew's answer wouldn't make any sense if you deleted, or seriously altered, what your initial question was.

    AndrewSayer
  • 954696
    954696 Member Posts: 17
    edited May 2018

    Makes sense.

    Although the quotes look really weird, as if doubly quoted. Except the code part which is displayed as not quoted at all.

    I guess it is the forum code...

  • 954696
    954696 Member Posts: 17
    edited May 2018

    Can the bouncy method be expanded for two columns?

    create table t (val1 varchar2(10),val2 varchar2(500));select unique val1,val2 from t;
  • Jonathan Lewis
    Jonathan Lewis Member Posts: 9,569 Gold Crown
    edited May 2018
    954696 wrote:Can the bouncy method be expanded for two columns?
    1. createtablet(
    2. val1varchar2(10),
    3. val2varchar2(500));
    4. selectuniqueval1,val2fromt;
    create table t ( val1 varchar2(10), val2 varchar2(500));  select unique val1,val2 from t; 

    Quick answer to your question - and assuming val1 and val2 are declared not null (though there may be a nullable solution if I think about it a bit longer) the answer is yes.

    My first thought is a two-step mechanism:  use the recursive CTE method to produce a list of unique val1 values with the corresponding minimum for val2, then use that result in a recursive CTE that uses the method Andrew showed to find all the val2 values for each val1 value in turn.

    Since you also posed the question in a comment on my blog I'll write up a blog note about it after I've worked out the details and link back to it. I may have time to do it some time this evening.

    Regards

    Jonathan Lewis

    AndrewSayer
  • William Robertson
    William Robertson Member Posts: 9,560 Bronze Crown
    edited May 2018

    Agreed, given that the forum Jive code doesn't handle quoting or code particularly well, I would personally take a chance on the original post not changing significantly, in order to spare everyone from having to scroll down through several pages of mangled quotes to get to the actual answer.

  • Jonathan Lewis
    Jonathan Lewis Member Posts: 9,569 Gold Crown
    edited May 2018

    Still preparing the note - explaining code takes so much longer than creating the code!

    Here's a solution - others may do better.  I had to hint one line to stop an index fast full scan, but that was a side effect of the statistics (particularly the size of data set and the number of distinct values).  val1 and val2 are declared not null in my test, and there's an index stating (val1, val2)

    with bounce1(val1, val2) as (

            select val1, val2

            from    (

                    select

                            /*+ index(t1) no_index_ffs(t1) */

                            val1, val2,

                            row_number() over(order by val1, val2) rn

                    from    t1

            )

            where

                    rn = 1

            union all

            select

                    (select min(t1.val1) val1 from t1 where t1.val1 > bounce1.val1) val1,

                    (select min(t1.val2) val2 from t1 where t1.val1 > bounce1.val1 and rownum = 1) val2

            from    bounce1

            where  bounce1.val1 is not null

    ),

    bounce2 (val1, val2)

    as (

            select  val1, val2

            from    bounce1

            where  bounce1.val1 is not null

            union all

            select  bounce2.val1, (select min(t1.val2) val2 from t1 where t1.val1 = bounce2.val1 and t1.val2 > bounce2.val2) val2

            from    bounce2

            where  bounce2.val2 is not null

            and    bounce2.val1 is not null

    )

    select * from bounce2

    where

            bounce2.val2 is not null

    and    bounce2.val1 is not null        -- > redundant predicate

    order by

            val1, val2

    ;

    Regards

    Jonathan Lewis

    UPDATE:  Background article now available at: https://jonathanlewis.wordpress.com/2018/05/29/index-bouncy-scan-2/

  • AndrewSayer
    AndrewSayer Member Posts: 12,998 Gold Crown
    edited May 2018

    With this as my setup:

    create table t1 as select mod(rownum,10) val1, mod(rownum,3) val2 from dual connect by rownum <=10000;

    create index t1_idx on t1 (val1, val2);

    alter table t1 modify val1 not null;

    alter table t1 modify val2 not null;

    I get

    ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------

    | Id  | Operation                                    | Name   | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |

    ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------

    |   0 | SELECT STATEMENT                             |        |      1 |        |       |    56 (100)|          |     30 |00:00:00.01 |      66 |       |       |          |

    |   1 |  SORT ORDER BY                               |        |      1 |      4 |   104 |    56   (2)| 00:00:01 |     30 |00:00:00.01 |      66 |  2048 |  2048 | 2048  (0)|

    |*  2 |   VIEW                                       |        |      1 |      4 |   104 |    55   (0)| 00:00:01 |     30 |00:00:00.01 |      66 |       |       |          |

    |   3 |    UNION ALL (RECURSIVE WITH) BREADTH FIRST  |        |      1 |        |       |            |          |     40 |00:00:00.01 |      66 |  1024 |  1024 |          |

    |*  4 |     VIEW                                     |        |      1 |      2 |    52 |    27   (0)| 00:00:01 |     10 |00:00:00.01 |      34 |       |       |          |

    |   5 |      UNION ALL (RECURSIVE WITH) BREADTH FIRST|        |      1 |        |       |            |          |     11 |00:00:00.01 |      34 |  1024 |  1024 |          |

    |*  6 |       VIEW                                   |        |      1 |      1 |    39 |    13   (0)| 00:00:01 |      1 |00:00:00.01 |       2 |       |       |          |

    |*  7 |        WINDOW NOSORT STOPKEY                 |        |      1 |  10000 | 60000 |    13   (0)| 00:00:01 |      1 |00:00:00.01 |       2 | 73728 | 73728 |          |

    |   8 |         INDEX FULL SCAN                      | T1_IDX |      1 |  10000 | 60000 |    13   (0)| 00:00:01 |      2 |00:00:00.01 |       2 |       |       |          |

    |   9 |       SORT AGGREGATE                         |        |     10 |      1 |     3 |            |          |     10 |00:00:00.01 |      12 |       |       |          |

    |  10 |        FIRST ROW                             |        |     10 |      1 |     3 |     1   (0)| 00:00:01 |      9 |00:00:00.01 |      12 |       |       |          |

    |* 11 |         INDEX RANGE SCAN (MIN/MAX)           | T1_IDX |     10 |      1 |     3 |     1   (0)| 00:00:01 |      9 |00:00:00.01 |      12 |       |       |          |

    |  12 |       SORT AGGREGATE                         |        |     10 |      1 |     6 |            |          |     10 |00:00:00.01 |      20 |       |       |          |

    |* 13 |        COUNT STOPKEY                         |        |     10 |        |       |            |          |      9 |00:00:00.01 |      20 |       |       |          |

    |* 14 |         INDEX RANGE SCAN                     | T1_IDX |     10 |    500 |  3000 |     1   (0)| 00:00:01 |      9 |00:00:00.01 |      20 |       |       |          |

    |  15 |       RECURSIVE WITH PUMP                    |        |     11 |        |       |            |          |     10 |00:00:00.01 |       0 |       |       |          |

    |  16 |     SORT AGGREGATE                           |        |     30 |      1 |     6 |            |          |     30 |00:00:00.01 |      32 |       |       |          |

    |  17 |      FIRST ROW                               |        |     30 |      1 |     6 |     1   (0)| 00:00:01 |     20 |00:00:00.01 |      32 |       |       |          |

    |* 18 |       INDEX RANGE SCAN (MIN/MAX)             | T1_IDX |     30 |      1 |     6 |     1   (0)| 00:00:01 |     20 |00:00:00.01 |      32 |       |       |          |

    |  19 |     RECURSIVE WITH PUMP                      |        |      4 |        |       |            |          |     30 |00:00:00.01 |       0 |       |       |          |

    ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------

    Predicate Information (identified by operation id):

    ---------------------------------------------------

       2 - filter(("BOUNCE2"."VAL2" IS NOT NULL AND "BOUNCE2"."VAL1" IS NOT NULL))

       4 - filter("BOUNCE1"."VAL1" IS NOT NULL)

       6 - filter("RN"=1)

       7 - filter(ROW_NUMBER() OVER ( ORDER BY "VAL1","VAL2")<=1)

      11 - access("T1"."VAL1">:B1)

      13 - filter(ROWNUM=1)

      14 - access("T1"."VAL1">:B1)

      18 - access("T1"."VAL1"=:B1 AND "T1"."VAL2">:B2)

    We can eliminate one of those Index range scans by using a cross apply rather than two scalar subqueries:

    with bounce1(val1, val2) as (

            select val1, val2

            from    (

                    select

                            /*+ index(t1) no_index_ffs(t1) */

                            val1, val2,

                            row_number() over(order by val1, val2) rn

                    from    t1

            )

            where

                    rn = 1

            union all

            select

                    ca.val1

                   ,ca.val2

            from    bounce1

            cross  apply (select val1, val2

                          from  (select /*+ index(t1) no_index_ffs(t1) */

                                         val1, val2

                                        ,row_number() over(order by val1, val2) rn

                                 from    t1

                                 where   t1.val1 > bounce1.val1

                                )

                          where  rn = 1

                         ) ca

            where  bounce1.val1 is not null

    ) CYCLE val1,val2 SET cycle TO 1 DEFAULT 0,

    bounce2 (val1, val2)

    as (

            select  val1, val2

            from    bounce1

            where  bounce1.val1 is not null

            union all

            select  bounce2.val1, (select min(t1.val2) val2 from t1 where t1.val1 = bounce2.val1 and t1.val2 > bounce2.val2) val2

            from    bounce2

            where  bounce2.val2 is not null

            and    bounce2.val1 is not null

    )

    select * from bounce2

    where

            bounce2.val2 is not null

    and    bounce2.val1 is not null        -- > redundant predicate

    order by

            val1, val2

    ;

    (The cycle clause is to prevent ORA-32044: cycle detected while executing recursive WITH query, we could probably avoid the looping somehow).

    I get this plan:

    Plan hash value: 3542522518

    -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

    | Id  | Operation                                    | Name            | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |

    -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

    |   0 | SELECT STATEMENT                             |                 |      1 |        |       |    55 (100)|          |     30 |00:00:00.04 |     162 |       |       |          |

    |   1 |  SORT ORDER BY                               |                 |      1 |      4 |   104 |    55   (2)| 00:00:01 |     30 |00:00:00.04 |     162 |  2048 |  2048 | 2048  (0)|

    |*  2 |   VIEW                                       |                 |      1 |      4 |   104 |    54   (0)| 00:00:01 |     30 |00:00:00.04 |     162 |       |       |          |

    |   3 |    UNION ALL (RECURSIVE WITH) BREADTH FIRST  |                 |      1 |        |       |            |          |     40 |00:00:00.04 |     162 |  1024 |  1024 |          |

    |*  4 |     VIEW                                     |                 |      1 |      2 |    52 |    27   (0)| 00:00:01 |     10 |00:00:00.04 |     130 |       |       |          |

    |   5 |      UNION ALL (RECURSIVE WITH) BREADTH FIRST|                 |      1 |        |       |            |          |     10 |00:00:00.04 |     130 |  1024 |  1024 |          |

    |*  6 |       VIEW                                   |                 |      1 |      1 |    39 |    13   (0)| 00:00:01 |      1 |00:00:00.01 |       2 |       |       |          |

    |*  7 |        WINDOW NOSORT STOPKEY                 |                 |      1 |  10000 | 60000 |    13   (0)| 00:00:01 |      1 |00:00:00.01 |       2 | 73728 | 73728 |          |

    |   8 |         INDEX FULL SCAN                      | T1_IDX          |      1 |  10000 | 60000 |    13   (0)| 00:00:01 |      2 |00:00:00.01 |       2 |       |       |          |

    |   9 |       NESTED LOOPS                           |                 |     10 |      1 |    39 |    14   (0)| 00:00:01 |      9 |00:00:00.04 |     128 |       |       |          |

    |  10 |        RECURSIVE WITH PUMP                   |                 |     10 |        |       |            |          |     10 |00:00:00.01 |       0 |       |       |          |

    |  11 |        VIEW                                  | VW_LAT_A83890C2 |     10 |      1 |    26 |     2   (0)| 00:00:01 |      9 |00:00:00.04 |     128 |       |       |          |

    |* 12 |         VIEW                                 |                 |     10 |      1 |    39 |     2   (0)| 00:00:01 |      9 |00:00:00.04 |     128 |       |       |          |

    |* 13 |          WINDOW BUFFER PUSHED RANK           |                 |     10 |    500 |  3000 |     2   (0)| 00:00:01 |      9 |00:00:00.04 |     128 |  2048 |  2048 | 2048  (0)|

    |* 14 |           INDEX RANGE SCAN                   | T1_IDX          |     10 |    500 |  3000 |     2   (0)| 00:00:01 |  45000 |00:00:00.02 |     128 |       |       |          |

    |  15 |     SORT AGGREGATE                           |                 |     30 |      1 |     6 |            |          |     30 |00:00:00.01 |      32 |       |       |          |

    |  16 |      FIRST ROW                               |                 |     30 |      1 |     6 |     1   (0)| 00:00:01 |     20 |00:00:00.01 |      32 |       |       |          |

    |* 17 |       INDEX RANGE SCAN (MIN/MAX)             | T1_IDX          |     30 |      1 |     6 |     1   (0)| 00:00:01 |     20 |00:00:00.01 |      32 |       |       |          |

    |  18 |     RECURSIVE WITH PUMP                      |                 |      4 |        |       |            |          |     30 |00:00:00.01 |       0 |       |       |          |

    -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

    Predicate Information (identified by operation id):

    ---------------------------------------------------

       2 - filter(("BOUNCE2"."VAL2" IS NOT NULL AND "BOUNCE2"."VAL1" IS NOT NULL))

       4 - filter("BOUNCE1"."VAL1" IS NOT NULL)

       6 - filter("RN"=1)

       7 - filter(ROW_NUMBER() OVER ( ORDER BY "VAL1","VAL2")<=1)

      12 - filter("RN"=1)

      13 - filter(ROW_NUMBER() OVER ( ORDER BY "VAL1","VAL2")<=1)

      14 - access("T1"."VAL1">"BOUNCE1"."VAL1")

      17 - access("T1"."VAL1"=:B1 AND "T1"."VAL2">:B2)

    I'm getting a lot more buffers, but they're centered around line 14 which looks to not be applying our first rows optimization early enough. So let's try changing the analytic row_number filter to just order by and rownum:

    Plan hash value: 1786565098

    -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

    | Id  | Operation                                    | Name            | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |

    -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

    |   0 | SELECT STATEMENT                             |                 |      1 |        |       |    54 (100)|          |     30 |00:00:00.01 |      54 |       |       |          |

    |   1 |  SORT ORDER BY                               |                 |      1 |      4 |   104 |    54   (2)| 00:00:01 |     30 |00:00:00.01 |      54 |  2048 |  2048 | 2048  (0)|

    |*  2 |   VIEW                                       |                 |      1 |      4 |   104 |    53   (0)| 00:00:01 |     30 |00:00:00.01 |      54 |       |       |          |

    |   3 |    UNION ALL (RECURSIVE WITH) BREADTH FIRST  |                 |      1 |        |       |            |          |     40 |00:00:00.01 |      54 |  1024 |  1024 |          |

    |*  4 |     VIEW                                     |                 |      1 |      2 |    52 |    26   (0)| 00:00:01 |     10 |00:00:00.01 |      22 |       |       |          |

    |   5 |      UNION ALL (RECURSIVE WITH) BREADTH FIRST|                 |      1 |        |       |            |          |     10 |00:00:00.01 |      22 |  1024 |  1024 |          |

    |*  6 |       VIEW                                   |                 |      1 |      1 |    39 |    13   (0)| 00:00:01 |      1 |00:00:00.01 |       2 |       |       |          |

    |*  7 |        WINDOW NOSORT STOPKEY                 |                 |      1 |  10000 | 60000 |    13   (0)| 00:00:01 |      1 |00:00:00.01 |       2 | 73728 | 73728 |          |

    |   8 |         INDEX FULL SCAN                      | T1_IDX          |      1 |  10000 | 60000 |    13   (0)| 00:00:01 |      2 |00:00:00.01 |       2 |       |       |          |

    |   9 |       NESTED LOOPS                           |                 |     10 |      1 |    39 |    14   (0)| 00:00:01 |      9 |00:00:00.01 |      20 |       |       |          |

    |  10 |        RECURSIVE WITH PUMP                   |                 |     10 |        |       |            |          |     10 |00:00:00.01 |       0 |       |       |          |

    |  11 |        VIEW                                  | VW_LAT_A83890C2 |     10 |      1 |    26 |     1   (0)| 00:00:01 |      9 |00:00:00.01 |      20 |       |       |          |

    |* 12 |         COUNT STOPKEY                        |                 |     10 |        |       |            |          |      9 |00:00:00.01 |      20 |       |       |          |

    |  13 |          VIEW                                |                 |     10 |      1 |    26 |     1   (0)| 00:00:01 |      9 |00:00:00.01 |      20 |       |       |          |

    |* 14 |           INDEX RANGE SCAN                   | T1_IDX          |     10 |      1 |     6 |     1   (0)| 00:00:01 |      9 |00:00:00.01 |      20 |       |       |          |

    |  15 |     SORT AGGREGATE                           |                 |     30 |      1 |     6 |            |          |     30 |00:00:00.01 |      32 |       |       |          |

    |  16 |      FIRST ROW                               |                 |     30 |      1 |     6 |     1   (0)| 00:00:01 |     20 |00:00:00.01 |      32 |       |       |          |

    |* 17 |       INDEX RANGE SCAN (MIN/MAX)             | T1_IDX          |     30 |      1 |     6 |     1   (0)| 00:00:01 |     20 |00:00:00.01 |      32 |       |       |          |

    |  18 |     RECURSIVE WITH PUMP                      |                 |      4 |        |       |            |          |     30 |00:00:00.01 |       0 |       |       |          |

    -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

    Predicate Information (identified by operation id):

    ---------------------------------------------------

       2 - filter(("BOUNCE2"."VAL2" IS NOT NULL AND "BOUNCE2"."VAL1" IS NOT NULL))

       4 - filter("BOUNCE1"."VAL1" IS NOT NULL)

       6 - filter("RN"=1)

       7 - filter(ROW_NUMBER() OVER ( ORDER BY "VAL1","VAL2")<=1)

      12 - filter(ROWNUM=1)

      14 - access("T1"."VAL1">"BOUNCE1"."VAL1")

      17 - access("T1"."VAL1"=:B1 AND "T1"."VAL2">:B2)

    This is only slightly less buffers than your solution, but mainly a neat example of using cross apply to reduce overhead of scalar subqueries that are visiting the same row.

    Sven W.
  • Jonathan Lewis
    Jonathan Lewis Member Posts: 9,569 Gold Crown
    edited May 2018

    Andrew,

    Excellent !

    I tried to do something similar with lateral() but I couldn't get it to work (wrong results!) and "cross apply" never crossed my mind.

    The benefit is that I have to probe N times more than you do, where N is the number of distinct values of the first column; you're only probing (in effect) for the second column.

    Interestingly the optimizer has transformed your "cross apply" into a lateral() - see the VW_LAT - so I'll have to revisit my lateral test to find out what I did wrong.

    Regards

    Jonathan Lewis

  • Jonathan Lewis
    Jonathan Lewis Member Posts: 9,569 Gold Crown
    edited May 2018

    Cracked it:

    I had got the wrong results with:

        "lateral (select ....)".

    Looked at the 10053 UNPARSED from your cross apply code and changed mine to

        "lateral(select * from (select ...))"

    and got the right results.

    Regards

    Jonathan Lewis

    Update: and now I have to figure out why that makes a difference - assuming it's supposed to.

    Update 2: I've just realised that your "cross apply" code has the (apparently) redundant extra select coded explicitly. Do you already know why it's necessary.  It looks like the "rownum=1" gets pushed to the wrong point if it isn't.

    Update 3:It looks like it's a bug. I put the hint /*+ no_query_transformation */  into the main query and got the right results without the extra select layer.

    Sven W.
  • 954696
    954696 Member Posts: 17
    edited May 2018

    Hi!

    thanks for the two column version. It works fine, except in one case.

    My table is partitioned and if I use a GLOBAL index for (val1,val2) everything works fine (fast), but with: create index ti on t1(val1,val2,1) local;

    It is much slower. Many buffers reads (15000 versus 30 - I have 10 partitions wit 1M rows each).

    what sticks out is the 10M A-Rows in line 9 of plan:

    |   9 |          INDEX FULL SCAN                     | T1   |     10 |     10M| 15458   (1)|     10M|00:00:04.70 |   15410 |  15389 |       |       |          |

    This is line 8 from your plan on https://jonathanlewis.wordpress.com/2018/05/29/index-bouncy-scan-2/

    (my plan is the same, except for 4 added PARTITION RANGE ALL items)

    With a global index it has 2 A-Rows.

    My table has 4 unique val1,val2 values.

    PS: As I see no collapse function in the forum, I'll post the whole plan and data creation only upon request. It makes the thread very long and unclear.

  • Jonathan Lewis
    Jonathan Lewis Member Posts: 9,569 Gold Crown
    edited May 2018

    Partitioned indexes are always a special case and often cause clever ideas to fail.

    Basic problem in this type of case:  even if the smallest for val1 appears in partition N that doesn't necessarily mean the smallest value for val2 appears in the same partiiton.

    Except for very special cases the approach has to be modified so that the second set of probes each index partition for every val1 value selected.  (Obviously a bit of procedural code could do this then combine results - a single SQL statement is going to take a little more thought - and might be too complicated to be desirable.)

    Regards

    Jonathan Lewis

  • AndrewSayer
    AndrewSayer Member Posts: 12,998 Gold Crown
    edited May 2018
    Jonathan Lewis wrote:Cracked it:I had got the wrong results with: "lateral (select ....)".Looked at the 10053 UNPARSED from your cross apply code and changed mine to "lateral(select * from (select ...))"and got the right results.RegardsJonathan LewisUpdate: and now I have to figure out why that makes a difference - assuming it's supposed to.Update 2: I've just realised that your "cross apply" code has the (apparently) redundant extra select coded explicitly. Do you already know why it's necessary. It looks like the "rownum=1" gets pushed to the wrong point if it isn't.Update 3:It looks like it's a bug. I put the hint /*+ no_query_transformation */ into the main query and got the right results without the extra select layer.

    I had used the subquery to make sure the order by was processed before the rownum filter, but I see what you mean now.

    Here's the (explain) plan for the first iteration of that recursive with:

    with bounce1(val1, val2) as (

    select val1, val2

    from    (

            select

                    /*+ index(t1) no_index_ffs(t1) */

                    val1, val2,

                    row_number() over(order by val1, val2) rn

            from    t1

    )

    where

            rn = 1

    )

    select

            ca.val1

            ,ca.val2

    from    bounce1

    cross  apply (select val1, val2

                          from    t1

                          where  t1.val1 > bounce1.val1

          and    rownum = 1

                  ) ca

    where  bounce1.val1 is not null

    /

    PLAN_TABLE_OUTPUT
    --------------------------------------------------------------------------------------------
    Plan hash value: 817484536

    -------------------------------------------------------------------------------------------
    | Id  | Operation              | Name            | Rows  | Bytes | Cost (%CPU)| Time    |
    -------------------------------------------------------------------------------------------
    |  0 | SELECT STATEMENT        |                |    1 |    52 |    27  (0)| 00:00:01 |
    |  1 |  NESTED LOOPS          |                |    1 |    52 |    27  (0)| 00:00:01 |
    |*  2 |  VIEW                  |                |    1 |    26 |    25  (0)| 00:00:01 |
    |*  3 |    WINDOW NOSORT STOPKEY|                | 10000 | 60000 |    25  (0)| 00:00:01 |
    |  4 |    INDEX FULL SCAN    | T1_IDX          | 10000 | 60000 |    25  (0)| 00:00:01 |
    |*  5 |  VIEW                  | VW_DCL_91B52CC4 |    1 |    26 |    2  (0)| 00:00:01 |
    |*  6 |    COUNT STOPKEY        |                |      |      |            |          |
    |  7 |    INDEX FAST FULL SCAN| T1_IDX          |    1 |    6 |    2  (0)| 00:00:01 |
    -------------------------------------------------------------------------------------------

    Predicate Information (identified by operation id):
    ---------------------------------------------------

      2 - filter("RN"=1)
      3 - filter(ROW_NUMBER() OVER ( ORDER BY "VAL1","VAL2")<=1)
      5 - filter("VAL1">"VAL1")
      6 - filter(ROWNUM=1)

    The join predicate is not being pushed down all the way into the view but is getting applied against the vw_dcl_% view instead (decorrelated view), this looks silly to me but I have seen similar things occur when using these new join operators - I think it's a legal transformation but I can't see anything in the docs supporting or otherwise.

    Looking at the plan outline, I spotted

    DECORRELATE(@SEL$2)

    So I modified that to no_DECORRELATE(@SEL$2) and bingo:

    Plan hash value: 3003058002

    -------------------------------------------------------------------------------------------
    | Id  | Operation              | Name            | Rows  | Bytes | Cost (%CPU)| Time    |
    -------------------------------------------------------------------------------------------
    |  0 | SELECT STATEMENT        |                |    1 |    52 |    27  (0)| 00:00:01 |
    |  1 |  NESTED LOOPS          |                |    1 |    52 |    27  (0)| 00:00:01 |
    |*  2 |  VIEW                  |                |    1 |    26 |    25  (0)| 00:00:01 |
    |*  3 |    WINDOW NOSORT STOPKEY|                | 10000 | 60000 |    25  (0)| 00:00:01 |
    |  4 |    INDEX FULL SCAN    | T1_IDX          | 10000 | 60000 |    25  (0)| 00:00:01 |
    |  5 |  VIEW                  | VW_LAT_91B52CC4 |    1 |    26 |    2  (0)| 00:00:01 |
    |*  6 |    COUNT STOPKEY        |                |      |      |            |          |
    |*  7 |    INDEX RANGE SCAN    | T1_IDX          |    1 |    6 |    2  (0)| 00:00:01 |
    -------------------------------------------------------------------------------------------

    Predicate Information (identified by operation id):
    ---------------------------------------------------

      2 - filter("RN"=1)
      3 - filter(ROW_NUMBER() OVER ( ORDER BY "VAL1","VAL2")<=1)
      6 - filter(ROWNUM=1)
      7 - access("T1"."VAL1">"VAL1")

    Another way to prevent that decorrelate is making the cross apply more complex - the first thing I tried was adding a union all:

    Plan hash value: 3821127209

    -------------------------------------------------------------------------------------------
    | Id  | Operation              | Name            | Rows  | Bytes | Cost (%CPU)| Time    |
    -------------------------------------------------------------------------------------------
    |  0 | SELECT STATEMENT        |                |    2 |  104 |    27  (0)| 00:00:01 |
    |  1 |  NESTED LOOPS          |                |    2 |  104 |    27  (0)| 00:00:01 |
    |*  2 |  VIEW                  |                |    1 |    26 |    25  (0)| 00:00:01 |
    |*  3 |    WINDOW NOSORT STOPKEY|                | 10000 | 60000 |    25  (0)| 00:00:01 |
    |  4 |    INDEX FULL SCAN    | T1_IDX          | 10000 | 60000 |    25  (0)| 00:00:01 |
    |  5 |  VIEW                  | VW_LAT_3A0DFF57 |    2 |    52 |    2  (0)| 00:00:01 |
    |  6 |    UNION-ALL            |                |      |      |            |          |
    |*  7 |    COUNT STOPKEY      |                |      |      |            |          |
    |*  8 |      INDEX RANGE SCAN  | T1_IDX          |    1 |    6 |    2  (0)| 00:00:01 |
    |*  9 |    FILTER              |                |      |      |            |          |
    |  10 |      FAST DUAL          |                |    1 |      |    2  (0)| 00:00:01 |
    -------------------------------------------------------------------------------------------

    Predicate Information (identified by operation id):
    ---------------------------------------------------

      2 - filter("RN"=1)
      3 - filter(ROW_NUMBER() OVER ( ORDER BY "VAL1","VAL2")<=1)
      7 - filter(ROWNUM=1)
      8 - access("T1"."VAL1">"VAL1")
      9 - filter(NULL IS NOT NULL)

    But the more obvious thing you might try is aggregation, which works too.

    with bounce1(val1, val2) as (

    select val1, val2

    from    (

            select

                    /*+ index(t1) no_index_ffs(t1) */

                    val1, val2,

                    row_number() over(order by val1, val2) rn

            from    t1

    )

    where

            rn = 1

    )

    select  ca.val1

            ,ca.val2

    from    bounce1

    cross  apply (select min(val1) val1, min(val2) val2

                          from    t1

                          where  t1.val1 > bounce1.val1

          and    rownum = 1

                  ) ca

    where  bounce1.val1 is not null

    PLAN_TABLE_OUTPUT
    ---------------------------------------------------------------------------------------------
    Plan hash value: 3146981694

    -------------------------------------------------------------------------------------------
    | Id  | Operation              | Name            | Rows  | Bytes | Cost (%CPU)| Time    |
    -------------------------------------------------------------------------------------------
    |  0 | SELECT STATEMENT        |                |    1 |    52 |    27  (0)| 00:00:01 |
    |  1 |  NESTED LOOPS          |                |    1 |    52 |    27  (0)| 00:00:01 |
    |*  2 |  VIEW                  |                |    1 |    26 |    25  (0)| 00:00:01 |
    |*  3 |    WINDOW NOSORT STOPKEY|                | 10000 | 60000 |    25  (0)| 00:00:01 |
    |  4 |    INDEX FULL SCAN    | T1_IDX          | 10000 | 60000 |    25  (0)| 00:00:01 |
    |  5 |  VIEW                  | VW_LAT_91B52CC4 |    1 |    26 |    2  (0)| 00:00:01 |
    |  6 |    SORT AGGREGATE      |                |    1 |    6 |            |          |
    |*  7 |    COUNT STOPKEY      |                |      |      |            |          |
    |*  8 |      INDEX RANGE SCAN  | T1_IDX          |    1 |    6 |    2  (0)| 00:00:01 |
    -------------------------------------------------------------------------------------------

    Predicate Information (identified by operation id):
    ---------------------------------------------------

      2 - filter("RN"=1)
      3 - filter(ROW_NUMBER() OVER ( ORDER BY "VAL1","VAL2")<=1)
      7 - filter(ROWNUM=1)
      8 - access("T1"."VAL1">"VAL1")

    One for the notebook (or the blog drafts ) !

  • AndrewSayer
    AndrewSayer Member Posts: 12,998 Gold Crown
    edited May 2018
    954696 wrote:Hi!thanks for the two column version. It works fine, except in one case.My table is partitioned and if I use a GLOBAL index for (val1,val2) everything works fine (fast), but with: create index ti on t1(val1,val2,1) local;It is much slower. Many buffers reads (15000 versus 30 - I have 10 partitions wit 1M rows each).what sticks out is the 10M A-Rows in line 9 of plan:
    1. |9|INDEXFULLSCAN|T1|10|10M|15458(1)|10M|00:00:04.70|15410|15389||||
    | 9 | INDEX FULL SCAN | T1 | 10 | 10M| 15458 (1)| 10M|00:00:04.70 | 15410 | 15389 | | | |
    This is line 8 from your plan on https://jonathanlewis.wordpress.com/2018/05/29/index-bouncy-scan-2/(my plan is the same, except for 4 added PARTITION RANGE ALL items)With a global index it has 2 A-Rows.My table has 4 unique val1,val2 values.PS: As I see no collapse function in the forum, I'll post the whole plan and data creation only upon request. It makes the thread very long and unclear.

    With a local index you now have the index once per partition key value. You could probably force the index min/max range scan with a hint but each iteration will now hit all partitions - the cost per iteration will increase quite a lot.

    It‘s important you don’t lose site of why the transformation gives you the result in less time - the amount of work required to traverse the branch structure to find the next unique value is less than the amount of work required to read the leaf blocks in order until you find the next unique value. When you partition the index, the effort to do that branch traversal is going to multiply by number of partitions.

    If your data is still such that the bouncey method works well, then just go through the plan and see what hints need to be added to mimic the non-partitioned plan (to see what the CBO did last time, have a peek at the plan outlines using  ‘advanced‘ as an input to the dbms_xplan.display_cursor format argument ). It might be a bit of trial and error, when the optimizer doesn’t agree with your hint it will tend to obey it in the most inconvenient way!

  • Mohamed Houri
    Mohamed Houri Member Posts: 1,190 Bronze Trophy
    edited May 2018
    Andrew Sayer wrote:The join predicate is not being pushed down all the way into the view but is getting applied against the vw_dcl_% view instead (decorrelated view), this looks silly to me but I have seen similar things occur when using these new join operators - I think it's a legal transformation but I can't see anything in the docs supporting or otherwise.

    Andrew

    Nice thread.

    I have got a major performance issue due to the Decorrelated lateral view in a 12cR1 which was generating systematically a FULL TABLE SCAN. I have explained this case in the following blog post

        www.hourim.wordpress.com/2017/03/25/de-correlated-lateral-view-vw_dcl_mmm

    This transformation (at least in 12cR1) accomplishes two steps

    1. Create a Lateral view (VW_LAT_xxx)
    2. De-Correlate this lateral view (VW_DCL_xxx) by excluding the join predicate with outer query block from the Lateral view

    The second point above was the root cause of my client systematic dramatic FULL table scan (index fast full scan in your case)

    This new transformation becomes interesting in 12cR2 when it is combined with the OR-Expansion

    www.hourim.wordpress.com/2017/06/17/12cr2-or-expansion

    For the sake of completeness, there is a US patent about this VW_DCL transformation which is worth a read:

    http://www.freepatentsonline.com/8548983.html

    Best regards

    Mohamed

  • 954696
    954696 Member Posts: 17
    edited May 2018

    I came up with this code, it seems to return the correct results, but is not optimal/fast.

    with bouncy2 (p_val,p_val2)   as (select min(val2) val2,val1        from  test.t group by val1 -- I also tried to do a JOIN here with bouncy1 ON val1, force nested loops, but no speedup  union all    select (select min(t.val2) val2 from test.t where t.val2 > bouncy2.p_val and val1=bouncy2.p_val2) val2, bouncy2.p_val2   from  bouncy2   where  bouncy2.p_val is not null     ) select * from  bouncy2 where  p_val is not null;

    Maybe someone can improve on it. Or maybe it is useless, it is late here...

    Regards,

    David

  • AndrewSayer
    AndrewSayer Member Posts: 12,998 Gold Crown
    edited May 2018
    954696 wrote:I came up with this code, it seems to return the correct results, but is not optimal/fast.
    1. withbouncy2(p_val,p_val2)
    2. as(selectmin(val2)val2,val1
    3. fromtest.tgroupbyval1--IalsotriedtodoaJOINherewithbouncy1ONval1,forcenestedloops,butnospeedup
    4. unionall
    5. select(selectmin(t.val2)val2fromtest.twheret.val2>bouncy2.p_valandval1=bouncy2.p_val2)val2,bouncy2.p_val2
    6. frombouncy2
    7. wherebouncy2.p_valisnotnull
    8. )
    9. select*
    10. frombouncy2
    11. wherep_valisnotnull;
    with bouncy2 (p_val,p_val2)  as (select min(val2) val2,val1  from test.t group by val1 -- I also tried to do a JOIN here with bouncy1 ON val1, force nested loops, but no speedup  union all  select (select min(t.val2) val2 from test.t where t.val2 > bouncy2.p_val and val1=bouncy2.p_val2) val2, bouncy2.p_val2  from bouncy2  where bouncy2.p_val is not null  )  select *  from bouncy2  where p_val is not null;
    Maybe someone can improve on it. Or maybe it is useless, it is late here...Regards,David

    I'm just guessing where your trouble lies, without the execution plan it's easy for us to miss other subtleties. Could you please include the execution plan you saw, as long as you use a fixed width font (use courier new in the advanced editor) it should display fine.

    select min(val2) val2,val1 

            from  test.t group by val1

    Will require reading the entire set of data and then sorting it, if you're going to do that you might as well just use the standard select distinct statement. In order to get this subquery to execute in the bouncy way, rather than grouping by val1 you would pass it in from a previous query (you could use the bouncy scan to get the single values and then join here).

    An important restriction to note is that when you do:

    select val1, min(val2)

    from  test1 where val1 = :value

    group by val1

    You will not be using the min/max scan. You have to only be projecting the aggregate in the subquery, e.g.

    select :value, (select min(val2) from test where val1 = :value) val2

    from dual

    In reality, you would use the result of your previous WITH clause rather than dual and a bind variable..

    Is T now partitioned? It would be useful if we had the DDL and a query to generate sample data here so that we're all on the same page. I haven't done any experimenting yet to see where the method could trip up other than the obvious higher costing

  • AndrewSayer
    AndrewSayer Member Posts: 12,998 Gold Crown
    edited May 2018

    Thanks for the link. The main difference I can see between your case and this one is the placement of the join predicate. Yours are still being applied during the access to the table, in our case the join predicates have been moved up to the VIEW plan line so all other predicates are carried out first. Maybe it's the presence of the rownum filter causing the optimizer to know that predicate evaluation order is important - this tempts me to say that this is Oracle's desired behaviour rather than a bug, unfortunately I don't have the patience to raise an SR about this to confirm it.

  • Jonathan Lewis
    Jonathan Lewis Member Posts: 9,569 Gold Crown
    edited May 2018
    954696 wrote:My table is partitioned and if I use a GLOBAL index for (val1,val2) everything works fine (fast), but with: create index ti on t1(val1,val2,1) local;It is much slower. Many buffers reads (15000 versus 30 - I have 10 partitions wit 1M rows each).PS: As I see no collapse function in the forum, I'll post the whole plan and data creation only upon request. It makes the thread very long and unclear.

    How important is it that you use a pure SQL solution to the problem. If the data pattern means that probing the index with a bouncy scan is much more efficient than a brute force scan and aggregate then it might  be worth the effort of writing a little procedure that knows how to to step from one partition to the next to do the job. (Maybe doing the bouncy scan on each partition in turn then aggregating to get the final result in the procedure.)

    I assume the partiton key is not part of the index in question given your comment about number of partitions and number of distinct combinations. I am puzzled, however, by the 1 that appears at the end of the partitioned index definition - is this an indication that both val1 and val2 are allowed to be null ?

    Regards

    Jonathan Lewis

  • Jonathan Lewis
    Jonathan Lewis Member Posts: 9,569 Gold Crown
    edited May 2018

    I've just run up a little model with a hash partitioned table, and the biggest workload using the basic query is (as you listed) the line which tries to find the starting list of (val1, {min(val2 for val1}) - everything after that runs as sensibly as it can on my model - i.e. lots of min/max range scans or stopkey. I've got an idea of how to write the start-point to make that use nothing but minimal probes. May get back to you soon.

    Regards

    Jonathan Lewis

  • 954696
    954696 Member Posts: 17
    edited May 2018
    Jonathan Lewis wrote:How important is it that you use a pure SQL solution to the problem.I assume the partiton key is not part of the index in question given your comment about number of partitions and number of distinct combinations. I am puzzled, however, by the 1 that appears at the end of the partitioned index definition - is this an indication that both val1 and val2 are allowed to be null ?

    A PL/SQL solution would be OK too.

    Yes, the partition key is not part of the index.

    Yes, in my real case, both columns can be null, so I used the 1 "trick" to include nulls in the index.

    In the production there are only a few partitions (like up to five).

    I am currently busy with other issues, so excuse me for not putting enough work in this thread.

    Regards,

    David

  • Jonathan Lewis
    Jonathan Lewis Member Posts: 9,569 Gold Crown
    edited May 2018

    David,

    I have a solution that works in 12.2 but not (apparently) in 12.1. I'm checking to see if it's an optimizer enhancement or whether it's something that could be hinted to take a different path.

    Your comment about NULLs means these generic solutions won't work - notice how many times the code has "is not null" predicates to stop the recursion raising a an infinite loop error. You. Given the "index full scan (min/max)" and "index range scan(min/max)" that works efficiently across partitions your best bet when nulls are relevant is to use a looping PL/SQL method rather like the one I showed in https://jonathanlewis.wordpress.com/2017/02/09/index-bouncy-scan/  , extended to a loop within a loop, and with a special path for the points where you need to test for "is null".

    Regards

    Jonathan Lewis

  • 954696
    954696 Member Posts: 17
    edited June 2018

    Maybe a hint how to combine the resultsets of the queries in the PL/SQL version?

    BULK COLLECT into an array variable and then SELECT from it into a cursor (the return value of the stored procedure should be of type sys_refcursor)?

This discussion has been closed.