1 2 Previous Next 27 Replies Latest reply: Feb 29, 2012 6:23 AM by dbms_photoshop RSS

    Can someone explain this

    Peter Gjelstrup
      Hi,

      Sorry for not providing a proper subject, can someone explain this?


      I hope the below explains itself. The problem is part of extracting Parent relation from existing (child) data and applying GUID as PK column.
      In real life I'll be using sys_guid(), not dbms_random.value, but result is the same.

      Here's my test table:
      SCOTT>drop table t purge;
      
      Table dropped.
      
      SCOTT>
      SCOTT>create table t as (select 1 n, 'a' c from dual union all
        2                     select 2 n, 'a' c from dual union all
        3                     select 3 n, 'a' c from dual union all
        4                     select 4 n, 'B' c from dual);
      
      Table created.
      
      SCOTT>
      SCOTT>select c, dbms_random.value
        2    from t
        3   group by c;
      
      C      VALUE
      - ----------
      a ,766212423
      B ,764535234
      
      2 rows selected.
      So far so good, I'm able to get a random value (Unique) for each different C. Now I want to join this back to the original data in T:
      SCOTT>
      SCOTT>set autotrace on
      SCOTT>select t.n, t.c, t2.v
        2    from t, (  select c, dbms_random.value v
        3                 from t
        4             group by c) t2
        5   where t.c = t2.c;
      
               N C          V
      ---------- - ----------
               3 a ,002285305
               2 a ,092421336
               1 a ,092421336
               4 B ,925445256
      
      4 rows selected.
      
      
      Execution Plan
      ----------------------------------------------------------
      Plan hash value: 1048184715
      
      -----------------------------------------------------------------------------
      | Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
      -----------------------------------------------------------------------------
      |   0 | SELECT STATEMENT     |      |     8 |   256 |    14  (15)| 00:00:01 |
      |*  1 |  HASH JOIN           |      |     8 |   256 |    14  (15)| 00:00:01 |
      |   2 |   TABLE ACCESS FULL  | T    |     4 |    64 |     6   (0)| 00:00:01 |
      |   3 |   VIEW               |      |     4 |    64 |     7  (15)| 00:00:01 |
      |   4 |    HASH GROUP BY     |      |     4 |    12 |     7  (15)| 00:00:01 |
      |   5 |     TABLE ACCESS FULL| T    |     4 |    12 |     6   (0)| 00:00:01 |
      -----------------------------------------------------------------------------
      
      Predicate Information (identified by operation id):
      ---------------------------------------------------
      
         1 - access("T"."C"="T2"."C")
      
      Note
      -----
         - dynamic sampling used for this statement
      
      
      Statistics
      ----------------------------------------------------------
               63  recursive calls
                0  db block gets
               36  consistent gets
                0  physical reads
                0  redo size
              364  bytes sent via SQL*Net to client
              235  bytes received via SQL*Net from client
                2  SQL*Net roundtrips to/from client
                4  sorts (memory)
                0  sorts (disk)
                4  rows processed
      
      SCOTT>
      Oops, not so good, random value is drawn three times now. But ordering set or applying window function brings back two different values:
      SCOTT>select t.n, t.c, t2.v, count(distinct v) over ()
        2    from t, (  select c, dbms_random.value v
        3                 from t
        4             group by c) t2
        5   where t.c = t2.c;
      
               N C          V COUNT(DISTINCTV)OVER()
      ---------- - ---------- ----------------------
               4 B ,524899328                      2
               1 a ,941905644                      2
               3 a ,941905644                      2
               2 a ,941905644                      2
      
      4 rows selected.
      
      
      Execution Plan
      ----------------------------------------------------------
      Plan hash value: 1251866241
      
      ------------------------------------------------------------------------------
      | Id  | Operation             | Name | Rows  | Bytes | Cost (%CPU)| Time     |
      ------------------------------------------------------------------------------
      |   0 | SELECT STATEMENT      |      |     8 |   256 |    15  (20)| 00:00:01 |
      |   1 |  WINDOW SORT          |      |     8 |   256 |    15  (20)| 00:00:01 |
      |*  2 |   HASH JOIN           |      |     8 |   256 |    14  (15)| 00:00:01 |
      |   3 |    TABLE ACCESS FULL  | T    |     4 |    64 |     6   (0)| 00:00:01 |
      |   4 |    VIEW               |      |     4 |    64 |     7  (15)| 00:00:01 |
      |   5 |     HASH GROUP BY     |      |     4 |    12 |     7  (15)| 00:00:01 |
      |   6 |      TABLE ACCESS FULL| T    |     4 |    12 |     6   (0)| 00:00:01 |
      ------------------------------------------------------------------------------
      
      Predicate Information (identified by operation id):
      ---------------------------------------------------
      
         2 - access("T"."C"="T2"."C")
      
      Note
      -----
         - dynamic sampling used for this statement
      
      
      Statistics
      ----------------------------------------------------------
                7  recursive calls
                0  db block gets
               14  consistent gets
                0  physical reads
                0  redo size
              392  bytes sent via SQL*Net to client
              235  bytes received via SQL*Net from client
                2  SQL*Net roundtrips to/from client
                2  sorts (memory)
                0  sorts (disk)
                4  rows processed
      
      SCOTT>
      SCOTT>select * from v$version where rownum = 1;
      
      BANNER
      ----------------------------------------------------------------
      Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi
      
      1 row selected.
      Regards
      Peter
        • 1. Re: Can someone explain this
          BluShadow
          Nope, can't say I can explain it to be honest.

          DISTINCTing the "c" values causes all the random values to be different, rather than giving two distinct random values.

          Peculiarer and Peculiarer. ?:|
          SQL> ed
          Wrote file afiedt.buf
          
            1  select t.n, t.c, t2.v
            2  from t, (select c, dbms_random.value v
            3           from (select distinct c from t)) t2
            4* where t.c = t2.c
          SQL> /
          
                   N C          V
          ---------- - ----------
                   3 a .019747614
                   2 a .621639572
                   1 a .307619345
                   4 B .398908586
          • 2. Re: Can someone explain this
            BluShadow
            It's gotta be with the way the optimiser is working.

            If I subquery factor out the t2...
            SQL> ed
            Wrote file afiedt.buf
            
              1  with t2 as (select c, dbms_random.value v
              2           from (select distinct c from t))
              3  select t.n, t.c, t2.v
              4  from t, t2
              5* where t.c = t2.c
            SQL> /
            
                     N C          V
            ---------- - ----------
                     3 a .690154606
                     2 a .437283163
                     1 a .522782499
                     4 B .314486231
            I still get the issue of 4 random numbers where we expect 2.

            But if I just output the contents of t2 subquery factor by itself (test1) and union that with the result of the previous query (test2)...
            SQL> ed
            Wrote file afiedt.buf
            
              1  with t2 as (select c, dbms_random.value v
              2           from (select distinct c from t))
              3  select 'test1' as lbl, null, c, v from t2
              4  union all
              5  select 'test2', t.n, t.c, t2.v
              6  from t, t2
              7* where t.c = t2.c
            SQL> /
            
            LBL         NULL C          V
            ----- ---------- - ----------
            test1            B .374153756
            test1            a  .81441921
            test2          4 B .374153756
            test2          3 a  .81441921
            test2          2 a  .81441921
            test2          1 a  .81441921
            
            6 rows selected.
            
            SQL>
            Just two distinct random values are given.

            Also the same with:
            SQL> ed
            Wrote file afiedt.buf
            
              1  with t2 as (select c, dbms_random.value v
              2           from (select distinct c from t))
              3  select *
              4  from (
              5  select null as n, null as c, null as v from t2
              6  union all
              7  select t.n, t.c, t2.v
              8  from t, t2
              9  where t.c = t2.c
             10  )
             11* where n is not null
            SQL> /
            
                     N C          V
            ---------- - ----------
                     4 B .967765442
                     3 a .013324481
                     2 a .013324481
                     1 a .013324481
            
            SQL>
            • 3. Re: Can someone explain this
              BluShadow
              p.s. for me:
              SQL> select * from v$version;
              
              BANNER
              ----------------------------------------------------------------
              Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Prod
              • 4. Re: Can someone explain this
                Sven W.
                It looks a little bit like the typical ROWNUM problem.

                Functions that are independent of the base inner query can be moved to the outer query by the optimizer. The problem with the random function is that even when it is technically independend from the inner query it is logically dependent (number of rows, not changeing values, etc).
                • 5. Re: Can someone explain this
                  Centinul
                  Peter --

                  Repeatable on my system as well. The only way I could get it to behave correctly was to do the following:
                  SQL > set autotrace on
                  SQL > WITH t AS
                    2  (
                    3     SELECT 1 n, 'a' c FROM DUAL UNION ALL
                    4     SELECT 2 n, 'a' c FROM DUAL UNION ALL
                    5     SELECT 3 n, 'a' c FROM DUAL UNION ALL
                    6     SELECT 4 n, 'B' c FROM DUAL
                    7  ), t2_rand AS
                    8  (
                    9     SELECT  c
                   10     ,       DBMS_RANDOM.VALUE AS v
                   11     ,       ROW_NUMBER() OVER (PARTITION BY C ORDER BY c) AS RN
                   12     FROM    t
                   13  )
                   14  SELECT     t.n
                   15  ,  t.c
                   16  ,  t2_rand.v
                   17  FROM       t
                   18  ,  t2_rand
                   19  WHERE      t.c = t2_rand.c
                   20  AND        RN = 1
                   21  /
                  
                           N C          V
                  ---------- - ----------
                           4 B  .30204937
                           3 a .222680079
                           2 a .222680079
                           1 a .222680079
                  
                  
                  Execution Plan
                  ----------------------------------------------------------
                  Plan hash value: 1267815693
                  
                  ---------------------------------------------------------------------------------------------------------
                  | Id  | Operation                  | Name                       | Rows  | Bytes | Cost (%CPU)| Time  |
                  ---------------------------------------------------------------------------------------------------------
                  |   0 | SELECT STATEMENT           |                            |     1 |    35 |    14  (58)| 00:00:01 |
                  |   1 |  TEMP TABLE TRANSFORMATION |                            |       |       |            |       |
                  |   2 |   LOAD AS SELECT           |                            |       |       |            |       |
                  |   3 |    UNION-ALL               |                            |       |       |            |       |
                  |   4 |     FAST DUAL              |                            |     1 |       |     2   (0)| 00:00:01 |
                  |   5 |     FAST DUAL              |                            |     1 |       |     2   (0)| 00:00:01 |
                  |   6 |     FAST DUAL              |                            |     1 |       |     2   (0)| 00:00:01 |
                  |   7 |     FAST DUAL              |                            |     1 |       |     2   (0)| 00:00:01 |
                  |*  8 |   HASH JOIN                |                            |     1 |    35 |     6  (34)| 00:00:01 |
                  |   9 |    VIEW                    |                            |     4 |    24 |     2   (0)| 00:00:01 |
                  |  10 |     TABLE ACCESS FULL      | SYS_TEMP_0FD9D6608_D11A3CC |     4 |    64 |     2   (0)| 00:00:01 |
                  |* 11 |    VIEW                    |                            |     4 |   116 |     3  (34)| 00:00:01 |
                  |* 12 |     WINDOW SORT PUSHED RANK|                            |     4 |    12 |     3  (34)| 00:00:01 |
                  |  13 |      VIEW                  |                            |     4 |    12 |     2   (0)| 00:00:01 |
                  |  14 |       TABLE ACCESS FULL    | SYS_TEMP_0FD9D6608_D11A3CC |     4 |    64 |     2   (0)| 00:00:01 |
                  ---------------------------------------------------------------------------------------------------------
                  
                  Predicate Information (identified by operation id):
                  ---------------------------------------------------
                  
                     8 - access("T"."C"="T2_RAND"."C")
                    11 - filter("RN"=1)
                    12 - filter(ROW_NUMBER() OVER ( PARTITION BY "C" ORDER BY "C")<=1)
                  
                  
                  Statistics
                  ----------------------------------------------------------
                            2  recursive calls
                            9  db block gets
                            8  consistent gets
                            1  physical reads
                          520  redo size
                          607  bytes sent via SQL*Net to client
                          396  bytes received via SQL*Net from client
                            2  SQL*Net roundtrips to/from client
                            1  sorts (memory)
                            0  sorts (disk)
                            4  rows processed
                  • 6. Re: Can someone explain this
                    Centinul
                    Interestingly enough, when I use subquery factoring to generate a test data set I get the expected results. However, when I use an actual table I get the incorrect results:

                    ::EDIT:: Nevermind Blushadow beat me to it.

                    Edited by: Centinul on Jul 29, 2009 9:13 AM
                    • 7. Re: Can someone explain this
                      BluShadow
                      Sven W. wrote:
                      It looks a little bit like the typical ROWNUM problem.

                      Functions that are independent of the base inner query can be moved to the outer query by the optimizer. The problem with the random function is that even when it is technically independend from the inner query it is logically dependent (number of rows, not changeing values, etc).
                      I was thinking that myself, but was just thinking how to show it.

                      I guess this also indicates that...
                      SQL> ed
                      Wrote file afiedt.buf
                      
                        1  with t2 as (select c, (select dbms_random.value from dual where rownum = 1) as v
                        2           from t
                        3           group by c)
                        4  select t.n, t.c, t2.v
                        5  from t, t2
                        6* where t.c = t2.c
                      SQL> /
                      
                               N C          V
                      ---------- - ----------
                               3 a .326130723
                               2 a .326130723
                               1 a .326130723
                               4 B .326130723
                      
                      SQL>
                      Only problem now is that the random value only gets generated once. LOL!
                      • 8. Re: Can someone explain this
                        Peter Gjelstrup
                        Thanks all for contributing ;-)

                        My best try, so far is to subquery refactor and materialize:
                        SCOTT>with t2 as (  select /*+ materialize */
                          2                      c, dbms_random.value v
                          3                  from t
                          4              group by c)
                          5  select t.n, t.c, t2.v
                          6    from t, t2
                          7   where t.c = t2.c;
                        
                                 N C          V
                        ---------- - ----------
                                 3 a ,189676218
                                 2 a ,189676218
                                 1 a ,189676218
                                 4 B   ,8574983
                        
                        4 rows selected.
                        Somehow I don't feel comfortable about putting such stuff into production. I remember at some time having read about (I think it was) prevention of subquery merging. Can't remember what the trick was. Anyone?

                        Regards
                        Peter
                        • 9. Re: Can someone explain this
                          BluShadow
                          Sven W. wrote:
                          Functions that are independent of the base inner query can be moved to the outer query by the optimizer. The problem with the random function is that even when it is technically independend from the inner query it is logically dependent (number of rows, not changeing values, etc).
                          Would you consider the following "technically dependent"?
                          SQL> ed
                          Wrote file afiedt.buf
                          
                            1  with t2 as (select c, dbms_random.string('A',length(c)) as v
                            2              from t
                            3              group by c
                            4              )
                            5  select t.n, t.c, t2.v
                            6  from t, t2
                            7* where t.c = t2.c
                          SQL> /
                          
                                   N C V
                          ---------- - ----------
                                   3 a J
                                   2 a A
                                   1 a A
                                   4 B e
                          
                          SQL> /
                          
                                   N C V
                          ---------- - ----------
                                   3 a L
                                   2 a e
                                   1 a e
                                   4 B n
                          
                          SQL> /
                          
                                   N C V
                          ---------- - ----------
                                   3 a B
                                   2 a L
                                   1 a L
                                   4 B y
                          
                          SQL>
                          Results would say otherwise, but I'd say the random package call was dependent on the column C from the query.

                          Hmmmm..... ?:|
                          • 10. Re: Can someone explain this
                            Centinul
                            Peter Gjelstrup wrote:
                            Somehow I don't feel comfortable about putting such stuff into production. I remember at some time having read about (I think it was) prevention of subquery merging. Can't remember what the trick was. Anyone?
                            There is the NO_MERGE hint, however is that what is really causing the issue? The explain plans that have been posted have a VIEW line which to me means the view is materialized, not merged. I generated a CBO trace for this and it doesn't appear that a plan with a merge was chosen either.
                            • 11. Re: Can someone explain this
                              Beijing
                              Same problem here with 10g, not problem with 9i.

                              This sql will generate expected result
                              SELECT   t.n, t.c, t2.v
                              FROM     t, (SELECT  c, DBMS_RANDOM.VALUE v
                                           FROM       t
                                           GROUP BY c
                                           ORDER BY c) t2
                              WHERE    t.c = t2.c
                              This sql will be worse
                              SELECT   t.n, t.c, t2.v
                              FROM     t, (SELECT  c, DBMS_RANDOM.VALUE v
                                           FROM       t
                                           GROUP BY c) t2
                              WHERE    t.c = t2.c
                              ORDER BY t.n
                              • 12. Re: Can someone explain this
                                Centinul
                                What about the possibility of this being a bug? I wouldn't think we should get DIFFERENT results when hinting a plan. For example the first query is used without hints, and the second one forces a MERGE, yet they yield different results:
                                SQL > SELECT       t.n
                                  2  ,  t.c
                                  3  ,  t2_rand.v
                                  4  FROM       t
                                  5  ,  (
                                  6     SELECT  c
                                  7     ,       DBMS_RANDOM.VALUE AS v
                                  8     FROM    t
                                  9     GROUP BY c
                                 10  )  t2_rand
                                 11  WHERE      t.c = t2_rand.c
                                 12  /
                                
                                         N C          V
                                ---------- - ----------
                                         3 a .429052293
                                         2 a .242166128
                                         1 a .242166128
                                         4 B .408919295
                                
                                
                                Execution Plan
                                ----------------------------------------------------------
                                Plan hash value: 1746068169
                                
                                -----------------------------------------------------------------------------
                                | Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
                                -----------------------------------------------------------------------------
                                |   0 | SELECT STATEMENT     |      |     8 |   256 |     6  (34)| 00:00:01 |
                                |*  1 |  HASH JOIN           |      |     8 |   256 |     6  (34)| 00:00:01 |
                                |   2 |   TABLE ACCESS FULL  | T    |     4 |    64 |     2   (0)| 00:00:01 |
                                |   3 |   VIEW               |      |     4 |    64 |     3  (34)| 00:00:01 |
                                |   4 |    HASH GROUP BY     |      |     4 |    12 |     3  (34)| 00:00:01 |
                                |   5 |     TABLE ACCESS FULL| T    |     4 |    12 |     2   (0)| 00:00:01 |
                                -----------------------------------------------------------------------------
                                
                                Predicate Information (identified by operation id):
                                ---------------------------------------------------
                                
                                   1 - access("T"."C"="T2_RAND"."C")
                                SQL > SELECT       /*+ USE_MERGE(t t2_rand) */ t.n
                                  2  ,  t.c
                                  3  ,  t2_rand.v
                                  4  FROM       t
                                  5  ,  (
                                  6     SELECT  c
                                  7     ,       DBMS_RANDOM.VALUE AS v
                                  8     FROM    t
                                  9     GROUP BY c
                                 10  )  t2_rand
                                 11  WHERE      t.c = t2_rand.c
                                 12  /
                                
                                         N C          V
                                ---------- - ----------
                                         4 B .430917487
                                         3 a .512943312
                                         1 a .512943312
                                         2 a .512943312
                                
                                
                                Execution Plan
                                ----------------------------------------------------------
                                Plan hash value: 988097664
                                
                                ------------------------------------------------------------------------------
                                | Id  | Operation             | Name | Rows  | Bytes | Cost (%CPU)| Time     |
                                ------------------------------------------------------------------------------
                                |   0 | SELECT STATEMENT      |      |     8 |   256 |     6  (34)| 00:00:01 |
                                |   1 |  MERGE JOIN           |      |     8 |   256 |     6  (34)| 00:00:01 |
                                |   2 |   SORT JOIN           |      |     4 |    64 |     3  (34)| 00:00:01 |
                                |   3 |    VIEW               |      |     4 |    64 |     3  (34)| 00:00:01 |
                                |   4 |     HASH GROUP BY     |      |     4 |    12 |     3  (34)| 00:00:01 |
                                |   5 |      TABLE ACCESS FULL| T    |     4 |    12 |     2   (0)| 00:00:01 |
                                |*  6 |   SORT JOIN           |      |     4 |    64 |     3  (34)| 00:00:01 |
                                |   7 |    TABLE ACCESS FULL  | T    |     4 |    64 |     2   (0)| 00:00:01 |
                                ------------------------------------------------------------------------------
                                
                                Predicate Information (identified by operation id):
                                ---------------------------------------------------
                                
                                   6 - access("T"."C"="T2_RAND"."C")
                                       filter("T"."C"="T2_RAND"."C")
                                • 13. Re: Can someone explain this
                                  Peter Gjelstrup
                                  Centinul wrote:
                                  There is the NO_MERGE hint, however is that what is really causing the issue? The explain plans that have been posted have a VIEW line which to me means the view is materialized, not merged. I generated a CBO trace for this and it doesn't appear that a plan with a merge was chosen either.
                                  You are right,

                                  I was fishing there. This question tops my understanding of what Oracle can or will do.

                                  /
                                  Peter
                                  • 14. Re: Can someone explain this
                                    Centinul
                                    Check out this Metalink Note: 3033572.8

                                    They say the issue was fixed in 10.2.0.1, but maybe that is not the case.
                                    1 2 Previous Next