1 2 Previous Next 27 Replies Latest reply: Aug 11, 2009 4:35 PM by 666352 Go to original post RSS
      • 15. Re: analytic sql question
        Rob van Wijk
        metzguar wrote:
        And it works.
        Not ...

        Please look what happens when I add an extra row:
        SQL> create table mytable (time,myuser,value)
          2  as
          3  select 1 col1, 'A' col2, 'X' col3 from dual union all
          4  select 2 col1, 'A' col2, 'X' col3 from dual union all
          5  select 3 col1, 'B' col2, 'Y' col3 from dual union all
          6  select 4 col1, 'B' col2, 'Y' col3 from dual union all
          7  select 5 col1, 'A' col2, 'X' col3 from dual union all
          8  select 5 col1, 'B' col2, 'X' col3 from dual union all
          9  select 6 col1, 'A' col2, 'Y' col3 from dual union all
         10  select 7 col1, 'B' col2, 'Y' col3 from dual union all
         11  select 7 col1, 'A' col2, 'Y' col3 from dual union all
         12  select 8 col1, 'A' col2, 'Y' col3 from dual
         13  /
        
        Tabel is aangemaakt.
        
        SQL> select min(time) start_time
          2       , max(time) end_time
          3       , myuser
          4       , value
          5    from ( select time
          6                , myuser
          7                , value
          8                , sum(new_group_ind) over (partition by myuser,value order by time) grp
          9             from ( select time
         10                         , myuser
         11                         , value
         12                         , case lag(time) over (partition by myuser,value order by time) when time-1 then 0 else 1 end new_group_ind
        
         13                      from mytable
         14                  )
         15         )
         16   group by grp
         17       , myuser
         18       , value
         19   order by myuser
         20       , start_time
         21  /
        
        START_TIME   END_TIME M V
        ---------- ---------- - -
                 1          2 A X
                 5          5 A X
                 6          8 A Y
                 3          4 B Y
                 5          5 B X
                 7          7 B Y
        
        6 rijen zijn geselecteerd.
        
        SQL> SELECT time AS "START_TIME"
          2       , NVL(next_time, time) AS "END_TIME"
          3       , myuser AS "USER"
          4       , value AS "VALUE"
          5    FROM ( SELECT time
          6                , myuser
          7                , value
          8                , ROW_NUMBER() OVER (PARTITION BY myuser, value ORDER BY time) AS rn
          9                , LEAD(time) OVER(PARTITION BY myuser, value ORDER BY time) AS next_time
         10             FROM mytable
         11         )
         12   WHERE MOD(rn, 2) = 1
         13   ORDER BY 3
         14       , 1
         15  /
        
        START_TIME   END_TIME U V
        ---------- ---------- - -
                 1          2 A X
                 5          5 A X
                 6          7 A Y
                 8          8 A Y
                 3          4 B Y
                 5          5 B X
                 7          7 B Y
        
        7 rijen zijn geselecteerd.
        Regards,
        Rob.
        • 16. Re: analytic sql question
          Rob van Wijk
          metzguar wrote:
          1 2 A X
          3 4 A X
          5 5 A X
          And this should be one line (1,5,'A','X').
          • 17. Re: analytic sql question
            Aketi Jyuuzou
            create table mytable (time,myuser,value) as
            select 1 col1, 'A' col2, 'X' col3 from dual union all
            select 2 col1, 'A' col2, 'X' col3 from dual union all
            select 3 col1, 'B' col2, 'Y' col3 from dual union all
            select 4 col1, 'B' col2, 'Y' col3 from dual union all
            select 5 col1, 'A' col2, 'X' col3 from dual union all
            select 5 col1, 'B' col2, 'X' col3 from dual union all
            select 6 col1, 'A' col2, 'Y' col3 from dual union all
            select 7 col1, 'B' col2, 'Y' col3 from dual union all
            select 7 col1, 'A' col2, 'Y' col3 from dual union all
            select 8 col1, 'A' col2, 'Y' col3 from dual;
            
            select min(time),max(time),myuser,value
            from (select time,myuser,value,
                   dense_rank() over(order by time)
                  -Row_Number() over(partition by myuser,value order by time)
                  as distance
                   from mytable)
            group by myuser,value,distance
            order by myuser,min(time);
            
            MIN(TIME)  MAX(TIME)  M  V
            ---------  ---------  -  -
                    1          2  A  X
                    5          5  A  X
                    6          8  A  Y
                    3          4  B  Y
                    5          5  B  X
                    7          7  B  Y
            I used sense of Tabibitosan B-)
            window function
            • 18. Re: analytic sql question
              Rob van Wijk
              Hi Aketi,

              Very inventive.
              I compared your query with mine, performance wise:
              SQL> select /*+ gather_plan_statistics */
                2         min(time) start_time
                3       , max(time) end_time
                4       , myuser
                5       , value
                6    from ( select time
                7                , myuser
                8                , value
                9                , sum(new_group_ind) over (partition by myuser,value order by time) grp
               10             from ( select time
               11                         , myuser
               12                         , value
               13                         , case lag(time) over (partition by myuser,value order by time) when time-1 then 0 else 1 end new_group_ind
              
               14                      from mytable
               15                  )
               16         )
               17   group by grp
               18       , myuser
               19       , value
               20   order by myuser
               21       , start_time
               22  /
              
              START_TIME   END_TIME M V
              ---------- ---------- - -
                       1          2 A X
                       5          5 A X
                       6          8 A Y
                       3          4 B Y
                       5          5 B X
                       7          7 B Y
              
              6 rijen zijn geselecteerd.
              
              SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'))
                2  /
              
              PLAN_TABLE_OUTPUT
              --------------------------------------------------------------------------------------------------------------------------------------
              SQL_ID  3fdm2b17kfnnb, child number 0
              -------------------------------------
              select /*+ gather_plan_statistics */        min(time) start_time      , max(time) end_time      , myuser      ,
              value   from ( select time               , myuser               , value               , sum(new_group_ind) over
              (partition by myuser,value order by time) grp            from ( select time                        , myuser
                                 , value                        , case lag(time) over (partition by myuser,value order by
              time) when time-1 then 0 else 1 end new_group_ind                     from mytable                 )        )
              group by grp      , myuser      , value  order by myuser      , start_time
              
              Plan hash value: 2039044503
              
              ------------------------------------------------------------------------------------------------------------------------
              | Id  | Operation               | Name    | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
              ------------------------------------------------------------------------------------------------------------------------
              |   1 |  SORT ORDER BY          |         |      1 |     10 |      6 |00:00:00.01 |       3 |  2048 |  2048 | 2048  (0)|
              |   2 |   SORT GROUP BY         |         |      1 |     10 |      6 |00:00:00.01 |       3 |  2048 |  2048 | 2048  (0)|
              |   3 |    VIEW                 |         |      1 |     10 |     10 |00:00:00.01 |       3 |       |       |          |
              |   4 |     WINDOW BUFFER       |         |      1 |     10 |     10 |00:00:00.01 |       3 |  2048 |  2048 | 2048  (0)|
              |   5 |      VIEW               |         |      1 |     10 |     10 |00:00:00.01 |       3 |       |       |          |
              |   6 |       WINDOW SORT       |         |      1 |     10 |     10 |00:00:00.01 |       3 |  2048 |  2048 | 2048  (0)|
              |   7 |        TABLE ACCESS FULL| MYTABLE |      1 |     10 |     10 |00:00:00.01 |       3 |       |       |          |
              ------------------------------------------------------------------------------------------------------------------------
              
              Note
              -----
                 - dynamic sampling used for this statement
              
              
              27 rijen zijn geselecteerd.
              
              SQL> select /*+ gather_plan_statistics */
                2         min(time)
                3       , max(time)
                4       , myuser
                5       , value
                6    from ( select time
                7                , myuser
                8                , value
                9                , dense_rank() over(order by time)
               10                  - Row_Number() over (partition by myuser,value order by time) as distance
               11             from mytable
               12         )
               13   group by myuser
               14       , value
               15       , distance
               16   order by myuser
               17       , min(time)
               18  /
              
               MIN(TIME)  MAX(TIME) M V
              ---------- ---------- - -
                       1          2 A X
                       5          5 A X
                       6          8 A Y
                       3          4 B Y
                       5          5 B X
                       7          7 B Y
              
              6 rijen zijn geselecteerd.
              
              SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'))
                2  /
              
              PLAN_TABLE_OUTPUT
              --------------------------------------------------------------------------------------------------------------------------------------
              SQL_ID  868g0jcx58cap, child number 0
              -------------------------------------
              select /*+ gather_plan_statistics */        min(time)      , max(time)      , myuser      , value   from (
              select time               , myuser               , value               , dense_rank() over(order by time)
                         - Row_Number() over (partition by myuser,value order by time) as distance            from mytable
                   )  group by myuser      , value      , distance  order by myuser      , min(time)
              
              Plan hash value: 1371012082
              
              -----------------------------------------------------------------------------------------------------------------------
              | Id  | Operation              | Name    | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
              -----------------------------------------------------------------------------------------------------------------------
              |   1 |  SORT ORDER BY         |         |      1 |     10 |      6 |00:00:00.01 |       3 |  2048 |  2048 | 2048  (0)|
              |   2 |   SORT GROUP BY        |         |      1 |     10 |      6 |00:00:00.01 |       3 |  2048 |  2048 | 2048  (0)|
              |   3 |    VIEW                |         |      1 |     10 |     10 |00:00:00.01 |       3 |       |       |          |
              |   4 |     WINDOW SORT        |         |      1 |     10 |     10 |00:00:00.01 |       3 |  2048 |  2048 | 2048  (0)|
              |   5 |      WINDOW SORT       |         |      1 |     10 |     10 |00:00:00.01 |       3 |  2048 |  2048 | 2048  (0)|
              |   6 |       TABLE ACCESS FULL| MYTABLE |      1 |     10 |     10 |00:00:00.01 |       3 |       |       |          |
              -----------------------------------------------------------------------------------------------------------------------
              
              Note
              -----
                 - dynamic sampling used for this statement
              
              
              24 rijen zijn geselecteerd.
              It would be nice if the original poster can test what works faster: a WINDOW BUFFER/VIEW or an extra WINDOW SORT.
              I'm not too familiar with the first ones, but I know a sort can be costly...

              Regards,
              Rob.
              • 19. Re: analytic sql question
                189821
                I see what you mean.
                Your query returns start and end points of data groups.
                My query returns pairs of consecutive values.
                As o.p. has never specified what he wanted, i guessed
                TIME was kind of a toggle event, so with mod(rownum, 2)
                I could decide if it was a "switch on" or "switch off" event.

                Sorry for that
                Urs
                • 20. Re: analytic sql question
                  715939
                  Rob, Aketi and Metzguar thanks for your help.
                  I think Rob' s solution is slower than Aketi' s since it has more analytical funcions which has over cost.
                  Aketi' s solution was very intuitive..

                  Regards..
                  emrah
                  • 21. Re: analytic sql question
                    Rob van Wijk
                    emrahsaglik@gmail.com wrote:
                    I think Rob' s solution is slower than Aketi' s since it has more analytical funcions which has over cost.
                    As already mentioned: I think not.

                    You have the data, so you can test it (hint). I'm curious.

                    Regards,
                    Rob.
                    • 22. Re: analytic sql question
                      Aketi Jyuuzou
                      Rob
                      Thanks for Good status report.

                      There is a good news ;-)
                      We can exchnage dense_rank for ColumnValue like below SQL,
                      Because In this case,there is no data which is difference >=2.
                      set autot trace
                      
                      select min(time),max(time),myuser,value
                      from (select time,myuser,value,
                            /*dense_rank() over(order by time)*/
                            time -Row_Number() over(partition by myuser,value
                                                    order by time) as distance
                            from mytable)
                      group by myuser,value,distance
                      order by myuser,min(time);
                      
                      ---------------------------------------------------------------------------------
                      | Id  | Operation             | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
                      ---------------------------------------------------------------------------------
                      |   0 | SELECT STATEMENT      |         |     3 |    90 |     6  (50)| 00:00:01 |
                      |   1 |  SORT ORDER BY        |         |     3 |    90 |     6  (50)| 00:00:01 |
                      |   2 |   HASH GROUP BY       |         |     3 |    90 |     6  (50)| 00:00:01 |
                      |   3 |    VIEW               |         |    10 |   300 |     4  (25)| 00:00:01 |
                      |   4 |     WINDOW SORT       |         |    10 |    70 |     4  (25)| 00:00:01 |
                      |   5 |      TABLE ACCESS FULL| MYTABLE |    10 |    70 |     3   (0)| 00:00:01 |
                      ---------------------------------------------------------------------------------
                      • 23. Re: analytic sql question
                        Rob van Wijk
                        Good catch, Aketi!
                        Only applicable in this special case where the time also is an integer.

                        Regards,
                        Rob.
                        • 24. Re: analytic sql question
                          666352
                          Hi,

                          The below article explain Aketi's method(TSQL)

                          http://beyondrelational.com/blogs/tc/archive/2009/07/30/tsql-challenge-9-solution-by-syed-mehroz-alam.aspx


                          Regards salim.
                          • 25. Re: analytic sql question
                            666352
                            Hi Rob,

                            What do you think about this query ?

                            Regards Salim.
                            select min(time) starttime, max(time) endtime , myuser,value
                            from(select time,myuser,value ,grp
                                    from mytable
                                    model
                                    partition by (myuser,value)
                                    dimension by (row_number()over(partition by myuser,value order by time)rn)
                                    measures(time,0 grp)ignore nav
                                    rules (grp[any] = case when time[cv()-1]+1!=time[cv()] then 
                                                                 grp[cv()-1]+1
                                                           else  grp[cv()-1] end))
                            group by myuser,value,grp
                            order by starttime
                                              
                            
                             STARTTIME    ENDTIME M V
                            ---------- ---------- - -
                                     1          2 A X
                                     3          4 B Y
                                     5          5 B X
                                     5          5 A X
                                     6          8 A Y
                                     7          7 B Y
                            
                            
                            6 rows selected.
                            • 26. Re: analytic sql question
                              Rob van Wijk
                              Salim Chelabi  wrote:
                              What do you think about this query ?
                              Hi Salim,

                              I think it is a nice alternative. A model clause query tends to be a tiny bit heavier than an alternative using only analytics (provided they do the same amount of sorting). Yours is doing only one sort, so that's good.

                              I'd add ordered rule evaluation for robustness, because as it is, your solution works "by accident". This is because you happen to sort in your analytic function (order by time).

                              Maybe you can add a performance comparison with Aketi's solution on a set with, say, 10,000 rows?

                              Regards,
                              Rob.
                              • 27. Re: analytic sql question
                                666352
                                Hi Rob,

                                Thank you for your explaintion.

                                Regards Salim.
                                1 2 Previous Next