Forum Stats

  • 3,827,849 Users
  • 2,260,830 Discussions
  • 7,897,398 Comments

Discussions

analytical function help

OU_230
OU_230 Member Posts: 84 Blue Ribbon
edited Sep 19, 2017 1:04PM in General Database Discussions

I have following table:

with  tbl as (select 1 col1, 'a'  col2 from dual            union select 2 , 'a' from dual            union  select 3 , 'b' from dual            union  select 4 , 'a' from dual            union  select 5 , 'a' from dual            union  select 6 , 'b' from dual            union  select 7 , 'b' from dual) select   * from tbl order by col1;

I want following result  i.e first value from  col2   whenever col2 value changes.

1 ,a

3 ,b

4 ,a

6 ,b

How can I do this in SQL  using analytical function

Best Answer

  • ddf_dba
    ddf_dba Member Posts: 1,398 Bronze Trophy
    edited Sep 18, 2017 4:48PM Answer ✓

    Here is one way (others my have better offerings):

    with  tbl as (select 1 col1, 'a'  col2 from dual

                union select 2 , 'a' from dual

                union  select 3 , 'b' from dual

                union  select 4 , 'a' from dual

                union  select 5 , 'a' from dual

                union  select 6 , 'b' from dual

                union  select 7 , 'b' from dual

    ),

    lag_data as(

    select col1, col2, lag(col2) over (order by col1) col2a from tbl

    )

    select col1, col2

    from lag_data

    where col2a is null or col2a <> col2

    order by col1;

    The results are:

    BING @ quanghoo >  with  tbl as (select 1 col1, 'a'  col2 from dual

      2              union select 2 , 'a' from dual

      3              union  select 3 , 'b' from dual

      4              union  select 4 , 'a' from dual

      5              union  select 5 , 'a' from dual

      6              union  select 6 , 'b' from dual

      7              union  select 7 , 'b' from dual

      8  ),

      9  lag_data as(

    10   select col1, col2, lag(col2) over (order by col1) col2a from tbl

    11  )

    12  select col1, col2

    13  from lag_data

    14  where col2a is null or col2a <> col2

    15  order by col1;

          COL1 C

    ---------- -

             1 a

             3 b

             4 a

             6 b

    BING @ quanghoo >

    David Fitzjarrell

    OU_230

Answers

  • ddf_dba
    ddf_dba Member Posts: 1,398 Bronze Trophy
    edited Sep 18, 2017 4:48PM Answer ✓

    Here is one way (others my have better offerings):

    with  tbl as (select 1 col1, 'a'  col2 from dual

                union select 2 , 'a' from dual

                union  select 3 , 'b' from dual

                union  select 4 , 'a' from dual

                union  select 5 , 'a' from dual

                union  select 6 , 'b' from dual

                union  select 7 , 'b' from dual

    ),

    lag_data as(

    select col1, col2, lag(col2) over (order by col1) col2a from tbl

    )

    select col1, col2

    from lag_data

    where col2a is null or col2a <> col2

    order by col1;

    The results are:

    BING @ quanghoo >  with  tbl as (select 1 col1, 'a'  col2 from dual

      2              union select 2 , 'a' from dual

      3              union  select 3 , 'b' from dual

      4              union  select 4 , 'a' from dual

      5              union  select 5 , 'a' from dual

      6              union  select 6 , 'b' from dual

      7              union  select 7 , 'b' from dual

      8  ),

      9  lag_data as(

    10   select col1, col2, lag(col2) over (order by col1) col2a from tbl

    11  )

    12  select col1, col2

    13  from lag_data

    14  where col2a is null or col2a <> col2

    15  order by col1;

          COL1 C

    ---------- -

             1 a

             3 b

             4 a

             6 b

    BING @ quanghoo >

    David Fitzjarrell

    OU_230
  • AndrewSayer
    AndrewSayer Member Posts: 13,007 Gold Crown
    edited Sep 18, 2017 7:39PM
    OU_230 wrote:I have following table:
    1. withtblas(select1col1,'a'col2fromdual
    2. unionselect2,'a'fromdual
    3. unionselect3,'b'fromdual
    4. unionselect4,'a'fromdual
    5. unionselect5,'a'fromdual
    6. unionselect6,'b'fromdual
    7. unionselect7,'b'fromdual
    8. )select*fromtblorderbycol1;
    with tbl as (select 1 col1, 'a' col2 from dual  union select 2 , 'a' from dual  union select 3 , 'b' from dual  union select 4 , 'a' from dual  union select 5 , 'a' from dual  union select 6 , 'b' from dual  union select 7 , 'b' from dual ) select * from tbl order by col1;
    I want following result i.e first value from col2 whenever col2 value changes.1 ,a3 ,b4 ,a6 ,bHow can I do this in SQL using analytical function

    Is this homework? It's hard to see the usefulness of the result. Would you always want the entire result set? Or would you just want the latest 'b' or similar?

    I won't bother giving any SQL, David has already shared a solution that will do one read and one sort of the data set (and very little memory will be needed for the lag), admittedly David's also sorted the result set with an order by afterwards, but that's just for niceness. Given your stated requirement, that's probably a most efficient way of getting there (there will always be other methods that do the same amount of work just slightly differently, but there's no point in finding them if you've already got a working good solution).

  • OU_230
    OU_230 Member Posts: 84 Blue Ribbon
    edited Sep 19, 2017 11:29AM

    Excellent! Thanks for the help.  It worked well.

  • Jonathan Lewis
    Jonathan Lewis Member Posts: 9,979 Blue Diamond
    edited Sep 19, 2017 12:23PM

    David,

    Here's a little surprise with your solution (I cut-n-pasted it to see if I could rewrite to a match_recognize) - inlining a WITH subquery taking a different plan from writing the subquery inline in the first place:

    The plan for your query:

    -------------------------------------------------------------------------

    | Id  | Operation        | Name | Rows  | Bytes | Cost (%CPU)| Time    |

    -------------------------------------------------------------------------

    |  0 | SELECT STATEMENT |      |    7 |    56 |    16  (13)| 00:00:01 |

    |  1 |  SORT ORDER BY  |      |    7 |    56 |    16  (13)| 00:00:01 |

    |*  2 |  VIEW          |      |    7 |    56 |    15  (7)| 00:00:01 |

    |  3 |    WINDOW SORT  |      |    7 |    42 |    15  (7)| 00:00:01 |

    |  4 |    VIEW        |      |    7 |    42 |    14  (0)| 00:00:01 |

    |  5 |      UNION-ALL  |      |      |      |            |          |

    |  6 |      FAST DUAL  |      |    1 |      |    2  (0)| 00:00:01 |

    |  7 |      FAST DUAL  |      |    1 |      |    2  (0)| 00:00:01 |

    |  8 |      FAST DUAL  |      |    1 |      |    2  (0)| 00:00:01 |

    |  9 |      FAST DUAL  |      |    1 |      |    2  (0)| 00:00:01 |

    |  10 |      FAST DUAL  |      |    1 |      |    2  (0)| 00:00:01 |

    |  11 |      FAST DUAL  |      |    1 |      |    2  (0)| 00:00:01 |

    |  12 |      FAST DUAL  |      |    1 |      |    2  (0)| 00:00:01 |

    -------------------------------------------------------------------------

    Predicate Information (identified by operation id):

    ---------------------------------------------------

      2 - filter("COL2A" IS NULL OR "COL2A"<>"COL2")

    I had not expected to see the optimizer do the "sort order by" since it ought to have inferred that the data were already sorted correctly as a consequence of the window sort. 

    Here's the plan with the inline version of the code:

    with  tbl as (select 1 col1, 'a'  col2 from dual

                union all select 2 , 'a' from dual

                union all  select 3 , 'b' from dual

                union all  select 4 , 'a' from dual

                union all  select 5 , 'a' from dual

                union all  select 6 , 'b' from dual

                union all  select 7 , 'b' from dual

    )

    select col1, col2

    from    (

            select col1, col2, lag(col2) over (order by col1) col2a from tbl

            )

    where col2a is null or col2a <> col2

    order by col1

    ;

    -------------------------------------------------------------------------

    | Id  | Operation        | Name | Rows  | Bytes | Cost (%CPU)| Time     |

    -------------------------------------------------------------------------

    |   0 | SELECT STATEMENT |      |       |       |    15 (100)|          |

    |*  1 |  VIEW            |      |     7 |    56 |    15   (7)| 00:00:01 |

    |   2 |   WINDOW SORT    |      |     7 |    42 |    15   (7)| 00:00:01 |

    |   3 |    VIEW          |      |     7 |    42 |    14   (0)| 00:00:01 |

    |   4 |     UNION-ALL    |      |       |       |            |          |

    |   5 |      FAST DUAL   |      |     1 |       |     2   (0)| 00:00:01 |

    |   6 |      FAST DUAL   |      |     1 |       |     2   (0)| 00:00:01 |

    |   7 |      FAST DUAL   |      |     1 |       |     2   (0)| 00:00:01 |

    |   8 |      FAST DUAL   |      |     1 |       |     2   (0)| 00:00:01 |

    |   9 |      FAST DUAL   |      |     1 |       |     2   (0)| 00:00:01 |

    |  10 |      FAST DUAL   |      |     1 |       |     2   (0)| 00:00:01 |

    |  11 |      FAST DUAL   |      |     1 |       |     2   (0)| 00:00:01 |

    -------------------------------------------------------------------------

    Predicate Information (identified by operation id):

    ---------------------------------------------------

       1 - filter(("COL2A" IS NULL OR "COL2A"<>"COL2"))

    The effect is the same with 12.1.0.2 and 12.2.0.1

    Regards

    Jonathan Lewis

    P.S. This might be a correct match_recognize() solution - but I still need lots of practice with the feature

    with  tbl as (

              select 1 col1, 'a'  col2 from dual

    union all select 2 , 'a' from dual

    union all select 3 , 'b' from dual

    union all select 4 , 'a' from dual

    union all select 5 , 'a' from dual

    union all select 6 , 'b' from dual

    union all select 7 , 'b' from dual

    )

    select * from tbl

    match_recognize(

            order by col1

            measures  first(col1) as startvalue, first(col2) as string_repeat

            pattern(a b*)

            define b as col2 = prev(col2)

    )

    ;

    OU_230
  • ddf_dba
    ddf_dba Member Posts: 1,398 Bronze Trophy
    edited Sep 19, 2017 1:04PM

    I literally threw it together and saw that it worked; I didn't take the time to check plans like you did.  There is an ORDER BY in the query so the extra sort doesn't really surprise me but it is interesting.

    Thanks for taking the time, I appreciate the insight.

    David Fitzjarrell

This discussion has been closed.