Forum Stats

  • 3,734,273 Users
  • 2,246,935 Discussions
  • 7,857,216 Comments

Discussions

Need help with query

Hi All,

I need some help with the following. I have a dataset that I need to be able to count how many times a column duplicate and add 1.


So on the following table, everytime the column C_READSTATUS contains "ES" and the previous was also ES then it should display 1, from there always add 1 is the C_READSTATUS contains "ES". Eventually I need to be able to display something like this:

When C_READSTATUS doesn't contain ES then it reset to 0.


Not sure the approach to use here, I have try using the LAG but I'm stuck from there.

Thanks,

Marc

Tagged:

Best Answers

  • Solomon Yakobson
    Solomon Yakobson Member Posts: 18,398 Black Diamond
    Accepted Answer

    It isnt clear if there can be multiple rows for the same year month and waht results you'd need in such case. I'll assume one row per year month:

    WITH SAMPLE AS (
                    SELECT 12345 C_ACCOUNT,2583695 C_METER,'RE' C_READSTATUS,DATE '2020-02-11' T_READDATE FROM DUAL UNION ALL
                    SELECT 12345,2583695,'RE',DATE '2020-03-12' FROM DUAL UNION ALL
                    SELECT 12345,2583695,'RE',DATE '2020-04-10' FROM DUAL UNION ALL
                    SELECT 12345,2583695,'ES',DATE '2020-05-13' FROM DUAL UNION ALL
                    SELECT 12345,2583695,'ES',DATE '2020-06-12' FROM DUAL UNION ALL
                    SELECT 12345,2583695,'ES',DATE '2020-07-14' FROM DUAL UNION ALL
                    SELECT 12345,2583695,'ES',DATE '2020-08-14' FROM DUAL UNION ALL
                    SELECT 12345,2583695,'ES',DATE '2020-09-15' FROM DUAL UNION ALL
                    SELECT 12345,2583695,'ES',DATE '2020-10-15' FROM DUAL UNION ALL
                    SELECT 12345,2583695,'ES',DATE '2020-11-13' FROM DUAL UNION ALL
                    SELECT 12345,2583695,'RE',DATE '2020-12-13' FROM DUAL
                   )
    SELECT  M.C_ACCOUNT,
            M.C_METER,
            M.T_READDATE,
            M.CNT
      FROM  SAMPLE
      MATCH_RECOGNIZE(
                      PARTITION BY C_ACCOUNT,
                                   C_METER
                      ORDER BY T_READDATE
                      MEASURES
                        GREATEST(COUNT(ES.C_READSTATUS) - 1,0) CNT
                      ALL ROWS PER MATCH
                      PATTERN(RE+|ES+)
                      DEFINE ES AS C_READSTATUS = 'ES' AND C_READSTATUS = FIRST(C_READSTATUS),
                             RE AS C_READSTATUS = 'RE' AND C_READSTATUS = FIRST(C_READSTATUS)
                     ) M
    /
    
     C_ACCOUNT    C_METER T_READDATE        CNT
    ---------- ---------- ---------- ----------
         12345    2583695 11-FEB-20           0
         12345    2583695 12-MAR-20           0
         12345    2583695 10-APR-20           0
         12345    2583695 13-MAY-20           0
         12345    2583695 12-JUN-20           1
         12345    2583695 14-JUL-20           2
         12345    2583695 14-AUG-20           3
         12345    2583695 15-SEP-20           4
         12345    2583695 15-OCT-20           5
         12345    2583695 13-NOV-20           6
         12345    2583695 13-DEC-20           0
    
    
    11 rows selected.
    
    
    SQL>
    

    SY.

    Marc Racine

Answers

  • Frank Kulash
    Frank Kulash Member, Moderator Posts: 40,512 Red Diamond

    Hi,

    Whenever you have a question, please post a little sample data in a usable form (such as CREATE TABLE and INSERT statements), so the people who want to help you car re-create the problem and test their ideas. Also post the exact results you want from the given data, and an explanation (in general terms) of you get those results from that data. Always say which version of Oracle you're using (e.g. 12.2.0.1.0).

    See:

    How to Ask Questions in Developer Community Spaces - oracle-tech

  • Solomon Yakobson
    Solomon Yakobson Member Posts: 18,398 Black Diamond

    I'll assume you need to check it within C_ACCOUNT, C_METER:

    CASE
     WHEN C_READSTATUS = 'ES' AND C_READSTATUS = LAG(C_READSTATUS) OVER(PARTITION BY C_ACCOUNT,C_METER ORDER BY C_READDATE)
       THEN 1
       ELSE 0
    END
    

    SY.

  • Marc Racine
    Marc Racine Member Posts: 11

    I'm using Oracle 12c, that's exactly what I currently have but it's the portion after that I can't wrap my head around.


  • Frank Kulash
    Frank Kulash Member, Moderator Posts: 40,512 Red Diamond
    edited Dec 18, 2020 4:07PM

    Hi,

    In general, you need to post CREATE TABLE and INSERT statements for a little sample data. If you can show what you want to do using commonly available tables, such as those in the scott schema, then you don't need to; just post the desired results and an explanation.

    For example: "I want to show how many consecutive employees (in alphabetic order) from the scott.emp table are in deptartment 30, like this:

    Here's one way to do that: See next reply for a better solution.

    ELECT   ename
    ,	 deptno
    ,	 consecutive
    FROM	 scott.emp
    MATCH_RECOGNIZE
          (
    --      PARTITION BY ...  -- if needed
           ORDER BY   ename
           MEASURES   COUNT (*) - 1 AS consecutive
           ALL ROWS PER MATCH
           PATTERN    (othr? d30*)
           DEFINE    d30 AS  deptno = 30
         )
    ORDER BY ename
    ;
    


  • Frank Kulash
    Frank Kulash Member, Moderator Posts: 40,512 Red Diamond

    Here's a better MATCH_RECOGNIZE solution for the scott.emp.deptno=30 problem:

    SELECT  ename
    ,	 deptno
    ,	 consecutive
    FROM	 scott.emp
    MATCH_RECOGNIZE
         (
    --      PARTITION BY ...  -- if needed
           ORDER BY   ename
           MEASURES   COUNT (*)  AS consecutive
           ALL ROWS PER MATCH
           PATTERN    (d30*)
           DEFINE    d30  AS  deptno = 30
    	 )
    ORDER BY ename
    ;
    


  • Marc Racine
    Marc Racine Member Posts: 11
    edited Dec 18, 2020 4:29PM

    Here's some sample data.

    CREATE TABLE BIF016
    ( C_ACCOUNT VARCHAR2(15) NOT NULL,
      C_READSTATUS VARCHAR2(2) NOT NULL,
      T_READDATE TIMESTAMP(9)
    )
    ;
    
    INSERT ALL
    INTO BIF016 (C_ACCOUNT, C_READSTATUS,T_READDATE) VALUES ('12345', 'RE', 02/11/2020 00:00:00.000000000)
    INTO BIF016 (C_ACCOUNT, C_READSTATUS,T_READDATE) VALUES ('12345', 'RE', 03/11/2020 00:00:00.000000000)
    INTO BIF016 (C_ACCOUNT, C_READSTATUS,T_READDATE) VALUES ('12345', 'RE', 04/11/2020 00:00:00.000000000)
    INTO BIF016 (C_ACCOUNT, C_READSTATUS,T_READDATE) VALUES ('12345', 'ES', 05/11/2020 00:00:00.000000000)
    INTO BIF016 (C_ACCOUNT, C_READSTATUS,T_READDATE) VALUES ('12345', 'ES', 06/11/2020 00:00:00.000000000)
    INTO BIF016 (C_ACCOUNT, C_READSTATUS,T_READDATE) VALUES ('12345', 'ES', 07/11/2020 00:00:00.000000000)
    INTO BIF016 (C_ACCOUNT, C_READSTATUS,T_READDATE) VALUES ('12345', 'ES', 08/11/2020 00:00:00.000000000)
    INTO BIF016 (C_ACCOUNT, C_READSTATUS,T_READDATE) VALUES ('12345', 'ES', 09/11/2020 00:00:00.000000000)
    INTO BIF016 (C_ACCOUNT, C_READSTATUS,T_READDATE) VALUES ('12345', 'ES', 10/11/2020 00:00:00.000000000)
    INTO BIF016 (C_ACCOUNT, C_READSTATUS,T_READDATE) VALUES ('12345', 'ES', 11/11/2020 00:00:00.000000000)
    INTO BIF016 (C_ACCOUNT, C_READSTATUS,T_READDATE) VALUES ('12345', 'ES', 12/11/2020 00:00:00.000000000)
    ;
    


    Output I'm looking for is by month based on the T_READDATE:


  • Frank Kulash
    Frank Kulash Member, Moderator Posts: 40,512 Red Diamond
    edited Dec 18, 2020 5:02PM

    Hi,

    Would you like to get solutions that work? Then make sure the CREATE TABLE and INSERT statements you post work, too.

    INSERT ALL
    INTO BIF016 (C_ACCOUNT, C_READSTATUS,T_READDATE) VALUES ('12345', 'RE', 02/11/2020 00:00:00.000000000)
    INTO BIF016 (C_ACCOUNT, C_READSTATUS,T_READDATE) VALUES ('12345', 'RE', 03/11/2020 00:00:00.000000000)
    INTO BIF016 (C_ACCOUNT, C_READSTATUS,T_READDATE) VALUES ('12345', 'RE', 04/11/2020 00:00:00.000000000)
    INTO BIF016 (C_ACCOUNT, C_READSTATUS,T_READDATE) VALUES ('12345', 'ES', 05/11/2020 00:00:00.000000000)
    INTO BIF016 (C_ACCOUNT, C_READSTATUS,T_READDATE) VALUES ('12345', 'ES', 06/11/2020 00:00:00.000000000)
    INTO BIF016 (C_ACCOUNT, C_READSTATUS,T_READDATE) VALUES ('12345', 'ES', 07/11/2020 00:00:00.000000000)
    INTO BIF016 (C_ACCOUNT, C_READSTATUS,T_READDATE) VALUES ('12345', 'ES', 08/11/2020 00:00:00.000000000)
    INTO BIF016 (C_ACCOUNT, C_READSTATUS,T_READDATE) VALUES ('12345', 'ES', 09/11/2020 00:00:00.000000000)
    INTO BIF016 (C_ACCOUNT, C_READSTATUS,T_READDATE) VALUES ('12345', 'ES', 10/11/2020 00:00:00.000000000)
    INTO BIF016 (C_ACCOUNT, C_READSTATUS,T_READDATE) VALUES ('12345', 'ES', 11/11/2020 00:00:00.000000000)
    INTO BIF016 (C_ACCOUNT, C_READSTATUS,T_READDATE) VALUES ('12345', 'ES', 12/11/2020 00:00:00.000000000)
    ;
    

    There's no FROM clause in that statement. Also, you should be calling TO_TIMESTAMP (or something similar) to populate the t_readdate column.

    Make sure the sample data and reflects your real problem. If you have multiple values for c_account in your real table, then you should have multiple values for c_account in your sample data, too.

    It should be easy to adapt the MATCH_RECOGNIZE solution above. Just substitute your own table- and column names, and change the DEFINE clause to something like:

    DEFINE    es AS c_readstatus = 'ES'
    
  • Solomon Yakobson
    Solomon Yakobson Member Posts: 18,398 Black Diamond
    Accepted Answer

    It isnt clear if there can be multiple rows for the same year month and waht results you'd need in such case. I'll assume one row per year month:

    WITH SAMPLE AS (
                    SELECT 12345 C_ACCOUNT,2583695 C_METER,'RE' C_READSTATUS,DATE '2020-02-11' T_READDATE FROM DUAL UNION ALL
                    SELECT 12345,2583695,'RE',DATE '2020-03-12' FROM DUAL UNION ALL
                    SELECT 12345,2583695,'RE',DATE '2020-04-10' FROM DUAL UNION ALL
                    SELECT 12345,2583695,'ES',DATE '2020-05-13' FROM DUAL UNION ALL
                    SELECT 12345,2583695,'ES',DATE '2020-06-12' FROM DUAL UNION ALL
                    SELECT 12345,2583695,'ES',DATE '2020-07-14' FROM DUAL UNION ALL
                    SELECT 12345,2583695,'ES',DATE '2020-08-14' FROM DUAL UNION ALL
                    SELECT 12345,2583695,'ES',DATE '2020-09-15' FROM DUAL UNION ALL
                    SELECT 12345,2583695,'ES',DATE '2020-10-15' FROM DUAL UNION ALL
                    SELECT 12345,2583695,'ES',DATE '2020-11-13' FROM DUAL UNION ALL
                    SELECT 12345,2583695,'RE',DATE '2020-12-13' FROM DUAL
                   )
    SELECT  M.C_ACCOUNT,
            M.C_METER,
            M.T_READDATE,
            M.CNT
      FROM  SAMPLE
      MATCH_RECOGNIZE(
                      PARTITION BY C_ACCOUNT,
                                   C_METER
                      ORDER BY T_READDATE
                      MEASURES
                        GREATEST(COUNT(ES.C_READSTATUS) - 1,0) CNT
                      ALL ROWS PER MATCH
                      PATTERN(RE+|ES+)
                      DEFINE ES AS C_READSTATUS = 'ES' AND C_READSTATUS = FIRST(C_READSTATUS),
                             RE AS C_READSTATUS = 'RE' AND C_READSTATUS = FIRST(C_READSTATUS)
                     ) M
    /
    
     C_ACCOUNT    C_METER T_READDATE        CNT
    ---------- ---------- ---------- ----------
         12345    2583695 11-FEB-20           0
         12345    2583695 12-MAR-20           0
         12345    2583695 10-APR-20           0
         12345    2583695 13-MAY-20           0
         12345    2583695 12-JUN-20           1
         12345    2583695 14-JUL-20           2
         12345    2583695 14-AUG-20           3
         12345    2583695 15-SEP-20           4
         12345    2583695 15-OCT-20           5
         12345    2583695 13-NOV-20           6
         12345    2583695 13-DEC-20           0
    
    
    11 rows selected.
    
    
    SQL>
    

    SY.

    Marc Racine
  • Frank Kulash
    Frank Kulash Member, Moderator Posts: 40,512 Red Diamond
    edited Dec 18, 2020 6:17PM

    Hi, @User_TCZ1L

    Sorry, I missed the requirement that the first 'ES' has consecutive=0. Here's how I would do it, using the sample table that @Solomon Yakobson posted:

    SELECT  c_account, c_meter, t_readdate
    ,	 c_readstatus
    ,	 consecutive
    FROM	 sample
    MATCH_RECOGNIZE
         (
    --      PARTITION BY c_account, c_meter -- if needed
           ORDER BY   t_readdate 
           MEASURES   GREATEST ( COUNT (*) - 1
    	   		  	   , 0
    				   ) AS consecutive
           ALL ROWS PER MATCH
           PATTERN    (es*) 
           DEFINE    es AS c_readstatus = 'ES'
    	 )
    ORDER BY t_readdate 
    ;
    

    What results would you want it there were separate streaks of consecutive 'ES's? That is, if we add this data to sample:

    UNION ALL    SELECT 12345,2583695,'ES',DATE '2021-01-01' FROM DUAL
    UNION ALL    SELECT 12345,2583695,'ES',DATE '2021-02-01' FROM DUAL
    UNION ALL    SELECT 12345,2583695,'ES',DATE '2021-04-01' FROM DUAL
    

    would you want consecutive to resume with the next highest value, or should it re-start to 0?

    Do you still count rows as "consecutive" if they are not consecutive months (e.g., in the data above, there is no data for March 2021. Do February and April count as "consecutive"?)

  • Marc Racine
    Marc Racine Member Posts: 11

    would you want consecutive to resume with the next highest value, or should it re-start to 0? It should restart at 0.

    Do you still count rows as "consecutive" if they are not consecutive months? No, the month need to be consecutive to count.

    What @Solomon Yakobson posted seems to work. I'm trying to understand the following part:

    MEASURES
                        GREATEST(COUNT(ES.C_READSTATUS) - 1,0) CNT
                      ALL ROWS PER MATCH
                      PATTERN(RE+|ES+)
                      DEFINE ES AS C_READSTATUS = 'ES' AND C_READSTATUS = FIRST(C_READSTATUS),
                             RE AS C_READSTATUS = 'RE' AND C_READSTATUS = FIRST(C_READSTATUS)
    


  • Frank Kulash
    Frank Kulash Member, Moderator Posts: 40,512 Red Diamond

    If you wanted to do this with an analytic function, I think ROW_NUMBER would be better than LAG:

    WITH  got_grp  AS
    (
      SELECT c_readstatus, t_readdate
      ,	  c_account, c_meter  -- if needed
      ,    ROW_NUMBER () OVER ( -- PARTITION BY c_account, c_meter -- if neededename)
      	  	    	     ORDER BY   t_readdate
    			    )
         - ROW_NUMBER () OVER ( PARTITION BY c_readstatus
    	           -- ,       c_account, c_meter -- if needed
    	             ORDER BY   t_readdate
    			    ) AS grp
      FROM  sample
    )
    SELECT  t_readdate
    -- ,   c_account, c_meter -- if needed
    ,     CASE
    	   WHEN c_readstatus = 'ES'
    	   THEN ROW_NUMBER () OVER ( PARTITION BY c_readstatus
    	   	  	    	    	 ,	    grp
    				   -- ,	    c_account, c_meter -- if needed
    	   	  	    	    	 ORDER BY   t_readdate
    	   			    ) - 1
    	   ELSE 0
    	 END  AS consecutive
    FROM   got_grp
    ORDER BY -- c_account, c_meter, -- if needed
         t_readdate
    ;
    


  • Marc Racine
    Marc Racine Member Posts: 11

    Sorry about the sample date, here's sample date. I'm also looking to output the data by month and display what's the count for each mount.

    CREATE TABLE BIF016
    ( C_ACCOUNT VARCHAR2(15) NOT NULL,
      C_READSTATUS VARCHAR2(2) NOT NULL,
      T_READDATE TIMESTAMP(9)
    )
    ;
    
    
    INSERT ALL
    INTO BIF016 (C_ACCOUNT, C_READSTATUS,T_READDATE) VALUES ('12345', 'RE', 02/11/2020 00:00:00.000000000)
    INTO BIF016 (C_ACCOUNT, C_READSTATUS,T_READDATE) VALUES ('12345', 'RE', 03/11/2020 00:00:00.000000000)
    INTO BIF016 (C_ACCOUNT, C_READSTATUS,T_READDATE) VALUES ('12345', 'RE', 04/11/2020 00:00:00.000000000)
    INTO BIF016 (C_ACCOUNT, C_READSTATUS,T_READDATE) VALUES ('12345', 'ES', 05/11/2020 00:00:00.000000000)
    INTO BIF016 (C_ACCOUNT, C_READSTATUS,T_READDATE) VALUES ('12345', 'ES', 06/11/2020 00:00:00.000000000)
    INTO BIF016 (C_ACCOUNT, C_READSTATUS,T_READDATE) VALUES ('12345', 'ES', 07/11/2020 00:00:00.000000000)
    INTO BIF016 (C_ACCOUNT, C_READSTATUS,T_READDATE) VALUES ('12345', 'ES', 08/11/2020 00:00:00.000000000)
    INTO BIF016 (C_ACCOUNT, C_READSTATUS,T_READDATE) VALUES ('12345', 'ES', 09/11/2020 00:00:00.000000000)
    INTO BIF016 (C_ACCOUNT, C_READSTATUS,T_READDATE) VALUES ('12345', 'ES', 10/11/2020 00:00:00.000000000)
    INTO BIF016 (C_ACCOUNT, C_READSTATUS,T_READDATE) VALUES ('12345', 'ES', 11/11/2020 00:00:00.000000000)
    INTO BIF016 (C_ACCOUNT, C_READSTATUS,T_READDATE) VALUES ('12345', 'ES', 12/11/2020 00:00:00.000000000)
    ;
    



  • Marc Racine
    Marc Racine Member Posts: 11

    Wow that's exactly what I need. Can you explain me the following part?

    MEASURES
                        GREATEST(COUNT(ES.C_READSTATUS) - 1,0) CNT
                      ALL ROWS PER MATCH
                      PATTERN(RE+|ES+)
                      DEFINE ES AS C_READSTATUS = 'ES' AND C_READSTATUS = FIRST(C_READSTATUS),
                             RE AS C_READSTATUS = 'RE' AND C_READSTATUS = FIRST(C_READSTATUS)
    


  • Marc Racine
    Marc Racine Member Posts: 11

    This solution seems to work, but I also need to exclude some results, if I do using this the number keep going up.


  • Solomon Yakobson
    Solomon Yakobson Member Posts: 18,398 Black Diamond
    edited Dec 18, 2020 7:46PM

    Exclude "some results" is very descriptive - everyone understands what's going on, right?

    SY.

    Frank Kulash
  • Marc Racine
    Marc Racine Member Posts: 11

    Sorry the reply was sent before I could complete the sentence.

    This solution seems to work, but I also need to exclude some rows, if I do using the where clause at the end the number keep going up.


  • Solomon Yakobson
    Solomon Yakobson Member Posts: 18,398 Black Diamond

    Again what is "some rows" what is the criteria to exclude them?

    SY.

  • Marc Racine
    Marc Racine Member Posts: 11

    I added the flag l_cancel = 0.

    SELECT  M.C_ACCOUNT,
            M.C_METER,
            M.T_READDATE,
            M.C_READSTATUS,
            l_cancel,
            M.CNT
      FROM  BIF016
      MATCH_RECOGNIZE(
                      PARTITION BY C_ACCOUNT
                      ORDER BY T_READDATE
                      MEASURES
                        GREATEST(COUNT(ES.C_READSTATUS) - 0,0) CNT
                      ALL ROWS PER MATCH 
                      PATTERN(RE+|ES+)
                      DEFINE ES AS C_READSTATUS = 'ES' AND C_READSTATUS = FIRST(C_READSTATUS),
                             RE AS C_READSTATUS = 'RE' AND C_READSTATUS = FIRST(C_READSTATUS)
                     ) M
    WHERE AND l_cancel = 0
    
    ORDER BY c_account
    ;
    
  • Here's what I'm running. Note that the flag l_cancel is the one I added.

    SELECT  M.C_ACCOUNT,
            M.C_METER,
            M.T_READDATE,
            M.C_READSTATUS,
            l_cancel,
            M.CNT
      FROM  BIF016
      MATCH_RECOGNIZE(
                      PARTITION BY C_ACCOUNT
                      ORDER BY T_READDATE
                      MEASURES
                        GREATEST(COUNT(ES.C_READSTATUS) - 0,0) CNT
                      ALL ROWS PER MATCH 
                      PATTERN(RE+|ES+)
                      DEFINE ES AS C_READSTATUS = 'ES' AND C_READSTATUS = FIRST(C_READSTATUS),
                             RE AS C_READSTATUS = 'RE' AND C_READSTATUS = FIRST(C_READSTATUS)
                     ) M
    WHERE AND l_cancel = 0
    
    ORDER BY c_account
    ;
    
  • Frank Kulash
    Frank Kulash Member, Moderator Posts: 40,512 Red Diamond
    edited Dec 20, 2020 3:12PM

    Hi,

    Here's what I'm running. Note that the flag l_cancel is the one I added.

    That's doing the MATCH_RECOGNIZE and WHERE in the same query, so the WHERE is applied after consecutive is calculated. Try doing the WHERE first, in a separate sub-query, as shown yesterday (3rd reply above).

Sign In or Register to comment.