Forum Stats

  • 3,781,578 Users
  • 2,254,530 Discussions


timeseries/eventstream processing



  • User_2SRRN
    User_2SRRN Member Posts: 7 Green Ribbon

    absolutely true. I just thought Oracle wants it this way, since that button is here.

    thanks for the great help!

  • Frank Kulash
    Frank Kulash Member, Moderator Posts: 41,385 Red Diamond

    Hi, @User_2SRRN

    Thanks for posting the sample data. You still haven't posted the exact results you want from that data, or given a very clear idea of what you're trying to do. For example:

    1. I want to group consecutive rows (in order by event_ts) for the same customer. The combination (customer, event_ts) is unique.
    2. Normally, a group will start with a row where event_type = 'start' and end with a row where event_type = 'stop'. (E.g., customer 2000 has two groups.) Rows with other event_types are ignored.
    3. If there are consecutive rows with the same customer and event_ts, they will all belong to the same group, so a group will normally begin with one or more 'start' rows, and end with one or more 'stop' rows. (E.g., customer 3200 has two groups. The first group has three 'start' rows, and the second group has two 'stop' rows.)
    4. For each group, I want to display the customer, the earliest 'start' event_ts and the latest 'stop' event_ts.
    5. The last group might not have a 'stop' event (e.g. customer -2000). In that case, SYSDATE should be displayed as the latest stop event_ts.
    6. Any 'stop's that do not follow some 'start' (e.g. customer -1000) are ignored.

    If those happen to be your requirements, then MATCH_RECOGNIZE (as shown above) is the best solution. If you can't use MATCH_RECOGNIZE for some reason, then you can use LAG and COUNT, like this:

    WITH  got_new_grp  AS
    	SELECT  customer, event_ts, event_type
    	,	CASE
    		  WHEN event_type = 'start'
    		  AND  LAG (event_type, 1, 'stop')
    		  	   OVER ( PARTITION BY customer
    			   	   ORDER BY 	  event_ts
    				  ) = 'stop'
    		  THEN 'New group starts here'
    		END AS new_grp
    	FROM	eventStream
    	WHERE	event_type IN ('start', 'stop') -- if needed
    ,  got_grp_num  AS
    	SELECT  customer, event_ts, event_type
    	,	COUNT (new_grp) OVER ( PARTITION BY customer
    		   		    ORDER BY   event_ts
    				   ) AS grp_num
    	FROM	got_new_grp
    SELECT   customer
    ,	 MIN (event_ts)	AS first_start
    ,	 NVL ( MAX ( CASE
    		         WHEN event_type = 'stop'
    			 THEN event_ts
    	     , SYSDATE
    	     )			AS last_stop
    FROM	 got_grp_num
    WHERE	 grp_num > 0
    GROUP BY customer, grp_num
    ORDER BY customer, first_start

    The results from this query are:

    ---------- ------------------- -------------------
      -2000  2020-01-01 14:59:00    2021-10-15 11:32:30
       1000  2021-01-01 10:00:00    2021-01-01 12:00:00
       2000  2021-02-01 14:00:00    2021-02-01 16:00:00
       2000  2021-02-02 08:30:00    2021-02-02 11:30:00
       3000  2021-03-01 10:00:00    2021-03-01 14:00:00
       3100  2021-03-02 09:00:00    2021-03-02 21:00:00
       3200  2021-03-01 10:00:00    2021-03-01 14:00:00
       3200  2021-03-02 09:00:00    2021-03-02 21:00:00
       4000  2021-04-01 15:15:00    2021-10-15 11:32:30
       4100  2021-04-02 15:15:00    2021-10-15 11:32:30
  • User_2SRRN
    User_2SRRN Member Posts: 7 Green Ribbon

    it's all the funny stuff you see in Oracle CRM (ex Siebel)

    I guess, picking the first start avoids cheating SLAs. If they see, they're late, they just press the button again to "reset" the counter. Not sure about the last stop though ;-)

  • User_2SRRN
    User_2SRRN Member Posts: 7 Green Ribbon

    this is perfect requirements engineering and exactly what I was looking for :-)

  • User_H3J7U
    User_H3J7U Member Posts: 814 Gold Trophy
    edited Oct 18, 2021 1:20PM

    The logic is: take the first and last value from series defined by the patern. Pattern is a sequential start elements and then (sequential stop elements or end of input string).