- 3,714,818 Users
- 2,242,634 Discussions
- 7,845,078 Comments
Forum Stats
Discussions
Categories
- 12 Data
- 362.2K Big Data Appliance
- 6 Data Science
- 1.5K Databases
- 447 General Database Discussions
- 3.7K Java and JavaScript in the Database
- 22 Multilingual Engine
- 487 MySQL Community Space
- 3 NoSQL Database
- 7.6K Oracle Database Express Edition (XE)
- 2.7K ORDS, SODA & JSON in the Database
- 415 SQLcl
- 42 SQL Developer Data Modeler
- 184.8K SQL & PL/SQL
- 21K SQL Developer
- 1.8K Development
- 3 Developer Projects
- 32 Programming Languages
- 135K Development Tools
- 7 DevOps
- 3K QA/Testing
- 236 Java
- 4 Java Learning Subscription
- 10 Database Connectivity
- 65 Java Community Process
- Java 25
- 8 Java APIs
- 141.1K Java Development Tools
- 5 Java EE (Java Enterprise Edition)
- 153K Java Essentials
- 134 Java 8 Questions
- 86.2K Java Programming
- 270 Java Lambda MOOC
- 65.1K New To Java
- 1.7K Training / Learning / Certification
- 13.8K Java HotSpot Virtual Machine
- 10 Java SE
- 13.8K Java Security
- 3 Java User Groups
- 22 JavaScript - Nashorn
- 18 Programs
- 118 LiveLabs
- 28 Workshops
- 9 Software
- 3 Berkeley DB Family
- 3.5K JHeadstart
- 5.7K Other Languages
- 2.3K Chinese
- 3 Deutsche Oracle Community
- 10 Español
- 1.9K Japanese
- 2 Portuguese
SQL to check for Re-Occurring Event.

Hello All ,
Please will need help in solving below business requirement, excerpts below...
Thanks very much
Oracle version: 12.2.0.1.0
Problem:
EVENTS_FACT_TAB.NEW_EVENT_FLAG - Should tell us if the viewing event was the first view event for the given device
EVENTS_FACT_TAB.REJUV_EVENT_FLAG - Should tell us if the viewing event was the first view event after a specified absence for the given device
for the past 364 Days
Notes for implementation :
First check the Lookup_dim table to see if the device_id is present or not
We should be able to handle multiple view events on a given day for a given device. Only the FIRST view event of the day should be tagged as "NEW",
not the rest of the view events from the same day
This applies to REJUV_EVENT_FLAG as well.
DDL and DML:
Expected Result:
Current Attempt: To resolve_New_event_flag
--load below value into temp table
SELECT FACT_ID,
DEVICE_ID,
VIEW_START,
VIEW_END,
NEW_EVENT_FLAG,
REJUV_EVENT_FLAG ,
ROW_NUMBER () OVER (PARTITION BY device_id ORDER BY VIEW_END DESC) RANK_DEVICE
FROM EVENTS_FACT_TAB;
--update fact table
UPDATE EVENTS_FACT_TAB a
set a.NEW_EVENT_FLAG = 1
where a.device_id in (select b.device_id from Temp_table where rank_device = 1)
Best Answer
-
Hi,
You may want something like this:
MERGE INTO events_fact_tab dst USING ( WITH got_prev_view_start AS ( SELECT ROWID AS r_id , view_start , LAG (view_end) OVER ( PARTITION BY device_id ORDER BY view_start ) AS prev_view_end FROM events_fact_tab ) SELECT r_id , CASE WHEN prev_view_end IS NULL THEN 1 ELSE 0 END AS new_event_flag , CASE WHEN prev_view_end < view_start - 364 THEN 1 ELSE 0 END AS rejuv_event_flag FROM got_prev_view_start ) src ON (dst.ROWID = src.r_id) WHEN MATCHED THEN UPDATE SET dst.new_event_flag = src.new_event_flag , dst.rejuv_event_flag = src.rejuv_event_flag WHERE LNNVL (dst.new_event_flag = src.new_event_flag) -- For efficiency OR LNNVL (dst.rejuv_event_flag = src.rejuv_event_flag) -- For efficiency ;
This assumes
- the time between events is measured from vew_end of one row to view_start of another
- neither view_start and view_end can be NULL
- the combination (device_id, view_start) is unique
- if row A has an earlier view_start than row B, then view_end on row A must not be later than view_end on row B. (That is, events may overlap, but on cannot be entirely contained within another.)
If any of these assumptions are wrong, then the same basic approach will still work: some of the details will just be a little more complicated.
Here are the results I get from the data you originally posted.
Answers
-
Hi,
Not everyone who wants to hep you is as familiar with your application as you are, so be sure to explain clearly what you mean and what you want. For example, when you say:
EVENTS_FACT_TAB.NEW_EVENT_FLAG - Should tell us if the viewing event was the first view event for the given device
what, exactly, does that mean? What is a "viewing event"? Does every row in the events_fact_tab table represent a viewing event?
EVENTS_FACT_TAB.REJUV_EVENT_FLAG - Should tell us if the viewing event was the first view event after a specified absence for the given device
How do you compute the amount of time between events? E.g., do you compare the veiw_end value of the earlier event to the view_start value of the later event?
Only the FIRST view event of the day should be tagged as "NEW",
How do you determine which row is first? Is it view_start? View_end? Something else?
First check the Lookup_dim table to see if the device_id is present or not
What, exactly, does "present" mean? Do you just need to check if a given value occurs in the watch_device_id column? Do the TIMESTAMP columns have anything to do with deciding if the device_id is present? What happens if it is present? What happens differently if it is not present?
Describe, at least for one device, how you would get the desired flags. (Just to make sure the problem is clear; this may not be the best way to do it in SQL). For example, device '896635'. Why is one of the rows from 2020 considered the new event, and not the row from 2018?
I'm not sure exactly what you want to do, but, whatever it is, I'll be you can do it without a temporary table. In Oracle, you can perform a sub-query (a WITH clause, for example, of the USING clause of a MERGE statement) and then use its result set as if that were a table.
-
EVENTS_FACT_TAB.NEW_EVENT_FLAG - Should tell us if the viewing event was the first view event for the given device
what, exactly, does that mean? What is a "viewing event"?
Does every row in the events_fact_tab table represent a viewing event?
--yes every row in the EVENTS_FACT_TAB is a view event , usually will be loaded on a daily basis .
What, exactly, does "present" mean?
--"Present" means if the device_id exists in the LOOKUP_DIM table .
--the table is supposed to have all distinct device_ids to know when first and last they appeared.
Do you just need to check if a given value occurs in the watch_device_id column?
-- Yes also and the LAST_ACTIVITY_TS to know the Timestamp the device was last seen. when last the act , but the face , for this requirement the watch_device_id in the LOOKUP_DIM
Do the TIMESTAMP columns What happens if it is present?
What happens differently if it is not present?
---when no device is present or available in the LOOKUP_DIM ,
then its shows it hasn't been inserted before. Then we use the values in the EVENTS_FACT_TAB and then rank accordingly to see
if it is a new device ,then if the device is being rejuvunated(coming in after 364 days of absence)
-
Hi,
Okay; now I know what "viewing event" and "present" mean. However, none of my other questions have been answered. When you post a specific example of how you determine the desired flags, then maybe I'll understand.
-
Thanks for your response so far.
I think we can skip the LOOKUP_DIM table for now as it is having same information.
So for every entry into the EVENTS_FACT_TAB , I am supposed to create a stored proc to update these two coulmns (
NEW_EVENT_FLAG , REJUV_EVENT_FLAG)
is the device_id new ? first time entry in the table then set NEW_EVENT_FLAG = 1
if the device has already been inserted before in the Fact , but hasn't been present in the past 364 days , and then show_up later ,
then the REJUV_EVENT_FLAG can be set to = 1
-
Hi,
I think we can skip the LOOKUP_DIM table for now as it is having same information.
Okay.
You still haven't explained how you compute the number of days absent. Say the only two rows for a device had these dates:
VIEW_START VIEW_END
11/01/2000 8:00 AM 11/02/2000 8:00 AM
11/10/2000 8:00 AM 11/20/2000 8:00 AM
How long would you say the absence was
- 8 days
- 9 days
- 18 days
- 19 days
- something else
?
I also don't understand the desired results for device '896635'.
Expected Result:
Why is the row with view_start = 10/6/2000 given new_event_flag = 1? Why not the row with view_start = 12/5/2018?
-
when we partition by device_id the minimum (view_start) entry will tell us that ts the first time the device is appearing so we can set the new_event_flag as 1 , if a new event re-occurs for that device after 364 days (when the next event is inserted) , we will check if the prior maximum view_end and if the device_id has been absent for 364 days , then comes back then REJUV_EVENT_FLAG is set = 1
Why is the row with view_start = 10/6/2000 given new_event_flag = 1? Why not the row with view_start = 12/5/2018?
--apologies for that ...that was due to bad data. which has then been fixed
-
You still haven't explained how you compute the number of days absent. Say the only two rows for a device had these dates:
absence is if any event is beinf re-inserted after 364 days past the prior max View_end of that same device_id
-
HI,
we will check if the prior view_end and if the device_id has been absent for 364 days , then comes back then REJUV_EVENT_FLAG is set = 1.
Okay, so you check view_end; then what do you do with it? Do you compare it to view_start on another row?
I see you changed your original message. Please don't change your messages after you post them. It makes the thread even harder to follow.
Are these really the results you want now?
Expected Result:
For device '896635', why is new_event_flag =1 on the row with view_start = 12/5/2018, and not the row with view_start = 10/6/2018?
-
Hi,
You may want something like this:
MERGE INTO events_fact_tab dst USING ( WITH got_prev_view_start AS ( SELECT ROWID AS r_id , view_start , LAG (view_end) OVER ( PARTITION BY device_id ORDER BY view_start ) AS prev_view_end FROM events_fact_tab ) SELECT r_id , CASE WHEN prev_view_end IS NULL THEN 1 ELSE 0 END AS new_event_flag , CASE WHEN prev_view_end < view_start - 364 THEN 1 ELSE 0 END AS rejuv_event_flag FROM got_prev_view_start ) src ON (dst.ROWID = src.r_id) WHEN MATCHED THEN UPDATE SET dst.new_event_flag = src.new_event_flag , dst.rejuv_event_flag = src.rejuv_event_flag WHERE LNNVL (dst.new_event_flag = src.new_event_flag) -- For efficiency OR LNNVL (dst.rejuv_event_flag = src.rejuv_event_flag) -- For efficiency ;
This assumes
- the time between events is measured from vew_end of one row to view_start of another
- neither view_start and view_end can be NULL
- the combination (device_id, view_start) is unique
- if row A has an earlier view_start than row B, then view_end on row A must not be later than view_end on row B. (That is, events may overlap, but on cannot be entirely contained within another.)
If any of these assumptions are wrong, then the same basic approach will still work: some of the details will just be a little more complicated.
Here are the results I get from the data you originally posted.
-
Okay, so you check view_end; then what do you do with it? Do you compare it to view_start on another row?
--when a new_event for that device is re-entered is only when you check the max( View_end ) of the prior event if it has been greater than 364 . no need to compare to view_start in another row.
Expected result
-
--when a new_event for that device is re-entered is only when you check the max( View_end ) of the prior event if it has been greater than 364 . no need to compare to view_start in another row.
Expected result
Are you serious? I thought this was just a typo, but you keep repeating it, even after I pointed out the problem. Explain, step-be-step, for device '896635' how you figure new_event_flag = 1 for the row with view_start=10/6/2018.
What does "max( View_end ) of the prior event" mean? How can a single event have multiple values for view_end?
-
My Apologies , seems I was confused at some point too. Thanks for the solution , looks like expected result .
Thanks very much for the time and effort