SQL & PL/SQL

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

using analytical function to calculate concurrency between date range

522382May 5 2008 — edited May 16 2008

Folks,
I'm trying to use analytical functions to come up with a query that gives me the
concurrency of jobs executing between a date range.

For example:
JOB100 - started at 9AM - stopped at 11AM
JOB200 - started at 10AM - stopped at 3PM
JOB300 - started at 12PM - stopped at 2PM

The query would tell me that JOB1 ran with a concurrency of 2 because JOB1 and JOB2
were running started and finished within the same time. JOB2 ran with the concurrency
of 3 because all jobs ran within its start and stop time. The output would look like this.

JOB START STOP CONCURRENCY
=== ==== ==== =========
100 9AM 11AM 2
200 10AM 3PM 3
300 12PM 2PM 2

I've been looking at this post, and this one if very similar...
640412

Here is the sample data..

CREATE TABLE TEST_JOB
( jobid NUMBER,
created_time DATE,
start_time DATE,
stop_time DATE
)
/
insert into TEST_JOB values (100, sysdate -1, to_date('05/04/08 09:00:00','MM/DD/YY hh24:mi:ss'), to_date('05/04/08 11:00:00','MM/DD/YY hh24:mi:ss'));
insert into TEST_JOB values (200, sysdate -1, to_date('05/04/08 10:00:00','MM/DD/YY hh24:mi:ss'), to_date('05/04/08 13:00:00','MM/DD/YY hh24:mi:ss'));
insert into TEST_JOB values (300, sysdate -1, to_date('05/04/08 12:00:00','MM/DD/YY hh24:mi:ss'), to_date('05/04/08 14:00:00','MM/DD/YY hh24:mi:ss'));

select * from test_job;

JOBID|CREATED_TIME |START_TIME |STOP_TIME
----------|--------------|--------------|--------------
100|05/04/08 09:28|05/04/08 09:00|05/04/08 11:00
200|05/04/08 09:28|05/04/08 10:00|05/04/08 13:00
300|05/04/08 09:28|05/04/08 12:00|05/04/08 14:00

Any help with this query would be greatly appreciated.

thanks.
-peter

Nicolas Gasparotto

LAG() or LEAD() should help you. Take a look in the documentation.

Nicolas.

Frank Kulash

Hi,

If you have two date ranges b1-e1 and b2-e2 (b1 <= e1 and b2 <= e2), you can tell if they overlap by saying

(b1 <= e2) AND (e1 >= b2)

Use this condition in a self-join to see how many rows overlap with a given row.

522382

I'm not so sure about that.
From all the examples I've seen LAG and LEAD can give you the previous or next value
in relation to a particular row. In my case I need to count how many records have a start_time and stop_time that fall within my current record's start_time and stop_time.
I'll check it out further though to see if I can do some sort of aggregation the result of LAG and LEAD.

thanks for your input.
-peter

522382

I should have mentioned this, but I was trying to use analytical functions to avoid the self join.
-peter

Nicolas Gasparotto

You don't need auto-join with the functions I told earlier about.

Nicolas.

Anurag Tibrewal

Hi,

Too late

Regards

Message was edited by:
Anurag Tibrewal

Nicolas Gasparotto

It was not clear enough from your first post. So, kindly post what should be the expected output with your sample input given above.

Nicolas.

Laurent Schneider

did you consider model?

select *
from test_job 
model
dimension by (start_time, stop_time)
measures (created_time,jobid,0 n)
(n[any,any]=
  count(*)[start_time<=cv(start_time),stop_time>=cv(start_time)]+
  count(*)[start_time between cv(start_time) and cv(stop_time),stop_time>cv(stop_time)]
)

Anurag Tibrewal

If the analytical query is not a constraint, does this give you the expected output

select JOBID,
       (select count(*)
          from TEST_JOB
         where START_TIME <= A.STOP_TIME OR STOP_TIME => A.START_TIME)
  from TEST_JOB

522382

Nicolas,

The sample output might have not been obvious in my first post, but here it is again.

JOOB START STOP CONCURRENCY
=== ==== ==== =========
100 9AM 11AM 2
200 10AM 3PM 3
300 12PM 2PM 2

Example, JOB 100 started at 9AM and completed at 11AM. That means that this job had a concurrency of 2 because job 200 started at 10AM which falls within the time that it took job 100 to complete (9AM -> 11AM).

JOB 200 started at 10AM and completd at 3PM, so it has a concurrency of 3 because the other 2 jobs were also running within this time.

hope this clears it up a bit.
-peter

522382

Laurent,
I have to say that I've never used model before, but from running the query the output is exactly what I was looking for. Very nice!

Anurag,
your query is essentially a self-join, but it will probably work too (haven't tested). I'm going to compare the model query against a self join to check for overall efficiency.

thanks.
-peter

Laurent Schneider

fine, the model should be more efficient as the table is selected only once.

Be sure you test limit case, for example if one period ends exactly when another start, etc... but it should be easy for you to correct it

Glad I made you interested in sql modelling :)

522382

after some checking the model rule wasn't working exactly as expected.

I believe it's working right now. I'm posting a self-contained example for completeness sake.I use 2 functions to convert back and forth between epoch unix timestamps, so
I'll post them here as well.

Like I said I think this works okay, but any feedback is always appreciated.
-peter

CREATE OR REPLACE FUNCTION date_to_epoch(p_dateval IN DATE)
RETURN NUMBER
AS
BEGIN
return (p_dateval - to_date('01/01/1970','MM/DD/YYYY')) * (24 * 3600);
END;
/

CREATE OR REPLACE FUNCTION epoch_to_date (p_epochval IN NUMBER DEFAULT 0)
RETURN DATE
AS
BEGIN
return to_date('01/01/1970','MM/DD/YYYY') + (( p_epochval) / (24 * 3600));
END;
/

DROP TABLE TEST_MODEL3 purge;

CREATE TABLE TEST_MODEL3
( jobid NUMBER,
start_time NUMBER,
end_time NUMBER);

insert into TEST_MODEL3
VALUES (300,date_to_epoch(to_date('05/07/2008 10:00','MM/DD/YYYY hh24:mi')),
date_to_epoch(to_date('05/07/2008 19:00','MM/DD/YYYY hh24:mi')));
insert into TEST_MODEL3
VALUES (200,date_to_epoch(to_date('05/07/2008 09:00','MM/DD/YYYY hh24:mi')),
date_to_epoch(to_date('05/07/2008 12:00','MM/DD/YYYY hh24:mi')));
insert into TEST_MODEL3
VALUES (400,date_to_epoch(to_date('05/07/2008 10:00','MM/DD/YYYY hh24:mi')),
date_to_epoch(to_date('05/07/2008 14:00','MM/DD/YYYY hh24:mi')));
insert into TEST_MODEL3
VALUES (500,date_to_epoch(to_date('05/07/2008 11:00','MM/DD/YYYY hh24:mi')),
date_to_epoch(to_date('05/07/2008 16:00','MM/DD/YYYY hh24:mi')));
insert into TEST_MODEL3
VALUES (600,date_to_epoch(to_date('05/07/2008 15:00','MM/DD/YYYY hh24:mi')),
date_to_epoch(to_date('05/07/2008 22:00','MM/DD/YYYY hh24:mi')));
insert into TEST_MODEL3
VALUES (100,date_to_epoch(to_date('05/07/2008 09:00','MM/DD/YYYY hh24:mi')),
date_to_epoch(to_date('05/07/2008 23:00','MM/DD/YYYY hh24:mi')));
commit;

SELECT jobid,
epoch_to_date(start_time)start_time,
epoch_to_date(end_time)end_time,
n concurrency
FROM TEST_MODEL3
MODEL
DIMENSION BY (start_time,end_time)
MEASURES (jobid,0 n)
(n[any,any]=
count(*)[start_time<= cv(start_time),end_time>=cv(start_time)]+
count(*)[start_time > cv(start_time) and start_time <= cv(end_time), end_time >= cv(start_time)]
)
ORDER BY start_time;

The results look like this:

JOBID|START_TIME|END_TIME |CONCURRENCY
----------|---------------|--------------|-------------------
100|05/07/08 09:00|05/07/08 23:00| 6
200|05/07/08 09:00|05/07/08 12:00| 5
300|05/07/08 10:00|05/07/08 19:00| 6
400|05/07/08 10:00|05/07/08 14:00| 5
500|05/07/08 11:00|05/07/08 16:00| 6
600|05/07/08 15:00|05/07/08 22:00| 4

522382

For completeness sake, I will add something else that I ran into, which was
the ORA-32638. In my example, it's very possible that you can have different jobid's that have a start/end time that is the same... so the DIMENSION BY will not be able to uniquely identify a cell .. which from the docs it seems that it is required.

So for example, if I added another row into my test table...

insert into TEST_MODEL3
VALUES (700,date_to_epoch(to_date('05/07/2008 09:00','MM/DD/YYYY hh24:mi')),
date_to_epoch(to_date('05/07/2008 23:00','MM/DD/YYYY hh24:mi')));

this insert statement has the same start/end times as jobid 100. If I were to execute the
above query, it fails with 'ORA-32638: Non unique addressing in MODEL dimension'

In my example, if my DIMENSION BY includes jobid, then this would be able to
uniquely identify a cell. So rewriting the query like this does work and I think it's correct.

SELECT
jobid,
epoch_to_date(start_time)start_time,
epoch_to_date(end_time)end_time,
n concurrency
FROM
(
SELECT
jobid,
start_time,
end_time
from TEST_MODEL3
)
MODEL
DIMENSION BY (jobid,start_time,end_time)
MEASURES (0 n)
(n[any,any,any]=
count(*)[any, start_time<= cv(start_time),end_time>=cv(start_time)]+
count(*)[any, start_time > cv(start_time) and start_time <= cv(end_time), end_time >= cv(start_time)]
)
ORDER BY start_time;

.. and the results are:

JOBID|START_TIME |END_TIME |CONCURRENCY
----------|-------------------|-------------------|-------------------
200|05/07/08 09:00|05/07/08 12:00| 6
700|05/07/08 09:00|05/07/08 23:00| 7
100|05/07/08 09:00|05/07/08 23:00| 7
300|05/07/08 10:00|05/07/08 19:00| 7
400|05/07/08 10:00|05/07/08 14:00| 6
500|05/07/08 11:00|05/07/08 16:00| 7
600|05/07/08 15:00|05/07/08 22:00| 5

If my thought process is wrong, please let me know.
thanks.

NicloeiW

hi laurent,
i tried myself using simple self join , but i cant seem to get the output, ;-(

SQL> 
SQL> select jobid,
  2         (select count(*)
  3          from test_job I_n
  4          where start_time <= O_ut.stop_time OR stop_time >= O_ut.start_time)
  5  from test_job O_ut
  6  /

     JOBID (SELECTCOUNT(*)FROMTEST_JOBI_N
---------- ------------------------------
       100                              3
       200                              3
       300                              3

SQL>

looks like i am not able to catch the logic here ;-(

regards
nic

Frank Kulash

Hi, Nic,

If you have two date ranges b1-e1 and b2-e2 (b1 <= e1 and b2 <= e2), you can tell if they overlap by saying

(b1 <= e2) AND (e1 >= b2)

Change your "OR" to "AND".

NicloeiW

hey frank,

yes, this is working now, date over lap issue always bother me, particularly if we have gaps in the date,

thanks

SQL> 
SQL>  select jobid,
  2             (select count(*)
  3              from test_job I_n
  4              where start_time <= O_ut.stop_time And stop_time >= O_ut.start_time)
  5      from test_job O_ut
  6  /

     JOBID (SELECTCOUNT(*)FROMTEST_JOBI_N
---------- ------------------------------
       100                              2
       200                              3
       300                              2

SQL>

Laurent Schneider

overlap exists, but is not documented (and therefore should not be used in prod)

with t as (
Select date '2000-01-01' t1, date '2000-01-02' t2, date '2000-01-03' t3, date '2000-01-04' t4 from dual
union all
Select date '2000-01-01' t1, date '2001-01-02' t2, date '2000-01-03' t3, date '2000-01-04' t4 from dual
union all
Select date '2000-01-01' t1, date '2001-01-02' t2, date '2000-01-03' t3, date '2002-01-04' t4 from dual
union all
Select date '2001-01-01' t1, date '2003-01-02' t2, date '2000-01-03' t3, date '2002-01-04' t4 from dual
)
select * from t where (t1,t2) overlaps (t3,t4);

T1                        T2                        T3                        T4                        
------------------------- ------------------------- ------------------------- ------------------------- 
01.01.00                  02.01.01                  03.01.00                  04.01.00                  
01.01.00                  02.01.01                  03.01.00                  04.01.02                  
01.01.01                  02.01.03                  03.01.00                  04.01.02

however WM_OVERLAPS is documented
http://download.oracle.com/docs/cd/B28359_01/appdev.111/b28396/long_vt.htm#sthref274

with t as (
Select date '2000-01-01' t1, date '2000-01-02' t2, date '2000-01-03' t3, date '2000-01-04' t4 from dual
union all
Select date '2000-01-01' t1, date '2001-01-02' t2, date '2000-01-03' t3, date '2000-01-04' t4 from dual
union all
Select date '2000-01-01' t1, date '2001-01-02' t2, date '2000-01-03' t3, date '2002-01-04' t4 from dual
union all
Select date '2001-01-01' t1, date '2003-01-02' t2, date '2000-01-03' t3, date '2002-01-04' t4 from dual
)
select * from t where wm_overlaps(wm_period(t1,t2),wm_period(t3,t4))=1;

Message was edited by:
Laurent Schneider

NicloeiW

this is excellent, can save bit of brain scratching, i never knew about overlaps and wm_overlaps,

the link is also nice,

but have to do brain scratching for filling in gaps with correct date in case of from - to date...

1 - 19

Locked Post

New comments cannot be posted to this locked post.

Locked on Jun 13 2008

Added on May 5 2008

19 comments

5,060 views

SQL & PL/SQL

using analytical function to calculate concurrency between date range

Comments

Post Details