Discussions
Categories
- 385.5K All Categories
- 4.9K Data
- 2.5K Big Data Appliance
- 2.4K Data Science
- 453.4K Databases
- 223.2K General Database Discussions
- 3.8K Java and JavaScript in the Database
- 47 Multilingual Engine
- 606 MySQL Community Space
- 486 NoSQL Database
- 7.9K Oracle Database Express Edition (XE)
- 3.2K ORDS, SODA & JSON in the Database
- 585 SQLcl
- 4K SQL Developer Data Modeler
- 188K SQL & PL/SQL
- 21.5K SQL Developer
- 46 Data Integration
- 46 GoldenGate
- 298.4K Development
- 4 Application Development
- 20 Developer Projects
- 166 Programming Languages
- 295K Development Tools
- 150 DevOps
- 3.1K QA/Testing
- 646.7K Java
- 37 Java Learning Subscription
- 37.1K Database Connectivity
- 201 Java Community Process
- 108 Java 25
- 22.2K Java APIs
- 138.3K Java Development Tools
- 165.4K Java EE (Java Enterprise Edition)
- 22 Java Essentials
- 176 Java 8 Questions
- 86K Java Programming
- 82 Java Puzzle Ball
- 65.1K New To Java
- 1.7K Training / Learning / Certification
- 13.8K Java HotSpot Virtual Machine
- 94.3K Java SE
- 13.8K Java Security
- 208 Java User Groups
- 25 JavaScript - Nashorn
- Programs
- 667 LiveLabs
- 41 Workshops
- 10.3K Software
- 6.7K Berkeley DB Family
- 3.6K JHeadstart
- 6K Other Languages
- 2.3K Chinese
- 207 Deutsche Oracle Community
- 1.1K Español
- 1.9K Japanese
- 474 Portuguese
Find the date gaps and count

Hi ,
Below is the table structure and sample data . I just want to find the the number of project unassigned periods for the employee. Could you please help on writing the query to fetch the expected result ?
with t(id,emp_code,projectcode,dates)as (select 1,100,'p1',to_date('10-jan-2019') from dual union all select 2,100,'p1',to_date('11-jan-2019') from dual union all select 3,100,'p1',to_date('12-jan-2019') from dual union all select 4,100,null,to_date('13-jan-2019') from dual union all select 5,100,null,to_date('14-jan-2019') from dual union all select 6,100,'p2',to_date('15-jan-2019') from dual union all select 7,100,'p2',to_date('16-jan-2019') from dual union all select 8,100,null,to_date('17-jan-2019') from dual union all select 9,100,null,to_date('18-jan-2019') from dual union all select 10,100,'p2',to_date('19-jan-2019') from dual union all select 10,100,null,to_date('20-jan-2019') from dual)select * from t;/*---------------------------Below is my Expected resultEmp_code start_dt end_dt count----------------------------------------------100 13-JAN-19 14-JAN-19 2100 16-JAN-19 17-JAN-19 2100 20-JAN-19 20-JAN-19 1*/
Best Answer
-
See if this is what you are looking for ...
select emp_code, min(dates) mn, max(dates) mx, max(dates) - min(dates) + 1 cnt from
(
select t.*, dates - row_number() over (partition by emp_code order by dates) rn from t
where projectcode is null and dates is not null
)
group by emp_code, rn
order by 1, 2
;
Answers
-
See if this is what you are looking for ...
select emp_code, min(dates) mn, max(dates) mx, max(dates) - min(dates) + 1 cnt from
(
select t.*, dates - row_number() over (partition by emp_code order by dates) rn from t
where projectcode is null and dates is not null
)
group by emp_code, rn
order by 1, 2
;
-
with t(id,emp_code,projectcode,dates)
as (select 1,100,'p1',to_date('10-jan-2019') from dual union all
select 2,100,'p1',to_date('11-jan-2019') from dual union all
select 3,100,'p1',to_date('12-jan-2019') from dual union all
select 4,100,null,to_date('13-jan-2019') from dual union all
select 5,100,null,to_date('14-jan-2019') from dual union all
select 6,100,'p2',to_date('15-jan-2019') from dual union all
select 7,100,'p2',to_date('16-jan-2019') from dual union all
select 8,100,null,to_date('17-jan-2019') from dual union all
select 9,100,null,to_date('18-jan-2019') from dual union all
select 10,100,'p2',to_date('19-jan-2019') from dual union all
select 10,100,null,to_date('20-jan-2019') from dual)
select emp_code,
start_dt,
end_dt,
cnt
from t
match_recognize(
partition by emp_code
order by id
measures first(dates) as start_dt,
last(dates) as end_dt,
count(*) as cnt
pattern (x+)
define x as projectcode is null and nvl(prev(dates),dates - 1) = dates - 1
)
/
EMP_CODE START_DT END_DT CNT
---------- --------- --------- ----------
100 13-JAN-19 14-JAN-19 2
100 17-JAN-19 18-JAN-19 2
100 20-JAN-19 20-JAN-19 1
SQL>
SY.
-
Hello New Roots,
simply a few remarks:
-1-
note that TO_DATE( '26-jan-2019' ) assumes that the format of the date is DD-mon-YYYY... the day the format uses another session you are in trouble.
Avoid using implicit formats,
prefer things like TO_DATE( '26-JAN-2019', 'DD-MON-YYYY' ),
or, better, (as the language might cause issues with MON) use TO_DATE( '26-01-2019', 'DD-MM-YYYY' )
or, even better/simpler: ISO format: DATE '2019-01-26' (keyword DATE followed by date in format YYYY-MM-DD between single quotes)-2-
I am surprised by your datamodel... It seems that "t" contains all the dates without gaps, making counts easy but datamodel not likely to resist to actual world dataExamples:
what if we want to add something like "on 13-JAN, project p3 for emp_code 100": we have to UPDATE the existing row with project= null
what if we want to add something like "on 23-JAN, project p1 fo remp_code 100": we have to add the rows for 21, 22 and 23-JAN?
what if we want to add something like "on 04-FEV, project p4 for emp_code 123": first time we have thsi emp_code; do we have to add other dates with project = null? which ones?
what if we want to remove a row ((I see a problem present in your data so I can't phrase the question adequately))
... Too many questions for the model to be OK I think.More usual is to store only the "relevant" data, not the lines with "empty project".
Then with the usual way of working, the "trick" is to generate a list of all relevant dates (maybe between MIN( dates ) and MAX( dates ) observed for a given emp_code, maybe for MIN and MAX obeserved through all the emp_codes, maybe arbitrarily (like 01-JAN-2019 to 31-JAN-2019), ...
And to query this generated list, using a LEFT OUTER JOIN to the actual data, this is a dynamic way to "fill in the list" and to go back to your original list with "project = null"
And then we might start from this and query only the NULLs, or count them, or ...For this last part you can see other replies (people replying to your question stricto sensu)
Best regards,
Bruno Vroman.
P.S. to test the queries you should generate more cases in your input data; add more emp_codes, add dates where more than 1 emp_code have a project, add dates where 1 emp_code has more than 1 project, add 1 emp_code having only 1 date (with a project_id), add 1 emp_code having only 1 date (with NULL project_id).
-
Hi Solomon,
Thank you for the different approach . I am going to spend some time to understand the match_recognize .
-
Hi Bruno,
Thank you very much for such wonderful review. This model is exists in our system for while, and I could not correct this stage. But I really keep in mind your inputs while design new data model. Really appreciated . Thanks.
-
This might be easier to understand:
with t(id,emp_code,projectcode,dates)
as (select 1,100,'p1',to_date('10-jan-2019') from dual union all
select 2,100,'p1',to_date('11-jan-2019') from dual union all
select 3,100,'p1',to_date('12-jan-2019') from dual union all
select 4,100,null,to_date('13-jan-2019') from dual union all
select 5,100,null,to_date('14-jan-2019') from dual union all
select 6,100,'p2',to_date('15-jan-2019') from dual union all
select 7,100,'p2',to_date('16-jan-2019') from dual union all
select 8,100,null,to_date('17-jan-2019') from dual union all
select 9,100,null,to_date('18-jan-2019') from dual union all
select 10,100,'p2',to_date('19-jan-2019') from dual union all
select 10,100,null,to_date('20-jan-2019') from dual)
select emp_code,
start_dt,
end_dt,
cnt
from t
match_recognize(
partition by emp_code
order by id
measures first(dates) as start_dt,
last(dates) as end_dt,
count(*) as cnt
pattern (s x*)
define s as projectcode is null,
x as projectcode is null and prev(dates) = dates - 1
)
/
SY.