This discussion is archived
7 Replies Latest reply: Apr 21, 2011 8:17 PM by Frank Kulash RSS

Stuck - Need help writing SQL logic

856245 Newbie
Currently Being Moderated
Heres the scenario ... I've got a table storing date ranges which overlap each other. I need to breakup up the dates and to show a timeline. The catch is that I'm trying to do this in SQL, specifically I'm using Oracle 10g – so can also use the Oracle PL/SQL to write the logic if need be.

Lets assume I have a table storing job information for employees in my organisation.
JOB_ID | EMPLOYEE_NUMBER | JOB_TITLE | START_DATE | END_DATE
1 | XYZ | Forklift Driver |10-JAN-2011 10-DEC-2020
2 | XYZ | Supervisor |20-JAN-2013 15-DEC-2013

In the above example, employee XYZ originally joined the company to be a forklift driver but then temporarily fills a supervisor role that becomes vacant i.e. XYZ is on secondment.

I want to split the dates to show a timeline of employment like this.
JOB_ID | START_DATE | END_DATE
1 | 10-JAN-2011 | 19-JAN-2013 << Starts original role
2 | 20-JAN-2013 | 15-DEC-2013 << Gets seconded – supervisor
1 | 16-DEC-2013 | 10-DEC-2020 << Back to normal role

This is not the exact scenario for my problem but its a lot simpler to explain. I've also tried using the Oracle LEAD and LAG functions which get my quite close but no cigar. Because an employee can switch between roles many times throughout their employment, the solution needs to be dynamic rather then using hardcoded logic.

Edited by: user12022738 on Apr 21, 2011 11:53 AM

Edited by: user12022738 on Apr 21, 2011 11:53 AM

Edited by: user12022738 on Apr 21, 2011 11:54 AM
  • 1. Re: Stuck - Need help writing SQL logic
    Dave Rabone Journeyer
    Currently Being Moderated
    This type of problem is tricky - and the solution depends on what else we can assume:

    For example: is

    3 | XYZ | Manager |20-MAR-2013 15-SEP-2013 possible
    or even
    3 | XYZ | Manager |20-MAR-2013 15-SEP-2014

    in other words, can we have more than one assignment active at a time, and is each assignment contained within a previous one.
  • 2. Re: Stuck - Need help writing SQL logic
    856245 Newbie
    Currently Being Moderated
    Dave, lets just assume that there will only be one active assignment here. The "active" assignment will be determined by using the latest starting "start date" as at the current day - if that makes sense. So as of today 21st Apr, the "active" assignment will be the forklift driver i.e. the employees original role.

    Each assignment doesn't necesarily need to be contained within another one. The example employee might originally join as a forklift driver on a fixed contract but then be given a contract extension which could be in a different role i.e. we create a new record.

    Regards,
  • 3. Re: Stuck - Need help writing SQL logic
    Frank Kulash Guru
    Currently Being Moderated
    Hi,

    Welcome to the forum!

    Whenever you post a question, please post CREATE TABLE and INSERT statements for your sample data, so that the people who want to help you can re-create the problem and test their ideas.
    For example:
    CREATE TABLE     job
    (     job_id          NUMBER          PRIMARY KEY
    ,     employee_number     VARCHAR2 (10)
    ,     job_title     VARCHAR2 (20)
    ,     start_date     DATE
    ,     end_date     DATE
    );
    
    INSERT INTO job (job_id, employee_number, job_title,           start_date,        end_date)
         VALUES  (1,      'XYZ',           'Forklift Driver', DATE '2011-01-10', DATE '2020-12-10');
    INSERT INTO job (job_id, employee_number, job_title,           start_date,        end_date)
         VALUES  (2,      'XYZ',           'Supervisor',      DATE '2013-01-20', DATE '2013-12-15');
    You can get the results you want using analytic functions. (I used ROW_NUMBER and LAG below.)
    WITH     unpivoted_data     AS
    (
         SELECT     job_id, employee_number, job_title
         ,      start_date     AS event_date
         ,     ROW_NUMBER () OVER ( PARTITION BY  employee_number
                             ORDER BY        start_date
                           )     AS start_number
         FROM     job
              --
        UNION ALL
              --
         SELECT     job_id, employee_number, job_title
         ,      END_date     AS event_date
         ,     0          AS start_number
         FROM     job
    )
    ,     got_neighbor_data     AS
    (
         SELECT     job_id
         ,     employee_number     -- If wanted
         ,     LAG (event_date)      OVER ( PARTITION BY  employee_number
                                  ORDER BY      event_date
                                   )     AS start_date
         ,       event_date               AS end_date
         ,       start_number
         ,       LAG  (start_number) OVER ( PARTITION BY  employee_number
                                  ORDER BY      event_date
                                   )     AS prev_start_number
         FROM       unpivoted_data
    )
    SELECT    job_id
    ,       employee_number     -- If wanted
    ,       start_date + CASE
                      WHEN  prev_start_number = 0
                      THEN  1
                      ELSE  0
                     END          AS start_date
    ,       end_date   - CASE
                      WHEN  start_number > 0
                      THEN  1
                      ELSE  0
                     END          AS end_date
    FROM       got_neighbor_data
    WHERE       start_number     != 1
    ORDER BY  employee_number
    ,       start_date
    ;
    Each row in your table will result in at least one row in the output, but it may result in two rows, if a given assignment temporarily interrupts another assignment. That's why this query starts off with a UNION to make two rows for every assingment: one showing the start_date and the other the end_date. start_number identifies whether the row indicates the start of an assignment (when start_numebr > 0) or the end of one (start_number = 0).
    In the sub-query got_neighbor_data, LAG gets the event_date from the previous row, so we have the beginning and ending dates of each tenure. We may need to fudge these by a day; for example, on December 15, XYZ ceased being a Supervisor, and went back to being a Forklift Driver. We don't want to show the same date twice, so we add 1 day to the start_date on the later line.
    user12022738 wrote:
    Each assignment doesn't necesarily need to be contained within another one. The example employee might originally join as a forklift driver on a fixed contract but then be given a contract extension which could be in a different role i.e. we create a new record.
    Okay, but, given that two assignments do overlap, will it always be the case that one is entirely contained inside the other. or can you have situations like this:
    INSERT INTO job (job_id, employee_number, job_title,           start_date,        end_date)
         VALUES  (91,     'FOO',           'Forklift Driver', DATE '2011-01-01', DATE '2012-12-31');
    INSERT INTO job (job_id, employee_number, job_title,           start_date,        end_date)
         VALUES  (92,     'FOO,           'Supervisor',      DATE '2012-01-01', DATE '2014-12-31');
    If that is possible, what output would you want?
  • 4. Re: Stuck - Need help writing SQL logic
    856245 Newbie
    Currently Being Moderated
    Thanks again Frank.

    Yes the dates don't need to be entirely contained, therefore that example that you posted is definitely valid. In this example I would expect the output to be like so
    JOB_ID | job_title | START_DATE | END_DATE
    91 | Forklift driver | 01-JAN-2011 | 31-DEC-2011
    92 | Supervisor | 01-JAN-2012 | 31-DEC-2014

    The biggest challenge is that in theory each row of output may result in an unlimited amount of rows, rather then just 2 rows as you mentioned.

    Consider the following timeline, this employee might originally start job A as a forklift driver but then later temporarily move into job role B, then back to the forklift driver. Then again, later down the line he might move into C - manager role, once completed he's resume the forklift driver role.

    |----------------------A-------------------------------|
    |----B----| |------C----|
  • 5. Re: Stuck - Need help writing SQL logic
    856245 Newbie
    Currently Being Moderated
    | |

    Edited by: user12022738 on Apr 22, 2011 9:45 AM
  • 6. Re: Stuck - Need help writing SQL logic
    856245 Newbie
    Currently Being Moderated
    | |

    Edited by: user12022738 on Apr 22, 2011 9:45 AM
  • 7. Re: Stuck - Need help writing SQL logic
    Frank Kulash Guru
    Currently Being Moderated
    Hi,
    user12022738 wrote:
    ... The biggest challenge is that in theory each row of output may result in an unlimited amount of rows, rather then just 2 rows as you mentioned...
    Sorry, I must not understand the problem correctly. Can you try again to describe what you want? Post some sample data and results, including a case where an employee with N rows in the table has more than 2*N rows of output.
    Will you ever have a situation where an employee has two different rows in the table with exactly the same start_date? If so, describe what happens, and include an example in your sample data and results.

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points