Forum Stats

  • 3,838,564 Users
  • 2,262,383 Discussions
  • 7,900,687 Comments

Discussions

Finding Department Projects Based on Order Completed

User_OO1Z3
User_OO1Z3 Member Posts: 15 Green Ribbon

Hello,

I have a table containing departments, employees, and completed dates. I have another table containing employees and projects. I am looking to find projects completed broken down by department such that the most recent project completed was X, second most recent project completed was W, and third most recent project completed was Z. Then from these results, summarize it by returning the departments who finished the projects in this defined order.

The table names are called DEPT_TIME and EMP_PROJ.

Below are my create table and insert scripts for both tables.


CREATE TABLE DEPT_TIME (DEPARTMENT VARCHAR(3), EMPLOYEE NUMBER, DATE_COMPLETED DATE)

;

INSERT ALL

INTO DEPT_TIME(DEPARTMENT, EMPLOYEE, DATE_COMPLETED)

VALUES('ABC','1','21-SEP-21')

INTO DEPT_TIME(DEPARTMENT, EMPLOYEE, DATE_COMPLETED)

VALUES('ABC','2','22-SEP-21')

INTO DEPT_TIME(DEPARTMENT, EMPLOYEE, DATE_COMPLETED)

VALUES('ABC','3','23-SEP-21')

INTO DEPT_TIME(DEPARTMENT, EMPLOYEE, DATE_COMPLETED)

VALUES('ABC','4','24-SEP-21')

INTO DEPT_TIME(DEPARTMENT, EMPLOYEE, DATE_COMPLETED)

VALUES('ABC','5','25-SEP-21')

INTO DEPT_TIME(DEPARTMENT, EMPLOYEE, DATE_COMPLETED)

VALUES('ABC','6','26-SEP-21')

INTO DEPT_TIME(DEPARTMENT, EMPLOYEE, DATE_COMPLETED)

VALUES('ABC','7','27-SEP-21')

INTO DEPT_TIME(DEPARTMENT, EMPLOYEE, DATE_COMPLETED)

VALUES('ABC','8','28-SEP-21')

INTO DEPT_TIME(DEPARTMENT, EMPLOYEE, DATE_COMPLETED)

VALUES('ABC','9','29-SEP-21')

INTO DEPT_TIME(DEPARTMENT, EMPLOYEE, DATE_COMPLETED)

VALUES('DEF','10','30-SEP-21')

INTO DEPT_TIME(DEPARTMENT, EMPLOYEE, DATE_COMPLETED)

VALUES('DEF','11','01-OCT-21')

INTO DEPT_TIME(DEPARTMENT, EMPLOYEE, DATE_COMPLETED)

VALUES('DEF','12','02-OCT-21')

INTO DEPT_TIME(DEPARTMENT, EMPLOYEE, DATE_COMPLETED)

VALUES('DEF','13','03-OCT-21')

INTO DEPT_TIME(DEPARTMENT, EMPLOYEE, DATE_COMPLETED)

VALUES('DEF','14','04-OCT-21')

INTO DEPT_TIME(DEPARTMENT, EMPLOYEE, DATE_COMPLETED)

VALUES('GHI','15','05-OCT-21')

INTO DEPT_TIME(DEPARTMENT, EMPLOYEE, DATE_COMPLETED)

VALUES('GHI','16','06-OCT-21')

INTO DEPT_TIME(DEPARTMENT, EMPLOYEE, DATE_COMPLETED)

VALUES('GHI','17','07-OCT-21')

INTO DEPT_TIME(DEPARTMENT, EMPLOYEE, DATE_COMPLETED)

VALUES('GHI','18','08-OCT-21')

SELECT 1 FROM DUAL

;


CREATE TABLE EMP_PROJ (EMPLOYEE NUMBER, PROJECT VARCHAR(1))

;

INSERT ALL

INTO EMP_PROJ (EMPLOYEE, PROJECT)

VALUES('1','Y')

INTO EMP_PROJ (EMPLOYEE, PROJECT)

VALUES('2','V')

INTO EMP_PROJ (EMPLOYEE, PROJECT)

VALUES('3','Z')

INTO EMP_PROJ (EMPLOYEE, PROJECT)

VALUES('4','W')

INTO EMP_PROJ (EMPLOYEE, PROJECT)

VALUES('5','Y')

INTO EMP_PROJ (EMPLOYEE, PROJECT)

VALUES('6','V')

INTO EMP_PROJ (EMPLOYEE, PROJECT)

VALUES('7','Z')

INTO EMP_PROJ (EMPLOYEE, PROJECT)

VALUES('8','W')

INTO EMP_PROJ (EMPLOYEE, PROJECT)

VALUES('9','X')

INTO EMP_PROJ (EMPLOYEE, PROJECT)

VALUES('10','Y')

INTO EMP_PROJ (EMPLOYEE, PROJECT)

VALUES('11','V')

INTO EMP_PROJ (EMPLOYEE, PROJECT)

VALUES('12','Z')

INTO EMP_PROJ (EMPLOYEE, PROJECT)

VALUES('13','W')

INTO EMP_PROJ (EMPLOYEE, PROJECT)

VALUES('14','X')

INTO EMP_PROJ (EMPLOYEE, PROJECT)

VALUES('15','Y')

INTO EMP_PROJ (EMPLOYEE, PROJECT)

VALUES('16','V')

INTO EMP_PROJ (EMPLOYEE, PROJECT)

VALUES('17','W')

INTO EMP_PROJ (EMPLOYEE, PROJECT)

VALUES('18','X')

SELECT 1 FROM DUAL

;


Below is the desired output.

DEPARTMENT

ABC

DEF


How would I go about achieving this result using SQL? I am using SQL Developer version 19.2.1.247.2212.

Tagged:

Answers

  • Frank Kulash
    Frank Kulash Member, Moderator Posts: 42,243 Red Diamond
    edited Jul 8, 2022 6:02PM

    Hi, @User_OO1Z3

    Thanks for posting the sample data, but

    CREATE TABLE DEPT_TIME (DEPARTMENT VARCHAR(3), EMPLOYEE NUMBER, DATE_COMPLETED DATE)

    ;

    INSERT ALL

    INTO DEPT_TIME(DEPARTMENT, EMPLOYEE, DATE_COMPLETED)

    VALUES('ABC','1','21-SEP-21')

    Don't try to insert string values (like '1' or '21-SEP-21') in NUMBER or DATE columns (like employee or completed_date). Don't use quotes with NUMBERs, and use DATE literals, e.g.

    VALUES ('ABC', 1 , DATE '2021-09-21')
    

    or TO_DATE, e.g.

    VALUES ('ABC', 1 , TO_DATE ('21-SEP-2021', 'DD-MON-YYYY'))
    

    and always use 4 digits for the year. Using fewer digits is simply asking for trouble.


    Sorry, the requirements are not clear.

    such that the most recent project completed was X, second most recent project completed was W, and third most recent project completed was Z. 

    Do you mean the most recently completed project in this particular sample set happens to be 'X'? How do you know that? When was 'X' completed?

    Below is the desired output.

    DEPARTMENT

    ABC

    DEF

    It looks like you have project and 3 departments in the sample data. Why are there only 2 rows in the desired output? What does each row in the output represent? Will the departments in the output always be unique?

  • User_OO1Z3
    User_OO1Z3 Member Posts: 15 Green Ribbon

    Hi Frank,

    Thank you for your comments. Table DEPT_TIME joins onto EMP_PROJ on EMPLOYEE. So, an inner join on these two tables would yield this result below.

    DEPARTMENT   EMPLOYEE         DATE_COMPLETED         EMPLOYEE         PROJECT

    ABC                      1                            21-SEP-21                           1                            Y

    ABC                      2                            22-SEP-21                          2                            V

    ABC                      3                            23-SEP-21                          3                            Z

    ABC                      4                            24-SEP-21                          4                            W

    ABC                      5                            25-SEP-21                          5                            Y

    ABC                      6                            26-SEP-21                          6                            V

    ABC                      7                            27-SEP-21                          7                            Z

    ABC                      8                            28-SEP-21                          8                            W

    ABC                      9                            29-SEP-21                          9                            X

    DEF                      10                         30-SEP-21                          10                         Y

    DEF                      11                         01-OCT-21                          11                         V

    DEF                      12                         02-OCT-21                          12                         Z

    DEF                      13                         03-OCT-21                          13                         W

    DEF                      14                         04-OCT-21                         14                         X

    GHI                      15                         05-OCT-21                          15                         Y

    GHI                      16                         06-OCT-21                          16                         V

    GHI                      17                         07-OCT-21                          17                         W

    GHI                      18                         08-OCT-21                          18                         X


    If you take the MAX(DEPT_TIME.DATE_COMPLETED) for Department ABC, it corresponds to Project “X.” If you take the “Second” MAX(DEPT_TIME.DATE_COMPLETED) for Department ABC, it corresponds to Project “W.” Likewise, if you take the “Third” MAX(DATE_COMPLETED) for Department ABC. It corresponds to Project “Z.”

    Department DEF follows similar suit, and therefore would be returned in the output.

    Department GHI would not be reported in the output since it’s “Third” MAX(DATE_COMPLETED) corresponds to Project “V.”

    Sorry for any confusion. Any help is much appreciated.

    Thanks!

  • Frank Kulash
    Frank Kulash Member, Moderator Posts: 42,243 Red Diamond

    Hi, @User_OO1Z3

    I'm starting to get a better idea of the problem. When someone asks you a question, it would really help if you tried to answer it. For example,

    Q: What does each row in the result set represent?

    A: Each row in the result set represents a department that completed all the right projects in the right order. (Where "right" means ...)

    Q:  Will the departments in the output always be unique?

    A: Yes

    Are the answers above correct?

    I have some more questions

    • Is the sequence X-W-Z derived from the data, or is it some arbitrary sequence given at run-time?
    • Once the target sequence is established, can we ignore all projects not in that group? That is, if a department qualifies to be in the result set for X-W-Z, would it change if we added or removed rows for projects V and/or Y?
    • Similarly, are we only interested in the latest date for each department/project combination? For example, if I understand the problem, department 'ABC' completed project 'W' on September 28, 2021 in this sample data. Can we therefore ignore all earlier occurences of that combination, such as the one on September 24, four days earlier?
    • What happens when there is a tie for the N-th latest project (that is, more than one project completes at the same time)? If ties are possible, include examples in your sample data and results.
  • User_OO1Z3
    User_OO1Z3 Member Posts: 15 Green Ribbon

    Q: What does each row in the result set represent?

    A: Each row in the result set represents a department that completed all the right projects in the right order.

    Q: Will the departments in the output always be unique?

    A: Yes

    Q: Is the sequence X-W-Z derived from the data, or is it some arbitrary sequence given at run-time?

    A: It's an arbitrary sequence and will depend on the run time. Once the X-W-Z sequence is established, you hit a "BINGO."

    Q: Once the target sequence is established, can we ignore all projects not in that group? That is, if a department qualifies to be in the result set for X-W-Z, would it change if we added or removed rows for projects V and/or Y?

    A: Yes, once the target sequence is established, all other rows for that department can be ignored. It would not change if more projects were add or removed for any department. As further stated below, only projects interested in is the latest 3 projects for each department, where X-W-Z is the qualifying result set.

    Q: Similarly, are we only interested in the latest date for each department/project combination? For example, if I understand the problem, department 'ABC' completed project 'W' on September 28, 2021 in this sample data. Can we therefore ignore all earlier occurrences of that combination, such as the one on September 24, four days earlier?

    A: Yes, correct. We are only interested in the latest 3 projects completed for each department. Therefore, all other projects can be ignored.

    Q: What happens when there is a tie for the N-th latest project (that is, more than one project completes at the same time)? If ties are possible, include examples in your sample data and results.

    A: Ties in latest project are not possible.

  • Frank Kulash
    Frank Kulash Member, Moderator Posts: 42,243 Red Diamond

    Hi, @User_OO1Z3

    Sorry, it's still not clear.

    It's an arbitrary sequence and will depend on the run time. 

    Are you saying it can be derived from SYSDATE? If so, explain how.

    only projects interested in is the latest 3 projects for each department, where X-W-Z is the qualifying result set.

    Are you saying that if the latest 4 projects for a department are Y-X-W-Z in that order, then you not want would to include that department, since the latest 3 projects are Y-X-W, and you only want X-W-Z?

    Ties in latest project are not possible.

    What happens if there is a tie for the second latest project, or the third? Include examples in the sample data, results and explanation.

  • mathguy
    mathguy Member Posts: 10,586 Blue Diamond

    Something like this should work:

    select department
    from   dept_time join emp_proj using (employee)
    match_recognize(
      partition by department
      order     by date_completed desc
      pattern   ( ^ last_compl second_to_last third_to_last )
      define    last_compl     as project = 'X',
                second_to_last as project = 'W',
                third_to_last  as project = 'Z'
    );
    
    DEPARTMENT
    ----------
    ABC
    DEF
    

    match_recognize partitions input rows (from the join) by department, and orders by date completed in reverse order. It searches for the pattern of the last three completed projects for each department (they look like the "first" three, but remember the order is descending); moreover, it requires that the last_completed project be 'X', etc. - these conditions are in the define clause of match_recognize. If you need a different pattern of last-completed projects, you can adjust the define clause.

  • User_OO1Z3
    User_OO1Z3 Member Posts: 15 Green Ribbon

    Hi @mathguy,

    The match recognize function ended up working for what I was looking for. Just needed some minor tweaks to the syntax.

    Thanks!

  • User_OO1Z3
    User_OO1Z3 Member Posts: 15 Green Ribbon

    Hi @Frank Kulash,

    Please forgive the late response.

    It's an arbitrary sequence and will depend on the run time. 

    Q: Are you saying it can be derived from SYSDATE? If so, explain how.

    A: Apologies, yes the sequence can de derived from the data. Each project from the EMP_PROJ table will have a corresponding date completed. And partitioning the input rows (from the join) by department, and ordering by date completed to find the latest 3 projects completed by department will be how the sequence is found. In turn, no it would not be derived from the SYSDATE.

    only projects interested in is the latest 3 projects for each department, where X-W-Z is the qualifying result set.

    Q: Are you saying that if the latest 4 projects for a department are Y-X-W-Z in that order, then you not want would to include that department, since the latest 3 projects are Y-X-W, and you only want X-W-Z?

    A: Correct. I'm only looking for the most recent project to be X, second most recent project completed to be W, and third most recent completed project to be Z. Since your ordering begins with Y being the recent recent completed project, that department would not be included.

    Ties in latest project are not possible.

    Q: What happens if there is a tie for the second latest project, or the third? Include examples in the sample data, results and explanation.

    A: It is not possible to have two or more X projects competing for second place, third place, etc. since the completed date can be casted as a timestamp which is unique for each project. In turn, the correct second most recent project would be found after ordering.