This discussion is archived
8 Replies Latest reply: Feb 11, 2013 12:46 AM by jeneesh RSS

Grouping of data

587126 Newbie
Currently Being Moderated
Hi all,

How can I get the grp_id for unique combination of manager and department, grp_id should be created on asc order of manager_id.
In this example manager_id 100 is minimum, so it should be grp 1 and all the employees with that manager_id should be in grp_id 1,
for manager_id 114 grp_id should be 2.

If, there is manager_id 117, it should create grp_id 3.

To get grp_num ,I can use row_number() over (partition by department_id,manager_id order by employee_id) grp_num

I am looking for an update statement for this issue.

Oracle version : Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Prod
CREATE TABLE HR.EMPLOYEES_2
(
  EMPLOYEE_ID         NUMBER(6),
  FIRST_NAME          VARCHAR2(20 BYTE),
  LAST_NAME           VARCHAR2(25 BYTE),
  EMAIL               VARCHAR2(25 BYTE),
  PHONE_NUMBER        VARCHAR2(20 BYTE),
  HIRE_DATE           DATE,
  SALARY              NUMBER(8,2),
  MANAGER_ID          NUMBER(6),
  DEPARTMENT_ID       NUMBER(4),
  EMPLOYEE_CHILD_ID   NUMBER,
  JOB_NAME            VARCHAR2(100 BYTE),
  COMMISSION          NUMBER,
  EMPLOYEE_MASTER_ID  NUMBER,
  GRP_NUM             NUMBER,
  GRP_ID              NUMBER
);


Insert into EMPLOYEES_2
   (EMPLOYEE_ID, FIRST_NAME, LAST_NAME, EMAIL, PHONE_NUMBER, HIRE_DATE, SALARY, MANAGER_ID, DEPARTMENT_ID, EMPLOYEE_CHILD_ID, JOB_NAME, COMMISSION, EMPLOYEE_MASTER_ID)
 Values
   (114, 'Den', 'Raphaely', 'DRAPHEAL', '515.127.4561', TO_DATE('12/07/1994 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 11000, 100, 30, 1087, 'PU_MAN', 0, 1207);
Insert into EMPLOYEES_2
   (EMPLOYEE_ID, FIRST_NAME, LAST_NAME, EMAIL, PHONE_NUMBER, HIRE_DATE, SALARY, MANAGER_ID, DEPARTMENT_ID, EMPLOYEE_CHILD_ID, JOB_NAME, COMMISSION, EMPLOYEE_MASTER_ID)
 Values
   (115, 'Alexander', 'Khoo', 'AKHOO', '515.127.4562', TO_DATE('05/18/1995 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 3100, 100, 30, 1088, 'PU_CLERK', 0, 1208);
Insert into EMPLOYEES_2
   (EMPLOYEE_ID, FIRST_NAME, LAST_NAME, EMAIL, PHONE_NUMBER, HIRE_DATE, SALARY, MANAGER_ID, DEPARTMENT_ID, EMPLOYEE_CHILD_ID, JOB_NAME, COMMISSION, EMPLOYEE_MASTER_ID)
 Values
   (116, 'Shelli', 'Baida', 'SBAIDA', '515.127.4563', TO_DATE('12/24/1997 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 2900, 114, 30, 1089, 'PU_CLERK', 0, 1209);
Insert into EMPLOYEES_2
   (EMPLOYEE_ID, FIRST_NAME, LAST_NAME, EMAIL, PHONE_NUMBER, HIRE_DATE, SALARY, MANAGER_ID, DEPARTMENT_ID, EMPLOYEE_CHILD_ID, JOB_NAME, COMMISSION, EMPLOYEE_MASTER_ID)
 Values
   (117, 'Sigal', 'Tobias', 'STOBIAS', '515.127.4564', TO_DATE('07/24/1997 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 2800, 114, 30, 1090, 'PU_CLERK', 0, 1210);
Insert into EMPLOYEES_2
   (EMPLOYEE_ID, FIRST_NAME, LAST_NAME, EMAIL, PHONE_NUMBER, HIRE_DATE, SALARY, MANAGER_ID, DEPARTMENT_ID, EMPLOYEE_CHILD_ID, JOB_NAME, COMMISSION, EMPLOYEE_MASTER_ID)
 Values
   (118, 'Guy', 'Himuro', 'GHIMURO', '515.127.4565', TO_DATE('11/15/1998 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 2600, 114, 30, 1091, 'PU_CLERK', 0, 1211);
Insert into EMPLOYEES_2
   (EMPLOYEE_ID, FIRST_NAME, LAST_NAME, EMAIL, PHONE_NUMBER, HIRE_DATE, SALARY, MANAGER_ID, DEPARTMENT_ID, EMPLOYEE_CHILD_ID, JOB_NAME, COMMISSION, EMPLOYEE_MASTER_ID)
 Values
   (119, 'Karen', 'Colmenares', 'KCOLMENA', '515.127.4566', TO_DATE('08/10/1999 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 2500, 114, 30, 1092, 'PU_CLERK', 0, 1212);

commit;


Expected result
----------------
EMPLOYEE_ID    SALARY    MANAGER_ID    DEPARTMENT_ID    GRP_NUM    GRP_ID

114              11000        100          30              1              1
115              3100         100          30              2              1
116              2900         114          30              1              2
117              2800         114          30              2              2
118              2600         114          30              3              2
119              2500         114          30              4              2
Edited by: Raghu on 10 Feb, 2013 12:59 PM
  • 1. Re: Grouping of data
    jeneesh Guru
    Currently Being Moderated
    When you want to use summary functions and you want to display the "grouped" column - you can use Analytic functions: http://docs.oracle.com/cd/E11882_01/server.112/e26088/functions004.htm
    select employee_id,salary,manager_id,department_id,
           row_number() over(partition by manager_id order by employee_id) grp_num,
           dense_rank() over(order by manager_id) grp_id
    from employees_2;
    
    EMPLOYEE_ID   SALARY MANAGER_ID DEPARTMENT_ID GRP_NUM GRP_ID
    ----------- -------- ---------- ------------- ------- ------
            114    11000        100            30       1      1 
            115     3100        100            30       2      1 
            116     2900        114            30       1      2 
            117     2800        114            30       2      2 
            118     2600        114            30       3      2 
            119     2500        114            30       4      2 
    
     6 rows selected 
  • 2. Re: Grouping of data
    jeneesh Guru
    Currently Being Moderated
    Raghu wrote:
    I am looking for an update statement for this issue.
    For that you could use MERGE:
    merge into employees_2 e1
    using
      (
        select employee_id,
               row_number() over(partition by manager_id order by employee_id) grp_num,
               dense_rank() over(order by manager_id) grp_id
        from employees_2
      ) e2
    on (e1.employee_id = e2.employee_id)  
    when matched then update set 
         e1.grp_num = e2.grp_num,
         e1.grp_id = e2.grp_id;
  • 3. Re: Grouping of data
    587126 Newbie
    Currently Being Moderated
    Hi Jeneesh,

    Thanks for your response.

    I should have given my requirement completely earlier.

    From employees_2, data has to be copied to employee_2_target, employees_2_target have two extra columns GRP_ID and GRP_NUM.
    Logic to populate GRP_ID and GRP_NUM are same as in my earlier post.

    Every time employees_2 data gets truncated, next time when employees_2 gets new data,
    it should be copied to target table and GRP_ID and GRP_NUM has to be populated again with the same logic.

    For example, in the 2nd load if there is a combination of manager_id and department_id, grp_num should be incremented and grp_id should remain same.
    CREATE TABLE HR.EMPLOYEES_2_TARGET
    (
      EMPLOYEE_ID         NUMBER(6),
      FIRST_NAME          VARCHAR2(20 BYTE),
      LAST_NAME           VARCHAR2(25 BYTE),
      EMAIL               VARCHAR2(25 BYTE),
      PHONE_NUMBER        VARCHAR2(20 BYTE),
      HIRE_DATE           DATE,
      SALARY              NUMBER(8,2),
      MANAGER_ID          NUMBER(6),
      DEPARTMENT_ID       NUMBER(4),
      EMPLOYEE_CHILD_ID   NUMBER,
      JOB_NAME            VARCHAR2(100 BYTE),
      COMMISSION          NUMBER,
      EMPLOYEE_MASTER_ID  NUMBER,
      GRP_ID              NUMBER,
      GRP_NUM             NUMBER
    );
    
    
    1st set of data
    ---------------
    Insert into EMPLOYEES_2
       (EMPLOYEE_ID, FIRST_NAME, LAST_NAME, EMAIL, PHONE_NUMBER, HIRE_DATE, SALARY, MANAGER_ID, DEPARTMENT_ID, EMPLOYEE_CHILD_ID, JOB_NAME, COMMISSION, EMPLOYEE_MASTER_ID)
     Values
       (114, 'Den', 'Raphaely', 'DRAPHEAL', '515.127.4561', TO_DATE('12/07/1994 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 11000, 100, 30, 1087, 'PU_MAN', 0, 1207);
    Insert into EMPLOYEES_2
       (EMPLOYEE_ID, FIRST_NAME, LAST_NAME, EMAIL, PHONE_NUMBER, HIRE_DATE, SALARY, MANAGER_ID, DEPARTMENT_ID, EMPLOYEE_CHILD_ID, JOB_NAME, COMMISSION, EMPLOYEE_MASTER_ID)
     Values
       (115, 'Alexander', 'Khoo', 'AKHOO', '515.127.4562', TO_DATE('05/18/1995 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 3100, 100, 30, 1088, 'PU_CLERK', 0, 1208);
    Insert into EMPLOYEES_2
       (EMPLOYEE_ID, FIRST_NAME, LAST_NAME, EMAIL, PHONE_NUMBER, HIRE_DATE, SALARY, MANAGER_ID, DEPARTMENT_ID, EMPLOYEE_CHILD_ID, JOB_NAME, COMMISSION, EMPLOYEE_MASTER_ID)
     Values
       (116, 'Shelli', 'Baida', 'SBAIDA', '515.127.4563', TO_DATE('12/24/1997 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 2900, 114, 30, 1089, 'PU_CLERK', 0, 1209);
    Insert into EMPLOYEES_2
       (EMPLOYEE_ID, FIRST_NAME, LAST_NAME, EMAIL, PHONE_NUMBER, HIRE_DATE, SALARY, MANAGER_ID, DEPARTMENT_ID, EMPLOYEE_CHILD_ID, JOB_NAME, COMMISSION, EMPLOYEE_MASTER_ID)
     Values
       (117, 'Sigal', 'Tobias', 'STOBIAS', '515.127.4564', TO_DATE('07/24/1997 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 2800, 114, 30, 1090, 'PU_CLERK', 0, 1210);
    Insert into EMPLOYEES_2
       (EMPLOYEE_ID, FIRST_NAME, LAST_NAME, EMAIL, PHONE_NUMBER, HIRE_DATE, SALARY, MANAGER_ID, DEPARTMENT_ID, EMPLOYEE_CHILD_ID, JOB_NAME, COMMISSION, EMPLOYEE_MASTER_ID)
     Values
       (118, 'Guy', 'Himuro', 'GHIMURO', '515.127.4565', TO_DATE('11/15/1998 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 2600, 114, 30, 1091, 'PU_CLERK', 0, 1211);
    Insert into EMPLOYEES_2
       (EMPLOYEE_ID, FIRST_NAME, LAST_NAME, EMAIL, PHONE_NUMBER, HIRE_DATE, SALARY, MANAGER_ID, DEPARTMENT_ID, EMPLOYEE_CHILD_ID, JOB_NAME, COMMISSION, EMPLOYEE_MASTER_ID)
     Values
       (119, 'Karen', 'Colmenares', 'KCOLMENA', '515.127.4566', TO_DATE('08/10/1999 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 2500, 114, 30, 1092, 'PU_CLERK', 0, 1212);
     
    
    2nd set of data  (table will be truncated for every set of data)
    ----------------
    Insert into employees_2
       (EMPLOYEE_ID, FIRST_NAME, LAST_NAME, EMAIL, PHONE_NUMBER, HIRE_DATE, JOB_ID, SALARY, MANAGER_ID, DEPARTMENT_ID)
     Values
       (400, 'Dan', 'Raphaely', 'DRAPHEAL', '515.127.4561', TO_DATE('12/05/1998 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'PU_MAN', 18000, 100, 30);
    Insert into employees_2
       (EMPLOYEE_ID, FIRST_NAME, LAST_NAME, EMAIL, PHONE_NUMBER, HIRE_DATE, JOB_ID, SALARY, MANAGER_ID, DEPARTMENT_ID)
     Values
       (401, 'Khoo', 'Alexander', 'AKHOO', '515.127.4562', TO_DATE('05/18/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'PU_CLERK', 8100, 100, 30);
    Insert into employees_2
       (EMPLOYEE_ID, FIRST_NAME, LAST_NAME, EMAIL, PHONE_NUMBER, HIRE_DATE, JOB_ID, SALARY, MANAGER_ID, DEPARTMENT_ID)
     Values
       (512, 'Sigal', 'Tobias', 'STOBIAS', '515.127.4564', TO_DATE('07/24/1999 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'PU_CLERK', 8200, 114, 30);
    Insert into employees_2
       (EMPLOYEE_ID, FIRST_NAME, LAST_NAME, EMAIL, PHONE_NUMBER, HIRE_DATE, JOB_ID, SALARY, MANAGER_ID, DEPARTMENT_ID)
     Values
       (513, 'Karen', 'Colmenares', 'KCOLMENA', '515.127.4566', TO_DATE('08/10/1999 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'PU_CLERK', 11500, 114, 30);
    
    
    Expected data in employees_2_target after 2nd loads (I have given only few columns)
    ----------------------------------------------------------------------
    EMPLOYEE_ID    SALARY    MANAGER_ID    DEPARTMENT_ID    GRP_NUM    GRP_ID
     
    114              11000        100          30              1              1
    115              3100         100          30              2              1
    116              2900         114          30              1              2
    117              2800         114          30              2              2
    118              2600         114          30              3              2
    119              2500         114          30              4              2
    400          18000          100      30          3          1
    401          8100          100      30          4          1
    512          8200          114      30             5          2
    513          11500          114      30          6          2
    Thanks
    Raghu
  • 4. Re: Grouping of data
    jeneesh Guru
    Currently Being Moderated
    Every time, Will you get new employee ids all the time?
  • 5. Re: Grouping of data
    587126 Newbie
    Currently Being Moderated
    most of times new employee id's.
    Even if same employee id comes, it should insert in target table as per the logic.

    If same emp id has same mgr and department, grp_id should be same, but grp_num should be incremented.

    Edited by: Raghu on 10 Feb, 2013 7:49 PM

    Edited by: Raghu on 10 Feb, 2013 8:04 PM
  • 6. Re: Grouping of data
    jeneesh Guru
    Currently Being Moderated
    With your sample data..
    select employee_id,manager_id 
    from employees_2;
    
    EMPLOYEE_ID MANAGER_ID
    ----------- ----------
            114        100 
            115        100 
            116        114 
            117        114 
            118        114 
            119        114 
    
     6 rows selected 
    
    insert into employees_2_target
    (employee_id,manager_id,grp_id,grp_num)
    with max_values as
    (
    select manager_id,grp_id,max(grp_num) grp_num,
           max(grp_id) over() mx_grp_id
    from employees_2_target
    group by manager_id,grp_id
    )
    select e.employee_id,e.manager_id,
           decode(m.grp_id,null,
                  nvl(m.mx_grp_id,1)-1+
                      dense_rank() 
                        over(order by 
                              decode(m.manager_id,null,e.manager_id,-1)),m.grp_id) grp_id ,
           nvl(m.grp_num,0) + 
           row_number() over(partition by e.manager_id order by e.employee_id) grp_num
    from employees_2 e,max_values m
    where e.manager_id = m.manager_id(+)
    ;
    
    
    6 rows inserted.
    
    commit;
    
    --"Now truncate employees_2 and insert new data
    
    truncate table employees_2;
    
    --"new data inserted using your script.
    
    select employee_id,manager_id 
    from employees_2;
    
    EMPLOYEE_ID MANAGER_ID
    ----------- ----------
            400        100 
            401        100 
            512        114 
            513        114 
    
    --"run the target insert script
    
    insert into employees_2_target
    (employee_id,manager_id,grp_id,grp_num)
    with max_values as
    (
    select manager_id,grp_id,max(grp_num) grp_num,
           max(grp_id) over() mx_grp_id
    from employees_2_target
    group by manager_id,grp_id
    )
    select e.employee_id,e.manager_id,
           decode(m.grp_id,null,
                  nvl(m.mx_grp_id,1)-1+
                      dense_rank() 
                        over(order by 
                              decode(m.manager_id,null,e.manager_id,-1)),m.grp_id) grp_id ,
           nvl(m.grp_num,0) + 
           row_number() over(partition by e.manager_id order by e.employee_id) grp_num
    from employees_2 e,max_values m
    where e.manager_id = m.manager_id(+)
    ;
    
    4 rows inserted.
    
    
    --"verify the output..
    
    select employee_id,manager_id,grp_id,grp_num
    from employees_2_target
    order by manager_id,employee_id;
    
    EMPLOYEE_ID MANAGER_ID GRP_ID GRP_NUM
    ----------- ---------- ------ -------
            114        100      1       1 
            115        100      1       2 
            400        100      1       3 
            401        100      1       4 
            116        114      2       1 
            117        114      2       2 
            118        114      2       3 
            119        114      2       4 
            512        114      2       5 
            513        114      2       6 
    
     10 rows selected 
    If you expect, same emp id multiple times, provide sample data and expected output, please..

    Edited by: jeneesh on Feb 10, 2013 8:09 PM
    If you expect changes in MANAGER_ID,DEPARTMENT_ID combination, re-write the join accordingly..
  • 7. Re: Grouping of data
    587126 Newbie
    Currently Being Moderated
    Hi Jeneesh,

    As stated earlier, grp_id is a unique combination of manager_id and department_id.
    I tried to understand the query you posted, but I am unable to do below condition, can you explain in brief.

    decode(m.grp_id,null,
    nvl(m.mx_grp_id,1)-1+
    dense_rank()
    over(order by
    decode(m.manager_id,null,e.manager_id,-1)),m.grp_id) grp_id ,

    To provide limited data, I have provided only department 30 data, my actual table has many departments.

    Thanks
    Raghu

    Edited by: Raghu on 11 Feb, 2013 2:10 PM

    Edited by: Raghu on 11 Feb, 2013 2:11 PM
  • 8. Re: Grouping of data
    jeneesh Guru
    Currently Being Moderated
    You need to provide details:

    What if a manger is moved to a different department?What will happen to the already loaded data? Should it change? Or should it go as new entries?

    What if an employee_id re appear in the new set of data? Want to insert again - duplicate data?

    Provide sample data that will cover all your scenarios - and the expected output also..

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points