SQL & PL/SQL

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Interested in getting your voice heard by members of the Developer Marketing team at Oracle? Check out this post for AppDev or this post for AI focus group information.

Random Assignment of Employees to Locations

GmoneyJul 5 2017 — edited Aug 25 2017

Good day all,

Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production

PL/SQL Release 11.2.0.1.0 - Production

"CORE 11.2.0.1.0 Production"

TNS for 32-bit Windows: Version 11.2.0.1.0 - Production

NLSRTL Version 11.2.0.1.0 - Production

I will base this question of the HR Schema for simplicity.

I have two tables Employees and Locations. For this example, I have 37 locations and 173 employees.

My locations have a specific number assigned indicating the number of employees that need to be assigned to the location.

What I am attempting to do is randomly assign the number of required employees for each location from the employees table.

Thanks for looking at my issue.

As such an example of desired results would be:

EMPLOYEE_ID	FIRST_NAME	LAST_NAME	STREET_ADDRESS	CITY	STATE
1	A	Smith	737 Any Road	Anywhere	USA
2	B	Jones	50 Some Road	Nowhere	USA
3	C	Butler	737 Any Road	Anywhere	USA
4	D	Thomas	50 Some Road	Nowhere	USA
5	E	Marsh	737 Any Road	Anywhere	USA
6	F	Ruppert	50 Some Road	Nowhere	USA
7	G	Glenn	50 Some Road	Nowhere	USA

EMPLOYEE_ID	FIRST_NAME	LAST_NAME
000001	A	Smith
000002	B	Jones
000003	C	utler
000004	D	Thomas
000005	E	Marsh
000006	F	Ruppert
000007	G	Glenn

STRET_ADDRESS	CITY	HEAD_CNT	STATE
737 Any Road	Anywhere	3	USA
50 Some Road	Nowhere	4	USA

This post has been answered by sdstuber on Jul 5 2017

Jump to Answer

sdstuber

Answer

WITH employees

AS (SELECT '000001' employee_id, 'A' first_name, 'Smith' last_name FROM DUAL

UNION ALL

SELECT '000002', 'B', 'Jones' FROM DUAL

UNION ALL

SELECT '000003', 'C', 'Butler' FROM DUAL

UNION ALL

SELECT '000004', 'D', 'Thomas' FROM DUAL

UNION ALL

SELECT '000005', 'E', 'Marsh' FROM DUAL

UNION ALL

SELECT '000006', 'F', 'Ruppert' FROM DUAL

UNION ALL

SELECT '000007', 'G', 'Glenn' FROM DUAL

locations

AS (SELECT '737 Any Road' street_address, 'Anywhere' city, 3 head_cnt, 'USA' state FROM DUAL

UNION ALL

SELECT '50 Some Road', 'Nowhere', 4, 'USA' FROM DUAL),

sorted_employees

AS (SELECT ROWNUM rn, e.*

FROM ( SELECT *

FROM employees

ORDER BY DBMS_RANDOM.VALUE) e),

sorted_locations

AS (SELECT l.*,

NVL(

SUM(head_cnt)

OVER(ORDER BY rnk ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING),

)

total

FROM (SELECT l.*, DENSE_RANK() OVER (ORDER BY street_address, city, state) rnk

FROM locations l) l)

SELECT *

FROM sorted_employees e, sorted_locations l

WHERE e.rn > l.total AND e.rn <= l.total + l.head_cnt;

I do make the assumption there are enough employees to distribute across all the locations.
If there are not, then the locations toward the end of the list (as sorted by street,cty,state) will be left empty while the ones toward the beginning will be filled.
Also, if there are too many employees, the extra employees won't be assigned anywhere.

If you need to handle those situations, then please elaborate as to what the expected results would be for each.

Marked as Answer by Gmoney · Sep 27 2020

1 person found this helpful

Gmoney

Thanks for your reply; however, unless I am missing something, this becomes a bit cumbersome when I have 37 locations and 173 employees.

John Thorton

why must the locations be "random"?

1) Assign the first location to the first "X" employees.

2) Assign the next location to the next "y employees

3) go to #2 & repeat until done

sdstuber

You don't use the employees and locations portions of the WITH clause, I don't have your tables, so those views are to simulate your tables.

The query on your end would look like this...

WITH sorted_employees

AS (SELECT ROWNUM rn, e.*

FROM ( SELECT *

FROM employees --- change this your employee table name

ORDER BY DBMS_RANDOM.VALUE) e),

sorted_locations

AS (SELECT l.*,

NVL(

SUM(head_cnt)

OVER(ORDER BY rnk ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING),

)

total

FROM (SELECT l.*, DENSE_RANK() OVER (ORDER BY street_address, city, state) rnk

FROM locations l --- change this to your location table name
) l)

SELECT *

FROM sorted_employees e, sorted_locations l

WHERE e.rn > l.total AND e.rn <= l.total + l.head_cnt;

And if you decide you don't need to randomize the employees as mentioned above,

simply change
ORDER BY DBMS_RANDOM.VALUE

ORDER BY employee_Id --- or any other sorting criteria to determin who is "first"

Gmoney

I jumped too soon, and did not pay attention to what you were showing me. My fault for not providing sample data. This works perfectly for my need.

Thank you for your prompt response.

Regards,

Greg

Gmoney

This is being used to assign employees to temporary locations, and thus out of fairness simply selecting the assignments at random.

John Thorton

Gmoney wrote:

This is being used to assign employees to temporary locations, and thus out of fairness simply selecting the assignments at random.

IMO, you can't do "random" & ensure that all locations be used a specific number of times.

But this is your parade & if you think the problem has been correctly solved, then my opinion should be ignored.

sdstuber

>> IMO, you can't do "random" & ensure that all locations be used a specific number of times.

If you see a bug in my answer please elaborate. I'm happy to try to fix it.
I think it does what was asked with the assumption I mentioned above; but I'm willing to believe I'm author-blind.

The code I posted does just what you suggested, the only thing special about it is the rule for determing who is "first" is randomized.

So, instead of, sorting by name or id I sort randomly. Then the first X employees, whomever they are in the random list are assigned to the first location, then the next Y employees from the same random list and so on.

I think the logic is sound; but again, I'm willing to believe I've overlooked something, if you see a flaw, do tell.

mathguy

Here is one way to do it. We need to generate a random ordering of employees (doing so with ROW_NUMBER() ordered by random numbers, generated with the DBMS_RANDOM.VALUE() function in the ORDER BY clause of ROW_NUMBER(). We also generate the right number of rows for each location (as many as its HEAD_COUNT) - we do this with a common technique using a hierarchical query. I assume (STREET_ADDRESS, CITY, STATE) is a unique key for your table (it would be better if there was a LOCATION_ID column in the LOCATIONS table, but I am working with what you posted). I generate ordinal numbers for locations with ROWNUM - no need to use functions like ROW_NUMBER() for this; ROWNUM is assigned anyway, so I am spending no extra time to generate these row numbers, and randomness is assured by the assignment of random ordinals to the employees. Then match employees to locations with a join (a lookup operation).

with

-- Begin sample data (for testing purposes)

     employees (employee_id, first_name, last_name ) as (
       select '000001', 'A', 'Smith'   from dual union all
       select '000002', 'B', 'Jones'   from dual union all
       select '000003', 'C', 'Butler' from dual union all
       select '000004', 'D', 'Thomas' from dual union all
       select '000005', 'E', 'Marsh'   from dual union all
       select '000006', 'F', 'Ruppert' from dual union all
       select '000007', 'G', 'Glenn'   from dual
     ),
     locations (street_address, city, head_cnt, state ) as (
       select '737 Any Road', 'Anywhere', 3, 'USA' from dual union all
       select '50 Some Road', 'Nowhere' , 4, 'USA' from dual

     -- End of sample data (not part of the solution). Query continues below this line.
     ),
     loc_needs ( street_address, city, state, rn ) as (
       select     street_address, city, state, rownum
       from       locations
       connect by level <= head_cnt
              and prior street_address = street_address
              and prior city          = city
              and prior state          = state
              and prior sys_guid() is not null
     ),
     employees_ord ( employee_id, first_name, last_name, rn ) as (
       select employee_id, first_name, last_name,
              row_number() over (order by dbms_random.value())
       from   employees
     )
select e.employee_id, e.first_name, e.last_name,
       l.street_address, l.city, l.state
from   employees_ord e join loc_needs l using (rn)
;

EMPLOYEE_ID FIRST_NAME LAST_NAME STREET_ADDRESS CITY      STATE
----------- ---------- --------- -------------- -------- -----
000005       E           Marsh      50 Some Road    Nowhere   USA
000007       G           Glenn      50 Some Road    Nowhere   USA
000006       F           Ruppert    50 Some Road    Nowhere   USA
000002       B           Jones      50 Some Road    Nowhere   USA
000004       D           Thomas    737 Any Road     Anywhere USA
000001       A           Smith      737 Any Road    Anywhere USA
000003       C           Butler    737 Any Road     Anywhere USA

Frank Kulash

Hi,

Here's a way that is basically what Sdstuber suggested in reply #1. You may find parts of this simpler and clearer, such as replacing ROWNUM with ROW_NUMBER, which doesn't require a sub-query.

WITH    sorted_employees
     AS ( SELECT em.*
          ,      ROW_NUMBER () OVER (ORDER BY dbms_random.value)     AS rn
          FROM   employees  em
        )
,       sorted_locations
     AS ( SELECT lo.*
          ,      SUM (head_cnt) OVER (ORDER BY street_address) - head_cnt  AS total
          FROM   locations  lo
        )
SELECT  *
  FROM  sorted_employees  e
  JOIN  sorted_locations  l  ON  e.rn   BETWEEN l.total + 1
                                        AND     l.total + l.head_cnt;

If locations.street_address is not unique, then use whatever is unique where I used street_address above.

What if the number of rows in employees is not exactly the same as the total of locations.head_cnt? (I think John was asking the same thing in reply #7, and not saying that there was any flaw in Sdstuber's solution. If head_cnt is 100 on some row in locations, but there are only 7 rows in employees, you can ensure that there will be 100 rows for that location in the result set. In fact, you can ensure that there won't be.) Depending on your requirements, you may want to use a full outer join, and/or use MOD in the join condition.

sdstuber

Here's another approach combining from all of the above.
Except instead of randomizing the employees, the locations are randomized.
Doing this lets you distribute employees across locations even if you don't have enough.

So, in this example I've added a 3rd location with a headcount of 2.

In the previous queries, the new address in Somewhere would be excluded while the other two would be competely filled.
In this one, they each get a person until either the people run out, or their head counts are reached.

I can't say this is better, it's functionally different; but may suit some requirements better whereas the previous methods might work better in other scenarios.

WITH employees(employee_id, first_name, last_name)

AS (SELECT '000001', 'A', 'Smith' FROM DUAL

UNION ALL

SELECT '000002', 'B', 'Jones' FROM DUAL

UNION ALL

SELECT '000003', 'C', 'Butler' FROM DUAL

UNION ALL

SELECT '000004', 'D', 'Thomas' FROM DUAL

UNION ALL

SELECT '000005', 'E', 'Marsh' FROM DUAL

UNION ALL

SELECT '000006', 'F', 'Ruppert' FROM DUAL

UNION ALL

SELECT '000007', 'G', 'Glenn' FROM DUAL),

locations(street_address,

city,

head_cnt,

state)

AS (SELECT '737 Any Road', 'Anywhere', 3, 'USA' FROM DUAL

UNION ALL

SELECT '50 Some Road', 'Nowhere', 4, 'USA' FROM DUAL

UNION ALL

SELECT '99 blah', 'Somewhere', 2, 'USA' FROM DUAL),

sorted_employees

AS (SELECT e.*, ROW_NUMBER() OVER (ORDER BY employee_id) rn

FROM employees e),

sorted_locations

AS (SELECT street_address,

city,

state,

ROW_NUMBER() OVER (ORDER BY rnx, DBMS_RANDOM.VALUE) rn

FROM ( SELECT street_address,

city,

state,

ROW_NUMBER() OVER(PARTITION BY street_address, city, state ORDER BY 1) rnx

FROM locations

CONNECT BY LEVEL <= head_cnt

AND PRIOR street_address = street_address

AND PRIOR city = city

AND PRIOR state = state

AND PRIOR SYS_GUID() IS NOT NULL))

SELECT *

FROM sorted_employees e, sorted_locations l

WHERE e.rn = l.rn

mathguy

The solution in Reply 11 seems unnecessarily complicated.

The solution I offered (and perhaps others too, I didn't read them closely) will do the best possible job when there is a mismatch in numbers. If there are too many employees, a random set of them (in the exact number needed) will be assigned randomly to the job locations, and the remaining ones will not be assigned. If there are too few employees, then they will all be assigned to jobs, with some job locations remaining unfilled. If the job locations have a priority order (which should be coded in an additional column), then instead of ROWNUM one can use ROW_NUMBER() over (ORDER BY <priority_column>) - and in that case one can simply select ROWNUM for the employee ordering. In any case, there is no need for more than one call to any of the ROW_NUMBER(), RANK() and DENSE_RANK() functions.

If the requirement is to also show all the unassigned employees, with NULL as their location, or - if there are too few employees - to also show the unfilled positions, showing NULL for employee, that can be accommodated easily as well - just change the inner join to a FULL OUTER JOIN.

sdstuber

I guess I didn't explain what I was trying to do differently very well.

It is more complicated than what you posted previously, but it's also trying to address an extra layer of complexity in assigning employees.

As noted "If there are too few employees, then they will all be assigned to jobs, with some job locations remaining unfilled" - that's exactly what I was trying to address.

The answers prior to 11 would assign 3 to Anywhere, 4 to Nowhere and 0 to Somewhere. So, yes, you are correct, the remaining location will not be filled.
As an alternate I sort the locations randomly and walk through the employees instead of walking through locations and assigning employees.

So answer 11 will assign 2 employees to each of Anywhere, Nowhere and Somewhere. Which employees go where is still random but every location will get one, unless there are fewer employees than locations.

After distributing 6 employees, Somewhere will be filled (headcount=2), so the last employee will be randomly assigned to either Anywhere or Nowhere.

The original question didn't specify such a distribution as necessary, hence why I said it couldn't be called "better", but it might be for some scenarios.

You're right it is more complicated, but it does more too. If you'd like to tune it further, go for it. I provided the alternative simply for illustration, not necessarily as an optimal solution.

Another option would be to sort both locations and employees randomly. That won't guarantee even distribution; but for large samples will approach proportional assignments.
Again, different, but possibly a better fit for some scenarios.

mathguy

OK, I understand what you were trying to do.

As a matter of "business management" (having nothing to do with coding!) - if we needed to do seven things but we only had enough resources (people, time, money, equipment) to do four, we would prioritize them and get those four jobs done. It is very rare that one would assign "some" resources to each job, none of which would get the resources needed to complete it satisfactorily. This is why in my answer I suggested how that could be accommodated (easily) in code.

Still: to assign one employee to each site, then one more to each site, then one more to each site, etc. - in the solution I proposed, instead of ROWNUM you could simply use ROW_NUMBER() over (ORDER BY LEVEL). No other change would be needed.

Cheers, - mathguy

sdstuber

Agreed, the alternate isn't a fit for everything; but conversely I could see businesses that might want to use it (probably not randomly, that part is a bit unrealistic.)

Let's say I have a packing company. I can screw up all customer orders by only assigning people to the apples and oranges, but not packing bananas for anybody.
Or I can assign people to every area but maybe leave some understaffed. So I will pack orders correctly but won't be able to get all of them done. Which is better? I don't know. That's a sales and customer relations issue, not code.

Thanks for looking though!

EdStevens

Cross posted to Stack Exchange - https://dba.stackexchange.com/questions/178083/random-assignment-of-employees-to-locations-oracle-sql

Gmoney

John- My parade recieved no rain - thanks.

Gmoney

mathguy - thanks for your input. I will try your solution as a learning tool.

Gmoney

Frank - Thanks for responding. I always enjoy seeing what you contribute, and have learned a lot from your responses. I will give your solution a run as well. I will take a look at the MOD function as well.The saving grace is that in my situation the number of employees always out number the number of locations. That's why sdstuber's solution worked for me.

1 - 19

Locked Post

New comments cannot be posted to this locked post.

Locked on Sep 22 2017

Added on Jul 5 2017

#select, #sql

19 comments

1,112 views

SQL & PL/SQL

Random Assignment of Employees to Locations

Comments

Post Details