14 Replies Latest reply on Sep 4, 2007 3:45 PM by Ronald Rood

    can this be done in sql (without the PL) ? distribute files

    Ronald Rood
      create table files
      (id number
      ,tsname varchar2(30)
      ,bytes number
      ,raidlevel number
      );

      create table fs
      (fsname varchar2(256)
      ,bytes number
      ,raidlevel number
      );

      insert into files values (1,'SYSTEM', 512,5); insert into files values (2,'SYSTEM', 512,5); insert into files values (3,'TOOLS', 256,5); insert into files values (4,'TOOLS', 512,5); insert into files values (5,'TOOLS', 768,5); insert into files values (6,'USERS', 512,10); insert into files values (7,'UNDO', 512,5); insert into files values (8,'UNDO', 512,5); insert into files values (9,'UNDO', 512,5); insert into files values (10,'TOOLS', 256,5); insert into files values (11,'TOOLS', 256,5);

      insert into fs values ('/fs1',1536,5);
      insert into fs values ('/fs2',2536,5);
      insert into fs values ('/fs3',536,5);
      insert into fs values ('/fs4',1536,5);
      insert into fs values ('/fs4',536,10);
      commit;

      -- try to distribute the files of the database over as many fs as possible
      -- 1) leaving about the same amount of free space in each device (%)
      -- 2) spreading the files of a tsname over as many fs as possible
      -- 3) generate a filename consisting of fsname/tsname_N where N is the sequence
      -- number of the file in the tsname (/fs1/SYSTEM_01, /fs2/SYSTEM_02, /fs3/TOOLS_01 etc.
      -- 4) when possible take into account the preference for the requested raidlevel.
      -- (just an added bonus (put only files with raidlevel 10 in the fs with raidlevel 10 etc.))
      -- 5) there are more ways to do the distribution, many different results can be ok.

      with kind regards,
      Ronald.
        • 1. Re: can this be done in sql (without the PL) ? distribute files
          Jomar
          Hi,

          Why don't you make exercises?
          And after, we help you. If you need.

          Regards
          Jomar
          • 2. Re: can this be done in sql (without the PL) ? distribute files
            Ronald Rood
            Thanks for your reaction Jomar.
            It implies that my question is to simple to be asked here. True, part of it is quite simple but unless I must be missing something, the part that should take care of the capacity is not. Could you give a clue in what direction I should search for your solution ?
            Analytics rocks, I know but for this problem .... I am not so sure about a sql only solution, without using procedural languages or user written functions.

            with kind regards,
            Ronald (who is not holding his breath)
            ---
            http://ronr.nl/unix-dba
            • 3. Re: can this be done in sql (without the PL) ? distribute files
              572471
              It implies that my question is to simple to be asked
              here.
              No, I think the main problem why you get so little responses - because it's hard to understand - what you want to get.
              May be only for me.
              You provided a good create+insert script for INPUT data.
              So you better provide expected OUTPUT. And I'm sure forum members will help you.

              best wishes.
              • 4. Re: can this be done in sql (without the PL) ? distribute files
                Ronald Rood
                Thanks for your reaction Volder,
                problem with giving expected output is that there are many possible solutions. Sure, I can give a few solutions, no problem but - if - an sql only solution is possible at all, I would not be surprized when that solution gives a different one.

                BTW: there is an error in the 5th insert into fs, this should insert '/fs5' instead of '/fs4'

                One strategy could be to sort the files on tsname and size in order to combine the list of files over the list of fs. Doing so makes sure that I distribute the files for a tsname over all possible fs. Problem is that some fs have a larger capacity and can not hold each file that is allocated to it.

                This sql follows that strategy
                SELECT
                tsname, id, bytes, mod(rownum, (SELECT count(*) FROM fs )) fsid
                FROM
                (
                SELECT
                tsname, id, bytes
                FROM
                files
                ORDER BY
                tsname,
                id
                )

                and the next gives the totals for the allocations/fsname:

                select fsname, fs.bytes maxbytes, dist.bytes, dist.files
                from
                (
                select fs.fsname, rownum -1 fsid, fs.bytes
                from fs
                ) fs
                ,
                (
                SELECT
                fsid, sum(bytes) bytes, count(*) files
                FROM
                (
                SELECT
                tsname, id, bytes, mod(rownum, (SELECT count(*) FROM fs )) fsid
                FROM
                (
                SELECT
                tsname, id, bytes
                FROM
                files
                ORDER BY
                tsname,
                id
                )
                )
                GROUP BY
                fsid
                ) dist
                where dist.fsid = fs.fsid
                order by fs.fsname

                FSNAME     MAXBYTES     BYTES     FILES
                /fs1     1536     1280     2
                /fs2     2536     1280     3
                /fs3     536     768     2
                /fs4     1536     768     2
                /fs5     536     1024     2

                this clearly shows that /fs3 has maxsize of 536 and has 2 files allocated with total size of 768. This does not fit. I hope this clarifies the problem a bit more.

                with best regards,
                Ronald.
                • 5. Re: can this be done in sql (without the PL) ? distribute files
                  Ronald Rood
                  select fsname, fs.bytes maxbytes, dist.tsname, dist.bytes, dist.id fileid
                  from
                  (
                  select fs.fsname, rownum -1 fsid, fs.bytes
                  from fs
                  ) fs
                  ,
                  (
                  SELECT
                  fsid, tsname, bytes, id
                  FROM
                  (
                  SELECT
                  tsname, id, bytes, mod(rownum, (SELECT count(*) FROM fs )) fsid
                  FROM
                  (
                  SELECT
                  tsname, id, bytes
                  FROM
                  files
                  ORDER BY
                  tsname,
                  id
                  )
                  )
                  ) dist
                  where dist.fsid = fs.fsid
                  order by fs.fsname
                  returns:
                  FSNAME     MAXBYTES     TSNAME     BYTES     FILEID
                  /fs1     1536     TOOLS     768     5
                  /fs1     1536     UNDO     512     9
                  /fs2     2536     SYSTEM     512     1
                  /fs2     2536     TOOLS     256     10
                  /fs2     2536     USERS     512     6
                  /fs3     536     SYSTEM     512     2
                  /fs3     536     TOOLS     256     11
                  /fs4     1536     UNDO     512     7
                  /fs4     1536     TOOLS     256     3
                  /fs5     536     UNDO     512     8
                  /fs5     536     TOOLS     512     4

                  manually moving a little around I come to:
                  FSNAME     MAXBYTES     TSNAME     BYTES     FILEID     total
                  /fs1     1536     TOOLS     768     5     1280
                  /fs1     1536     UNDO     512     9     
                  /fs2     2536     SYSTEM     512     1     1792
                  /fs2     2536     TOOLS     256     10     
                  /fs2     2536     UNDO     512     8     
                  /fs2     2536     USERS     512     6     
                  /fs3     536     TOOLS     256     11     256
                  /fs4     1536     SYSTEM     512     2     1280
                  /fs4     1536     TOOLS     256     3     
                  /fs4     1536     UNDO     512     7     
                  /fs5     536     TOOLS     512     4     512
                  which is an acceptable solution
                  all files allocated to their /fsN fit within the size and the distribution of the files is optimal. This result will not allways be possible, when there are more files for a tsname than there are fs some fs will get more files of the same fs. I can live with that.


                  R.
                  • 6. Re: can this be done in sql (without the PL) ? distribute files
                    Alessandro Rossi
                    I found a way but I implemented it just for practice. It is a too complex computation I had to cut a couple of files from your input on files table to have an acceptable response time for the full procedure from a quite good machine.

                    Here it is.
                    Processing ...
                    create table files (
                         file_id number primary key,
                         tsname varchar2(30),
                         bytes number,
                         raidlevel number
                    )
                    Processing ...
                    create table fs (
                         fsname varchar2(256) primary key,
                         bytes number,
                         raidlevel number
                    )
                    Processing ...
                    insert into files values (1,'SYSTEM', 512,5)
                    1 row(s) inserted
                    Processing ...
                    insert into files values (2,'SYSTEM', 512,5)
                    1 row(s) inserted
                    Processing ...
                    insert into files values (3,'TOOLS', 256,5)
                    1 row(s) inserted
                    Processing ...
                    insert into files values (4,'TOOLS', 512,5)
                    1 row(s) inserted
                    Processing ...
                    insert into files values (6,'USERS', 512,10)
                    1 row(s) inserted
                    Processing ...
                    insert into files values (7,'UNDO', 512,5)
                    1 row(s) inserted
                    Processing ...
                    insert into files values (8,'UNDO', 512,5)
                    1 row(s) inserted
                    Processing ...
                    insert into files values (10,'TOOLS', 256,5)
                    1 row(s) inserted
                    Processing ...
                    insert into files values (11,'TOOLS', 256,5)
                    1 row(s) inserted
                    Processing ...
                    insert into fs values ('/fs1',1536,5)
                    1 row(s) inserted
                    Processing ...
                    insert into fs values ('/fs2',2536,5)
                    1 row(s) inserted
                    Processing ...
                    insert into fs values ('/fs3',536,5)
                    1 row(s) inserted
                    Processing ...
                    insert into fs values ('/fs4',1536,5)
                    1 row(s) inserted
                    Processing ...
                    insert into fs values ('/fs5',536,10)
                    1 row(s) inserted
                    Processing ...
                    commit
                    These are your tables I cut two files because it was taking too long on the elaboration.
                    Processing ...
                    create table distributions (
                         k_distr number,
                         file_id number/*,
                         constraint distr_file_fk
                              foreign key (file_id) references files*/,
                         fsname varchar2(256)/*,
                         constraint distr_fs_fk
                              foreign key (fsname) references fs*/
                    )
                    Processing ...
                    create sequence k_distr start with 1 increment by 1 cache 10000000000
                    The table is used to represent all the possible was to put files in filesystems, the sequence is used to enumerate each one.
                    Processing ...
                    create type file_rec as object (
                         id number,
                         bytes number,
                         rl number
                    )
                    TYPE FILE_REC compiled successfully
                    Processing ...
                    create type file_tab as table
                         of file_rec
                    TYPE FILE_TAB compiled successfully
                    Processing ...
                    create type fs_rec as object (
                         fsname varchar2(256),
                         free_bytes number,
                         rl number
                    )
                    TYPE FS_REC compiled successfully
                    Processing ...
                    create type fs_tab as table
                         of fs_rec
                    TYPE FS_TAB compiled successfully
                    Processing ...
                    create type distr_rec as object (
                         file_id number,
                         fsname varchar2(256)
                    )
                    TYPE DISTR_REC compiled successfully
                    Processing ...
                    create type distr_tab as table
                         of distr_rec
                    TYPE DISTR_TAB compiled successfully
                    Processing ...
                    create PROCEDURE DISTRIBUTE_FILES(
                         files in file_tab,
                         curr_file_idx in number,
                         fss in out fs_tab,
                         prev_distr in distr_tab
                    )
                    as
                         loc_k_distr number;
                         idx number;
                         this_distr distr_tab;
                    begin
                         --dbms_output.put_line(curr_file_idx);
                         if ( curr_file_idx <= files.count ) then
                              --dbms_output.put_line('File '||files(curr_file_idx).id);
                              for ind in 1..fss.count loop
                                   --dbms_output.put_line(fss(ind).fsname);
                                   if ( fss(ind).free_bytes >= files(curr_file_idx).bytes and fss(ind).rl >= files(curr_file_idx).rl ) then
                                        --dbms_output.put_line('YES');
                                        fss(ind).free_bytes := fss(ind).free_bytes - files(curr_file_idx).bytes;
                                        select new distr_rec(file_id,fsname)
                                             bulk collect into this_distr
                                        from table (prev_distr);
                                        this_distr.extend;
                                        this_distr(this_distr.count):=new distr_rec(
                                             files(curr_file_idx).id,
                                             fss(ind).fsname
                                        );
                                        distribute_files(
                                             files,
                                             curr_file_idx+1,
                                             fss,
                                             this_distr
                                        );
                                        fss(ind).free_bytes := fss(ind).free_bytes + files(curr_file_idx).bytes;
                                   --else
                                        --dbms_output.put_line('NO');
                                   end if;
                              end loop;
                         else
                              --dbms_output.put_line('OK');
                              select k_distr.nextval
                                   into loc_k_distr
                              from dual;
                              insert into distributions (
                                   k_distr,
                                   file_id,
                                   fsname
                              ) (
                                   select loc_k_distr,
                                        file_id,
                                        fsname
                                   from table (prev_distr)
                              );
                              commit;
                         end if;
                    end;
                    PROCEDURE DISTRIBUTE_FILES compiled successfully
                    A procedure to fill the previous table and all the needed objects. This is the procedure that takes very long on long ( not really ) inputs.
                    Processing ...
                    declare
                         files file_tab;
                         fss fs_tab;
                         distr distr_tab;
                         i number;
                    begin
                         select new file_rec(file_id,bytes,raidlevel)
                              bulk collect into files
                         from files;
                         select new fs_rec(fsname,bytes,raidlevel)
                              bulk collect into fss
                         from fs;
                         i := 1;
                         distr := distr_tab();
                         DISTRIBUTE_FILES(
                              files,
                              i,
                              fss,
                              distr
                         );
                    end;
                    Processing ...
                    create index distributions_id on distributions (
                         k_distr
                    )
                    Processing ...
                    begin
                         dbms_stats.gather_schema_stats(ownname=>user,cascade=>true);
                    end;
                    Now let's call it and let's analyze the new objects.
                    Processing ...
                    select *
                    from files
                    Query finished, retrieving results...
                                    FILE_ID                            TSNAME                              BYTES                                RAIDLEVEL              
                    -------------------------------------- ------------------------------ -------------------------------------- --------------------------------------
                                                         1 SYSTEM                                                            512                                      5
                                                         2 SYSTEM                                                            512                                      5
                                                         3 TOOLS                                                             256                                      5
                                                         4 TOOLS                                                             512                                      5
                                                         6 USERS                                                             512                                     10
                                                         7 UNDO                                                              512                                      5
                                                         8 UNDO                                                              512                                      5
                                                        10 TOOLS                                                             256                                      5
                                                        11 TOOLS                                                             256                                      5

                    9 row(s) retrieved

                    Processing ...
                    select *
                    from fs
                    Query finished, retrieving results...
                                               FSNAME                                             BYTES                                RAIDLEVEL              
                    ------------------------------------------------------------ -------------------------------------- --------------------------------------
                    /fs1                                                                                           1536                                      5
                    /fs2                                                                                           2536                                      5
                    /fs3                                                                                            536                                      5
                    /fs4                                                                                           1536                                      5
                    /fs5                                                                                            536                                     10

                    5 row(s) retrieved
                    Processing ...
                    select *
                    from distributions a join files b using (file_id) join fs c using (fsname)
                    where k_distr < 5
                    order by k_distr


                    Query finished, retrieving results...
                                     FSNAME                                  FILE_ID                                K_DISTR                            TSNAME                              BYTES                                RAIDLEVEL                               BYTES_1                              RAIDLEVEL_1             
                    ---------------------------------------- -------------------------------------- -------------------------------------- ------------------------------ -------------------------------------- -------------------------------------- -------------------------------------- --------------------------------------
                    /fs1                                                                          3                                      1 TOOLS                                                             256                                      5                                   1536                                      5
                    /fs2                                                                          4                                      1 TOOLS                                                             512                                      5                                   2536                                      5
                    /fs1                                                                         10                                      1 TOOLS                                                             256                                      5                                   1536                                      5
                    /fs1                                                                          1                                      1 SYSTEM                                                            512                                      5                                   1536                                      5
                    /fs5                                                                          6                                      1 USERS                                                             512                                     10                                    536                                     10
                    /fs2                                                                         11                                      1 TOOLS                                                             256                                      5                                   2536                                      5
                    /fs2                                                                          8                                      1 UNDO                                                              512                                      5                                   2536                                      5
                    /fs2                                                                          7                                      1 UNDO                                                              512                                      5                                   2536                                      5
                    /fs1                                                                          2                                      1 SYSTEM                                                            512                                      5                                   1536                                      5
                    /fs1                                                                          3                                      2 TOOLS                                                             256                                      5                                   1536                                      5
                    /fs1                                                                          2                                      2 SYSTEM                                                            512                                      5                                   1536                                      5
                    /fs1                                                                          1                                      2 SYSTEM                                                            512                                      5                                   1536                                      5
                    /fs5                                                                          6                                      2 USERS                                                             512                                     10                                    536                                     10
                    /fs3                                                                         11                                      2 TOOLS                                                             256                                      5                                    536                                      5
                    /fs1                                                                         10                                      2 TOOLS                                                             256                                      5                                   1536                                      5
                    /fs2                                                                          8                                      2 UNDO                                                              512                                      5                                   2536                                      5
                    /fs2                                                                          7                                      2 UNDO                                                              512                                      5                                   2536                                      5
                    /fs2                                                                          4                                      2 TOOLS                                                             512                                      5                                   2536                                      5
                    /fs4                                                                         11                                      3 TOOLS                                                             256                                      5                                   1536                                      5
                    /fs1                                                                          3                                      3 TOOLS                                                             256                                      5                                   1536                                      5
                    /fs2                                                                          4                                      3 TOOLS                                                             512                                      5                                   2536                                      5
                    /fs5                                                                          6                                      3 USERS                                                             512                                     10                                    536                                     10
                    /fs1                                                                          1                                      3 SYSTEM                                                            512                                      5                                   1536                                      5
                    /fs2                                                                          8                                      3 UNDO                                                              512                                      5                                   2536                                      5
                    /fs2                                                                          7                                      3 UNDO                                                              512                                      5                                   2536                                      5
                    /fs1                                                                         10                                      3 TOOLS                                                             256                                      5                                   1536                                      5
                    /fs1                                                                          2                                      3 SYSTEM                                                            512                                      5                                   1536                                      5
                    /fs2                                                                          4                                      4 TOOLS                                                             512                                      5                                   2536                                      5
                    /fs1                                                                          2                                      4 SYSTEM                                                            512                                      5                                   1536                                      5
                    /fs2                                                                          8                                      4 UNDO                                                              512                                      5                                   2536                                      5
                    /fs1                                                                          1                                      4 SYSTEM                                                            512                                      5                                   1536                                      5
                    /fs5                                                                          6                                      4 USERS                                                             512                                     10                                    536                                     10
                    /fs1                                                                         11                                      4 TOOLS                                                             256                                      5                                   1536                                      5
                    /fs2                                                                          7                                      4 UNDO                                                              512                                      5                                   2536                                      5
                    /fs1                                                                          3                                      4 TOOLS                                                             256                                      5                                   1536                                      5
                    /fs2                                                                         10                                      4 TOOLS                                                             256                                      5                                   2536                                      5

                    36 row(s) retrieved
                    Just to take a look at what we have.
                    Processing ...
                    with t1 as (
                         select k_distr,to_char(variance(free_space),'9999999999990.99') param1,to_char(avg(free_space),'9999999999999990.99') as mean_free_space
                         from (
                              select k_distr,fsname,c.bytes-sum(b.bytes) as  free_space
                              from distributions a join files b using (file_id) join fs c using (fsname)
                              group by k_distr,fsname,c.bytes
                         )
                         group by k_distr
                    ), t2 as (
                         select k_distr,avg(sing)/count(distinct fsname) as param2
                         from (
                              select k_distr,fsname,tsname,count(*) as sing, sum(b.bytes) as used_space
                              from distributions a join files b using (file_id) join fs c using (fsname)
                              group by k_distr,fsname,tsname
                         )
                         group by k_distr
                    ), t3 as (
                         select k_distr,count(*) as param4
                         from distributions a join files b using (file_id) join fs c using (fsname)
                         where b.raidlevel = c.raidlevel
                         group by k_distr
                    )
                    select *
                    from (
                         select *
                         from t1 join t2 using (k_distr) join t3 using (k_distr)
                         order by param2 asc, param1 asc,param4 desc,k_distr asc
                    )
                    where rownum <= 50
                    Query finished, retrieving results...
                                    K_DISTR                      PARAM1         MEAN_FREE_SPACE                   PARAM2                                 PARAM4                
                    -------------------------------------- ----------------- -------------------- -------------------------------------- --------------------------------------
                                                      1351         161376.00               568.00                                    0,2                                      9
                                                      1354         161376.00               568.00                                    0,2                                      9
                                                      1408         161376.00               568.00                                    0,2                                      9
                                                      1411         161376.00               568.00                                    0,2                                      9
                                                      2425         161376.00               568.00                                    0,2                                      9
                                                      2433         161376.00               568.00                                    0,2                                      9
                                                      2455         161376.00               568.00                                    0,2                                      9
                                                      2463         161376.00               568.00                                    0,2                                      9
                                                      2838         161376.00               568.00                                    0,2                                      9
                                                      2844         161376.00               568.00                                    0,2                                      9
                                                      2897         161376.00               568.00                                    0,2                                      9
                                                      2903         161376.00               568.00                                    0,2                                      9
                                                      5608         161376.00               568.00                                    0,2                                      9
                                                      5611         161376.00               568.00                                    0,2                                      9
                                                      5665         161376.00               568.00                                    0,2                                      9
                                                      5668         161376.00               568.00                                    0,2                                      9
                                                      6682         161376.00               568.00                                    0,2                                      9
                                                      6690         161376.00               568.00                                    0,2                                      9
                                                      6712         161376.00               568.00                                    0,2                                      9
                                                      6720         161376.00               568.00                                    0,2                                      9
                                                      7095         161376.00               568.00                                    0,2                                      9
                                                      7101         161376.00               568.00                                    0,2                                      9
                                                      7154         161376.00               568.00                                    0,2                                      9
                                                      7160         161376.00               568.00                                    0,2                                      9
                                                     10497         161376.00               568.00                                    0,2                                      9
                                                     10500         161376.00               568.00                                    0,2                                      9
                                                     10538         161376.00               568.00                                    0,2                                      9
                                                     10541         161376.00               568.00                                    0,2                                      9
                                                     11605         161376.00               568.00                                    0,2                                      9
                                                     11613         161376.00               568.00                                    0,2                                      9
                                                     11635         161376.00               568.00                                    0,2                                      9
                                                     11643         161376.00               568.00                                    0,2                                      9
                                                     11993         161376.00               568.00                                    0,2                                      9
                                                     11999         161376.00               568.00                                    0,2                                      9
                                                     12033         161376.00               568.00                                    0,2                                      9
                                                     12039         161376.00               568.00                                    0,2                                      9
                                                     15977         161376.00               568.00                                    0,2                                      9
                                                     15980         161376.00               568.00                                    0,2                                      9
                                                     16018         161376.00               568.00                                    0,2                                      9
                                                     16021         161376.00               568.00                                    0,2                                      9
                                                     17085         161376.00               568.00                                    0,2                                      9
                                                     17093         161376.00               568.00                                    0,2                                      9
                                                     17115         161376.00               568.00                                    0,2                                      9
                                                     17123         161376.00               568.00                                    0,2                                      9
                                                     17473         161376.00               568.00                                    0,2                                      9
                                                     17479         161376.00               568.00                                    0,2                                      9
                                                     17513         161376.00               568.00                                    0,2                                      9
                                                     17519         161376.00               568.00                                    0,2                                      9
                                                      1590         223840.00               568.00                                    0,2                                      9
                                                      1593         223840.00               568.00                                    0,2                                      9

                    50 row(s) retrieved
                    This are the best dispositions on the base of your criteria the param label on the columns stands for the number before the criteria you indicated in the first post.
                    Param 3 is not there because it can be obtained in different ways, anyway lower values for k_distrib may reach that scope.
                    Processing ...
                    select *
                    from distributions a join files b using (file_id) join fs c using (fsname)
                    where k_distr in (1351,1590,1408,2433)
                    order by k_distr,file_id
                    Query finished, retrieving results...
                                     FSNAME                                  FILE_ID                                K_DISTR                            TSNAME                              BYTES                                RAIDLEVEL                               BYTES_1                              RAIDLEVEL_1             
                    ---------------------------------------- -------------------------------------- -------------------------------------- ------------------------------ -------------------------------------- -------------------------------------- -------------------------------------- --------------------------------------
                    /fs1                                                                          1                                   1351 SYSTEM                                                            512                                      5                                   1536                                      5
                    /fs2                                                                          2                                   1351 SYSTEM                                                            512                                      5                                   2536                                      5
                    /fs1                                                                          3                                   1351 TOOLS                                                             256                                      5                                   1536                                      5
                    /fs2                                                                          4                                   1351 TOOLS                                                             512                                      5                                   2536                                      5
                    /fs5                                                                          6                                   1351 USERS                                                             512                                     10                                    536                                     10
                    /fs2                                                                          7                                   1351 UNDO                                                              512                                      5                                   2536                                      5
                    /fs4                                                                          8                                   1351 UNDO                                                              512                                      5                                   1536                                      5
                    /fs3                                                                         10                                   1351 TOOLS                                                             256                                      5                                    536                                      5
                    /fs4                                                                         11                                   1351 TOOLS                                                             256                                      5                                   1536                                      5
                    /fs1                                                                          1                                   1408 SYSTEM                                                            512                                      5                                   1536                                      5
                    /fs2                                                                          2                                   1408 SYSTEM                                                            512                                      5                                   2536                                      5
                    /fs1                                                                          3                                   1408 TOOLS                                                             256                                      5                                   1536                                      5
                    /fs2                                                                          4                                   1408 TOOLS                                                             512                                      5                                   2536                                      5
                    /fs5                                                                          6                                   1408 USERS                                                             512                                     10                                    536                                     10
                    /fs4                                                                          7                                   1408 UNDO                                                              512                                      5                                   1536                                      5
                    /fs2                                                                          8                                   1408 UNDO                                                              512                                      5                                   2536                                      5
                    /fs3                                                                         10                                   1408 TOOLS                                                             256                                      5                                    536                                      5
                    /fs4                                                                         11                                   1408 TOOLS                                                             256                                      5                                   1536                                      5
                    /fs1                                                                          1                                   1590 SYSTEM                                                            512                                      5                                   1536                                      5
                    /fs2                                                                          2                                   1590 SYSTEM                                                            512                                      5                                   2536                                      5
                    /fs1                                                                          3                                   1590 TOOLS                                                             256                                      5                                   1536                                      5
                    /fs4                                                                          4                                   1590 TOOLS                                                             512                                      5                                   1536                                      5
                    /fs5                                                                          6                                   1590 USERS                                                             512                                     10                                    536                                     10
                    /fs2                                                                          7                                   1590 UNDO                                                              512                                      5                                   2536                                      5
                    /fs4                                                                          8                                   1590 UNDO                                                              512                                      5                                   1536                                      5
                    /fs2                                                                         10                                   1590 TOOLS                                                             256                                      5                                   2536                                      5
                    /fs3                                                                         11                                   1590 TOOLS                                                             256                                      5                                    536                                      5
                    /fs1                                                                          1                                   2433 SYSTEM                                                            512                                      5                                   1536                                      5
                    /fs2                                                                          2                                   2433 SYSTEM                                                            512                                      5                                   2536                                      5
                    /fs3                                                                          3                                   2433 TOOLS                                                             256                                      5                                    536                                      5
                    /fs2                                                                          4                                   2433 TOOLS                                                             512                                      5                                   2536                                      5
                    /fs5                                                                          6                                   2433 USERS                                                             512                                     10                                    536                                     10
                    /fs2                                                                          7                                   2433 UNDO                                                              512                                      5                                   2536                                      5
                    /fs4                                                                          8                                   2433 UNDO                                                              512                                      5                                   1536                                      5
                    /fs4                                                                         10                                   2433 TOOLS                                                             256                                      5                                   1536                                      5
                    /fs1                                                                         11                                   2433 TOOLS                                                             256                                      5                                   1536                                      5

                    36 row(s) retrieved
                    And finally let's see some of the disposition in the top 50. It looks fine.

                    Bye Alessandro
                    • 7. Re: can this be done in sql (without the PL) ? distribute files
                      Ronald Rood
                      Oops, that's a lot of code Alessandro, thanks for the response.
                      the solutions look very valid to me. Should I conclude that the short answer to my question is: no ?
                      or can the model clause be used to solve this in a single sql ?

                      regards,
                      Ronald.
                      • 8. Re: can this be done in sql (without the PL) ? distribute files
                        Rob van Wijk
                        Hi colleague,

                        I just had an idea about how to solve this problem using the model clause.

                        The algorithm used is not the most sophisticated one out there, so there will probably exist some better ones. This one just orders the files and the fs from biggest to smallest and assigns the biggest file to the fs with the most space at that moment.

                        I included some grouping functions in the result set, so it's a little easier to verify the results. Note that for the bonus of including the raidlevel I got stuck. It looks like I just have to include the raidlevel as a partition of the model, but I got the "self cyclic rule" error message. I'll have to look into that some other time. Meanwhile you can have a look whether the next solution is good enough:
                        SQL> select id
                          2       , decode(grouping(tsname),1,'Total ' || max(fsname),tsname) tsname
                          3       , sum(fbytes)
                          4       , fsname
                          5       , fsbytes
                          6    from ( select *
                          7             from files f
                          8                , fs
                          9            model
                        10                  dimension by
                        11                  ( dense_rank() over (order by f.bytes desc, f.id) - 1 x
                        12                  , dense_rank() over (order by fs.bytes desc, fs.fsname) - 1 y
                        13                  )
                        14                  measures
                        15                  ( ' ' assigned
                        16                  , f.id
                        17                  , f.bytes fbytes
                        18                  , f.tsname
                        19                  , f.raidlevel
                        20                  , fsname
                        21                  , fs.bytes fsbytes
                        22                  , fs.bytes tempfsbytes
                        23                  , fs.bytes maxbytes
                        24                  )
                        25                  rules iterate(1000) until (tsname[iteration_number,0] is null)
                        26                  ( tempfsbytes[iteration_number,any] =
                        27                    case
                        28                    when assigned[iteration_number-1,cv()] = 'y'
                        29                    then tempfsbytes[iteration_number-1,cv()] - fbytes[iteration_number-1,cv()]
                        30                    else tempfsbytes [nvl(nullif(iteration_number-1,-1),0),cv()]
                        31                    end
                        32                  , maxbytes[iteration_number,any] = max(tempfsbytes)[iteration_number,any]
                        33                  , assigned[iteration_number,any] =
                        34                    case
                        35                    when  maxbytes[iteration_number,cv()] = tempfsbytes[iteration_number,cv()]
                        36                      and nvl(max(assigned)[iteration_number,y<cv()],'n') = 'n'
                        37                    then 'y'
                        38                    else 'n'
                        39                    end
                        40                  )
                        41         )
                        42   where assigned = 'y'
                        43   group by grouping sets((id,tsname,fsbytes,fsname),(fsname))
                        44   order by fsname
                        45       , id
                        46  /

                                ID TSNAME     SUM(FBYTES) FSNAME        FSBYTES
                        ---------- ---------- ----------- ---------- ----------
                                 2 SYSTEM             512 /fs1             1536
                                 7 UNDO               512 /fs1             1536
                                11 TOOLS              256 /fs1             1536
                                   Total /fs1        1280 /fs1
                                 1 SYSTEM             512 /fs2             2536
                                 5 TOOLS              768 /fs2             2536
                                 6 USERS              512 /fs2             2536
                                 9 UNDO               512 /fs2             2536
                                   Total /fs2        2304 /fs2
                                 3 TOOLS              256 /fs3              536
                                   Total /fs3         256 /fs3
                                 4 TOOLS              512 /fs4             1536
                                 8 UNDO               512 /fs4             1536
                                   Total /fs4        1024 /fs4
                                10 TOOLS              256 /fs5              536
                                   Total /fs5         256 /fs5

                        16 rijen zijn geselecteerd.
                        Groet,
                        Rob.
                        • 9. Re: can this be done in sql (without the PL) ? distribute files
                          Rob van Wijk
                          I just remembered what to do about self cyclic rules: use ordered rule evaluation.

                          So here is the bonus variant taking the raidlevel into account:
                          SQL> select id
                            2       , decode(grouping(tsname),1,'Total ' || max(fsname),tsname) tsname
                            3       , sum(fbytes)
                            4       , raidlevel
                            5       , fsname
                            6       , fsbytes
                            7    from ( select *
                            8             from files f
                            9                , fs
                          10            where f.raidlevel = fs.raidlevel
                          11            model
                          12                  partition by (f.raidlevel)
                          13                  dimension by
                          14                  ( dense_rank() over (partition by f.raidlevel order by f.bytes desc, f.id) - 1 x
                          15                  , dense_rank() over (partition by f.raidlevel order by fs.bytes desc, fs.fsname) - 1 y
                          16                  )
                          17                  measures
                          18                  ( ' ' assigned
                          19                  , f.id
                          20                  , f.bytes fbytes
                          21                  , f.tsname
                          22                  , fsname
                          23                  , fs.bytes fsbytes
                          24                  , fs.bytes tempfsbytes
                          25                  , fs.bytes maxbytes
                          26                  )
                          27                  rules iterate(1000) until (tsname[iteration_number,0] is null)
                          28                  ( tempfsbytes[iteration_number,any] order by y =
                          29                    case
                          30                    when assigned[iteration_number-1,cv()] = 'y'
                          31                    then tempfsbytes[iteration_number-1,cv()] - fbytes[iteration_number-1,cv()]
                          32                    else tempfsbytes [nvl(nullif(iteration_number-1,-1),0),cv()]
                          33                    end
                          34                  , maxbytes[iteration_number,any] order by y = max(tempfsbytes)[iteration_number,any]
                          35                  , assigned[iteration_number,any] order by y =
                          36                    case
                          37                    when  maxbytes[iteration_number,cv()] = tempfsbytes[iteration_number,cv()]
                          38                      and nvl(max(assigned)[iteration_number,y<cv()],'n') = 'n'
                          39                    then 'y'
                          40                    else 'n'
                          41                    end
                          42                  )
                          43         )
                          44   where assigned = 'y'
                          45   group by grouping sets((id,tsname,raidlevel,fsbytes,fsname),(fsname))
                          46   order by fsname
                          47       , id
                          48  /

                                  ID TSNAME     SUM(FBYTES)  RAIDLEVEL FSNAME        FSBYTES
                          ---------- ---------- ----------- ---------- ---------- ----------
                                   2 SYSTEM             512          5 /fs1             1536
                                   8 UNDO               512          5 /fs1             1536
                                  11 TOOLS              256          5 /fs1             1536
                                     Total /fs1        1280            /fs1
                                   1 SYSTEM             512          5 /fs2             2536
                                   3 TOOLS              256          5 /fs2             2536
                                   5 TOOLS              768          5 /fs2             2536
                                   7 UNDO               512          5 /fs2             2536
                                     Total /fs2        2048            /fs2
                                  10 TOOLS              256          5 /fs3              536
                                     Total /fs3         256            /fs3
                                   4 TOOLS              512          5 /fs4             1536
                                   9 UNDO               512          5 /fs4             1536
                                     Total /fs4        1024            /fs4
                                   6 USERS              512         10 /fs5              536
                                     Total /fs5         512            /fs5

                          16 rijen zijn geselecteerd.
                          Regards,
                          Rob.
                          • 10. Re: can this be done in sql (without the PL) ? distribute files
                            Ronald Rood
                            Hee Rob,
                            this is fast, like we agreed before: analytics rocks ! I ran your qry on a more realistic sample of data (437 files and 10 fs) and got results within the minute.
                            1 statement(s) executed, 447 row(s) affected, exec/fetch time: 46.036/0.859 sec
                            no complaints for that at all.

                            this was on a 8 cpu linux box with 2.6Ghz amd's and 64G memory.

                            One problem I have with the results is that now it assigns files of the same tsname a bit too quick to the same fs. A file can be assigned to a fs but preferably to one that has not already been used for the tsname where the file belongs to. Ofcourse on some point this can be impossible (more files than we have fs or size constraints) but now it's too open. (still it's better than the best solution I made before so I am already quite happy)

                            I think this has to do with the chosen strategy. This does not deal with tsname at all. What happens when the assignment order changes to 'tsname,bytes' ? Looking at the 'ORA-32637: Self cyclic rule in sequential order MODEL' I get there has to be done more than just change to:
                            model
                            dimension by
                            ( dense_rank() over (order by f.tsname,f.bytes desc, f.id) - 1 x
                            , dense_rank() over (order by fs.bytes desc, fs.fsname) - 1 y
                            )


                            groeten,
                            Ronald.
                            • 11. Re: can this be done in sql (without the PL) ? distribute files
                              Alessandro Rossi
                              Actually there is no answer to this question, it is unknown about the existence of an easy solution with any programming language of it.

                              If someone would find an answer ( but it has to be right ) about this problem, there are $1,000,000 for him :-) plus probably some dedicated pages in many computer science books for the next few centuries.

                              http://en.wikipedia.org/wiki/Clay_Mathematics_Institute#Millennium_Prize_Problems

                              This is an NP problem.

                              There may be many way to find some solutions that come very close to the best one, but they must be analyzed because sometimes an algorithm that looks good in one case can't be as good as before all the times.

                              Plus in your question there are more than one parameters to evaluate and it becomes harder to take all of them with the right consideration in the building phase ( the enumeration of the possible distributions ).

                              Bye Alessandro
                              • 12. Re: can this be done in sql (without the PL) ? distribute files
                                Rob van Wijk
                                Hi colleague,

                                As said by Alessandro more or less, your criteria are a tradeoff between the ability of spreading out the files over the fs having approximately equal space left and between spreading out the tablespaces over the fs. The next solution is leads to a more optimal spread of the tablespaces over the fs, if and only if, the capacity of the different fs's are not too much apart.

                                How is this query doing against your real life data set?
                                SQL> select id
                                  2       , decode(grouping(tsname),1,'Total ' || max(fsname),tsname) tsname
                                  3       , sum(fbytes)
                                  4       , raidlevel
                                  5       , fsname
                                  6       , fsbytes
                                  7    from ( select *
                                  8             from ( select files.*
                                  9                         , avg(bytes) over (partition by tsname) avgbytes
                                10                      from files
                                11                  ) f
                                12                , fs
                                13            where f.raidlevel = fs.raidlevel
                                14            model
                                15                  partition by (f.raidlevel)
                                16                  dimension by
                                17                  ( dense_rank() over (partition by f.raidlevel order by avgbytes desc, tsname, f.bytes desc, f.id) - 1 x
                                18                  , dense_rank() over (partition by f.raidlevel order by fs.bytes desc, fs.fsname) - 1 y
                                19                  )
                                20                  measures
                                21                  ( ' ' assigned
                                22                  , f.id
                                23                  , f.bytes fbytes
                                24                  , f.tsname
                                25                  , fsname
                                26                  , fs.bytes fsbytes
                                27                  , fs.bytes tempfsbytes
                                28                  , fs.bytes maxbytes
                                29                  )
                                30                  rules iterate(1000) until (tsname[iteration_number,0] is null)
                                31                  ( tempfsbytes[iteration_number,any] order by y =
                                32                    case
                                33                    when assigned[iteration_number-1,cv()] = 'y'
                                34                    then tempfsbytes[iteration_number-1,cv()] - fbytes[iteration_number-1,cv()]
                                35                    else tempfsbytes [nvl(nullif(iteration_number-1,-1),0),cv()]
                                36                    end
                                37                  , maxbytes[iteration_number,any] order by y = max(tempfsbytes)[iteration_number,any]
                                38                  , assigned[iteration_number,any] order by y =
                                39                    case
                                40                    when  maxbytes[iteration_number,cv()] = tempfsbytes[iteration_number,cv()]
                                41                      and nvl(max(assigned)[iteration_number,y<cv()],'n') = 'n'
                                42                    then 'y'
                                43                    else 'n'
                                44                    end
                                45                  )
                                46         )
                                47   where assigned = 'y'
                                48   group by grouping sets((id,tsname,raidlevel,fsbytes,fsname),(fsname))
                                49   order by fsname
                                50       , id
                                51  /

                                        ID TSNAME     SUM(FBYTES)  RAIDLEVEL FSNAME        FSBYTES
                                ---------- ---------- ----------- ---------- ---------- ----------
                                         5 TOOLS              768          5 /fs1             1536
                                         7 UNDO               512          5 /fs1             1536
                                           Total /fs1        1280            /fs1
                                         1 SYSTEM             512          5 /fs2             2536
                                         2 SYSTEM             512          5 /fs2             2536
                                         3 TOOLS              256          5 /fs2             2536
                                         9 UNDO               512          5 /fs2             2536
                                        10 TOOLS              256          5 /fs2             2536
                                           Total /fs2        2048            /fs2
                                        11 TOOLS              256          5 /fs3              536
                                           Total /fs3         256            /fs3
                                         4 TOOLS              512          5 /fs4             1536
                                         8 UNDO               512          5 /fs4             1536
                                           Total /fs4        1024            /fs4
                                         6 USERS              512         10 /fs5              536
                                           Total /fs5         512            /fs5

                                16 rijen zijn geselecteerd.
                                Groet,
                                Rob.
                                • 13. Re: can this be done in sql (without the PL) ? distribute files
                                  Rob van Wijk
                                  Ronald,

                                  We discussed a last enhancement to show when it is not possible to distribute the files anymore due to a lack of capacity in fs.

                                  I changed two values from your insert statements to get such a situation, please see below:

                                  I also managed to get rid of one redundant measures maxbytes.

                                  So the final (?) solution becomes:
                                  SQL> create table files
                                  ...
                                  <snipped>
                                  ...
                                  SQL> insert into files values (3,'TOOLS', 768,5);

                                  1 rij is aangemaakt.
                                  ...
                                  <snipped>
                                  ...
                                  SQL> insert into fs values ('/fs2',536,5);

                                  1 rij is aangemaakt.
                                  ...
                                  <snipped>
                                  ...

                                  SQL> select id
                                    2       , decode(grouping(tsname),1,'Total',tsname) tsname
                                    3       , raidlevel
                                    4       , sum(fbytes)
                                    5       , fsname
                                    6       , fsbytes
                                    7    from ( select id
                                    8                , tsname
                                    9                , raidlevel
                                  10                , fbytes
                                  11                , case assigned when 'y' then fsname when 'x' then 'doesn''t fit' end fsname
                                  12                , case assigned when 'y' then fsbytes end fsbytes
                                  13                , assigned
                                  14             from ( select files.*
                                  15                         , avg(bytes) over (partition by tsname) avgbytes
                                  16                      from files
                                  17                  ) f
                                  18                , fs
                                  19            where f.raidlevel = fs.raidlevel
                                  20            model
                                  21                  partition by (f.raidlevel)
                                  22                  dimension by
                                  23                  ( dense_rank() over (partition by f.raidlevel order by avgbytes desc, tsname, f.bytes desc, f.id) - 1 x
                                  24                  , dense_rank() over (partition by f.raidlevel order by fs.bytes desc, fs.fsname) - 1 y
                                  25                  )
                                  26                  measures
                                  27                  ( ' ' assigned
                                  28                  , f.id
                                  29                  , f.bytes fbytes
                                  30                  , f.tsname
                                  31                  , fs.fsname
                                  32                  , fs.bytes fsbytes
                                  33                  , fs.bytes tempfsbytes
                                  34                  )
                                  35                  rules iterate(1000) until (tsname[iteration_number,0] is null)
                                  36                  ( tempfsbytes[iteration_number,any] =
                                  37                    case
                                  38                    when assigned[iteration_number-1,cv()] = 'y'
                                  39                    then tempfsbytes[iteration_number-1,cv()] - fbytes[iteration_number-1,cv()]
                                  40                    else tempfsbytes [nvl(nullif(iteration_number-1,-1),0),cv()]
                                  41                    end
                                  42                  , assigned[iteration_number,any] =
                                  43                    case
                                  44                    when  max(tempfsbytes)[iteration_number,any] = tempfsbytes[iteration_number,cv()]
                                  45                      and nvl(max(assigned)[iteration_number,y<cv()],'n') = 'n'
                                  46                    then
                                  47                      case
                                  48                      when tempfsbytes[iteration_number,cv()] < fbytes[iteration_number,cv()]
                                  49                      then 'x'
                                  50                      else 'y'
                                  51                      end
                                  52                    else 'n'
                                  53                    end
                                  54                  )
                                  55         )
                                  56   where assigned in ('y','x')
                                  57   group by grouping sets((id,tsname,fsbytes,raidlevel,fsname),(fsname))
                                  58   order by fsname
                                  59       , id
                                  60  /

                                          ID TSNAME       RAIDLEVEL SUM(FBYTES) FSNAME         FSBYTES
                                  ---------- ----------- ---------- ----------- ----------- ----------
                                           7 UNDO                 5         512 doesn't fit
                                           8 UNDO                 5         512 doesn't fit
                                           9 UNDO                 5         512 doesn't fit
                                             Total                         1536 doesn't fit
                                           1 SYSTEM               5         512 /fs1              1536
                                           3 TOOLS                5         768 /fs1              1536
                                             Total                         1280 /fs1
                                           4 TOOLS                5         512 /fs2               536
                                             Total                          512 /fs2
                                          10 TOOLS                5         256 /fs3               536
                                          11 TOOLS                5         256 /fs3               536
                                             Total                          512 /fs3
                                           2 SYSTEM               5         512 /fs4              1536
                                           5 TOOLS                5         768 /fs4              1536
                                             Total                         1280 /fs4
                                           6 USERS               10         512 /fs5               536
                                             Total                          512 /fs5

                                  17 rijen zijn geselecteerd.
                                  Groet,
                                  Rob.

                                  Message was edited by:
                                  Rob van Wijk

                                  Please note that a strange >< sign appears in the code, where I wrote and meant a < sign...
                                  • 14. Re: can this be done in sql (without the PL) ? distribute files
                                    Ronald Rood
                                    danks Rob,

                                    so the short answer to my question seems to be: yes.
                                    At least for me it's very acceptable and usable. Really amazing what can be done with analytics. I am quite sure that this code is usable in more allocation problems. The code is fast and compact. I like it.

                                    groeten,
                                    Ronald.