1 2 Previous Next 15 Replies Latest reply: Jul 2, 2012 9:32 AM by 946222 RSS

    Insert query

    user07118719
      Hi All,

      DB version :10.2

      I have table with 8 columns namely num1,num2,num3,num4,num5,num6,num7,num8. I have to insert all these column values into one table with column "Num".
      create table nums_tb
      (
      key char(16),
      num1  char(16)
      ,num2  char(16)
      ,num3  char(16)
      ,num4  char(16)
      ,num5  char(16)
      ,num6   char(16)
      ,num7   char(16)
      ,num8   char(16)
      )   ;
      
      create table nums_tmp
      (
      num1  char(16)
      )   ;
      
      I have to insert all column values into nums_tmp. I have written query like this:
      
      insert into nums_tmp
      select * from (
      select distinct num1
      from  nums_tb
      
      union
      select distinct num2
      from  nums_tb
      union
      
      select distinct num3
      from  nums_tb
      union
      
      select distinct num4
      from  nums_tb
      union
      
      select distinct num5
      from  nums_tb
      union
      
      select distinct num6
      from  nums_tb
      union
      
      select distinct num7
      from  nums_tb
      union
      
      select distinct num8
      from  nums_tb);
      The table nums_tb has 20 million records in it. So this insert is taking much of time around 5 hrs. Is there any better way to write this query to achive good performance?

      Regards,
      SK
        • 1. Re: Insert query
          sb92075
          SK wrote:
          Hi All,

          DB version :10.2

          I have table with 8 columns namely num1,num2,num3,num4,num5,num6,num7,num8. I have to insert all these column values into one table with column "Num".
          it might go faster after DISTINCT has been removed or eliminated
          • 2. Re: Insert query
            user07118719
            Thanks for the update.

            I have already tried with this. It took little bit less time. completed in 4:30 hrs. But taking 4:30 hrs is huge.

            Is there any way that we can write query to improve performance?

            Appreciate help.


            Regards,
            SK
            • 3. Re: Insert query
              sb92075
              SK wrote:
              Thanks for the update.

              I have already tried with this. It took little bit less time. completed in 4:30 hrs. But taking 4:30 hrs is huge.

              Is there any way that we can write query to improve performance?
              do 8 different/separate INSERT statements
              • 4. Re: Insert query
                Alex Fatkulin
                Do you really need to insert only unique values?
                • 5. Re: Insert query
                  user07118719
                  I can insert duplicate values also. But I have to delete duplicates once insertion is completed to make sure that table has unique values to define PK on it.

                  Regards,
                  SK
                  • 6. Re: Insert query
                    AdamMartin
                    Is there any way that we can write query to improve performance?
                    Put distinct on the outer query not each inner query, using union all in the inline view.

                    Use parallel DML. Consider using a manual sort area size.
                    alter session enable parallel dml;
                    
                    insert /*+ parallel */ 
                    into nums_tmp
                    select distinct nums.num1
                    from (
                      select num1 num1
                      from  nums_tb
                      union all
                      select num2 num1
                      from nums_tb
                      union all
                      select num3 num1
                      from nums_tb
                      union all
                      select num4 num1
                      from nums_tb
                      union all
                      select num5 num1
                      from nums_tb
                      union all
                      select num6 num1
                      from nums_tb
                      union all
                      select num7 num1
                      from nums_tb
                      union all
                      select num8 num1
                      from  nums_tb) nums;
                    • 7. Re: Insert query
                      Alex Fatkulin
                      That'll make it tricky as duplicate data will have to be filtered at some point.

                      See if something like this will work better for you (though it may not, depending on what data/how many duplicates you have):
                      begin
                           insert /*+ append */ into nums_tmp select distinct num1 from nums_tb;
                           commit;
                      
                           for i in 2 .. 7
                           loop
                                execute immediate 'merge /*+ append */ into nums_tmp t using (select distinct num'||to_char(i)||' num from nums_tb) v on ( t.num1=v.num ) when not matched then insert values (v.num)';
                                commit;
                           end loop;
                      end;
                      Consider switching to manual pga management for the session which does this and increasing sort/hash-area sizes so there is enough memory for oracle to do SORT/HASH UNIQUE.

                      You can also try your statement in parallel, i.e.:
                      insert /*+ append */ into nums_tmp
                      select /*+ parallel(8) */ * from (
                      select num1
                      from  nums_tb
                      union
                      select num2
                      from  nums_tb
                      union
                      select num3
                      from  nums_tb
                      union
                      select num4
                      from  nums_tb
                      union
                      select num5
                      from  nums_tb
                      union
                      select num6
                      from  nums_tb
                      union
                      select num7
                      from  nums_tb
                      union
                      select num8
                      from  nums_tb);
                      that's just some obvious stuff I'd try first
                      • 8. Re: Insert query
                        user07118719
                        Thanks .
                        • 9. Re: Insert query
                          Alex Fatkulin
                          Adam, I think this will mostly be HASH UNIQUE vs SORT UNIQUE (i.e. algorithm) question -- which one is faster?

                          If you look at the plan with UNION you'll notice that it essentially does UNION ALL followed by sorting:
                          SQL> select num from (
                          select num1 num
                          from  nums_tb
                          union
                          select num2 num
                          from  nums_tb);  2    3    4    5    6
                          
                          Execution Plan
                          ----------------------------------------------------------
                          Plan hash value: 1897821920
                          
                          ----------------------------------------------------------------------------------------
                          | Id  | Operation            | Name    | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
                          ----------------------------------------------------------------------------------------
                          |   0 | SELECT STATEMENT     |         | 20000 |   351K|       |   255   (2)| 00:00:01 |
                          |   1 |  VIEW                |         | 20000 |   351K|       |   255   (2)| 00:00:01 |
                          |   2 |   SORT UNIQUE        |         | 20000 |   332K|   496K|   255  (51)| 00:00:01 |
                          |   3 |    UNION-ALL         |         |       |       |       |            |          |
                          |   4 |     TABLE ACCESS FULL| NUMS_TB | 10000 |   166K|       |    68   (0)| 00:00:01 |
                          |   5 |     TABLE ACCESS FULL| NUMS_TB | 10000 |   166K|       |    68   (0)| 00:00:01 |
                          ----------------------------------------------------------------------------------------
                          though I don't recall whether this behavior is common across most version or not.

                          Getting DISTINCT on "top of the view" will allow the optimizer to use HASH UNIQUE, if it wants to.
                          • 10. Re: Insert query
                            AdamMartin
                            Adam, I think this will mostly be HASH UNIQUE vs SORT UNIQUE (i.e. algorithm) question -- which one is faster?
                            Why choose? He can use manual workareas for both a larger hash area size and a larger sort area size and let the database decide. It will only allocate what it actually needs to use. But I would caution using both parallel dml and manual workarea sizes, as the hash/sort sizes specified will apply to each process. So a 2 GB sort size x 16 parallel processes adds up quick. But by the same token, the more parallel processes there are, the fewer rows each process will have to hash/sort as it gets divided into smaller chunks.

                            Edit: By using caution, I'm not saying, don't do it. I'm just saying that he runs the risk of running out of process memory. Hopefully he tries it and reports back. It would be nice if the whole thing could be sorted in memory.
                            • 11. Re: Insert query
                              Etbin
                              Seems somehow silly, but applying some string-to-rows technique to
                              select '~'||num1||'~'||num2||'~'||num3||'~'||num4||'~'||num5||'~'||num6||'~'||num7||'~'||num8||'~' s
                              /*          num1||'~'||num2||'~'||num3||'~'||num4||'~'||num5||'~'||num6||'~'||num7||'~'||num8 s  if suits you better */
                                from nums_tb
                              you'll scan nums_tb only once ;)

                              Regards

                              Etbin
                              • 12. Re: Insert query
                                946222
                                You can add an intermediary step and resolve the first step with a single insert all statement. The additional step will be required to filter the duplicates. Something like this:
                                SQL> create table nums_tb
                                  2  (
                                  3  key char(16),
                                  4  num1  char(16)
                                  5  ,num2  char(16)
                                  6  ,num3  char(16)
                                  7  ,num4  char(16)
                                  8  ,num5  char(16)
                                  9  ,num6     char(16)
                                 10  ,num7     char(16)
                                 11  ,num8     char(16)
                                 12  )      ;
                                
                                Table created.
                                
                                SQL> 
                                SQL> create table nums_tmp
                                  2  (
                                  3  num1  char(16)
                                  4  )      ;
                                
                                Table created.
                                
                                SQL> 
                                SQL> insert into nums_tb
                                  2  select rownum as key,
                                  3           rownum+1,
                                  4           rownum+2,
                                  5           rownum+3,
                                  6           rownum+4,
                                  7           rownum+5,
                                  8           rownum+6,
                                  9           rownum+7,
                                 10           rownum+8
                                 11    from dual
                                 12  connect by level <= 100;
                                
                                100 rows created.
                                
                                SQL> 
                                SQL> commit;
                                
                                Commit complete.
                                
                                SQL> 
                                SQL> insert all
                                  2    into nums_tmp values(num1)
                                  3    into nums_tmp values(num2)
                                  4    into nums_tmp values(num3)
                                  5    into nums_tmp values(num4)
                                  6    into nums_tmp values(num5)
                                  7    into nums_tmp values(num6)
                                  8    into nums_tmp values(num7)
                                  9    into nums_tmp values(num8)
                                 10  select * from nums_tb;
                                
                                800 rows created.
                                
                                SQL> 
                                SQL> commit;
                                
                                Commit complete.
                                
                                SQL> 
                                SQL> select count(*) from nums_tmp;
                                
                                  COUNT(*)
                                ----------
                                       800
                                
                                SQL> create table nums_tmp2 nologging as
                                  2  select distinct num1 from nums_tmp;
                                
                                Table created.
                                
                                SQL> select count(*) from nums_tmp2;
                                
                                  COUNT(*)
                                ----------
                                       107
                                • 13. Re: Insert query
                                  rp0428
                                  >
                                  I have table with 8 columns namely num1,num2,num3,num4,num5,num6,num7,num8. I have to insert all these column values into one table with column "Num".
                                  >
                                  One simple way is define the primary key on the table and create an error logging table for the primary key column.

                                  Then just do 8 simple inserts, one after the other. Oracle will put any duplicate values into the error log table and you can just ignore that table when you are done.

                                  No UNIONs, no DISTINCT, no fuss, no muss.

                                  See Inserting Data with DML Error Logging in the DBA doc
                                  http://docs.oracle.com/cd/B28359_01/server.111/b28310/tables004.htm
                                  • 14. Re: Insert query
                                    AdamMartin
                                    One simple way is define the primary key on the table and create an error logging table for the primary key column.
                                    No UNIONs, no DISTINCT, no fuss, no muss.
                                    Yes, but the issue was not one of fuss or muss, but of performance. Will this method outperform the others?
                                    1 2 Previous Next