This discussion is archived
1 2 Previous Next 15 Replies Latest reply: Jul 2, 2012 7:32 AM by 946222 RSS

Insert query

user07118719 Newbie
Currently Being Moderated
Hi All,

DB version :10.2

I have table with 8 columns namely num1,num2,num3,num4,num5,num6,num7,num8. I have to insert all these column values into one table with column "Num".
create table nums_tb
(
key char(16),
num1  char(16)
,num2  char(16)
,num3  char(16)
,num4  char(16)
,num5  char(16)
,num6   char(16)
,num7   char(16)
,num8   char(16)
)   ;

create table nums_tmp
(
num1  char(16)
)   ;

I have to insert all column values into nums_tmp. I have written query like this:

insert into nums_tmp
select * from (
select distinct num1
from  nums_tb

union
select distinct num2
from  nums_tb
union

select distinct num3
from  nums_tb
union

select distinct num4
from  nums_tb
union

select distinct num5
from  nums_tb
union

select distinct num6
from  nums_tb
union

select distinct num7
from  nums_tb
union

select distinct num8
from  nums_tb);
The table nums_tb has 20 million records in it. So this insert is taking much of time around 5 hrs. Is there any better way to write this query to achive good performance?

Regards,
SK
  • 1. Re: Insert query
    sb92075 Guru
    Currently Being Moderated
    SK wrote:
    Hi All,

    DB version :10.2

    I have table with 8 columns namely num1,num2,num3,num4,num5,num6,num7,num8. I have to insert all these column values into one table with column "Num".
    it might go faster after DISTINCT has been removed or eliminated
  • 2. Re: Insert query
    user07118719 Newbie
    Currently Being Moderated
    Thanks for the update.

    I have already tried with this. It took little bit less time. completed in 4:30 hrs. But taking 4:30 hrs is huge.

    Is there any way that we can write query to improve performance?

    Appreciate help.


    Regards,
    SK
  • 3. Re: Insert query
    sb92075 Guru
    Currently Being Moderated
    SK wrote:
    Thanks for the update.

    I have already tried with this. It took little bit less time. completed in 4:30 hrs. But taking 4:30 hrs is huge.

    Is there any way that we can write query to improve performance?
    do 8 different/separate INSERT statements
  • 4. Re: Insert query
    Alex Fatkulin Explorer
    Currently Being Moderated
    Do you really need to insert only unique values?
  • 5. Re: Insert query
    user07118719 Newbie
    Currently Being Moderated
    I can insert duplicate values also. But I have to delete duplicates once insertion is completed to make sure that table has unique values to define PK on it.

    Regards,
    SK
  • 6. Re: Insert query
    AdamMartin Pro
    Currently Being Moderated
    Is there any way that we can write query to improve performance?
    Put distinct on the outer query not each inner query, using union all in the inline view.

    Use parallel DML. Consider using a manual sort area size.
    alter session enable parallel dml;
    
    insert /*+ parallel */ 
    into nums_tmp
    select distinct nums.num1
    from (
      select num1 num1
      from  nums_tb
      union all
      select num2 num1
      from nums_tb
      union all
      select num3 num1
      from nums_tb
      union all
      select num4 num1
      from nums_tb
      union all
      select num5 num1
      from nums_tb
      union all
      select num6 num1
      from nums_tb
      union all
      select num7 num1
      from nums_tb
      union all
      select num8 num1
      from  nums_tb) nums;
  • 7. Re: Insert query
    Alex Fatkulin Explorer
    Currently Being Moderated
    That'll make it tricky as duplicate data will have to be filtered at some point.

    See if something like this will work better for you (though it may not, depending on what data/how many duplicates you have):
    begin
         insert /*+ append */ into nums_tmp select distinct num1 from nums_tb;
         commit;
    
         for i in 2 .. 7
         loop
              execute immediate 'merge /*+ append */ into nums_tmp t using (select distinct num'||to_char(i)||' num from nums_tb) v on ( t.num1=v.num ) when not matched then insert values (v.num)';
              commit;
         end loop;
    end;
    Consider switching to manual pga management for the session which does this and increasing sort/hash-area sizes so there is enough memory for oracle to do SORT/HASH UNIQUE.

    You can also try your statement in parallel, i.e.:
    insert /*+ append */ into nums_tmp
    select /*+ parallel(8) */ * from (
    select num1
    from  nums_tb
    union
    select num2
    from  nums_tb
    union
    select num3
    from  nums_tb
    union
    select num4
    from  nums_tb
    union
    select num5
    from  nums_tb
    union
    select num6
    from  nums_tb
    union
    select num7
    from  nums_tb
    union
    select num8
    from  nums_tb);
    that's just some obvious stuff I'd try first
  • 8. Re: Insert query
    user07118719 Newbie
    Currently Being Moderated
    Thanks .
  • 9. Re: Insert query
    Alex Fatkulin Explorer
    Currently Being Moderated
    Adam, I think this will mostly be HASH UNIQUE vs SORT UNIQUE (i.e. algorithm) question -- which one is faster?

    If you look at the plan with UNION you'll notice that it essentially does UNION ALL followed by sorting:
    SQL> select num from (
    select num1 num
    from  nums_tb
    union
    select num2 num
    from  nums_tb);  2    3    4    5    6
    
    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 1897821920
    
    ----------------------------------------------------------------------------------------
    | Id  | Operation            | Name    | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
    ----------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT     |         | 20000 |   351K|       |   255   (2)| 00:00:01 |
    |   1 |  VIEW                |         | 20000 |   351K|       |   255   (2)| 00:00:01 |
    |   2 |   SORT UNIQUE        |         | 20000 |   332K|   496K|   255  (51)| 00:00:01 |
    |   3 |    UNION-ALL         |         |       |       |       |            |          |
    |   4 |     TABLE ACCESS FULL| NUMS_TB | 10000 |   166K|       |    68   (0)| 00:00:01 |
    |   5 |     TABLE ACCESS FULL| NUMS_TB | 10000 |   166K|       |    68   (0)| 00:00:01 |
    ----------------------------------------------------------------------------------------
    though I don't recall whether this behavior is common across most version or not.

    Getting DISTINCT on "top of the view" will allow the optimizer to use HASH UNIQUE, if it wants to.
  • 10. Re: Insert query
    AdamMartin Pro
    Currently Being Moderated
    Adam, I think this will mostly be HASH UNIQUE vs SORT UNIQUE (i.e. algorithm) question -- which one is faster?
    Why choose? He can use manual workareas for both a larger hash area size and a larger sort area size and let the database decide. It will only allocate what it actually needs to use. But I would caution using both parallel dml and manual workarea sizes, as the hash/sort sizes specified will apply to each process. So a 2 GB sort size x 16 parallel processes adds up quick. But by the same token, the more parallel processes there are, the fewer rows each process will have to hash/sort as it gets divided into smaller chunks.

    Edit: By using caution, I'm not saying, don't do it. I'm just saying that he runs the risk of running out of process memory. Hopefully he tries it and reports back. It would be nice if the whole thing could be sorted in memory.
  • 11. Re: Insert query
    Etbin Guru
    Currently Being Moderated
    Seems somehow silly, but applying some string-to-rows technique to
    select '~'||num1||'~'||num2||'~'||num3||'~'||num4||'~'||num5||'~'||num6||'~'||num7||'~'||num8||'~' s
    /*          num1||'~'||num2||'~'||num3||'~'||num4||'~'||num5||'~'||num6||'~'||num7||'~'||num8 s  if suits you better */
      from nums_tb
    you'll scan nums_tb only once ;)

    Regards

    Etbin
  • 12. Re: Insert query
    946222 Explorer
    Currently Being Moderated
    You can add an intermediary step and resolve the first step with a single insert all statement. The additional step will be required to filter the duplicates. Something like this:
    SQL> create table nums_tb
      2  (
      3  key char(16),
      4  num1  char(16)
      5  ,num2  char(16)
      6  ,num3  char(16)
      7  ,num4  char(16)
      8  ,num5  char(16)
      9  ,num6     char(16)
     10  ,num7     char(16)
     11  ,num8     char(16)
     12  )      ;
    
    Table created.
    
    SQL> 
    SQL> create table nums_tmp
      2  (
      3  num1  char(16)
      4  )      ;
    
    Table created.
    
    SQL> 
    SQL> insert into nums_tb
      2  select rownum as key,
      3           rownum+1,
      4           rownum+2,
      5           rownum+3,
      6           rownum+4,
      7           rownum+5,
      8           rownum+6,
      9           rownum+7,
     10           rownum+8
     11    from dual
     12  connect by level <= 100;
    
    100 rows created.
    
    SQL> 
    SQL> commit;
    
    Commit complete.
    
    SQL> 
    SQL> insert all
      2    into nums_tmp values(num1)
      3    into nums_tmp values(num2)
      4    into nums_tmp values(num3)
      5    into nums_tmp values(num4)
      6    into nums_tmp values(num5)
      7    into nums_tmp values(num6)
      8    into nums_tmp values(num7)
      9    into nums_tmp values(num8)
     10  select * from nums_tb;
    
    800 rows created.
    
    SQL> 
    SQL> commit;
    
    Commit complete.
    
    SQL> 
    SQL> select count(*) from nums_tmp;
    
      COUNT(*)
    ----------
           800
    
    SQL> create table nums_tmp2 nologging as
      2  select distinct num1 from nums_tmp;
    
    Table created.
    
    SQL> select count(*) from nums_tmp2;
    
      COUNT(*)
    ----------
           107
  • 13. Re: Insert query
    rp0428 Guru
    Currently Being Moderated
    >
    I have table with 8 columns namely num1,num2,num3,num4,num5,num6,num7,num8. I have to insert all these column values into one table with column "Num".
    >
    One simple way is define the primary key on the table and create an error logging table for the primary key column.

    Then just do 8 simple inserts, one after the other. Oracle will put any duplicate values into the error log table and you can just ignore that table when you are done.

    No UNIONs, no DISTINCT, no fuss, no muss.

    See Inserting Data with DML Error Logging in the DBA doc
    http://docs.oracle.com/cd/B28359_01/server.111/b28310/tables004.htm
  • 14. Re: Insert query
    AdamMartin Pro
    Currently Being Moderated
    One simple way is define the primary key on the table and create an error logging table for the primary key column.
    No UNIONs, no DISTINCT, no fuss, no muss.
    Yes, but the issue was not one of fuss or muss, but of performance. Will this method outperform the others?
1 2 Previous Next

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points