1 2 Previous Next 15 Replies Latest reply: Sep 5, 2012 10:50 AM by user12869307 RSS

    tuning help

    user12869307
      Hi,

      select * from v$version

      BANNER
      Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi
      PL/SQL Release 10.2.0.4.0 - Production
      CORE     10.2.0.4.0     Production
      TNS for IBM/AIX RISC System/6000: Version 10.2.0.4.0 - Productio
      NLSRTL Version 10.2.0.4.0 - Production

      above given is the version I work on. I have a requirement for which I dont have tables since it is in proposal state. But they have given approximate table sizes like below.
      for example, table a,b,c,d has to be joined together to get a output(one or two columns from each tables or some aggregate function).
      This result must be insrted into a new table 'E'.

      Right now we have made a query like

      insert into E(col1,col2,col3...)
      select a.col1,
      b.col2,
      c.col3,
      d.col4
      from a,b,c,d
      where a.col1 = b.col1
      and b.col2 = c.col2
      and c.col3 = d.col3
      --- we can't check the performance of this too now since we do not have enough data.
      One doubt in the above insert is, will the select query will be executed only once and all the rows will be inserted into table E? or it will be inserted and select parallely?
      --- table a and b are having approximately 100 million records and c,d approximately 1.5 million records.
      How can we approach this?

      Can we follow a insert like above one. OR using a cursor and bulk collect will improve the performance?
      If there are any other way also please suggest. Thanks for your help
        • 1. Re: tuning help
          00125
          Hi,

          For inserting millions of record u better try with cursor and bulk collect and forall
          • 2. Re: tuning help
            Keith Jamieson
            Whats stopping you creating 4 tables and populating them with 100 million rows or 1.5 million rows as appropriate and then testing the join.

            The direct sql will perform much faster than bulk collect.
            • 3. Re: tuning help
              user12869307
              we do not have proper data yet. since the project is in proposal state. based on our query writing skills with limited information we have to get confidence of client and get the project. then they will start providing the data when we can have whole tests. Thanks for your reply. for now I will make it as simple select
              • 4. Re: tuning help
                user12869307
                One more help I asked is when we do

                insert into target_table_name
                select * from source_table;

                how will this query execute. select query once and then insert into the table
                OR select and insert parallely?
                • 5. Re: tuning help
                  Manik
                  Definetely plain sql..you may also look for append hint in documentation for insertion.

                  Cheers,
                  Manik
                  • 6. Re: tuning help
                    BluShadow
                    user12869307 wrote:
                    One more help I asked is when we do

                    insert into target_table_name
                    select * from source_table;

                    how will this query execute. select query once and then insert into the table
                    OR select and insert parallely?
                    Theoretically speaking it can be viewed as a parellel processing. It certainly doesn't load all the data from the source query into memory before inserting it, if that's what you are thinking.
                    • 7. Re: tuning help
                      user12869307
                      Thanks for the reply. actually a smalll correction two tables which I mentioned will increase by 100 millions every week. in what kind of scenario we can use bulk collect forall statements for inserting?
                      • 8. Re: tuning help
                        Stew Ashton
                        user12869307 wrote:
                        --- we can't check the performance of this too now since we do not have enough data.
                        You can generate data, lots of people do it all the time. Here is an example that will generate one million lines, but is configurable.
                        DEFINE NUMROWSA = 1000
                        define numrowsb = 1000
                        
                        create table test_a (  key1,  vn1,  vn2,  vn3,  vd1,  nomenc,  vv1,  vv2,  vv3,  vv4) as
                        with a as (select level lvl from dual connect by level <= &numrowsa)
                        , b as (select level lvl from dual connect by level <= &numrowsb)
                        select
                        rownum
                        , round(dbms_random.value(0, 99999999))
                        , round(dbms_random.value(0, 99999999))
                        , round(dbms_random.value(0, 99999999))
                        , trunc(sysdate)
                        , 'N' || mod(rownum,10)
                        , rpad('vv', 70 + round(dbms_random.value(0, 20)), '*')
                        , rpad('vv'||rownum, 70 + round(dbms_random.value(0, 20)), '*')
                        , rpad('vv', 70 + round(dbms_random.value(0, 20)), '*')
                        , rpad('vv', 56 + round(dbms_random.value(0, 32)), '*')
                        from a,b;
                        One doubt in the above insert is, will the select query will be executed only once and all the rows will be inserted into table E? or it will be inserted and select parallely?
                        The whole statement, both the SELECT part and the INSERT part, will be done in one session. For parallel execution, you need to do "direct path insert" using the /*+ APPEND */ hint, plus you need to enable and use "parallel DML".

                        I would not do parallel on such a small data set. I might to direct path insert though.

                        Why create this intermediate table? Why not just query the base tables, eventually through a view?
                        • 9. Re: tuning help
                          Keith Jamieson
                          Why do you think bulk collect is faster than direct sql?

                          You need to compare it.

                          Create a table with 1 million rows and insert into it using bulk collect and then insert into it using direct sql.

                          direct sql will be much faster.

                          You don't need real data for the test I suggested. You can make up the data as long as the appropriate columns are matched correctly.

                          If your tables are growing at 100 million rows per week, I think the partitioning option (additional license cost) is probably going to be essential.



                          Also in order to find out what your query is doing, use explain plan.

                          explain plan from <your sql statement>

                          select * from table(dbms_xplan())
                          • 10. Re: tuning help
                            Stew Ashton
                            user12869307 wrote:
                            Thanks for the reply. actually a smalll correction two tables which I mentioned will increase by 100 millions every week. in what kind of scenario we can use bulk collect forall statements for inserting?
                            You would only do BULK COLLECT + FORALL if the input requires some processing before insertion, and this processing cannot be done in SQL.

                            Are you going to add to the existing table or recreate it from scratch every week?
                            • 11. Re: tuning help
                              user12869307
                              we will be adding to the existing table. we are not recreating our target table every week. thanks for your reply on using bulk_collect, forall.. to make clear, I am mentioning again four source tables joined and result inserted into one target table.

                              two sourcet table have 1.5 million records. two source tables increase by 100 million records every week.
                              • 12. Re: tuning help
                                Keith Jamieson
                                Why are you inserting the result into a target table?

                                Why can't you just create a view?

                                so instead of
                                insert into table e
                                you say
                                create view e as
                                select a.col1,
                                b.col2,
                                c.col3,
                                d.col4
                                from a,b,c,d
                                where a.col1 = b.col1
                                and b.col2 = c.col2
                                and c.col3 = d.col3;
                                Edited by: Keith Jamieson on Sep 5, 2012 12:19 PM
                                • 13. Re: tuning help
                                  GVR
                                  It is hard to make any recommendations without knowing your requirements,It is always better to try with sample data. If you have to select only new data set then think about partitions and or sub partitions,this way you can select data from specific partition and or sub partition using dynamic sql.
                                  • 14. Re: tuning help
                                    BluShadow
                                    user12869307 wrote:
                                    we will be adding to the existing table. we are not recreating our target table every week. thanks for your reply on using bulk_collect, forall.. to make clear, I am mentioning again four source tables joined and result inserted into one target table.

                                    two sourcet table have 1.5 million records. two source tables increase by 100 million records every week.
                                    Seriously, you're going to have 5.2 billion records per year inserted to these tables? Is this a large supermarket chain shopping transactions database?
                                    1 2 Previous Next