1 2 Previous Next 26 Replies Latest reply: Jan 22, 2013 7:43 AM by user9093343 Go to original post RSS
      • 15. Re: Bulk Upload - Slower for processing the Last few records
        user9093343
        DBMS_PARALLEL_EXECUTE is a 11g feature. we are still in 10g. :(
        Also as I told earlier..."Insert into... select" is taking too much time compared to bulk processing in my case.:(
        • 16. Re: Bulk Upload - Slower for processing the Last few records
          6363
          user9093343 wrote:

          Also as I told earlier..."Insert into... select" is taking too much time compared to bulk processing in my case.:(
          This is incredibly hard to believe. How are you measuring the exact time taken?

          There is nothing in bulk collect that would make it faster than the driving SQL statement alone. Bulk collect only improves performance of PL/SQL code that already executes SQL in a loop.
          • 17. Re: Bulk Upload - Slower for processing the Last few records
            Billy~Verreynne
            user9093343 wrote:
            DBMS_PARALLEL_EXECUTE is a 11g feature. we are still in 10g. :(
            The same basic approach can be done in 10g.

            Have a look at Tom's example.

            He shows how to use SQL to query the data dictionary for the range of data blocks containing a specific table's data. Then turn that in multiple sub ranges - which is the "chunking" done by DBMS_PARALLEL_EXECUTE.

            Tom calls it the splitting the table into rowid ranges.

            You use this to create, for example, 10 rowid ranges for you VLT.

            You create a procedure that looks as follows:
            create or replace procedure ProcessRowRange( fromRow varchar2, toRow varchar2 ) is
            ..
            begin
              .. 
              insert into target_table values( .. ) select ... from source_table where rowid between fromRow and toRow;
              ..
            end;
            You now need to run this procedure 10x, once per rowid range.

            You can schedule these procedures via DBMS_JOB. You can run them in parallel. You can run them in a staggered fashion. You can restart a procedure should it fail - and only process that subset of source data again via its rowid range.
            Also as I told earlier..."Insert into... select" is taking too much time compared to bulk processing in my case.:(
            Tell me why do you think that an insert...select is slower? Does it read more rows? Does it write more rows? Surely the I/O load is the same.

            Can you provide any evidence that your bulk processing approach has a smaller workload?
            • 18. Re: Bulk Upload - Slower for processing the Last few records
              rp0428
              >
              I have written here processes almost 10 millions in 10 seconds. I am worried about the remaining 1.4 millions only.
              >
              Then the solution is obvious.

              Break the job into two pieces. Process the first 9 or 10 million in one run. Then when it is finished execute the process again to do the remaining 1.4 million.
              • 19. Re: Bulk Upload - Slower for processing the Last few records
                user9093343
                Billy - have requested access to the dba tables to try your suggestion in 10g. will keep you posted.

                Also - Just out of curiosity - whenever I use bulk collect (& FORALL) in any situations, I see the last 10 to 20% of records take a lot of time compared to the initial 80% of records loaded. any idea?
                • 20. Re: Bulk Upload - Slower for processing the Last few records
                  user9093343
                  Billy - Have no evidence as such other than seeing it loading 10M records in 10 minutes for the same snippet whcih I had posted in this thread.
                  • 21. Re: Bulk Upload - Slower for processing the Last few records
                    Billy~Verreynne
                    user9093343 wrote:
                    Billy - have requested access to the dba tables to try your suggestion in 10g. will keep you posted.
                    As DBA, I have a super-user schema that has specific system access. Like select any dictionary. I then implement code (e.g. function returning rowid ranges) in that schema with authid definer, and give an app/user schema execute rights on it. This allows an app or user access to specific super-user functionality in a controlled and managed fashion.
                    Also - Just out of curiosity - whenever I use bulk collect (& FORALL) in any situations, I see the last 10 to 20% of records take a lot of time compared to the initial 80% of records loaded. any idea?
                    Basic reaction is that you are misdiagnosing that - it may be what you think you are seeing, but is not.

                    The workload that you are creating is a constant and consistent:
                    - read rows from a (select) cursor into PL/SQL PGA memory via bulk collect
                    - execute a (insert) cursor with a bulk bind (the forall loop)
                    - commit

                    So there is no obvious reason why one such workload iteration would be drastically slower than another.

                    What metrics did you use to come to the conclusion that the 1st 80% of these workload iterations were fast and the last 20% not?

                    There is also the issue of read consistency. Others already pointed this problem out. You create a consistent read with the select cursor. At t1, that cursor is opened on a specific version of the data. The truth as it is seen at that specific point in time. Each commit, creates a new version of data. A new truth at that point in time.

                    A consistent read is just that - consistent. So it does not matter how many new truths/versions are created, that read sees the truth at the point in time it opened its cursor. No other truth exists for that read.

                    Oracle has to maintain that consistency and "rebuild" the truth as it is seen by that cursor. While your code is slapping the database with a new truth each loop iteration with a commit. Sooner or later Oracle will fail to recreate the truth needed for that consistent read (redo needed has been overwritten).

                    So I do not see your approach as robust - never mind performance. And as far as performance goes, I would like to see your claim of 80% and last 20% slow substantiated with some kind of instrumentation and metrics. Not saying that this could not be the case - but we will need to see the evidence of that in order to provide any kind of meaningful analysis.
                    • 22. Re: Bulk Upload - Slower for processing the Last few records
                      user9093343
                      that answers my Q billy - thank you.... how good is file processing compared to bulk loads?
                      • 23. Re: Bulk Upload - Slower for processing the Last few records
                        Billy~Verreynne
                        What type of file processing?

                        Oracle writing to database files?

                        Oracle reading an external table?

                        PL/SQL code writing to file via <i>UTL_FILE</i>?

                        Something else?
                        • 24. Re: Bulk Upload - Slower for processing the Last few records
                          user9093343
                          oops...file processing in the sense.......writting data to UTL_FILE and then using sql loader to load the table.
                          • 25. Re: Bulk Upload - Slower for processing the Last few records
                            Billy~Verreynne
                            SQL*Loader is fast. (optimised piece of s/w for loading "flat file" data into the database).

                            UTL_FILE - I would not readily use. Old (and slow) style to write structured data into a non-structured CSV-type file - only for reloading and reparsing that as structured data.

                            That said, UTL_FILE performance can be addressed via parallel PL/SQL processing - running the SQL select cursor reading the data to write to file in parallel, and via these parallel processes enable the SQL engine to also write via UTL_FILE in parallel courtesy of a pipelined table function.

                            Keep in mind that I/O is the single most expensive computing operation. Performance means reducing I/O to the absolute minimum. The fundamental reason we use indexes for tables, we partition data, etc.

                            Reading data from a table is I/O. Writing that data to file is I/O. Reading that file is I/O. Writing that data to a table is I/O.

                            That is 50% more I/O that reading that data from a table and writing that data to another table.
                            • 26. Re: Bulk Upload - Slower for processing the Last few records
                              user9093343
                              Thank You Billy. Suggestion to use SQL/Conventional way of load (Normal Cursor) instead of PL/SQL Bulk binds worked.
                              1 2 Previous Next