1 2 Previous Next 23 Replies Latest reply on Jan 7, 2019 9:51 PM by mathguy

    Handling huge CLOB input data in Procedure.

    DBQuest

      PROCEDURE PRC_INS_DOWNLOADED_ITEMS (

          i_so_id       IN VARCHAR2 ,

          i_oreder_type IN NUMBER ,

          i_customer           IN NUMBER ,

          i_item_id       IN CLOB ,

          i_item_key      IN CLOB ,

          i_country_id           IN NUMBER ,

          i_quantity        IN NUMBER,

          o_status OUT VARCHAR2);

       

      We are having above procedure, it extracts the values from comma separated string and insert into a table.

       

      we are getting 50000 items from vendor with comma separted.

      For parameter i_item_id contains 50000 item ids (each of 7 characters) with comma separated. Total characters in i_item_id CLOB = 400000 (size is 0.381 MB)

      For parameter i_item_key contains 50000 item keys (each of 172 characters) with comma separated. Total characters in i_item_key CLOB = 8650000 (size is 8.25 MB).

       

      If we get 5000 items from Vendor, above proc is getting executed within 2 mins. But it is taking 2 hours of time for 50000 keys.

      We have tested the Proc with locally with 50000 items, but each item key of 1 to 5 characters. Then procedure got executed within 4 mins.

      But each i_item_key with 172 characters becoming too slow and taking 2 hours of time.

       

      Please advise changes to handle the huge data in parameters.

        • 2. Re: Handling huge CLOB input data in Procedure.
          Saubhik

          Not only it is duplicate and posted in other forum but also What is the difference between just marked answered Splitting values from comma separated string of CLOB data type.

          • 3. Re: Handling huge CLOB input data in Procedure.
            Gaz in Oz

            I don't know, what is the difference?

            • 4. Re: Handling huge CLOB input data in Procedure.
              Saubhik

              I also don't know, I think same.

              • 5. Re: Handling huge CLOB input data in Procedure.
                Jarkko Turpeinen

                DBQuest wrote:

                 

                PROCEDURE PRC_INS_DOWNLOADED_ITEMS (

                i_so_id IN VARCHAR2 ,

                i_oreder_type IN NUMBER ,

                i_customer IN NUMBER ,

                i_item_id IN CLOB ,

                i_item_key IN CLOB ,

                i_country_id IN NUMBER ,

                i_quantity IN NUMBER,

                o_status OUT VARCHAR2);

                 

                We are having above procedure, it extracts the values from comma separated string and insert into a table.

                 

                we are getting 50000 items from vendor with comma separted.

                For parameter i_item_id contains 50000 item ids (each of 7 characters) with comma separated. Total characters in i_item_id CLOB = 400000 (size is 0.381 MB)

                For parameter i_item_key contains 50000 item keys (each of 172 characters) with comma separated. Total characters in i_item_key CLOB = 8650000 (size is 8.25 MB).

                 

                If we get 5000 items from Vendor, above proc is getting executed within 2 mins. But it is taking 2 hours of time for 50000 keys.

                We have tested the Proc with locally with 50000 items, but each item key of 1 to 5 characters. Then procedure got executed within 4 mins.

                But each i_item_key with 172 characters becoming too slow and taking 2 hours of time.

                 

                Please advise changes to handle the huge data in parameters.

                Well should you read my suggestion on the other thread? It might ease the problem if you don't do row by row logic but set based processing instead.

                • 6. Re: Handling huge CLOB input data in Procedure.
                  Stew Ashton

                  Hi,

                   

                  I can show you how to do this in less than 10 seconds. Are you interested? Enough to answer my questions?

                   

                  Best regards,

                  Stew Ashton

                  • 7. Re: Handling huge CLOB input data in Procedure.
                    Paulzip

                    So you show us the procedure spec declaration without the actual procedure code.  What is the point in that?!

                     

                    You : Here's a picture of the badge of my car, it's going slower than it should, make it go faster.

                    Us : We can't, you haven't told us ANYTHING we can work with.

                    • 8. Re: Handling huge CLOB input data in Procedure.
                      Saubhik

                      I have given a link from your blog which contains other related links, Others including Paulzip and Jarkko given him lots of inputs but I don't think he has gone through any of those and tested those. He just need a tailor made solution.

                      • 9. Re: Handling huge CLOB input data in Procedure.
                        Stew Ashton

                        Stew Ashton wrote:

                         

                        I can show you how to do this in less than 10 seconds. Are you interested? Enough to answer my questions?

                        I meant the solution would run in less than 10 seconds, not that the explanation would take less than 10 seconds.

                         

                        Sorry if any confusion.

                         

                        Regards,

                        Stew

                        • 10. Re: Handling huge CLOB input data in Procedure.
                          Stew Ashton

                          If anyone is interested in a technique for splitting CLOB strings, I wrote a blog post about it. I believe it is relevant to the question, though a full answer requires more input from DbQuest...

                           

                          https://stewashton.wordpress.com/2019/01/03/splitting-strings-when-they-are-clobs/

                           

                          Best regards,

                          Stew

                          • 11. Re: Handling huge CLOB input data in Procedure.
                            AndrewSayer

                            Stew Ashton wrote:

                             

                            If anyone is interested in a technique for splitting CLOB strings, I wrote a blog post about it. I believe it is relevant to the question, though a full answer requires more input from DbQuest...

                             

                            https://stewashton.wordpress.com/2019/01/03/splitting-strings-when-they-are-clobs/

                             

                            Best regards,

                            Stew

                            To handle the data directly, your solution is almost certainly the right way to go about it. You only read the clobs once rather than once per element (as in the regexp solution from a previous thread).

                             

                            I shared an alternative solution to the business problem in the duplicate thread Handling huge CLOB input parameter data.  ; if the data from the vendor was sent in a reasonable format (one that is probably much easier to achieve and more likely to not have mistakes due to ordering), there would be no need to put much effort at all into loading the data.

                             

                            If I was put in this situation I would probably maniuplate the two files into one csv files using powershell (using piecemeal reads like you are doing) and then loading that file. That combined csv file  can be read by humans easily so can be used for manual spot checking of the updated data.

                            • 12. Re: Handling huge CLOB input data in Procedure.
                              Stew Ashton

                              AndrewSayer wrote:

                              ...

                              If I was put in this situation I would probably maniuplate the two files into one csv files using powershell (using piecemeal reads like you are doing) and then loading that file. That combined csv file can be read by humans easily so can be used for manual spot checking of the updated data.

                              Oh, so they started life as files, not CLOBs: of course!

                               

                              In that case:

                              create table item_ids_EXT(
                                "RN" NUMBER,
                                "ITEM_ID" VARCHAR2(16)
                              )
                              ORGANIZATION EXTERNAL(
                                TYPE ORACLE_LOADER
                                DEFAULT DIRECTORY EXT_DATA_DIR
                                ACCESS PARAMETERS(
                                  records delimited by ',' 
                                  BADFILE EXT_DATA_DIR:'i_item_id.txt.bad'
                                  LOGFILE EXT_DATA_DIR:'i_item_id.txt.log'
                                  FIELDS TERMINATED BY "§" NOTRIM
                                  MISSING FIELD VALUES ARE NULL
                                  REJECT ROWS WITH ALL NULL FIELDS
                                  (
                                    "RN" recnum,
                                    "ITEM_ID" char(255)
                                  )
                                )
                                location ('i_item_id.txt')
                              )
                              REJECT LIMIT 10;
                              
                              create table item_keys_EXT(
                                "RN" NUMBER,
                                "ITEM_KEY" VARCHAR2(255)
                              )
                              ORGANIZATION EXTERNAL(
                                TYPE ORACLE_LOADER
                                DEFAULT DIRECTORY EXT_DATA_DIR
                                ACCESS PARAMETERS(
                                  records delimited by ',' 
                                  BADFILE EXT_DATA_DIR:'i_item_id.txt.bad'
                                  LOGFILE EXT_DATA_DIR:'i_item_id.txt.log'
                                  FIELDS TERMINATED BY "§" NOTRIM
                                  MISSING FIELD VALUES ARE NULL
                                  REJECT ROWS WITH ALL NULL FIELDS
                                  (
                                    "RN" recnum,
                                    "ITEM_KEY" char(255)
                                  )
                                )
                                location ('i_item_key.txt')
                              )
                              REJECT LIMIT 10;
                              
                              create table u as
                              select item_id, item_key
                              from item_ids_ext
                              join item_keys_ext using(rn);
                              
                              Table U created.
                              
                              
                              Elapsed: 00:00:00.979
                              

                               

                              Takes less than a second on my machine

                               

                              Regards,

                              Stew

                              • 13. Re: Handling huge CLOB input data in Procedure.
                                mathguy

                                Stew Ashton wrote:

                                 

                                If anyone is interested in a technique for splitting CLOB strings, I wrote a blog post about it. I believe it is relevant to the question, though a full answer requires more input from DbQuest...

                                 

                                https://stewashton.wordpress.com/2019/01/03/splitting-strings-when-they-are-clobs/

                                 

                                Best regards,

                                Stew

                                 

                                 

                                The OP has declined to answer my question about Oracle version several times, and I wasn't going to read this thread anymore, but then I saw your post by accident. :-)

                                 

                                Here is why I wanted to know the Oracle version: We can use JSON functions and avoid PL/SQL altogether. There will be time wasted in other places - for example to convert the comma-separated strings to JSON arrays of strings - but the performance is comparable, and the solution seems simpler to me, because it does not need user-defined functions. I only tested this on Oracle 12.2, but it should also work on Oracle 12.1.

                                 

                                I am sure this can be improved, I just don't know how; I am just learning about JSON right now, I may be missing some obvious things. I am also not sure why the MATERIALIZE hint is needed - why the optimizer won't choose to do that without the hints. Without it (meaning, using the CTE's as inline views) the CREATE TABLE statement takes so long that I had to cancel it - no point in knowing if it takes 3 minutes or 3 hours, when it takes less than 1 second with the hint. Moreover, I don't understand why the statement SHOULD take longer if it treats the CTE's as inline views, but it does. I would have thought it should take less. Perhaps this is one direction to look for improvements.

                                 

                                I created table T exactly as you posted in your blog post.  Then:

                                 

                                create table u as

                                with

                                  item_ids as (

                                    select /*+ materialize */ item_id, ord

                                    from   t, json_table( '["' || replace(i_item_id, ',', '","') || '"]', '$[*]'

                                                          columns ( item_id varchar2 path '$', ord for ordinality )

                                                        )

                                  )

                                , item_keys as (

                                    select /*+ materialize */ item_key, ord

                                    from   t, json_table( '["' || replace(i_item_key, ',', '","') || '"]', '$[*]'

                                                          columns ( item_key varchar2 path '$', ord for ordinality )

                                                        )

                                  )

                                select item_id, item_key from item_ids join item_keys using(ord);

                                 

                                 

                                Table created.

                                 

                                Elapsed: 00:00:00.87

                                 

                                 

                                 

                                NOTE - This solution assumes the strings in the CSV inputs do not contain the JSON special characters (backslash and double-quote); if we can't make that assumption, the solution becomes more complicated (and slower) because we must handle that, too.

                                • 14. Re: Handling huge CLOB input data in Procedure.
                                  AndrewSayer

                                  Nice demo, goes to show how much easier things can be if you know the full story.

                                  1 2 Previous Next