1 2 Previous Next 27 Replies Latest reply on Jun 18, 2018 5:38 AM by mathguy Go to original post
      • 15. Re: ORA-30009: Not enough memory for CONNECT BY operation
        Etbin

        Do it in steps (what follows is just for testing)

         

        create table etbin as

          select level id, 

                 to_char(trunc(sysdate) + level/24/3600,'yyyy-mm-dd hh24:mi:ss') as inc_datetime,

                 trunc(dbms_random.value(0,100)) as random_id,

                 dbms_random.string('x',20) random_string 

            from dual 

          connect by level <= 10

         

        followed by

         

        begin

          for i in 1 .. 3

          loop

          insert into etbin

          select 10 * i + level id, 

                 to_char(trunc(sysdate) + (10 * i + level)/24/3600,'yyyy-mm-dd hh24:mi:ss') as inc_datetime,

                 trunc(dbms_random.value(0,100)) as random_id,

                 dbms_random.string('x',20) random_string 

            from dual 

          connect by level <= 10;

          commit; 

          end loop;

        end;

         

        Regards

         

        Etbin

        • 16. Re: ORA-30009: Not enough memory for CONNECT BY operation
          James Su

          use pipelined function instead of connect by:

           

          http://jonathanlewis.wordpress.com/2011/12/08/test-data/#comments

          1 person found this helpful
          • 17. Re: ORA-30009: Not enough memory for CONNECT BY operation
            Jonathan Lewis

            You've already got the answer to the question about the 30009 - recursive descent just takes memory.

            The Gaz in Oz approach is the most common - and very efficient - workaround, but a couple of extra comments on your code:

             

            a) always start with a dbms_random.seed(0)  - or some other seed point - so that you can reproduce the same random value on repeat tests.

            b) I tend to use exponent notation to avoid micounting on big numbers as in:  1e8 for 100,000,000

            c) Try to be a bit clever about use of dbms_random - it's CPU intensive:

             

            e.g.

             

            create table t1

            nologging

            as

            with generator as (

                    select

                            rownum id

                    from dual

                    connect by

                            level <= 1e4

            )

            select

            --      dbms_random.string('U',20)      padding

                    rpad(dbms_random.string('U',6),20)      padding

            from

                    generator       v1,

                    generator       v2

            where

                    rownum <= 1e6

            ;

             

             

            Note the rpad() of a 6 character random string - this gets me 20 character strings which are guaranteed nearly unique, but does a lot less work than generating all characters randomly. On my 1M rows table the padded version took 41 seconds to run, the pure dbms_random version took 115 seconds.  That matters a lot when you're try to build 1e8 rows.  (When you've decided your data patterns, test the time to generate a small data set before you create the big one).

             

            I tend to stick with 1e4 rows for my WITH subquery - but that's good for 1e8 rows in the result set, and I don't often worry about adding in a 3rd generator unless I want to use v1.id and v2.id (and potentially v3.id) to produce particular patterns - which I happen to have one in the following posting: https://www.red-gate.com/simple-talk/sql/oracle/massive-deletes-part-2/

             

             

            Regards

            Jonathan Lewis

            1 person found this helpful
            • 18. Re: ORA-30009: Not enough memory for CONNECT BY operation
              Quanwen Zhao

              Missing "as" keyword.

               

              Hi, Gaz

               

              Very nice, many thanks for your reminding and guiding.

               

              Nevertheless the advice afterwards Jonathan has given to me is more better, haven't you minded whether I mark his suggestion as a good answer?

               

              Best Regards

              Quanwen Zhao

              • 19. Re: ORA-30009: Not enough memory for CONNECT BY operation
                Quanwen Zhao

                James Su wrote:

                 

                use pipelined function instead of connect by:

                 

                http://jonathanlewis.wordpress.com/2011/12/08/test-data/#comments

                 

                Hello, James

                 

                Thank you, according to your advice I'll read that post from Jonathan.

                 

                Best Regards

                Quanwen Zhao

                • 20. Re: ORA-30009: Not enough memory for CONNECT BY operation
                  padders

                  > Try to be a bit clever about use of dbms_random - it's CPU intensive:

                   

                  Agreed and dbms_random.string appears increasingly inefficient for longer strings (which is hardly surprising given that it is essentially a PL/SQL loop).

                   

                  I personally find that if you are not too precious about actual string content then ora_hash/standard_hash/sys_guid can be combined to very efficiently generate quite a range of things without needing PL/SQL function calls.

                  • 21. Re: ORA-30009: Not enough memory for CONNECT BY operation
                    Gaz in Oz

                    Nevertheless the advice afterwards Jonathan has given to me is more better, haven't you minded whether I mark his suggestion as a good answer?

                    Sure, all good.

                    • 22. Re: ORA-30009: Not enough memory for CONNECT BY operation
                      Quanwen Zhao

                      create table t1

                      nologging

                      as

                      with generator as (

                      select

                      rownum id

                      from dual

                      connect by

                      level <= 1e4

                      )

                      select

                      -- dbms_random.string('U',20) padding

                      rpad(dbms_random.string('U',6),20) padding

                      from

                      generator v1,

                      generator v2

                      where

                      rownum <= 1e6

                      ;

                       

                      Hi, Jonathan

                       

                      I've tested the SQL code of creating table you supplied and it has spent only 32 seconds creating successfully.

                       

                      According to Gaz in Oz's sql, I've created table test1 successfully and it has almost spent about 2 hours and 35 minutes. Why is it so slow?

                       

                      SQL> create table test1
                        2  segment creation immediate
                        3  nologging
                        4  as
                        5  select r as id
                        6       , to_char(sysdate + r/24/3600, 'yyyy-mm-dd hh24:mi:ss') as inc_datetime
                        7       , trunc(dbms_random.value(0, 100)) as random_id
                        8       , dbms_random.string('x', 20) random_string
                        9  from  (SELECT rownum r
                       10         FROM  (SELECT level r FROM dual CONNECT BY level <= 1000) d1,
                       11               (SELECT level r FROM dual CONNECT BY level <= 1000) d2,
                       12               (SELECT level r FROM dual CONNECT BY level <= 1000) d3
                       13         WHERE  rownum <= 100000000
                       14  );
                      
                      
                      Table TEST3 created.
                      
                      
                      Elapsed: 02:34:55.198
                      

                       

                      BTW the parameter value of PGA_AGGREGATE_TARGET is 20g.

                       

                      SQL> show parameter pga
                      NAME                 TYPE        VALUE 
                      -------------------- ----------- ----- 
                      pga_aggregate_target big integer 20G
                      

                       

                      Based on your nice SQL code, I've mimicked my own the following,

                       

                      SQL> drop table t1;
                      
                      
                      Table dropped.
                      
                      
                      SQL> execute dbms_random.seed(0);
                      
                      
                      PL/SQL procedure successfully completed.
                      
                      
                      SQL> set timing on
                      SQL> 
                      
                      
                      SQL> create table t1
                        2  segment creation immediate
                        3  nologging
                        4  as
                        5  with generator as (
                        6  select rownum id
                        7  from dual
                        8  connect by level <= 1e4
                        9  )
                       10  select rownum as id
                       11         , to_char(sysdate + rownum/24/3600, 'yyyy-mm-dd hh24:mi:ss') as inc_datetime
                       12         , trunc(dbms_random.value(0, 100)) as random_id
                       13         , dbms_random.string('x', 20) random_string
                       14  from generator v1
                       15       , generator v2
                       16  where rownum <= 1e8
                       17  ;
                      
                      
                      Table T1 created.
                      
                      
                      Elapsed: 02:32:54.566
                      
                      
                      SQL> select sum(blocks),sum(bytes)/1024/1024 size_mb from user_segments
                        2  where segment_name ='T1'
                        3  and segment_type = 'TABLE';
                      
                      
                      SUM(BLOCKS)    SIZE_MB
                      ----------- ----------
                           770176       6017
                      
                      
                      SQL> select count(distinct random_string) from t1;
                      
                      
                      COUNT(DISTINCTRANDOM_STRING)
                      ----------------------------
                                         100000000
                      
                      
                      Elapsed: 00:01:35.487
                      
                      
                      SQL> show parameter cpu
                      NAME                            TYPE    VALUE 
                      ------------------------------- ------- ----- 
                      cpu_count                       integer 32    
                      parallel_threads_per_cpu        integer 2     
                      resource_manager_cpu_allocation integer 32
                      
                      
                      SQL> select /*+ parallel */ count(distinct random_string) from t1;
                      COUNT(DISTINCTRANDOM_STRING)
                      ----------------------------
                                         100000000
                      
                      
                      Elapsed: 00:00:05.437
                      

                       

                      Furthermore it's very slow as well and still spent approximately 2 hours and 33 minutes.

                       

                      I suppose that I understand your suggestion as follows,

                       

                      • firstly producing a small result set of 1e4 (10,000) rows via CONNECT BY and ROWNUM in "with ... as ... ()";
                      • secondly building the cross join of two or three tables (all based on the same table "generator");
                      • 1e4 * 1e4 = 1e8, this could easily produce 1e8 rows of data;

                       

                      Best Regards

                      Quanwen Zhao

                      • 23. Re: ORA-30009: Not enough memory for CONNECT BY operation
                        John_K

                        Is it any quicker if you take advantage of materialization?

                         

                        with x as ((SELECT level r FROM dual CONNECT BY level <= 1000))
                        select r as id  
                             , to_char(sysdate + r/24/3600, 'yyyy-mm-dd hh24:mi:ss') as inc_datetime  
                             , trunc(dbms_random.value(0, 100)) as random_id  
                             , dbms_random.string('x', 20) random_string  
                        from  (SELECT rownum r  
                               FROM  x d1,  
                                     x d2,  
                                     x d3  
                               WHERE  rownum <= 100000000  
                        );
                        

                         

                        Also, whilst it doesn't seem to affect the plan in this case, I always find it useful to use the cardinality hint on a row generator so the optimizer has a bit more information on the number of rows it's going to process. This is particularly important with dual, because the connect by cardinality is calculated based on dual, which is of course, one row.

                         

                        with x as ((SELECT /*+cardinality(1000)*/ level r FROM dual CONNECT BY level <= 1000))
                        select r as id  
                             , to_char(sysdate + r/24/3600, 'yyyy-mm-dd hh24:mi:ss') as inc_datetime  
                             , trunc(dbms_random.value(0, 100)) as random_id  
                             , dbms_random.string('x', 20) random_string  
                        from  (SELECT /*+all_rows*/ rownum r  
                               FROM  x d1,  
                                     x d2,  
                                     x d3  
                               WHERE  rownum <= 100000000  
                        );
                        

                         

                        Same plan on my system - but the cardinalities are correct.

                         

                        ---------------------------------------------------------------------------------------------------------------
                        | Id  | Operation                      | Name                        | E-Rows |E-Bytes| Cost (%CPU)| E-Time   |
                        ---------------------------------------------------------------------------------------------------------------
                        |   0 | SELECT STATEMENT               |                             |    100M|  1239M|  2137K  (7)| 00:01:24 |
                        |   1 |  TEMP TABLE TRANSFORMATION     |                             |        |       |            |          |
                        |   2 |   LOAD AS SELECT               | SYS_TEMP_0FD9D6A1A_7657F032 |        |       |            |          |
                        |*  3 |    CONNECT BY WITHOUT FILTERING|                             |        |       |            |          |
                        |   4 |     FAST DUAL                  |                             |      1 |       |     2   (0)| 00:00:01 |
                        |   5 |   VIEW                         |                             |    100M|  1239M|  2137K  (7)| 00:01:24 |
                        |*  6 |    COUNT STOPKEY               |                             |        |       |            |          |
                        |   7 |     MERGE JOIN CARTESIAN       |                             |   1000M|       |  2137K  (7)| 00:01:24 |
                        |   8 |      MERGE JOIN CARTESIAN      |                             |   1000K|       |  2137   (7)| 00:00:01 |
                        |   9 |       VIEW                     |                             |   1000 |       |     2   (0)| 00:00:01 |
                        |  10 |        TABLE ACCESS FULL       | SYS_TEMP_0FD9D6A1A_7657F032 |   1000 | 13000 |     2   (0)| 00:00:01 |
                        |  11 |       BUFFER SORT              |                             |   1000 |       |  2137   (7)| 00:00:01 |
                        |  12 |        VIEW                    |                             |   1000 |       |     2   (0)| 00:00:01 |
                        |  13 |         TABLE ACCESS FULL      | SYS_TEMP_0FD9D6A1A_7657F032 |   1000 | 13000 |     2   (0)| 00:00:01 |
                        |  14 |      BUFFER SORT               |                             |   1000 |       |  2137K  (7)| 00:01:24 |
                        |  15 |       VIEW                     |                             |   1000 |       |     2   (0)| 00:00:01 |
                        |  16 |        TABLE ACCESS FULL       | SYS_TEMP_0FD9D6A1A_7657F032 |   1000 | 13000 |     2   (0)| 00:00:01 |
                        ---------------------------------------------------------------------------------------------------------------
                        

                         

                        This is more applicable though when you're then joining that with other tables. For example, you might generate a row for every day of the year over the next 10 years, and join that to another table - without cardinality hinting the CBO may opt for a nested loops approach because it calculates only one row will be returned from the "generator", when in reality it's more like 3.6k. I've seen problems compound like this to generate loops that get executed several billion times!

                        • 24. Re: ORA-30009: Not enough memory for CONNECT BY operation
                          Quanwen Zhao
                          1. with x as ((SELECT level r FROM dual CONNECT BY level <= 1000))
                          2. select r as id
                          3.      , to_char(sysdate + r/24/3600, 'yyyy-mm-dd hh24:mi:ss') as inc_datetime
                          4.      , trunc(dbms_random.value(0, 100)) as random_id
                          5.      , dbms_random.string('x', 20) random_string
                          6. from  (SELECT rownum r
                          7.        FROM  x d1,
                          8.              x d2,
                          9.              x d3
                          10.        WHERE  rownum <= 100000000
                          11. );

                           

                          Hello, John_K

                           

                          I've also tested your code on my machine and it still need about 2 hours and 38 minutes to generate that test table t1.

                           

                          SQL> set timing on
                          SQL> 
                          SQL> create table t1
                            2  segment creation immediate
                            3  nologging
                            4  as
                            5  with x as ((SELECT /*+cardinality(1000)*/ level r FROM dual CONNECT BY level <= 1e3))
                            6  select r as id
                            7       , to_char(sysdate + r/24/3600, 'yyyy-mm-dd hh24:mi:ss') as inc_datetime
                            8       , trunc(dbms_random.value(0, 100)) as random_id
                            9       , dbms_random.string('x', 20) random_string
                           10  from  (SELECT /*+all_rows*/ rownum r
                           11         FROM  x d1,
                           12               x d2,
                           13               x d3
                           14         WHERE  rownum <= 1e8
                           15  );
                          
                          
                          Table T1 created.
                          
                          
                          Elapsed: 02:37:24.056
                          

                           

                          How long does it spend about creating the 1e8 rows of data on table t1 on your system?

                           

                          On the other hand, I repeat reading the content Jonathan was in response to me last time. He pointed out that it's best to use rpad function for filling the generated random strings, avoiding more consumption of CPU. Therefore I modify my code as well.

                           

                          SQL> drop table t1 purge;
                          
                          
                          Table dropped.
                          
                          
                          SQL> 
                          SQL> set timing on
                          SQL> 
                          SQL> create table t1
                            2  segment creation immediate
                            3  nologging
                            4  as
                            5  with x as ((SELECT level r FROM dual CONNECT BY level <= 1e4))
                            6  select r as id
                            7         , to_char(sysdate + r/24/3600, 'yyyy-mm-dd hh24:mi:ss') as inc_datetime
                            8         , trunc(dbms_random.value(0, 100)) as random_id
                            9         , rpad(dbms_random.string('x', 6), 20) random_string
                           10  from  (SELECT /*+ no_merge */
                           11         rownum r
                           12         FROM  x d1,
                           13               x d2
                           14         WHERE  rownum <= 1e8
                           15  );
                          
                          
                          Table T1 created.
                          
                          
                          Elapsed: 01:09:30.462
                          SQL> 
                          SQL> set timing off
                          SQL> 
                          SQL> select sum(bytes)/1024/1024 size_mb from user_segments
                            2  where segment_name = 'T1';
                          
                          
                             SIZE_MB
                          ----------
                                5990
                          

                           

                          Do you have faster and more efficient method of it? About almost spending 1 hours and 10 minutes is also very slow this time.

                           

                          Best Regards

                          Quanwen Zhao

                          • 25. Re: ORA-30009: Not enough memory for CONNECT BY operation
                            John_K

                            It depends on what you want for your data I guess. For example, at the moment you're padding a random string out to 20 characters - is there any reason why you can't pad that out to say 4000 characters? Then you don't need as many rows to reach your desired end table size. i.e.

                            create table xxt1 as
                              select rpad(dbms_random.string('X',5),20) from dual
                              connect by level <= 100000;
                            create table xxt2 as
                              select rpad(dbms_random.string('X',5),4000) from dual
                              connect by level <= 100000;
                            select sum(bytes)/1024/1024 size_mb from user_segments  
                             where segment_name = 'XXT1'; 
                            select sum(bytes)/1024/1024 size_mb from user_segments  
                             where segment_name = 'XXT2'; 
                            
                            SQL> set timing on
                            SQL> @a.sql
                            Table created.
                            Elapsed: 00:00:04.10
                            Table created.
                            Elapsed: 00:00:09.49
                              SIZE_MB
                            ----------
                                    3
                            Elapsed: 00:00:00.17
                              SIZE_MB
                            ----------
                                785.25
                            Elapsed: 00:00:00.13
                            SQL>
                            

                            As you can see, the table is significantly larger simply because I've padded it with a load of space. So rather than generating 100M rows, you might be able to get away with less than 1M given that your only worry seems to be the final size of the end table? Similarly, you can also add dummy columns in there that are just padded space if you're not too bothered about the content.

                            • 26. Re: ORA-30009: Not enough memory for CONNECT BY operation
                              Jonathan Lewis

                              There's not a lot you can do if you want to generate 100M rows with random strings.

                               

                              Most of the time will be spent in calls to generating random bytes - which is clear from the time difference between 6 character and 20 character strings.  Some CPU time will be spent, of course, doing pure string manipulation, and the date to string formatting that you've shown in the earlier postings.

                               

                              If you've got the resources you can parallelise the code, but not while the main (1e8) select is using ROWNUM in any way; and you may have to add a padding column to the generating "with subquery" to make sure that the parallel query slaves that scan it don't end up with just one of two of them getting data and the resting doing nothing.

                               

                              I've supplied more details in the comment on the blog note: https://jonathanlewis.wordpress.com/2018/06/11/dbms_random/

                               

                               

                               

                              Regards

                              Jonathan Lewis

                              1 person found this helpful
                              • 27. Re: ORA-30009: Not enough memory for CONNECT BY operation
                                mathguy

                                I am not sure why people generate "random" <whatever> for testing to begin with. The values are not random anyway, they are 100% deterministic. There are trivial ways to generate many values - even many distinct values - with much simpler algorithms than those used by random number generators.

                                 

                                For example - being lazy here and using ROWNUM, which is not really needed - why not generate the 20-character strings like so? Assume ROWNUM up to 1e8 - 1 to avoid TO_CHAR overflow.

                                 

                                RPAD(TO_CHAR(ROWNUM, 'FM00000000'), 20, 'A')

                                 

                                These values are guaranteed to be distinct. They can also be easily obfuscated - select a very large prime number P (for example a nine-digit prime number, find it somewhere online), and instead of ROWNUM use   MOD(P * ROWNUM + OFFSET, 1e8) for some arbitrary nine-digit OFFSET. If strings of upper-case letters are required for whatever reason, and if the distribution of the first six (or even of ALL) letters must be uniform and independent, all of these can be arranged with trivial arithmetic manipulations, much simpler and less expensive than DBMS_RANDOM functions. Isn't that sufficient?

                                1 2 Previous Next